2004年8月13日

Entity Class Classification

I began my additional EDR CO-Reference Task Evaluation plan with the Entity Class Classification.

ACE would not provide the corresponding entity class. This was additional classification task for me. There were three schemes for me to finish it.

The First was Adopting the modules of EMD. As the EMD output content included the entity class. So this was the comman directly scheme. However, after a consideration, I found out the bugs in this scheme. The EDM's output was including not only entity class attributes, but also other more information. According the algorithmic approach, if the other information was provided and we only needed classify the entity class, we could use some machine learning algorithm based on the provided entities and eitity mentions features. I used some methods to classify the entity class rather than using EMD modules.

The second was using SPC as a default vlaue for entity class attributes. Why? There was a statistical rule that SPC is 89.45% of the all entity class attributed in the whole training corpoa. And using one third corpus for testing, under the scoring plan of ACE entity class, 8798.5 could be got.

The last one was using decison trees algorithmic approach. Discussing with Xiantao Liao, she said I could make some manual rules for the classification of entity class attributes. There were lots of feature values I could use. And decision tree was the best choice for generating the best rules on the training corpoa. I tested this scheme. The classification accuracy was 90.3%. And using one third corpus for testing, under the scoring plan of ACE entity class, 8942.75 could be got.

Comparing the three above schemes, I decided to using decision trees as the final scheme.

The next problem was the whole system evaluation by myself.


没有评论: