2004年8月6日

Comparing the evaluation result with different train scale

Yes, I have got the final evaluation result by the perl language program ace04-eval-v09.pl. When I used three fourth train corpus for training and one fourth for testing, the final evaluation result was as follows:
Unweighted-F=70.4% Value=49.6% Valued-based-F=50.5%

The unweighted F socre was nearly to 74.17% that was my best original result under my evaluation method. But the ACE evaluation main item value was little lower.

Last night I found we should submit some description about our algorithmic approach and comparing with some primiary system. And I wanted to konw whether there is a rule that the final value would be better and better with the train corpus larger and larger. I planned to do about seven experiments with the training scale at from 10% to 70%.

Just now I had finished four of them . Each experiment would be spent about one hour. And there was an astonishing rule that the main three evaluation scores were floating a little.

This rule was only reflected by the comaring evaluation results. However, I would make deep analysis.

没有评论: