The C4.5R8 was used for my task. And I had got the first result of three samples, as follws:
------------------------------------------------------------------
bnews  3:1 train:test
------------------------------------------------------------------
Evaluation on training data (348405 items):
         Before Pruning           After Pruning
        ----------------   ---------------------------
        Size      Errors   Size      Errors   Estimate
        65340  13719( 3.9%)   1172  17106( 4.9%)    ( 5.1%)   <<
Evaluation on test data (116134 items):
         Before Pruning           After Pruning
        ----------------   ---------------------------
        Size      Errors   Size      Errors   Estimate
        65340  4709( 4.1%)   1172  3819( 3.3%)    ( 5.1%)   <<
          (a)  (b)      <-classified as
         ---- ----
         3468 3498      (a): class +
          321108847     (b): class -
Precision:3468/(3468+321)=0.91528  Recall: 3468/(3468+3498)=0.49785 F: 2PR/(P+R)=0.64491
------------------------------------------------------------------
treebank 3:1 train:test
------------------------------------------------------------------
Evaluation on training data (151249 items):
         Before Pruning           After Pruning
        ----------------   ---------------------------
        Size      Errors   Size      Errors   Estimate
        20672  4558( 3.0%)    384  5630( 3.7%)    ( 3.9%)   <<
Evaluation on test data (50417 items):
         Before Pruning           After Pruning
        ----------------   ---------------------------
        Size      Errors   Size      Errors   Estimate
        20672  2700( 5.4%)    384  2407( 4.8%)    ( 3.9%)   <<
          (a)  (b)      <-classified as
         ---- ----
         1688 2184      (a): class +
          22346322      (b): class -
Precision: 1688/(1688+223)=0.8833  Recall: 1688/(1688+2184)=0.43595 F: 2PR/(P+R)=0.58378
------------------------------------------------------------------
nwire 3:1 train:test
------------------------------------------------------------------
Evaluation on training data (351814 items):
         Before Pruning           After Pruning
        ----------------   ---------------------------
        Size      Errors   Size      Errors   Estimate
        53589  11635( 3.3%)    880  14826( 4.2%)    ( 4.4%)   <<
Evaluation on test data (117271 items):
         Before Pruning           After Pruning
        ----------------   ---------------------------
        Size      Errors   Size      Errors   Estimate
        53589  6599( 5.6%)    880  5942( 5.1%)    ( 4.4%)   <<
          (a)  (b)      <-classified as
         ---- ----
         3763 5507      (a): class +
          435107566     (b): class -
Precision: 3763/(3763+435)=0.89638  Recall: 3763/(3763+5507)=0.405933  F: 2PR/(P+R)=0.5588
------------------------------------------------------------------
all 3:1 train:test
------------------------------------------------------------------
Evaluation on training data (851468 items):
	 Before Pruning           After Pruning
	----------------   ---------------------------
	Size      Errors   Size      Errors   Estimate
	96104  31886( 3.7%)   1753  37773( 4.4%)    ( 4.6%)   <<
Evaluation on test data (283822 items):
	 Before Pruning           After Pruning
	----------------   ---------------------------
	Size      Errors   Size      Errors   Estimate
	96104  13422( 4.7%)   1753  12154( 4.3%)    ( 4.6%)   <<
	  (a)  (b)	<-classified as
	 ---- ----
	 8872 11236	(a): class +
	  918 262796	(b): class - 
Precision: 8872/(8872+918)=0.90623  Recall: 8872/(8872+11236)=0.44122 F: 2PR/(P+R)=0.59348
Although the F-scores were near to the international results of MUC, I fell them could be improved.
Keep on to improved them.
 
没有评论:
发表评论