下午的讲座是北京语言大学的荀恩东博士的《Perl语言与自然语言处理》。荀恩东老师是咱们哈工大毕业的博士,还是校友亚! Perl 语言以前就听说过非常适合于文本处理,曾经也自学过一阵子,但是一直没有真正用上。今天荀老师介绍后方知其强大功能和作用。讲座内容中主要介绍了Perl语言的一些基本的数据结构和基本的流程控制等语句。Perl中功能最为强大的是其模式匹配功能,也就是正则表达式的应用。荀恩东老师在介绍完Perl的基本特点后介绍了Perl 在自然语言处理中的英勇,主要有以下几个实例:查词典、词频统计、汉语分词、词性标注、简繁转换、网络机器人、连接数据库、调用Google API等。看着那些短短的数行代码就可以完成C++里面需要数倍行数代码才能实现的操作,心里不禁很激动,回学校后一定要好好学习并使用Perl。其中有一个演示非常有提示作用,在利用Perl来完成简繁转换的时候,采用的方法是在Word里面利用记录的宏操作的代码来嫁接到Perl中进行操作。Office中集成了大量的多种功能的Com组件,我们可以利用录制宏这一功能在嫁接出许多特殊的功能应用。
下午的讲座是北京语言大学的荀恩东博士的《Perl语言与自然语言处理》。荀恩东老师是咱们哈工大毕业的博士,还是校友亚! Perl 语言以前就听说过非常适合于文本处理,曾经也自学过一阵子,但是一直没有真正用上。今天荀老师介绍后方知其强大功能和作用。讲座内容中主要介绍了Perl语言的一些基本的数据结构和基本的流程控制等语句。Perl中功能最为强大的是其模式匹配功能,也就是正则表达式的应用。荀恩东老师在介绍完Perl的基本特点后介绍了Perl 在自然语言处理中的英勇,主要有以下几个实例:查词典、词频统计、汉语分词、词性标注、简繁转换、网络机器人、连接数据库、调用Google API等。看着那些短短的数行代码就可以完成C++里面需要数倍行数代码才能实现的操作,心里不禁很激动,回学校后一定要好好学习并使用Perl。其中有一个演示非常有提示作用,在利用Perl来完成简繁转换的时候,采用的方法是在Word里面利用记录的宏操作的代码来嫁接到Perl中进行操作。Office中集成了大量的多种功能的Com组件,我们可以利用录制宏这一功能在嫁接出许多特殊的功能应用。
相对而言,下午的内容让我收获颇多。首先是微软研究院的研究院高剑峰博士介绍了一下他们在刚结束的ACL上发表的一篇关于自适应分词系统的构建策略。主要背景是现在什么是词语在自然语言处理界还没有达成统一认识,不同的语料库的有不同的处理方法。通常在某个语料库上训练好的分词系统换一个评测语料就会效果很差。采用的方法是将基本的贝叶斯模型按照线性加权的思想来修改,通过训练出更加精细的适应语料的模型来完成分词任务。这种方法实验结果显示可以大大改良统一系统在不同训练语料上的评测结果。另一位负责信息抽取的博士展示了微软研究院的信息抽取方面的工作。现阶段微软研究院的信息抽取工具建立在SQL Server2003的基础上,将来会提供一些API接口来供用户进行信息抽取基础上的研发。主要抽取思路是在Chunk的基础上抽取一些关系,采用可视图的方式来进行展示。还有一位今天上午刚刚达完辨的博士给我们展示了他在中文Chunk识别方面的工作。具体的思路也是在用线性加权的方法来构建精细的Chunk识别模型。微软研究院讲座的最后是中文信息处理领域的老前辈黄昌宁博士的一点评论。黄昌宁老师简单说了一下中文信息处理领域的前景和介绍了一些高剑峰博士在微软亚洲研究院工作期间取得的成绩。原来高剑峰博士短短4年左右时间里已经在ACL上发表了7篇文章。真是牛人呀。这也更加说明了这个领域是一个年轻人的时代,只要多多努力,多多思考就会有成绩。
相对而言,下午的内容让我收获颇多。首先是微软研究院的研究院高剑峰博士介绍了一下他们在刚结束的ACL上发表的一篇关于自适应分词系统的构建策略。主要背景是现在什么是词语在自然语言处理界还没有达成统一认识,不同的语料库的有不同的处理方法。通常在某个语料库上训练好的分词系统换一个评测语料就会效果很差。采用的方法是将基本的贝叶斯模型按照线性加权的思想来修改,通过训练出更加精细的适应语料的模型来完成分词任务。这种方法实验结果显示可以大大改良统一系统在不同训练语料上的评测结果。另一位负责信息抽取的博士展示了微软研究院的信息抽取方面的工作。现阶段微软研究院的信息抽取工具建立在SQL Server2003的基础上,将来会提供一些API接口来供用户进行信息抽取基础上的研发。主要抽取思路是在Chunk的基础上抽取一些关系,采用可视图的方式来进行展示。还有一位今天上午刚刚达完辨的博士给我们展示了他在中文Chunk识别方面的工作。具体的思路也是在用线性加权的方法来构建精细的Chunk识别模型。微软研究院讲座的最后是中文信息处理领域的老前辈黄昌宁博士的一点评论。黄昌宁老师简单说了一下中文信息处理领域的前景和介绍了一些高剑峰博士在微软亚洲研究院工作期间取得的成绩。原来高剑峰博士短短4年左右时间里已经在ACL上发表了7篇文章。真是牛人呀。这也更加说明了这个领域是一个年轻人的时代,只要多多努力,多多思考就会有成绩。
接下来就是正式的会议,上午报告在北京语言大学教学二楼的301和401同时举行,301主题是分词,401主题是机器翻译。根据我自己的兴趣我挑选了一些报告来听。机器翻译的第一个报告题目是Fuzzy Matching in Machine Translation Evaluation,其中对于自动文摘评价有价值的思想是将机器翻译生成的结果与标准翻译结果进行对比,对比的方法是看出现的n-gram的重叠率。这个思想对于自动文摘评价的启发就是对于那种非机械文摘可以比较重叠的n-gram(n可以取多种情况),对于机械文摘可以借助这种方法来对比流畅度。上午的报告有一些没有报告人,所以两个会场结束得都比较早。
接下来就是正式的会议,上午报告在北京语言大学教学二楼的301和401同时举行,301主题是分词,401主题是机器翻译。根据我自己的兴趣我挑选了一些报告来听。机器翻译的第一个报告题目是Fuzzy Matching in Machine Translation Evaluation,其中对于自动文摘评价有价值的思想是将机器翻译生成的结果与标准翻译结果进行对比,对比的方法是看出现的n-gram的重叠率。这个思想对于自动文摘评价的启发就是对于那种非机械文摘可以比较重叠的n-gram(n可以取多种情况),对于机械文摘可以借助这种方法来对比流畅度。上午的报告有一些没有报告人,所以两个会场结束得都比较早。
Exam of Fault Tolerance
I was preparing for the Fault Tolerance Exam from yesterday evening to this morning. However, it is difficult for me. As there were some theme confusing me. I had tried my best. I thought so.
Tomorrow, I will come to Beijing for taking part in SWCL2004. It is a good chance for changing minds with others.
Tomorrow, I will come to Beijing for taking part in SWCL2004. It is a good chance for changing minds with others.
A good book
I recommand a good book to you. It is named as The Elements of Statistical Learning: Data Mining, Inference, and Prediction. It is one of the Springer Series in Statistical. There is a sentence at the beginning of the preface. "We are drowning in information and starving for knowledge".-- Rutherford D.Roger.
It is some newly opinion of supervised and unsupervised learning to me.
"The learning problems that we consider can be roughly categorized as either supervised and unsupervised. In supervised learning, the goal is to predict the value of an outcome measure based on a number of input measures; in unsupervised learning, there is no outcome measure, and the goal is to describe the association and patterns among a set of input measures".
This book's coverage is board, from supervised learning(prediction) to unsupervised learning. The many topics include neural networks ,support vector machines, classification trees, and boosting--the first comprehensive tretment of this topic in any book.
So perfect I believe.
It is some newly opinion of supervised and unsupervised learning to me.
"The learning problems that we consider can be roughly categorized as either supervised and unsupervised. In supervised learning, the goal is to predict the value of an outcome measure based on a number of input measures; in unsupervised learning, there is no outcome measure, and the goal is to describe the association and patterns among a set of input measures".
This book's coverage is board, from supervised learning(prediction) to unsupervised learning. The many topics include neural networks ,support vector machines, classification trees, and boosting--the first comprehensive tretment of this topic in any book.
So perfect I believe.
The final class of the Wearable Computing
This afternoon, we finished the last Wearable Computing. After three presentations, Prof. Daniel made some conclusion of our five days studying. The main idea of his words were as follows:
This was a challenge for us to listen, speak, read, write and think fully in English. This was not imaginabale before us taking in part in the classes. It could be view as a special training of our English.
The whole process was a reduction of the research. When you do some research, firstly you must read hundreads of papers. You should read the paper, understand others works, come up new ideas and relaize them. After that, you should sell your ideas to others for money for supporting your research. So he requested us to write a business plan on some application wearable computing. The works above consisted of the research.
Finally we took a group photo. It's good memory.
This was a challenge for us to listen, speak, read, write and think fully in English. This was not imaginabale before us taking in part in the classes. It could be view as a special training of our English.
The whole process was a reduction of the research. When you do some research, firstly you must read hundreads of papers. You should read the paper, understand others works, come up new ideas and relaize them. After that, you should sell your ideas to others for money for supporting your research. So he requested us to write a business plan on some application wearable computing. The works above consisted of the research.
Finally we took a group photo. It's good memory.
Most busys(2)
As had submitted the discription document of ACE evaluation, I could be concerned fully on the preparing for the presentation of Wearable Computing. Geted up at 6:30 like yeaterday, prepared the presentatio until 8:30, listened the Fault Tolerance class, continue prepared the presentation until 13:50, and beginning at 14:00 I began to do my presentation of "Power Management". However, I didn't think I prepared enough. What I could do was to present as well as I can.
The 35 minutes was hard to me. First I kept calm to describe the diagram one by one. But when I began to introduce the third part, I began to read the slides befor the audience. I knew it not well. As this part I prepared not well. At the end of my presentation, I read the five conclusions. As in the others' presentations, Dr.Daniel P. Siewiorek asked me a question about some points' meanings.
I struke my staff at Four-Flat. Today was the deadline we must remove to the Nine-Flat. After the class I removed in a hurry. It was tiring for us.
Came to the new bedroom, I felt novelty. When I came back at night, the domitory was quiet. It was good enough for me to read some materials.
The 35 minutes was hard to me. First I kept calm to describe the diagram one by one. But when I began to introduce the third part, I began to read the slides befor the audience. I knew it not well. As this part I prepared not well. At the end of my presentation, I read the five conclusions. As in the others' presentations, Dr.Daniel P. Siewiorek asked me a question about some points' meanings.
I struke my staff at Four-Flat. Today was the deadline we must remove to the Nine-Flat. After the class I removed in a hurry. It was tiring for us.
Came to the new bedroom, I felt novelty. When I came back at night, the domitory was quiet. It was good enough for me to read some materials.
Most busys(1)
This was the most busy day in my life. Yeah, I thinked so.
I dragged myself out of bed at 6:30, still half asleep, after my alarm clock had quivered for half an hour. About 6:50, I began to finish my ACE Evaluation task. Yeah! The simulating evaluation achieved good performance. The unweighted F value, total Value, and Value-based F score were 84.3%, 75.6%, 81.1%, respectively. Good news!! Following the original plan, I generated the final the 246 APF documents. When I checked the final documents, I found lots of the mentions clustered in right entities. But the "PRO" type mentions' accuracy was not good enough. I believed this was because of missing entity type and entity subtype. Just so. I wished we could gain good performance of the ACE evaluation.
At 8:50, I came to A315 quickly to listen the Falult Tolerance Class. The class was perfect, as in pure English.
So lucky that we were free this afternoon. Based on the system description of the ACE EDR, I added some new features and finished the ACE EDR Co-Reference algorithmic approach and system description. But when I finished this document, it was nearly 20:00. There were only four hours for me to complete the homework of the Fault Tolerance and make the powerpoint for presentation of tomorrow.
Time was very pressing to me. I read the papers as fast as I could. At 22:30, I had the outline in my mind, and began to prepare the ppt. Until the building administrator came here I finish half of it.
It was nearly 23:00 when I came back into my bedchamber. Today's homework I had not finished. Writing in a poor light, I completed it at 00:30.
Being busy could entich my life.
I dragged myself out of bed at 6:30, still half asleep, after my alarm clock had quivered for half an hour. About 6:50, I began to finish my ACE Evaluation task. Yeah! The simulating evaluation achieved good performance. The unweighted F value, total Value, and Value-based F score were 84.3%, 75.6%, 81.1%, respectively. Good news!! Following the original plan, I generated the final the 246 APF documents. When I checked the final documents, I found lots of the mentions clustered in right entities. But the "PRO" type mentions' accuracy was not good enough. I believed this was because of missing entity type and entity subtype. Just so. I wished we could gain good performance of the ACE evaluation.
At 8:50, I came to A315 quickly to listen the Falult Tolerance Class. The class was perfect, as in pure English.
So lucky that we were free this afternoon. Based on the system description of the ACE EDR, I added some new features and finished the ACE EDR Co-Reference algorithmic approach and system description. But when I finished this document, it was nearly 20:00. There were only four hours for me to complete the homework of the Fault Tolerance and make the powerpoint for presentation of tomorrow.
Time was very pressing to me. I read the papers as fast as I could. At 22:30, I had the outline in my mind, and began to prepare the ppt. Until the building administrator came here I finish half of it.
It was nearly 23:00 when I came back into my bedchamber. Today's homework I had not finished. Writing in a poor light, I completed it at 00:30.
Being busy could entich my life.
Two tasks
Go on listening the Fault Tolerant Techniques and finish the ACE CR Co-Reference Evaluation. They are my tasks.
Beginning at 6 this afternoon, I began to finish the ACE task. And based on my consideration I had finish a foundamental version of the testing documents. Now the ACE evaluation program is running on my test results.
It is too late. My Fault Tolerant Homework has not completed. I must come back now.
Beginning at 6 this afternoon, I began to finish the ACE task. And based on my consideration I had finish a foundamental version of the testing documents. Now the ACE evaluation program is running on my test results.
It is too late. My Fault Tolerant Homework has not completed. I must come back now.
Dr.Daniel P. Siewiorek
I am very glad to study in the classes under Dr.Daniel P. Siewiorek's teaching. He would teach us two curricula during these five days: Fault Tolerant Computing and Wearable Computing. The timetable was like this: every morning of Fault Tolerant Computing, and the afternoon of Wearable Computing.
In the morning, Dr.Daniel teached us by himself. Frankly speaking, his speech speed was little quick for me to understand. I believed I could be adaptive to his speed. The main content of this morning's Fault Tolerant Computing were some basic concepts. They were common and useful in this area.
We met the big challenge that each of us should make a presentation in 30 minutes and comparing three papers. This afternoon, there were three of us made their presentations. There were some problems. Somebody's speech speed was too fast to hear clearly. There was not any hierarchy. The listeners could understand them clearly. I will give my presentation after three days. I must get rid of these problems.
I must go to sleep early. There is another busy day.
In the morning, Dr.Daniel teached us by himself. Frankly speaking, his speech speed was little quick for me to understand. I believed I could be adaptive to his speed. The main content of this morning's Fault Tolerant Computing were some basic concepts. They were common and useful in this area.
We met the big challenge that each of us should make a presentation in 30 minutes and comparing three papers. This afternoon, there were three of us made their presentations. There were some problems. Somebody's speech speed was too fast to hear clearly. There was not any hierarchy. The listeners could understand them clearly. I will give my presentation after three days. I must get rid of these problems.
I must go to sleep early. There is another busy day.
Finish the preparing works
Right now, I could say I have done the fully preparation for ACE EDR Co-Reference.
Based on last days' workd, I had completed the comparing tests. The first one was simulating the evaluation of ACE EDR Co-Reference. And I got the final unweighted F score of 83.2, weighted F score of 80.7, the final whole value of 71.5. The second one was adding the original eneity class attributes to construct the EMD output corpoa for testing. The final values were unweighted F score of 85.3, weighted F score of 81.8 the final whole value of 72.4.
The missing eneity class attributes could not affect a lot!
So I believed I have made all the preparation. Let me meet the challenge!
Based on last days' workd, I had completed the comparing tests. The first one was simulating the evaluation of ACE EDR Co-Reference. And I got the final unweighted F score of 83.2, weighted F score of 80.7, the final whole value of 71.5. The second one was adding the original eneity class attributes to construct the EMD output corpoa for testing. The final values were unweighted F score of 85.3, weighted F score of 81.8 the final whole value of 72.4.
The missing eneity class attributes could not affect a lot!
So I believed I have made all the preparation. Let me meet the challenge!
Good effect
Finished today's task, I had linked all the five programs and tested ten training documents. The final result displayed that my method was with good effect. The current performance was little lower than the perfect effect.
Ok. All the preparing works had been done. The next works was simulating the real inviroment to evaluation my system on the training corpoa.
Ok. All the preparing works had been done. The next works was simulating the real inviroment to evaluation my system on the training corpoa.
Entity Class Classification
I began my additional EDR CO-Reference Task Evaluation plan with the Entity Class Classification.
ACE would not provide the corresponding entity class. This was additional classification task for me. There were three schemes for me to finish it.
The First was Adopting the modules of EMD. As the EMD output content included the entity class. So this was the comman directly scheme. However, after a consideration, I found out the bugs in this scheme. The EDM's output was including not only entity class attributes, but also other more information. According the algorithmic approach, if the other information was provided and we only needed classify the entity class, we could use some machine learning algorithm based on the provided entities and eitity mentions features. I used some methods to classify the entity class rather than using EMD modules.
The second was using SPC as a default vlaue for entity class attributes. Why? There was a statistical rule that SPC is 89.45% of the all entity class attributed in the whole training corpoa. And using one third corpus for testing, under the scoring plan of ACE entity class, 8798.5 could be got.
The last one was using decison trees algorithmic approach. Discussing with Xiantao Liao, she said I could make some manual rules for the classification of entity class attributes. There were lots of feature values I could use. And decision tree was the best choice for generating the best rules on the training corpoa. I tested this scheme. The classification accuracy was 90.3%. And using one third corpus for testing, under the scoring plan of ACE entity class, 8942.75 could be got.
Comparing the three above schemes, I decided to using decision trees as the final scheme.
The next problem was the whole system evaluation by myself.
ACE would not provide the corresponding entity class. This was additional classification task for me. There were three schemes for me to finish it.
The First was Adopting the modules of EMD. As the EMD output content included the entity class. So this was the comman directly scheme. However, after a consideration, I found out the bugs in this scheme. The EDM's output was including not only entity class attributes, but also other more information. According the algorithmic approach, if the other information was provided and we only needed classify the entity class, we could use some machine learning algorithm based on the provided entities and eitity mentions features. I used some methods to classify the entity class rather than using EMD modules.
The second was using SPC as a default vlaue for entity class attributes. Why? There was a statistical rule that SPC is 89.45% of the all entity class attributed in the whole training corpoa. And using one third corpus for testing, under the scoring plan of ACE entity class, 8798.5 could be got.
The last one was using decison trees algorithmic approach. Discussing with Xiantao Liao, she said I could make some manual rules for the classification of entity class attributes. There were lots of feature values I could use. And decision tree was the best choice for generating the best rules on the training corpoa. I tested this scheme. The classification accuracy was 90.3%. And using one third corpus for testing, under the scoring plan of ACE entity class, 8942.75 could be got.
Comparing the three above schemes, I decided to using decision trees as the final scheme.
The next problem was the whole system evaluation by myself.
My problem of summarization evaluation
I have studied one of the three papers about summarization evaluation of ACL 2004. The guidline was as follows:
Title: Automatic Evaluation of Summaries Using Document Graphs
Author: Eugene Santos Jr., Ahmed A.Mohamed, and Qunhua Zhao
Organization: Computer Science and Engineering Department University of Connecticut 191 Auditorium Road, U-155, Storrs, CT 0269-3155 {Eugene,amohamed,qzhao}@engr.uconn.edu
Abstract: Summarization evaluation has been always a challenge to researchers in the document summarization field. Usually, human involvement is necessary to evaluate the quality of a summary. Here we present a new method for automatic evaluation of text summaries by using document graphs. Data from Document Understanding Conference 2002 (DUC-2002) has been used in the experiment. We propose measuring the similarity between two summaries or between a summary and a document based on the concept/entities and relation between them in the text.
Essentials: There is a bottleneck in text summarization research field that is how to evaluate the quality of a summary or the performance of a summarization tool.
In this paper, there is a newly view of summaries evaluation based on the document graphs(DG). In the document graphs there are only two kinds of nodes: concept/entity nodes and relation nodes. Currently, only two kinds of relations, "isa" and "related to", are captured for simplicity. Comparing the similarity between two document graphs, we could evaluate two document. There was a trend that the similarity comparison between the summary DG and the source DG is nearly same as that of the evaluation by human. So this tool could be instead of the human involvement of the evaluation.
Use for reference: The concept/entity node and the relation node were generated in the bound of the noun phrase. But in Chinese there is not any good noun phrase identifier. This is a bottleneck of this approach using in Chinese summarization.
Title: Automatic Evaluation of Summaries Using Document Graphs
Author: Eugene Santos Jr., Ahmed A.Mohamed, and Qunhua Zhao
Organization: Computer Science and Engineering Department University of Connecticut 191 Auditorium Road, U-155, Storrs, CT 0269-3155 {Eugene,amohamed,qzhao}@engr.uconn.edu
Abstract: Summarization evaluation has been always a challenge to researchers in the document summarization field. Usually, human involvement is necessary to evaluate the quality of a summary. Here we present a new method for automatic evaluation of text summaries by using document graphs. Data from Document Understanding Conference 2002 (DUC-2002) has been used in the experiment. We propose measuring the similarity between two summaries or between a summary and a document based on the concept/entities and relation between them in the text.
Essentials: There is a bottleneck in text summarization research field that is how to evaluate the quality of a summary or the performance of a summarization tool.
In this paper, there is a newly view of summaries evaluation based on the document graphs(DG). In the document graphs there are only two kinds of nodes: concept/entity nodes and relation nodes. Currently, only two kinds of relations, "isa" and "related to", are captured for simplicity. Comparing the similarity between two document graphs, we could evaluate two document. There was a trend that the similarity comparison between the summary DG and the source DG is nearly same as that of the evaluation by human. So this tool could be instead of the human involvement of the evaluation.
Use for reference: The concept/entity node and the relation node were generated in the bound of the noun phrase. But in Chinese there is not any good noun phrase identifier. This is a bottleneck of this approach using in Chinese summarization.
Four vs. Six of my works
This morning, I had another task of Automatic Text Summarization Evaluation. And Dr.Tliu gave me the suggestion that I could divide my works into two parts: four to Automatic Text Summarization Evaluation and six to ACE EDR Co-Reference Evaluation.
ACL 2004 finished just now. There was a workshop about the Text Summarization. I found that there were three papers about the evaluation about summarization. They were: Automatic Evaluation of Summaries Using Document Graphs, Rouge: A Package for Automatic Evaluation of Summaries, and Evaluation Measures Considering Sentence Concatenation for Automatic Summarization by Sentence or Word Extraction. In the preface of this workshop there were some words exciting me. Automatic Evaluation of Summaries played a more and more important role in the Text summarization research area. That is to say that it was the bottleneck of the research in Text Summarizarion inhancement to advance steps.
So good a research area.
ACL 2004 finished just now. There was a workshop about the Text Summarization. I found that there were three papers about the evaluation about summarization. They were: Automatic Evaluation of Summaries Using Document Graphs, Rouge: A Package for Automatic Evaluation of Summaries, and Evaluation Measures Considering Sentence Concatenation for Automatic Summarization by Sentence or Word Extraction. In the preface of this workshop there were some words exciting me. Automatic Evaluation of Summaries played a more and more important role in the Text summarization research area. That is to say that it was the bottleneck of the research in Text Summarizarion inhancement to advance steps.
So good a research area.
The final evaluation of ACE EDR
According to the original plan we got the evaluation corpus 9 o'clock this morning. But we did not run the processing programs for getting the final documents. As there were some little problems in the EMD task processing program. We tested our systems on the training corpus.
At 20:00 the EMD program came to its best state. After about one hour all the final documents of EMD, EDR, and RMD had been run out. During the programs running time, we were busy in writing the systems descriptions.
At about 21:10, we submitted all the resuit files with the three system descriptions.
Then we could have our supper.
We were preparing for these evaluation tasks for more than one month. It was like my experience of the American MCM.
Good experience!! But there was one more ACE evaluation task for me: EDR Co-Reference. There were some doubts about it. I should write an email for interpretation.
At 20:00 the EMD program came to its best state. After about one hour all the final documents of EMD, EDR, and RMD had been run out. During the programs running time, we were busy in writing the systems descriptions.
At about 21:10, we submitted all the resuit files with the three system descriptions.
Then we could have our supper.
We were preparing for these evaluation tasks for more than one month. It was like my experience of the American MCM.
Good experience!! But there was one more ACE evaluation task for me: EDR Co-Reference. There were some doubts about it. I should write an email for interpretation.
Keep on the improve our systems
No other things could disturb us. We were all concentrate on the precessing programs' improvement.
Link with Entity Mention Dection
My co-reference resolution was a sub task of the EDR evaluation. The deadline of the ACE EDR was nearly to the end. So I had to test the EMD output for my system. And the newly evaluation result displayed that the the entity scores had little enhancement. However that was the result of three papers evaluation.
Tomorrow I could finish the more detail evaluation.
Tomorrow I could finish the more detail evaluation.
Comparing the evaluation result with different train scale
Yes, I have got the final evaluation result by the perl language program ace04-eval-v09.pl. When I used three fourth train corpus for training and one fourth for testing, the final evaluation result was as follows:
Unweighted-F=70.4% Value=49.6% Valued-based-F=50.5%
The unweighted F socre was nearly to 74.17% that was my best original result under my evaluation method. But the ACE evaluation main item value was little lower.
Last night I found we should submit some description about our algorithmic approach and comparing with some primiary system. And I wanted to konw whether there is a rule that the final value would be better and better with the train corpus larger and larger. I planned to do about seven experiments with the training scale at from 10% to 70%.
Just now I had finished four of them . Each experiment would be spent about one hour. And there was an astonishing rule that the main three evaluation scores were floating a little.
This rule was only reflected by the comaring evaluation results. However, I would make deep analysis.
Unweighted-F=70.4% Value=49.6% Valued-based-F=50.5%
The unweighted F socre was nearly to 74.17% that was my best original result under my evaluation method. But the ACE evaluation main item value was little lower.
Last night I found we should submit some description about our algorithmic approach and comparing with some primiary system. And I wanted to konw whether there is a rule that the final value would be better and better with the train corpus larger and larger. I planned to do about seven experiments with the training scale at from 10% to 70%.
Just now I had finished four of them . Each experiment would be spent about one hour. And there was an astonishing rule that the main three evaluation scores were floating a little.
This rule was only reflected by the comaring evaluation results. However, I would make deep analysis.
Playing Table Tennis
I have nearly not played table tennis for two months. This evening , when I was tired with the ACE evaluation result file format, Shiqi called us together to play table tennis. After changing cloths, I went to the baseroom of 12th dorm.
I played with Yongguang Huang. We all liked to smash and top-spin the ball. And this style was tiring.
We played about one hour.
I played with Yongguang Huang. We all liked to smash and top-spin the ball. And this style was tiring.
We played about one hour.
The last program of the ACE CR tesk was finished just now. The main flow of it was composed by four parts: GetDecisionInformation, FormatAPF, Clusteri, and GetFinalDocument.
I had just finished the last part and gotten the final document. But there were some errors in it. It was too late. So I would debug all errors tomorrow.
I had just finished the last part and gotten the final document. But there were some errors in it. It was too late. So I would debug all errors tomorrow.
Three Programs
I need three programs to solve my last task of ACE CR. The were the result after my deep consideration and abstract.
The first program was named as EMDProcess. It prepared all materials for c4.5.
The second one was named as ExtractDecisionResult.
The last one was named as GetFinalDocument.
Just now I had finish the first two. Tomorrow I could finish the last one.
Good news! Keep on!!
The first program was named as EMDProcess. It prepared all materials for c4.5.
The second one was named as ExtractDecisionResult.
The last one was named as GetFinalDocument.
Just now I had finish the first two. Tomorrow I could finish the last one.
Good news! Keep on!!
ACE CR next plan
There were some problems to me on ACE CR. However the detail plan was in my heart now.
In order to finish ACE EDR evaluation task with Xiantao Liao, I could
reduce some tagged corpus into the EMD output format,
suppose that they are the final result of Xiantao Liao's programs,
construct the final train samples with the hypothesis that the samples were all positive,
use the final trainned decision tree to extract the decision result,
run the trainning samples gotten programs with the decision result, if fall across positive result , do some cluster process,
output the final XML documents.
This process flow could be realized in thw days.
Hoho, another two tensional work days.
In order to finish ACE EDR evaluation task with Xiantao Liao, I could
reduce some tagged corpus into the EMD output format,
suppose that they are the final result of Xiantao Liao's programs,
construct the final train samples with the hypothesis that the samples were all positive,
use the final trainned decision tree to extract the decision result,
run the trainning samples gotten programs with the decision result, if fall across positive result , do some cluster process,
output the final XML documents.
This process flow could be realized in thw days.
Hoho, another two tensional work days.
DLL interface problem
One of my recent tasks was to make a co-reference resolution dll for automatic summarization. The detailed steps were in my mind. However when I relaized and tested the dll, there was an error information.
I asked lots of people. The answer was that when you used vector in the interface of some dll modules, there would be some errors. This problem had not been solved directly so far. Some compromise idea was to use char*.
I asked lots of people. The answer was that when you used vector
博文 (Atom)