2004年10月31日

Rreply to the robot attack of my blog

Nowadays, I discovered that many many bad comments appeared in my blog. As I had not closed the option of allowing comments, lots of robots put some bad comments. I believed that there were some SOE(搜索引擎优化公司)s use this trick to advance the rank of some hyperlink's rank in Google.

Formerly, I deleted the bad comments one by one. These days I dealed with it as it deals with my blog. I used a robot named as ROBOT5 to delete the comments one by one automatically and without any human interface. By this robot I could record some operations as a macro and run this macro for any times.

This robot software was useful for lots of mechanical operation. Wonderful!

2004年10月30日

Pattern Classification homework

This evening, I was studying in a classroom in D building. My Pattern Classification homework had not been finished. After studied the chaper four I began to finish the five subjects. But the first was nearly a pure mathematical one. I thought it for an hour without any solution. The other subjects were easy to be solved. At some extent, I believed that mathematical was very important for the computer application subjects.

2004年10月29日

自习一日

今日计划复习一日。来到二区图书馆,试图在新鲜的环境下学习。开始上自习一会儿就困了,下午的自习也是一挥而就困了。整个一天睡去了很多时间。

最近的几次上自习我都会出现一会儿就很困,睡上一觉后才能继续学习的状况。以前还以为是自己的睡眠时间不够,但是我只要在计算面前就不会出现这种情况。现在仔细想来,那是因为计算机的辐射作用可以让我处在一种兴奋的状态。

原先看别人的IT人养生之道中说一个人每天面对计算机的时间不能超过6小时,否则会对身体造成伤害,也不是是对是错,但是从自己的这种状态来看,我需要控制在计算机面前的时间。

这一点或许就是今天最大的收获。

2004年10月28日

考试临近

下午的《科学技术哲学》课上,老师给我们简单的回顾了一下前两年的题目,并列举了一些需要复习的知识要点。这堂课是老师加的,大概30分钟就下课了。这一科很快就要考试了。

下午的78节《组合数学》课上,老师给我们传达了同样的一种信号,那就是要考试了。

忽然之间,感觉一股考试的压力压了下来。前一阵子书看得少。看来需要好好准备复习了。

2004年10月27日

Google desktop search

今天收到最新一期《计算机世界》。像往常一样,我迅速的浏览了全部内容。其中有两篇文章谈到了Google Desktop Search。一篇说Google Desktop Search很简洁高效,整个软件也非常小,神奇的查找资料的方式和Google的网页搜索融为一体。总之就是非常赞赏。

另一篇文章谈到一个使用过这个软件的人都会想到的问题,那就是它的安全性。Google Desktop Search可以非常方便的帮助用户查找机器上的资料,同时这种功能也具有最完美的间谍功能,因为你的邮件、聊天记录、个人office文档都尽在它的眼中。如果别人一旦侵入你的机器,你的个人资料和个人隐私将100%的被盗走。

一个新事物有好有坏,让事实来验证它吧。

2004年10月26日

宁静以志远

记得刚进实验室不久由于太忙,心情曾经低落过。当时Dr.Tliu给我的指点是:一个人如果能够安排好各项任务,在繁忙中仍然保持内心的平静,他将来一定大有作为。

前些日子,我也处于这种非常忙碌的状态中。俱乐部的招新面试、成立大会,论文框架需要重新修改,课程考试压力的增大,等等事情全压下来。从几日来自己的心态来看,我比以前成熟了许多。面对这种情况,心情不像以往那样容易焦躁不安。对每一件事我都认认真真的去对待。其实许多事情压在身上的时候需要的仅仅是一种形态,一种宁静以志远的心态。同样的事情,无论什么心态都需要去完成,去面对,与其焦躁不安不如心平气和的去面对,心态好了,心情就好,一切自然就顺畅了。这些忙忙碌碌的事情使得我忙碌之余感到充实。

“宁静以致远”,七年前在表哥家的墙上看到,但是当时体会不到这些。希望这种心态继续保持下去。

2004年10月25日

The students club plan

Nowadays, our college leader intend to go ahead with the students club plan. The clubs are founded based on some research centers and labs, under the lead of some Ph.D. and master students. The aim was to practise the scientific research and development ability. The basic idea was to let the sophomores and juniors join in the students club earlier and know more about each research centers and labs.

The result of our IRClub interview was published this noon. I phoned the students who passed the interview one by one. They were all excited. This afternoon, there was a junior who had not passed our interview came here to ask us give a chance to him. As he cherished this chance very much. Carl and me were moved by his spirit and let him take part in the meeting of tomorrow.

This kind of chance was very good for each undergraduate. When I was a understudent, we had not any chance to join in the research center or lab. I admire them.

2004年10月24日

IRClub Interview(2)

Our Information Retrieval Club interviewed about twenty-seven sophomores this evening. Based on the experience last evening, we interviewed them more standard.

Faced to the sophomores I could feel clearly that their experience and thoughts not abundant and profound than the juniors. Thinking back the my sophomore and junior years, I was like them. At this momnent I understood more about the effect of university. We must cherish more the campus life.

The process was three and a half hours like yeaterday. We tired also. But I fell better.

2004年10月23日

IRClub Interview(1)

Our original plan was to arrange the IRClub interview at tomorrow evening. But at 18:10, some juniors came to our lab said that they were noticed to be interview this evening. Dr.Tliu said that we should interview them under this case.

Carl, zsq and me, based on the interview excel table that I designed this noon, combined a three interviewers group quickly. Our rule was three students as a group. Any student should be interviewed by three times by us during 15 minutes. We asked each interviewee some questions and made a score. Finally we unified our opinions to each student.

This process was strict to and responsible for each interviewee. There were twenty-seven juniors who were interviewed by us.

We were all tired after the four hours. But this was a nice experience. This was my first chance as an interviewer.

2004年10月22日

Begin to study VC++

This afternoon, I began to study VC++. This time I had made up my mind to study VC++ and never study VB.

I studied MFC firstly based on some experience on a simple calculator. I had finished a "Hello World!" program. This was a simple but useful program for me to understand the mechanism of VC++.

Continue this process!

2004年10月21日

Reading the new paper about Anaphora Resolution

There is a new paper about Chinese Anaphora Resolution that is On Anaphora Resolution within Chinese Text. The author is Wang Houfeng, who is a expert in Chinese Anaphora Resolution.

In his paper, he mentioned some issues on anaphora resolution within Chinese text and analyzes the difficulties to solve these issues in the current state of art. Three aspects of anaphora resolution are discussed: (1) It is difficult to identify some Chinese anaphors such as zero forms and common noun ones; (2) there are a lot of difficulties to recognize potential antecedents and their features like gender, number, and grammatical role etc.; (3) There is a lack of both necessary technology of NLP and Language resource.

I learned some new technique for anaphora about the syntax. That is C-command condition.

2004年10月20日

Time Management

How to manage your time when you are busy with more things than your consideration or burden? This is a big problem of my study and life.

During the period of time, I had spent lots of time on the lab's tasks. When I had some spare time I concentrated on them also. So there were some chapters of Combinatorics and Pattern Classification I had not read any more. And some homework of them I had not finished.

This evening I thought more about my recent life and study. I found that I should manage my time more reasonable. And I listed a simple time management as follow:

6:30 Get up, do moring exercise and have breakfast.

8:00~11:30 Finish the task of lab or read some materials about the research theme.

11:30~1:30 Have lunch and take a nap.

2:00~5:30 Practise the programming techniques or finish the program task of lab.

6:30~10:00 Study the course of graduate.

10:00~10:30 Write my diary and make the detail plan for the next day.

11:00 Go to sleep.

This plan is flexible for my study and life. There is a celebrated remark that you must devote all your energies to your work and study when you are working or studying, and spend all your energies to enjoy your self in your spare time. This rule is adopted to my needs.

New management, new life! I wish so.

2004年10月19日

Face to the visit.

This afternoon, I received a task that introducing our laboratory to some foreigners in English. When it was 3:10 this afternoon, the guests who were a couple, came to our laboratory. After some introduction by our associate dean, I began to introduce our laboratory to them. During the speech of our associate dean, the woman smiled with me. I felt her kindness.

Frankly speaking, before their visit I had prepared the English introduction for one hour. Firstly, I talked some about the research areas, IF, IE and NLP. Then I made some demo to them. At the beginning, I fell some nervous. About two minutes later, the couple and me sat down to chat. The man was very interested in the natural text understanding technology. When he looked the demo of Chinese sentence dependency parser, he asked some questions about the Chinese character word segmentation and said that was very different from English words. When I demoed the summarization system, he told some thing about his works about reading lots of information. Finally, he was interested in the Chinese character recognition system. He wrote a character that was a old symbol in Chinese. So this system could not recognize it. He was very interested in this character and said some history about this symbol.

Frankly speaking, I had not understood some sentences of them. But I could feel that their English was perfect. This was a nice chance for me practicing my speaking English.

Nice experience!

2004年10月18日

First TA

This evening, I came to the Second Campus for TA of experiments of C language programming. I substituted my studying brother to guide the experiments. This was a good chance for me to practise my ability.

There were fifteen students who were been guided by me. Some of them finished the experiment quickly and better. But some of them were not smart to the problem and the language. I thought back to my C language experiments when I was a fresh man. The guidline was more strict. Every fifteen students had a guidiing teacher. I believed this rule could give more benefit to the students.

It was a good experience.

2004年10月17日

Continue writing SE paper

This was a whole day's work.

2004年10月16日

Get together

This noon, Hang Chen came to the campus. Ten of our class got together at Hong Ming restaurant. We talked lots on the working experience and recent situation. Hang Chen was more mature and stout. He gave us some advices about finding a good job. He also told some news about other classmates. Some of them wanted to change a job and some prepared for the recent graduage enrollment exam.

We all fell happy with the good memory in undergraduate four years.

2004年10月15日

质疑灰色系统理论

很久没有和别人讨论灰色系统方面的问题了。今天在HIT-IR-BBS上的Machine Learning版遇到一位ID是phew的朋友。开始是他对灰色系统理论提出质疑,后来是我们之间的一些讨论。将这些讨论列举如下:

phew:
-----------------------------------------------------------------------------------------
不知道你们这里对灰色这么热衷。
邓聚龙的灰色刚出来时,我曾经把他的例题进行了检查,发现他根本就没做过计算,而是想当然地直接给出了结果。后来,我按他的方法检查了所谓的GM(1,1)模型,GM(1,2)模型,发现他的理论根本就是对高等数学的侮辱。
如果你想附和理论,尽可以使用灰色作标题。如果你想做科学研究,那么,请你思考点(x,y)处的一阶导数是怎么定义的?然后,你去查他的例题,最好看他的第一本书。
我还检查过当时的《系统工程》杂志(具体名称及不清了),里面刊登的所有有关灰色理论的例子,没有一个可以按照他的思路得出正确的结果。然而,预测的结果精度都非常高。
于是,我放弃了这一高深的理论。
因此,我希望大家也对这些东西进行检查之后再使用它们。
-----------------------------------------------------------------------------------------

billlang:
-----------------------------------------------------------------------------------------
欢迎讨论
欢迎批判
但是你说的太笼统了,能不能烦请阁下给出一个具体的反例让人心服口服。
-----------------------------------------------------------------------------------------

phew:
-----------------------------------------------------------------------------------------
下面是一个例子,如果您有怀疑,可以找出邓聚龙86年左右出版的那本书,自己动手亲自验算一下。
因为已经被人提升到系统工程的高度,灰色理论理应适用于他所鼓吹的范围,并且,不能违背数学规律

令 t = (1,2,3,4,5)

假定时步 dt=1
对于序列 (35,47,22,150,47,33),因为它不是递增的,所以进行累加,得到如下序列
(82,104,254,301,334)
于是:[dx/dt]= (47,22,150,47,33)
[x]= (82,104,254,301,334)

GM(1,1)应当是:
dx/dt + ax + b =0

x=exp(-a * t) – b/a

只要求得系数(a,b),就万事大吉
E1=47 + a * 82 + b
E2=22 + a * 104 + b
E3=150 + a * 254 + b
E4=47 + a * 301 + b
E4=33 + a * 334 + b

按最小二乘法,使 min([e] * [e]’)
{-dx/dt}={C}{a,b}’
得到 (a,b) = ( -2.4362 ,-0.0105)

Dx/dt – 2.4362 * x – 0.0105 = 0

X=exp(2.4362 t ) – 0.0043

X=(10 13 14.9 1707 19509)

这个结果不是可以更改初始条件就可以修正到原来的序列的。

GM(1,2)就不用举例了。只靠累加去强迫所有的增序列满足指数规律,难道做得到吗?

这里的矩阵运算使用Matlab。
-----------------------------------------------------------------------------------------

billlang:
-----------------------------------------------------------------------------------------
phew你好!
我同意你对这个例子的分析
但是如果你仅仅因为这一个例子而排斥灰色系统理论,感觉有所不妥,个人认为你可以改进这个理论,甚至提出你自己的新的见解。
灰色系统理论从诞生以来已经得到了很大的改进,原先的GM(1,1)模型仅仅是一个预测模型的雏形,或者可以说是一种数据拟合的方法。后来出现了许许多多的改进使得基于GM(1,1)得到了很大的改进。
-----------------------------------------------------------------------------------------

phew:
-----------------------------------------------------------------------------------------
这不仅仅是一个孤立的例子,而是,使用那个常微分方程是有简单的前提要求,它不会自动满足灰度的概念中提出的要求。
--------------------------------
--------------------------------
请问Bill,我看了你的那个ppt。在学习灰色系统的过程中,其中对于白化方程推导GM(1,1)模型的那部分有点问题,我没有推出正确的答案。请问有没有关于这部分详细的理论和推导?如果有相关的电子文档,能不能发给我一份?我的信箱:yang.guan@gmail.com。多谢。
----------------------------------
看来不只是我有疑问。15年前,邓先生给我的答复同您的给我的鼓励是一样的。
1、 dx/dt=dx 是有条件的。那就是,dx/dt在0附近
2、使用所谓的灰微分方程,也不能违背微分方程的求解法则
您对灰微分方程附加了那么多的条件,“少数据量”又有什么意义呢?
我看了您的灰色讲义,其中您对于灰色的包装,实在是不得不折服。不过我不明白的是,以您深厚的数学底子,为什么不去堵上在数学上那么低级的漏洞呢?
-----------------------------------------------------------------------------------------

phew:
-----------------------------------------------------------------------------------------
这是台湾朝阳科技大学的一篇关于灰色预测的论文,诸位检查一下他的结果。
论文的第44页
------------------------------------------------------
http://ethesys.lib.cyut.edu.tw/ETD-db/ETD-search/getfile?URN=etd-0724104-142556&filename=etd-0724104-142556.pdf
------------------------------------------------------------
关键矩阵:
{B}=
-349 1
-612 1
-858 1

{Y}=
272
254
239

A={a,b}

矩阵方程

{B}*A'={Y}

用最小二乘法求 A

------------------------------------------------------
如果他的结论是正确的,那一定是最小二乘法出了问题
-----------------------------------------------------------------------------------------


billlang:
-----------------------------------------------------------------------------------------
您在15年前就和邓先生讨论了。看来您对灰色系统的研究比我深入很多。我接触灰色系统的时间不超过两年。原先接触灰色系统是因为想要用来参加数学建模竞赛,后来又在灰色系统的基础上完成过一个简单的研究题目。我现在还是一名学生,我的研究方向不是灰色系统,现在有许多事情需要我去完成,没有更多的时间来深入的学习灰色系统理论。但是我坚信,灰色系统理论有它存在的价值和意义,是一个富有生命力的理论。纵观很多经典理论的发展都不是一帆风顺的,我想灰色系统理论的发展会是前途光明的。

我的那个介绍灰色系统理论的讲义是参考《灰色系统理论及其应用》制作的,我并没有对灰色系统进行包装(建议您浏览一下这本书)。在网上您会找到很多关于灰色系统理论的文章,灰色系统的研究者有很多,他们富有成效的研究将灰色系统带上一个更高的层次。自我感觉我的数学底子还不是很深,现在还没有能力去发展灰色系统理论。

您对灰色系统理论的研究非常深入,建议您和现在灰色系统方面的专家刘思峰老师讨论一下。
-----------------------------------------------------------------------------------------

phew:
-----------------------------------------------------------------------------------------
因为用的人多,所以提醒大家,使用一个新理论之前,先要搞清它的前提,现在一哄而上的现象太普遍。对于灰色理论的GM模型,我的看法是:
1、他用了最小二乘法,只是最小二乘法只能给出一个误差平方和的最小值,这个最小值并不是0。这一点被人忽略了。
2、有太多的人在造数据,上面提到的那篇论文,a 的值明明是 0.0648,而最后的反推公式则奇迹般地变成为0.01804(希望是作者的笔误)。但是,无论用 a 的那个值(0.0648 或者 0.01804),按论文描述的灰色思路,都无法还原到原来的序列(希望我的验算是错误的)。

将灰色理论描述一番--〉装模作样造些数据--〉造个奇迹,这几乎成了利用灰色理论的套路。

理解发表论文的重要性,但是,明知是错还要去用,面对面前也同样需要论文的同道,这些作者们应当有所收敛。

-------------------------------------------------------
有太多的问题是说不清楚的,因而需要理论的突破,不只是灰色,风行的Fuzzy Probability 同样也在遭到数学家的拷问,不能采用糊涂的理论解释未知的事物。我希望灰色理论有大的突破,因为我的专业也在等待着。但不是用这样的方法。
-----------------------------------------------------------------------------------------

billlang:
部分同意您的观点
您说的“明知是错还要去用”我不太同意
因为灰色系统的对与错需要大量的事实来验证,相信实践是检验真理的唯一标准,为什么不等待时间和事实来证明一切呢
不知您现在的专业是什么,需要灰色系统来帮您解决什么问题呢

谢谢指教
-----------------------------------------------------------------------------------------


phew对问题的深入程度很是令人敬佩,期待和他的交流!
具体讨论参见:http://ir.hit.edu.cn/cgi-bin/newbbs/topic.cgi?forum=20&topic=4&start=24&show=0

新书推荐《神经网络及其应用》

出版社 : 清华大学出版社
作者  : 周志华/ 曹存根/
系列名 : 中国计算机学会学术著作丛书
出版日期: 2004年9月
内容简介:
本书特别邀请国内神经网络及相关领域的知名专家,分别对神经网络的理论基础及典型应用进行了讨论。内容涉及神经网络的学习方法、优化计算、知识理论、流形学习、过程神经元网络、随机二元网络、离散联想记忆神经网络以及神经网络在医学数据处理、汉语认知等方面的应用。文中通过丰富的文献资料和研究工作,对当前的最新进展做出回顾和分析,对学术研究有重要的参考价值。
本书适合计算机和自动化专业的研究生、教师、工程技术人员和研究人员参考。


2004年10月14日

Related materials on Summarization Evaluation

There were so many materials on summarization evaluation. In the recent DUC 2004 conference, there was a summarization evaluation tool named as ROUGE. It's main idea was calculating the n-gram co-occurence rate. Following the successful application of automatic evaluation methods, such as BLEU, in machine learning translation evaluation, Lin and Hovy(2003) showed that methods similiar to BLEU, i.e. n-gram co-occurance statistics, could be applied to evaluate summaries.

ROUGE stands for Recall-Oriented Understudy for Gisting Evaluation. It includes several automatic evaluation methods that measure the similaity between summaries.


Reference:

[1] Chin-Yew Lin, ROUGE: A Package for Automatic Evaluation of Summaries, ACL2004
[2] Chew-Yew Lin, and E.H.Hovy.2003. Automatic evaluation of summaries using n-gram co-occurance statics. In Proceedings of 2003 Language Technologyu Conference, Edmonton, Canada.

2004年10月13日

New Scheme

I have done the summarization evaluation task. But I have not studied on CR for nearly two months. So my recent task was to read lots of papers about CR.

We had studied lots of classes on Pattern Classification, Combinatorics. I had not reviewed them for nearly half a month.

Two main aspects of tasks I could work for. Try again!

2004年10月12日

Poor pronunciation

During this year I had made two English presentations. The first one was on the Graduate English Class. At that time my topic was High-tech. It was only eight minutes. After that presentation my English teacher Mrs. Zhang suggested me improving my pronunciation. My second presentation was in the summer holiday's Fault Tolerant Computing and Wearable Computing Class. My topic was Power Management. It was thirty-five minutes. After that presentation the teacher Dr.Daniel P. Siewiorek suggested me improving my pronunciation.

This evening I made the third presentation. This time was in the Doctoral English Forum about discussion some papers about our research fields. It was my trun to give presentation. My topic was Random Forests in Language Modeling. It was seventy minutes. I kept my speaking speed in order to express myself clearly. After the presentation I answered lots of questions. Dr.Tliu suggested me improving my pronunciation.

It was clearly that my pronunciation was not good. I fell this problem was very serious to me. I should solve this problem from now on.

2004年10月11日

Continue reading paper

For tomorrow's presentation, I must continue reading the paper. There were lots of puzzles to me.

One thing I'd like to note that the author Peng Xu was a Chinese. His education experience was as follows:

1990 ~ 1995 B.S. in Tsinghua University;
1995 ~ 1998 M.S. in National Lab of Pattern Recognition Beijing;
1998 ~ 1999 Ph.D. Candidate in Brown University;
1999 ~ now Ph.D. Candidate in The Center for Language and Speech Processing, Johns Hopkins University, USA.
Fields of Interest: Speech Recognition, Pattern Recognition, Language Modeling, Machine Translation, Natural Language Processing, Multimedia Coding.

He had done lots of performance.

2004年10月10日

Random Forests in Language Modeling

It is about the language model. The reading outline was as follows:

Title: Random Forests in Language Modeling

Author(s): Peng Xu and Frederick Jelinek

Author Affiliation: Center for Language and Speech Processing, the Johns Hopkins University, Baltimore, MD 21218, USA

Conference Title: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing.

Language: English

Type: Conference Paper (PA)

Treatment: Practical (P) Experimental (X)

Abstract: In this paper, we explore the use of Random Forests (RFs) (Amit and Geman, 1997; Breiman, 2001) in language modeling, the problem of predicting the next word based on words already seen before. The goal in this work is to develop a new language modeling approach based on randomly grown Decision Trees (DTs) and apply it to automatic speech recognition. We study our RF approach in the context of n-gram type language modeling. Unlike regular n-gram language models, RF language models have the potential to generalize well to unseen data, even when a complicated history is used. We show that our RF language models are superior to regular n-gram language models in reducing both the perplexity (PPL) and word error rate (WER) in a large vocabulary speech recognition system.
Descriptors: Natural Language Processing basic problem
Identifiers: random forests, language model, decision tree, perplixity

Personal feeling: It introduces the decision tree language model and random forests concepts. The main idea is wonderful. Random generating some decision tree language models and combine them for a whole model. This model could solve the data sparseness problem at some extent.

Some thing could be updated: The random decision tree generation method was not good enough. I believe we can use some optimization principles for get better random decision trees.

2004年10月9日

Three new summarization systems

Based on the best evaluation methods' idea, I realized two summarization systems. The first one was following the traditional methods that calculates the weights of all sentences and selects the best ones.

And the new idea for my first system was the weighting methods. I used the evaluation methods for calculating the weights of each sentences. The final summarizations of the test papers achieved some better score under my best evaluation system.

Then I changed my point to combine all possible sentences set then calculating the similarty to source file. This methods was very slow. As its algorithm complexity was pow(2,n).

The third one was generalizing the weights of each sentences based on the first system. But its final evaluating results was worse than the realized four system by yhb.

Three systems, three methods, I will think more about them.

2004年10月8日

Some checking results

The original plan was that I use the new evaluation method for yhb to obtain the best one. And yhb gave me eight new system, I used my progam to evaluate them one by one.

It was perfect effective. But there were some trend not following Mrs. Qin's feeling. We could analysis more.

2004年10月7日

Exciting Scheme

This morning, after reading the daily latest news I made the daily plan. Just now I finished the first one: realizing the relatively word frequency approach. The final experimental result was close to the intending. It was of the ability of distinguishing different summarization systems, but not high relative to the human feelings. It was not well as the two ones, could be a comparison results.

Right now I had an exciting scheme. I discussed the recent development with Wanxiang Che who was a PhD.Student. He did not believe that my TF method was powerful and suggested me to realize some new method. But he suggested me to realize a new summarization system based on my TF method. We could make some evaluation by human to prove the useness of this method.

This system was not complicated. I could realize it quickly. So exciting.

After comparing this system with others I could do another thing: comparing the summarizations with human summarizations and get new evaluation approach.

So exciting news for me today! I was excited!!!!

Let me start the new plans.

2004年10月6日

Simply but effective method for SE

It was said that the most effective method was the most simply one. I could not believe it ever. But now, I couldn't help believing it.

This morning, I kept feeling sad on the SE task. I had no idea. But I sticked on my viewpoint about the break point that was how to combine the new system into my SE system. I wanted to realize the famous package for Summarization Evaluation in DUC2004: ROUGE. But there was some unkonwn problem that I could not run the progroms. I had no idea and began to review the presentation ppt on 26 Sep. Suddenly, I found the two methods in that ppt could be re-realized with some new usage.

I compared the new four ranks data, and got the ideal result. Wonderful!!

I realized it. I told this news to Dr.Tliu. He discuss it with me and was exciting ,too. He suggested me to think more about the methods.

At noon, I kept working and realized a more basic method and achieved better result. Good news for me.

Frankly speaking, the two new methods were not master, but simply and effectively. I could not explain it fully.

2004年10月5日

Visiting science and technology museum

This morning we began our visiting plan: to science and technology museum.
We were 17 person including members of our lab and WF.

Remembering last time, we planed to visit this place. But it was close every Monday. Insteadly we plaied in the Sun Island. Today was Tuesday and sunny.

In this beautiful building there were three floors. All kinds of item under science and technology were interesting.

The most exciting program was the four-dimensional film.

2004年10月4日

No answer for SE

One whole day, I was thinking the key of the SE problem.

I found I had not any idea for this problem. So I changed my view to the publication papers of others. There was lots of papers about SE by Hongyan Jin. She was a famous person in this area. I had read some of her papers. But there were not any information guiding me to finish my task.

The key problem of the SE task was how to combine the new summarization system into the SE system.

There was a famous summarization evaluation conference in DUC. Their evaluation tool was ROUGE which was based on n-gram and other gram information. I wanted to test it. But there was some problems of my perl enviroment.

Until just now, I had not got along with it.

2004年10月3日

Three new methods for SE

My recent task was designing new methods for SE(Summarization Evaluation). Today, I tested three new methods for SE. They were Artifical Neural Network, Decision Trees, multi regress analysis. But no one was good for my task.

I believed there was a breach. I must combine the new system to my SE methods. How to combine? This was the essence of my problem. I could ponder it much.

2004年10月2日

New method for SE

There was a new method for SE(Summarization Evaluation). That was based on the hint of Dr.Tliu, Car and Yhb. This method was so good that I could use my machine learning methods. I wanted to extract lots of features of the Summarizations and let the learner fitting the data.

The framework has been fixed on. I only realized the sub-modules one by one.

2004年10月1日

National Day!

This is national day!

Under the original plan, I was working in lab in this morning and afternoon. When it was 3:00 pm, I, with WF, went to Harbin Odeum. There was a wonderful concert of Harbin Philharmonic Group for national day. It began at 6:30 pm.

The conductor was a famous young man named Qiuhong Teng. He conducted 12 compositions, including the famous Carmen, the Blue Danube, and so on. They were beautiful.

This was the first time I went to listen concert. Wonderful experience. And WF and me wandered in soome streets, including the Central Street.

When it came to 8:00 pm, we came back.