2005年4月30日

Ai -> Computation and Algorithms

We had the final AI exam this morning. While I did not very good in the exam, the forthcoming two courses were Computation and Algoriths.

OK. The exam season was coming now. Try!

2005年4月29日

MindMap of Chapters 789 of AI

Right now, I have made all the mindmap of our Ai exam.
The 789 chapters mindmap, just for sharing, as follows:
Welcome your comments.


2005年4月28日

MindMap of Chapters 456 of AI

Just for sharing, as follows:


2005年4月27日

MindMap of AI chapter 1,2,3

I had reviewed Ai slides one times. But I found I had known little of the exam points. I must reviewed them once again!

I used mindmap for my second review. I'd like to share it with you, if you like, as follows:


2005年4月26日

ANN -> AI

This eveniing, 6:00pm, we had the exam of ANN in D12. There were eight themes. After three days we would have the exam of AI. Let me change my state into reviewing AI.

If somebody want to review somthing, he must be devoted to. OK. Tomorrow, I would go to classroom.

2005年4月25日

父亲,母亲,恩师

晚上复习AI时有一阵感觉累了,走到楼梯口看着窗外,不禁想起自己已经快半个月没有打电话回家了。拿起电话立刻拨通。嘟……嘟……嘟……嘟,“喂……”。父亲那熟悉的声音在电话中传来。依旧先是父亲询问了我的近况,学习、身体、工作等一一问过后,依序下来是询问是否需要家里汇钱过来。略带固执的我每次都说不用,呵呵,多年前已经学会的自立总是潜意识的让我这样回答。

接下来是母亲接过电话,开始询问一些是否注意保护视力,是否每小时休息一次了诸如此类的问题。言谈之中总能感到父母的关怀,甚为亲切。我说半个多月没有给家里电话了,感到抱歉。母亲的答案我是知道的。“家里没事,我和你爸爸身体都很好,你就放心吧。”母亲接着说:“但是你爸爸前一段时间去买菜时被一辆摩托车挂了一下,肋骨受到一点创伤。”瞬时间,我非常的紧张,得知只是小伤,住了几天院后现在已经差不多好了,只是还需要喝一些中药保养保养。询问了具体的受伤过程和住院检查经过后,我才放下心来。电话转到父亲手里,父亲说着,“没事了,你放心吧。”我再劝劝父亲一定要注意休息之后电话又转到母亲手里。

母亲又说了说父亲的情况后提到了我初中时候的班主任陶老师。说是已经找到了陶老师的电话号码。原来一次老师路过我家门口到附近的一个地方参加母校的离退休教师活动的时候去晚了,找不到路了,恰好经过我家就和母亲聊了一阵。母亲告诉她我曾经两次回家寻找老师的都没有找到的情况后,老师告知了她现在的电话号码。

拿到电话号码后,一会儿,我便拨通了老师的电话。电话那头依旧是老师亲切的声音。我报上自己的姓名后老师很惊讶,惊讶之余又很欣喜。因为老师已经五年没有见到我了。老师的声音没变,还是那么清脆。一时的激动我们都不知道说些什么好。老师已退休多年,现在在家负责带小孩。老师说了一句,“教了一辈子的书,现在退休了,什么都不想干,也干不了了,成天休息。”的确,记得我当初上初中时,那是老师带的倒数第二界学生,三年的时光飞逝而去,彷佛在我眼前的一阵光。老师在我家的时候已经知道了我的近况,直接问了一句,“现在身体好吧?感觉压力大吗?”老师的关怀依旧,我眼眶一花,片刻回过神来说到“很好,一切都好。”聊了一些当初同学的情况后,老师说让我继续努力。我也祝福老师身体永远健康。

父亲,母亲,老师,我生命中不可缺少。感谢他们对我的养育和教导。祝福他们永远健康!

2005年4月24日

Reviewing

Reviewing AI……

2005年4月23日

闭关?开始!

今天看到一位好友的blog上写着闭关二十天,为的是撰写完成一篇论文。还幽默的给出了如有万分紧急事务的联系电话。

看到“闭关”,不禁想起了武侠小说中某某高人为了练就一些高深武功或者领悟一些精华而将自己完全封闭起来,有的是只喝水不吃饭,有的甚至不吃不喝。闭关时间短则一月两月,长则数月几年。闭关期间完全沉浸在所要学习的东西中。等到出关之时往往会练就高深武功。

闭关的效果真的那么好吗?想到自己本科时代数次数学建模竞赛,每次几乎都是与外界隔绝,吃住都在机房,完全置身于电脑前的思索和分析。同学们都戏称我去“闭关”去了。等到三四天很少睡眠之后“出关”之时往往是累得需要长时间睡觉。

看过武侠,再加上自己的一些经历,不得不承认“闭关”的确很重要。

“闭关”其实就是内一段时间内,屏蔽所有干扰,完全置身于需要处理的事情之中。这种状态效率并不见得很高,但是一种发散思维在经历较长时间的漫游后往往会得到有效的结果。

今天上午,人工智能课上老师介绍了考试范围,七天后考试;神经网络考试还有三天;计算理论进入17天的倒计时;算法和数据挖掘估计都只有20天左右的时间。这么多的考试一起压下来,看来我也需要一些“闭关”了。

由于考试复习时参考资料位置的不确定性,我需要呆在实验室里全身心学习。因此,我的“闭关”规则(主要是一些电脑操作规则)如下:
1。时段有限,不采用24小时不休息的“全闭关”模式,生活规律照旧;
2。每天早中晚时段开始时各开一次通讯软件,比如msn,qq,outlook等,其余时间全部关闭;
3。每小时休息10分钟,主要内容是遥望户外或者围着体育场漫步一圈;
4。打开工作笔记本,随时记下自己发散思维的东西,便于非闭关时段处理;
5。专心学习,适当低声欣赏Bandari等工作音乐,并且关闭显示器;
6。闭关时间即日起到考完最后一科为止(大约5月14号);
7。每日白天时间分为三段,闭关时间不超过两个时段,另外一个时段完成实验室任务;
8。暂时还没有想到 ^-^


呵呵,看到以上八条,也算是具体问题具体分析的“闭关”吧 :)

2005年4月22日

Earth Day!

This is earch day! Google is on the right logo today!Just as our mother! The mothor's day is forthcoming, on May 8.

Today, I filled in two big tables and joined three discussiones. Little tiring! Fortunately, I have finished them. I can put more attention on my examing courses.




2005年4月21日

深度搜索--学习中的“夹生饭煮不成熟饭”之现象

今日来一直在阅读《人工智能--一种现代方法》。学习的过程就像那种开水的热度一样,逐渐冷却。也就是经常有人说起的“三分钟热度”。开始几章看得非常认真,后面的章节逐渐开始出现跳页,后来出现的一些自己看似了解一些的章节就开始跳节甚至跳章。后面部分的章节也就只看了一些自己一点也不知道和非常感兴趣的部分。这样一来,近千页的书被我数日内“囫囵吞枣”的“看完”了。

盖上书来,仔细一想,自己一直认为堪称经典的这本书让我学到了什么?接下来便是漫无天际的“深度搜索”,甚至午睡时也想过这个问题。后来终于“受限深度搜索”(因为本人到人世间来的时间有限,所以受限搜索:) )找到了自己在初中一年级上音乐课时老师提到的一段话。原话我已记不清楚(人的记忆很复杂,记不清楚还能复述大概意思,奇特!),大概意思是这样的:


“这堂课咱们学习歌曲某某,首先请学过这首歌曲的同学举手”。
数人举手。
“好,现在开始请刚才举手的同学完全忘记自己学过这首歌曲。因为实际情况是很多同学学习歌曲的时候都是‘夹生饭’,如果不把自己放在什么都不会的情形下来从零开始学习歌曲,到头来只能是怎么也学不会。就像夹生饭怎么煮也煮不熟一样。”
大家轰然一笑。
“好,我们先从乐谱开始学习,123,预备,起……”



实践证明我们的老师是对的,大家学习得非常快,原先会一点的同学很快纠正了一些不正确的发音。

这个事情和我现在学习的一些科目联系起来,结论是显然的。我们学习的课程之间存在很严重的知识交叠的情况,比如机器学习和数据挖掘重复很多,人工智能又和这些都有重叠。这种情况也出现在我们平时的自然语言处理学习中。一般我们在学习的时候头脑里往往会预先判断这个东西我们学习过没有,学过了再听一遍可能会反感。这就像,假设我们现在到小学一年级学习拼音和识字,偶尔会发现一些自己以前不标准的发音,但是绝大多数都是已经学习过的。我想大家在这种情况下很难有人能够认认真真的从头到尾学习完全部拼音。

那么怎么解决这个问题呢?其实在我的音乐老师那里已经有了答案。在你想要认认真真学习一本书,或者一门课程的时候,你需要将自己放在零点,假设自己一点都不会。对于自己一点都不知道的东西,显然需要好好学习之,对于自己知道的一些东西需要细致耐心的学习,因为不同的书籍反映的是不同人对同一个问题不同的认识,认识一个事物从多个方面来结合会达到更好的效果。

我定义这个过程叫“re-check”,re的过程就是大脑寄存器暂时清零的过程,防止出现带着“夹生饭”学习。check的过程就是正常的虚心学习的过程。

可能会出现这样一种情况。学习一本很厚的书的时候或者听一个很长的报告时,开始可以在“零状态”起始,后来学习的时间长了,就会出现在某个新问题点上的“夹生饭”。避免这个问题的方法我想就是每个时间段学习一个相对集中的专题,每次开始时都清一下零。当然,如何划分专题也是需要很多的技巧。这个问题另待探讨。

呵呵,今天的深度搜索找到了自己的初中生活,没准儿哪天找到自己的小学生活 :)

2005年4月20日

postpone the machine learning group activity

I was noticed that Mr.Qinghua Hu had gone to America for an international conference. Congratulation to him! However, this weekend he could not come back. The original plan of our machine learning group was discussing rough set theory and decision trees in this weekend. When I heard Mr.Hu's news, I began to change the plan.

I wanted to present decision trees lonely in this weekend. But I found there were so many things for me. I could not give the presentation. So I would like to postpone the activity.

Our original plan of machine learning group was discussing for some classic and newly machine learning algorithms. But so far, our activity content was only presentation. I believed it was not with good effect. The speakers were not invited easily. And the discussing effect was not good. This problem should be solved in the near future.

2005年4月19日

Mining your neighbor information

These days I began to read books for our exams. I had a novel feeling.
Sharing with you, as follows:

Yes. Everyday, we find so much information that we can't understand or master them all. For instance, I have so many books around me. But I have not read them all. AI- A morden approach is a perfect book that I had pointed out in my blog. However, I didn't read through it before. When I read it, I found so much useful information for our study and research.

I thought more about this instance. It was a waste of my books and electronic presses. I should mine them firstly, and then study others.

2005年4月18日

Bandari

In my opinion, Bandari is a kind of very nice music. I had collected so many of them from Victor. In the recent months, I listened them when I was working, programming and thinking. I had named them as my working music. I liked them very much.

This afternoon, when I started up winmap and searched bandari in gaea music library, I found out that I had not listened so many other bandari songs. I deleted my original bandari music library and downloaded them from gaea one by one. Finally, I collected as many as 260 bandari songs. I found the other bandari songs also beautiful.

Nice feeling!

2005年4月17日

Hammers and nails

We do research just as using hammers for nails.

Just for fun, I collected so many hammers and nails as follows.
I thought we should be familiar very much with the characteristics and particularity for each hammer, and try to understand the situation of each nail. Too our NLP&IR research, the hammers were all kiinds of machine learning methods and mathematics. The nails were the concrete problem of NLP and IR.

















My last three presentations of this term

This afternoon, I made the last three presentations on our lab of this term. Firstly, Sheqi and I introduced the experience of ourself. Then I introduced decision trees theory and C45R8 libirary. The third one the the XML plan in tail.

After them, I felt little easy about life. I had done so many presentations in recet days. I should conclude and think more about them. Yes. There was so many attentations for presentation.

I would write a mindmap about it.

2005年4月16日

Machine Learning Group

This is the second activity of our Machine Learning Group. This time, there are four presentations of my plan. Car gave us a wonderful presentation on: Overview-Machine Learning for NLP. Quietsea introduced Libsvm in tail. With the time limitation, heitu and me did not gave our presentations on roseta and decision trees.

It was successful. However, I found some shortages of today's activity.

First, I didn't book in every person by myself. It's a nice chance for me to master the situation about the persons.

Second, I did not manage the time table. I had not considered overtime situation.

So many warmhearted friends gave me so many advices and suggestions. Thanks to them. I would prepare a more wonderful activity in the next weekend.

2005年4月15日

Fruiting and tiring preparation for presentation

These days, in the recent months, I found I was dropping into the presentation sea. Last month I presented eight times. This month, I had presented three times. There were three forthcoming.

Although it was very good for me to practice my presentation ability and sum up my ideas, I felt little tired. As to each presentation, I tried my best for it.

Tomorrow morning, I will give the presentation on our machine learning group. Let me try again!

2005年4月14日

Summarizing yourself

We have so many kinds of summarizations: annual, quarterly, monthly, weekly, and dayly. We have been familiar wi them. But do you have summarize your works of many years? Do you see youself clearly? Ususlly, I have little time to think such question. Now, I have the chance. This weekend, Shiqi and me would give our presentations on our working summarization from our joining IR to now.

When I wanted to prepared for the presentation, I had not any idea. Maybe I tell some useful experiences to the newly persons of our lab, introduce the working items which I had experienced, or some machine learning subjects. I had no clearly idea.

Recently, I spent some time for learning Mindmap. I believed it was useful for everyone. But after consideration, I was not sure whether all of us would be interested in it.

It's hard to do so. I wrote the all I could thought about in MiindManager. When I arranged the ideas in Mindmanager, I found out the introduction flow of my presentation. Yeah. Mindmap was iinteresting, novel, and powerful.

Thanks to Mindmap.

2005年4月13日

Scientific American

After so many days asking and being told that Scientific American had not come, I managed to buy it this afternoon.

It is a nice book for me. But I had little time to read it. This evening, when I came bask to our apartment, it was 23:00. I chose to read it in the reading room of our apartment.

I believed it was a so nice book which could attract me so much. This was the first time I read it seriously. It was April now. But the Science Review column concentrated on the February. Firstly, I did not understand it and even to doubt it was an error of the book. When I turned to read the head pages, I found that although the Chinese version is on April, the original edition was on Feb. 2005. So I understood that the translation of the original edition would cost two months.

When I was reading it in detail, I found I think more than the content of the articles. I hesitated to comment on the pages directly firstly. But when I had more ideas, I could not help to writing directly on the pages. I believed when I reviewed it, I would think more.

Yes. Reading only is not enough. Thinking is the most important for reading. I think so.

2005年4月12日

Exams are drawing near

We have five courses this term. They all will have exams near the end of this month. So our most busy period is coming.

I want to do the survey about data stream mining for the homework of data mining course. It's a interesting subject. I didn't know more about it except the basic idea。 This afternoon, when we were in the class of data mining, I asked Mrs. Gao. She gave me some suggestions about this direction. The famous research institutes were in Wisconsin University, Stanford University, and Cornell University.

I believed in that I could submit a wonderful homework.

2005年4月11日

To live is to function

To live is to function. This is a nice sentence from my friend niwenjie.

Yesterday, he (or she, I don't know) sent a mail to me with some advice. He found some syntax error of my diaries, and suggested me studying more on English.

This is the first time that somebody gives me advice on blog. Yeah. As you know, my English is not good. Right now, I have not passed the CET-6 test after I had tried five times. My original idea about English was it's only a tool. So I had not regarded it as important as enough. In my daily life plan, I had very little time for English studying.

After read his advice, I am on to change my view about English. I am about to study more about English.

Everyday, I write a diary in my blog. This is a habit of my daily life. But I didn’t pay more attention on the quality of them. It's wrong. I think so now. Every body knows that writing diary is a good practice for improving English. But this can be guaranteed by keeping on correct errors of your diaries. I should do so.

Thanks to niwenjie!

2005年4月10日

Grey System

I have been not studying and discussing grey system so long. Yesterday, I found a friend made some comments on my entry about Grey System in our Machine Learning discussing block. His ID was stochasticpr. He had commented on grey system. From his comments, I knew the Journal of Grey System was a index of SCI now. It's a piece of nice news.

We discussed more on mails. By my judgment, he was good at grey system, and had done much on it.

We all believed that right now it was a painful period of grey system. We'd like to discuss more.

2005年4月9日

MindManager

当你发现一个东西很好的时候,往往会有一种冲动--学会并且使用它。

昨天在学习了《思维导图》后找到并且试用了Inspiration 7软件,今天的在实验室的第一件事情就是采用Inspiration 7制作今日计划。在列举自己的个人当前事务的时候亲身感受到了思维导图的威力,一个本来不是很起眼的想法经过自己的发散式思维居然扩展出了很多的细节。按照计划完成了几项任务后感觉确实是非常的方便。

俗话说没有调查就没有发言权。晚上一时闲暇打开了自己搜集的一些关于思维导图的论坛,想要学学西别人是怎么使用思维导图的。看到了很多非常精致的思维导图,但是它们的风格与我用过的FreeMind和Inspiration大不一样。思维导图强调色彩和图形,感觉论坛中的例子比我做的几个图形精彩得多。仔细查找后得知它们都是用MindManager制作的。

MindManager的优点别人阐述如下:
Mindmanger是专业的思维导图制作软件。Mindmanager上手非常快Mindmanager支持与WORD、POWERPONT之间的转换。对于分支较多、较复杂的思维导图,比较适适合Mindmanager处理。

自己试用了这个软件后感觉的确非常非常的好,堪称完美。尤其是它的帮助文档都是非常漂亮的录像。改天贴上自己的作品和大家分享一下。

2005年4月8日

笔记,思维

早上整理自己的电脑桌面文件夹的时候发现了自己在学习FreeMind软件时找到的一个pdf文档《思维导图》。只有300页,打开看来顿觉兴奋。

书中讲述了咱们以往的线性笔记思维的害处和发散思维的好处。结合种种例子说明了以发散思维为核心的思维导图的意义。本书的以为作者提到了他采用思维导图来学习研究的一些简易而卓有成效的的方法。思维导图让他成为一个在科研方面的多面手,读博期间也很快完成了博士论文。作者巴利提到的一段内容如下:


我很快地意识到,在思维与写作两者之间衔接的问题,时我的研究生同学们成功或失败的一个主要的决定性因素。许多人没有能够衔接上。他们对研究的主题把握得越来越多,可在组织细节,以便形成论文的时候,却越来越不由自主,茫然失措。
思维导图使我处在一个非常有竞争性的优势位置。它使我有了把思想组织起来并且加以深化提炼的能力,而不再重复耗时费力的起草再起草过程。由于把思维和写作分开来了,我可以更清楚的想问题,思路也广泛得多了。到开始写作的时候,我已经有了一个清楚的结构,也有了一个确定的方向感,这使写作更容易,更快,也更令人愉快。
我再规定的三年时间内提前完成了博士论文,还抽出时间写完了另一本书和一个章节,帮人找到然后编辑了一份国际关系学方面的季刊,当学生报纸的助理编辑,参加摩托车赛,还结了婚(与未婚妻一起用思维导图起草了婚誓)。因为有了这些经验,我对这个技巧当中有关创造性思维的一面热情高涨。
思维导图一直是我进行学术工作的重要方法。它使我再进行书籍、文章和学会论文的写作是成果迭、产量甚高。在一个信息的分量极为重要,很多人被迫成为专家的地方,它使我保持了一个多面手的地位。在一些太过复杂,常常令人语无伦次、辞不达意的理论文章写作的时候,我也把自己清晰的写作能力归功于思维导图。它多我的职业生涯最大的影响,也许就反映在人们第一次见到我时发出的惊异中:“你比我想象的年轻得多。你是怎么在这样短的时间里写出这么多东西的?”
在我自己的生活和工作中尝到了思维导图的很多甜头后,我成了思维导图的倡导者。



一个如此好的东西,一个如此实用的方法,凭什么不值得我好好学习呢。到现在为止我在思维导图上的使用仅仅是利用FreeMind做过一个相关的报告。在那个报告中我没有体会到思维导图的巨大威力。今天在试用了INspiration 7.6后,我决定今后每天至少使用一次这个软件,争取思维导图在我的研究和生活中成为我最好的帮手。


2005年4月7日

Football Match

After the basketball match before two weeks, this morning, our labs hold a football match. Our opponent was the class team of Wei He.

The match was started up at 9:00. We all, about twenty persons, came to the playground. As there were only eight persons consist a group, we turned up in the ground by turns. Prof.Tliu was also a member of our football team. He was good at this sport item. At the beginning, Wei He was the goalkeeper of our opponent. Our team's first goal was by Yu Hong. In the forbidden zone, Yu Hong, faced to Wei He, kick the ball into the goal. The ball was through under Wei He's crotch. So excellent goal it was.

The best member of our team was Jialun Deng. Firstly, he was our goalkeeper. By right of their skills, our opponent vanguards shot once and once again. Jialun Deng turned back them one by one. We gave him a title of iron man. After some times substitution, he was the vanguard. Facing to the goalkeeper, he shot by the same way as Yu Hong. After so many matches, we all found he was a good player of all ball-like sports.

When it came to 11:00, we finished our match. We all were happy.

Nice sports!

2005年4月6日

ostream_iterator< int > ofile( cout, " ");

ostream_iterator
这是一个游标类,
ostream_iterator< int > ofile( cout, " ");
这个表示这是一个直接指向cout(默认打开的输出流)的游标类,两个引号中间有一个空格,我能够编译,估计是你的空格少了,这是表示流的分隔标志,也可以用其他的字符

#include
#include
#include
using namespace std;

int main(int argc, char* argv[])
{
vector v;
vector vc;
ostream_iterator< int > ofile( cout, ",");
ostream_iterator ofile2( cout,";");
for(int i=0;i<10;i++) v.push_back(i);
for(i=0;i<10;i++) vc.push_back('w');
copy(v.begin(),v.end(),ofile);
copy(vc.begin(),vc.end(),ofile2);
ofile=4;ofile2='m';
return 0;
}
看看这个程序输出什么,
0,1,2,3,4,5,6,7,8,9,
w;w;w;w;w;w;w;w;w;w;
4,m;

函数bind1st 和 bind2nd 的说明

函数bind1st 和 bind2nd 都可以用于将二元算子(binary functor, bf)转换为一元算子(unary functor, uf)。转换过程需要二个参数:bf与值(v)。

值(v)是固定参数。换句话说,uf(x)等价于:
 * bf( x, v) - 用于bind2nd函数
 * bf( v, x) - 用于bind1st函数

在处理判别问题时使用bind1st 和bind2nd 函数是很有用的。这二个函数可将二元判别条件转换为一元判别条件。在将某一范围内(如容器中)的各个值与一基准值相比较时尤其有用。例如:

std::vector< int> a;
// . . . 给a赋值

// 下面的指令将删除所有小于30的元素
a.erase( std::remove_if( a.begin(), a.end(),
std::bind2nd( std::less< int>(), 30)), a.end());

在大多情况下,使用bind2nd就足够了,如上例所示。

但在类属编程时,要用函数来处理判别问题。通常要指定判别方向,并都用小于号"<" (std::less< type>)来表示。记住,要分别建立小于号"<"的各个重载算子"<=", ">=", 和 ">"。在这种情况下,就能看到bind1st 和bind2nd函数都很有用了。举例如下:

#include
#include
#include

template< class iterator, class predicate, class doer>
void for_each_if( iterator itFirst, iterator itLast, predicate pred, doer do_it)
{
while ( itFirst != itLast)
{
if ( pred( *itFirst)) do_it( *itFirst);
++itFirst;
}
}

void print( int i) { std::cout << i << " "; }

int main(int argc, char* argv[])
{
int aNumbers[] = { 10, 5, 89, 9, 30, -2, -8, 7, 33, 25, 30, 76, 0, 2};
int nCount = sizeof( aNumbers) / sizeof( aNumbers[ 0]);

// a < b
std::cout << "\nNumbers less than 30: ";
for_each_if( aNumbers, aNumbers + nCount,
std::bind2nd( std::less< int>(), 30), print);

std::cout << "\nNumbers bigger than 30: ";
// a > b
for_each_if( aNumbers, aNumbers + nCount,
std::bind1st( std::less< int>(), 30), print);

std::cout << "\nNumbers less or equal than 30: ";
// a <= b <=> !(a > b)
for_each_if( aNumbers, aNumbers + nCount,
std::not1( std::bind1st( std::less< int>(), 30)), print);

std::cout << "\nNumbers bigger or equal than 30: ";
// a >= b <=> !(a < b)
for_each_if( aNumbers, aNumbers + nCount,
std::not1( std::bind2nd( std::less< int>(), 30)), print);

return 0;
}

下面是一个类属函数示例,将删除所有小等于最小值或大等于最大值的元素:

// 删除所有满足'x <= least' 或 'x >= biggest'条件的元素
template< class iterator, class value_type, class predicate>
iterator remove_least_and_biggest(
iterator itFirst, iterator itLast,
value_type least, value_type biggest, predicate pred)
{
// 删除所有x <= least的元素
iterator itAfterRemovingLeast =
std::remove_if( itFirst, itLast,
std::not1( std::bind1st( pred, least)));
// 删除所有x >= biggest的元素
iterator itNewLast =
std::remove_if( itFirst, itAfterRemovingLeast,
std::not1( std::bind2nd( pred, biggest)));
return itNewLast;
}

如果进行忽略大小写字母的字符串比较,可用以下代码:

bool case_insensitive( const std::string & first, const std::string & second)
{ /* 代码 */ }

std::string aStrs[] = { "john", "John Doe", "Mircea", "nicole", "Nicole
Kidman", "Abraham", "Zeek" };
int n = sizeof( aStrs) / sizeof( aStrs[ 0]);
std::vector< std::string> a( aStrs, aStrs + n);
std::copy( a.begin(), a.end(),
std::ostream_iterator< std::string>( std::cout, ", "));
std::cout << std::endl;
// 删除不是"John Doe", "Mircea", "nicole"的所有元素
a.erase( remove_least_and_biggest(
a.begin(), a.end(), "John", "nicole kidman",
std::ptr_fun(case_insensitive)), a.end());
std::copy( a.begin(), a.end(),
std::ostream_iterator< std::string>( std::cout, ", "));


2005年4月5日

FSNLP第五章

周五,我将讲述FSNLP的第五章。最近任务太多,不得不先完成相对容易而且很紧的任务。

昨天下午在寝室一觉醒来,想要找个地方上自习开始看看FSNLP的第五章。拿上书包准备出去的时候发现寝室自习未尝不可。安安静静的一个人开始看书。一个半小时左右后心潮彭湃,有想要赶紧做slides的冲动。

约莫三点四十,来到实验室,打开自己的工作音乐--Bandari,开始写起slides来。截至刚才,我的slides已经完成,检查了一下基本没有什么问题。就差周五上午再温习一下了。

正如昨天的blog一样,读一遍书和做slides给大家讲完全不是一样的概念。做slides的时候脑袋里需要想着听众,想着内容如何展现。还需要细细体会的书中内容。

以前听说过抄书的好处,我也手抄过英文版的FSNLP的第一册。现在看来效果最好的还是读完后给大家作个slides介绍介绍书中的内容。同样的感觉出现在若干次在实验室例会以及reading group上的论文主讲。

想到一个笨方法:以后发现一些非常值得阅读的文献资料,读完一遍后再做个slides试着给大家讲讲。

Good idea. 最笨的方法也是最有效的方法。

2005年4月4日

指代消歧

准备周五的FSNLP第五章的报告的过程中想到了一个关于指代消歧的问题。

第五章主要内容是搭配。章末提到了专有名词的识别,对于专有名词的识别存在一些很大的挑战:指代( coreference)(怎样才能说IBM和International Bussiness Machines是指向同一个实体),消歧(disambiguation)(AMEX什么时候指的是American Exchange,什么时候指的是Americam Express)?

看到这里不禁想到了卢老师昨天在实验室报告会上的一个缩略语--TCL。大家刚看到这个词的时候最先想到的就是“王牌高频电子有限公司”,而卢老师的报告中的TCL是Thai Computaional Linguistic(泰国语言学研究所)的简称。这里的TCL就像FSNLP书中提到的AMEX那样。

我对这个问题仔细思考了一下。对于某些人未曾听说过TCL能指代“泰国语言学研究所”之前,他会认为TCL就是指代“王牌高频电子有限公司”。纯粹就是指代消解需要解决的问题,细化一下就是缩略型的共指消解。但是在听说TCL能指代“泰国语言学研究所”之后再谈这个问题那就不一样了。按照FSNLP书中的说法,这个问题是消歧的问题。但是究竟是什么消歧呢?我开始以为是指代消歧的问题。因为这里其实是共指上有两种可能。再网上查证自己的想法(输入“指代消歧”或者“coreference disambiguation”或者"anaphora disambiguation")结果找到的我需要的信息一点也没有。我认为指代消歧应该是指代消解研究体系下的一个较为深入的题目。

我把我的想法和实验室专做词义消歧的卢老师讨论了一下。卢老师说在上下文中确定“他”的指向问题的时候,备选答案可能就是几个人名。确定“他”的指向问题的时候,和我所说的TCL指向的问题本身就是很类似的。

我认为按照卢老师的提示问题确实是一致的。但是,正如FSNLP书中所说的那样,AMEX有两个意思(缩略对象也可以看成是意思):American Exchange和Americam Express。如果是在上下文中确定AMEX是什么意思时那就是一个词义消歧的问题。

这个问题还需要深入考虑。待续。

2005年4月3日

月末小结

又到了月末小结的时刻,撰写这种文档的时候总有一种疏理自己生活的感觉。疏理是必要的,人就应改经常的总结自己。正如周明老师一次在咱们学校大礼堂上送给我们的话一样。“经常的总结自己,发现自己取得成功的原因,找出失败的教训,争取以后做事时取得更大的成绩”。

“疏理”自己之前,我对于自己在三月份的工作只有模糊的映象。只是感觉自己成天都在忙忙碌碌,没有条理的那种忙法。细细查看自己在一个月里的几个报告内容和完成的一些任务,这才将自己三月份做过什么整理清楚。原来自己在三月份做了那么多的事情。这在一定程度上减轻了自己浪费时间的感觉。

一个月的时间本不是很长,四个忙忙碌碌而又周期性的工作周就会让人感觉不清楚它的存在。同样是一个月,以寒假为例,时间却过得很慢,总觉得时钟会随着一年的节日而发生转速的改变。时间真是一个奇特的东西。

月末总结还需要包含下月计划。在制定的下月计划时,一不小心我发现我的时间表上已经占据了好几次的报告,加上我带领的研究小组的研究任务以及自己的研究生课程,我感觉自己的2005年4月将是一个更加忙碌的月份。

忙碌是什么?我一直在反思这个问题。其实忙碌是一个双刃剑。有些人碌碌无为,有人过劳死掉,有人在忙碌中体验并享受人生。不同的生活习惯和生活内容导致了不同的忙碌内容。忙碌需要规范化,那自然就是计划的重要性。

提起做计划我就想到了自己在这方面存在的不足之处。原先自己在做计划的时候总是将时间安排的满满的,而且一次就是好几十天的内容,但是计划没有变化快,真正完全执行到底的计划没有几个。这就自然涉及到一个如何做计划的东西。其实这是一门很高深的学问。经过了这么多,加上自己的体会以及别人的心得,我觉得如下的方案对我比较可行:

月初撰写上月月末小结,包含月度计划;每周末撰写周末小结并且制定下周计划;每天晚上撰写blog并且制定第二天的计划;每个小结的开头需要对上月、周、日的计划内容进行考核,找出没有完成任务的原因并且找出解决方案,适当调整计划内容。

原先我的各种计划缺少最为关键的一个部分:计划完成状况的监督。

忙碌的四月已经到来,加上完善的计划监督方案相信自己会更加充实,不会再有那种模糊的感觉。

2005年4月2日

研究生课程

眨眼功夫已经到了四月,根据研究生教学计划我们研一的课程将在这个月结束,下个月月初进行各科的考试。

这学期开学以来,忙忙碌碌的生活我选择了听好每一堂课,最后一个月好好复习的策略。这个计划中的最后一个月已经到来,我需要开始好好的准备各个科目的作业和考试了。

时间--紧,任务--多,迫切需要详细周密的计划。

2005年4月1日

遗传算法初探

下午的实验室TS小组例会上,我负责指导的IR俱乐部学生李正华给大家做了《遗传算法初探》的报告。报告内容丰富,大家对于遗传算法也都有了更深入的认识。

看着李正华在台上精彩的演讲,我知道了他现在对机器学习已经有了浓厚的兴趣。他现在对于机器学习的激情自一定程度上也给了我很大的影响。互相学习嘛!

祝福他能子遗算法方面取得一定的成绩。