2005年7月31日

Reading(2): Language research, application, evaluation and analysis

I remembered a story about how to be successful. It said that write your tasks today, order them by their importance, finish the first one and keep on, you will be successful. I believed the reason is that if you finish the most important thing, you would be excited and have the passion and potency to finish the others easily.

Following this principle, I listed my tasks today. Because I was in the important process of training my reading habit, I believed the reading task was the most difficult for me.

Now I had finished the first phrase of my reading task. I would like to write some abstract and gain.
------------------------------------------------------------
Pages: 1~10 of Natural Language Understanding, second edition, by James F. Allen, 1995

Language research:
There are four main kinds of language researchers, linguist, psychological linguist, philosopher and computational linguist. Under the background of my study, I belonged to the last one. To computational linguist, the typical research problems were how to identify the structure of a sentence, how to modeling for knowledge and reasoning and how to use language for some special tasks. The main tools for us were algorithms, data structure, formal models of representation and reasoning, and artificial intelligence (search and representation methods). Now, I thought, I had mastered most of the tools and be familiar with few of the typical problems. I wished I could be versed on the problems. After all, the problems were the most important for any research.

Language research application:
There were two categories of natural language understanding application: text based and dialogue based. The former was about the processing of text, such as book, newspaper, report, handbook, email. They were all reading based task. The typical applications of this type were finding special topics from text database, information from messages and articles, documents translation, and summarization. The main problem of them was constructing a representation for information of text then used for reasoning.
Dialogue based application was more about the communication between human and computer, such as question answering systems, auto customer service system by telephone, auto teaching system, spoken language control for machines, and synergic problem solving system. They all were based on dialogue and keyboard based alternation.

Language understanding system evaluation: System evaluation included two main types: black box and white box test. Evaluation was very important for NLU and NLP research. If you could not construct the evaluation system, you could not start do your research practically. Black box test should be used after there was high performance of white box test. This was a very important principle. For example, the famous psychoanalyst robot ELIZA had not any intelligence but achieved best performance on psychopath treatment. It was based on keywords. So if you ask some wrong sentences with the keywords, it would answer all the same.

Language analysis:
There were three layers of language analysis: syntax, semantic, and pragmatic.
Syntax considered how to list words for correct sentences, confirm the roles of each works in sentences and the relation between phrases.
Semantic used for researching how to combine the meanings of each word for the meaning of the whole sentence. It was context-free sentence meaning research.
Pragmatic took care about same sentence used in different context and the influence of the context to the sentence meaning.
Up to the three basic layers, there were two main aspects of context: discourse and the world information. The two were all about context. Until now, research on the two layers was hot. Anaphora resolution and Coreference resolution belonged to this kind.
------------------------------------------------------------

2005年7月30日

Reading(1): NLU vs. NLP

I began to finish my first reading day on Natural Language Understanding, Second Edition by James F. Allen.
After reading the first ten pages (from head page to the end of the catalog), there was so much information I collected.
Firstly, the author gave me an introduction about the difference of NLP and NLU, and described the gap between them. NLU(Natural Language Understanding) emphasizes the importance of interpreting meaning and intent in order to achieve "deep" understanding on what is said. NLP(Natural Language Processing) refers to any techniques that are used to process linguistic information, and currently involve little semantics and intent. While NLP can serve many useful tasks, such as helping design better search engines, better information retrieval, and rough summarization and rough translation of documents, it so far has little to suggest about meaning and intent. This book attempts to combine the best NLP techniques with work connected with meaning, understanding, intent, reasoning, and acting. It describes how recent advances in statistical language processing can be used to advantage in natural language understanding.

This book was written in 1995. So it was ten years ago. When Jams wrote it, there was a rift forming in the language processing community. Work on language understanding and work on statistical methods were considered to be competing incompatible methodologies. Researchers working on statistical methods are increasingly turning their attention to semantics and meaning, and work in language understanding is increasingly using statistical techniques to improve algorithms, ranging from parsing to intent recognition.

So the two factions were fusing to be one. As studying in Computer Department from undergraduate, I believed I was belonged to the one working on statistical methods. So I was in need of reading this book.

From Carl, I learned to visit the information of the publication's author. The following link was his homepage: http://www.cs.rochester.edu/u/www/u/james. His publications from the DBLP Bibliography Server was in this link:http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/a/Allen:James_F=.htmlHe was so famous and had done lots on dialogue. This was my current surveying point. I hoped I could get some useful information from him.

How to keep reading everyday

This morning, when I came to MSRA, I began to stay at fifth floor. The environment was better than that of 4F. We were in cubic, just like of our lab. We can not see many people in a glance. As in special groups, my friends, Xiaoyuan Cui, Huo Yong, Long Jiang, Ke Wu kept in 4F.

The first thing today was to fit my computer together and trim my materials. When I found the book Natural Language Understanding, I thought more.

Yes. We were in the age of great capacity for information. We could get so many nice books and wonderful papers. But we were same in the computer age. We were nearly submerged by so much daily information. So the result was that we did not read our nice books enough. There was a gap.

How to fill it? I knew a great scientist who read ten pages every day. When he was in his old age, he had read so many books. I believed again that if you could keep on doing something everyday, you would be successful of it. Nothing could obstruct you.

In my daily life, reading ten pages was not difficult. But I had not persisted in this habit. How to keep a habit? I had some experience. You could do it today firstly. Then tomorrow you do it again. After a week you would keep that habit. If this habit was not fit for you, after a week, you would exclude it.

So my current goal was reading ten pages everyday in this week. The reading material is the book Natural Language Understanding Second Edition.

2005年7月29日

Say goodbye to 4F

This afternoon, we are saying goodbye to 4F of Sigma.
Our enviromant of 4F is like the following picture:

From we came here, we had stayed here for two and half months.
This evening, we will move to 5F.
New place, new feeling!

2005年7月28日

Best papers on ACL and EMNLP

EMNLP2002 http://ufal.ms.mff.cuni.cz/~hajic/emnlp02/best.html
--------------
Michael Collins.
Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms.
EMNLP 2002. (Received Best Paper Award.)
(This paper includes theorems and proofs which apply to the algorithms in the ACL 2002 papers.)
-----------------
Frank Keller1, Maria Lapata1, Olga Ourioupina2
University of Edinburgh, United Kingdom and 2University of Saarland, Germany
Using the Web to Overcome Data Sparseness
-----------------

EMNLP2003 http://people.csail.mit.edu/mcollins/emnlp03.all.html
Training Connectionist Models for the Structured Language Model
Peng Xu, Ahmad Emami and Frederick Jelinek

EMNLP2004
Ben Taskar, Dan Klein, Michael Collins, Daphne Koller, and Christopher Manning. Max-Margin Parsing. EMNLP 2004. (Received Best Paper Award.)

ACL2001
"Fast Decoding and Optimal Decoding for Machine Translation" (U. Germann, M. Jahr, K. Knight, D. Marcu, and K. Yamada), Proc. of the Conference of the Association for Computational Linguistics (ACL-2001). ACL Best Paper award.

ACL2002
Franz Josef Och, Hermann Ney.
2002
Discriminative Training and Maximum Entropy Models for Statistical Machine Translation.
In "ACL 2002: Proc. of the 40th Annual Meeting of the Association for Computational Linguistics" (best paper award), pp. 295-302, Philadelphia, PA, July 2002.

ACL2003
----------------------
Yukiko Nakano, Gabe Reinstein, Tom Stocky and Justine Cassell. Towards a Model of Face-to-Face Grounding. ACL
----------------------
Dan Klein,Best Paper Award, ACL 2003, for "Accurate Unlexicalized Parsing"
----------------------

ACL2004
The best paper prize was awarded to Diana McCarthy, Rob Koeling, Julie Weeds, & John Carroll for their paper "Finding Predominant Word Senses in Untagged Text". In this paper they develop a method for selecting the predominant sense of a word from a corpus without sense annotations, using unsupervised thesaurus extraction techniques. The paper was rated highly by the reviewers and PC on quality and innovativeness, and was selected also because it combines two important topics in our field: unsupervised learning and robust semantic analysis.

ACL2005
David Chiang, A Hierarchical Phrase-Based Model for Statistical Machine Translation. (Best paper award.) This paper takes the standard phrase-based MT model that is popular in our field (basically, translate a sentence by individually translating phrases and reordering them according to a complicated statistical model) and extends it to take into account hierarchy in phrases, so that you can learn things like “X ’s Y” -> “Y de X” in chinese, where X and Y are arbitrary phrases. This takes a step toward linguistic syntax for MT, which our group is working strongly on, but doesn’t require any linguists to sit down and write out grammars or parse sentences.

2005年7月27日

Reading papers

How to read a paper? This is an very port issue for any researcher. When I was working in our laboratory, I adopted our paper reading ourline, as following:
-----------------------------------------------------------------
Title:
Source:
Publish date:
Authors:
Author’s organization:
Why the author did such research/project?
How about others work on this topic?
Where are the existed problems?
What is the new method/solution author adopted?
What is the theoretical advantage of this method?
Experiment design:
Experiment result:
Experiment analysis:
The remaining problems:
My ideas/criticism to this paper/work:
-----------------------------------------------------------------

Each time, after I had read the paper, I would fill many information in a text file. I believed it was of lower efficient. Now, I use BibliExpress for helping me to fill the related information. I read some papers, used only one sentence for noting the key point of it, and then select it into three different folder in terms of its usability for my task. The three folders' name were: Direct to my idea about my task, indirect to my idea but related to my task, useless for my task.

I believed after reading my collected papers, I would get a nice hierarchy of papers for my task.

After reading, we must write some research proposal. I have collected two classical, and sharing with you:
Proposal for Collaboration Task on Evaluation of Spoken (MM) Dialogue Systems
PhD Research Proposal: Dynamic software updates within ad-hoc connected environments.

2005年7月26日

Constructing the related papers library

It's a terriable process about constructing a papers library without any tool. I believed so now. As I have experienced on several times survey.

I remembered that when I surveyed on coreference resolution, I have only collected all the related papers on coreference resolution. I put them in a same folder with out any relation or index of them. So after my collecting, I nearly lost myself when I was in the face of them.

A fall into the pit, a gain in your wit. This time on surveying dialogue modeling and management, I collected each paper's related information in detail and organized them by Biblioexpress with Biblioscape format. If you collected the in formation items one by one, I believed you can not insist on ten papers. So I recommanded one useful skill to you.

Using google "filetype:bib" with the title of your paper, you can search the bibtex format paper information on most papers. If it returned not any correct result, you can google "@inproceedings" or "@article" with the paper title query. Copy the bibtex format paper information and import it automatically by your tool like Endnote or Biblioscape. Then copy the corresponding pdf or ps file into the attchment folder and add the relative paper path into your library.

After the process, you can easily construct a wonderful paper library. Then you should read them. How to read papers is another very important issue for researcher. I will give my experience later.


2005年7月25日

Surveying with tools

How to do survey? I think the first thing is collecting related materials as many as enough. How to save and clear up your materials is another important thing. As you can download files as many as you want. But I believe after you have downloaded, you are not very clearly about them.
So I recommand useful tools for research: Biblioscape. As I can not use the commercial edition. I choose to use the free edition of biblioscape: bibliexpress. It's enough for your collecting, clearing up, writing in word on your research information.
I think after my survey on this topic I should write a wonderful report about "How to do efficient and fruitful survey and writing by the tools Biblioscape and Bibliexpress".

These skills are very important to a researcher. All of us should master them.

2005年7月24日

Nice movie: Quill



号称感动亚洲一亿观众的《导盲犬小Q》
2005年01月08日北京娱乐信报
中文片名:导盲犬小Q
英文片名:Quill
导  演:崔洋一
主要演员:小林熏 椎名桔平 香川照之
类  型:剧情/动物
片  长:100分钟

  《导盲犬小Q》感动在电影背后

  号称感动亚洲一亿观众的《导盲犬小Q》,其实从影片上来说并没有什么新意,老套的人与狗的亲情,常见的日本影片风格,平淡的演员表演。一个半小时的影片中给人印象最深刻的是开头的15分钟,从小Q出生的瞳眼蒙目龙到步履蹒跚地探索这个世界,从与熊娃娃的嬉戏到伤感的周岁别离。影片中的小Q就像是一个被给予期望的孩子,被家长关怀呵护。而我们也从中体验到一些培养的快乐。

  电影是生活的提炼,让人感觉美好的电影屏弃了许多会让人感觉不美好的东西。比如与五只小狗共眠你没看到可能其中的两只会拉屎拉尿在你身上;五只小狗玩完了卷筒纸后你想象一下究竟是谁来收拾残局。更别说重新花钱去超市购买;你喜欢园艺吗?你以为小狗只是偶尔在里面蹦蹦跳跳吗?不过日本人对小狗的态度看上去不知道究竟是不是过分。就像是“遛狗者”爸爸一样,下班回家还轻手轻脚地来看看小Q是否睡得好。完全是父母对待孩子的感觉。让人觉得影片是否有意煽情。影片前半段讲述小Q的成长,始终将目光落在小Q身上,而后半段的影片加入了盲人渡边,又花了不少篇幅展现渡边的性格,导盲犬小Q完全处在旁观、附和的地位。不出声的小Q沦落为一只普通的导盲犬,与喋喋不休的渡边比较更加可怜。同时影片中并没有许多片段展现两者之间亲密关系。最煽情的时刻也无非是在训练中心重逢时渡边的表现而已。

  但《导盲犬小Q》让人感动其实是制作人员对影片的认真态度。可以想象用一只狗作为主角进行拍摄有多少困难的地方,摄制人员需要多少的耐心。比如在一场戏里,表现小Q被“遛狗者”爸爸吵醒后重新睡觉。一个镜头里看得出小Q的眼睛慢慢闭上。想想这个镜头有多难拍,培养小狗逐渐习惯摄像机需要多少时间(不要告诉我用了药)。另一个片段是小Q斗鸡眼看鼻子上的毛毛虫的特写镜头;还有一个片段是小Q睡在草地上梦见小熊后惊醒四顾的拟人镜头等等。几乎所有令人称道的镜头都出现在影片的前15分钟,而越到后来这样的镜头越少,越感到影片趋于平淡。包括渡边和小Q的两次死亡。

  影片用画外音来介绍小Q出生、成长、训练,与渡边的一切。看到影片中段才搞清楚原来说话的是渡边先生的大女儿。顶真一点的想法是其实她并没有经历小Q来到她家以前的那些生活。她不过是道听途说而已,并不能证明是真是假。于是影片前半段的真实性就打了大大的折扣,我觉得还不如在影片前半段用训狗队长的画外音而后半段再用渡边先生的大女儿的画外音会更好,至少脑子里少点疑问。

  《导盲犬小Q》还算值得一看,体验一些培养的感觉吧!

2005年7月23日

Whole day rain

This summer in Beijing, there was few rains. Today, we met it. It's so cool to us. When I walked on the street to MSRA, the water flooded my sole. Yeah. So nice feeling. Tomorrow we will watch a movie. Glad to it!

2005年7月22日

How to search the papers citing one paper

This afternoon, I fall across a certain problem. I had a paper titled "An effective Conservational Agent with User Modeling based on Bayesian Network". I wanted to find which paper cited it.

Firstly, I used Citeseer. But the webpage only told me the papers it cited. Then I tried my second solution:Google Scholar. Yes. In Google Scholar, there was a page linking to the citing information. You could see the second retrievaled result of this page. Luckily, there was a "Cited by 3" link. Yes, this was my need. I found three papers which cited the paper.

It seemed Google Scholar was a good tool for scholar really. I was fond it. Maybe there was other better tool. I should read the help file of Google Scholar in detail. If you knew, you could tell me. Thanks in advance. If I found, I would share it to you.

2005年7月21日

Special Materials for ChatterBot

After two days collecting, I have found some most useful links for ChatterBots. I think if somebody want to find related materials about ChatterBot or chatbot, he must be in need of the following links. SO just sharing for you:

Chatterbot FAQ
Forums of Chatterbots
The Chatterbot Collection
Documentation for Chatterbots
The Chatterbox Challenge
Chatterbot References

2005年7月20日

16 persons joined in our ping pong club this evening

It's a good news to our club. I thiink so. Originally, I thought there were only seven of us playing ping pong this evening. We seven guys went to BUAA from MSRA 4F hall together. But at the entrance we met three more. And in the ping pong room, we met six more. Aha! This was the record of our ping pong club.
Maybe my presentation on Winddown last week had the positive effect to our club. I'd like more and more people take part in our activity. Indeed, some of us had kept the habit of each Wednesday.

2005年7月19日

Dialogue Model

I thinked it was difficult for dialogue modelling. Although I had experienced so many mathematical modelling contests. As there were so many uncertain factors in dialogue.
I had collected three papers for the survey about dialogue model.

1. Title: Bridging the Gap Between Dialogue Management and Dialogue Models
Source: Proceedings of the Third SIGdial Workshop on Discourse and Dialogue,Philadelphia, July 2002, pp. 201-210. Association for Computational Linguistics.
Authors: Weiqun Xu and Bo Xu and Taiyi Huang and Hairong Xia
Organization: National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science, Beijing, 100080, P. R. China

2. Title: Probabilistic Dialogue Modelling
Source: Proceedings of the Third SIGdial Workshop on Discourse and Dialogue, Philadelphia, July 2002, pp. 125-128. Association for Computational Linguistics
Authors: Oliver Lemon(1), Prashant Parikh(2), Stanley Peters(1)
Organization: Stanford University(1), University of Pennsylvania(2)

3. Title: Survey of the State of the Art in Human Language Technology
Source: Cambridge Studies In Natural Language Processing Series; Vol. XII-XIII archive
Pages: 513
Editors: Ron Cole, Joseph Mariani, Hans Uszkoreit, Giovanni Batista Varile, Annie Zaenen, Antonio Zampolli, Victor Zue
Year of Publication: 1997
Chapter 6.2 Discourse Modeling
Chapter 6.3 Dialogue Modeling
URL: http://cslu.cse.ogi.edu/HLTsurvey/HLTsurvey.html


------------------------------------------------------------------------------------------
I have read the third one of chapter 6. The reading note is as following:

Chapter 6 Discourse and Dialogue
6.1 Overview
The problems addressed in discourse research aim to answer two general kinds of questions: (1) what information is contained in extended sequences of utterances that goes beyond the meaning of the individual utterances themselves? (2) how does the context in which an utterance is used affect the meaning of the individual utterances, or parts of them?
Computational work in discourse has focused on two different types of discourse: extended texts and dialogues, both spoken and written. Although there are clear overlaps between these---dialogues contain text-like sequences spoken by a single individual and texts may contain dialogues---the current state of the art leads research to focus on different questions for each. In addition, application opportunities and needs are different.
Text and dialogue have, however, two significant commonalities. First is a discourse segment. The segment boundaries need to be detected. Second, discourse research on the interpretation of referring expressions, including pronouns and definite descriptions, and the event reference aspect of verb phrase interpretation also is relevant to both text and dialogue.

6.2 Discourse Modeling
6.2.1 Overview: Discourse and Dialogue
Current approaches to discourse and dialogue from the field of artificial intelligence and computational linguistics are based on four predominant theories of discourse which emerged in the mid- to late-eighties:
[Hobbs1985]:
A theory of discourse coherence based on a small, limited set of coherence relations, applied recursively to discourse segments. This is part of a larger, still-developing theory of the relations between text interpretation and belief systems.
[Grosz and Sidner 1986]:
A tripartite organization of discourse structure according to the focus of attention of the speaker (the attentional state), the structure of the speaker's purposes (the intentional structure) and the structure of sequences of utterances (the linguistic structure); each of these three constituents deal with different aspects of the discourse.
[Mann and Thompson (1987) ]:
A hierarchical organization of text spans, where each span is either the nucleus (central) or satellite (support) of one of a set of discourse relations. This approach is commonly known as Rhetorical Structure Theory (RST).
[McKeown (1985) ]:
A hierarchical organization of discourse around fixed schemata which guarantee coherence and which drive content selection in generation.

[Hobbs1985] and [Grosz and Sidner 1986] are suitable for natural language processing,. However, [Mann and Thompson (1987) ] and [McKeown (1985) ] are more appropriate for natural language generation
One important aspect of dialogues is that the successive utterances which make it up are often interconnected by cross references of various sorts. (Anaphora resolution or coreference resolution)
6.2.2 Discourse Representation Theory(DRT)
Discourse Representation Theory (DRT) (cf. [Kam81,KR93]), a semantic theory developed for the express purpose of representing and computing trans-sentential anaphora and other forms of text cohesion, thus offers itself as a natural semantic framework for the design of sophisticated dialogue systems. DRT has already been used in the design of a number of question-answering systems, some of them of considerable sophistication.
DRT is being used as the semantic representation formalism in VERBMOBIL [Wah93], a project to develop a machine translation system for face-to-face spoken dialogue funded by the German Department of Science and Technology. Here the aim is to integrate DRT-like semantics with the various kinds of pragmatic information that are needed for translation purposes.

There are many implemented systems for discourse understanding and generation. Most involve hybrid approaches, selectively exploiting the power of existing theories. Available systems for handling dialogue tend either to have sophisticated discourse generation coupled to a crude discourse understanding systems or vice versa; attempts at full dialogue systems are only now beginning to appear.

6.3 Dialogue Modeling
6.3.1 Research Goals
Two related, but at times conflicting, research goals are often adopted by researchers of dialogue. First is the goal of developing a theory of dialogue, including, at least, a theory of cooperative task-oriented dialogue. A second research goal is to develop algorithms and procedures to support a computer's participation in a cooperative dialogue.
In general, no consensus exists on the appropriate research goals, methodologies, and evaluation procedures for modeling dialogue.

Three approaches to modeling dialogue---dialogue grammars, plan-based models of dialogue, and joint action theories of dialogue---will be discussed, both from theoretical and practical perspectives.
6.3.2 Dialogue Grammars
This approach is based on the observation that there exist a number of sequencing regularities in dialogue, termed adjacency pairs [SSJ78], describing such facts as that questions are generally followed by answers, proposals by acceptances, etc.
The rules state sequential and hierarchical constraints on acceptable dialogues, just as syntactic grammar rules state constraints on grammatically acceptable strings. The terminal elements of these rules are typically illocutionary act names [Aus62,Sea69], such as request, reply, offer, question, answer, propose, accept, reject, etc. The non-terminals describe various stages of the specific type of dialogue being modeled [SC75], such as initiating, reacting, and evaluating.
Just as syntactic grammar rules can be used in parsing sentences, it is often thought that dialogue grammar rules can be used in parsing the structure of dialogues. With a bottom-up parser and top-down prediction, it is expected that such dialogue grammar rules can predict the set of possible next elements in the sequence, given a prior sequence [GWF90].
The speech acts become the state transition labels. When the state machine variant of a dialogue grammar is used as a control mechanism for a dialogue system, the system first recognizes the user's speech act from the utterance, makes the appropriate transition, and then chooses one of the outgoing arcs to determine the appropriate response to supply. When the system performs an action, it makes the relevant transition, and uses the outgoing arcs from the resulting state to predict the type of response to expect from the user.
In summary, dialogue grammars are a potentially useful computational tool to express simple regularities of dialogue behavior. However, they need to function in concert with more powerful plan-based approaches (described below) in order to provide the input data, and to choose a cooperative system response. As a theory, dialogue grammars are unsatisfying as they provide no explanation of the behavior they describe, i.e., why the actions occur where they do, why they fit together into a unit, etc.

6.3.3 Plan-based Models of Dialogue
Plan-based models are founded on the observation that utterances are not simply strings of words, but rather are the observable performance of communicative actions, or speech acts [Sea69], such as requesting, informing, warning, suggesting, and confirming.
Plan-based theories of communicative action and dialogue [AP80,App85,Car90,CL90,CP79,PA80,Sad91,SI81] assume that the speaker's speech acts are part of a plan, and the listener's job is to uncover and respond appropriately to the underlying plan, rather than just to the utterance..
For example, in response to a customer's question of Where are the steaks you advertised?, a butcher's reply of How many do you want? is appropriate because the butcher has discovered that the customer's plan of getting steaks himself is going to fail. Being cooperative, he attempts to execute a plan to achieve the customer's higher-level goal of having steaks.
based theories of dialogue is to offer a generalization in which dialogue can be treated as a special case of other rational noncommunicative behavior. The primary elements are accounts of planning and plan-recognition, which employ various inference rules, action definitions, models of the mental states of the participants, and expectations of likely goals and actions in the context. The set of actions may include speech acts, whose execution affects the beliefs, goals, commitments, and intentions, of the conversants. Importantly, this model of cooperative dialogue solves problems of indirect speech acts as a side-effect [PA80].
Drawbacks of the Plan-based Approach
Illocutionary Act Recognition is Redundant
Discourse versus Domain Plans
Complexity of Inference
Lack of a Theoretical Base

6.3.4 joint action theories of dialogue
Plan-based approaches that model dialogue simply as a product of the interaction of plan generators and recognizers working in synchrony and harmony, do not explain why addressees ask clarification questions, why they confirm, or even, why they do not simply walk away during a conversation. A new theory of conversation is emerging in which dialogue is regarded as a joint activity, something that agents do together [CWG86,CL91b,GS90,GK93,Loc94,Sch81,Suc87]. The joint action model claims that both parties to a dialogue are responsible for sustaining it. Participating in a dialogue requires the conversants to have at least a joint commitment to understand one another, and these commitments motivate the clarifications and confirmations so frequent in ordinary conversation.

6.3.5 Future Directions
Typical areas in which such models are distinguished from individual plan-based models are dealing with reference and confirmations. Clark and colleagues [CWG86,Cla89] have argued that actual referring behavior cannot be adequately modeled by the simple notion that speakers simply provide noun phrases and listeners identify the referents. Rather, both parties offer noun phrases, refine previous ones, correct misidentifications, etc. They claim that people appear to be following the strategy of minimizing the joint effort involved in successfully referring. Computer models of referring based on this analysis are beginning to be developed [HH92,Edm93]. Theoretical models of joint action [CL91b,CL91a] have been shown to minimize the overall team effort in dynamic, uncertain worlds [JM92]. Thus, if a more general theory of joint action can be applied to dialogue as a special case, an explanation for numerous dialogue phenomena, such as collaboration on reference, confirmations, etc.) will be derivable. Furthermore, such a theory offers the possibility for providing a specification of what dialogue participants should do, which could be used to guide and evaluate dialogue management components for spoken language systems. Finally, future work in this area can also form the basis for protocols for communication among intelligent software agents.

2005年7月18日

Personal Knowledge Management

Yeah. This is a knowledge age. We all are nearly submerged by information and knowledge. So we need personal knowledge management.
I redommanded the paper to you: Personal Knowledge Management : Who, What, Why, When, Where, How?. It was written by Jason Frand and Carol Hixon in December, 1999. I believed it was a good article for you however it was little longer.

2005年7月17日

Do you really want to do

Confucius said: "If a man is not far-sighted, then suffering will be close to him." Yes, I believe I am closing to belong to the the latter.

I remembered that some body asked me what you were going to be when I was a child. That time, I said I wanted to be a scientist and invent many things.
When I was studying in middle school, I said I could be a scientist or an expert on computer. And when I was an undergraduate, I said I would like to do be a researcher. After I began to study for my master and doctor degree, somebody asked me what I wanted to do after my doctor degree. At this time, I said I did not know. Maybe I would work in a company, or teach in a school, or keep studying on abroad, or nothing. From I was a junior to now; I was very interested in machine learning. Prof. Liu said so I could be a researcher and do some useful research on machine learning and natural language processing. Yes. I think it was a better choice for me.

Yes. I had not any life play of myself. Now, I am twenty-four years old. I must think more about this topic. Maybe a researcher on NLP and Machine learning is better for me.

The most important thing, I think same as others, is do you really want to do.

2005年7月16日

Madagascar



Yeah. This is a new cartoon which was designed by DreamWorks. During the 80 minutes, we all laughed from the beginning to the end. It's a great cartoon. I thinked so. I believed it was like Ice Age. They all made people laughing and told us some philosophy. Their computer cartoon skills were so excellent that could display so much good scenes.

We six IR guys and our NLC three friends enjoied this nice cartoon in the morning. So nice!

2005年7月15日

John's farewell

This noon, John said his farewell to us. This was his last day in MSRA. We all NLC VSs and John had a dinner at Siji Yuan restaurant. There, we all sixteen persons seated on a big table. During the lunch, we said many things. I first knew John on our visiting of LongQingXia. At that time, we all played Killer Game on the car. John said beside me. Each time, he said so few words that we could not recognize him a killer many times. That day, we all enjoyed ourselves very much.
This evening, Johnny Cui invited us for dinner. He did best in Slogan competition of Winedown last day. We all fifteen went to a Sichuan Tofu pudding restaurant. This time we all fifteen people seat on two big tables. But we pulled them very close. So we all can see each other. Although we were on two tables, we order dishes one at each. During our dinner, we kept exchanging dishes on the two tables. It's very interesting. After the supper, we all began to play the Killer game again. We played three times. I played the policeman role twice. After our cooperation, we policemen won the games.
How time flied! When we watched our clock, we found it was half past nine.
When we returned back to MSRA, Gallen introduced many things about American Killer game to us. It was very interesting.

2005年7月14日

MSRA Winedown

What is Winedown? I did not know before this afternoon.
Gallen, who is an American student and seat beside me, told me that the form of Winedown came from MSR of Redmond, USA. It's a fraction of MS culture. This is the third Winedown of MSRA. The last two were in last year and March this year. This afternoon, from 16:00, Eileen began to hold the Winedown. Firstly, she reviewed simply the past two times Winedown of MSRA. In the clubs showing, as a leader, I showed Ping Pong Club and invited more friends to play with us. I used a Chat mode for displaying our information. I thought it is interesting and attractive.
With the help of my friends and me, Xiaoyuan Cui won the Best Slogan of our characters filling game. It’s very interesting of this form.
Finally, there was a birthday party for celebration the guys whose birthday was in this month. Jizhou Huang, our project member, was one of the birthday guys.
We all felt happy and took easy during the two hours Winedown. Thanks for Eileen, Haibin, Yu Ge, Jixu Chen and so many volunteers.

2005年7月13日

Ping Pong Activity

This is the six time of our Ping Pong Club activity. I am very glad to invite Jixu Chen taking photos for us this evening. In the past ive ftimes, we all played in nice mood. However, tomorrow, there is the third Winedown. Each club will show some pictures. So I asked help for Jixu.

It's rainy suddenly when we returned to BUAA. And under the rain, we walked to the ping pong room of BUAA. So after the rain, the atmosphere was very cool. We felt so better.

About after one hour, we began to play doubles. Cui and me were in a group. Jun Xu and CUi's friend were in a group. We were excited. The final time, we played at 19:17.

Thanks to Jixu. We got the pictures of our Ping Pong Club.

2005年7月12日

[collection]WEB交互界面易用性设计和验收的指导性原则

WEB交互界面易用性设计和验收的指导性原则





  随着企业intranet和国际internet的迅速发展,越来越多的工作流程,商务交易,教育、培训、会议和讲座,以及个人消费娱乐都被转移到所谓的万维网(World Wide Web,以下简称WEB)上来了。与此相对应的是交互操作的复杂性越来越高。


  随着Browser/Server模式的日渐流行,很多操作都是在浏览器环境下的网页上完成的,并不是只有失效的链接和意外的出错才会使操作者感到烦恼,即便是一次完整的成功操作过程,也可能因为操作的繁复性过高或者使用上的不方便而给操作者带来不愉快的体验。


  本文试图阐述WEB交互页面设计的一些指导性原则,这些原则有利于避免发生不愉快的操作体验。这些原则是用户友好性的,是在完成同一种操作要求下,使用户最感到轻松、简单、舒适的WEB交互界面设计原则。我们假定我们讨论的WEB页面都是功能正常的,符合美学观点的。需要说明我们讨论的原则可能会和设计上的美学观点以及既有的功能设计有所冲突。如果发生这种情况,基于“实用的就是美的”观点,我们会建议您酌情放弃原先的美学观点与功能设计。


  1. 输入控件的自动聚焦和可用键盘切换输入焦点


  使用JavaScript实现页面加载完成后立即自动聚焦(focus)到第一个输入控件。可用TAB键(IE缺省实现)或方向键切换聚焦到下一个输入控件。


  输入控件指WEB页面表单(<form>)中显式的,需要用户进行修改、编辑操作的表单元素。对于这些控件,如果没有自动聚焦操作,不可避免的出现一次用户鼠标定位操作(如果用户此前处于键盘输入操作状态或鼠标定位后需要进行键盘输入操作,实际上是键盘鼠标切换操作)。如果鼠标定位后需要进行键盘输入操作,如果不能键盘切换输入焦点,那么不可避免的在切换输入焦点时需要反复的键盘鼠标切换操作,这是很繁琐的。


  如果实现了页面加载完成即自动聚焦到第一个输入控件,并且可以键盘切换输入焦点标定位操作,那么对于用户来说整个页面的输入操作可能都不需要鼠标操作,或次数较少,这是一种便利。毕竟频繁的键盘鼠标切换操作是比较累人的。


  对于有输入栏的对话框或网页,在不干预的情况下就应将当前控制焦点定位在待输入的输入栏上;如果输入栏在一般情况下不需要更改其中的内容,则应直接将焦点定在“确定”按钮上;在几个输入栏之间应支持tab,shift+tab切换操作,“确定”和“取消”应该是切换操作的终点,与具体所在位置无关。


  2.可用Enter(或Ctrl+Enter)键提交,确保和点击提交按钮的效果是相同的


  不要在提交按钮上加入onClick=”…”这样的JavaScript代码。


  用Enter键提交页面是原则1的自然延伸,而且这也是浏览器所缺省支持的。只所以单独列出来是因为实际上有些设计者设计的页面不能达到这种效果,结果导致使用Enter键提交和点击“确定”按钮提交带来的效果不一样。大部分情况下是设计者在“确定”按钮上加入了onClik=”…”这样的代码,通过点击“确定”按钮后,会执行一段JavaScript代码,比如对某些hidden类型的input元素设值。而使用Enter键提交时就不会执行这段代码。


  正确的做法是把这段代码移到表单标签<form>中,以onSubmit=”…”属性引入。


  对于<textarea>表单元素,它会消耗Enter键,因此会使得Enter键提交失效。可以引入JavaScript代码捕捉Ctrl+Enter复合键,一旦捕捉到即执行表单的submit()方法。对于需要频繁提交的场合,比如BBS上,这种代码是很有必要的。


  3. 鼠标动作提示和回应


  对用户的鼠标定位操作,当移动到可响应的位置上时,应给予视觉或听觉的提示。


  动作回应的最简单形式就是鼠标ICON变成手状。浏览器只对具有href属性的HTML标签会自动进行这种变换ICON的行为。对于没有href属性(或没有设置href属性)的标签,可以通过JavaScript设置style属性的cursor为hand。


  目标区域发生变化是更为主动的响应形式。当鼠标指针移到目标区域,此时指针图形改变或文字颜色发生改变均能较大的减轻用户搜索定位目标区域的注意力负担。在按钮上增添直观的图形,尽可能的增大按钮面积;按钮间保持适当的距离,太近增加了用户区别它们之间界限以防误操作的负担,太远增加了用户搜索定位按钮的负担。


  4.尽可能早的在客户端完成输入数据合法性验证


  输入数据的合法性检验应该在客户端使用JavaScript进行验证。除非验证只能在服务器端完成,否则验证工作应在最早能完成的情况下进行。


  在客户端完成数据合法性验证,可以避免一次服务器请求和回复通讯,这种通讯是需要用户等待的,如果用户等待很长时间后从服务器返回的结果提示出现的错误是在输入时即可发现的,那么这种设计就是不友好的。诸如密码长度限制,用户名允许字符限制等等,显然应该在客户端提交前就应该进行验证。


  5. 根据应用场景决定在表单页面和提交后返回页面间是否使用中间过渡页面


  根据应用场景,决定是否显示接收表单页面(表单页面和提交后返回页面间的中间过渡页面),以及使用何种方式显示接收表单页面。


  表单页面和接收表单页面是大部分WEB交互操作赖以实现的配合模式。关于表单页面和接收表单页面的相互关系的设计,要做如下几个方面的考虑。


  一,对于需要频繁操作的场合,从操作便利和快捷性出发,尽可能的减少服务器和客户端交互次数,应该避免使用中间过渡页面。提交完毕直接返回原来的表单页面或默认页面。在这种情况下要考虑到数据安全和可恢复性。


  如果因为用户输入的数据不合格,需要重新输入,那么,去除中间页面,把错误信息直接显示在原表单页面上的设计方式,将是最简洁的处理方式。用户只需要根据错误提示进行更正即可。当然这样做稍微增加了编程负担。在表单接收页面上需要包含原表单页面的内容,而且输入数据项都必须用服务器端代码或客户端JavaScript设置成用户输入的值。为了开发快捷,可以这样做:表单页面和接收表单页面用同一个服务器端脚本页面实现。这个页面按如下流程完成原来两个页面的工作:


  页面脚本初始化


┃检查“提交”变量是否设置┠已设置,做数据验证┃ 验证通过->业务逻辑处理->使用包含页面方式或重定向方式返回到特定页面 ┃ 验证不通过->保存用户输入的数据->退出表单提交处理到表单页面流程中┗未设置,做表单页面流程,如有来自提交流程中产生的用户输入数据,则显示出来


  其中,使用包含页面方式返回到特定页面可以避免一次客户端重定向过程,比客户端重定向过程还要快捷和稳定一些。但是有些情况下因为代码变量冲突或其他原因,使用包含页面方式可能并不方便,这时候可以使用服务器端重定向技术,在ASP里是Server.Transfer方法,在Java Servlet里是RequestDispatcher.forward()方法。不要使用Response.Redirect或者HttpServletResponse.sendRedirect()这种客户端HTTP重定向方法。不使用中间过渡页面也就意味着用户不能后退浏览原先已经填好的表单页面,因为使用的是同一个URL。所以在验证不通过情况下保存用户输入的数据就是必不可少的。


  不使用中间过渡页面带来的另一个问题就是使用包含页面方式或服务器端重定向方式返回会使得URL和页面内容不能一一对应。对于用户可能会直接用这个URL(会收藏这个URL)访问返回页面的情况,他会发现实际上到达的是表单页面,不是他想要的那个返回结果页面。所以,去除中间过渡页面,确实会带来URL和内容含混不清的情况,因而不适合需要URL和页面内容一一对应的场合。


  二,从技术角度考虑,使用中间过渡页面能保证URL和页面内容一一对应,简化页面开发工作。


  为了保证页面内容总是和固定的URL联系起来,必须使用客户端重定向:


           提交                   业务逻辑处理  (中间过渡页面)


表单页面―>接收表单页面―>显示处理结果―>客户端重定向到特定页面


  客户端重定向分几种情况:1,使用HTTP Header重定向,Location:http://www.netall.com.cn,这种定向是最快的,在窗口一片空白的情况下就迅速访问(GET)另一个页面。这种方式实际上不能显示处理结果,只能说是向第一种快速重定向方式的一种折衷处理;2,HTML标签刷新,<META HTTP-EQUIV="Refresh" CONTENT="5;URL=http://www.netall.com.cn">,这种定向比较友好,在这个页面加载完毕后访问另一个页面。很多设计者把这个作为一个技巧使用,在载入一个大页面前放置一个缓冲页面以避免用户乏味的等待;3,JavaScript重定向。由于是用代码控制重定向,可以做的更灵活。比如根据用户习惯,控制操作完毕后的转向流程。4,被动式的重定向。在页面上放置按钮或链接,由用户手动决定返回到特定页面。这种情况适合于处理结果的显示页面包含相当多的信息,需要用户仔细浏览,而决定下一步的操作。


  在使用中间过渡页面的情况下,不能再使用页面过期失效了。否则一旦出现错误,需要用户重新输入表单数据,用户就不能用后退按钮恢复此前填写的表单数据了。除非设计者有意禁止这种恢复。


  6. 防止表单重复提交处理


  对提交按钮点击后做变灰处理避免在网络响应较慢情况下用户重复提交同一个表单。使用页面过期失效避免用户后退浏览重复提交表单。


  有些复杂的应用会导致需要较长时间的等待才会返回处理结果。而在较慢的网络环境中,这种情况更是频繁发生。焦急等待的用户往往会重复点击提交按钮。这种情况是设计者所不希望看到的。


  使用JavaScript在点击提交按钮后使按钮失效变灰是一个最直接的办法(根据原则2这段代码应该放在<form>标签里onSubmit=”…”做)。此外,在表单页面上,用服务器端脚本设置HTTP Header的Expires为立即过期可以保证用户没办法使用后退浏览恢复表单页面。注意这样做的代价可能是用户辛辛苦苦填写很长的内容,结果一旦操作失误就没法恢复。所以应该避免在包含<textarea>表单元素的页面上使用页面过期失效。


  应该说,更严格的方法是,服务器端脚本就应该具备抵抗重复提交的能力。例如,为这个表单分配一个唯一ID或一个使用一次即失效的验证码。此外,这个表单处理还应具有事务性质,如果表单不被接受,所做的改变还是能恢复的。在金融应用场合,重复提交同一笔交易是肯定不被允许的。能在重复提交中获利的一方总是会想办法绕过浏览器的限制,所以不能依赖于客户端的技术。


  7. 页面链接是打开新窗口、使用原窗口还是弹出窗口的原则


  一般而言,首页上链接可以使用target=”_blank”属性打开新窗口,而其他页面上的链接都应使用原窗口或弹出窗口。如果链接页面内容相对原页面来说不重要,是附属性质的,可以使用弹出窗口方式。


  一般情况下应该使用原窗口,把是否保留原窗口内容的权利留给用户。除非设计者相信原页面是如此重要,在用户发出点击指令后还有使用上的价值,以至于不能被随便更新或覆盖。一般来说,只有首页才会处于这样一个地位,用户在首页上打开一个链接后,一般还会在这个首页上去打开另一个链接。比如首页包含极多链接的门户网站,或者搜索引擎的搜索结果页面。Google.com以前的搜索结果页面上的链接是使用原窗口的,后来他们意识到用户会反复使用这个页面,而改成打开新窗口了。一般的网站如果首页链接不多,就不必使用新窗口,这是用户友好的设计原则。


  上述情形的一个极端情况就是新页面内容比起原页面内容的重要性差很多,以至于都未必需要打开一个新页面。这时候使用弹出窗口比较合适。用JavaScript弹出窗口有好几种:一个是window.open()函数。这里有个技巧。应该使用window.open()先打开一个空白窗口,再使用location.replace()用目标页面替换。这样做可以避免在打开新页面的过程中导致原页面失去响应。Window.open()将打开一个新的浏览器窗口进程,因此资源消耗比较大。另一个是由微软DynamicHTML规范中扩充的方法createPopup()。createPopup()可以创建无边框的弹出窗口,消耗系统资源较小。还有一个就是用页面中隐藏的层<div>来模拟一个弹出页面。后两种可以使用JavaScript代码填充弹出窗口内容。如果需要下载网页作为其内容的话,需要微软DynamicHTML规范中的<download>标签。


  8. 尽可能少的排列可选项,尽可能少的安排操作步骤


  根据用户操作习惯安排尽可能少的操作菜单选项,同时要保证尽可能少的操作步骤。


  在不降低功能多样性的前提下减少菜单项和操作步骤是用户友好的设计。要做到这一点很不容易。要从用户出发考虑他们最频繁的操作是什么。正常情况下一个用户需要的操作总可以归类为5个以下的种类,如果出现更多的种类,那一定是没有针对用户兴趣去区分主次。一个用户同时有5个以上的强烈兴趣中心是难以想像的,走马观花似的随意点击浏览的用户,是不大可能在某个种类上进行深入的交互操作的。在这5个种类中,每个种类都可能有若干个可操作的二级种类。如果这些二级操作项是不可见的,那么意味着要做两次选择才能进入可操作页面。这就违背了“尽可能少的安排操作步骤”这一原则。如果使用JavaScript制作二级菜单,避免请求服务器,会好一些。如果二级菜单项总共不超过20个左右,不妨将二级菜单直接显示出来,比如放在左列一字向下排开,这样只需要一次选择到可操作项,更加明了方便。


  9. 操作逻辑无漏洞,保证数据是操作安全的


  多个页面间的操作和同个页面上的多个操作间的逻辑关系在设计上是安全和严谨的。保证不会出现不被允许的用户操作组合,至少不会因为用户的不适当的操作导致出错。


  这最典型的表现则是在页面上广泛采用的所谓联动下拉框设计。一个下拉框中允许的选项受另一个下拉框中的选择而变。另外一个例子是根据选择使表单元素有效或者失效。如果在多个页面间也要维持某种合法性逻辑,那么就需要服务器端脚本的参与。这样会使表单设计跟操作有关,应该说这不是一个好的设计。可以通过变更操作步骤顺序、组合方式来尽可能避免这种情况出现。


  操作逻辑的设计既要保证用户任意的输入不会导致错误,也要保证是用户输入的数据能购被安全处理。在Session控制下的表单中输入大幅文字可能会导致超时出错,这时候往往还伴随重定向过程,导致用户的长篇输入荡然无存。用JavaScript提醒用户已超时,请保存输入后重新提交,是一个好办法。某些表单元素如<input type=”text”>接受ESC键清除数据,并且无法撤销,这也是很危险的。在中文输入法中常常使用ESC键清楚输入的码位,一旦不小心多按一下ESC就会使得输入数据消失。因此有必要用JavaScript禁用<input>和<textarea>的ESC键处理过程。


Reconcerning on Weka

This evening, when I read the paper Opinion Observer: Analyzing and Comparing Opinions, I found some related information. Suddenly, the word Weka occurred to my view. Yes. Weka is a wonderful and powerful machine learning tool. I had spent some time on it. But, frankly speaking, I did know little about it. Luckily, I had brought the book Data Mining: practical machine learning tools and techniques with java implementations.

When I visited the website of Weka. I found some useful slides on how to use Weka. After I saw ten pages of them, I found I could not help reading more about it. Yeah. I think this is my favorite thing on spare time. I could spent some time each day on it. I hope I can master Weka totally. Some day, I could do some research with high quality.

Ok. Why not try again on Weka. See you tomorrow; I will come back for evening practices with my MSRA friends.

2005年7月11日

New Tasks

I can do some knowledge base construction from this afternoon. I can temporary say good bye to asp.net.

Ok. I have submit the action items to Mr. Ming Zhou. I would begin do NLP related tasks again. Try my best to do so^_^

2005年7月10日

2005年7月9日

IR北京聚会

早上CR到北京了,我们IR所有在北京的兄弟们聚在一起了。加上刚到京工作尚林,和从公司赶过来的王震,我们共有八人。我们戏称八仙过海。

2005年7月8日

CR毕业了

晚上收到短信,CR明日到北京。我们准备迎接。我的偶像这就毕业了。

2005年7月7日

程序<=>知识库

继续实践ASP.NET编程,融合多人撰写的知识库。

2005年7月6日

尚林到京

这一阵子俺的MSN昵称都是“惜别IR毕业人”。离愁是苦,大四毕业时告别兄弟姐妹们时的那种感觉还历历在目。今年的六月七月都在北京,没有明显的感觉到实验室毕业的惜别。还好今天尚林来到北京了。
尚林昨晚从哈尔滨出发,据说有十多个人去送他。送人是痛苦的,尚林说他自己没有哭倒是很多给他送行的哭了。我和世奇在北京为尚林接风感觉到的是一种高兴。其实接风嘛,都是很高兴的。
尚林上午到金山公司报道,晚上我们三人吃过饭后到中心的住房处把他的部分行李搬到我们在北航的住处。往返下来我们也折腾到晚上十点。呵呵,尚林来了就好了,我们IR在北京的人员又多了一个了。哈哈,我想等到我们IR在北京建立分部的时候人员就不是问题了^_^

2005年7月5日

MSRA兄弟们

晚饭在楼下四季缘,我们NLC的好几个平日里相处融洽的兄弟姐妹们聚餐了。主题还是刘菁菁队长说得好,“欢迎小黄陈议归队”。我们一桌八人:队长,龙哥,大圣,小崔,小陈,俺,世奇,小黄,哈哈就是八仙过海了。
这次饭局大家吃得非常开心。闲聊得话题天南地北无所不及。来到MSRA这是吃的最爽一次了。
哈哈,感谢我的NLC兄弟姐妹们!

2005年7月4日

[collection]JS里在光标位置插入字符


<
script language
=style="COLOR: #000000">Javascriptstyle="COLOR: #000000">>
style="COLOR: #0000ff">functionstyle="COLOR: #000000"> AddOnPos(obj, charvalue)
{
style="COLOR: #008000">//style="COLOR: #008000">obj代表要插入字符的输入框style="COLOR: #008000">
style="COLOR: #008000">//style="COLOR: #008000">value代表要插入的字符
style="COLOR: #000000">
obj.focus();
varstyle="COLOR: #000000"> r =style="COLOR: #000000"> document.selection.createRange();
style="COLOR: #0000ff">var ctr style="COLOR: #000000">=style="COLOR: #000000"> obj.createTextRange();
style="COLOR: #0000ff">var i;
style="COLOR: #0000ff">var s style="COLOR: #000000">= obj.value;

style="COLOR: #008000">//style="COLOR: #008000">注释掉的这种方法只能用在单行的输入框input内style="COLOR: #008000">
style="COLOR: #008000">//style="COLOR: #008000">对多行输入框textarea无效style="COLOR: #008000">
style="COLOR: #008000">//style="COLOR: #008000">r.setEndPoint("StartToStart", ctr);style="COLOR: #008000">
style="COLOR: #008000">//style="COLOR: #008000">i = r.text.length;style="COLOR: #008000">
style="COLOR: #008000">//style="COLOR: #008000">取到光标位置----Start----style="COLOR: #008000">
style="COLOR: #0000ff">varstyle="COLOR: #000000"> ivalue =style="COLOR: #000000"> "style="COLOR: #000000">&^asdjfls2FFFF325%$^&style="COLOR: #000000">";
r.text
style="COLOR: #000000">= ivalue;
i
style="COLOR: #000000">=style="COLOR: #000000"> obj.value.indexOf(ivalue);
r.moveStart(
"style="COLOR: #000000">characterstyle="COLOR: #000000">", style="COLOR: #000000">-style="COLOR: #000000">ivalue.length);
r.text
style="COLOR: #000000">= style="COLOR: #000000">"";
style="COLOR: #008000">//style="COLOR: #008000">取到光标位置----End----style="COLOR: #008000">
style="COLOR: #008000">//插入字符style="COLOR: #008000">
obj.value style="COLOR: #000000">=style="COLOR: #000000"> s.substr(style="COLOR: #000000">0,i) style="COLOR: #000000">+style="COLOR: #000000"> charvalue style="COLOR: #000000">+style="COLOR: #000000"> s.substr(i,s.length);
ctr.collapse(
truestyle="COLOR: #000000">);
ctr.moveStart(
"style="COLOR: #000000">characterstyle="COLOR: #000000">", i style="COLOR: #000000">+style="COLOR: #000000"> charvalue.length);
ctr.select();
}
style="COLOR: #000000"></style="COLOR: #000000">scriptstyle="COLOR: #000000">>

2005年7月3日

毕业一周年聚会

转眼间我们本科毕业刚好一年了。几位本科同学难得在京城相聚。小猪不远千里从海南赶过来学习一周,仁清和我都是过来实习,邵爱华是过来出差三月,赵建宁、丁蒸、巴赫、宋吉科四位正式北京员工今天都赶上周末。大家能够一起挤出时间来聚会真是难得呀。

今天是7月3号,去年的今天我们寝室的同学开始离校,当时的离别场景还历历在目。都说男儿有泪不轻弹,可是当时我们每位坚强的男生都哭过。当然女生就更不例外了。

我们集合的地点选在西单地铁站旁边的麦当劳。加完班后,下午三点我和仁清准时出现在指定地点,赵建宁早在里面匆匆吃了午饭出来。接下来就是漫长的等待,不知道其他几位同学什么时候出现。

等到四点左右我们终于聚齐。大家的第一眼反映是除了小猪天天陪客户吃饭发福、阿科工作辛苦瘦了些外大家都没有变化。

等到大家在麦当劳聊了很长时间后大家一致决定到附近找家餐馆就餐。大家对西单都不熟悉。我们在这里饶了几大圈,穿过了好长的北京胡同才找到一家饭店。走累了的大家决定不再走了,就地开始点菜了。

大家好久没有见面,话匣子一下子打开都收不回来。我们回忆着本科四年的生活,那些经典的话语和场景一一被我们回味了一番。饭局上大家喝酒已经“文明”了许多,不再像本科毕业那阵儿生猛。桌上上了好几次的花生米,酒只加了两次。

席间谈论了许多现在各自工作的情况,海南的房子比起北京上海很便宜,小猪已经按揭买了一套,准备把父母迁到海南;阿科不久前跳槽,马上要搬到紫竹院那边住;丁蒸还在建行总部,还是那么的不亦乐乎;建宁换了工作,现在状态比以前好了很多;邵爱华工作的地点和上班的地点相隔有点远,每天上班都要坐好长时间车;仁清现在在中心忙些项目,每天都是晚上10:30回去睡觉;巴赫现在在一个公关公司就职前景很好;剩下我在MSRA实习,年底前就回哈尔滨。

大家回味了很多,了解了很多现在的情况,我想到的是不论毕业工作还是读研,都是人生的新开始,每个人都很精彩。

祝福我的本科同学们永远快乐,前程似锦!

2005年7月2日

IR回忆录

几天前Simply写了一篇超级长的blog,堪称IR Blog Group之最。充分体现了她的文采和写作能力。其间流露着对实验室的热爱和离别的不舍。我想她在坐上火车离开哈尔滨的时候一定会号啕大哭的。师姐就是这样一位重情重义之人。
今天看blog发现victor也写了一篇关于毕业的blog,感情真挚,让人感触颇多。
后来转向Stream老师的blog时又发现了新作。老师回顾了IRLab的历程,言及创立实验室的辛酸和欣喜。
我想现在就是一段回忆IR的日子,大家的感觉都是一样的。几年的时间大家培养出来的不仅仅是老师和师兄弟师姐妹们之间简单的称呼,而是从友情上升到的亲情。是呀,人这一辈子有几次大学历程呢,又有几次青春能够在大学度过呢。
一时的感情是复杂的,我只知道我已深深的爱着我们的IRLab,我爱那里的每一位成员。身在北京不能和Simply, YuCoast, Zhiwei Wang, 俺偶像Cr999, Slchen, Zuoran Wang当面告别的感觉是难受的。我只能在这边默默的为他们祝福!一切竟在不言中,祝福他们毕业后一切顺利!

[Collection]人生致命的八个经典问题

问题一:如果你家附近有一家餐厅,东西又贵又难吃,桌上还爬着蟑螂,你会因为它很近很方便,就一而再、再而三地光临吗?
回答:你一定会说,这是什么烂问题,谁那么笨,花钱买罪受?
可同样的情况换个场合,自己或许就做类似的蠢事。
  不少男女都曾经抱怨过他们的情人或配偶品性不端,三心二意,不负责任。明知在一起没什么好的结果,怨恨已经比爱还多,但却“不知道为什么”还是要和他搅和下去,分不了手。说穿了,只是为了不甘,为了习惯,这不也和光临餐厅一样?
  ——做人,为什么要过于执著?!
  
问题二:如果你不小心丢掉100块钱,只知道它好像丢在某个你走过的地方,你会花200块钱的车费去把那100块找回来吗?
  回答:一个超级愚蠢的问题。
  可是,相似的事情却在人生中不断发生。做错了一件事,明知自己有问题,却怎么也不肯认错,反而花加倍的时间来找藉口,让别人对自己的印象大打折扣。被人骂了一句话,却花了无数时间难过,道理相同。为一件事情发火,不惜损人不利已,不惜血本,不惜时间,只为报复,不也一样无聊?
  失去一个人的感情,明知一切已无法挽回,却还是那么伤心,而且一伤心就是好几年,还要借酒浇愁,形销骨立。其实这样一点用也没有,只是损失更多。
——做人,干吗为难自己?!
问题三:你会因为打开报纸发现每天都有车祸,就不敢出门吗?
  回答:这是个什么烂问题?当然不会,那叫因噎废食。
  然而,有不少人却曾说:现在的离婚率那么高,让我都不敢谈恋爱了。说得还挺理所当然。也有不少女人看到有关的诸多报道,就对自己的另一半忧心忡忡,这不也是类似的反应?所谓乐观,就是得相信:虽然道路多艰险,我还是那个会平安过马路的人,只要我小心一点,不必害怕过马路。
  ——做人,先要相信自己。
问题四:你相信每个人随便都可以成功立业吗?
  回答:当然不会相信。
  但据观察,有人总是在听完成功人士绞尽脑汁的建议,比如说,多读书,多练习之后,问了另一个问题?那不是很难?
  我们都想在3分钟内学好英文,在5分钟内解决所有难题,难道成功是那么容易的吗?改变当然是难的。成功只因不怕困难,所以才能出类拔萃。
  有一次坐在出租车上,听见司机看到自己前后都是高档车,兀自感叹:“唉,为什么别人那么有钱,我的钱这么难赚?”
  我心血来潮,问他:“你认为世上有什么钱是好赚的?”他答不出来,过了半晌才说:好像都是别人的钱比较好赚。
  其实任何一个成功者都是艰辛取得。我们实在不该抱怨命运。
  ——做人,依靠自己!

问题五:你认为完全没有打过篮球的人,可以当很好的篮球教练吗?
  回答:当然不可能,外行不可能领导内行。
  可是,有许多人,对某个行业完全不了解,只听到那个行业好赚钱,就马上开起业来了。
  我看过对穿着没有任何口味、或根本不在乎穿着的人,梦想却是开间服装店;不知道电脑怎么开机的人,却想在网上聊天,结果道听途说,却不反省自己是否专业能力不足,只抱怨时不我与。
  ——做人,量力而行。

问题六:相似但不相同的问题:你是否认为,篮球教练不上篮球场,闭着眼睛也可以主导一场完美的胜利?
  回答:有病啊,当然是不可能的。
  可是却有不少朋友,自己没有时间打理,却拼命投资去开咖啡馆,开餐厅,开自己根本不懂的公司,火烧屁股一样急着把辛苦积攒的积蓄花掉,去当一个稀里糊涂的投资人。亏的总是比赚的多,却觉得自己是因为运气不好,而不是想法出了问题。
  ——做人,记得反省自己。

问题七:你宁可永远后悔,也不愿意试一试自己能否转败为胜?
  解答:恐怕没有人会说:“对,我就是这样的孬种”吧。
  然而,我们却常常在不该打退堂鼓时拼命打退堂鼓,为了恐惧失败而不敢尝试成功。
  以关颖珊赢得2000年世界花样滑冰冠军时的精彩表现为例:她一心想赢得第一名,然而在最后一场比赛前,她的总积分只排名第三位,在最后的自选曲项目上,她选择了突破,而不是少出错。在4分钟的长曲中,结合了最高难度的三周跳,并且还大胆地连跳了两次。她也可能会败得很难看,但是她毕竟成功了。
  她说:“因为我不想等到失败,才后悔自己还有潜力没发挥。”
  一个中国伟人曾说;胜利的希望和有利情况的恢复,往往产生于再坚持一下的努力之中。
  ——做人,何妨放手一搏。

问题八:你的时间无限,长生不老,所以最想做的事,应该无限延期?
  回答:不,傻瓜才会这样认为。
  然而我们却常说,等我老了,要去环游世界;等我退休,就要去做想做的事情;等孩子长大了,我就可以……
  我们都以为自己有无限的时间与精力。其实我们可以一步一步实现理想,不必在等待中徒耗生命。如果现在就能一步一步努力接近,我们就不会活了半生,却出现自己最不想看到的结局。
  ——做人,要活在当下。

2005年7月1日

Lunch Talks

昨天今天连续两个Lunch Talk。Lunch Talk我还是第一次参加,主要活动形式就是负责后勤的阿姨订好快餐(肯德基或麦当劳)然后中午12点在会议室开始让学生们领取,大家在15分钟内吃完后就开始主讲人做报告。为什么采取这种形式,我想是因为大家的时间都很紧张,所以就把大家聚到一起快速吃完后立即开始报告。这样下来就充分的利用上午休的时间了。说来也怪,来到这边后我以前养成的午睡的习惯没了,上次问car当初在这里的时候是不是一样,car说是的,他不午睡的习惯就是在这边养成的。
还是转向报告内容吧。
昨天的是讲Feature Selection。主讲人就是我以前拜读过的一篇Feature Selection的文章的作者Huan Liu。看到真人后我向我旁边的同学说:“哈哈,和他个人主页上的照片一摸一样!”来这边听到过很多牛人的报告,多数都是从国外来的华人,但是他们的英语都是非常好的。Huan Liu教授的英语抑扬顿挫挺起来非常舒服,由于以前接触过一点Feature Selection的东西,他的报告我听懂了一些。以往的Feature Selection相当于在数据空间中进行列的筛选,采用的方法很多是将全部的数据都用上计算一些特征量来实现特征的选择。而Huan Liu教授这次介绍的方法不但有特征列的选择,而且用上了采样技术来实现特征的选择。基本思想就是当数据量巨大时,不使用全部的数据来完成Feature Selection,而是采用部分的数据来实现。其实这个Idea非常简单。我想这又应验了王义和老师那句“简单就是美”的话。主讲内容涉及到的文章是A selective sampling approach to active feature selection。报告结束后我找到Huan Liu教授,向他反映了一下我使用他提供的FCBF软件的一些建议。

今天中午的报告是研究院洪小文副院长的报告,主题是如何写出好文章。经过孙承杰、世奇和我的鉴别,呵呵,这个报告和他在我们学校访问的时候做的是一样的。原先在我们学校的报告是用中文讲的,今天的洪院长用英文讲。重温他的报告我想是有用的。俗话说温故而知新嘛。确实听完报告后感觉那些如何写出好文章的技巧和方法在我的脑袋里面印象更加深刻了。这让我想到了我们平时学习到的一些非常有空的东西都需要多多温习。等俺有时间了一定要写一个方便管理记忆东西的个人助理软件,充分利用那个记忆曲线规律来实现。

哈哈,又扯远了^_^