2006年1月18日

[2006-01-13]YOCSEF Conference: Content-based Searching and Search Engines

YOCSEF Conference: Content-based Searching and Search Engines
2006-01-13 14:00~18:00


This afternoon, there was an YOCSEF meeting on Content-based Searching and Search Engines. The meeting location was in Beijing University.

Our Prof. Tliu was the Executive Chairman. The three famous speakers were from Research institute of university, famous Enterprise to World-level research institute.

My supervisor Prof. Tliu gave a wonderful introduction about this meeting firstly. His speech, in my opinion, was perfect. He concluded the most important things and men of 2005.

Then Shuo Bai gave first topic on Several development on Text Mining. His presentation was wonderful. There were three main points:
1) Analysis on very huge scale rule text corpora and obtain the macroscopically characters in special scopes.
2) Model and analysis people, institute, location, events, music, software, and other abstractive objects, then obtain the related properties, relations and documents.
3) Integrate structure and un-structure data mining based on XML frames.

In his talk, I concerned some points. Document Representation has four types: Links analysis, Expand Factor (each word’s neighbors are especially. It is some statistical characters), graph expression (node is word, edge is co-occurrence frequency. Used for sub-graph finding). Document representation is the link node of natural language shallow processing, classification and clustering, and information extraction.

There were several macroscopically characters, such as single document trends analysis, attitude search engine, public opinion index, popular sequence(such as popular words, virus, and hot topic) analysis, event tracking, human tracking, and research paper&topics track.

In the audience asking time, I asked Mr. Shuo Bai a question on XML. He agreed with my opinion on XML was only a representation form and tool. The essential technology was same. To XML, document could be represented by hierarchy. With ample tools assistant, XML could be very useful to NLP and IR. Maybe it was a newly revolution.


The second topic was of Pei Chen who was the CEO of ZhongSou(Chinese Search). Pei Chen was a media event in 2005. He had several words cited often now. You could visit some of them as following:
陈沛简介
中国搜索总裁陈沛简介
中搜CEO陈沛做主题演讲
中国搜索总裁陈沛做客《专访间》
陈沛:走向中国搜索引擎4.0时代

His presentation in this forum was the future of search(搜索的未来). This was the first time I heard his presentation. In my opinion, he was sure of himself. His ZhongSou was famous now. I wished his success! He defined the third stage of search as the combination of dictionary(list earlier Yahoo) and keyword based search engine(like google just now). And his ZhongSou was ample with his idea. Yeah! When I heard his words on introducing the third stage of search, I was excited. Because I had similiar idea on our English short search engine. I was so glad to meet similiar idea. Pei Chen's analysis on the third stage of search was very good. I agreed his idea. Yes. Now we needed a stronger search engine. It should be of the feature on navigation for web and keyword searching. To the popular search engine, we could only search something. But we could not do anything for knowing things out of our mind. So this was the fault. I was looking forward more powerful one.

The third presentation was given by Dr.Huican Zhu. He was the professional engineer of Google China. He introduced some operation for business search engine. There were lots of introduction about information retrieval. I had known it a little. So I was interested in the final introduction about google. The speaker introduced the papers site of googlers. The link of google papers was http://labs.google.com/papers. I had found many good papers in this link.

Finally, Dr.Huican Zhu gave some introduction about the challenges for search engine. I recorded the later two. First was Natural Language Processing(NLP) for understanding the question and relevant facts. The other was Semantic Web which could make data easier to process and understand.

To the two challenges, NLP was our main research direction. So we could do more research on NLP for information retrieval. Semantic web had been invested a little by me. I knew it was very popular now. We could do lots of works on it.


--------------------------------------------------
In a word, this was a successful forum on searching. I gained a lot.

没有评论: