The whole day, we joint in the First MS IJARC Symposium on Natural Language Processing. MS IJARC stood by Microsoft Institute for Japanese Academic Research Collaboration. It was established in 2005. We had the opportunity to study the lectures from Japanese researchers.
There were many nice presentations.
Bill Dolan, who was the manager of Redmond NLP group of MSR, introduced many research topics of them. I was interested in their Multi-document Summarization, MindNet, and paraphrase projects. They were very related to our research of IRLab.
Takashi Ninomiya, from Japan, introduced his beaming search for Probabilistic HPSG Parsing. The basic idea was using some dynamic programming technology for searching the best parsing result. Comparing common parsing technology, it was of very high speed. But I challenged his method losing some final precision.
It was a nice trick for parsing very huge corpus using little loss of precision for thousands of times speed. It was a good idea for huge-scale corpus processing.
During the break, I asked a question with Junichi Tsujii, who was professor of university of Tokyo. He was the Ph.D. supervisor of Hang Li. There was a parsing example of Takashi Ninomiya. It was "I saw the girl with telescope." It was the famous ambiguity sentence to paring. We could not make sure the final parsing result directly by the sentence itself. We should use some context information to disambiguate it. My question was how to use context information to solve this problem. And whether there were somebody had used such method. Prof. Junichi Tsujii explained it very clearly. He said because we could not model the local context well, this problem was very hard to solve. In the past twenty years, some researchers had tried to combining some heuristic rules but only got very little effect. Now there was not any nice result on it. So I believed it was a nice research topic based on the recent useful technologies.
Chengling Huang, who was an elder man of NLC group of MSRA, introduced the word segmentation error auto detecting method. His final words explained one good idea. To each NLP domain, you must give out a specification and define your problem very well and then you can do some works. I remembered some of my tagging works on coreference research. I should re-check it again after I returned to Harbin.
Cheng Niu, who was my mentor now, introduced his paper Word Independent Context Pair Classification Model for Word Sense Disambiguation. His idea was on context modeling with many features. But his research perspective on WSD was of coreference resolution. It meant that WSD processed the same mention with different meaning. It was a good idea. So we could re-define coreference problem based on other related research.
Nice chance for listening such meeting.
没有评论:
发表评论