Wednesday, April 22, 2009

Lecture: Lauri Karttunen

Department of Linguistics announces the following Distinguished Alumnus Lecture:

Lauri Karttunen (Ph.D. 1969), PARC and Stanford University, will present, "Computing Textual Inferences."

Friday, April 24
4:00 p.m.
Ballantine 228

Co-sponsored by: Cognitive Science, Computer Science, IULC

Abstract:
The ultimate goal of computational linguistics is to build systems that understand natural language. It is unlikely that we will get there in my lifetime but I will try in this talk to convince you that progress is being made.

One indication that a human has understood a piece of text is that she can answer questions about it. A computer should be able to do the same, but not in the the way that search engines such as Google operate. If you type the query “Who was the 21st president of the US?”
into a search box, Google comes back with a set of links to relevant articles with several highlighted passages that contain the answer: Chester Alan Arthur was the 21st President of the US. But Google does not understand the question. It extracts the string was the 21st president of the US and retrieves from its index all documents that contain it. But if you ask a question like “Did Robert Downey Jr. win an Academy award?” Google does not find the document that has the answer, “Robert Downey Jr. failed to win an Academy Award this week for his performance in Tropic Thunder,” because Google cannot make the obvious inference that failed to win implies not to win. None of the current search engines that all rely on string matching can deliver the simple NO answer in such cases.

Much progress has been made in recent years on computing such inferences. From statements such as Robert Downey Jr. failed/managed to win an Academy Award we can conclude the truth or falsity of Robert Downey Jr. won an Academy award. From Joe failed not to get lost in Peru we can infer that Joe got lost in Peru. Inferences of this type are called LOCAL TEXTUAL INFERENCEs because they are based on the semantic properties of the particular lexical items such as fail (to), manage (to) and not and require no chains of reasoning or world knowledge. My 1971 IU Linguistics Club publication "The Logic of English Predicate Complement Constructions" established the semantic classification of verbs that take infinitival complements such as fail, manage, bother, happen, forget, remember, force, allow, hesitate, etc. It is satisfying to see that these distinctions have been implemented in the current state-of-the-art systems for computing textual inferences.

Local textual inference is in many respects a good test bed for computational semantics. It is task oriented. It abstracts away from particular meaning representations and inference procedures. It allows for systems that make purely linguistic inferences but it also allows for systems that bring in world knowledge and statistical reasoning. Because shallow statistical approaches have plateaued out, there is a clear need for deeper processing. The system I will describe in the final part of the talk is the Bridge system (a bridge from language to understanding) developed at the Palo Alto Research Center by the Natural Language Theory and Technology group. I will explain how the system infers, among other things, that Joe failed not to get lost in Peru implies that he did, why Nobody moved contradicts A girl danced, and how the system concludes that every small boy saw a cat from Every boy saw a small cat?

No comments: