News understanding through event pattern clustering


In this talk I will describe HEADY: an abstractive approach for news understanding and headline generation from news collections. From a web-scale corpus of English news, we mine syntactic patterns that a Noisy-OR model generalizes into event descriptions. At inference time, we query the model with the patterns observed in an unseen news collection, identify the event that better captures the gist of the collection and automatically produce updates for the knowledge graph, and retrieve the most appropriate pattern to generate a headline. The talk will focus on the event understanding and information extraction sides of the system, and the main challenges we see moving forward.


Enrique Alfonseca is a research scientist at Google Research Zurich where he currently manages the language understanding team. During the past four years he has been a member of areas of ads quality, search quality and natural language understanding, contributing in areas such as query expansion and relevance estimation for sponsored search, ranking for web search, information extraction, unsupervised semantic parsing, lexical semantics and automatic text summarization. He received a Ph.D. in Computer Science from Universidad Autónoma de Madrid in 2003, and he has over 60 research publications, mainly in the fields of computational linguistics and information retrieval.