News understanding through event pattern clustering
this talk I will describe HEADY: an abstractive approach for news
understanding and headline generation from news collections. From a
web-scale corpus of English news, we mine syntactic patterns that a
Noisy-OR model generalizes into event descriptions. At inference time,
we query the model with the patterns observed in an unseen news
collection, identify the event that better captures the gist of the
collection and automatically produce updates for the knowledge graph,
and retrieve the most appropriate pattern to generate a headline. The
talk will focus on the event understanding and information extraction
sides of the system, and the main challenges we see moving forward.
Alfonseca is a research scientist at Google Research Zurich where he
currently manages the language understanding team. During the past four
years he has been a member of areas of ads quality, search quality and
natural language understanding, contributing in areas such as query
expansion and relevance estimation for sponsored search, ranking for
web search, information extraction, unsupervised semantic parsing,
lexical semantics and automatic text summarization. He received a Ph.D.
in Computer Science from Universidad Autónoma de Madrid in 2003, and he
has over 60 research publications, mainly in the fields of
computational linguistics and information retrieval.