Research on automatic text summarization started over 50 years ago and although mature in some application domains (i.e. news), faces new challenges in the current context of user-generated on-line content and social networks.
Information on the Web is constantly updated sometimes without any quality-control, an important proportion of the information being informal and ephemeral, a typical example being that of opinions and messages on the Internet.
- What techniques can be used to produce appropriate summaries in this context?
- How to measure relevance of ill-formed input?
- How to produce understandable summaries from noisy texts? How to identify the most relevant information in a set of opinions?
High quality documentation such as technical/scientific articles and patents, has not received all the attention that the field deserves. Given the explosion of technical documentation available on the Web and in intranets, scientist and research and development facilities face a true scientific information deluge: summarization should be a key instrument not only for reducing the information content but also for measuring information relevance in context, providing to users adequate answers in context.
- What techniques can be used to extract knowledge from complex technical documents?
- How to compile back the information in a well formed summary?
- How to measure relevance in a network of scientific articles, beyond mere citation counts?
Another summarization research topic lying behind is non-extractive summarization, the generation of a concise summary which is not a set of sentences from the input. This is a very difficult problem since summarization systems must be able to easily adapt from one domain to another in order to recognize what is important and how to produce a coherent text from a textual or conceptual representation.
The workshop Automatic Text Summarization of the Future aims to bringing together researchers and practitioners of natural language processing to address the aforementioned and related issues.