SUMMA

 

Overview

SUMMA is a toolkit for the development of text summarization systems.

 

 


 

SUMMA Ready-made Applications

These applications will give you an idea of how to use and sequence components to create your summarization applications.

Single Document Summarization in English

  • GAPP file: SUMMA-POS-TF-FIRST-DOCVEC.gapp under the summa_plugin 'gapps' directory
  • Functionality: it is a corpus pipeline with all components necesary for a single document summarization system. It is prepared for use with English documents.
  • Use: load a number of documents in a corpus, give the corpus as parameter to the application and run it! It should produce summaries for each of the documents in the corpus.

Single Document Summarization in Spanish

  • GAPP file: SUMMA-SINGLE-SPANISH.gapp under the summa_plugin 'gapps' directory
  • Functionality: it is a corpus pipeline with all components necesary for a single document summarization system. It is prepared for use with Spanish documents.
  • Use: load a number of documents in a corpus, give the corpus as parameter to the application and run it! It should produce summaries for each of the documents in the corpus.

Random Single Document Summarization

  • GAPP file: SUMMA-RANDOM.gapp under the summa_plugin 'gapps' directory
  • Functionality: it is a corpus pipeline with all components necesary to produce random summaries.
  • Use: load a number of documents in a corpus, give the corpus as parameter to the application and run it! It should produce summaries for each of the documents in the corpus.

Lead-based Single Document Summarization

  • GAPP file: SUMMA-LEAD.gapp under the summa_plugin 'gapps' directory
  • Functionality: it is a corpus pipeline with all components necesary to produce lead-based summaries.
  • Use: load a number of documents in a corpus, give the corpus as parameter to the application and run it! It should produce summaries for each of the documents in the corpus.

SUMMA Multi-document Summarization Application

  • This is NOT a gapp application but a component that loads a number of gapps and uses them to create the multi-document summarization functionalities. All gapps used are distributed with SUMMA under the gapp directory and in order for them to work, summa_plugin shoul be installed under the directory plugins of GATE.
  • How to use it:
    • Load the component called SUMMA Vanilla MultiDoc Summarizer PR. A number of applications will be loaded in the interface.
    • Create a GATE pipeline
    • Add the SUMMA Vanilla MultiDoc Summarizer PR to the pipeline
    • The parameters which are of type CorpusController or SerialController should be left with their default settings until you are familiar with how this component works. The other parameters can be changed.
    • One parameter you should provide is the set of documents to summarize which should be in a corpus.

Restriction

For the applications to run correctly you have to install the summa_plugin right below the plugins directory in GATE otherwise you will have to change the relative paths of the gapp files.

 

 

 

 

 

Copyright 2002-2014 Universitat Pompeu Fabra