SUMMA

 

Overview

SUMMA is a toolkit for the development of text summarization systems.

 

 


 

SUMMA Centroid Computation

Functionality

Computes the centroid if a set of documents (e.g. corpus) from vectors in the individual documents. The centroid, called 'centroid' is stored as a feature of the corpus itself.

Parameters of the Resource

  • annSet: the annotation set where the document vectors are to be found (only one vector per document)
  • vecName: the annotation representing the document vector
  • corpus: the corpus holding the set of documents

Restriction

This resource should be used in a GATE pipeline, it does not make sense to use it in a Corpus Pipeline!Each document must have a document vector computed. This can be produced using the vector computation component in SUMMA.

 

 

 

 

 

Copyright 2002-2014 Universitat Pompeu Fabra