SUMMA

 

Overview

SUMMA is a toolkit for the development of text summarization systems.

 

 


 

SUMMA Centroid Sentence Similarity

Functionality

Adds to each sentence a feature ('centroid_sim') representing the similarity of the sentence to the centroid of the set of documents.

Parameters of the Resource

  • annSet: the annotation set where the annotations live
  • sentAnn: the name of the annotation for which you want to compute the feature (e.g. Sentence)
  • corpus: the corpus with the documents
  • sentVec: the name of the annotation with the sentence vector
  • centroid: the vector with the centroid (a feature of the corpus)

Restriction

This resource should be used in a GATE pipeline, it does not make sense to use it in a Corpus Pipeline! A centroid must exist as a feature of the corpus. Sentence vectors must exist in the documents.

 

 

 

 

 

Copyright 2002-2014 Universitat Pompeu Fabra