SUMMA

 

Overview

SUMMA is a toolkit for the development of text summarization systems.

 

 


 

SUMMA Sentence Document Similarity

Functionality

Adds to each sentence a feature ('sent_doc_sim') representing the similarity of the sentence to the document.

Parameters of the Resource

  • sentAnnSet: the annotation set where the annotations live
  • sentAnn: the name of the annotation for which you want to compute the feature (e.g. Sentence)
  • vecAnn; the name of the annotation representing the sentence vector.
  • docVecName: the name of the vector you want to compare the sentence to (there should be only one annotation of the type in the document)

Restriction

You need to compute the text vector using the SUMMA Vector Computation resource to produce a vector for the whole document. To do so, you will need to have one annotation representing the whole text of your document which you can produce with a grammar distributed with SUMMA.

 

 

 

 

 

Copyright 2002-2014 Universitat Pompeu Fabra