Thursday, November 18, 2010

Sanderson and Croft, Deriving Concept Hierarchies from Text

Monothetic clusters: membership is based only on one feature.

Polythetic clusters: a document's membership to a given cluster is defined by its possession of a sufficient fraction of terms from that cluster. See Scatter/Gather.

The topic of a monothetic cluster is usually more intuitive than the topic of a polythetic cluster.

Five basic principles:
1) Terms in hierarchy extracted from documents, and reflect topics covered in documents

No comments:

Post a Comment