1997 ProjectionsForEfficientDocumentClustering

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Lexical Semantic Similarity Function, Text Clustering Algorithm, Latent Semantic Indexing

Cited By

1999

Quotes

Abstract

Clustering is increasing in importance, but linear- and even constant-time clustering algorithms are often too slow for real-time applications. A simple way to speed up clustering is to speed up the distance calculations at the heart of clustering routines. We study two techniques for improving the cost of distance calculations, LSI and truncation, and determine both how much these techniques speed up clustering and how much they affect the quality of the resulting clusters. We find that the speed increase is significant while — surprisingly — the quality of clustering is not adversely affected. We conclude that truncation yields clusters as good as those produced by full-profile clustering while offering a significant speed advantage.


,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
1997 ProjectionsForEfficientDocumentClusteringHinrich Schütze
Craig Silverstein
Projections for Efficient Document Clusteringhttp://dx.doi.org/10.1145/278459.25853910.1145/278459.258539