1998 DistributionalClusteringofWords

Subject Headings: Supervised Document Classification, Supervised Dimensionality Reduction.

Notes

Experimental results obtained on three real-world data sets show that we can reduce the feature dimensionality by three orders of magnitude and lose only 2% accuracy - significantly better than Latent Semantic Indexing [6], class-based clustering [1], feature selection by mutual information [23] or Markov-blanket-based feature selection [13]. We also show that less aggressive clustering sometimes results in improved classification accuracy over classification without clustering.

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
1998 DistributionalClusteringofWords	L. Douglas Baker Andrew McCallum			Distributional Clustering of Words for Text Classification				10.1145/290941.290970		1998