1999 UsingMaximumEntropyforTextClass

(Nigam & al, 1999) ⇒ Kamal Nigam, John Lafferty, and Andrew McCallum. (1999). "Using Maximum Entropy for Text Classification." In: IJCAI-99 workshop on machine learning for information filtering.

Subject Headings:

Notes

Cited By

http://scholar.google.com/scholar?q=%22Using+maximum+entropy+for+text+classification%22+1999

Quotes

Abstract

This paper proposes the use of maximum en- tropy techniques for text classi�cation. Maxi- mum entropy is a probability distribution esti- mation technique widely used for a variety of natural language tasks, such as language mod- eling, part-of-speech tagging, and text segmen- tation. The underlying principle of maximum entropy is that without external knowledge, one should prefer distributions that are uni- form. Constraints on the distribution, derived from labeled training data, inform the tech- nique where to be minimally non-uniform. The maximum entropy formulation has a unique so- lution which can be found by the improved it- erative scaling algorithm. In this paper, max- imum entropy is used for text classi�cation by estimating the conditional distribution of the class variable given the document. In experi- ments on several text datasets we compare ac- curacy to naive Bayes and show that maximum entropy is sometimes signi�cantly better, but also sometimes worse. Much future work re- mains, but the results indicate that maximum entropy is a promising technique for text classifi�cation.

References

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
1999 UsingMaximumEntropyforTextClass	Kamal Nigam John D. Lafferty Andrew McCallum			Using Maximum Entropy for Text Classification						1999