1999 UsingMaximumEntropyforTextClass
- (Nigam & al, 1999) ⇒ Kamal Nigam, John Lafferty, and Andrew McCallum. (1999). "Using Maximum Entropy for Text Classification." In: IJCAI-99 workshop on machine learning for information filtering.
Subject Headings:
Notes
Cited By
Quotes
Abstract
This paper proposes the use of maximum en- tropy techniques for text classi�cation. Maxi- mum entropy is a probability distribution esti- mation technique widely used for a variety of natural language tasks, such as language mod- eling, part-of-speech tagging, and text segmen- tation. The underlying principle of maximum entropy is that without external knowledge, one should prefer distributions that are uni- form. Constraints on the distribution, derived from labeled training data, inform the tech- nique where to be minimally non-uniform. The maximum entropy formulation has a unique so- lution which can be found by the improved it- erative scaling algorithm. In this paper, max- imum entropy is used for text classi�cation by estimating the conditional distribution of the class variable given the document. In experi- ments on several text datasets we compare ac- curacy to naive Bayes and show that maximum entropy is sometimes signi�cantly better, but also sometimes worse. Much future work re- mains, but the results indicate that maximum entropy is a promising technique for text classifi�cation.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
1999 UsingMaximumEntropyforTextClass | Kamal Nigam John D. Lafferty Andrew McCallum | Using Maximum Entropy for Text Classification | 1999 |