Supervised Text-Item Classification Algorithm: Difference between revisions

From GM-RKB
Jump to navigation Jump to search
m (Text replacement - "[[::" to "[[")
m (Text replacement - "ions]] " to "ion]]s ")
 
Line 53: Line 53:
* ([[1999_AReExaminationOfTextCategorizationMethods|Yang & Liu, 1999]]) ⇒ [[Yiming Yang]], and Xin Liu. ([[1999]]). “[http://dx.doi.org/10.1145/312624.312647 A Re-examination of Text Categorization Methods].” In: Proceedings of the 22nd [[ACM SIGIR Conference]] Retrieval ([[SIGIR]] 1999).
* ([[1999_AReExaminationOfTextCategorizationMethods|Yang & Liu, 1999]]) ⇒ [[Yiming Yang]], and Xin Liu. ([[1999]]). “[http://dx.doi.org/10.1145/312624.312647 A Re-examination of Text Categorization Methods].” In: Proceedings of the 22nd [[ACM SIGIR Conference]] Retrieval ([[SIGIR]] 1999).
* ([[1999_UsingMaximumEntropyforTextClass|Nigam et al., 1999]]) ⇒ [[Kamal Nigam]], [[John Lafferty]], and [[Andrew McCallum]]. ([[1999]]). “[http://www.kamalnigam.com/papers/maxent-ijcaiws99.pdf Using Maximum Entropy for Text Classification].” In: [[IJCAI-99 workshop on machine learning for information filtering]].
* ([[1999_UsingMaximumEntropyforTextClass|Nigam et al., 1999]]) ⇒ [[Kamal Nigam]], [[John Lafferty]], and [[Andrew McCallum]]. ([[1999]]). “[http://www.kamalnigam.com/papers/maxent-ijcaiws99.pdf Using Maximum Entropy for Text Classification].” In: [[IJCAI-99 workshop on machine learning for information filtering]].
** QUOTE: [[Maximum entropy algorithm|Maximum entropy]] is a [[probability distribution estimation technique]] widely used for a variety of natural language tasks, such as [[language modeling]], [[part-of-speech tagging]], and [[supervised text segmentation|text segmentation]]. The underlying principle of [[Maximum entropy algorithm|maximum entropy]] is that without external knowledge, one should prefer [[probability distribution|distributions]] that are [[uniform distribution|uniform]]. Constraints on the distribution, derived from labeled [[training data]], inform [[Maximum Entropy algorithm|the technique]] where to be minimally non-uniform. The [[Maximum entropy algorithm|maximum entropy formulation]] has a unique solution which can be found by the improved [[iterative scaling algorithm]]. [[In this paper]], [[Maximum entropy algorithm|maximum entropy]] is used for [[supervised text classification|text classification]] by [[estimating]] the [[conditional distribution]] of the [[class variable]] given the [[document]]. In experiments on several [[text dataset]]s we compare accuracy to [[naive Bayes classification algorithm|naive Bayes]] and show that [[Maximum entropy algorithm|maximum entropy]] is sometimes significantly better, but also sometimes worse. Much future work remains, but the results indicate that [[Maximum entropy algorithm|maximum entropy]] is a promising technique for [[supervised text classification|text classification]].
** QUOTE: [[Maximum entropy algorithm|Maximum entropy]] is a [[probability distribution estimation technique]] widely used for a variety of natural language tasks, such as [[language modeling]], [[part-of-speech tagging]], and [[supervised text segmentation|text segmentation]]. The underlying principle of [[Maximum entropy algorithm|maximum entropy]] is that without external knowledge, one should prefer [[probability distribution|distribution]]s that are [[uniform distribution|uniform]]. Constraints on the distribution, derived from labeled [[training data]], inform [[Maximum Entropy algorithm|the technique]] where to be minimally non-uniform. The [[Maximum entropy algorithm|maximum entropy formulation]] has a unique solution which can be found by the improved [[iterative scaling algorithm]]. [[In this paper]], [[Maximum entropy algorithm|maximum entropy]] is used for [[supervised text classification|text classification]] by [[estimating]] the [[conditional distribution]] of the [[class variable]] given the [[document]]. In experiments on several [[text dataset]]s we compare accuracy to [[naive Bayes classification algorithm|naive Bayes]] and show that [[Maximum entropy algorithm|maximum entropy]] is sometimes significantly better, but also sometimes worse. Much future work remains, but the results indicate that [[Maximum entropy algorithm|maximum entropy]] is a promising technique for [[supervised text classification|text classification]].


=== 1998 ===
=== 1998 ===

Latest revision as of 07:32, 22 August 2024

A Supervised Text-Item Classification Algorithm is a data-driven text-item classification algorithm that is a supervised classification algorithm.



References

2023

2007a

2007b

2002a

2002b

2001

  • (Slonim and Tishby, 2001) ⇒ N. Slonim, and N. Tishby. (2001). “The Power of Word Clusters for Text Classification.” In: Proceedings of the 23rd European Colloquium on Information Retrieval Research (ECIR 2001).

2000

1999

1998

  • (Apte et al., 1998) ⇒ C. Apte, F. Damerau, and Sholom M. Weiss. (1998). “Text mining with decision rules and decision trees.” In: Proceedings of the Conference on Automated Learning and Discorery, Workshop 6: Learning from Text and the Web.





  • ((McCallum & Nigam, 1998) ⇒ Andrew McCallum, and Kamal Nigam. (1998). “A Comparison of Event Models for Naive Bayes Text Classification.” In: Proceedings of AAAI-98 Workshop on Learning for Text Categorization.


1997

1996

1995

  • E. Wiener, J.O. Pedersen, and A.S. Weigend. (1995). “A neural network approach to topic spotting.” In: Proceedings of the Fourth Annual Symposium on Document Analysis and Information Retrieval (SDAIR 1995).
  • William W. Cohen. (1995). “Text Categorization and Relational Learning.” In: The Twelfth International Conference on Machine Learning (ICML 1995).

1994

1993

1992

1991

  • N. Fuhr, S. Hartmanna, G. Lustig, M. Schwantner, and K. Tzeras. (1991). “Air/x - a rule-based Multistage Indexing Systems for Large Subject Fields.” In: Proceedings of RIAO 1991.