1999 UnsupervisedNamedEntityClassification

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Semi-Supervised Named Entity Recognition Algorithm, Bootstrapping, Unsupervised Learning Algorithm.

Notes

Cited By

~578 http://scholar.google.com/scholar?cites=13260482622036226672

Quotes

Abstract

  • This paper discusses the use of unlabeled examples for the problem of named entity classification. A large number of rules is needed for coverage of the domain, suggesting that a fairly large number of labeled examples should be required to train a classifier. However, we show that the use of unlabeled data can reduce the requirements for supervision to just 7 simple “seed” rules. The approach gains leverage from natural redundancy in the data: for many named-entity instances both the spelling of the name and the context in which it appears are sufficient to determine its type. We present two algorithms. The first method uses a similar algorithm to that of (Yarowsky 95), with modifications motivated by (Blum and Mitchell 98). The second algorithm extends ideas from boosting algorithms, designed for supervised learning tasks, to the framework suggested by (Blum and Mitchell 98).

References

  • M. Berland and Eugene Charniak. (1999). Finding Parts in Very Large Corpora. In: Proceedings of the the 37th Annual Meeting of the Association for Computational Linguistics (ACL-99).
  • D. M. Bikel, S. Miller, R. Schwartz, and R. Weischedel. (1997). Nymble: a High-Performance Learning Name-finder. In: Proceedings of the Fifth Conference on Applied Natural Language Processing, pages 194-201.
  • A. Blum and Tom M. Mitchell. (1998). Combining Labeled and Unlabeled Data with Co-Training. In: Proceedings of the ll th Annual Conference on Computational Learning Theory (COLT-98).
  • Eric D. Brill. (1995). Unsupervised Learning of Disambiguation Rules for Part of Speech Tagging. In: Proceedings of the Third Workshop on Very Large Corpora.
  • S. Brin. (1998). Extracting Patterns and Relations from the World Wide Web. In WebDB Wokshop at EDBT '98.
  • Michael Collins. (1996). A New Statistical Parser Based on Bigram Lexical Dependencies. Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, pages 184-191.
  • Arthur P. Dempster, N.M. Laird, and D.B. Rubin, (1977). Maximum Likelihood from Incomplete Data Via the EM Algorithm, Journal of the Royal Statistical Society, Ser B, 39, 1-38.
  • Yoav Freund. Boosting a weak learning algorithm by majority. Information and Computation, 121 (2):256-285, 1995.
  • Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55( 1 ): I 19-139, 1997.
  • Marti Hearst. (1992). Automatic Acquisition of Hyponyms from Large Text Corpora. In: Proceedings of the Fourteenth International Conference on Computational Linguistics. Michael Kearns. Thoughts on hypothesis boosting. Unpublished manuscript, December 1988.
  • John D. Lafferty. Additive Models, Boosting, and Inference for Generalized Divergences. In: Proceedings of the Twelfth Annual Conference on Computational Learning Theory, 1999. Proceedings of the Sixth Message Understanding Conference (MUC-6). Morgan Kaufmann, San Mateo, CA.
  • E. Riloff and J. Shepherd. (1997). A Corpus-based Approach for Building Semantic Lexicons. In: Proceedings of the Second Conference on Empirical Methods in Natural Language Processing (EMNLP-2).
  • E. Riloff and R. Jones. (1999). Learning Dictionaries for Information Extraction by Multi-level Bootstrapping. In: Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI-99).
  • Robert E. Schapire. The strength of weak learnability. Machine Learning, 5(2): 197-227, 1990.
  • Robert E. Schapire and Yoram Singer. Improved boosting algorithms using confidence-rated predictions. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pages 80-91, (1998). To appear, Machine Learning.
  • G. Valiant. A theory of the learnable. Communications of the ACM, 27(11): 1134-1142, November 1984.
  • Yarowsky. (1995). Unsupervised Word Sense Disambiguation Rivaling Supervised Methods.In: Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics. Cambridge, MA, pp. 189-196.

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
1999 UnsupervisedNamedEntityClassificationMichael Collins
Yoram Singer
Unsupervised Models for Named Entity Classificationhttp://acl.ldc.upenn.edu/W/W99/W99-0613.pdf