2008 AnAnalysisofActiveLearningStrat

(Settles & Craven, 2008) ⇒ Burr Settles, and Mark Craven. (2008). “Analysis of Active Learning Strategies for Sequence Labeling Tasks.” In: Proceedings of the Conference on Empirical Methods in Natural Language Processing.

Subject Headings: Active Learning, Supervised NLP Task.

Notes

Cited By

Quotes

Abstract

Active learning is well-suited to many problems in natural language processing, where unlabeled data may be abundant but annotation is slow and expensive. This paper aims to shed light on the best active learning approaches for sequence labeling tasks such as information extraction and document segmentation. We survey previously used query selection strategies for sequence models, and propose several novel algorithms to address their shortcomings. We also conduct a large-scale empirical comparison using multiple corpora, which demonstrates that our proposed methods advance the state of the art.

References

1. Naoki Abe, Hiroshi Mamitsuka, Query Learning Strategies Using Boosting and Bagging, Proceedings of the Fifteenth International Conference on Machine Learning, p.1-9, July 24-27, 1998
2. J. Baldridge and M. Osborne. 2004. Active Learning and the Total Cost of Annotation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Pages 9--16. ACL Press.
3. V. R. Carvalho and W. Cohen. 2004. Learning to Extract Signature and Reply Lines from Email. In Proceedings of the Conference on Email and Anti-Spam (CEAS).
4. David Cohn, Les Atlas, Richard Ladner, Improving Generalization with Active Learning, Machine Learning, v.15 n.2, p.201-221, May 1994 doi:10.1023/A:1022673506211
5. Aron Culotta, Andrew McCallum, Reducing Labeling Effort for Structured Prediction Tasks, Proceedings of the 20th National Conference on Artificial Intelligence, p.746-751, July 09-13, 2005, Pittsburgh, Pennsylvania
6. I. Dagan and S. Engelson. 1995. Committee-based Sampling for Training Probabilistic Classifiers. In Proceedings of the International Conference on Machine Learning (ICML), Pages 150--157. Morgan Kaufmann.
7. Rebecca Hwa, Sample Selection for Statistical Parsing, Computational Linguistics, v.30 n.3, p.253-276, September 2004 doi:10.1162/0891201041850894
8. Jin-Dong Kim, Tomoko Ohta, Yoshimasa Tsuruoka, Yuka Tateisi, Nigel Collier, Introduction to the Bio-entity Recognition Task at JNLPBA, Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications, August 28-29, 2004, Geneva, Switzerland
9. Seokhwan Kim, Yu Song, Kyungduk Kim, Jeong-Won Cha, Gary Geunbae Lee, MMR-based Active Machine Learning for Bio Named Entity Recognition, Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers, p.69-72, June 04-09, 2006, New York, New York
10. John D. Lafferty, Andrew McCallum, Fernando C. N. Pereira, Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data, Proceedings of the Eighteenth International Conference on Machine Learning, p.282-289, June 28-July 01, 2001
11. K. Lari and S. J. Young. 1990. The Estimation of Stochastic Context-free Grammars Using the Inside-outside Algorithm. Computer Speech and Language, 4:35--56.
12. D. Lewis and J. Catlett. 1994. Heterogeneous Uncertainty Sampling for Supervised Learning. In Proceedings of the International Conference on Machine Learning (ICML), Pages 148--156. Morgan Kaufmann.
13. Gideon S. Mann, Andrew McCallum, Efficient Computation of Entropy Gradient for Semi-supervised Conditional Random Fields, Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers, p.109-112, April 22-27, 2007, Rochester, New York
14. Andrew McCallum, Kamal Nigam, Employing EM and Pool-Based Active Learning for Text Classification, Proceedings of the Fifteenth International Conference on Machine Learning, p.350-358, July 24-27, 1998
15. Martin Nyffenegger, Jean-Cédric Chappelier, Éric Gaussier, Revisiting Fisher Kernels for Document Similarities, Proceedings of the 17th European Conference on Machine Learning, September 18-22, 2006, Berlin, Germany doi:10.1007/11871842_73
16. F. Peng and A. McCallum. 2004. Accurate Information Extraction from Research Papers Using Conditional Random Fields. In Proceedings of Human Language Technology and the North American Association for Computational Linguistics (HLT-NAACL). ACL Press.
17. L. R. Rabiner. 1989. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE, 77(2):257--286.
18. L. A. Ramshaw and M. P. Marcus. 1995. Text Chunking Using Transformation-based Learning. In Proceedings of the ACL Workshop on Very Large Corpora.
19. Nicholas Roy, Andrew McCallum, Toward Optimal Active Learning through Sampling Estimation of Error Reduction, Proceedings of the Eighteenth International Conference on Machine Learning, p.441-448, June 28-July 01, 2001
20. Erik F. Tjong Kim Sang, Fien De Meulder, Introduction to the CoNLL-2003 Shared Task: Language-independent Named Entity Recognition, Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, p.142-147, May 31, 2003, Edmonton, Canada doi:10.3115/1119176.1119195
21. Tobias Scheffer, Christian Decomain, Stefan Wrobel, Active Hidden Markov Models for Information Extraction, Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis, p.309-318, September 13-15, 2001
22. R. Schwartz and Y.-L. Chow. 1990. The N-best Algorithm: An Efficient and Exact Procedure for Finding the N Most Likely Sentence Hypotheses. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Pages 81--83. IEEE Press.
23. B. Settles, M. Craven, and S. Ray. 2008. Multiple-instance Active Learning. In Advances in Neural Information Processing Systems (NIPS), Volume 20, Pages 1289--1296. MIT Press.
24. H. S. Seung, M. Opper, H. Sompolinsky, Query by Committee, Proceedings of the Fifth Annual Workshop on Computational Learning Theory, p.287-294, July 27-29, 1992, Pittsburgh, Pennsylvania, USA doi:10.1145/130385.130417
25. C. E. Shannon. 1948. A Mathematical Theory of Communication. Bell System Technical Journal, 27:379--423, 623--656.
26. C. Sutton and A. McCallum. 2006. An Introduction to Conditional Random Fields for Relational Learning. In L. Getoor and B. Taskar, Editors, Introduction to Statistical Relational Learning. MIT Press.
27. Andreas Vlachos, Evaluating and Combining Biomedical Named Entity Recognition Systems, Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing, June 29-29, 2007, Prague, Czech Republic
28. A. Yeh, A. Morgan, M. Colosimo, and L. Hirschman. 2005. Biocreative Task 1a: Gene Mention Finding Evaluation. BMC Bioinformatics, 6(Suppl 1):S2.
29. T. Zhang and F. J. Oles. 2000. A Probability Analysis on the Value of Unlabeled Data for Classification Problems. In Proceedings of the International Conference on Machine Learning (ICML), Pages 1191--1198. Morgan Kaufmann.
30. X. Zhu, J. Lafferty, and Z. Ghahramani. 2003. Combining Active Learning and Semi-supervised Learning Using Gaussian Fields and Harmonic Functions. In Proceedings of the ICML Workshop on the Continuum from Labeled to Unlabeled Data, Pages 58--65.

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2008 AnAnalysisofActiveLearningStrat	Mark Craven Burr Settles			An Analysis of Active Learning Strategies for Sequence Labeling Tasks						2008