2008 BootstrappingNamedEntityAnnotation

From GM-RKB
Jump to: navigation, search

Subject Headings: Named Entity Recognition, Active Learning

Notes

Cited By

Quotes

Abstract

This thesis describes the development and in-depth empirical investigation of a method, called BootMark, for bootstrapping the marking up of named entities in textual documents. The reason for working with documents, as opposed to for instance sentences or phrases, is that the BootMark method is concerned with the creation of corpora. The claim made in the thesis is that BootMark requires a human annotator to manually annotate fewer documents in order to produce a named entity recognizer with a given performance, than would be needed if the documents forming the basis for the recognizer were randomly drawn from the same corpus. The intention is then to use the created named entity recognizer as a pre-tagger and thus eventually turn the manual annotation process into one in which the annotator reviews system-suggested annotations rather than creating new ones from scratch. The BootMark method consists of three phases: (1) Manual annotation of a set of documents; (2) Bootstrapping – active machine learning for the purpose of selecting which document to annotate next; (3) The remaining unannotated documents of the original corpus are marked up using pre-tagging with revision.

Five emerging issues are identified, described and empirically investigated in the thesis. Their common denominator is that they all depend on the realization of the named entity recognition task, and as such, require the context of a practical setting in order to be properly addressed. The emerging issues are related to: (1) the characteristics of the named entity recognition task and the base learners used in conjunction with it; (2) the constitution of the set of documents annotated by the human annotator in phase one in order to start the bootstrapping process; (3) the active selection of the documents to annotate in phase two; (4) the monitoring and termination of the active learning carried out in phase two, including a new intrinsic stopping criterion for committee-based active learning; and (5) the applicability of the named entity recognizer created during phase two as a pre-tagger in phase three.

The outcomes of the empirical investigations concerning the emerging issues support the claim made in the thesis. The results also suggest that while the recognizer produced in phases one and two is as useful for pre-tagging as a recognizer created from randomly selected documents, the applicability of the recognizer as a pre-tagger is best investigated by conducting a user study involving real annotators working on a real named entity recognition task.


References

  • Abe, Naoki and Hiroshi Mamitsuka (1998). Query learning strategies using boosting and bagging. Proceedings of the Fifteenth International Conference on Machine Learning, 1–9. Madison, Wisconsin, USA: Morgan Kaufmann Publishers Inc.
  • Aha, David W., Dennis Kibler and Marc K. Albert (1991). Instance-based learning algorithms. Machine Learning 6 (1): 37–66 (January).
  • Alias-I (2008). LingPipe. URL: http://alias-i.com/lingpipe/.
  • Angluin, Dana (1988). Queries and concept learning. Machine Learning 2 (4): 319–342.
  • Appelt, Douglas E. and David J. Israel (1999). Introduction to Information Extraction Technology. A tutorial prepared for IJCAI-99.
  • Argamon-Engelson, Shlomo and Ido Dagan (1999). Committee-based sample selection for probabilistic classifiers. Journal of Artificial Intelligence Research 11: 335–360.
  • Asuncion, Arthur and David Newman (2007). UCI Machine Learning Repository. URL: http://www.ics.uci.edu/~mlearn/MLRepository.html.
  • Balcan, Maria-Florina, Avrim Blum and Ke Yang (2005). Co-training and expansion: Towards bridging theory and practice. Advances in Neural Information Processing Systems 17, 89–96. Cambridge, Massachusetts, USA: MIT Press.
  • Baldridge, Jason andMiles Osborne (2004). Active learning and the total cost of annotation. Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 9–16. ACL, Barcelona, Spain.
  • Baram, Yoram, Ran El-Yaniv and Kobi Luz (2004). Online choice of active learning algorithms. Journal of Machine Learning Research 5 (December): 255–291.
  • Eric Bauer, and Ron Kohavi (1999). An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning 36(1-2).
  • Becker, Markus, Ben Hachey, Beatrice Alex and Claire Grover (2005). Optimising selective sampling for bootstrapping named entity recognition.
  • Stefan Rüping and Tobias Scheffer (eds), Proceedings of the ICML 2005 Workshop on Learning with Multiple Views, 5–11. Bonn, Germany.
  • Becker,Markus and Miles Osborne (2005). A two-stage method for active learning of statistical grammars. Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, 991–996. Edinburgh, Scotland, UK: Professional Book Center.
  • Bikel, Daniel M., Richard Schwartz and Ralph M. Weischedel (1999). An algorithm that learns what’s in a name. Machine Learning 34 (1-3): 211–231 (February).
  • Blum, Avrim and Tom M. Mitchell (1998). Combining Labeled and Unlabeled Data with Co-training. Proceedings of the 11th Annual Conference on Computational Learning Theory, 92–100. ACM, Madison, Wisconsin, USA.
  • Borin, Lars, Dimitrios Kokkinakis and Leif-J¨oran Olsson (2007). Naming the past: Named entity recognition and animacy recognition in 19th century Swedish literature. Proceedings of the workshop: Language Technology for Cultural Heritage Data (LaTeCh), held in conjunction with the 45th Annual Meeting of the Association for Computational Linguistics. Prague, Czech Republic.
  • Borthwick, Andrew, John Sterling, Eugene Agichtein and Ralph Grishman 1998. NYU: Description of the MENE Named Entity System as used in MUC-7. Proceedings of the Seventh Message Understanding Conference (MUC-7). Fairfax, Virginia, USA.
  • Brants, Thorsten and Oliver Plaehn (2000). Interactive corpus annotation. Proceedings of the 2nd International Conference on Language Resources and Engineering, 453–459. ELRA, Athens, Greece. Leo Breiman (1996). Bagging predictors. Machine Learning 24 (2): 123–140 (August).
  • Brinker, Klaus (2003). Incorporating diversity in active learning with support vector machines. Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003), 59–66. Washington DC, USA: AAAI Press.
  • Carreras, Xavier, Llu´isM`arquez and Llu´is Padr´o 2003. Learning a perceptronbased named entity chunker via online recognition feedback. Proceedings of the Seventh Conference on Natural Language Learning (CoNLL-2003), 156–159. Edmonton, Alberta, Canada. le Cessie, Saskia and Hans C. van Houwelingen (1992). Ridge estimators in logistic regression. Applied Statistics 41 (1): 191–201. Chan, Yee Seng and Hwee Tou Ng (2007). Domain adaptation with active learning for word sense disambiguation. Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL-07), 49–56. ACL, Prague, Czech Republic.
  • Chapelle, Oliver, Bernhard Schölkopf and Alexander Zien (eds) (2006). Semisupervised learning. MIT Press.
  • Chawla, Nitesh V. and Grigoris Karakoulas (2005). Learning from labeled and unlabeled data: An empirical study across techniques and domains. Journal of Artificial Intelligence Research 23 (March): 331–366.
  • Chen, Jinying, Andrew Schein, Lyle H. Ungar and Martha Palmer (2006). An empirical study of the behavior of active learning for word sense disambiguation. Proceedings of the Human Language Technology Conference - North American Chapter of the Association for Computational Linguistics Annual Meeting (HLT-NAACL 2006), 120–127. ACL, New York, New York, USA.
  • Chinchor, Nancy (1998). Overview of MUC-7/MET-2. Proceedings of the Seventh Message Understanding Conference (MUC-7). Fairfax, Virginia, USA.
  • Chinchor, Nancy, Patty Robinson and Erica Brown (1998). HUB-4 named entity task definition. Technical report, SAIC.
  • Timothy Chklovski, and Rada Mihalcea (2002). “Building a Sense Tagged Corpus with Open Mind Word Expert.” In: Proceedings of the SIGLEX/SENSEVAL Workshop on Word Sense Disambiguation: Recent Successes and Future Directions.
  • Chou, Wen-Chi, Richard Tzong-Han Tsai, Ying-Shan Su, Wei Ku, Ting-Yi Sung and Wen-Lian Hsu (2006). A semi-automatic method for annotating a biomedical proposition bank. Proceedings of the Workshop on Frontiers in Linguistically Annotated Corpora, 5–12. ACL, Sydney, Australia.
  • Ciravegna, Fabio, Alexei Dingli, Daniela Petrelli and Yorick Wilks 2002. Timely and non-intrusive active document annotation via adaptive information extraction. Proceedings of the ECAI Workshop on Semantic Authoring, Annotation & Knowledge Markup (SAAKM-02). Lyon, France.
  • Ciravegna, Fabio, Daniela Petrelli and Yorick Wilks (2002). User-system cooperation in document annotation based on information extraction. Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2002). Siguenza, Spain: Springer Verlag.
  • William W. Cohenlliam W. (1995). Fast effective rule induction. Proceedings of the 12th International Conference on Machine Learning, 115–123. Tahoe City, California, USA: Morgan Kaufmann.
  • Cohn, David, Les Atlas and Richard Ladner (1994). Improving generalization with active learning. Machine Learning 15 (2): 201–221 (May).
  • Collier, Nigel, Hyun Seok Park, Norihiro Ogata, Yuka Tateishi, Chikashi Nobata, Tomoko Ohta, Tateshi Sekimizu, Hisao Imai, Katsutoshi Ibushi and Jun'ichi Tsujii (1999). The GENIA project: Corpus-based knowledge acquisition and information extraction from genome research papers. Proceedings of the 9th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 271–272.
  • Michael Collinsand Yoram Singer (1999). Unsupervised models for named entity classification. Proceedings of Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, 100–110. ACL, University of Maryland, College Park, Maryland, USA.
  • Cowie, Jim and Wendy Lehnert (1996). Information extraction. Communications of the ACM 39 (1): 80–91 (January).
  • Culotta, Aron, Trausti Kristjansson, Andrew McCallum and Paul Viola 2006. Corrective feedback and persistent learning for information extraction. Journal of Artificial Intelligence 170 (14): 1101–1122 (October).
  • Daelemans,Walter and V´eronique Hoste (2002). Evaluation of machine learning methods for natural language processing tasks. Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC 2002), 755–760. ELRA, Las Palmas, Gran Canaria, Spain.
  • Ido Dagan and Sean P. Engelson (1995). Committee-based sampling for training probabilistic classifiers. Proceedings of the Twelfth International Conference on Machine Learning, 150–157. Tahoe City, California, USA: Morgan Kaufmann.
  • Day, David, John Aberdeen, Sasha Caskey, Lynette Hirschman, Patricia Robinson and Marc Vilain (1998). Alembic workbench corpus development tool. Proceedings of the 1st International Conference on language resource and evaluation, 1021–1028. ELRA, Granada, Spain.
  • Day, David, John Aberdeen, Lynette Hirschman, Robyn Kozierok, Patricia Robinson and Marc Vilain (1997). Mixed-initiative development of language processing systems. Proceedings of the Fifth Conference on Applied Natural Language Processing, 438–355. ACL, Washington DC, USA.
  • Dempster, Arthur, Nan Laird and Donald Rubin 1977. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society 39 (1): 1–38.
  • Doddington, George, Alexis Mitchell, Mark Przybocki, Lance Ramshaw, Stephanie Strassel and Ralph Weischedel (2004). The automatic content extraction (ACE) program – tasks, data, & evaluation. Proceedings of the 4th International Conference on Language Resources and Evaluation, 837–840. ELRA, Lisbon, Portugal.
  • Domingos, Pedro (2000). A unified bias-variance decomposition and its applications. Proceedings of the Seventeenth International Conference on Machine Learning (ICML-2000), 231–238. Stanford University, California,
  • Douglas, Shona (2003). Active learning for classifying phone sequences from unsupervised phonotactic models. Proceedings of Human Language Technology Conference – North American Chapter of the Association for Computational Linguistics Annual Meeting (HLT-NAACL 2003), 19–21. ACL, Edmonton, Alberta, Canada.
  • Engelson, Sean P. and Ido Dagan (1996). Minimizing manual annotation cost in supervised training from corpora. Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, 319–326. ACL, Santa Cruz, California, USA.
  • Finn, Aidan and Nicolas Kushmerick (2003). Active learning selection strategies for information extraction. Proceedings of the International Workshop on Adaptive Text Extraction and Mining (ATEM-03), 18–25. Catvat, Dubrovnik, Croatia.
  • Eibe Frank, and Ian H. Witten(1998). Generating accurate rule sets without global optimization. Proceedings of the Fifteenth International Conference on Machine Learning (ICML-98), 144–151. Madison, Wisconsin, USA: Morgan Kaufman Publishers.
  • Yoav Freund and Robert E. Schapire (1996). Experiments with a new boosting algorithm. Proceedings of the Thirteenth International Conference on Machine Learning (ICML-96), 148–156. Bari, Italy: Morgan Kaufmann.
  • Yoav Freund and Robert E. Schapire (1997). A decision-theoretic generalization of on-line learning and application to boosting. Journal of Computer and Systems Science 55 (1): 119–139 (August).
  • Yoav Freund, Sebastian H. Seung, Eli Shamir and Naftali Tishby (1997). Selective sampling using the query by committee algorithm. Machine Learning 28 (2-3): 133–168 (August/September).
  • Ganchev, Kuzman, Fernando Pereira and Mark Mandel (2007). Semi-automated named entity annotation. Proceedings of the Linguistic Annotation Workshop, 53–56. ACL, Prague, Czech Republic.
  • Goldman, Sally A. and Yan Zhou (2000). Enhancing supervised learning with unlabeled data. Proceedings of the Seventeenth International Conference on Machine Learning (ICML-2000), 327–334. Stanford, California, USA.
  • Grishman, Ralph (1997). Information extraction: Techniques and challenges. Maria Teresa Pazienza (ed.), Information extraction: A multidisciplinary approach to an emerging information technology, Volume 1299 of Lecture Notes in Artificial Intelligence, 10–27. Springer.
  • Grishman, Ralph, Ted Dunning, Jamie Callan, Bill Caid, Jim Cowie, Louise Guthrie, Jerry Hobbs, Paul Jacobs, Matt Mettler, Bill Ogden, Bev Schwartz, Ira Sider and Ralph Weischedel (1997). Tipster text phase II architecture design. version 2.3. New York, New York, USA.
  • Grishman, Ralph and Beth Sundheim (1996). Message understanding conference-6: A brief history. Proceedings of the 16th conference on Computational Linguistics, 466–471. ACL, Copenhagen, Denmark.
  • Hachey, Ben, Beatrice Alex and Markus Becker (2005). Investigating the effects of selective sampling on the annotation task. Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005), 144–151. ACL, Ann Arbor, Michigan, USA.
  • Haertel, Robbie, Eric Ringger, Kevin Seppi, James Carroll and Peter McClanahan 2008. Assessing the costs of sampling methods in active learning for annotation. Proceedings of the 46th Annual Meeting of the association for computational linguistics: Human language technologies, short papers (companion volume), 65–68. ACL, Columbus, Ohio, USA.
  • Hall, Mark A. (1999). Correlation-based feature subset selection for machine learning. Ph.D. diss., Department of Computer Science, University of Waikato, Hamilton, New Zealand.
  • Hall, Mark A. and Geoffrey Holmes (2003). Benchmarking attribute selection techniques for discrete class data mining. IEEE Transactions on Knowledge and Data Engineering 15 (6): 1437–1447 (November).
  • Hamming, Richard W. 1950. Error detecting and error correcting codes. Bell System Technical Journal 26 (2): 147–160 (April). Hoi, Steven C. H., Rong Jin and Michael R. Lyu (2006). Large-scale text categorization by batch mode active learning. Proceedings of the 15th International World Wide Web Conference (WWW 2006), 633–642. Edinburgh, Scotland.
  • Hwa, Rebecca (2000). Sample selection for statistical grammar induction. Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, 45–52. ACL, Hong-Kong.
  • Hwa, Rebecca, Miles Osborne, Anoop Sarkar and Mark Steedman (2003). Corrected co-training for statistical parsers. Proceedings of the Workshop on the Continuum from Labeled to Unlabeled Data inMachine Learning and Data Mining. Washington DC, USA.
  • John, George H. and Pat Langley (1995). Estimating continuous distributions in bayesian classifiers. Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence, 338–345. Montreal, Quebec, Canada: Morgan Kaufman.
  • Jones, Rosie, Rayid Ghani, Tom M. Mitchell and Ellen Riloff (2003). Active learning for information extraction with multiple view feature sets. Proceedings of the 20th International Conference on Machine Learning (ICML 2003). Washington DC, USA.
  • Kaiser, Katharina and Silvia Miksch (2005). Information extraction – a survey. Technical Report Asgaard-TR-2005-6, Vienna University of Technology, Institute of Software Technology & Interactive Systems, Vienna, Austria.
  • Kim, Seokhwan, Yu Song, Kyungduk Kim, Jeong-Won Cha and Gary Geunbae Lee (2006). MMR-based active machine learning for bio named entity recognition. Proceedings of the Human Language Technology Conference – North American Chapter of the Association for Computational Linguistics Annual Meeting (HLT-NAACL 2006), 69–72. ACL, New York, New York, USA.
  • Kokkinakis, Dimitrios (2004). Reducing the effect of name explosion. Proceedings of the workshop: Beyond Named Entity Recognition, Semantic Labelling for NLP Tasks, held in conjunction with the 4th International Conference on Language Resources and Evaluation, 1–6. Lisbon, Portugal.
  • K¨orner, Christine and Stefan Wrobel (2006). Multi-class ensemble-based active learning. Proceedings of The 17th European Conference on Machine Learning and the 10th European Conference on Principles and Practice of Knowledge Discovery in Databases, 687–694. Berlin, Germany: Springer-Verlag.
  • Kuo, Jin-Shea, Haizhou Li and Ying-Kuei Yang (2006). Learning transliteration lexicons from the web. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association of Computational Linguistics, 1129–1136. ACL, Sydney, Australia.
  • Lafferty, John, Andrew McCallum and Fernando Pereira (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. Proceedings of the Eighteenth International Conference onMachine Learning (ICML-2001), 282–289. Williamstown, Massachusetts, USA.
  • Laws, Florian and Hinrich Schütze (2008). Stopping criteria for active learning of named entity recognition. Proceedings of the 22nd International Conference on Computational Linguistics (COLING 2008), 465–472. ACL, Manchester, England.
  • Lewis, David D. (1995). A sequential algorithm for training text classifiers: Corrigendum and additional data. ACM SIGIR Forum 29 (2): 13–19.
  • Lewis, David D. and William A. Gale (1994). A Sequential Algorithm for Training Text Classifiers. Proceedings of the 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, 3–12. Dublin, Ireland: ACM/Springer.
  • Liere, Ray and Prasad Tadepalli (1997). Active learning with committees for text categorization. Proceedings of the fourteenth national conference on artificial intelligence, 591–597. AAAI, Providence, Rhode Island, USA.
  • Lin, Jianhua (1991). Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory 37 (1): 145–151 (January).
  • Linguistic Data Consortium (2001). Message Understanding Conference (MUC) 7. LDC2001T02. FTP FILE. Philadelphia: Linguistic Data Consortium.
  • Linguistic Data Consortium (2008). Automatic Content Extraction (ACE). URL <http://projects.ldc.upenn.edu/ace/>.
  • Liu, H. and R. Setiono (1996). A probabilistic approach to feature selection – a filter solution. Proceedings of the Thirteenth International Conference on Machine Learning (ICML-96), 319–327. Bari, Italy: Morgan Kaufmann.
  • Marcus, M., B. Santorini and M. A. Marcinkiewicz (1993). Building a large annotated corpus of english: the penn treebank. Computational Linguistics 19 (2): 313–330 (June).
  • McCallum, Andrew, Dayne Freitag and Fernando Pereira (2000). Maximum entropy Markov models for information extraction and segmentation. Proceedings of the Seventeenth International Conference on Machine Learning (ICML-2000), 591–598. Stanford University, California, USA.
  • McCallum, Andrew and Wei Li (2003). Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. Proceedings of the Seventh Conference on Natural Language Learning (CoNLL-2003), 188–191. ACL, Edmonton, Alberta, Canada.
  • McCallum, Andrew and Kamal Nigam (1998). Employing em and pool-based active learning for text classification. Proceedings of the 15th International Conference on Machine Learning (ICML-98), 350–358. Madison, Wisconsin, USA: Morgan Kaufmann.
  • Melville, Prem and Raymond Mooney (2003). Constructing diverse classifier ensembles using artificial training examples. Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence (IJCAI- 03), 505–510. Acapulco, Mexico.
  • Melville, Prem and Raymond Mooney (2004). Diverse ensembles for active learning. Proceedings of the 21st International Conference on Machine Learning (ICML-2004), 584–591. Banff, Canada.
  • Rada Mihalcea and Timothy Chklovski (2003). Open mind word expert: Creating large annotated data collections with web user’s help. Proceedings of the EACL 2003 Workshop on Linguistically Annotated Corpora (LINC 2003). EACL, Budapest, Hungary.
  • Mikheev, Andrei (2000). Document centered approach to text normalization. Proceedings of the 23rd ACM SIGIR Conference Retrieval, 136–143. ACM, Athens, Greece.
  • Tom M. Mitchell (1997). Machine learning. McGraw-Hill.
  • Morton, Thomas and Jeremy LaCivita (2003). Wordfreak: An open tool for linguistic annotation. Proceedings of the Human Language Technology Conference – North American Chapter of the Association for Computational Linguistics annual meeting (HLT-NAACL 2003), 17–18. ACL, Edmonton, Alberta, Canada.
  • Muslea, Ion, Steven Minton and Craig A. Knoblock (2000). Selective sampling with redundant views. Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI-2000), 621–626. Austin, Texas, USA.
  • Muslea, Ion, Steven Minton and Craig A. Knoblock 2002a. Adaptive view validation: A first step towards automatic view detection. Proceedings of the 19th International Conference on Machine Learning (ICML 2002), 443–450. Sydney, Australia.
  • Muslea, Ion, Steven Minton and Craig A. Knoblock 2002b. Active + semisupervised learning = robust multi-view learning. Proceedings of the 19th International Conference on Machine Learning (ICML-02), 435– 442. Sydney, Australia.
  • Muslea, Ion, StevenMinton and Craig A. Knoblock (2006). Active learning with multiple views. Journal of Artificial Intelligence Research 27 (October): 203–233.
  • Nadeau, David and Satoshi Sekine (2007). A survey of named entity recognition and classification. Journal of Linguisticae Investigationes 30 (1): 3–26 (September).
  • Ngai, Grace and David Yarowsky (2000). Rule writing or annotation: Costefficient resource usage for base noun phrase chunking. Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, 117–125. ACL, Hong-Kong.
  • Nigam, Kamal and Rayid Ghani (2000). Analyzing the effectiveness and applicability of co-training. Proceedings of the Ninth International Conference on Information and Knowledge Management (CIKM 2000), 86–93. ACM, McLean, Virginia, USA.
  • Nobata, Chikashi, Nigel Collier and Jun'ichi Tsujii (1999). Automatic term identification and classification in biology texts. Proceedings of the Fifth Natural Language Pacific Rim Symposium (NLPRS 2000), 369–374. Beijing, China.
  • Olsson, Fredrik (2002). Requirements and design considerations for an open and general architecture for information refinement. Licentiate of Philosophy Thesis, Uppsala University, Uppsala. Available as RUUL No. 35 (Reports from Uppsala University, Department of Linguistics). ISBN:91-973737- 1-0, ISSN: 0280-1337.
  • Olsson, Fredrik and Katrin Tomanek (2008). An intrinsic stopping criterion for committee-based active learning. Submitted.
  • Osborne,Miles and Jason Baldridge (2004). Ensemble-based active learning for parse selection. Proceedings of Human Language Technology Conference – the North American Chapter of the Association for Computational Linguistics Annual Meeting (HLT-NAACL 2004), 89–96. ACL, Boston, Massachusetts, USA.
  • Pereira, Fernando C. N., Naftali Tishby and Lillian Lee (1993). Distributional clustering of English words. Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, 183–190. ACL, Columbus, Ohio, USA.
  • Pierce, David and Claire Cardie (2001). Limitations of co-training for natural language learning from large datasets. Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing (EMNLP 2001), 1–9. Pittsburgh, Pennsylvania, USA.
  • Powell, Michael J. D. (1987). Radial basis functions for multivariable interpolation: A review. J. Mason and M. Cox (eds), Algorithms for Approximation, 143–167. New York, New York, USA: Oxford: Clarendon Press. Quinlan, Ross J. (1993). C4.5: Programs for machine learning. San Mateo, California: Morgan Kaufmann.
  • Ramshaw, Lance A. and Mitchell P. Marcus (1995). Text chunking using transformation based learning. Proceedings of the Third Workshop on Very Large Corpora, 82–94. ACL, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
  • Reichart, Roi and Ari Rappoport (2007). An ensemble method for selection of high quality parses. Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL-07), 408–415. ACL, Prague, Czech Republic.
  • Ringger, Eric, Peter McClanahan, Robbie Haertel, George Busby, Marc Carmen, James Carroll, Kevin Seppi and Deryle Lonsdale (2007). Active learning for part-of-speech tagging: Accelerating corpus annotation. Proceedings of the Linguistic Annotation Workshop, 101–108. ACL, Prague, Czech Republic.
  • Sassano, Manabu (2002). An empirical study of active learning with support vector machines for Japanese word segmentation. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), 505–512. ACL, Philadelphia, USA.
  • Robert E. Schapire. (1990). The strength of weak learnability. Machine Learning 5 (2): 197–227 (June).
  • Robert E. Schapire. (2003). The boosting approach to machine learning: An overview. D. D. Denison, M. H. Hansen, C. Holmes, B. Mallick and B. Yu (eds), Nonlinear Estimation and Classification, Volume 171 of Lecture Notes in Statistics, 149–172. Springer.
  • Robert E. Schapire, Yoav Freund, Peter Bartlett and Wee Sun Lee 1998. Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics 26 (5): 1651–1686 (October).
  • Scheffer, Tobias, Christian Decomain and Stefan Wrobel (2001). Active hidden Markov models for information extraction. Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis (IDA-2001), 309–318. Lisbon, Portugal: Springer.
  • Schohn, Greg and David Cohn (2000). Less is more: Active learning with support vector machines. Proceedings of the Seventeenth International Conference on Machine Learning (ICML-2000), 839–846. Stanford University, Stanford, California, USA: Morgan Kaufmann.
  • Satoshi Sekine. (1998). “NYU: Description of the Japanese NE system used for MET-2.” In: Proceedings of the Seventh Message Understanding Conference (MUC-7).
  • Satoshi Sekine, and Hitoshi Ishara. (2000). “IREX: IR & IE evaluation project in japanese.” In: Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC 2000).
  • Satoshi Sekine, and Chikashi Nobata (2004). “Definition, Dictionary and Tagger for Extended Named Entities.” In: Proceedings of The Fourth International Conference on Language Resources and Evaluation (LREC 2004).
  • Seung, H. Sebastian, Manfred Opper and Haim Sompolinsky (1992). Query by committee. Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, 287–294. Pittsburgh, Pennsylvania, USA: ACM.
  • Shannon, Claude E. 1948. A mathematical theory of communication. Bell System Technical Journal 27 (July and October): 379–423 and 623–656.
  • Shen, Dan, Jie Zhang, Jian Su, Guodong Zhou and Chew-Lim Tan 2004. Multi-criteria-based active learning for named entity recognition. Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), 589–596. ACL, Barcelona, Spain.
  • Steedman, Mark, Rebecca Hwa, Stephen Clark,Miles Osborne, Anoop Sarkar, Julia Hockenmaier, Paul Ruhlen, Steven Baker and Jeremiah Crim 2003. Example selection for bootstrapping statistical parsers. Proceedings of Human Language Technology Conference – North American Chapter of the Association for Computational Linguistics Annual Meeting (HLTNAACL 2003), 157–164. ACL, Edmonton, Alberta, Canada.
  • Tang, Min, Xiaoqiang Luo and Salim Roukos (2002). Active learning for statistical natural language parsing. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL-02), 120–127. ACL, Philadelphia, Pennsylvania, USA.
  • Tapanainen, Pasi and Timo J¨arvinen (1997). A non-projective dependency parser. Proceedings of the Fifth Conference of Applied Natural Language Processing, 64–71. ACL, Washington DC, USA: Morgan Kaufmann.
  • Tateisi, Yuka and Jun'ichi Tsujii (2004). Part-of-Speech Annotation of Biology Research Abstracts. Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), 1267–1270. ELRA, Lisbon, Portugal.
  • Thompson, Cynthia A., Mary Elaine Califf and Raymond Mooney 1999. Active learning for natural language parsing and information extraction. Proceedings of the Sixteenth International Machine Learning Conference (ICML-99), 406–414. Bled, Slovenia.
  • Erik Tjong Kim Sang 2002a. Introduction to the CoNLL-2002 shared task: Language independent named entity recognition. Proceedings of the Conference on Computational Natural Language Learning, 155–158. ACL, Taipei, Taiwan.
  • Erik Tjong Kim Sang 2002b. Memory-based named entity recognition. Proceedings of the Sixth Conference on Computational Language Learning (CoNLL-2002), 203–206. ACL, Taipei, Taiwan.
  • Erik Tjong Kim Sang and Fien De Meulder (2003). Introduction to the CoNLL-2003 shared task: Language independent named entity recognition. Proceedings of the Conference on Computational Natural Language Learning, 142–147. ACL, Edmonton, Alberta, Canada.
  • Tomanek, Katrin and Udo Hahn (2008). Approximating learning curves for active-learning-driven annotation. Proceedings of Sixth International Conference on Language Resources and Evaluation (LREC 2008). ELRA, Marrakech, Morocco.
  • Tomanek, Katrin, JoachimWermter and Udo Hahn 2007a. An approach to text corpus construction which cuts annotation costs and maintains reusability of annotated data. Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 486–495. ACL, Prague, Czech Republic.
  • Tomanek, Katrin, Joachim Wermter and Udo Hahn 2007b. Efficient annotation with the jena annotation environment (JANE). Proceedings of the Linguistic Annotation Workshop, 9–16. ACL, Prague, Czech Republic.
  • Tong, Simon and Daphne Koller. (2002). Support vector machine active learning with applications to text classification. Journal of Machine Learning 2 (March): 45–66.
  • Tur, Gokhan, Dilek Hakkani-Tür and Robert E. Schapire (2005). Combining active and semi-supervised learning for spoken language understanding. Speech Communication 45 (2): 171–186 (February). Vlachos, Andreas (2006). Active annotation. Proceedings of the Workshop on Adaptive Text Extraction and Mining (ATEM 2006), 64–71. ACL, Trento, Italy.
  • Vlachos, Andreas (2008). A stopping criterion for active learning. Computer, Speech and Language 22 (3): 295–312 (July).
  • Ellen Voorhees. (2001). The Message Understanding Conference Scoring Software User’sManual. URL <http://www-nlpir.nist.gov/related projects/muc/muc sw/muc sw manual.html>.
  • Webb, Geoffrey I. (2000). Multiboosting: A technique for combining boosting and wagging. Machine Learning 40 (2): 159–196 (August).
  • Ian H. Wittenn H. and Eibe Frank (2005). Data mining: Practical machine learning tools with java implementations. 2nd edition. San Fransisco: Morgan Kaufmann.
  • Wu, Wei-Lin, Ru-Zhan Lu, Jian-Yong Duan, Hui Liu, Feng Gao and Yu-Quan Chen (2006). A weakly supervised learning approach for spoken language understanding. Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP 2006), 199–207. ACL, Sydney, Australia.
  • Yangarber, Roman and Ralph Grishman (1997). Customization of information extraction systems. Proceedings of the International Workshop on Lexically-Driven Information Extraction, 1–11. Frascati, Italy.
  • (Zhang, Tang, et al., 2005) ⇒ Kuo Zhang, Jie Tang, JuanZi Li, and KeHongWang (2005). “Feature-correlation based multi-view detection.” In: Proceedings of International Conference on Computational Science and Its Applications (ICCSA 2005). doi:10.1007/11424925_127
  • Zhu, Jingbo and Eduard Hovy (2007). Active learning for word sense disambiguation with methods for addressing the class imbalance problem. Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 783–790. ACL, Prague, Czech Republic.
  • Zhu, Jingbo, HuizhenWang and Eduard Hovy 2008a. Learning a stopping criterion for active learning for word sense disambiguation and text classification. Proceedings of the 3rd International Joint Conference on Natural Language Processing (IJCNLP 2008), 366–372. Hyderabad, India.
  • Zhu, Jingbo, Huizhen Wang and Eduard Hovy 2008b. Multi-criteria-based strategy to stop active learning for data annotation. Proceedings of the 22nd International Conference on Computational Linguistics (COLING 2008), 1129–1136. ACL, Manchester, England.,


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2008 BootstrappingNamedEntityAnnotationFredrik OlssonBootstrapping Named Entity Annotation by Means of Active Machine Learninghttp://gupea.ub.gu.se/dspace/bitstream/2077/18722/2/gupea 2077 18722 2.pdf2008