2004 CotrainAndSelftrainForWSD

Jump to: navigation, search

Subject Headings: Word Sense Disambiguation Algorithm, Cotraining Algorithm, Self-Training Algorithm.


Cited By



This paper investigates the application of cotraining and self-training to word sense disambiguation. Optimal and empirical parameter selection methods for co-training and self-training are investigated, with various degrees of error reduction. A new method that combines cotraining with majority voting is introduced, with the effect of smoothing the bootstrapping learning curves, and improving the average performance.


This paper investigated the application of co-training and self-training to supervised word sense disambiguation. If the right parameters for co-training and self-training can be identified for each individual classifier, an average error reduction of 25.5% is achieved, with similar performance observed for both co-training and self-training. Given that these optimal settings cannot always be identified in practical applications, several algorithms for empirical parameter selection were investigated: global settings determined as the best set of parameters across all classifiers, and per-word settings, identified separately for each classifier, both using a validation set. An improved cotraining method was also introduced, that combines cotraining with majority voting, with the effect of smoothing the learning curves, and improving the average performance. This improved co-training algorithm, applied with a global parameter selection scheme, brought a significant error reduction of 9.8% with respect to the basic classifier, which shows that co-training can be successfully employed in practice for bootstrapping sense classifiers.


  • Steven P. Abney. (2002). Bootstrapping. In: Proceedings of the 40st Annual Meeting of the Association for Computational Linguistics ACL 2002, pages 360–367, Philadelphia, PA, July.
  • A. Blum and Tom M. Mitchell. (1998). Combining labeled and unlabeled data with cotraining. In COLT: Proceedings of the Workshop on Computational Learning Theory, Morgan Kaufmann Publishers.
  • S. Clark, J. R. Curran, and M. Osborne. (2003). Bootstrapping POS taggers using unlabelled data. InWalter Daelemans and Miles Osborne, editors, Proceedings of CoNLL-2003, pages 49–55. Edmonton, Canada.
  • Y.K. Lee and H.T. Ng. (2002). “An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation.” In: Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002), pages 41–48, Philadelphia, June.
  • Bernardo Magnini, C. Strapparava, G. Pezzulo, and A. Gliozzo. (2002). Using domain information for word sense disambiguation. In: Proceedings of Senseval-2 Workshop, Association of Computational Linguistics, pages 111–115, Toulouse, France.
  • Rada Mihalcea. (2002). “Instance Based Learning with Automatic Feature Selection Applied to Word Sense Disambiguation.” In: Proceedings of the 19th International Conference on Computational Linguistics (COLING-ACL 2002).
  • V. Ng and C. Cardie. (2003). Weakly supervised natural language learning without redundant views. In Human Language Technology/Conference of the North American Chapter of the Association for Computational Linguistics (HLTNAACL), Edmonton, Canada, May.
  • K. Nigam and R. Ghani. (2000). Analyzing the effectiveness and applicability of co-training. In: Proceedings of the Conference on Information and Knowledge Management CIKM-2000, pages 86–93, McLean, Virginia.
  • D. Pierce and C. Cardie. (2001). Limitations of co-training for natural language learning from large datasets. In: Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing (EMNLP-2001), Pittsburgh, PA.
  • Anoop Sarkar. (2001). Applying cotraining methods to statistical parsing. In: Proceedings of the North American Chapter of the Association for Compuatational Linguistics, NAACL 2001, Pittsburg, June.
  • David Yarowsky and R. Florian. (2002). Evaluating sense disambiguation across diverse parameter spaces. JNLE Special Issue on Evaluating Word Sense Disambiguation Systems. forthcoming.


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2004 CotrainAndSelftrainForWSDRada MihalceaCo-training and Self-training for Word Sense DisambiguationProceedings of NAACL Conferencehttp://acl.ldc.upenn.edu/hlt-naacl2004/conll04/pdf/mihalcea.pdf2004