2006 EffectiveSelfTrainingforParsing
- (McClosky et al., 2006) ⇒ David McClosky, Eugene Charniak, and Mark Johnson. (2006). “Effective Self-training for Parsing.” In: Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics. doi:10.3115/1220835.1220855
Subject Headings:
Notes
- Additional information can be found at http://stanford.edu/~mcclosky/selftraining.html
Cited By
- http://scholar.google.com/scholar?q=%22Effective+self-training+for+parsing%22+2006
- http://dl.acm.org/citation.cfm?id=1220835.1220855&preflayout=flat#citedby
Quotes
Abstract
We present a simple, but surprisingly effective, method of self-training a two-phase parser-reranker system using readily available unlabeled data. We show that this type of bootstrapping is possible for parsing when the bootstrapped parses are processed by a discriminative reranker. Our improved model achieves an f-score of 92.1%, an absolute 1.1% improvement (12% error reduction) over the previous best result for Wall Street Journal parsing. Finally, we provide some analysis to better understand the phenomenon.
2 Previous work
A simple method of incorporating unlabeled data into a new model is self-training. In self-training, the existing model first labels unlabeled data. The newly labeled data is then treated as truth and combined with the actual labeled data to train a new model. This process can be iterated over different sets of unlabeled data if desired. It is not surprising that self-training is not normally effective: Charniak (1997) and Steedman et al. (2003) report either minor improvements or significant damage from using self-training for parsing. Clark et al. (2003) applies self-training to POS-tagging and reports the same outcomes. One would assume that errors in the original model would be amplified in the new model.
References
,
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2006 EffectiveSelfTrainingforParsing | Eugene Charniak Mark Johnson David McClosky | Effective Self-training for Parsing | 10.3115/1220835.1220855 |