2006 ProbabilisticWSDAnalysisAndTechniques

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Supervised Word Sense Disambiguation Algorithm.

Notes

  • This technical report is based on a dissertation submitted July 2005 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Trinity College.

Quotes

Abstract

  • This thesis shows that probabilistic word sense disambiguation systems based on established statistical methods are strong competitors to current state-of-the-art word sense disambiguation (WSD) systems.
  • We begin with a survey of approaches to WSD, and examine their performance in the systems submitted to the SENSEVAL-2 WSD evaluation exercise. We discuss existing resources for WSD, and investigate the amount of training data needed for effective supervised WSD.
  • We then present the design of a new probabilistic WSD system. The main feature of the design is that it combines multiple probabilistic modules using both Dempster-Shafer theory and Bayes Rule. Additionally, the use of Lidstone’s smoothing provides a uniform mechanism for weighting modules based on their accuracy, removing the need for an additional weighting scheme.
  • Lastly, we evaluate our probabilistic WSD system using traditional evaluation methods, and introduce a novel task-based approach. When evaluated on the gold standard used in the SENSEVAL-2 competition, the performance of our system lies between the first and second ranked WSD system submitted to the English all words task.
  • Task-based evaluations are becoming more popular in natural language processing, being an absolute measure of a system’s performance on a given task. We present a new evaluation method based on subcategorization frame acquisition. Experiments with our probabilistic WSD system give an extremely high correlation between subcategorization frame acquisition performance and WSD performance, thus demonstrating the suitability of SCF acquisition as a WSD evaluation task.

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2006 ProbabilisticWSDAnalysisAndTechniquesJudita PreisProbabilistic Word Sense Disambiguation: Analysis and Techniques for Combining Knowledge Sourceshttp://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-673.html