1992 OneSensePerDiscourse

(Gale et al., 1992) ⇒ William A. Gale, Kenneth W. Church, and David Yarowsky. (1992). “One Sense per Discourse.” In: Proceedings of the DARPA Speech and Natural Language Workshop.

Subject Headings: One Sense per Discourse, Word Sense, Word Sense Disambiguation, Polysemous Word, One Sense Per Discourse Heuristic.

Cited By

(Yarowsky, 1993) ⇒ David Yarowsky. (1993). “One Sense per Collocation.” In: Proceedings of the Workshop on Human Language Technology. doi:10.3115/1075671.1075731
https://scholar.google.com/scholar?cluster=13407796614804380317&as_sdt=0,5
https://dl.acm.org/citation.cfm?id=1075579

Quotes

Abstract

It is well-known that there are polysemous words like sentence whose "meaning" or "sense" depends on the context of use. We have recently reported on two new word-sense disambiguation systems, one trained on bilingual material (the Canadian Hansards) and the other trained on monolingual material (Roget's Thesaurus and Grolier's Encyclopedia). As this work was nearing completion, we observed a very strong discourse effect. That is, if a polysemous word such as sentence appears two or more times in a well-written discourse, it is extremely likely that they will all share the same sense. This paper describes an experiment which confirmed this hypothesis and found that the tendency to share sense in the same discourse is extremely strong (98%). This result can be used as an additional source of constraint for improving the performance of the word-sense disambiguation algorithm. In addition, it could also be used to help evaluate disambiguation algorithms that did not make use of the discourse constraint.

…

2.2. Bayesian Discrimination

Surprisingly good results can be achieved using Bayesian discrimination methods which have been used very successfully in many other applications, especially author identification (Mosteller and Wallace, 1964) and information retrieval (IR) (Salton, 1989, section 10.3). Our word-sense disambiguation algorithm uses the words in a 100-word context surrounding the polysemous word very much like the other two applications use the words in a test document.

It is common to use very small contexts (e.g., 5-words) based on the observation that people do not need very much context in order to performance the disambiguation task. In contrast, we use much larger contexts (e.g., 100 words). Although people may be able to make do with much less context, we believe the machine Leeds all the help it can get, and we have found that the larger context makes the task much easier. In fact, we have been able to measure information at extremely large distances (10,000 words away from the polysemous word in question), though obviously most of the useful information appears relatively near the polysemous word (e.g., within the first 100 words or so). Needless to say, our 100-word contexts are considerably larger than the smaller 5-word windows that one normally finds in the literature.

…

,

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
1992 OneSensePerDiscourse	William A. Gale Kenneth W. Church David Yarowsky			One Sense per Discourse		Proceedings of the DARPA Speech and Natural Language Workshop	http://www.coli.uni-saarland.de/~schulte/Teaching/ESSLLI-06/Referenzen/Senses/gale-et-al-1992.pdf			1992

1992 OneSensePerDiscourse

Cited By

Quotes

Abstract

2.2. Bayesian Discrimination

Navigation menu

Search