1994 CoOccurrenceVectorsfromCorporaV

From GM-RKB
Jump to navigation Jump to search

Subject Headings:

Notes

Cited By

Quotes

Abstract

A comparison was made of vectors derived by using ordinary co-occurrence statistics from large text corpora and of vectors derived by measuring the interword distances in dictionary definitions. The precision of word sense disambiguation by using co-occurrence vectors from the 1987 Wall Street Journal (20M total words) was higher than that by using distance vectors from the Collins English Dictionary (60K head words + 1.6M definition words). However, other experimental results suggest that distance vectors contain some different semantic information from co-occurrence vectors.

1. Introduction

3 Co-occurrence Vectors

We use ordinary co-occurrence statistics and measure the co-occurrence likelihood between two words, X and Y, by the mutual information estimate (Church and Hanks, 1989): :[math]\displaystyle{ I(\mathbf{X},\mathbf{Y}) = \log^+ \frac{P(\mathbf{X} \mid \mathbf{Y})}{P(\mathbf{X}}) }[/math], where P(X) is the occurrence density of word X in a whole corpus, and the conditional probability [math]\displaystyle{ P(\mathbf{X} \mid \mathbf{Y}) }[/math] is the density of word X in a neighborhood of word Y. Here the neighborhood is defined as 50 words before or after any appearance of word Y. (There is a variety of neighborhood definitions such as "100 surrounding words" (Yarowsky 1992) and "within a distance of no more thall 3 words ignoring function words" (Dagan et al., 1993).)

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
1994 CoOccurrenceVectorsfromCorporaVYoshiki Niwa
Yoshihiko Nitta
Co-occurrence Vectors from Corpora Vs. Distance Vectors from Dictionaries10.3115/991886.9919381994