1997 SemanticSimBasedOnCorpusStats

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Jiang-Conrath Similarity Measure, Word Sense Disambiguation Algorithm, Lexical Semantic Similarity Function.

Notes

Cited By

2001

Quotes

Abstract

This paper presents a new approach for measuring semantic similarity/distance between words and concepts. It combines a lexical taxonomy structure with corpus statistical information so that the semantic distance between nodes in the semantic space constructed by the taxonomy can be better quantified with the computational evidence derived from a distributional analysis of corpus data. Specifically, the proposed measure is a combined approach that inherits the edge-based approach of the edge counting scheme, which is then enhanced by the node-based approach of the information content calculation. When tested on a common data set of word pair similarity ratings, the proposed approach outperforms other computational models. It gives the highest correlation value (r = 0.828) with a benchmark based on human similarity judgements, whereas an upper bound (r = 0.885) is observed when human subjects replicate the same task.


References

  • E. Agirre, and G. Rigau, 1995, “A proposal for Word Sense Disambiguation Using Conceptual Distance”, Proceedings of the First International Conference on Recent Advanced in NLP, Bulgaria.
  • Kenneth W. Church and P. Hanks, 1989, “Word Association Norms, Mutual Information, and Lexicography”, Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics, ACL27’89, 76-83.
  • Grefenstette, G., 1992, “Use of Syntactic Context to Produce Term Association Lists for Text Retrieval”, Proceedings of the 15th Annual International Conference on Research and Development in Information Retrieval, SIGIR’92.
  • Hindle, D., 1990, “Noun Classification from Predicate-Argument Structures”, Proceedings of the 28th Annual Meeting of the Association for Computational Linguistics, ACL28’90, 268-275.
  • Kozima, H. and T. Furugori, 1993, “Similarity Between Words Computed by Spreading Activations on an English Dictionary”, Proceedings of the 5th Conference of the European Chapter of the Association for Computational Linguistics, EACL-93, 232-239.
  • Lee, J.H., M.H. Kim, and Y.J. Lee, 1993, “Information Retrieval Based on Conceptual Distance in IS-A Hierarchies”, Journal of Documentation, Vol. 49, No. 2, 188-207.
  • George A. Miller, 1990, “Nouns in WordNet: A Lexical Inheritance System”, International Journal of Lexicography, Vol. 3, No. 4, 245-264.
  • George A. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. Miller, 1990, “Introduction to WordNet: An Online Lexical Database”, International Journal of Lexicography, Vol. 3, No. 4, 235-244.
  • George A. Miller and W.G. Charles, 1991, “Contextual Correlates of Semantic Similarity”, Language and Cognitive Processes, Vol. 6, No. 1, 1-28.
  • George A. Miller, C. Leacock, R. Tengi, and R.T. Bunker, 1993, “A Semantic Concordance”, Proceedings of ARPA Workshop on Human Language Technology, 303-308, March 1993.
  • Morris, J. and Graeme Hirst, 1991, “Lexical Cohesion Computed by Thesaural Relations as an Indicator of the Structure of Text”, Computational Linguistics, Vol. 17, 21-48.
  • Niwa, Y. and Y. Nitta. 1994, “Co-occurrence Vectors from Corpora vs. Distance Vectors from Dictionaries”, Proceedings of the 17th International Conference on computational Linguistics, COLING’94, 304-309.
  • Rada, R., H. Mili, E. Bicknell, and M. Bletner, 1989, “Development and Application of a Metric on Semantic Nets”, IEEE Transactions on Systems, Man, and Cybernetics, Vol. 19, No. 1, 17-30.
  • Philip Resnik, 1992, “WordNet and Distributional Analysis: A Class-based Approach to Lexical Discovery”, Proceedings of the AAAI Symposium on Probabilistic Approaches to Natural Language.
  • Philip Resnik, 1995, “Using Information Content to Evaluate Semantic Similarity in a Taxonomy”, Proceedings of the 14th International Joint Conference on Artificial Intelligence, Vol. 1, 448-453, Montreal, August 1995.
  • Richardson, R. and A.F. Smeaton, 1995, “Using WordNet in a Knowledge-based Approach to Information Retrieval”, Working Paper, CA-0395, School of Computer Applications, Dublin City University, Ireland.
  • Smeaton, A.F. and I. Quigley, 1996, “Experiments on Using Semantic Distance Between Words in Image Caption Retrieval”, Working Paper, CA-0196, School of Computer Applications, Dublin City University, Ireland.
  • Strzalkowski, T. and B. Vauthey, 1992, “Information Retrieval Using Robust Natural Language Processing”, Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, ACL 1992.
  • Sussna, M., 1993, “Word Sense Disambiguation for Free-text Indexing Using a Massive Semantic Network”, Proceedings of the Second International Conference on Information and Knowledge Management, CIKM 1993.

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
1997 SemanticSimBasedOnCorpusStatsJay J. Jiang
David W. Conrath
Semantic Similarity Based on Corpus Statistics and Lexical TaxonomyProceedings on International Conference on Research in Computational Linguistics1997