2015 NASARIANovelApproachtoaSemantic
- (Camacho-Collados et al., 2015) ⇒ Jose Camacho-Collados, Mohammad Taher Pilehvar, and Roberto Navigli. (2015). “NASARI: A Novel Approach to a Semantically-Aware Representation of Items.” In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (NAACL-HLT 2015).
Subject Headings: NASARI System; Word Embedding System, Semantic Similarity System; Word Sense Disambiguation System.
Notes
- Online Resource(s):
Cited By
- Google Scholar: ~ 110 Citations.
Quotes
Abstract
The semantic representation of individual word senses and concepts is of fundamental importance to several applications in Natural Language Processing. To date, concept modeling techniques have in the main based their representation either on lexicographic resources, such as WordNet, or on encyclopedic resources, such as Wikipedia. We propose a vector representation technique that combines the complementary knowledge of both these types of resource. Thanks to its use of explicit semantics combined with a novel cluster-based dimensionality reduction and an effective weighting scheme, our representation attains state-of-the-art performance on multiple datasets in two standard benchmarks: word similarity and sense clustering. We are releasing our vector representations at http://lcl.uniroma1.it/nasari/.
1. Introduction
...
In this paper we put forward a novel concept representation technique, called NASARI, which exploits the knowledge available in both types of resource in order to obtain effective representations of arbitrary concepts. The contributions of this paper are threefold. First, we propose a novel technique for rich semantic representation of arbitrary WordNet synsets or Wikipedia pages. Second, we provide improvements over the conventional tf-idf weighting scheme by applying lexical specificity (Lafon, 1980), a statistical measure mainly used for term extraction, to the task of computing vector weights in a vector representation. Third, we propose a semantically-aware dimensionality reduction technique that transforms a lexical item's representation from a semantic space of words to one of WordNet synsets, simultaneously providing an implicit disambiguation and a distribution smoothing. We demonstrate that our representation achieves state-of-the-art performance on two different tasks: (1) word similarity on multiple standard datasets: MC30, RG-65, and WordSim-353 similarity, and (2) Wikipedia sense clustering, in which our unsupervised system surpasses the performance of a state-of-the-art supervised technique that exploits knowledge available Wikipedia in several languages.
2. Semantic Representation of Concepts
3. NASARI for Semantic Similarity
4. Experiments
5. Related Work
6. Conclusions
Acknowledgments
Footnotes
References
BibTeX
@inproceedings{2015_NASARIANovelApproachtoaSemantic, author = {Jose Camacho-Collados and Mohammad Taher Pilehvar and Roberto Navigli}, editor = {Rada Mihalcea and Joyce Yue Chai and Anoop Sarkar}, title = {NASARI: a Novel Approach to a Semantically-Aware Representation of Items}, booktitle = {Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (NAACL-HLT 2015)}, pages = {567--577}, publisher = {The Association for Computational Linguistics}, year = {2015}, url = {https://doi.org/10.3115/v1/n15-1059}, doi = {10.3115/v1/n15-1059}, }
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2015 NASARIANovelApproachtoaSemantic | Mohammad Taher Pilehvar Jose Camacho-Collados Roberto Navigli | NASARI: A Novel Approach to a Semantically-Aware Representation of Items | 2015 |