Text-Items Meaning Similarity Measure

From GM-RKB
Jump to navigation Jump to search

A Text-Items Meaning Similarity Measure is a items meaning similarity measure (a meaning similarity measure) between two or more linguistic sentences.



References

2021

  • (Chandrasekaran & Mago, 2021) ⇒ Dhivya Chandrasekaran, and Vijay Mago. (2021). “Evolution of Semantic Similarity — a Survey.” ACM Computing Surveys (CSUR) 54, no. 2
    • ABSTRACT: Estimating the semantic similarity between text data is one of the challenging and open research problems in the field of Natural Language Processing (NLP). The versatility of natural language makes it difficult to define rule-based methods for determining semantic similarity measures. To address this issue, various semantic similarity methods have been proposed over the years. This survey article traces the evolution of such methods beginning from traditional NLP techniques such as kernel-based methods to the most recent research work on transformer-based models, categorizing them based on their underlying principles as knowledge-based, corpus-based, deep neural network–based methods, and hybrid methods. Discussing the strengths and weaknesses of each method, this survey provides a comprehensive view of existing systems in place for new researchers to experiment and develop innovative ideas to address the issue of semantic similarity.

2016

Semantic Textual Similarity (STS) seeks to measure the degree of semantic equivalence between two snippets of text. Similarity is expressed on an ordinal scale that spans from semantic equivalence to complete unrelatedness. Intermediate values capture specifically defined levels of partial similarity. While prior evaluations constrained themselves to just monolingual snippets of text, the 2016 shared task includes a pilot subtask on computing semantic similarity on cross-lingual text snippets. This year’s traditional monolingual subtask involves the evaluation of English text snippets from the following four domains: Plagiarism Detection, Post-Edited Machine Translations, Question-Answering and News Article Headlines. From the questionanswering domain, we include both questionquestion and answer-answer pairs. The cross-lingual subtask provides paired SpanishEnglish text snippets drawn from the same sources as the English data as well as independently sampled news data. The English subtask attracted 43 participating teams producing 119 system submissions, while the crosslingual Spanish-English pilot subtask attracted 10 teams resulting in 26 systems. -