2016 ModelingSemanticCompositionalit

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Relational Pattern; Relational Pattern Learning System.

Notes

Cited By

Quotes

Author's Keywords

Abstract

Vector representation is a common approach for expressing the meaning of a relational pattern. Most previous work obtained a vector of a relational pattern based on the distribution of its context words (e.g., arguments of the relational pattern), regarding the pattern as a single ' word'. However, this approach suffers from the data sparseness problem, because relational patterns are productive, i.e., produced by combinations of words. To address this problem, we propose a novel method for computing the meaning of a relational pattern based on the semantic compositionality of constituent words. We extend the Skip-gram model (Mikolov et al., 2013) to handle semantic compositions of relational patterns using recursive neural networks. The experimental results show the superiority of the proposed method for modeling the meanings of relational patterns, and demonstrate the contribution of this work to the task of relation extraction.

Introduction

Relation extraction is the task of extracting semantic relations between entities from corpora. This task is crucial for a number of NLP applications such as question answering and recognizing textual entailment. In this task, it is essential to identify the meaning of a relational pattern (a linguistic pattern connecting entities). Based on the distributional hypothesis (Harris, 1954), most previous studies construct a co-occurrence matrix between relational patterns (e.g., “X cause Y”) and entity pairs (e.g., “X: smoking, Y: cancer”), and then they recognize relational patterns sharing the same meaning regarding the co-occurrence distribution as a semantic vector (Mohamed et al., 2011, Min et al., 2012, Nakashole et al., 2012). For example, we can find that the patterns “X cause Y” and “X increase the risk of Y” have the similar meaning because the patterns share many entity pairs (e.g., “X: smoking, Y: cancer”). Using semantic vectors, we can map a relational pattern such as “X cause Y” into a predefined semantic relation such as causality only if we can compute the similarity between the semantic vector of the relational pattern and the prototype vector for the relation. In addition, we can discover relation types by clustering relational patterns based on semantic vectors.

However, this approach suffers from the data sparseness problem due to regarding a pattern as a ‘word’. Fig. 1 shows the frequency and rank of relational patterns appearing in the ukWaC corpus (Baroni et al., 2009). The graph confirms that the distribution of occurrences of relational patterns follows Zipf׳s law. Here, we identify two critical problems. First, the quality of a semantic vector of a relational pattern may vary, because the frequency of occurrence of a relational pattern varies drastically. For example, the pattern “X cause Y” can obtain sufficiently many co-occurrence statistics (appearing more than 105 times), while the pattern “X cause an increase in Y” cannot (appearing less than 102 times). Second, we cannot compute semantic vectors of out-of-vocabulary patterns. We often discard less frequently occurring relational patterns, say, occurring fewer than 102 times, even though we have no way of computing semantic vectors for the discarded or unseen patterns.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2016 ModelingSemanticCompositionalitKentaro Inui
Naoaki Okazaki
Sho Takase
Modeling Semantic Compositionality of Relational Patterns10.1016/j.engappai.2016.01.0272016