Keywords: Relation Detection from Text Algorithm, ACE Benchmark Task, SVMlight
Evaluation
Quotes
Abstract
- "This paper presents a new method for extracting meaningful relations from unstructured natural language sources. The method is based on information made available by shallow semantic parsers. Semantic information was used (1) to enhance a dependency tree kernel; and (2) to build semantic dependency structures used for enhanced relation extraction for several semantic classifiers. In our experiments the quality of the extracted relations surpassed the results of kernel-based models employing only semantic class information."
1. Introduction
- "Since meaningful relations between relevant entities are of semantic nature, we argue that additional semantic resources should be used for extracting relations from texts. In this work, we were interested in investigating the contribution of two shallow semantic parsing techniques to the quality of relation extraction.
- "We explored two main resources: PropBank and FrameNet. Proposition Bank or PropBank is a one million word corpus annotated with predicate-argument structures. The corpus consists of the Penn Treebank 2 Wall Street Journal texts (www.cis.upenn.edu/ treebank). The PropBank annotations were performed at University of Pennsylvania (www.cis.upenn.edu/ ace). To date PropBank has addressed only predicates lexicalized by verbs, proceeding from the most to the least common verbs while annotating verb predicates in the corpus. The FrameNet project (www.icsi.berkeley.edu/ framenet) produced a lexico-semantic resource encoding a set of frames, which represent schematic representations of situations characterized by a set of target words, or lexicalized predicates, which can be verbs, nouns or adjectives. In each frame, various participants and conceptual roles are related by case-roles or theta-roles which are called frame elements or FEs. FEs are local to each frame, some are quite general while others are specific to a small family of lexical items. FrameNet annotations were performed on a corpus of over three million words. Recently, semantic parsers using PropBank and FrameNet have started to become available. In each sentence, verbal or nominal predicates are discovered in relation to their arguments or FEs.
- "Our investigation shows that predicate arguments structures and semantic frames discovered by shallow semantic parsers play an important role in discovering extraction relations. This is due to the fact that arguments of extracted relations belong to arguments of predicates or to FEs.
2 Shallow Semantic Parsing
- "Shallow semantic information represented by predicates and their arguments, or frames and their FEs, can be identified in text sentences by semantic parsers. The idea of automatically identifying and labeling shallow semantic information was pioneered by [3]. Semantic parsers operate on the output of a syntactic parser. When using the PropBank information, the semantic parser (1) identifies each verbal predicate and (2) labels its arguments. The expected arguments of a predicate are numbered sequentially from Arg0 to Arg5. Additionally, the arguments may include functional tags from Treebank, e.g. ArgM-DIR indicates a directional, ArgM-LOC indicates a locative and ArgM-TMP stands for a temporal.
3. Dependency Tree Kernels
- "In [Culotta and Sorensen, 2004] the relation extraction problem was cast as a classification problem based on Kernels that operate on Dependency Trees. Kernels measure the similarity between two Instances of a relation. If X is the instance space, a kernel function is a mapping K:X
xX->[0,infinity) such that given two instances x and y, K(x,y) = SUM(i) ti(x) ti(y) = t(x)·t(y), where ti(x) is some feature function over the instance x. The instances can be represented in several ways. First, each sentence where a relation of interest occurs can be viewed as a list of words. Thus, the similarity between two instances represented in this way is computed as the number of common words between the two instance sentences. All words from instances x and y are indexed and i(x) is the number of times instance x contains the word referenced by i. Such a kernel is known as bag-of-words kernel. When sentences are represented as strings of words, string kernels, count the number of common subsequences in the two strings and weight their matches by their length. Thus i(x) is the number of times string x contains the subsequence referenced by i. - "If the instances are represented by syntactic trees, more complex kernels are needed. A class of kernels, called Convolution Kernels, was proposed to handle such instance representations. Convolution kernels measure the similarity between two structured instances by summing the similarity of their substructures. Thus, given all possible substructures in instances x and y, t i(x) counts not only the number of times the substructure referenced by i is matched into x, but also how many times it is matched into any of its substructures.
- The features are used by a tree kernel function K(T1; T2) that returns a similarity score in the range (0; 1). We preferred the more general version of the kernel introduced in [Culotta and Sorensen, 2004] to the kernel described by [Zelenko et al, 2002]. This kernel is based on two functions defined on the features of tree nodes: a matching function m(ti; tj) 2 f0; 1g and a similarity function s(ti; tj) 2 (0;1). The feature vector of a tree node (ti) = fv1; : : : vdg consists of two possibly overlapping subsets m(ti) (ti) and s(ti) (ti).
4 Relation Extraction
- "When analyzing the dependency kernels, we noticed that only few nodes bear semantic information derived by the semantic parsers. We also noticed that these nodes are clustered together in the dependency tree. For example, Figure 6(d) illustrates the cluster of nodes from the dependency tree that contains semantic information.
- "Instead of using the entire dependency tree to compute similarities, we selected sub-trees that contain nodes having values for the features from set F2 (illustrated in Figure 7). Typically these nodes correspond to target predicates and their arguments or FEs. This allowed us to compare trees of the form
SDT(R1) [\attorneys”!\represented” \Woodward”]
and SDT(R2)[\intern”!\dismissed” \lawyer”].
- We called such trees semantic dependency trees since they are characterized by semantic features present in the nodes of dependency trees. Semantic dependency trees (SDTs) are binary trees containing three nodes: a verbal predicate that is the root of the tree; and two children nodes, each an argument of the predicate. To measure the similarity of two SDTs we built a very simple kernel:
5 Experimental results
- "To evaluate the role of shallow semantics provided by semantic parsers on relation extraction we have used the Automatic Content Extraction (ACE) corpus available from LDC (LDC2003T11).
- "We choose to train on all 24 relations, not only on the first 5-high level relation types as was done in [1].
- "We implemented the same five kernels as [1]: K0=sparse kernel, K1=contiguous kernel, K2=bag-of-words kernel and K3=K0+K2 and K4=K1+K2 and used first only the feature set F1 from Figure 7 and then both feature sets F1 and F2. The comparison of the kernel performance of the two experiments is listed in Figure 9.
- "We used each kernel within an SVM (we augmented the SVMlight implementation to include our kernels).
contig+bag-o-words kernel non SRL features 60.5 20.3 30.4
contig+bag-o-words kernel plus all features 72.2 44.5 55.1
- "The results indicate that on average, for K4, the best performing kernel, we obtained an increase of 24.66% in the F-score when features provided by the semantic parsers were added. When relying on SDTs, the average Precision that was obtained was 89.3%, the recall was 76.4%, thus an F-score of 82.35%, when using the same data as [1].
- "In the ACE data 61.71% of the training/testing data could be cast into SDTs. The semantic similarity between arguments of a relation within the same NP and arguments present in SDTs allowed the extraction with an average F1-score of 78.41%. The quality of the extraction results depend on the quality of the semantic parsers, that obtained F-scores of over 90% in recent SENSEVAL evaluations.
6 Conclusions
- "In this paper we have introduced a new dependency structure that relies on semantic information provided by shallow semantic parsers. This structure enabled the extraction of relevant relations with better performance than previous state-of-the-art kernel methods. Furthermore, the semantic features enabled similarly good results to be obtained with a few other learning algorithms. We also used compatibility functions that made use of semantic knowledge. This framework could be extended to allow processing of idiomatic predicates, e.g. [PERSON “lobbying on behalf of” ORGANIZATION], and combined predications.
References
[2] C. Fellbaum. WordNet: An Electronic Lexical Database. MIT Press., 1998.
[3] Daniel Gildea and Daniel Jurasky. Automatic labeling of semantic roles. Computational Linguistic, 28(3):496–530, 2002.
[4] F. Jelineck, J. Lafferty, D. Magerman, R. Mercer, A. Ratnaparkhi, and S. Roukos. Decision tree parsing using a hidden derivational model. In Proceedings of the HLT Workshop-1994.
[5] X. Luo, A. Ittycheriah, H. Jing, N. Kambhatla, and S. Roukos. A Mention-Synchronous Coreference Resolution Algorithm Based On the Bell Tree. In Proceedings of the ACL-2004, 2004.
[6] S. Pradhan, K. Hacioglu, V. Krugler,W.Ward, J. H. Martin, and D. Jurafsky. Support Vector Learning for Semantic Argument Classification. Journal of Machine Learning Research, 2004.
[7] M. Surdeanu, S. M. Harabagiu, J.Williams, and J. Aarseth. Using Predicate-Argument Structures for Information Extraction. In Proceedings of the ACL-2003, 2003.
[8] D. Zelenko, C. Aone, and A. Richardella. Kernel Methods for Relation Extraction. In Proceedings of the EMNLP-2002, pages 71–78, 2002.
BibTeX
@inproceedings{DBLP:conf/ijcai/HarabagiuBM05,
author = {Sanda M. Harabagiu and
Cosmin Adrian Bejan and
Paul Morarescu},
title = {Shallow Semantics for Relation Extraction.},
booktitle = {IJCAI},
year = {2005},
pages = {1061-1066},
ee = {http://www.ijcai.org/papers/1589.pdf},
crossref = {DBLP:conf/ijcai/2005},
bibsource = {DBLP, http://dblp.uni-trier.de}}@proceedings{DBLP:conf/ijcai/2005,
editor = {Leslie Pack Kaelbling and
Alessandro Saffiotti},
title = {IJCAI-05, Proceedings of the Nineteenth International Joint
Conference on Artificial Intelligence, Edinburgh, Scotland,
UK, July 30-August 5, 2005},
booktitle = {IJCAI},
publisher = {Professional Book Center},
year = {2005},
isbn = {0938075934},
bibsource = {DBLP, http://dblp.uni-trier.de}}