Keywords: Relation Recognition from Text Algorithm, ACE Benchmark Task
Cited By
- [ZhangZS, 2006] => M. Zhang, J. Zhang, and J. Su. (2006). Exploring Syntactic Features for Relation Extraction using a Convolution Tree Kernel. In Proceedings of HLT-2006.
- "Culotta and Sorensen (2004) generalize this kernel to estimate similarity between dependency trees. One may note that their tree kernel requires the matchable nodes must be at the same depth counting from the root node. This is a strong constraint on the matching of syntax so it is not surprising that the model has good precision but very low recall on the ACE corpus (Zhao and Grishman, 2005). In addition, according to the top-down node matching mechanism of the kernel, once a node is not matchable with any node in the same layer in another tree, all the sub-trees below this node are discarded even if some of them are matchable to their counterparts in another tree."
- [Zhao and Grishman, 2005] => S. Zhao and R. Grishman. (2005). Extracting Relations with Integrated Information Using Kernel Methods. In Proc. or ACL-2005.
- "Culotta and Sorensen (2004) described a slightly generalized version of this kernel based on dependency trees. Since their kernel is a recursive match from the root of a dependency tree down to the leaves where the entity nodes reside, a successful match of two relation examples requires their entity nodes to be at the same depth of the tree. This is a strong constraint on the matching of syntax so it is not surprising that the model has good precision but very low recall. In their solution a bag-of-words kernel was used to compensate for this problem. In our approach, more flexible kernels are used to capture regularization in syntax, and more levels of syntactic information are considered.
Quotes
Abstract
- "We extend previous work on tree kernels to estimate the similarity between the dependency trees of sentences. Using this kernel within a Support Vector Machine, we detect and classify relations between entities in the Automatic Content Extraction (ACE) corpus of news articles. We examine the utility of different features such as Wordnet hypernyms, parts of speech, and entity types, and find that the dependency tree kernel achieves a 20% F1 improvement over a “bag-of-words” kernel."
Introduction
- "Our algorithm is similar to that described by Zelenko et al. (2003). Our contributions are a richer sentence representation, a more general framework to allow feature weighting, as well as the use of composite kernels to reduce kernel sparsity.
Experiments
- "Although training was done over all 24 relation subtypes, we evaluate only over the 5 highlevel relation types. Thus, classifying a RESIDENCE relation as a LOCATED relation is deemed correct.
- "Fiture 3 - Distribution over relation types in training data. At_Located ~300, Role_Staff ~200, Role_Member ~200, Role_Mgmt ~170, Part_Part-of ~155, At_based-in ~85, At_residence ~70, Near_Relative_loc ~40, etc...
- "While precision is adequate, recall is low. This is a result of the aforementioned class imbalance – very few of the training examples are relations, so the classifier is less likely to identify a testing instances as a relation. Because we treat every pair of mentions in a sentence as a possible relation, our training set contains fewer than 15% positive relation instances.
- "To remedy this, we retrain each SVMs for a binary classification task. Here, we detect, but do not classify, relations. This allows us to combine all positive relation instances into one class, which provides us more training samples to estimate the class boundary. We then threshold our output to achieve an optimal operating point. As seen in Table 5, this method of relation detection outperforms that of the multi-class classifier.
- "We then use these binary classifiers in a cascading scheme as follows: First, we use a classifier to detect possible relations. Then, we use a classifier trained only on positive relation instances to classify each predicted relation. These results are shown in Table 6.
References
- [Agichtein and Gravano, 2000] => E. Agichtein and L. Gravano. (2000). Snowball: Extracting Relations from Large Plain-Text Collections. In Proc. of the 5th ACM Int. Conf. on Digital Libraries (DL-2000).
- [Brin, 1998] => S. Brin. (1998). Extracting Patterns and Relations from the World Wide Web. WebDB Workshop at EDBT'98
- [Collins and Duffy, 2001] => M. Collins and N. Duffy. (2001). Convolution Kernels for Natural Language. In Proc. of NIPS-2001.
- [Cortes and Vapnik] => C. Cortes and V. Vapnik. (1995). Support Vector Networks. Machine Learning, 20(3).
- N. Cristianini and J. Shawe-Taylor. 2000. An introduction to support vector machines. Cambridge University Press.
- [Cumby and Roth, 2003] => C. M. Cumby and D. Roth. (2003). On Kernel Methods for Relational Learning. In Proc. of ICML-2003.
- K. Fukunaga. 1990. Introduction to Statistical Pattern Recognition. Academic Press, second edition.
- [Haussler, 1999] => D. Haussler. (1999). Convolution Kernels on Discrete Structures. Technical Report UCSC-CLR-99-10, University of California at Santa Cruz.
- Thorsten Joachims, Nello Cristianini, and John Shawe-Taylor. 2001. Composite kernels for hypertext categorisation. In Carla Brodley and Andrea Danyluk, editors, Proceedings of ICML-01, 18th International
Conference on Machine Learning, pages 250–257, Williams College, US. Morgan Kaufmann Publishers, San Francisco, US. - Huma Lodhi, John Shawe-Taylor, Nello Cristianini, and Christopher J. C. H. Watkins. 2000. Text classification using string kernels. In NIPS, pages 563–569.
- A. McCallum and B. Wellner. 2003. Toward conditional models of identity uncertainty with application to proper noun coreference. In IJCAI Workshop on Information Integration on the Web.
- [MillerFRW, 2000] => S. Miller, H. Fox, L. Ramshaw, and R. Weischedel. (2000). A novel use of statistical parsing to extract information from text. In Proc. NAACL-2000.
- H. Pasula, B. Marthi, B. Milch, S. Russell, and I. Shpitser. 2002. Identity uncertainty and citation
matching. - Dan Roth and Wen-tau Yih. 2002. Probabilistic reasoning for entity and relation recognition. In 19th International Conference on Computational Linguistics.
- Sam Scott and Stan Matwin. 1999. Feature engineering for text classification. In Proceedings of ICML-99, 16th International Conference on Machine Learning.
- Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In Walter Daelemans and Miles Osborne, editors, Proceedings of CoNLL-2003, pages 142–147. Edmonton, Canada.
- Vladimir Vapnik. 1998. Statistical Learning Theory. Whiley, Chichester, GB.
- [ZelenkoAR, 2003] => D. Zelenko, C. Aone, and A. Richardella. (2003). https://mitpress.mit.edu/journals/pdf/jmlr_3_6_1083_0.pdf">Kernel Methods for Relation Extraction. Journal of Machine Learning Research.
BibTeX
@inproceedings{DBLP:conf/acl/CulottaS04,
author = {Aron Culotta and
Jeffrey S. Sorensen},
title = {Dependency Tree Kernels for Relation Extraction.},
booktitle = {ACL},
year = {2004},
pages = {423-429},
ee = {http://acl.ldc.upenn.edu/acl2004/main/pdf/244_pdf_2-col.pdf},
bibsource = {DBLP, http://dblp.uni-trier.de}}