2008 AnEntMentModelForCorefResWithILP

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Entity Mention Coreference Resolution, Inductive Logic Programming.

Notes

Cited By

  • ~17 scholar.google.com/scholar?hl=en&q="An+Entity-Mention+Model+for+Coreference+Resolution+with+Inductive+Logic+Programming"+2008

Quotes

Abstract

The traditional mention-pair model for coreference resolution cannot capture information beyond mention pairs for both learning and testing. To deal with this problem, we present an expressive entity-mention model that performs coreference resolution at an entity level. The model adopts the Inductive Logic Programming (ILP) algorithm, which provides a relational way to organize different knowledge of entities and mentions. The solution can explicitly express relations between an entity and the contained mentions, and automatically learn first-order rules important for coreference decision. The evaluation on the ACE data set shows that the ILP based entity-mention model is effective.

1. Introduction

Coreference resolution is the process of linking multiple mentions that refer to the same entity. Most of previous work adopts the mention-pair model, which recasts coreference resolution to a binary classification problem of determining whether or not two mentions in a document are co-referring (e.g. Aone and Bennett (1995); McCarthy and Lehnert (1995); Soon et al. (2001); Ng and Cardie (2002)). Although having achieved reasonable success, the mention-pair model has a limitation that information beyond mention pairs is ignored for training and testing. As an individual mention usually lacks adequate descriptive information of the referred entity, it is often difficult to judge whether or not two mentions are talking about the same entity simply from the pair alone.

An alternative learning model that can overcome this problem performs coreference resolution based on entity-mention pairs (Luo et al., 2004; Yang et al., 2004b). Compared with the traditional mentionpair counterpart, the entity-mention model aims to make coreference decision at an entity level. Classification is done to determine whether a mention is a referent of a partially found entity. A mention to be resolved (called active mention henceforth) is linked to an appropriate entity chain (if any), based on classification results.

One problem that arises with the entity-mention model is how to represent the knowledge related to an entity. In a document, an entity may have more than one mention. It is impractical to enumerate all the mentions in an entity and record their information in a single feature vector, as it would make the feature space too large. Even worse, the number of mentions in an entity is not fixed, which would result in variant-length feature vectors and make trouble for normal machine learning algorithms. A solution seen in previous work (Luo et al., 2004; Culotta et al., 2007) is to design a set of first-order features summarizing the information of the mentions in an entity, for example, “whether the entity has any mention that is a name alias of the active mention?” or “whether most of the mentions in the entity have the same head word as the active mention?” These features, nevertheless, are designed in an ad-hoc manner and lack the capability of describing each individual mention in an entity.

In this paper, we present a more expressive entity-mention model for coreference resolution. The model employs Inductive Logic Programming (ILP) to represent the relational knowledge of an active mention, an entity, and the mentions in the entity. On top of this, a set of first-order rules is automatically learned, which can capture the information of each individual mention in an entity, as well as the global information of the entity, to make coreference decision. Hence, our model has a more powerful representation capability than the traditional mention-pair or entity-mention model. And our experimental results on the ACE data set shows the model is effective for coreference resolution.

2 Related Work

There are plenty of learning-based coreference resolution systems that employ the mention-pair model.

A typical one of them is presented by Soon et al. (2001). In the system, a training or testing instance is formed for two mentions in question, with a feature vector describing their properties and relationships. At a testing time, an activemention is checked against all its preceding mentions, and is linked with the closest one that is classified as positive. The work is further enhanced by Ng and Cardie (2002) by expanding the feature set and adopting a “bestfirst” linking strategy.

Recent years have seen some work on the entity-mention model. Luo et al. (2004). propose a system that performs coreference resolution by doing search in a large space of entities. They train a classifier that can determine the likelihood that an active mention should belong to an entity. The entity-level features are calculated with an “Any-X” strategy: an entitymention pair would be assigned a feature X, if any mention in the entity has the feature X with the active mention.

Culotta et al. (2007) present a system which uses an online learning approach to train a classifier to judge whether two entities are coreferential or not. The features describing the relationships between two entities are obtained based on the information of every possible pair of mentions from the two entities. Different from (Luo et al., 2004), the entity-level features are computed using a “Most-X” strategy, that is, two given entities would have a feature X, if most of the mention pairs from the two entities have the feature X.

Yang et al. (2004b) suggest an entity-based coreference resolution system. The model adopted in the system is similar to the mention-pair model, except that the entity information (e.g., the global number/gender agreement) is considered as additional features of a mention in the entity.

McCallum and Wellner (2003) propose several graphical models for coreference analysis. These models aim to overcome the limitation that pairwise coreference decisions are made independently of each other. The simplest model conditions coreference on mention pairs, but enforces dependency by calculating the distance of a node to a partition (i.e., the probability that an active mention belongs to an entity) based on the sum of its distances to all the nodes in the partition (i.e., the sum of the probability of the active mention co-referring with the mentions in the entity).

Inductive Logic Programming (ILP) has been applied to some natural language processing tasks, including parsing (Mooney, 1997), POS disambiguation (Cussens, 1996), lexicon construction (Claveau et al., 2003), WSD (Specia et al., 2007), and so on. However, to our knowledge, our work is the first effort to adopt this technique for the coreference resolution task.

3 Conclusions

This paper presented an expressive entity-mention model for coreference resolution by using Inductive Logic Programming. In contrast to the traditional mention-pair model, our model can capture information beyond single mention pairs for both training and testing. The relational nature of ILP enables our model to explicitly express the relations between an entity and its mentions, and to automatically learn the first-order rules effective for the coreference resolution task. The evaluation on ACE data set shows that the ILP based entity-model performs better than the mention-pair model (with up to 2.3% increase in F-measure), and also beats the entity-mention model with heuristically designed first-order features. Our current work focuses on the learning model that calculates the probability of a mention belonging to an entity. For simplicity, we just use a greedy clustering strategy for resolution, that is, a mention is linked to the current best partial entity. In our future work, we would like to investigate more sophisticated clustering methods that would lead to global optimization, e.g., by keeping a large search space (Luo et al., 2004) or using integer programming (Denis and Baldridge, 2007).

References

  • C. Aone and S. W. Bennett. (1995). Evaluating Automated and Manual Acquisition of Anaphora Resolution Strategies. In: Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics (ACL), pages 122–129.
  • V. Claveau, P. Sebillot, C. Fabre, and P. Bouillon. (2003). Learning semantic lexicons from a part-of-speech and semantically tagged corpus using inductive logic programming. Journal of Machine Learning Research, 4:493–525.
  • (CulottaWM, 2007) =>A. Culotta, M. Wick, and Andrew McCallum. (2007). “First-order probabilistic models for coreference resolution.” In: Proceedings of the Annual Meeting of the North America Chapter of the Association for Computational Linguistics (NAACL), pages 81–88.
  • J. Cussens. (1996). Part-of-speech disambiguation using ILP. Technical report, Oxford University Computing Laboratory.
  • P. Denis and J. Baldridge. (2007). Joint determination of anaphoricity and coreference resolution using integer programming. In: Proceedings of the Annual Meeting of the North America Chapter of the Association for Computational Linguistics (NAACL), pages 236–243.
  • X. Luo, A. Ittycheriah, H. Jing, Nanda Kambhatla, and S. Roukos. (2004). A mention-synchronous coreference resolution algorithm based on the bell tree. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL), pages 135–142.
  • Andrew McCallum and Ben Wellner. (2003). Toward conditional models of identity uncertainty with application to proper noun coreference. In: Proceedings of IJCAI-03 Workshop on Information Integration on the Web, pages 79–86.
  • J. McCarthy and W. Lehnert. (1995). Using decision trees for coreference resolution. In: Proceedings of the 14th International Conference on Artificial Intelligences (IJCAI), pages 1050–1055.
  • Raymond Mooney. (1997). Inductive logic programming for natural language processing. In: Proceedings of the sixth International Inductive Logic Programming Workshop, pages 3–24.
  • V. Ng and C. Cardie. (2002). Improving machine learning approaches to coreference resolution. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), pages 104–111, Philadelphia.
  • V. Ng. (2005). Machine learning for coreference resolution: From local classification to global ranking. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL), pages 157–164.
  • V. Ng. (2007). Semantic class induction and coreference resolution. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL), pages 536–543.
  • W. Soon, H. Ng, and D. Lim. (2001). A machine learning approach to coreference resolution of noun phrases. Computational Linguistics, 27(4):521–544.
  • L. Specia, M. Stevenson, and M. V. Nunes. (2007). Learning expressive models for words sense disambiguation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL), pages 41–48.
  • A. Srinivasan. (2000). The aleph manual. Technical report, Oxford University Computing Laboratory.
  • M. Vilain, J. Burger, J. Aberdeen, D. Connolly, and Lynette Hirschman. (1995). A model-theoretic coreference scoring scheme. In: Proceedings of the Sixth Message understanding Conference (MUC-6), pages 45–52, San Francisco, CA. Morgan Kaufmann Publishers.
  • X. Yang and J. Su. (2007). Coreference resolution using semantic relatedness information from automatically discovered patterns. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL), pages 528–535.
  • X. Yang, J. Su, G. Zhou, and C. Tan. 2004a. Improving pronoun resolution by incorporating coreferential information of candidates. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL), pages 127–134, Barcelona.
  • X. Yang, J. Su, G. Zhou, and C. Tan. 2004b. An NP-cluster approach to coreference resolution. In: Proceedings of the 20th International Conference on Computational Linguistics, pages 219–225, Geneva.
  • G. Zhou and J. Su. (2000). Error-driven HMM-based chunk tagger with context-dependent lexicon. In: Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pages 71–79, Hong Kong.
  • G. Zhou and J. Su. (2002). Named Entity recognition using a HMM-based chunk tagger. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), pages 473–480, Philadelphia,


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2008 AnEntMentModelForCorefResWithILPJian Su
Chew Lim Tan
Xiaofeng Yang
Jun Lang
Ting Liu
Sheng Li
An Entity-Mention Model for Coreference Resolution with Inductive Logic ProgrammingProceedings of ACL Conferencehttp://www.aclweb.org/anthology-new/P/P08/P08-1096.pdf2008