Pairwise Comparison Coreference Resolution Algorithm

From GM-RKB
Jump to navigation Jump to search

A Pairwise Comparison Coreference Resolution Algorithm is a Coreference Resolution Algorithm that uses a Binary Classification Algorithm to determine whether two Entity Mention are in a Coreference Relation.



References

2007

  • (CulottaWM, 2007) ⇒ Aron Culotta, Michael Wick, Robert Hall, and Andrew McCallum. (2007). “First-order probabilistic models for coreference resolution.” In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT-NAACL 2007).
    • "Noun phrase coreference resolution is the problem of clustering noun phrases into anaphoric sets. A standard machine learning approach is to perform a set of independent binary classifications of the form “Is mention [math]\displaystyle{ a }[/math] coreferent with mention b?” This approach of decomposing the problem into pairwise decisions presents at least two related difficulties. First, it is not clear how best to convert the set of pairwise classifications into a disjoint clustering of noun phrases. The problem stems from the transitivity constraints of coreference: If [math]\displaystyle{ a }[/math] and [math]\displaystyle{ b }[/math] are coreferent, and [math]\displaystyle{ b }[/math] and [math]\displaystyle{ c }[/math] are coreferent, then [math]\displaystyle{ a }[/math] and [math]\displaystyle{ c }[/math] must be coreferent.
    • "In this section we briefly review the standard pairwise coreference model. Given a pair of noun phrases xij = {xi, xj}, let the binary random variable yij be 1 if [math]\displaystyle{ x_i }[/math] and [math]\displaystyle{ x_j }[/math] are coreferent. Let F = {fk(xij, y)} be a set of features over xij . For example, fk(xij, y) may indicate whether [math]\displaystyle{ x_i }[/math] and [math]\displaystyle{ x_j }[/math] have the same gender or number. Each feature fk has an associated real-valued parameter k.
    • "We follow Soon et al. (2001) and Ng and Cardie (2002) to generate most of our features for the Pairwise Model. These include:
      • Match features - Check whether gender, number, head text, or entire phrase matches
      • Mention type (pronoun, name, nominal)
      • Aliases - Heuristically decide if one noun is the acronym of the other
      • Apposition - Heuristically decide if one noun is in apposition to the other
      • Relative Pronoun - Heuristically decide if one noun is a relative pronoun referring to the other.
      • Wordnet features - Use Wordnet to decide if one noun is a hypernym, synonym, or antonym of another, or if they share a hypernym.
      • Both speak - True if both contain an adjacent context word that is a synonym of “said.” This is a domain-specific feature that helps for many newswire articles.
      • Modifiers Match - for example, in the phrase “President Clinton”, “President” is a modifier of “Clinton”. This feature indicates if one noun is a modifier of the other, or they share a modifier.
      • Substring - True if one noun is a substring of the other (e.g. “Egypt” and “Egyptian”).

2006

  • McCallum, 2006, tutorialAndrew McCallum. (2006). Information extraction, Data Mining and Joint Inference. KDD-2006.
    • "Traditionally in NLP co-reference has been performed by making independent coreference decisions on each pair of entity mentions. An Affinity Matrix CRF jointly makes all coreference decisions together, accounting for multiple constraints.

2002

  • V. Ng and C. Cardie. (2002). Improving machine learning approaches to coreference resolution. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), pages 104–111, Philadelphia.

2001

1995

the Association for Computational Linguistics (ACL), pages 122–129.