Coreference Resolution System

From GM-RKB
Jump to navigation Jump to search

A Coreference Resolution System is a clustering system that can solve a coreference resolution task by means of a coreference resolution algorithm.



References

2019

  • (Wikipedia, 2019) ⇒ https://en.wikipedia.org/wiki/Coreference#Coreference_resolution Retrieved:2019-3-15.
    • In computational linguistics, coreference resolution is a well-studied problem in discourse. To derive the correct interpretation of a text, or even to estimate the relative importance of various mentioned subjects, pronouns and other referring expressions must be connected to the right individuals. Algorithms intended to resolve coreferences commonly look first for the nearest preceding individual that is compatible with the referring expression. For example, she might attach to a preceding expression such as the woman or Anne, but not to Bill. Pronouns such as himself have much stricter constraints. Algorithms for resolving coreference tend to have accuracy in the 75% range. As with many linguistic tasks, there is a tradeoff between precision and recall.

      A classic problem for coreference resolution in English is the pronoun it, which has many uses. It can refer much like he and she, except that it generally refers to inanimate objects (the rules are actually more complex: animals may be any of it, he, or she; ships are traditionally she; hurricanes are usually it despite having gendered names). It can also refer to abstractions rather than beings: "He was paid minimum wage, but didn't seem to mind it." Finally, it also has pleonastic uses, which do not refer to anything specific:

a. It's raining.
b. It's really a shame.
c. It takes a lot of work to succeed.
d. Sometimes it's the loudest who have the most influence.
Pleonastic uses are not considered referential, and so are not part of coreference. [1]
  1. Li et al. (2009) have demonstrated high accuracy in sorting out pleonastic it, and this success promises to improve the accuracy of coreference resolution overall.
  2. 2015a

    2015b

    2013

    2012a

    2012b

    2011a

    These problems can be remedied by an incremental entity-mention model, where candidate pairs are evaluated on the basis of the emerging coreference sets. A clustering phase on top of the pairwise classifier no longer is needed and the number of candidate pairs is reduced, since from each coreference set (be it large or small) only one mention (the most representative one) needs to be compared to a new anaphor candidate. We form a ’virtual prototype’ that collects information from all the members of each coreference set in order to maximize ’representativeness’. Constraints such as transitivity and morphological agreement can be assured by just a single comparison. If an anaphor candidate is compatible with the virtual prototype, then it is by definition compatible with all members of the coreference set.

    2011b

    2010a

    2010b

    2010c

    2009a

    2009b

    2008a

    2008b

    2008c

    2008d

    2007a

    2007b

    2007c

    2006

    2004

    2003

    2002

    2001

    1999a

    1999b