2007 JointInferenceInInformationExtraction
- (Poon & Domingos, 2007) ⇒ Hoifung Poon, Pedro Domingos. (2007). “Joint Inference in Information Extraction.” In: Proceedings of the Twenty-Second National Conference on Artificial Intelligence (AAAI 2007).
Subject Headings: Citation Extraction Task, Joint Inference Algorithm, Markov Logic, CORA Citation Matching Benchmark Task.
Notes
- It applies a Joint Inference Algorithm to the combined tasks of Entity Mention Recognition and Entity Mention Coreference Resolution.
- It applies an MC-SAT Algorithm.
Cited By
2009
- (Wick et al., 2009) ⇒ Michael Wick, Aron Culotta, Khashayar Rohanimanesh, and Andrew McCallum. (2009). “An Entity Based Model for Coreference Resolution.” In: Proceedings of the SIAM International Conference on Data Mining (SDM 2009).
- There has also been discriminatively trained methods in undirected graphical models. … Poon and Domingos [13] achieve impressive results by jointly modeling citation matching with segmentation. However, their weighted logic model factorizes mention pairs, forcing the model to reason over mentions instead of entities. In contrast, our model allows first order logic features to be expressed over entire clusters, enabling us to model canonicalization and coreference simultaneously. Weighted Logic Model, Mention Pair.
Quotes
Abstract
The goal of information extraction is to extract database records from text or semi-structured sources. Traditionally, information extraction proceeds by first segmenting each candidate record separately, and then merging records that refer to the same entities. While computationally efficient, this approach is suboptimal, because it ignores the fact that segmenting one candidate record can help to segment similar ones. For example, resolving a well-segmented field with a less-clear one can disambiguate the latter's boundaries. In this paper we propose a joint approach to information extraction, where segmentation of all records and entity resolution are performed together in a single integrated inference process. While a number of previous authors have taken steps in this direction (e.g., Pasula et al (2003), Wellner et al. (2004)), to our knowledge this is the first fully joint approach. In experiments on the CiteSeer and Cora citation matching datasets, joint inference improved accuracy, and our approach outperformed previous ones. Further, by using Markov logic and the existing algorithms for it, our solution consisted mainly of writing the appropriate logical formulas, and required much less engineering than previous ones.
References
- (Pasula et al., 2003) ⇒ Hanna Pasula, Bhaskara Marthi, Brian Milch, Stuart Russell, and Ilya Shpitser. (2003). “Identity Uncertainty and Citation Matching.” In: Proceedings of Advances in Neural Information Processing, 15 (NIPS 2003).
- (Wellner et al., 2004) ⇒ Ben Wellner, Andrew McCallum, Fuchun Peng, and Michael Hay. (2004). “An Integrated, Conditional Model of Information Extraction and Coreference with Application to Citation Matching.” In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI 2004).
,