2004 An IntegConditIEandCoref

(Wellner et al., 2004) ⇒ Ben Wellner, Andrew McCallum, Fuchun Peng, Michael Hay. (2004). “An Integrated, Conditional Model of Information Extraction and Coreference with Application to Citation Matching.” In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI 2004).

Subject Headings: Entity Mention Normalization Algorithm, Coreference Resolution System.

Notes

Cited By

~122 http://scholar.google.com/scholar?q=%22An+Integrated%2C+Conditional+Model+of+Information+Extraction+and+Coreference+with+Application+to+Citation+Matching%22+2004

2008

(Sarawagi, 2008) ⇒ Sunita Sarawagi. (2008). “Information extraction.” FnT Databases, 1(3).

2007

(Poon & Domingos) ⇒ H. Poon and Pedro Domingos. (2007). “Joint inference in information extraction.” In: Proceedings of the Twenty-Second National Conference on Artificial Intelligence (AAAI 2007).
- QUOTE: While a number of previous authors have taken steps in this direction (e.g., Pasula et al (2003), Wellner et al. (2004)), to our knowledge this is the first fully joint approach.

Quotes

Abstract

Although information extraction and coreference resolution appear together in many applications, most current systems perform them as independent steps. This paper describes an approach to integrated inference for extraction and coreference based on conditionally-trained undirected graphical models. We discuss the advantages of conditional probability training, and of a coreference model structure based on graph partitioning. On a data set of research paper citations, we show significant reduction in error by using extraction uncertainty to improve coreference citation matching accuracy, and using coreference to improve the accuracy of the extracted fields.

References

1 Nikhil Bansal, Avrim Blum, Shuchi Chawla, Correlation Clustering, Proceedings of the 43rd Symposium on Foundations of Computer Science, p.238, November 16-19, 2002
2 Besag, J. (1986). On the statistical analysis of dirty pictures. Journal of the Royal Statistics Society B, 48, 259--302.
3 Boykov, Y., Veksler, O., & Zabih, R. (1999). Fast approximate energy minimization via graph cuts. Proceedings of the Seventh IEEE International Conference on Computer Vision (ICCV) (1) (pp. 377--384).
4 Xavier Carreras, Lluís Màrquez, Lluís Padró, Named Entity Extraction using AdaBoost, proceeding of the 6th conference on Natural language learning, p.1-4, August 31, 2002 doi:10.3115/1118853.1118857
5 William W. Cohen, Ravikumar, P., & Fienberg, S. (2003). A comparison of string metrics for matching names and records. KDD-2003 Workshop on Data Cleaning and Object Consolidation.
6 Demaine, E., & Immorlica, N. (2003). Correlation clustering with partial information. 6th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems (APPROX 2003).
7 Gillick, L., & Cox, S. (1989). Some statistical issues in the comparison of speech recognition algorithms. Proceedings of the International Conference on Acoustics Speech and Signal Processing (ICASSP) (pp. 532--535).
8 John D. Lafferty, Andrew McCallum, Fernando C. N. Pereira, Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data, Proceedings of the Eighteenth International Conference on Machine Learning, p.282-289, June 28-July 01, 2001
9 Steve Lawrence, C. Lee Giles, Kurt Bollacker, Digital Libraries and Autonomous Citation Indexing, Computer, v.32 n.6, p.67-71, June 1999 doi:10.1109/2.769447
10 Marthi, B., Milch, B., & Russell, S. (2003). First-order probabilistic models for information extraction. IJCAI 2003 Workshop on Learning Statistical Models from Relational Data.
11 Andrew McCallum (2003). Efficiently inducing features of conditional random fields. Proceedings of 19th Conference on Uncertainty in Artificial Intelligence (UAI).
12 Andrew McCallum, & Jensen, D. (2003). A note on the unification of information extraction and data mining using conditional-probability relational models. IJCAI Workshop on Learning Statistical Models from Relational Data.
13 Andrew McCallum, & Li, W. (2003). Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. Proceedings of the Seventh Workshop on Computational Language Learning (CoNLL).
14 Andrew McCallum, & Wellner, B. (2003). Toward conditional models of identity uncertainty. IJCAI Workshop on Information Integration and the Web.
15 Milch, B., Marthi, B., & Russell, S. (2004). Blog: Relational modeling with unknown objects. ICML 2004 Workshop on Statistical Relational Learning and Its Connections to Other Fields.
16 Un Yong Nahm, Raymond Mooney, A Mutually Beneficial Integration of Data Mining and Information Extraction, Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence, p.627-632, July 30-August 03, 2000
17 Neville, J., Simsek, O., & Jensen, D. (2004). Autocorrelation and relational learning: Challenges and opportunities. To appear in ICML Statistical Relational Learning Workshop. Banff, Canada.
18 (Pasula et al., 2003) ⇒ Hanna Pasula, Bhaskara Marthi, Brian Milch, Stuart Russell, and Ilya Shpitser. (2003). “Identity Uncertainty and Citation Matching.” In: Advances in Neural Information Processing (NIPS 2003).
19 Peng, F., & Andrew McCallum (2004). Accurate information extraction from research papers using conditional random fields. Proceedings of Human Language Technology Conference and North American Chapter of the Association for Computational Linguistics(HLT-NAACL) (pp. 329--336).
20 David Pinto, Andrew McCallum, Xing Wei, W. Bruce Croft, Table extraction using conditional random fields, Proceedings of the 26th ACM SIGIR Conference on Research and development in informaion retrieval, July 28-August 01, 2003, Toronto, Canada doi:10.1145/860435.860479
21 Dan Roth, & tau Yih, W. (2004). A linear programming formulation for global inference in natural language tasks. Proceedings of the Eighth Workshop on Computational Language Learning (CoNLL).
22 Saul, L. K., & Michael I. Jordan (1996). Exploiting tractable substructures in intractable networks. Advances in Neural Information Processing Systems (NIPS).
23 Fei Sha, Fernando Pereira, Shallow parsing with conditional random fields, Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, p.134-141, May 27-June 01, 2003, Edmonton, Canada doi:10.3115/1073445.1073473
24 Charles Sutton, Khashayar Rohanimanesh, Andrew McCallum, Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data, Proceedings of the twenty-first International Conference on Machine learning, p.99, July 04-08, 2004, Banff, Alberta, Canada doi:10.1145/1015330.1015422
25 Taskar, B., Abbeel, P., & Koller, D. (2002). Discriminative probabilistic models for relational data. Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI).
26 Wim Wiegerinck, Variational Approximations between Mean Field Theory and the Junction Tree Algorithm, Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence, p.626-633, June 30-July 03, 2000
27 Yedidia, J., Freeman, W., & Weiss, Y. (2000). Generalized belief propagation. Advances in Neural Information Processing Systems (NIPS) (pp. 689--695).
28 Ying, L., Frey, B., Koetter, R., & Munson, D. (2002). Analysis of an iterative dynamic programming approach to 2-d phase unwrapping. IEEE International Conference on Image Processing.

,

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2004 An IntegConditIEandCoref	Fuchun Peng Ben Wellner Andrew McCallum Michael Hay			An Integrated, Conditional Model of Information Extraction and Coreference with Application to Citation Matching		Proceedings of the Conference on Uncertainty in Artificial Intelligence	http://portal.acm.org/citation.cfm?id=1036915			2004