2007 FirstOrderProbModForCorefRes

Jump to: navigation, search

Subject Headings: Coreference Resolution Algorithm, Coreference Resolution System.


Cited By




Traditional noun phrase coreference resolution systems represent features only of pairs of noun phrases. In this paper, we propose a machine learning method that enables features over sets of noun phrases, resulting in a first-order probabilistic model for coreference. We outline a set of approximations that make this approach practical, and apply our method to the ACE coreference dataset, achieving a 45% error reduction over a comparable method that only considers features of pairs of noun phrases. This result demonstrates an example of how a first-order logic representation can be incorporated into a probabilistic model and scaled efficiently.


  • B. Amit and B. Baldwin. (1998). Algorithms for scoring coreference chains. In: Proceedings of the Seventh Message Understanding Conference (MUC7).
  • Razvan C. Bunescu and Raymond Mooney. (2004). Collective information extraction with relational markov networks. In ACL.
  • Y. Censor and S.A. Zenios. (1997). Parallel optimization : theory, algorithms, and applications. Oxford University Press.
  • Michael Collins and Brian Roark. (2004). Incremental parsing with the perceptron algorithm. In ACL.
  • Koby Crammer and Yoram Singer. (2003). Ultraconservative online algorithms for multiclass problems. JMLR, 3:951–991.
  • Aron Culotta and Andrew McCallum. (2006). Tractable learning and inference with high-order representations. In ICML Workshop on Open Problems in Statistical Relational Learning, Pittsburgh, PA.
  • Hal Daum´e III and Daniel Marcu. 2005a. A large-scale exploration of effective global features for a joint entity detection and tracking model. In HLT/EMNLP, Vancouver, Canada.
  • Hal Daum´e III and Daniel Marcu. 2005b. Learning as search optimization: Approximate large margin methods for structured prediction. In ICML, Bonn, Germany.
  • Rodrigo de Salvo Braz, Eyal Amir, and Dan Roth. (2005). Lifted first-order probabilistic inference. In IJCAI, pages 1319–1325.
  • Pascal Denis and Jason Baldridge. (2007). A ranking approach to pronoun resolution. In IJCAI.
  • Jenny Rose Finkel, Trond Grenager, and Christopher D. Manning. (2005). Incorporating non-local information into information extraction systems by gibbs sampling. In ACL, pages 363–370.
  • H. Gaifman. 1964. Concerning measures in first order calculi. Israel J. Math, 2:1–18.
  • J. Y. Halpern. (1990). An analysis of first-order logics of probability. Artificial Intelligence, 46:311–350.
  • Xiaoqiang Luo, Abe Ittycheriah, Hongyan Jing, Nanda Kambhatla, and Salim Roukos. (2004). A mention-synchronous coreference resolution algorithm based on the Bell tree. In ACL, page 135.
  • Andrew McCallum and Ben Wellner. (2003). Toward conditional models of identity uncertainty with application to proper noun coreference. In IJCAI Workshop on Information Integration on the Web.
  • Andrew McCallum and Ben Wellner. (2005). Conditional models of identity uncertainty with application to noun coreference. In Lawrence K. Saul, Yair Weiss, and L´eon Bottou, editors, NIPS17. MIT Press, Cambridge, MA.
  • Brian Milch, Bhaskara Marthi, and Stuart Russell. (2004). BLOG: Relational modeling with unknown objects. In ICML 2004Workshop on Statistical Relational Learning and Its Connections to Other Fields. Brian Milch, Bhaskara Marthi, Stuart Russell, David Sontag,
  • Daniel L. Ong, and Andrey Kolobov. (2005). BLOG: Probabilistic models with unknown objects. In IJCAI.
  • Vincent Ng and Claire Cardie. (2002). Improving machine learning approaches to coreference resolution. In ACL.
  • Vincent Ng. (2005). Machine learning for coreference resolution: From local classification to global ranking. In ACL.
  • Cristina Nicolae and Gabriel Nicolae. (2006). Bestcut: A graph algorithm for coreference resolution. In EMNLP, pages 275–283, Sydney, Australia, July. Association for Computational Linguistics.
  • Mark A. Paskin. (2002). Maximum entropy probabilistic logic. Technical Report UCB/CSD-01-1161, University of California, Berkeley.
  • D. Poole. (2003). First-order probabilistic inference. In IJCAI, pages 985–991, Acapulco, Mexico. Morgan Kaufman.
  • Matthew Richardson and Pedro Domingos. (2006). Markov logic networks. Machine Learning, 62:107–136.
  • Dan Roth and W. Yih. (2004). A linear programming formulation for global inference in natural language tasks. In The 8th Conference on Compuational Natural Language Learning, May.
  • Parag Singla and Pedro Domingos. (2005). Discriminative training of markov logic networks. In AAAI, Pittsburgh, PA.
  • Wee Meng Soon, Hwee Tou Ng, and Daniel Chung Yong Lim. (2001). A machine learning approach to coreference resolution of noun phrases. Comput. Linguist., 27(4):521–544.
  • Charles Sutton and Andrew McCallum. (2004). Collective segmentation and labeling of distant entities in information extraction. Technical Report TR # 04-49, University of Massachusetts, July.
  • Charles Sutton and Andrew McCallum. (2005). Piecewise training of undirected models. In 21st Conference on Uncertainty in Artificial Intelligence.,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2007 FirstOrderProbModForCorefResAron Culotta
Michael Wick
Robert Hall
Andrew McCallum
First-Order Probabilistic Models for Coreference ResolutionProceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguisticshttp://www.cs.umass.edu/~culotta/pubs/culotta07first.pdf2007