2009 ConditionalRandomFieldsWithHOfeatures

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Conditional Random Field Model, Higher-Order Feature, Sequence Labeling Task/Sequential Labeling Task.

Notes

Quotes

Abstract

Dependencies among neighbouring labels in a sequence is an important source of information for sequence labeling problems. However, only dependencies between adjacent labels are commonly exploited in practice because of the high computational complexity of typical inference algorithms when longer distance dependencies are taken into account. In this paper, we show that it is possible to design efficient inference algorithms for a conditional random field using features that depend on long consecutive label sequences (high-order features), as long as the number of distinct label sequences used in the features is small. This leads to efficient learning algorithms for these conditional random fields. We show experimentally that exploiting dependencies using high-order features can lead to substantial performance improvements for some problems and discuss conditions under which high-order features can be effective.


References

  • [1] B. A. Cipra, “The Ising model is NP-complete,” SIAM News, vol. 33, no. 6, 2000.
  • [2] A. Culotta, D. Kulp, and A. McCallum. (2005). “Gene Prediction with Conditional Random Fields.” University of Massachusetts, Amherst, Tech. Rep. UM-CS-2005-028, 2005.
  • [3] T. G. Dietterich, A. Ashenfelter, and Y. Bulatov, “Training conditional random fields via gradient tree boosting,” in: Proceedings of the Twenty-First International Conference on Machine Learning, 2004.
  • [4] S. Fine, Y. Singer, and N. Tishby, “The hierarchical hidden markov model: Analysis and applications,” Machine Learning, vol. 32, no. 1, pp. 41–62, 1998.
  • [5] C. Huang and A. Darwiche, “Inference in belief networks: A procedural guide,” International Journal of Approximate Reasoning, vol. 15, no. 3, pp. 225–263, 1996.
  • [6] F. Jelinek, J. D. Lafferty, and R. L. Mercer, “Basic methods of probabilistic context free grammars,” in Speech Recognition and Understanding. Recent Advances, Trends, and Applications. Springer Verlag, 1992.
  • [7] R. H. Kassel, “A comparison of approaches to on-line handwritten character recognition,” Ph.D. dissertation, Massachusetts Institute of Technology, Cambridge, MA, USA, 1995.
  • [8] John D. Lafferty, A. McCallum, and Fernando Pereira, “Conditional random fields: Probabilistic models for segmenting and labeling sequence data,” in: Proceedings of the Eighteenth International Conference on Machine Learning, 2001, pp. 282–289.
  • [9] Linguistic Data Consortium, “ACE (Automatic Content Extraction) English Annotation Guidelines for Entities,” 2005.
  • [10] K. P. Murphy and M. A. Paskin, “Linear-time inference in hierarchical HMMs,” in Advances in Neural Information Processing Systems 14, vol. 14, 2002.
  • [11] X. Qian, X. Jiang, Q. Zhang, X. Huang, and L. Wu, “Sparse higher order conditional random fields for improved sequence labeling,” in ICML, 2009, p. 107.
  • [12] L. R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1990.
  • [13] S. Sarawagi and W. W. Cohen, “Semi-Markov conditional random fields for information extraction,” in Advances in Neural Information Processing Systems 17. Cambridge, MA: MIT Press, 2005, pp. 1185–1192.
  • [14] F. Sha and F. Pereira, “Shallow parsing with conditional random fields,” in: Proceedings of the Twentieth International Conference on Machine Learning, 2003, pp. 282–289.
  • [15] Ben Taskar, C. Guestrin, and D. Koller, “Max-margin Markov networks,” in Advances in Neural Information Processing Systems 16. Cambridge, MA: MIT Press, 2004.
  • [16] E. Tjong, and F. D. Meulder, “Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition,” in: Proceedings of Conference on Computational Natural Language Learning, 2003.
  • [17] T. T. Tran, D. Phung, H. Bui, and S. Venkatesh, “Hierarchical semi-Markov conditional random fields for recursive sequential data,” in NIPS’08: Advances in Neural Information Processing Systems 20. Cambridge, MA: MIT Press, 2008, pp. 1657–1664.
  • [18] I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun, “Support vector machine learning for interdependent and structured output spaces,” in: Proceedings of the Twenty-First International Conference on Machine learning, 2004, pp. 104–112.
  • [19] G. Wahba, Spline models for observational data, ser. CBMS-NSF Regional Conference Series in Applied Mathematics. Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM), 1990, vol. 59.

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2009 ConditionalRandomFieldsWithHOfeaturesNan Ye
Wee Sun Lee
Hai Leong Chieu
Dan Wu
Conditional Random Fields with High-Order Features for Sequence LabelingAdvances in Neural Information Processing Systemhttp://books.nips.cc/papers/files/nips22/NIPS2009 0300.pdf2009