2004 SemiMarkovCRFforIE

Jump to: navigation, search

Subject Headings: Semi-Markov Conditional Random Field, Sequence Segmentation Statistical Models, Supervised Sequence Segmentation Task.


Cited By





We describe semi-Markov conditional random-fields (semi-CRFs), a conditionally trained version of semi-Markov chains. Intuitively, a semi-CRF on an input sequence [math]\bf{x}[/math] outputs a “segmentation” of [math]\bf{x}[/math], in which labels are assigned to segments (i.e., subsequences) of [math]\bf{x}[/math] rather than to individual elements [math]x_i[/math] of [math]\bf{x}[/math]. Importantly, features for semi-CRFs can measure properties of segments, and transitions within a segment can be non-Markovian. In spite of this additional power, exact learning and inference algorithms for semi-CRFs are polynomial-time — often only a small constant factor slower than conventional CRFs. In experiments on five named entity recognition problems, semi-CRFs generally outperform conventional CRFs.

1. Introduction

Conditional random fields (CRFs) are a recently-introduced formalism [12] for representing a conditional model Pr(y|x), where both x and y have non-trivial structure (often sequential). Here we introduce a generalization of sequential CRFs called semi-Markov conditional random fields (or semi-CRFs). Recall that semi-Markov chain models extend hidden Markov models (HMMs) by allowing each state si to persist for a non-unit length of time di. After this time has elapsed, the system will transition to a new state s0, which depends only on si ; however, during the “segment” of time between [math]i[/math] to [math]i[/math] + di, the behavior of the system may be non-Markovian. Semi-Markov models are fairly common in certain applications of statistics [8, 9], and are also used in reinforcement learning to model hierarchical Markov decision processes [19].

Semi-CRFs are a conditionally trained version of semi-Markov chains. In this paper, we present inference and learning methods for semi-CRFs. We also argue that segments often have a clear intuitive meaning, and hence semi-CRFs are more natural than conventional CRFs. We focus here on named entity recognition (NER), in which a segment corresponds to an extracted entity; however, similar arguments might be made for several other tasks, such as gene-finding [11] or NP-chunking [16].

In NER, a semi-Markov formulation allows one to easily construct entity-level features (such as “entity length” and …


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2004 SemiMarkovCRFforIESunita Sarawagi
William W. Cohen
Shun-Zheng Yu
Semi-Markov Conditional Random Fields for Information ExtractionProceedings of Advances in Neural Information Processing Systemshttp://books.nips.cc/papers/files/nips17/NIPS2004 0427.pdf2004