Linear-Chain Conditional Random Field
- AKA: Linear-Chain CRF.
- See: Linear-CRF Training Algorithm.
- Linear-chain CRFs have many of the same applications as conceptually simpler hidden Markov models (HMMs), but relax certain assumptions about the input and output sequence distributions. An HMM can loosely be understood as a CRF with very specific feature functions that use constant probabilities to model state transitions and emissions. Conversely, a CRF can loosely be understood as a generalization of an HMM that makes the constant transition probabilities into arbitrary functions that vary across the positions in the sequence of hidden states, depending on the input sequence.
- (Wallach, 2004) ⇒ Wallach, 2004) ⇒ Hanna M. Wallach. (2004). “Conditional Random Fields: An introduction." Technical Report MS-CIS-04-21, University of Pennsylvania.
- (McCallum & Li, 2003) ⇒ Andrew McCallum, and Wei Li. (2003). “Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons.” In: Proceedings of Seventh Conference on Natural Language Learning (CoNLL 2003). doi:10.3115/1119176.1119206
- QUOTE: In the special case in which the output nodes of the graphical model are linked by edges in a linear chain, CRFs make a first-order Markov independence assumption, and thus can be understood as conditionally-trained finite state machines (FSMs). In the remainder of this section we introduce the likelihood model, inference and estimation procedures for CRFs.