Conditional random field
From Wikipedia, the free encyclopedia
A conditional random field (CRF) is a type of discriminative probabilistic model most often used for the labeling or parsing of sequential data, such as natural language text or biological sequences.
Contents |
[edit] Description
Much like a Markov random field, a CRF is an undirected graphical model in which each vertex represents a random variable whose distribution is to be inferred, and each edge represents a dependency between two random variables. In a CRF, the distribution of each discrete random variable Y in the graph is conditioned on an input sequence X.
In principle, the layout of the graph of random variables Y can be arbitrary; most often, however, the Yi are structured to form a chain, with an edge between each Yi − 1 and Yi. As well as having a simple interpretation of the Yi as "labels" for each element in the input sequence, this layout admits efficient algorithms for model training, learning the conditional distributions between the Yi and feature functions from some corpus of training data, inference, determining the probability of a given label sequence Y given X, and decoding, determining the most likely label sequence Y given X.
The conditional dependency of each Yi on X is defined through a fixed set of feature functions of the form f(i,Yi − 1,Yi,X), which can informally be thought of as measurements on the input sequence that partially determine the likelihood of each possible value for Yi. The model assigns each feature a numerical weight and combines them to determine the probability of a certain value for Yi.
[edit] Relationship to hidden Markov models
CRFs have many of the same applications as conceptually simpler hidden Markov models (HMMs), but relax certain assumptions about the input and output sequence distributions. An HMM can loosely be understood as a CRF with very specific feature functions that use constant probabilities to model state transitions and emissions. Conversely, a CRF can loosely be understood as a generalization of an HMM that makes the constant transition probabilities into arbitrary functions that vary across the positions in the sequence of hidden states, depending on the input sequence.
Notably in contrast to HMMs, CRFs can contain any number of feature functions, the feature functions can inspect the entire input sequence X at any point during inference, and the range of the feature functions need not have a probabilistic interpretation.
The well-known forward-backward and Viterbi algorithms for HMMs have direct analogues for CRFs, with the same asymptotic running times. The training step, which determines a weight for each feature function, is somewhat more complex; generally, there is no closed-form solution for the optimal assignment of weights, so it must be found using a numerical optimization algorithm.
[edit] Higher-order CRFs and semi-Markov CRFs
CRFs can be extended into higher order models by making each Yi dependent on a fixed number o of previous variables Yi − (o − 1),...,Yi. Training and inference are only practical for small values of o (such as ), since their computational cost increase exponentially with o.
There exists another generalization of CRFs, the semi-Markov conditional random field (semi-CRF), which model variable-length segmentations of the label sequence Y. This provides much of the power of higher-order CRFs to model long-range dependencies of the Yi, at a reasonable computational cost.
[edit] References
- Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. 18th International Conf. on Machine Learning, Morgan Kaufmann, San Francisco, CA (2001) 282–289
- McCallum, A.: Efficiently inducing features of conditional random fields. In: Proc. 19th Conference on Uncertainty in Artificial Intelligence. (2003)
- Sha, F., Pereira, F.: Shallow parsing with conditional random fields. Technical Report MS-CIS-02-35, University of Pennsylvania (2003)
- Wallach, H.M.: Conditional random fields: An introduction. Technical Report MS-CIS-04-21, University of Pennsylvania (2004)
- Sutton, C., McCallum, A.: An Introduction to Conditional Random Fields for Relational Learning. In "Introduction to Statistical Relational Learning". Edited by Lise Getoor and Ben Taskar. MIT Press. (2006)
- Wang, Y., Loe, K.F., Wu, J.K.: A dynamic conditional random field model for foreground and shadow segmentation. IEEE Trans Pattern Anal Mach Intell. 28(2):279-89 (2006)