Log-Linear Model

(Redirected from Log-linear models)

A Log-Linear Model is a mathematical model that takes the form of a function whose logarithm is a first-degree polynomial function of the parameters of the model.

References

2015

• (Wikipedia, 2015) ⇒ http://en.wikipedia.org/wiki/Log-linear_model Retrieved:2015-3-22.
• A log-linear model is a mathematical model that takes the form of a function whose logarithm is a first-degree polynomial function of the parameters of the model, which makes it possible to apply (possibly multivariate) linear regression. That is, it has the general form : $\exp \left(c + \sum_{i} w_i f_i(X) \right)\,,$ in which the fi(X) are quantities that are functions of the variables X, in general a vector of values, while c and the wi stand for the model parameters.

The term may specifically be used for:

• The specific applications of log-linear models are where the output quantity lies in the range 0 to ∞, for values of the independent variables X, or more immediately, the transformed quantities fi(X) in the range −∞ to +∞. This may be contrasted to logistic models, similar to the logistic function, for which the output quantity lies in the range 0 to 1. Thus the contexts where these models are useful or realistic often depends on the range of the values being modelled.

2013

• (Collins, 2013b) ⇒ Michael Collins. (2013). “Log-Linear Models." Course notes for NLP by Michael Collins, Columbia University.
• QUOTE: The abstract problem is as follows. We have some set of possible inputs, $\mathcal{X}$, and a set of possible labels, $\mathcal{Y}$. Our task is to model the conditional probability $p(y \mid x)$ for any pair (x; y) such that $x \in \mathcal{X}$ and $y \in \mathcal{Y}$.

Definition 1 (Log-linear Models) A log-linear model consists of the following components:

• A set $\mathcal{X}$ of possible inputs.
• A set $\mathcal{Y}$ of possible labels. The set $\mathcal{Y}$ is assumed to be finite.
• A positive integer d specifying the number of features and parameters in the model.
• A function $f : \mathcal{X} \times \mathcal{Y} \rightarrow \mathbb{R}^d$ that maps any $(x, y)$ pair to a feature-vector $f(x, y)$.
• A parameter vector$v \in \mathbb{R}^d$.
• For any $x \in \mathcal{X}$, $y \in \mathcal{Y}$, the model defines a conditional probability :$p(y \mid x; v) = \frac{\exp (v \cdot f(x,y))}{\Sigma_{y' \in \mathcal{Y}} \exp (v \cdot f(x, y'))}$ Here $\exp(x) = e^x$, and $v \cdot f(x,y) = \Sigma^d_{k=1} v_k f_k(x,y)$ is the inner product between v and f(x,y). The term $p(y \mid x; v)$ is intended to be read as “the probability of y conditioned on x, under parameter values v”.