Log-Linear Model

From GM-RKB
Jump to navigation Jump to search

A Log-Linear Model is a mathematical model that takes the form of a function whose logarithm is a first-degree polynomial function of the parameters of the model.



References

2017a

2017b

2015

  • (Wikipedia, 2015) ⇒ http://en.wikipedia.org/wiki/Log-linear_model Retrieved:2015-3-22.
    • A log-linear model is a mathematical model that takes the form of a function whose logarithm is a first-degree polynomial function of the parameters of the model, which makes it possible to apply (possibly multivariate) linear regression. That is, it has the general form : [math]\displaystyle{ \exp \left(c + \sum_{i} w_i f_i(X) \right)\,, }[/math] in which the fi(X) are quantities that are functions of the variables X, in general a vector of values, while c and the wi stand for the model parameters.

      The term may specifically be used for:

    • The specific applications of log-linear models are where the output quantity lies in the range 0 to ∞, for values of the independent variables X, or more immediately, the transformed quantities fi(X) in the range −∞ to +∞. This may be contrasted to logistic models, similar to the logistic function, for which the output quantity lies in the range 0 to 1. Thus the contexts where these models are useful or realistic often depends on the range of the values being modelled.

2013

  • (Collins, 2013b) ⇒ Michael Collins. (2013). “Log-Linear Models." Course notes for NLP by Michael Collins, Columbia University.
    • QUOTE: The abstract problem is as follows. We have some set of possible inputs, [math]\displaystyle{ \mathcal{X} }[/math], and a set of possible labels, [math]\displaystyle{ \mathcal{Y} }[/math]. Our task is to model the conditional probability [math]\displaystyle{ p(y \mid x) }[/math] for any pair (x; y) such that [math]\displaystyle{ x \in \mathcal{X} }[/math] and [math]\displaystyle{ y \in \mathcal{Y} }[/math].

      Definition 1 (Log-linear Models) A log-linear model consists of the following components:

      • A set [math]\displaystyle{ \mathcal{X} }[/math] of possible inputs.
      • A set [math]\displaystyle{ \mathcal{Y} }[/math] of possible labels. The set [math]\displaystyle{ \mathcal{Y} }[/math] is assumed to be finite.
      • A positive integer d specifying the number of features and parameters in the model.
      • A function [math]\displaystyle{ f : \mathcal{X} \times \mathcal{Y} \rightarrow \mathbb{R}^d }[/math] that maps any [math]\displaystyle{ (x, y) }[/math] pair to a feature-vector [math]\displaystyle{ f(x, y) }[/math].
      • A parameter vector[math]\displaystyle{ v \in \mathbb{R}^d }[/math].
    • For any [math]\displaystyle{ x \in \mathcal{X} }[/math], [math]\displaystyle{ y \in \mathcal{Y} }[/math], the model defines a conditional probability :[math]\displaystyle{ p(y \mid x; v) = \frac{\exp (v \cdot f(x,y))}{\Sigma_{y' \in \mathcal{Y}} \exp (v \cdot f(x, y'))} }[/math] Here [math]\displaystyle{ \exp(x) = e^x }[/math], and [math]\displaystyle{ v \cdot f(x,y) = \Sigma^d_{k=1} v_k f_k(x,y) }[/math] is the inner product between v and f(x,y). The term [math]\displaystyle{ p(y \mid x; v) }[/math] is intended to be read as “the probability of y conditioned on x, under parameter values v”.

2003

1997

1980

1957