# Likelihood Function

A likelihood function is a measure function of unknown parameters of a statistical model based on known outcomes.

**AKA:**Likelihood Measure.**Context:**- It can be a Pseudo-Likelihood Function.
- It can range from being a Discrete Likelihood Function to being a Continuous Likelihood Function.
- It can be produced by an Estimation Task.

**Example(s):****See:**Probability Density Function, Statistical Inference, Bayes Rule, Maximum Likelihood Estimate, Prior Distribution, Posterior Distribution, Likelihood Ratio, Likelihood-Ratio Test, Log Likelihood, Gaussian Likelihood, Pseudo-Likelihood, Likelihood Estimate, Normalized Likelihood, Marginal Likelihood, Maximum Likelihood.

## References

### 2015

- (Wikipedia, 2015) ⇒ http://en.wikipedia.org/wiki/Likelihood_function#Historical_remarks Retrieved:2015-6-4.
- In statistics, a
**likelihood function**(often simply the likelihood) is a function of the parameters of a statistical model.Likelihood functions play a key role in statistical inference, especially methods of estimating a parameter from a set of statistics. In informal contexts, "likelihood" is often used as a synonym for “probability." But in statistical usage, a distinction is made depending on the roles of the outcome or parameter.

*Probability*is used when describing a function of the outcome given a fixed parameter value. For example, if a coin is flipped 10 times and it is a fair coin, what is the*probability*of it landing heads-up every time?*Likelihood*is used when describing a function of a parameter given an*outcome.*For example, if a coin is flipped 10 times and it has landed heads-up 10 times, what is the*likelihood*that the coin is fair?

- In statistics, a

### 2014

- (Wikipedia, 2014) ⇒ http://en.wikipedia.org/wiki/likelihood_function#Definition Retrieved:2014-12-10.
- The likelihood function is defined differently for discrete and continuous probability distributions.
- Discrete probability distribution
- Let
*X*be a random variable with a discrete probability distribution p*depending on a parameter*θ*. Then the function :[math]\displaystyle{ \mathcal{L}(\theta |x) = p_\theta (x) = P_\theta (X=x), \, }[/math]**considered as a function of*θ*, is called the likelihood function (of*θ*, given the outcome*x*of*X*). Sometimes the probability on the value*x*of*X for the parameter value*θ*is written as [math]\displaystyle{ P(X=x|\theta) }[/math]; often written as [math]\displaystyle{ P(X=x;\theta) }[/math] to emphasize that this value is not a conditional probability, because θ is a parameter and not a random variable.

- Let
- Continuous probability distribution.
- Let
*X*be a random variable with a continuous probability distribution with density function f*depending on a parameter*θ*. Then the function :[math]\displaystyle{ \mathcal{L}(\theta |x) = f_{\theta} (x), \, }[/math]**considered as a function of*θ*, is called the likelihood function (of*θ*, given the outcome*x*of*X*). Sometimes the density function for the value*x*of*X for the parameter value*θ*is written as [math]\displaystyle{ f(x|\theta) }[/math], but should not be considered as a conditional probability density.The actual value of a likelihood function bears no meaning. Its use lies in comparing one value with another. For example, one value of the parameter may be more likely than another, given the outcome of the sample. Or a specific value will be most likely: the maximum likelihood estimate. Comparison may also be performed in considering the quotient of two likelihood values. That is why [math]\displaystyle{ \mathcal{L}(\theta |x) }[/math] is generally permitted to be any positive multiple of the above defined function [math]\displaystyle{ \mathcal{L} }[/math]. More precisely, then, a likelihood function is any representative from an equivalence class of functions, :[math]\displaystyle{ \mathcal{L} \in \left\lbrace \alpha \; P_\theta: \alpha \gt 0 \right\rbrace, \, }[/math]

where the constant of proportionality

*α*> 0 is not permitted to depend upon*θ*, and is required to be the same for all likelihood functions used in any one comparison. In particular, the numerical value [math]\displaystyle{ \mathcal{L}(\theta |x) }[/math] alone is immaterial; all that matters are maximum values of [math]\displaystyle{ \mathcal{L} }[/math], or likelihood ratios, such as those of the form :[math]\displaystyle{ \frac{\mathcal{L}(\theta_2 | x)}{\mathcal{L}(\theta_1 | x)} \lt P\gt = \frac{\alpha P(X=x|\theta_2)}{\alpha P(X=x|\theta_1)} = \frac{P(X=x|\theta_2)}{P(X=x|\theta_1)}, }[/math]that are invariant with respect to the constant of proportionality

*α*.For more about making inferences via likelihood functions, see also the method of maximum likelihood, and likelihood-ratio testing.

- Let

### 2013

- http://www.math.uah.edu/stat/point/Likelihood.html
- QUOTE: Suppose again that we have an observable random variable [math]\displaystyle{ X }[/math] for an experiment, that takes values in a set S. Suppose also that distribution of [math]\displaystyle{ X }[/math] depends on an unknown parameter θ, taking values in a parameter space Θ. Specifically, we will denote the probability density function of [math]\displaystyle{ X }[/math] on S by [math]\displaystyle{ f_θ }[/math] for θ∈Θ. Of course, our data variable [math]\displaystyle{ X }[/math] will almost always be vector-valued. The parameter θ may also be vector-valued.
The

*likelihood function*[math]\displaystyle{ L }[/math] is the function obtained by reversing the roles of x and θ in the probability density function; that is, we view θ as the variable and x as the given information (which is precisely the point of view in estimation): [math]\displaystyle{ L_x(θ) = f_θ(x); θ∈Θ, x∈S }[/math] In the*method of maximum likelihood*, we try to find a value [math]\displaystyle{ u(x) }[/math] of the parameter θ that maximizes [math]\displaystyle{ L_x(θ) }[/math] for each [math]\displaystyle{ x∈S }[/math]. If we can do this, then the statistic [math]\displaystyle{ u(X) }[/math] is called a maximum likelihood estimator of θ. The method is intuitively appealing — we try to find the values of the parameters that would have most likely produced the data we in fact observed.Since the natural logarithm function is strictly increasing on (0, ∞), the maximum value of L_x(θ), if it exists, will occur at the same points as the maximum value of [math]\displaystyle{ l_n [L_x (θ) ] }[/math]. This latter function is called the

*log likelihood function*and in many cases is easier to work with than the likelihood function (typically because the probability density function fθ (x) has a product structure).

- QUOTE: Suppose again that we have an observable random variable [math]\displaystyle{ X }[/math] for an experiment, that takes values in a set S. Suppose also that distribution of [math]\displaystyle{ X }[/math] depends on an unknown parameter θ, taking values in a parameter space Θ. Specifically, we will denote the probability density function of [math]\displaystyle{ X }[/math] on S by [math]\displaystyle{ f_θ }[/math] for θ∈Θ. Of course, our data variable [math]\displaystyle{ X }[/math] will almost always be vector-valued. The parameter θ may also be vector-valued.

### 2009

- (WordNet, 2009) ⇒ http://wordnetweb.princeton.edu/perl/webwn?s=likelihood
- S: (n) likelihood, likeliness (the probability of a specified outcome)

- (Gentle, 2009) ⇒ James E. Gentle. (2009). “Computational Statistics." Springer. ISBN:978-0-387-98143-7
- QUOTE: The likelihood function arises from a probability density, but it is not a probability density function. It does not in any way relate to a “probability” associated with the parameters or the model.
Although non-statisticians will often refer to the “likelihood of an observation”, in statistics, we use the term “likelihood” to refer to a model or a distribution given observations.

- QUOTE: The likelihood function arises from a probability density, but it is not a probability density function. It does not in any way relate to a “probability” associated with the parameters or the model.

### 2006

- http://www.hopkinsmedicine.org/Bayes/PrimaryPages/Glossary.cfm
- Likelihood function: A mathematical expression that indicates the likelihood that the observed data (or sufficient statistic) would have been observed, given the (unknown) population parameter(s). Note the difference from the P value.