Poisson Regression Algorithm

A Poisson Regression Algorithm is a generalized linear model regression algorithm that is restricted to a Poisson distribution family.

Context:
- It can be implemented by a Poisson Regression System.
- …
Counter-Example(s):
- Logistic Regression.
See: Regression Analysis, Count Data, Contingency Table, Log-Linear Model, Generalized Linear Model, Poisson Mean.

References

2019

(Wikipedia, 2019) ⇒ https://en.wikipedia.org/wiki/Poisson_regression Retrieved:2019-10-7.
- In statistics, Poisson regression is a generalized linear model form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown parameters. A Poisson regression model is sometimes known as a log-linear model, especially when used to model contingency tables.
  Negative binomial regression is a popular generalization of Poisson regression because it loosens the highly restrictive assumption that the variance is equal to the mean made by the Poisson model. The traditional negative binomial regression model, commonly known as NB2, is based on the Poisson-gamma mixture distribution. This model is popular because it models the Poisson heterogeneity with a gamma distribution.
  Poisson regression models are generalized linear models with the logarithm as the (canonical) link function, and the Poisson distribution function as the assumed probability distribution of the response.

2015

(Wikipedia, 2015) ⇒ http://en.wikipedia.org/wiki/Poisson_distribution#Maximum_likelihood Retrieved:2015-6-14.
- Given a sample of n measured values ki = 0, 1, 2, ..., for i = 1, ..., n, we wish to estimate the value of the parameter λ of the Poisson population from which the sample was drawn. The maximum likelihood estimate is : [math]\displaystyle{ \widehat{\lambda}_\mathrm{MLE}=\frac{1}{n}\sum_{i=1}^n k_i. \! }[/math] Since each observation has expectation λ so does this sample mean. Therefore the maximum likelihood estimate is an unbiased estimator of λ. It is also an efficient estimator, i.e. its estimation variance achieves the Cramér–Rao lower bound (CRLB).Hence it is minimum-variance unbiased. Also it can be proved that the sum (and hence the sample mean as it is a one-to-one function of the sum) is a complete and sufficient statistic for λ.
  To prove sufficiency we may use the factorization theorem. Consider partitioning the probability mass function of the joint Poisson distribution for the sample into two parts: one that depends solely on the sample [math]\displaystyle{ \mathbf{x} }[/math] (called [math]\displaystyle{ h(\mathbf{x}) }[/math] ) and one that depends on the parameter [math]\displaystyle{ \lambda }[/math] and the sample [math]\displaystyle{ \mathbf{x} }[/math] only through the function [math]\displaystyle{ T(\mathbf{x}) }[/math] . Then [math]\displaystyle{ T(\mathbf{x}) }[/math] is a sufficient statistic for [math]\displaystyle{ \lambda }[/math] . : [math]\displaystyle{ P(\mathbf{x})=\prod_{i=1}^n\frac{\lambda^x e^{-\lambda}}{x!}=\frac{1}{\prod_{i=1}^n x_i!} \times \lambda^{\sum_{i=1}^n x_i}e^{-n\lambda} }[/math] Note that the first term, [math]\displaystyle{ h(\mathbf{x}) }[/math] , depends only on [math]\displaystyle{ \mathbf{x} }[/math] . The second term, [math]\displaystyle{ g(T(\mathbf{x})|\lambda) }[/math] , depends on the sample only through [math]\displaystyle{ T(\mathbf{x})=\sum_{i=1}^nx_i }[/math] . Thus, [math]\displaystyle{ T(\mathbf{x}) }[/math] is sufficient.
  To find the parameter λ that maximizes the probability function for the Poisson population, we can use the logarithm of the probability function: : [math]\displaystyle{ \begin{align} L(\lambda) & = \ln \prod_{i=1}^n f(k_i \mid \lambda) \\ & = \sum_{i=1}^n \ln\!\left(\frac{e^{-\lambda}\lambda^{k_i}}{k_i!}\right) \\ & = -n\lambda + \left(\sum_{i=1}^n k_i\right) \ln(\lambda) - \sum_{i=1}^n \ln(k_i!). \end{align} }[/math] We take the derivative of L with respect to λ and compare it to zero: : [math]\displaystyle{ \frac{\mathrm{d}}{\mathrm{d}\lambda} L(\lambda) = 0 \iff -n + \left(\sum_{i=1}^n k_i\right) \frac{1}{\lambda} = 0. \! }[/math] Solving for λ gives a stationary point. : [math]\displaystyle{ \lambda = \frac{\sum_{i=1}^n k_i}{n} }[/math] So λ is the average of the k_i values. Obtaining the sign of the second derivative of L at the stationary point will determine what kind of extreme value λ is. : [math]\displaystyle{ \frac{\partial^2 L}{\partial \lambda^2} = -\lambda^{-2}\sum_{i=1}^n k_i }[/math] Evaluating the second derivative at the stationary point gives: : [math]\displaystyle{ \frac{\partial^2 L}{\partial \lambda^2} = - \frac{n^2}{\sum_{i=1}^n k_i} }[/math] which is the negative of n times the reciprocal of the average of the k_i. This expression is negative when the average is positive. If this is satisfied, then the stationary point maximizes the probability function.
  For completeness, a family of distributions is said to be complete if and only if [math]\displaystyle{ E(g(T)) = 0 }[/math] implies that [math]\displaystyle{ P_\lambda(g(T) = 0) = 1 }[/math] for all [math]\displaystyle{ \lambda }[/math] . If the individual [math]\displaystyle{ X_i }[/math] are iid [math]\displaystyle{ \mathrm{Po}(\lambda) }[/math], then [math]\displaystyle{ T(\mathbf{x})=\sum_{i=1}^nX_i\sim \mathrm{Po}(n\lambda) }[/math] . Knowing the distribution we want to investigate, it is easy to see that the statistic is complete. : [math]\displaystyle{ E(g(T))=\sum_{t=0}^\infty g(t)\frac{(n\lambda)^te^{-n\lambda}}{t!}=0 }[/math] For this equality to hold, it is obvious that [math]\displaystyle{ g(t) }[/math] must be 0. This follows from the fact that none of the other terms will be 0 for all [math]\displaystyle{ t }[/math] in the sum and for all possible values of [math]\displaystyle{ \lambda }[/math] . Hence, [math]\displaystyle{ E(g(T)) = 0 }[/math] for all [math]\displaystyle{ \lambda }[/math] implies that [math]\displaystyle{ P_\lambda(g(T) = 0) = 1 }[/math] , and the statistic has been shown to be complete.

Poisson Regression Algorithm

References

2019

2015

Navigation menu

Search