Prior Probability Function

(Redirected from Prior Distribution)
Jump to navigation Jump to search

A prior probability function is a probability function representing prior knowledge.



  • (Wikipedia, 2015) ⇒ Retrieved:2015-6-23.
    • In Bayesian statistical inference, a prior probability distribution, often called simply the prior, of an uncertain quantity is the probability distribution p that would express one's beliefs about this quantity before some evidence is taken into account. For example, p could be the probability distribution for the proportion of voters who will vote for a particular politician in a future election. It is meant to attribute uncertainty, rather than randomness, to the quantity. The unknown quantity may be a parameter or latent variable.

      One applies Bayes' theorem, multiplying the prior by the likelihood function and then normalizing, to get the posterior probability distribution, which is the conditional distribution of the uncertain quantity, given the data.

      A prior is often the purely subjective assessment of an experienced expert. Some will choose a conjugate prior when they can, to make calculation of the posterior distribution easier.

      Parameters of prior distributions are called hyperparameters, to distinguish them from parameters of the model of the underlying data. For instance, if one is using a beta distribution to model the distribution of the parameter p of a Bernoulli distribution, then:

      • p is a parameter of the underlying system (Bernoulli distribution), and
      • α and β are parameters of the prior distribution (beta distribution), hence hyperparameters.


  • (Webb, 2011k) ⇒ Geoffrey I. Webb. (2011). “Prior Probability.” In: (Sammut & Webb, 2011) p.782
    • QUOTE: In Bayesian inference, a prior probability of a value x of a random variable X, P(X = x), is the probability of X assuming the value x in the absence of (or before obtaining) any additional information. It contrasts with the posterior probability, P(X = x | Y = y), the probability of X assuming the value x in the context of Y = y.

      For example, it may be that the prevalence of a particular form of cancer, exoma, in the population is 0.1%, so the prior probability of exoma, P(exoma = true), is 0.001. However, assume 50% of people who have skin discolorations of greater than 1 cm width (sd > 1cm) have exoma. It follows that the posterior probability of exoma given sd > 1cm, P(exoma = true | sd > 1cm = true), is 0.500.