# Probability Measure

A probability measure is a real-valued measure on events which assigns a nonnegative probability value to every set in a sigma-field (a collection of subsets of a sample space).

**AKA:**[math]\displaystyle{ P }[/math], Outcome Probability, Probability Function.**Context:**- It must satisfy the following axioms, for a probability space defined by the triple [math]\displaystyle{ (\Omega, \mathcal{F}, P) }[/math], where [math]\displaystyle{ \Omega }[/math] is a sample space, [math]\displaystyle{ \mathcal{F} }[/math] is a sigma-field, and [math]\displaystyle{ P }[/math] a probability measure:

- (a) [math]\displaystyle{ 0 \leq P(A) \leq 1 }[/math] for all subsets [math]\displaystyle{ A \; \in \; \mathcal{F} }[/math]
- (b) [math]\displaystyle{ P(\emptyset)=0 }[/math] and [math]\displaystyle{ P(\Omega)=1 }[/math]
- (c) If [math]\displaystyle{ \{A_1,A_2,A_3,\cdots \} }[/math] is a sequence of disjoint sets (i.e. [math]\displaystyle{ A_i \cap A_j = \emptyset }[/math] whenever [math]\displaystyle{ i\neq j }[/math]) that belong to [math]\displaystyle{ \mathcal{F} }[/math], then [math]\displaystyle{ P(\cup_iA_i)= \sum^{\infty}_{i=1} P(A_i) }[/math].

- It can be expressed as a Probability Distribution Function.
- It can range from being an Abstract Probability Function to being a Probability Function Structure.
- It can range from being a Parametric Probability Function to being a Non-Parametric Probability Function (family?).
- It can be an Axiomatic Probability Function, based on Probability Axioms and Mathematical Inference.
- It is a member of a Probability Space that returns a Random Experiment Event's Event Probability.

- …

**Example(s):**- any Axiomatic Probability Function, such as:
- [math]\displaystyle{ P(\Omega,{H,T}) \rightarrow 1.0 }[/math], for a heads or tail event in a P(Coin Toss Experiment.
- [math]\displaystyle{ P(\Omega,{H}) \rightarrow 0.5 }[/math], for Heads Event [math]\displaystyle{ E }[/math] in a Fair Coin Toss Experiment.

- an Empirical Probability Function, such as: P(Fair Coin Toss Experiment, {H}) →
`0.489`

. - P(Lightbulb Lifetime Experiment, [0,Inf) → 1.0
- P(Lightbulb Lifetime Experiment, [0,300]) → 0.4653
- Discrete Probability Function, such as: uniform probability functions, and multinomial probability functions.
- Continuous Probability Function, such as: Gaussian density functions, and Poisson density functions.
- a Joint Probability Function.
- a Conditional Probability Function.
- …

- any Axiomatic Probability Function, such as:
**Counter-Example(s):****See:**Probability Space, Sample Space, Sigma-Field, Likelihood Function, Identical Distribution Relation, Probability-Generating Function, Event Space, Probability Theory, Experimental Probability, Certainty, Prior Probability.

## References

### 2022

- (Wikipedia, 2022) ⇒ https://en.wikipedia.org/wiki/Probability_measure Retrieved:2022-1-11.
- In mathematics, a
**probability measure**is a real-valued function defined on a set of events in a probability space that satisfies measure properties such as*countable additivity*.^{[1]}The difference between a probability measure and the more general notion of measure (which includes concepts like area or volume) is that a probability measure must assign value 1 to the entire probability space.Intuitively, the additivity property says that the probability assigned to the union of two disjoint events by the measure should be the sum of the probabilities of the events, e.g. the value assigned to "1 or 2" in a throw of a die should be the sum of the values assigned to "1" and "2".

Probability measures have applications in diverse fields, from physics to finance and biology.

- In mathematics, a

### 2015

- (Wikipedia, 2015) ⇒ http://en.wikipedia.org/wiki/probability_measure Retrieved:2015-6-4.
- In mathematics, a
**probability measure**is a real-valued function defined on a set of events in a probability space that satisfies measure properties such as*countable additivity*.^{[2]}The difference between a probability measure and the more general notion of measure (which includes concepts like area or volume) is that a probability measure must assign value 1 to the entire probability space.Intuitively, the additivity property says that the probability assigned to the union of two disjoint events by the measure should be the sum of the probabilities of the events, e.g. the value assigned to "1 or 2" in a throw of a die should be the sum of the values assigned to "1" and "2".

Probability measures have applications in diverse fields, from physics to finance and biology.

- In mathematics, a

### 2013

- http://en.wikipedia.org/wiki/Probabilistic_model#Formal_definition
- A statistical model is a collection of probability distribution functions or probability density functions (collectively referred to as
*distributions*for brevity).

- A statistical model is a collection of probability distribution functions or probability density functions (collectively referred to as

### 2009

- http://en.wiktionary.org/wiki/theoretical_probability
- 1. (mathematics) the probability that a certain outcome will occur, as determined through reasoning or calculation.
*Given a die which is a regular octahedron of uniform density, and given that one and only one of its faces is painted black, then if the die is cast, the theoretical probability that the outcome will be the black face is 1/8.*

- 1. (mathematics) the probability that a certain outcome will occur, as determined through reasoning or calculation.
- http://www.econ.cam.ac.uk/faculty/weeks/Paper6/Exercise1.pdf
- A probability function is defined as a real-valued set function on the class of all subsets of the sample space Ω: the value associated with a subset A is denoted Pr(A).
The assignment of probabilities must satisfy the following three axioms:

- A probability function is defined as a real-valued set function on the class of all subsets of the sample space Ω: the value associated with a subset A is denoted Pr(A).

### 2008

- (MIT Lecture Notes, 2008 ) ⇒ http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-436j-fundamentals-of-probability-fall-2008/lecture-notes/MIT6_436JF08_lec01.pdf
- A probabilistic model is defined formally by a triple [math]\displaystyle{ (\Omega, \mathcal{F}, \mathbb{P}) }[/math], called a probability space, comprised of the following three elements:

- (a) [math]\displaystyle{ \Omega }[/math] is the sample space, the set of possible outcomes of the experiment.
- (b) [math]\displaystyle{ \mathcal{F} }[/math] is a σ-field, a collection of subsets of [math]\displaystyle{ \Omega }[/math].
- (c) [math]\displaystyle{ \mathbb{P} }[/math] is a probability measure, a function that assigns a nonnegative probability to every set in the σ-field F.

- (...) Let [math]\displaystyle{ (\Omega, \mathcal{F}) }[/math] be a measurable space. A measure is a function [math]\displaystyle{ \mu \; :\; \mathcal{F} \rightarrow [0, +\infty] }[/math], which assigns a nonnegative extended real number [math]\displaystyle{ \mu(A) }[/math] to every set [math]\displaystyle{ A }[/math] in [math]\displaystyle{ \mathcal{F} }[/math], and which satisfies the following two conditions:
- (a) [math]\displaystyle{ \mu(\emptyset)=0 }[/math];
- (b) (Countable additivity) If [math]\displaystyle{ \{A_i\} }[/math] is a sequence of disjoint sets that belong to [math]\displaystyle{ \mathcal{F} }[/math], then [math]\displaystyle{ \mu(\cup_iA_i) = \sum^{\infty}_{i=1} \mu(A_i) }[/math].

- A probability measure is a measure [math]\displaystyle{ \mathbb{P} }[/math] with the additional property [math]\displaystyle{ \mathbb{P}(\Omega)= 1 }[/math]. In that case, the triple \[math]\displaystyle{ (\Omega, \mathcal{F}, \mathbb{P}) }[/math] is called a probability space.

### 1991

- (Gurevich, 1991) ⇒ Yuri Gurevich. (1991). “Average Case Completeness.” In: Journal of Computer and System Sciences, 42(3). doi:0.1016/0022-0000(91)90007-R.
- QUOTE: We will consider only finite or infinite countable sample spaces, i.e., probability spaces. The function that assigns probabilities to sample points is the
*probability function*. If*μ*is a**probability function**and [math]\displaystyle{ X }[/math] is a collection of sample points then the*μ*-probability of the event [math]\displaystyle{ X }[/math] will be denoted*μ*(*X*); in other words,*μ*(*X*) = Σμ_{x∈X}*(*x*).***The letters****μ***and*ν*are reserved for probability functions**. If*μ*(*x*) is a**probability function on an ordered sample space then*μy^{*}(x) = Σ*<*x*(*y*) is the corresponding*probability distribution*. A*μ**probability function***is*positive*if every value of*μ*is positive. The*restriction*μ*|*X*of a**probability function**μ*to a set [math]\displaystyle{ X }[/math] of sample points with*μ*(*x*) > 0 is the probability function proportional to*μ*on [math]\displaystyle{ X }[/math] and zero outside*X*.*

- QUOTE: We will consider only finite or infinite countable sample spaces, i.e., probability spaces. The function that assigns probabilities to sample points is the

### 1986

- (Larsen & Marx, 1986) ⇒ Richard J. Larsen, and Morris L. Marx. (1986). “An Introduction to Mathematical Statistics and Its Applications, 2nd edition." Prentice Hall
- QUOTE: Consider a sample space, [math]\displaystyle{ S }[/math], and any event, [math]\displaystyle{ A }[/math], defined on [math]\displaystyle{ S }[/math]. If our experiment were performed
*one*time, either [math]\displaystyle{ A }[/math] or [math]\displaystyle{ A^C }[/math] would be the outcome. If it were performed [math]\displaystyle{ n }[/math] times, the resulting set of sample outcomes would be members of [math]\displaystyle{ A }[/math] on [math]\displaystyle{ m }[/math] occasions, [math]\displaystyle{ m }[/math] being some integer between [math]\displaystyle{ 1 }[/math] and [math]\displaystyle{ n }[/math], inclusive. Hypothetically, we could continue this process an infinite number of times. As [math]\displaystyle{ n }[/math] gets large, the ratio*m/n*will fluctuate less and less (we will make that statement more precise a little later). The number that m/n convert to is called the*empirical probability*of [math]\displaystyle{ A }[/math] : that is, [math]\displaystyle{ P(A) = lim_{n → ∞}(m/n) }[/math]. … the very act of repeating an experiment under identical conditions an infinite number of times is physically impossible. And left unanswered is the question of how large [math]\displaystyle{ n }[/math] must be to give a good approximation for [math]\displaystyle{ lim_{n → ∞}(m/n) }[/math].The next attempt at defining probability was entirely a product of the twentieth century. Modern mathematicians have shown a keen interest in developing subjects axiomatically. It was to be expected, then, that probability would come under such scrutiny … The major breakthrough on this front came in 1933 when Andrei Kolmogorov published

*Grundbegriffe der Wahscheinlichkeitsrechnung*(*Foundations of the Theory of Probability.*). Kolmogorov's work was a masterpiece of mathematical elegance - it reduced the behavior of the probability function to a set of just three or four simple postulates, three if the same space is limited to a finite number of outcomes and four if [math]\displaystyle{ S }[/math] is infinite.

- QUOTE: Consider a sample space, [math]\displaystyle{ S }[/math], and any event, [math]\displaystyle{ A }[/math], defined on [math]\displaystyle{ S }[/math]. If our experiment were performed

### 1933

- (Kolmogorov, 1933) ⇒ Andrei Kolmogorov. (1933). “Grundbegriffe der Wahrscheinlichkeitsrechnung (
*Foundations of the Theory of Probability.*). American Mathematical Society. ISBN:0828400237- QUOTE: … Every distributions function [math]\displaystyle{ F_{\mu_1 \mu_2 … \mu_n} }[/math], satisfying the general conditions of Chap. II, Sec 3, III and also conditions (2) and (3). Every distribution function [math]\displaystyle{ F_{\mu_1 \mu_2 … \mu_n} }[/math] defines uniquely a corresponding probability function [math]\displaystyle{ \text{P}_{\mu_1 \mu_2 … \mu_n} }[/math] for all Borel sets of [math]\displaystyle{ R^n }[/math].