# Multiclass Cross-Entropy Measure

(Redirected from relative or cross entropy)

A Multiclass Cross-Entropy Measure is a dispersion measure which measures the average number of bits needed to identify an event from a set of possibilities.

## References

### 2017

• (Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/cross_entropy Retrieved:2017-6-7.
• In information theory, the cross entropy between two probability distributions $p$ and $q$ over the same underlying set of events measures the average number of bits needed to identify an event drawn from the set, if a coding scheme is used that is optimized for an "unnatural" probability distribution $q$ , rather than the "true" distribution $p$ .

The cross entropy for the distributions $p$ and $q$ over a given set is defined as follows: : $H(p, q) = \operatorname{E}_p[-\log q] = H(p) + D_{\mathrm{KL}}(p \| q),\!$ where $H(p)$ is the entropy of $p$ , and $D_{\mathrm{KL}}(p \| q)$ is the Kullback–Leibler divergence of $q$ from $p$ (also known as the relative entropy of p with respect to q — note the reversal of emphasis).

For discrete $p$ and $q$ this means : $H(p, q) = -\sum_x p(x)\, \log q(x). \!$ The situation for continuous distributions is analogous. We have to assume that $p$ and $q$ are absolutely continuous with respect to some reference measure $r$ (usually $r$ is a Lebesgue measure on a Borel σ-algebra). Let $P$ and $Q$ be probability density functions of $p$ and $q$ with respect to $r$ . Then : $-\int_X P(x)\, \log Q(x)\, dr(x) = \operatorname{E}_p[-\log Q]. \!$ NB: The notation $H(p,q)$ is also used for a different concept, the joint entropy of $p$ and $q$