Multiclass Cross-Entropy Measure

(Redirected from cross-entropy function)
Jump to navigation Jump to search

A Multiclass Cross-Entropy Measure is a dispersion measure which measures the average number of bits needed to identify an event from a set of possibilities.



  • (Wikipedia, 2017) ⇒ Retrieved:2017-6-7.
    • In information theory, the cross entropy between two probability distributions [math]\displaystyle{ p }[/math] and [math]\displaystyle{ q }[/math] over the same underlying set of events measures the average number of bits needed to identify an event drawn from the set, if a coding scheme is used that is optimized for an "unnatural" probability distribution [math]\displaystyle{ q }[/math] , rather than the "true" distribution [math]\displaystyle{ p }[/math] .

      The cross entropy for the distributions [math]\displaystyle{ p }[/math] and [math]\displaystyle{ q }[/math] over a given set is defined as follows: : [math]\displaystyle{ H(p, q) = \operatorname{E}_p[-\log q] = H(p) + D_{\mathrm{KL}}(p \| q),\! }[/math] where [math]\displaystyle{ H(p) }[/math] is the entropy of [math]\displaystyle{ p }[/math] , and [math]\displaystyle{ D_{\mathrm{KL}}(p \| q) }[/math] is the Kullback–Leibler divergence of [math]\displaystyle{ q }[/math] from [math]\displaystyle{ p }[/math] (also known as the relative entropy of p with respect to q — note the reversal of emphasis).

      For discrete [math]\displaystyle{ p }[/math] and [math]\displaystyle{ q }[/math] this means : [math]\displaystyle{ H(p, q) = -\sum_x p(x)\, \log q(x). \! }[/math] The situation for continuous distributions is analogous. We have to assume that [math]\displaystyle{ p }[/math] and [math]\displaystyle{ q }[/math] are absolutely continuous with respect to some reference measure [math]\displaystyle{ r }[/math] (usually [math]\displaystyle{ r }[/math] is a Lebesgue measure on a Borel σ-algebra). Let [math]\displaystyle{ P }[/math] and [math]\displaystyle{ Q }[/math] be probability density functions of [math]\displaystyle{ p }[/math] and [math]\displaystyle{ q }[/math] with respect to [math]\displaystyle{ r }[/math] . Then : [math]\displaystyle{ -\int_X P(x)\, \log Q(x)\, dr(x) = \operatorname{E}_p[-\log Q]. \! }[/math] NB: The notation [math]\displaystyle{ H(p,q) }[/math] is also used for a different concept, the joint entropy of [math]\displaystyle{ p }[/math] and [math]\displaystyle{ q }[/math]