2006 InformationTheoreticLearning

From GM-RKB
Jump to navigation Jump to search

Subject Headings:

Notes

Cited By

Quotes

Glossary

Information theoretic learning (ITL): A theoretical framework and associated set of algorithms to implement adaptive information filtering based on information theory. ITL synergistically integrates the general framework of information theory into the design of new cost functions for adaptive systems.

Nonparametric entropy estimation: A method for measuring the degree of randomness in a system that does not rely on the assumptions of any particular distribution type (e.g. Normal, Poisson, Bernoulli, etc.)

Information potential: A method for entropy estimation which combines alternative entropy measures, such as Renyi’s quadratic entropy, with the Parzen window method of estimating the underlying probability density function from a set of data samples. In this method, data samples are treated as physical particles and the entropy is then related to the potential energy of these “information particles.”

Information force: The derivative of the potential energy (information potential) of a data set. These are forces, driven by information, that move data in the space of the interactions. They replace the infected error in the back propagation framework.

Stochastic information gradient: A faster stochastic algorithm to adapt linear or non-linear systems. It relies on temporal differences between samples due to the pair-wise nature of kernel methods for entropy estimation.

Kernel annealing: A method to avoid local minima in non-convex performance surfaces. Kernel annealing is analogous to the method of convolution smoothing in global optimization. Hence, as long as a “proper” annealing schedule is chosen (i.e. slow enough annealing rate), kernel annealing provides a way to avoid local minima and reach the global minimum.

Euclidean and Cauchy Schwarz probability density function distances: Methods which use only quadratic terms to measure the divergence (“distance”) between two probability density functions. These measures are used in algorithms based on information potential and information forces.

MeRMaId algorithmMinimum Renyi’s Mutual Information algorithm – an algorithm used for blind source separation based on finding a rotation matrix which minimizes Renyi’s quadratic mutual information of the input signal. The input signal is initially spatially decorrelated by the use of a whitening transformation. The MERMAID algorithm differs from the others primarily in the estimation of the information quantity. Since it provides better performance for small size windows, it means that the estimation error is better in the small sample case than in the other methods.


,