Mahalanobis Distance Measure

From GM-RKB
Jump to navigation Jump to search

A Mahalanobis Distance Measure is a unitless distance measure between a point P and a distribution D.



References

2016

2012

  • http://en.wikipedia.org/wiki/Mahalanobis_distance#Definition
    • QUOTE: Formally, the Mahalanobis distance of a multivariate vector [math]\displaystyle{ x = (x_1, x_2, x_3, \dots, x_N )^T }[/math] from a group of values with mean [math]\displaystyle{ \mu = (\mu_1, \mu_2, \mu_3, \dots , \mu_N )^T }[/math] and covariance matrix [math]\displaystyle{ S }[/math] is defined as: :[math]\displaystyle{ D_M(x) = \sqrt{(x - \mu)^T S^{-1} (x-\mu)}.\, }[/math][1]

      Mahalanobis distance (or "generalized squared interpoint distance" for its squared value[2]) can also be defined as a dissimilarity measure between two random vectors [math]\displaystyle{ \vec{x} }[/math] and [math]\displaystyle{ \vec{y} }[/math] of the same distribution with the covariance matrix [math]\displaystyle{ S }[/math]:  :[math]\displaystyle{ d(\vec{x},\vec{y})=\sqrt{(\vec{x}-\vec{y})^T S^{-1} (\vec{x}-\vec{y})}.\, }[/math]

      If the covariance matrix is the identity matrix, the Mahalanobis distance reduces to the Euclidean distance. If the covariance matrix is diagonal, then the resulting distance measure is called the normalized Euclidean distance: :[math]\displaystyle{ d(\vec{x},\vec{y})= \sqrt{\sum_{i=1}^N {(x_i - y_i)^2 \over s_{i}^2}}, }[/math]

      where [math]\displaystyle{ s_{i} }[/math] is the standard deviation of the [math]\displaystyle{ x_i }[/math] and [math]\displaystyle{ y_i }[/math] over the sample set.

  1. De Maesschalck, Roy; Jouan-Rimbaud, Delphine; and Massart, Désiré L. (2000); The Mahalanobis distance, Chemometrics and Intelligent Laboratory Systems 50:1–18
  2. Gnanadesikan, Ramanathan; and Kettenring, John R. (1972); Robust estimates, residuals, and outlier detection with multiresponse data, Biometrics 28:81-124

2008