# Mahalanobis distance

In statistics, Mahalanobis distance is a distance measure invented by P. C. Mahalanobis in 1936. It is based on correlations between variables by which different patterns can be identified and analysed. It is a useful way of determining similarity of an unknown sample set to a known one. It differs from Euclidean distance in that it takes into account the correlations of the data set.

Formally, the Mahalanobis distance from a group of values with mean [itex]\mu = ( \mu_1, \mu_2, \mu_3, \dots , \mu_p )[itex] and covariance matrix [itex]\Sigma[itex] for a multivariate vector [itex]x = ( x_1, x_2, x_3, \dots, x_p )[itex] is defined as:

[itex]D_M(x) = \sqrt{(x - \mu)' \Sigma^{-1} (x-\mu)}.\, [itex]

Mahalanobis distance can also be defined as dissimilarity measure between two random vectors [itex] \vec{x}[itex] and [itex] \vec{y}[itex] of the same distribution with the covariance matrix [itex]\Sigma[itex] :

[itex] d(\vec{x},\vec{y})=\sqrt{(\vec{x}-\vec{y})'\Sigma^{-1} (\vec{x}-\vec{y})}.\,

[itex]

If the covariance matrix is the identity matrix then it is the same as Euclidean distance. If covariance matrix is diagonal, then it is called normalized Euclidean distance:

[itex] d(\vec{x},\vec{y})=

\sqrt{\sum_{i=1}^p {(x_i - y_i)^2 \over \sigma_i^2}}, [itex]

where [itex]\sigma_i[itex] is the standard deviation of the [itex] x_i [itex] over the sample set.

• Art and Cultures
• Countries of the World (http://www.academickids.com/encyclopedia/index.php/Countries)
• Space and Astronomy