Bhattacharyya distance

From Wikipedia, the free encyclopedia

In statistics, the Bhattacharyya distance measures the similarity of two discrete probability distributions. It is usually used to measure the separability of classes in classification.

For discrete probability distributions p and q over the same domain X; it is defined as:

BC(p,q) = \sum_{x\in X} \sqrt{p(x) q(x)}.

The Bhattacharyya coefficient is a divergence-type measure; it can be seen as the scalar product of the two vectors (one for p and one for q) having as components the square root of the probability of the points x \in X. (Since p and q are probability distributions, the length of both vectors is 1). The Bhattacharyya coefficient thereby lends itself to a geometric interpretation: It is the cosine of the angle enclosed between these two vectors. As such, it is always between -1 and 1, 1 indicating the strongest similarity between the distributions, -1 the weakest.

[edit] See also

[edit] References

  • A. Bhattacharyya, "On a measure of divergence between two statistical populations defined by probability distributions", Bull. Calcutta Math. Soc., vol. 35, pp. 99–109, 1943.
  • T. Kailath, "The Divergence and Bhattacharyya Distance Measures in Signal Selection", IEEE Trans. on Comm. Technology, vol. 15, pp. 52-60, Feb. 1967.
  • A. Djouadi, O. Snorrason and F. Garber, "The quality of Training-Sample estimates of the Bhattacharyya coefficient", IEEE Tran. Pattern analysis and machine intelligence, vol. 12, pp. 92-97, 1990.

For a short list of properties, see: http://www.mtm.ufsc.br/~taneja/book/node20.html

Languages