Total variation distance of probability measures
In probability theory, the total variation distance is a distance measure for probability distributions. It is an example of a statistical distance metric, and is sometimes just called "the" statistical distance.
Definition
The total variation distance between two probability measures P and Q on a sigma-algebra of subsets of the sample space is defined via[1]
Informally, this is the largest possible difference between the probabilities that the two probability distributions can assign to the same event.
Properties
Relation to other distances
The total variation distance is related to the Kullback–Leibler divergence by Pinsker's inequality.
On a finite probability space, the total variation distance is related to the L1 norm by the identity:[2]
Connection to Transportation theory
The total variation distance arises as twice the optimal transportation cost, when the cost function is , that is,
where the infimum is taken over all probability distributions with marginals and , respectively[3].
See also
References
- ↑ Chatterjee, Sourav. "Distances between probability measures" (PDF). UC Berkeley. Archived from the original (PDF) on July 8, 2008. Retrieved 21 June 2013.
- ↑ David A. Levin Yuval Peres Elizabeth L. Wilmer, 'Markov Chains and Mixing Times', Proposition 5.2, p.50
- ↑ Villani, Cédric (2009). Optimal Transport, Old and New. Springer-Verlag Berlin Heidelberg. p. 22. ISBN 978-3-540-71049-3. doi:10.1007/978-3-540-71050-9.