Variation of information

In probability theory and information theory, the variation of information or shared information distance is a measure of the distance between two clusterings (partitions of elements). It is closely related to mutual information; indeed, it is a simple linear expression involving the mutual information. Unlike the mutual information, however, the variation of information is a true metric, in that it obeys the triangle inequality.^[1]

Venn diagram illustrating the relation between information entropies, mutual information and variation of information.

Definition

Suppose we have two partitions $X$ and $Y$ of a set $A$ into disjoint subsets, namely $X = \{X_{1}, X_{2}, ..,, X_{k}\}$ , $Y = \{Y_{1}, Y_{2}, ..,, Y_{l}\}$ . Let $n=\sum _{i}|X_{i}|=\sum _{j}|Y_{j}|=|A|$ , $p_{i} = |X_{i}| / n$ , $q_{j} = |Y_{j}| / n$ , $r_{ij} = |X_i\cap Y_{j}| / n$ . Then the variation of information between the two partitions is:

VI(X; Y ) = - \sum_{i,j} r_{ij} \left[\log(r_{ij}/p_i)+\log(r_{ij}/q_j) \right]

This is equivalent to the shared information distance between the random variables i and j with respect to the uniform probability measure on $A$ defined by $\mu(B):=|B|/n$ for $B\subseteq A$ .

Identities

The variation of information satisfies

VI(X;Y)=H(X)+H(Y)-2I(X,Y)

where $H(X)$ is the entropy of $X$ , and $I(X, Y)$ is mutual information between $X$ and $Y$ with respect to the uniform probability measure on $A$ . This can be rewritten as

VI(X;Y)=H(X,Y)-I(X,Y)

where $H(X,Y)$ is the joint entropy of $X$ and $Y$ , or

VI(X;Y)=H(X|Y)+H(Y|X)

where $H(X|Y)$ and $H(Y|X)$ are the respective conditional entropies.

The variation of information can also be bounded, either in terms of the number of elements:

VI(X;Y)\leq log(n)

Or with respect to a maximum number of clusters, $K^{*}$ :

VI(X;Y)\leq 2log(K^{*})

References

↑ Alexander Kraskov, Harald Stögbauer, Ralph G. Andrzejak, and Peter Grassberger, "Hierarchical Clustering Based on Mutual Information", (2003) ArXiv q-bio/0311039

External links

C++ implementation with MATLAB mex files

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

Variation of information

Definition

Identities

References

Further reading

External links