Gibbs' inequality

From Wikipedia, the free encyclopedia

In information theory, Gibbs' inequality is a statement about the mathematical entropy of a discrete probability distribution. Several other bounds on the entropy of probability distributions are derived from Gibbs' inequality, including Fano's inequality.

First presented by J. Willard Gibbs in the nineteenth century.

Contents

[edit] Gibbs' inequality

Suppose that

P = \{ p_1 , \ldots , p_n \}

is a probability distribution. Then for any other probability distribution

Q = \{ q_1 , \ldots , q_n \}

the following inequality holds

- \sum_{i=1}^n p_i \log_2 p_i \leq - \sum_{i=1}^n p_i \log_2 q_i

with equality if and only if

p_i = q_i \,

for all i.

The difference between the two quantities is the negative of the Kullback–Leibler divergence or relative entropy, so the inequality can also be written:

D_{\mathrm{KL}}(P\|Q) \geq 0

[edit] Proof

Since

\log_2 a = \frac{ \ln a }{ \ln 2 }

it is sufficient to prove the statement using the natural logarithm (ln). Note that the natural logarithm satisfies

\ln x \leq x-1

for all x with equality if and only if x=1.

Let I denote the set of all i for which pi is non-zero. Then

\begin{align} - \sum_{i \in I} p_i \ln \frac{q_i}{p_i} & {} \geq - \sum_{i \in I} p_i \left( \frac{q_i}{p_i} - 1 \right) \\ & {} = - \sum_{i \in I} q_i + \sum_{i \in I} p_i \\ & {} = - \sum_{i \in I} q_i + 1 \\ & {} \geq 0 \end{align}

So

- \sum_{i \in I} p_i \ln q_i \geq - \sum_{i \in I} p_i \ln p_i

and then trivially

- \sum_{i=1}^n p_i \ln q_i \geq - \sum_{i=1}^n p_i \ln p_i

since the right hand side does not grow, but the left hand side may grow or may stay the same.

For equality to hold, we require:

  1. \frac{q_i}{p_i} = 1 for all i \in I so that the approximation \ln \frac{q_i}{p_i} = 1 - \frac{q_i}{p_i} is exact.
  2. \sum_{i \in I} q_i = 1 so that equality continues to hold between the third and fourth lines of the proof.

This can happen if and only if

pi = qi

for i = 1, ..., n.

[edit] Alternative proofs

The result can alternatively be proved using Jensen's inequality or log sum inequality.

[edit] Corollary

The entropy of P is bounded by:

H(p_1, \ldots , p_n) \leq \log n

The proof is trivial - simply set qi = 1 / n for all i.

[edit] See also

In other languages