Limiting density of discrete points

In information theory, the limiting density of discrete points is an adjustment to the formula of Claude Shannon for differential entropy.

It was formulated by Edwin Thompson Jaynes to address defects in the initial definition of differential entropy.

Definition

Shannon originally wrote down the following formula for the entropy of a continuous distribution, known as differential entropy:

H(X)=-\int p(x)\log p(x)\,dx.

Unlike Shannon's formula for the discrete entropy, however, this is not the result of any derivation (Shannon simply replaced the summation symbol in the discrete version with an integral) and it turns out to lack many of the properties that make the discrete entropy a useful measure of uncertainty. In particular, it is not invariant under a change of variables and can even become negative.

Jaynes (1963, 1968) argued that the formula for the continuous entropy should be derived by taking the limit of increasingly dense discrete distributions.[1][2] Suppose that we have a set of n discrete points \{x_i\}, such that in the limit n \to \infty their density approaches a function m(x) called the "invariant measure".

 \lim_{n \to \infty}\frac{1}{n}\,(\mbox{number of points in }a<x<b)=\int_a^b m(x)\,dx

Jaynes derived from this the following formula for the continuous entropy, which he argued should be taken as the correct formula:

H(X)=-\int p(x)\log\frac{p(x)}{m(x)}\,dx.

It is similar to the (negative of the) Kullback–Leibler divergence or relative entropy, which is a comparison between two probability distributions, with one difference. In the Kullback-Leibler divergence, m(x) must be a probability density, whereas in Jaynes' formula, m(x) is simply a density, meaning that it does not have to integrate to 1.

Jaynes' continuous entropy formula has the property of being invariant under a change of variables, provided that m(x) and p(x) are transformed in the same way. (This motivates the moniker "invariant measure" for m.) This solves many of the difficulties that come from applying Shannon's continuous entropy formula.

References

  1. Jaynes, E. T. (1963). "Information Theory and Statistical Mechanics". In K. Ford. Statistical Physics. Benjamin, New York. p. 181.
  2. Jaynes, E. T. (1968). "Prior Probabilities". IEEE Trans. on Systems Science and Cybernetics. SSC-4: 227.