Shannon–Fano–Elias coding

From Wikipedia, the free encyclopedia

In information theory, Shannon–Fano–Elias coding is a precursor to arithmetic coding, in which probabilities are used to determine codewords.[1]

Algorithm description

Given a discrete random variable X of ordered values to be encoded, let p(x) be the probability for any x in X. Define a function

{\bar  F}(x)=\sum _{{x_{i}<x}}p(x_{i})+{\frac  12}p(x)

Algorithm:

For each x in X,
Let Z be the binary expansion of {\bar  F}(x).
Choose the length of the encoding of x, L(x), to be the integer \left\lceil \log _{2}{\frac  {1}{p(x)}}\right\rceil +1
Choose the encoding of x, code(x), be the first L(x) most significant bits after the decimal point of Z.

Example

Let X = {A, B, C, D}, with probabilities p = {1/3, 1/4, 1/6, 1/4}.

For A
{\bar  F}(A)={\frac  12}p(A)={\frac  12}\cdot {\frac  13}=0.1666...
In binary, Z(A) = 0.0010101010...
L(A) = \left\lceil \log _{2}{\frac  1{\frac  13}}\right\rceil +1 = 3
code(A) is 001
For B
{\bar  F}(B)=p(A)+{\frac  12}p(B)={\frac  13}+{\frac  12}\cdot {\frac  14}=0.4583333...
In binary, Z(B) = 0.01110101010101...
L(B) = \left\lceil \log _{2}{\frac  1{\frac  14}}\right\rceil +1 = 3
code(B) is 011
For C
{\bar  F}(C)=p(A)+p(B)+{\frac  12}p(C)={\frac  13}+{\frac  14}+{\frac  12}\cdot {\frac  16}=0.66666...
In binary, Z(C) = 0.101010101010...
L(C) = \left\lceil \log _{2}{\frac  1{\frac  16}}\right\rceil +1 = 4
code(C) is 1010
For D
{\bar  F}(D)=p(A)+p(B)+p(C)+{\frac  12}p(D)={\frac  13}+{\frac  14}+{\frac  16}{\frac  12}\cdot {\frac  14}=0.875
In binary, Z(D) = 0.111
L(D) = \left\lceil \log _{2}{\frac  1{\frac  14}}\right\rceil +1 = 3
code(D) is 111

Algorithm analysis

Prefix code

Shannon–Fano–Elias coding produces a binary prefix code, allowing for direct decoding.

Let bcode(x) be the rational number formed by adding a decimal point before a binary code. For example, if code(C)=1010 then bcode(C) = 0.1010. For all x, if no y exists such that

bcode(x)\leq bcode(y)<bcode(x)+2^{{-L(x)}}

then all the codes form a prefix code.

By comparing F to the CDF of X, this property may be demonstrated graphically for Shannon–Fano–Elias coding.

By definition of L it follows that

2^{{-L(x)}}\leq {\frac  12}p(x)

And because the bits after L(y) are truncated from F(y) to form code(y), it follows that

{\bar  F}(y)-bcode(y)\leq 2^{{-L(y)}}

thus bcode(y) must be no less than CDF(x).

So the above graph demonstrates that the bcode(y)-bcode(x)>p(x)\geq 2^{{-L(x)}}, therefore the prefix property holds.

Code length

The average code length is LC(X)=\sum _{{x\epsilon X}}p(x)L(x)=\sum _{{x\epsilon X}}p(x)(\left\lceil \log _{2}{\frac  {1}{p(x)}}\right\rceil +1).
Thus for H(X), the Entropy of the random variable X,

H(X)+1\leq LC(X)<H(X)+2

Shannon Fano Elias codes from 1 to 2 extra bits per symbol from X than entropy, so the code is not used in practice.

References

  1. T. M. Cover and Joy A. Thomas (2006). Elements of information theory (2nd ed.). John Wiley and Sons. pp. 127–128. ISBN 978-0-471-24195-9. 
This article is issued from Wikipedia. The text is available under the Creative Commons Attribution/Share Alike; additional terms may apply for the media files.