Fano's inequality

From Wikipedia, the free encyclopedia

In information theory, Fano's inequality (also known as the Fano converse) is a lower bound on the error probability of any decoder. It was developed by Robert Fano in the early 1950s while teaching a Ph.D. seminar in information theory at MIT, and later recorded in his 1961 textbook.

The lower bound is calculated as:

$\Pr(g(V)\neq W) \geq \frac{H(W|V)-\log 2}{\log(|{}\mathcal{W}|-1)}$

for any function (even probabilistic) $g$ of the observation $V$ used to estimate the message $W$ out of a countable number of possibilities $\mathcal{W}$ . The logarithm is in the same base as that used to calculate the entropy.

[edit] Proof

Let $E$ be the error indicator,

$E= \begin{cases} 1 & \mbox{if } g(V)\neq W\\ 0 & \mbox{if } g(V)=W \end{cases}$

Applying the chain rule on $H (W, E | V)$ in two different ways gives,

$H(W,E|V)=H(E|V)+H(W|E,V)=H(W|V)+H(E|W,V)\qquad (1)$

Consider each terms separately,

$H(E|V)\leq \log 2\qquad(2)$ because $E$ is binary.
where
- $0\leq H(W|E=0,V)=H(g(V)|E=0,V)\leq H(g(V)|V)=0$ applying the fact that entropy is non-negative and is not increased by conditioning.
- Similarly for the other term, $H(W|E=1,V)=H(W|E=1,V,W\neq g(V))\leq H(W|W\neq g(V)) \leq \log(|\mathcal{W}|-1)$ .
- Hence, $H(W|E,V)\leq \Pr(g(V)\neq W) \log(|\mathcal{W}|-1)\qquad(3)$
$H(E|W,V) \geq 0\qquad(4)$ with equality if $g$ is a deterministic function (i.e. $H (E | W, V) = H (E | W, V, g (V)) = 0$ ). $g$ being deterministic, however, is not necessary for the proof.