Shannon's Noiseless Coding Theorem

Assumptions:

___________________________________________________________________________________________________

Theorem:

Under the above assumptions, for any such $\QTR{Large}{C}$ :


MATH

In particular,

MATH MATH

The average length of an encoded symbol is greater than or equal to the Entropy.

___________________________________________________________________________________________________

Proof:


Recall Gibb's Inequality:

If MATH and MATH are two probablility distributions, then

MATH

And Kraft's inequality:

MATH MATH

Define $\QTR{Large}{D}$ by the formula: MATH MATH

By Kraft's inequality we have MATH thus MATH

$\ $

Letting MATH be the probability distribution MATH we have:

MATH MATH $\QTR{Large}{\leq }$ MATH

_____________________________________________________________________________________________________________

Realizing the Bound (Huffman Code for Certain Probability Measures):

Theorem:

Given MATH as above with MATHthen the associated Huffman Code, satisifies the equation

MATH

, the expect length of codes is the Entropy of the symbol set.

Proof: A previous discussion.

_____________________________________________________________________________________________________________

Approximating the Bound (Shannon-Fano Coding):

The following approximation to the Shannon Noiseless coding bound is less efficient than Huffman's proceedure but is a bit easier to demonstrate.

In the setting above, just choose integers MATHsuch that:

MATH

Suppose we knew that we could find an instantanous code with these lengths, then, multiplying each inequality by the appropriate MATH and then adding them we get

MATH

or

MATH

To show that we can find such a code:

MATH

can be rewritten as:

MATH and by inverting the terms MATH

again by adding the inequalities we get:

MATH

Which, again using Kraft's inequality, says that such an instantanous code exists: