Mutual Information and Channel Capacity

Definition:

In the setting of the previous lecture, given two jointly distributed Finite Random Variables $\QTR{Large}{X}$ $\QTR{Large}{:}$ MATH and $\QTR{Large}{Y}$ $\QTR{Large}{:}$ MATH their Mutual Information is defined as follows:

MATH MATH

MATH MATH

and essentially the same calculation for MATH MATH. All Information is Mutual.

_____________________________________________________________________________________________

Theorem:

MATH MATH

Proofs:

These are all simple variants of the definition, the calculation in the second bullet above and the material in the previous lecture. For example,

MATH

MATH

MATH

MATH MATH

MATH

_____________________________________________________________________________________

One might read MATH , the Mutual Information as:

  1. MATHthe average information about the character received$\QTR{Large}{\ }$ after the transmission noise has been removed.

  2. MATHthe average information about the character sent$\QTR{Large}{\ }$ after the Bayesian noise has been removed.

  3. The information in MATH after a copy of the joint information has been removed, the Mutual Information getting counted twice

_____________________________________________________________________________________

The Extreme Cases, Yet Again:

We have Random Variables $\QTR{Large}{T\ \ }$andMATH and MATH MATH{A,B,..,Z}

The two special cases to be considered are :

  1. MATH MATH , error free transmission.

    MATH and MATH

    MATH MATH MATH , All of the information is in what is transmitted.

  2. MATH MATH for all $\QTR{Large}{t}$ and $\QTR{Large}{r}$, total noise

    MATH MATH, MATH, MATH

    MATHSince $\QTR{Large}{R}$ and $\QTR{Large}{T}$ are independent.there is no mutual information.

________________________________________________________________________________________

Definition:

$\ \ \hspace{1in}$for a given channel MATH MATH, the Channel Capacity, $\QTR{Large}{C\ }$ is defined by the formula

MATH

For the example of a Binary Symmetric Channel, since MATHand MATH is constant. The maximum is achieved when MATHis a maximum (see below)

Exercise (Due March 7) : Compute the Channel Capacity for a Binary Symmetric Channel in terms of $\QTR{Large}{p}$?

_______________________________________________________________________________________________________

Theorem:

Proof:

MATHand MATH but the rows of the channel matrix all have the same values, again the order may be different

so MATHis independent of $\QTR{Large}{i}$ . In particular,

MATHfor any $\QTR{Large}{i.}$

For a given MATH, MATH MATH in particular since the columns all have the same values

MATH

since MATH any $\QTR{Large}{\ j}$ and $\QTR{Large}{k}$ ,and MATH is a probability distribution.

$\ $