Joint Entropy and Conditional Entropy

Random Variables Revisited:

Definition: Two Random Variables $\QTR{Large}{X}$ $\QTR{Large}{:}$ MATH and $\QTR{Large}{Y:}$ MATH for the same Sample Space $\QTR{Large}{S\ }$and Sigma Algebra $\QTR{Large}{A\ }$on $\QTR{Large}{S\ }$ are called jointly distributed.

Notes:

Figure

___________________________________________________________________________________

Definitions:

For MATH Read: Having learned the value $\QTR{Large}{Y}\ $has take MATH is the Information you get when you learn the value $\QTR{Large}{X}\ $has taken

MATH is the Expected Value of MATH

___________________________________________________________________

Theorem:

MATH MATH MATH MATH

with equality if and only $\QTR{Large}{X\ }$and $\QTR{Large}{Y}\ $are independent, that is MATH for all MATH.

Proof:

By Gibb's inequality, with MATH playing the role of the MATH 's for the pairs MATH

MATH

With equality if and only if MATH and thus

MATH MATH

and,

MATH MATH

since,

MATH

MATH

similarly for MATH

_________________________________________________________________________________

$\vspace{1pt}$Theorem (The Chain Rule):

MATH MATH MATH equivalently MATH MATH MATH

Proof (the second version):

MATH MATH

MATH

MATH

MATH

_____________________________________________________________________________________

The Extreme Cases:

  1. MATH MATH and MATH, $\QTR{Large}{i=\ j}$

    $\QTR{Large}{=\ 0}$ , MATH

    No Noise,

    MATH and thus MATH

    We learn nothing new when we know what character was received given that we know what was transmitted.

    since MATH MATH MATH

  2. MATH MATH and MATHall $\QTR{Large}{m.}$

    All Noise,

    MATH and MATH

    $\ $since the Random Variables are independent.

___________________________________________________________________________

Exercise ( Due March 5): Compute MATHand MATH for a Binary Symmetric Channel and input vector MATH

MATH