Reviewing, we considered a set
of
plaintext-probability pairs
that is the probability that the symbol
occurs
in a message
is
.We
also considered an instantaneous binary
code
for
.
Finally we set
One wants to know how to achieve the lower bound for
.
In fact this is not a "hard" problem! There is a
rather simple algorithm to general a "best possible" code.
At the root of Huffman's algorithm is a simple idea, which is...choose the longest binary string for the least probable symbol because this would reduce the expected length of a transmission.
That is:
if
and
then
.
Hence
where the first summation is the result of interchanging
and
in the second.
The idea of the algorithm is to build a tree from the leaves back towards the root. At each stage we "combine" the two least probable symbols, in effect adding a new node to the tree.
a | 0 |
b | 10 |
c | 110 |
d | 1110 |
e | 1111 |
______________________________________________________________________________________________________
Suppose we have a cryptosystem.
and
an encrypted message
the
big question is can we find the original plaintext message
.
Information Theory provide a general framework to study this and related
questions. Unfortunately the practical applications of this framework are
somewhat limited, at least for our purposes. On the other hand, it is
worthwhile to briefly review this area.
The first thing that must be done is to define
"Conditional Entropy." That is given events
and
we
want to consider
the
amount of uncertainty in
knowing
after
is
revealed. The formula defining of
is an somewhat obvious extension of
. From the Information Theoretic point of view the
formulation involves the three sets
.
For example, the following definition:
: We say
that a cryptosystem has perfect secrecy
if
In simple terms, the amount of uncertainty about a
message is the same whether or not we know the encrypted message. In order for
this definition to make sense we need to remember that for a message
to be the plaintext of an encrypted message
there needs to be some key
with
Examples:
Questions:
What about SHIFT and SUBSTITUTION Cyphers?
What about RSA?
Consider
, remember
that if I have an encrypted message and the Key that was used then I can
decode the message, is there a relationship between
and
?
In all the about we are more or less assuming that the distribution of plaintext messages are equiprobable. What role does the fact that we are dealing with natural languages play in this?