Julius Caesar used a simple coding technique, now called the Caesar Shift in his private correspondence. He would encode a message by shifting the position of letters a fixed amount. For English with a shift of 4, the coding table would be:
Symbol | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
Code | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C | D |
Assuming somebody intercepting this message knows Caesar's technique, breaking the code amounts to trying the 26 possible shifts until a readable message appears. One might generalize the process by choosing some arbitrary permutation of the letters rather than a shift permutation. Now the code breaker is faced with the task of trying the 26! possible permutations.
However, the core problem with Caesar's code, even as modified, is that the received message contains too much Information. In particular, below is a full and somewhat accurate table of probablilities of letter occurance.
Symbol | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
Probability | 0.086 | 0.014 | 0.028 | 0.038 | 0.130 | 0.029 | 0.020 | 0.053 | 0.063 | 0.001 | 0.004 | 0.034 | 0.025 | 0.071 | 0.080 | 0.020 | 0.001 | 0.068 | 0.061 | 0.105 | 0.025 | 0.009 | 0.015 | 0.002 | 0.020 | 0.001 |
The Enigma Machine
Select 10 random numbers (n0,n1,...,n9) between 1 and 25. Let l1l2...lm be a code word. The corresponding Enigma Code word is
S(l1,n0)S(l2,n1)...S(lm,nm mod(10))
as we step through the string of letters we cycle through the shifts. To decode the encrypted string, again just reverse the direction of the shift.
Since, for example, two occurrences of the letter E. might be encoded with different shifts, a statistical analysis of encoded letters would not be an effective code breaking strategy