Hill cipher

From Wikipedia, the free encyclopedia

Hill's cipher machine, from figure 4 of the patent
Enlarge
Hill's cipher machine, from figure 4 of the patent

In classical cryptography, the Hill cipher is a polygraphic substitution cipher based on linear algebra. Invented by Lester S. Hill in 1929, it was the first polygraphic cipher in which it was practical (though barely) to operate on more than three symbols at once. The following discussion assumes an elementary knowledge of matrix theory.

Contents

[edit] Operation

Each letter is treated as a digit in base 26: A = 0, B =1, and so on. A block of n letters is then considered as a vector of n dimensions, and multiplied by a n × n matrix, modulo 26. The components of the matrix are the key, and should be random provided that the matrix is invertible in \mathbb{Z}_{26}^n (to ensure decryption is possible). Consider the message 'ACT', and the key below (or GYBNQKURP in letters):

\begin{pmatrix} 6 & 24 & 1 \\ 13 & 16 & 10 \\ 20 & 17 & 15 \end{pmatrix}

Since 'A' is 0, 'C' is 2 and 'T' is 19, the message is the vector:

\begin{pmatrix} 0 \\ 2 \\ 19 \end{pmatrix}

Thus the enciphered vector is given by:

\begin{pmatrix} 6 & 24 & 1 \\ 13 & 16 & 10 \\ 20 & 17 & 15 \end{pmatrix} \begin{pmatrix} 0 \\ 2 \\ 19 \end{pmatrix} \equiv \begin{pmatrix} 67 \\ 222 \\ 319 \end{pmatrix} \equiv \begin{pmatrix} 15 \\ 14 \\ 7 \end{pmatrix} \mod 26

which corresponds to a ciphertext of 'POH'. Now, suppose that our message is instead 'CAT', or:

\begin{pmatrix} 2 \\ 0 \\ 19 \end{pmatrix}

This time, the enciphered vector is given by:

\begin{pmatrix} 6 & 24 & 1 \\ 13 & 16 & 10 \\ 20 & 17 & 15 \end{pmatrix} \begin{pmatrix} 2 \\ 0 \\ 19 \end{pmatrix} \equiv \begin{pmatrix} 31 \\ 216 \\ 325 \end{pmatrix} \equiv \begin{pmatrix} 5 \\ 8 \\ 13 \end{pmatrix} \mod 26

which corresponds to a ciphertext of 'FIN'. Every letter has changed. The Hill cipher has achieved Shannon's diffusion, and an n-dimensional Hill cipher can diffuse fully across n symbols at once.

[edit] Decryption

In order to decrypt, we turn the ciphertext back into a vector, then simply multiply by the inverse matrix of the key matrix (IFKVIVVMI in letters). (There are standard methods to calculate the inverse matrix; see matrix inversion for details.) We find that in \mathbb{Z}_{26}^n the inverse matrix of the one in the previous example is:

\begin{pmatrix} 8 & 5 & 10 \\ 21 & 8 & 21 \\ 21 & 12 & 8 \end{pmatrix}

Taking the previous example ciphertext of 'POH', we get:

\begin{pmatrix} 8 & 5 & 10 \\ 21 & 8 & 21 \\ 21 & 12 & 8 \end{pmatrix} \begin{pmatrix} 15 \\ 14 \\ 7 \end{pmatrix} \equiv \begin{pmatrix} 260 \\ 574 \\ 539 \end{pmatrix} \equiv \begin{pmatrix} 0 \\ 2 \\ 19 \end{pmatrix} \mod 26

which gets us back to 'ACT', just as we hoped.

Two complications that we have glossed over, are that not all matrices have an inverse (see invertible matrix). There is a (relatively) straightforward way to find this out, though. If the determinant of the matrix is 0, or has common factors with the modulus (i.e. factors of 2 or 13, in the case of modulus 26), then the matrix cannot be used in the Hill cipher; discard it and try another one. Fortunately, unless the basis has small factors, most matrices will have an inverse. Alas, because 2 is one of the factors of 26, quite a few matrices modulo 26 will not work. For our example key matrix:

\begin{vmatrix} 6 & 24 & 1 \\ 13 & 16 & 10 \\ 20 & 17 & 15 \end{vmatrix} \equiv 6(16\cdot15-10\cdot17)-24(13\cdot15-10\cdot20)+1(13\cdot17-16\cdot20) \equiv 441 \equiv 25 \mod 26

25 is coprime with 26, so no problems. The risk of the determinant having common factors with the modulus can be eliminated by making the modulus prime. Consequently a useful variant of the Hill cipher adds 3 extra symbols to increase the modulus to 29.

[edit] Security

Unfortunately, the basic Hill cipher is vulnerable to a known-plaintext attack because it is completely linear. An opponent who intercepts n2 plaintext/ciphertext character pairs can set up a linear system which can (usually) be easily solved; if it happens that this system is indeterminate, it is only necessary to add a few more plaintext/ciphertext pairs. Calculating this solution by standard linear algebra algorithms then takes very little time.

The security could be greatly enhanced by combining with some non-linear step to defeat this attack. (Perhaps the simplest way to nonlinearise would be to use two different mixed alphabets when converting text to and from numerical vectors.) The combination of wider and wider weak, linear diffusive steps like a Hill cipher, with non-linear substitution steps, ultimately leads to a substitution-permutation network (e.g., a Feistel cipher).

[edit] Key size

One might naïvely think that the key size, in bits, is n2log226 or about 4.7n2. In fact, it is slightly less than this because not all randomly selected matrices are usable. A slightly less naïve view might guess that 1/2 + 1/26 of candidate keys would be unusable, reducing the keyspace by about 54%. In fact, determinants are not uniformly distributed, and the key space reduction is closer to 70%. Additionally it seems to be prudent to avoid too many zeroes in the key matrix, since they reduce diffusion. The net effect is that the effective keyspace of a basic Hill cipher is about 4.64n2 − 1.7. For a 5 × 5 Hill cipher, that is about 114 bits. Of course, key search is not the most efficient known attack.

[edit] Mechanical implementation

When operating on 2 symbols at once, a Hill cipher offers no particular advantage over Playfair or the bifid cipher, and in fact is weaker than either, and slightly more laborious to operate by pencil-and-paper. As the dimension increases, the cipher rapidly becomes infeasible for a human to operate by hand. But astonishingly, a Hill cipher of dimension 6 was once implemented mechanically! Hill and a partner were awarded a patent (U.S. Patent 1,845,947 ) for this device, which performed a 6 × 6 matrix multiplication modulo 26 using a system of gears and chains. Unfortunately the gearing arrangements (and thus the key) were fixed for any given machine, so triple encryption was recommended for security: a secret nonlinear step, followed by the wide diffusive step from the machine, followed by a third secret nonlinear step. Such a combination was actually very powerful for 1929, and indicates that Hill apparently understood the concepts of a meet-in-the-middle attack as well as confusion and diffusion. Unfortunately, his machine did not sell.

[edit] See also

Other practical "pencil-and-paper" polygraphic ciphers include:

[edit] References

  • http://ardsrk.blogspot.com Provides excellent explanation on computing matrix inverses with regard to the hill cipher
  • Lester S. Hill, Cryptography in an Algebraic Alphabet, The American Mathematical Monthly 36, June-July 1929, pp306–312.
  • Lester S. Hill, Concerning Certain Linear Transformation Apparatus of Cryptography, The American Mathematical Monthly 38, 1931, pp135–154.
  • Jeffrey Overbey, William Traves, and Jerzy Wojdylo, On the Keyspace of the Hill Cipher, Cryptologia, 29(1), January 2005, pp59–72. (PDF)
  • Shahrokh Saeednia, How to Make the Hill Cipher Secure, Cryptologia, 24(4), October 2000, pp353–360.
Classical cryptography
v  d  e
Rotor machines: CCM | Enigma | Fialka | Hebern | HX-63 | KL-7 | Lacida | M-325 | Mercury | NEMA | OMI | Portex | SIGABA | SIGCUM | Singlet | Typex
Ciphers: ADFGVX | Affine | Alberti | Atbash | Autokey | Bifid | Book | Caesar | Four-square | Hill | Keyword | Nihilist | Permutation | Pigpen | Playfair | Polyalphabetic | Polybius | Rail Fence | Reihenschieber | Reservehandverfahren | ROT13 | Running key | Scytale | Solitaire | Straddling checkerboard | Substitution | Tap Code | Transposition | Trifid | Two-square | VIC cipher | Vigenère
Cryptanalysis: Frequency analysis | Index of coincidence
Misc: Cryptogram | Bacon | Polybius square | Scytale | Straddling checkerboard | Tabula recta
Cryptography
v  d  e
History of cryptography | Cryptanalysis | Cryptography portal | Topics in cryptography
Symmetric-key algorithm | Block cipher | Stream cipher | Public-key cryptography | Cryptographic hash function | Message authentication code | Random numbers
In other languages