Vigenère cipher

From Wikipedia, the free encyclopedia

The Vigenère cipher is named for Blaise de Vigenère (pictured), although Giovan Batista Belaso had invented the cipher earlier. Vigenère did invent a stronger autokey cipher.
Enlarge
The Vigenère cipher is named for Blaise de Vigenère (pictured), although Giovan Batista Belaso had invented the cipher earlier. Vigenère did invent a stronger autokey cipher.

The Vigenère cipher is a method of encryption that uses a series of different Caesar ciphers based on the letters of a keyword. It is a simple form of polyalphabetic substitution.

The Vigenère cipher has been reinvented many times. The method was originally described by Giovan Batista Belaso in his 1553 book La cifra del. Sig. Giovan Batista Belaso, however, the scheme was later misattributed to Blaise de Vigenère in the 19th century, and is now widely known as the "Vigenère cipher".

This cipher is well known because while it is easy to understand and implement, it often appears to beginners to be unbreakable; this earned it the moniker le chiffre indéchiffrable (French for 'the unbreakable cipher'). Consequently, many people have tried to implement obfuscation or encryption schemes that are essentially Vigenère ciphers, only to have them broken[1].

Contents

[edit] History

The first polyalphabetic cipher, created by Leone Battista Alberti around 1467, used a metal cipher disc to switch between cipher alphabets. Alberti's system only switched alphabets after several words, and switches were indicated by writing the letter of the corresponding alphabet in the ciphertext. Later, in 1508, Johannes Trithemius, in his work Poligraphia, invented the tabula recta, a critical component of the Vigenère cipher. Trithemius, however, only provided a progressive, rigid and predictable system for switching between cipher alphabets.

What is now known as the Vigenère cipher was originally described by Giovan Batista Belaso in his 1553 book La cifra del. Sig. Giovan Batista Belaso. He built upon the tabula recta of Trithemius, but added a repeating "countersign" (a key) to switch cipher alphabets every letter.

Blaise de Vigenère published his description of a similar but stronger autokey cipher before the court of Henry III of France, in 1586. Later, in the 19th century, the invention of Belaso's cipher was misattributed to Vigenère. David Kahn in his book The Codebreakers lamented the misattribution by saying that history had "ignored this important contribution and instead named a regressive and elementary cipher for him [Vigenère] though he had nothing to do with it".[2]

A reproduction of the Confederacy's cipher disk. Only five originals are known to exist.
Enlarge
A reproduction of the Confederacy's cipher disk. Only five originals are known to exist.

The Vigenère cipher gained a reputation for being exceptionally strong. Noted author and mathematician Charles Lutwidge Dodgson (Lewis Carroll) called the Vigenère cipher unbreakable in his 1868 piece "The Alphabet Cipher" in a children's magazine. In 1917, Scientific American described the Vigenère cipher as "impossible of translation".[3] This reputation was not deserved since Kasiski entirely broke the cipher in the 19th century and some skilled cryptanalysts could occasionally break the cipher in the 16th century.[2]

The Vigenère cipher is simple enough to be a field cipher if it is used in conjunction with cipher disks. [4] The Confederate States of America, for example, used a brass cipher disk to implement the Vigenère cipher during the American Civil War. The Confederacy's messages were far from secret and the Union regularly cracked their messages. Throughout the war, the Confederate leadership primarily relied upon three keywords, "Manchester Bluff", "Complete Victory" and, as the war came to a close, "Come Retribution".[5]

Gilbert Vernam tried to repair the broken cipher (creating the Vernam-Vigenère cipher in 1918), but, no matter what he did, the cipher was still vulnerable to cryptanalysis. Vernam's work, however, eventually led to the one-time pad, a provably unbreakable cipher.

[edit] Description

The Vigenère square or Vigenère table, also known as the tabula recta, can be used for encryption and decryption.
Enlarge
The Vigenère square or Vigenère table, also known as the tabula recta, can be used for encryption and decryption.

In a Caesar cipher, each letter of the alphabet is shifted along some number of places; for example, in a Caesar cipher of shift 3, A would become D, B would become E and so on. The Vigenère cipher consists of several Caesar ciphers in sequence with different shift values.

To encipher, a table of alphabets can be used, termed a tabula recta, Vigenère square, or Vigenère table. It consists of the alphabet written out 26 times in different rows, each alphabet shifted cyclically to the left compared to the previous alphabet, corresponding to the 26 possible Caesar ciphers. At different points in the encryption process, the cipher uses a different alphabet from one of the rows. The alphabet used at each point depends on a repeating keyword.

For example, suppose that the plaintext to be encrypted is:

ATTACKATDAWN

The person sending the message chooses a keyword and repeats it until it matches the length of the plaintext, for example, the keyword "LEMON":

LEMONLEMONLE

The first letter of the plaintext, A, is enciphered using the alphabet in row L, which is the first letter of the key. This is done by looking at the letter in row L and column A of the Vigenère square, namely L. Similarly, for the second letter of the plaintext, the second letter of the key is used; the letter at row E and column T is X. The rest of the plaintext is enciphered in a similar fashion:

Plaintext: ATTACKATDAWN
Key: LEMONLEMONLE
Ciphertext: LXFOPVEFRNHR

Decryption is performed by finding the position of the ciphertext letter in a row of the table, and then taking the label of the column in which it appears as the plaintext. For example, in row L, the ciphertext L appears in column A, which taken as the first plaintext letter. The second letter is decrypted by looking up X in row E of the table; it appears in column T, which is taken as the plaintext letter.

Vigenère can also be viewed algebraically. If the letters AZ are taken to be the numbers 0–25, and addition is performed modulo 26, then Vigenère encryption can be written,

C_i \equiv (P_i + K_i) \mod 26

and decryption,

P_i \equiv (C_i - K_i) \mod 26

[edit] Cryptanalysis

The Vigenère cipher is effective because it masks the characteristic letter frequencies of English plaintexts, but some patterns remain.
Enlarge
The Vigenère cipher is effective because it masks the characteristic letter frequencies of English plaintexts, but some patterns remain.

The strength behind the Vigenère cipher, like all polyalphabetic ciphers, is its ability to stymie frequency analysis. Frequency analysis is the practice of decrypting a message by counting the frequency of ciphertext letters, and equating it to the letter frequency of normal text. For instance if P occurred most in a ciphertext whose plaintext is in English one could suspect that P corresponded to E, because E is the most frequently used letter in English. Using the Vigenère cipher, E can be enciphered as any of several letters in the alphabet at different points in the message thus defeating simple frequency analysis.

The critical weakness in the Vigenère cipher is the relatively short and repeated nature of its key. If a cryptanalyst discovers the key's length then the cipher text can be treated as a series of different Caesar ciphers, which individually are trivially broken. The Kasiski and Friedman tests help determine a ciphertext's key length.

[edit] Kasiski examination

For more details on this topic, see Kasiski examination.

Friedrich Kasiski published the first successful attack on the Vigenère cipher in 1863, but Charles Babbage had already developed the same test in 1854. Babbage was goaded into breaking the Vigenère cipher when John Hall Brock Thwaites submitted a "new" cipher to the Journal of the Society of the Arts: when Babbage showed that Thwaites' cipher was essentially just another recreation of the Vigenère cipher, Thwaites grew irritated and challenged Babbage to break his cipher. Babbage tried to get out of it, but eventually gave in and succeeded in decrypting a sample, which turned out to be the poem "The Vision of Sin", by Alfred Tennyson, encrypted according to the keyword "Emily", the first name of Tennyson's wife[6].

The Kasiski examination, also called the Kasiski test, takes advantage of the fact that certain common words like "the" will, by chance, be encrypted using the same key letters, leading to repeated groups in the ciphertext. For example, a message encrypted with the keyword ABCDEF might not encipher "crypto" the same way each time it appears in the plain text:

Key:        ABCDEF AB CDEFA BCD EFABCDEFABCD
Plaintext:  CRYPTO IS SHORT FOR CRYPTOGRAPHY
Ciphertext: CSASXT IT UKSWT GQU GWYQVRKWAQJB

The encrypted text here will not have repeated sequences that correspond to repeated sequences in the plaintext. However, if the key length is different, as in this example:

Key:        ABCDAB CD ABCDA BCD ABCDABCDABCD
Plaintext:  CRYPTO IS SHORT FOR CRYPTOGRAPHY
Ciphertext: CSASTP KV SIQUT GQU CSASTPIUAQJB

Then the Kasiski test is effective. Longer messages make the test more accurate because they usually contain more repeated ciphertext segments. The following ciphertext has several repeated segments and allows a cryptanalyst to discover its key length:

Ciphertext: DYDUXRMHTVDVNQDQNWDYDUXRMHARTJGWNQD

The distance between the repeated DYDUXRMHs is 18. This, assuming that the repeated segments represent the same plaintext segments, implies that the key is 18, 9, 6, 3, or 2 characters long. The distance between the NQDs is 20 characters. This means that the key length could be 20, 10, 5, 4, or 2 characters long (all factors of the distance are possible key lengths – a key of length one is just a simple shift cipher, where cryptanalysis is much easier). By taking the intersection of these sets one could safely conclude that the key length is (almost certainly) 2.

[edit] Friedman test

The Friedman test (also known as the Kappa test) was invented in 1925 by William F. Friedman. Friedman used the index of coincidence, the probability that any two cipher letters represent the same letter in the plaintext, to break the cipher. By knowing that the probability of any two randomly chosen letters in English are the same is about 6.5%, Friedman found that the key length is approximately equal to:

{.027n}\over{(n-1)\boldsymbol{I}-.038n+.065}

where I (the index of coincidence) equals

\sum_{i=1}^{26}\frac{n_i(n_i -1)} {n(n-1)}

n is the length of the text and n1 through n26 are the letter frequencies.

The test is, however, only an approximation whose accuracy increases with the size of the text. It would be necessary to try key lengths close to the test result. Despite this weakness, the Friedman test is more accurate than the Kasiski examination at determining key length.[7]

[edit] Frequency analysis

Once the length of the key is known, the ciphertext is split into sections with each section corresponding to a single letter in the key. Each portion of the divided ciphertext is equivalent to a single Caesar shift ciphertext. The letter shifts in the sections are determined by the letters in the key for the corresponding portions of the ciphertext. Using methods similar to those used to break the Caesar cipher, the letters in the key can be discovered. Once every letter in the key is known, the cryptanalyst can simply decrypt the ciphertext and reveal the plaintext.

A variant of the Kasiski examination, known as Kerckhoff's method, continues the analysis by deducing the keyword itself from the letter frequencies in the divided ciphertexts. [8]

[edit] Variants

The running key variant of the Vigenère cipher was also considered unbreakable at one time. This version uses a block of text as long as plaintext as the key. Since the key is as long as the message the Friedman and Kassiski tests no longer work (the key is not repeated). In 1920, Friedman was the first to discover this variant's weaknesses. The problem with the running key Vigenere cipher is that the cryptanalyst has statistical information about the key (assuming that the block of text is in English) and that information will be reflected in the ciphertext.

Vigenère actually invented a stronger cipher: an autokey cipher. The name "Vigenère cipher" became associated with a simpler polyalphabetic cipher instead. In fact, the two ciphers were often confused, and both were sometimes called "le chiffre indéchiffrable". Babbage actually broke the much stronger autokey cipher, while Kasiski is generally credited with the first published solution to the fixed-key polyalphabetic ciphers.

A simple variant is to encrypt using the Vigenère decryption method, and decrypt using Vigenère encryption. This method is sometimes referred to as "Variant Beaufort". This is different from the Beaufort cipher, created by Sir Francis Beaufort, which nonetheless is similar to Vigenère but uses a slightly modified enciphering mechanism and tableau. The Beaufort cipher is a reciprocal cipher.

Despite the Vigenère cipher's apparent strength it never became widely used throughout Europe. The Gronsfeld cipher is a variant created by Count Gronsfeld which is identical to the Vigenère cipher, except that it uses just 10 different cipher alphabets (corresponding to the digits 0 to 9). The Gronsfeld cipher is strengthened because its key is not a word, but it is weakened because it has just 10 cipher alphabets. Gronsfeld's cipher did become widely used throughout Germany and Europe, despite its weaknesses.

[edit] References

  1. ^ Smith, Laurence D. (1943). “Substitution Ciphers”, Cryptography the Science of Secret Writing: The Science of Secret Writing. Dover Publications, 81. ISBN 048620247X.
  2. ^ a b David, Kahn (1999). “On the Origin of a Species”, The Codebreakers: The Story of Secret Writing. Simon & Schuster. ISBN 0684831309.
  3. ^ Knudsen, Lars R. (1998). “Block Ciphers— a survey”, Bart Preneel and Vincent Rijmen: State of the Art in Applied Cryptography: Course on Computer Security and Industrial Cryptograph Leuven Belgium, June 1997 Revised Lectures, 29. ISBN 3540654747.
  4. ^ Codes, Ciphers, & Codebreaking (The Rise Of Field Ciphers)
  5. ^ David, Kahn (1999). “Crises of the Union”, The Codebreakers: The Story of Secret Writing. Simon & Schuster, 217-221. ISBN 0684831309.
  6. ^ Singh, Simon (1999). “Chapter 2: Le Chiffre Indéchiffrable”, The Code Book. Anchor Book, Random House, 63–78. ISBN 0-385-49532-3.
  7. ^ (2005) Henk C.A. van Tilborg: Encyclopedia of Cryptography and Security, First edition, Springer, 115. ISBN 038723473X.
  8. ^ Lab exercise: Vigenere, RSA, DES, and Authentication Protocols (PDF). CS 415: Computer and Network Security. Retrieved on 2006-11-10.

[edit] Sources

  • Beutelspacher, Albrecht (1994). “Chapter 2”, Cryptology, translation from German by J. Chris Fisher, 27-41. ISBN 0-88385-504-6.
  • Singh, Simon (1999). “Chapter 2: Le Chiffre Indéchiffrable”, The Code Book. Anchor Book, Random House. ISBN 0-385-49532-3.
  • Gaines, Helen Fouche (1939). “The Gronsfeld, Porta and Beaufort Ciphers”, Cryptanalysis a Study of Ciphers and Their Solutions. Dover Publications, 117-126. ISBN 0486200973.

[edit] External links

Classical cryptography
v d e
Rotor machines: CCM | Enigma | Fialka | Hebern | HX-63 | KL-7 | Lacida | M-325 | Mercury | NEMA | OMI | Portex | SIGABA | SIGCUM | Singlet | Typex
Ciphers: ADFGVX | Affine | Alberti | Atbash | Autokey | Bifid | Book | Caesar | Four-square | Hill | Keyword | Nihilist | Permutation | Pigpen | Playfair | Polyalphabetic | Polybius | Rail Fence | Reihenschieber | Reservehandverfahren | ROT13 | Running key | Scytale | Solitaire | Straddling checkerboard | Substitution | Tap Code | Transposition | Trifid | Two-square | VIC cipher | Vigenère
Cryptanalysis: Frequency analysis | Index of coincidence
Misc: Cryptogram | Bacon | Polybius square | Scytale | Straddling checkerboard | Tabula recta
Cryptography
v d e
History of cryptography | Cryptanalysis | Cryptography portal | Topics in cryptography
Symmetric-key algorithm | Block cipher | Stream cipher | Public-key cryptography | Cryptographic hash function | Message authentication code | Random numbers