RC4

From Wikipedia, the free encyclopedia

For the Vietnam road named RC4, see Route Coloniale 4.

In cryptography, RC4 (also known as ARC4 or ARCFOUR) is the most widely-used software stream cipher and is used in popular protocols such as Secure Sockets Layer (SSL) (to protect Internet traffic) and WEP (to secure wireless networks). While remarkable in its simplicity, RC4 falls short of the high standards of security set by cryptographers, and some ways of using RC4 can lead to very insecure cryptosystems (including WEP). It is not recommended for use in new systems. However, some systems based on RC4 are secure enough for practical use.

Contents

[edit] History

RC4 was designed by Ron Rivest of RSA Security in 1987; while it is officially termed "Rivest Cipher 4", the RC acronym is alternatively understood to stand for "Ron's Code" (see also RC2, RC5 and RC6).

RC4 was initially a trade secret, but in September 1994 a description of it was anonymously posted to the Cypherpunks mailing list. It was soon posted on the sci.crypt newsgroup, and from there to many sites on the Internet. Because the algorithm is known, it is no longer a trade secret. The name "RC4" is trademarked, however. The current status seems to be that "unofficial" implementations are legal, but cannot use the RC4 name. RC4 is often referred to as "ARCFOUR" or "ARC4" (meaning Alleged RC4, because RSA has never officially released the algorithm), to avoid possible trademark problems. It has become part of some commonly used encryption protocols and standards, including WEP and WPA for wireless cards and TLS.

The main factors which helped its deployment over such a wide range of applications consisted in its impressive speed and simplicity. Implementations in both software and hardware are very easy to develop.

[edit] Description

RC4 generates a pseudorandom stream of bits (a keystream) which, for encryption, is combined with the plaintext using XOR; decryption is performed the same way. (This is similar to the Vernam cipher except that pseudorandom bits, rather than random bits, are used.) To generate the keystream, the cipher makes use of a secret internal state which consists of two parts:

  1. A permutation of all 256 possible bytes (denoted "S" below).
  2. Two 8-bit index-pointers (denoted "i" and "j").

The permutation is initialised with a variable length key, typically between 40 and 256 bits, using the key-scheduling algorithm (KSA). Once this has been completed, the stream of bits is generated using the pseudo-random generation algorithm (PRGA).

[edit] The key-scheduling algorithm (KSA)

The key-scheduling algorithm is used to initialise the permutation in the array "S". "keylength" is defined as the number of bytes in the key and can be in the range 1 ≤ keylength ≤ 256, typically between 5 and 16, corresponding to a key length of 40 – 128 bits. First, the array "S" is initialised to the identity permutation. S is then processed for 256 iterations in a similar way to the main PRGA algorithm, but also mixes in bytes of the key at the same time.

for i from 0 to 255
    S[i] := i
endfor
j := 0
for i from 0 to 255
    j := (j + S[i] + key[i mod keylength]) mod 256
    swap(S[i],S[j])
endfor

[edit] The pseudo-random generation algorithm (PRGA)

The lookup stage of RC4. The output byte is selected by looking up the values of S(i) and S(j), adding them together modulo 256, and then looking up the sum in S; S(S(i) + S(j)) is used as a byte of the key stream, K.
The lookup stage of RC4. The output byte is selected by looking up the values of S(i) and S(j), adding them together modulo 256, and then looking up the sum in S; S(S(i) + S(j)) is used as a byte of the key stream, K.

For as many iterations as are needed, the PRGA modifies the state and outputs a byte of the keystream. In each iteration, the PRGA increments i, adds the value of S pointed to by i to j, exchanges the values of S[i] and S[j], and then outputs the value of S at the location S[i] + S[j] (modulo 256). Each value of S is swapped at least once every 256 iterations.

i := 0
j := 0
while GeneratingOutput:
    i := (i + 1) mod 256
    j := (j + S[i]) mod 256
    swap(S[i],S[j])
    output S[(S[i] + S[j]) mod 256]
endwhile

[edit] Implementation

Many stream ciphers are based on linear feedback shift registers (LFSRs), which while efficient in hardware are less so in software. The design of RC4 avoids the use of LFSRs, and is ideal for software implementation, as it requires only byte manipulations. It uses 256 bytes of memory for the state array, S[0] through S[255], k bytes of memory for the key, key[0] through key[k-1], and integer variables, i, j, and k. Performing a modulus 256 can be done with a bitwise AND with 255 (or on most platforms, simple addition of bytes ignoring overflow).

Here is a simple implementation in Python:

    class WikipediaARC4:
        def __init__(self, key = None):
            self.state = range(256) # Initialize state array with values 0 .. 255
            self.x = self.y = 0 # Our indexes. x, y instead of i, j
 
            if key is not None:
                self.init(key)
        
        # KSA
        def init(self, key):
            for i in range(256):
                self.x = (ord(key[i % len(key)]) + self.state[i] + self.x) & 0xFF
                self.state[i], self.state[self.x] = self.state[self.x], self.state[i]
            self.x = 0
            
        # PRGA
        def crypt(self, input):
            output = [None]*len(input)
            for i in xrange(len(input)):
                self.x = (self.x + 1) & 0xFF
                self.y = (self.state[self.x] + self.y) & 0xFF
                self.state[self.x], self.state[self.y] = self.state[self.y], self.state[self.x]
                output[i] = chr((ord(input[i]) ^ self.state[(self.state[self.x] + self.state[self.y]) & 0xFF]))
            return ''.join(output)
            
    if __name__ == '__main__':
        test_vectors = [['Key', 'Plaintext'], \
                        ['Wiki', 'pedia'], \
                        ['Secret', 'Attack at dawn']]
        for i in test_vectors:
            print WikipediaARC4(i[0]).crypt(i[1]).encode('hex').upper()

[edit] Test vectors

These test vectors are not official, but convenient for anyone testing their own RC4 program. The inputs are ASCII, the output is in hexadecimal.

RC4( "Key", "Plaintext" ) == BBF316E8D940AF0AD3

RC4( "Wiki", "pedia" ) == 1021BF0420

RC4( "Secret", "Attack at dawn" ) == 45A01F645FC35B383552544B9BF5

[edit] Security

RC4 falls short of the standards set by cryptographers for a secure cipher in several ways, and thus is not recommended for use in new applications.

The keystream generated by RC4 is slightly biased in favour of certain sequences of bytes. The best attack based on this bias is due to Fluhrer and McGrew, which will distinguish the keystream from a random stream given a gigabyte of output.

RC4 does not take a separate nonce alongside the key. Such a nonce is, in general, a necessary requirement for security, so that encrypting the same message twice produces a different ciphertext each time. One approach to addressing this is to generate a "fresh" RC4 key by hashing a long-term key with a nonce. However, many applications that use RC4 simply concatenate key and nonce; RC4's weak key schedule then gives rise to a variety of serious problems.

[edit] Fluhrer, Mantin and Shamir attack

In 2001 a new and surprising discovery was made by Fluhrer, Mantin and Shamir: over all possible RC4 keys, the statistics for the first few bytes of output keystream are strongly non-random, leaking information about the key. If the long-term key and nonce are simply concatenated to generate the RC4 key, this long-term key can be discovered by analysing large number of messages encrypted with this key. This and related effects were then used to break the WEP ("wired equivalent privacy") encryption used with 802.11 wireless networks. This caused a scramble for a standards-based replacement for WEP in the 802.11 market, and led to the IEEE 802.11i effort and WPA.

Cryptosystems can defend against this attack by discarding the initial portion of the keystream (say the first 1024 bytes) before using it.

[edit] Combinatorial problem

A combinatorial problem related to the number of inputs and outputs of the RC4 cipher was first posed by Itsik Mantin and Adi Shamir in 2001, whereby, of the total 256 elements in the typical state of RC4, if x number of elements (x ≤ 256) are only known (all other elements can be assumed empty), then the maximum number of elements that can be produced deteministically is also x in the next 256 rounds. This conjecture was put to rest in 2004 with a formal proof given by Souradyuti Paul and Bart Preneel.

[edit] RC4-based cryptosystems

Where a cryptosystem is marked with "(optionally)", RC4 is one of several ciphers the system can be configured to use.

[edit] See also

  • TEA and XTEA - A family of block ciphers that like RC4 are designed to be very simple to implement.


[edit] References

  • Scott R. Fluhrer, Itsik Mantin and Adi Shamir, Weaknesses in the Key Scheduling Algorithm of RC4. Selected Areas in Cryptography 2001, pp1 – 24 (PS).
  • Scott R. Fluhrer and David A. McGrew, Statistical Analysis of the Alleged RC4 Keystream Generator. FSE 2000, pp19 – 30 (PDF).
  • Jovan Dj. Golic, Iterative Probabilistic Cryptanalysis of RC4 Keystream Generator. ACISP 2000, pp220 – 233
  • Jovan Dj. Golic, Linear Statistical Weakness of Alleged RC4 Keystream Generator. EUROCRYPT 1997, pp226 – 238 (PDF).
  • Lars R. Knudsen, Willi Meier, Bart Preneel, Vincent Rijmen and Sven Verdoolaege, Analysis Methods for (Alleged) RC4. ASIACRYPT 1998, pp327 – 341 (PS).
  • Itsik Mantin and Adi Shamir, A Practical Attack on Broadcast RC4. FSE 2001, pp152 – 164 (PS).
  • Serge Mister and Stafford E. Tavares, Cryptanalysis of RC4-like Ciphers. Selected Areas in Cryptography 1998, pp131 – 143
  • Ilya Mironov, (Not So) Random Shuffles of RC4. CRYPTO 2002, pp304 – 319
  • Souradyuti Paul and Bart Preneel, Analysis of Non-fortuitous Predictive States of the RC4 Keystream Generator. INDOCRYPT 2003, pp52 – 67 (PDF).
  • Souradyuti Paul and Bart Preneel, A New Weakness in the RC4 Keystream Generator and an Approach to Improve the Security of the Cipher. Fast Software Encryption - FSE 2004, pp245 – 259 (PDF).

[edit] External links

RC4

RC4 in WEP

Stream ciphers
v d e
Algorithms: A5/1 | A5/2 | E0 | FISH | Grain | HC-256 | ISAAC | LILI-128 | MUGI | Panama | Phelix | Pike | Py | Rabbit | RC4 | Salsa20 | Scream | SEAL | SOBER | SOBER-128 | SOSEMANUK | Trivium | VEST | WAKE
Theory: Shift register | LFSR | NLFSR | Shrinking generator | T-function | IV
Standardization: eSTREAM
Cryptography
v d e
History of cryptography | Cryptanalysis | Cryptography portal | Topics in cryptography
Symmetric-key algorithm | Block cipher | Stream cipher | Public-key cryptography | Cryptographic hash function | Message authentication code | Random numbers