Talk:RC4
From Wikipedia, the free encyclopedia
- This is a consolidation of two talk pages, resulting from a move. Neither section is an archive, threads in both sections can continue here, and the page history will make it obvious what has occured if there's any doubt... not that I can think of any reason you'd want to know anyway... Andrewa 19:32, 19 Sep 2004 (UTC)
[edit] Derivative / copyvio from Schneier?
The second and third paragraphs plus the code fragment look a lot like part of the writeup by Bruce Schneier in Applied_Cryptography 2nd ed. Did Bruce donate this part, or give permission? (anon -- 10:59, 11 May 2003)
- I suspect that even if the code fragments were copied wholesale, they'd be legitimate as "scenes a faire". (See Online service provider law#Sometimes there are only a few ways to do something). As for the rest, I don't have my copy near this machine, but I'll compare them tonight and report back. Securiger 07:05, 30 May 2004 (UTC)
- Ah, just checked the page history and saw that comment was made a year ago. Presumably been changed since then, but at any rate, there is very little similarity to Schneier. The code fragments are somewhat similar, but - given the essential simplicity of the algorithm - about as different as actually possible. Securiger 00:36, 1 Jun 2004 (UTC)
-
- Yep, I agree. I've also had a look at the page as it was back then (http://en.wikipedia.org/w/wiki.phtml?title=RC4_%28cipher%29&oldid=1107229) and it seems to be quite different from Applied Cryptography. You might — possibly — be able to make a case that parts of the second paragraph were written based on information in the second paragraph of Schneier's description (Ch 17.1, p 397), but this is (of course) perfectly legitimate. — Matt 11:19, 1 Jun 2004 (UTC)
Just a little thing I noticed, paragraph three says "n is defined as the number of bytes in the key and can be in the range 1 ≤ n ≤ 255, though for common applications such as WEP, n = 64 or 128 is common", but isn't WEP a 64/128 bit algorithm, not byte? Plasma 14:59, 17 Sep 2004 (UTC)
- Yep, well spotted! I think the case of WEP is complicated by the fact that "128-bit WEP" is effectively 104 bits, and "64-bit WEP" is effectively 40 bits, so I've removed the mention of WEP as an example. — Matt 15:17, 17 Sep 2004 (UTC)
Some threads below copied by Andrewa 19:19, 19 Sep 2004 (UTC) to allow page move:
[edit] Let's put the encryption article here
Any objections to moving RC4 (cipher) to RC4 (and including a disambiguation header) since the encryption algorithm is the meaning that most people will be looking for when coming here? Evidence: try a Google test for RC4; note that Route Coloniale 4 is an orphan, but lots of pages reference the cipher. — Matt 21:38, 12 Sep 2004 (UTC)
- Yeah, I do object, because that would necessitate the creation of a page called RC4 (disambiguation). Personally, I hate (disambiguation) pages - they are klunky, inelegant kludges - and since RC4 (cipher) is already in place, and there aren't a lot of pages pointing to RC4 at the moment, I don't see the need to change it. Kevyn 15:01, 13 Sep 2004 (UTC)
- Hmm, you may have misunderstood...this page is currently a disambiguation page; my proposal is to eliminate the disambiguation page. I suggest we move the page about the cipher to RC4. At the top of the RC4 page would be a link saying "This page is about the encryption algorithm. For the Vietnam road named RC4, see Route Coloniale 4". This is appropriate because the the crypto use is overwhelmingly predominant; I think my solution is the standard Wikipedia approach in this situation. It is an improvement in useability, because most of the time when people type "RC4" into the search box, they are looking for the cipher -- this change will mean one less click for them. — Matt 17:13, 14 Sep 2004 (UTC)
- I understood, I have no problem with RC4 being a disambig page. Changing it from a disambig into a standard page, when there is more than one definition for RC4, that is what I object to, because it would either require the creation of a RC4 (disambiguation) page, or a disambiguation paragraph at the top of the page (what you suggest). I'm not arguing that the RC4 encryption difinition isn't the most used. But if someone searching for "RC4" is looking for the lesser definition, Route Colonial 4, then the Wikipedia standard (which I disagree with) will mean one more click they have to go through to find it, which I believe reduces useability. Kevyn 02:36, 15 Sep 2004 (UTC)
- But that is simply not true -- currently, the (rare) user searching for Route Coloniale 4 by entering "RC4" into the search bar will be taken to this current page -- a disambiguation page. He or she would then have to click on Route Coloniale 4; that is one click. After my suggestion, entering "RC4" into the search bar will take the user to the "crypto page with header". He or she would then have to click on "Route Coloniale 4" to get to the page they want — again, one click. So the user looking for Route Coloniale 4 only needs one click in each instance. The (much more common) user looking for the cipher is saved a click. Is my reasoning at fault? — Matt 02:53, 15 Sep 2004 (UTC)
- Doh! You are correct, what you're proposing (a disambig paragraph on the page) would not add additional links. I was thinking in terms of the creation of an RC4 (disambiguation) page, but I was not being clear. My apologies. That being said, I dislike disambiguation paragraphs only slightly less than (disambiguation) pages, but then I'm a purist about these things. My basic point is, I object to primary definitions monopolizing the namespace in the case of multiple definitions - I prefer to see no definition gets primacy over any other, and in all cases where there are multiple definitions, a disambiguation page should reside at the shared name, period. But, Wikipedia policy states otherwise, and I know I'm in the minority in this opinon, so there you have it. I object, but probably to no avail. Kevyn 23:33, 16 Sep 2004 (UTC)
- But that is simply not true -- currently, the (rare) user searching for Route Coloniale 4 by entering "RC4" into the search bar will be taken to this current page -- a disambiguation page. He or she would then have to click on Route Coloniale 4; that is one click. After my suggestion, entering "RC4" into the search bar will take the user to the "crypto page with header". He or she would then have to click on "Route Coloniale 4" to get to the page they want — again, one click. So the user looking for Route Coloniale 4 only needs one click in each instance. The (much more common) user looking for the cipher is saved a click. Is my reasoning at fault? — Matt 02:53, 15 Sep 2004 (UTC)
- I understood, I have no problem with RC4 being a disambig page. Changing it from a disambig into a standard page, when there is more than one definition for RC4, that is what I object to, because it would either require the creation of a RC4 (disambiguation) page, or a disambiguation paragraph at the top of the page (what you suggest). I'm not arguing that the RC4 encryption difinition isn't the most used. But if someone searching for "RC4" is looking for the lesser definition, Route Colonial 4, then the Wikipedia standard (which I disagree with) will mean one more click they have to go through to find it, which I believe reduces useability. Kevyn 02:36, 15 Sep 2004 (UTC)
- Hmm, you may have misunderstood...this page is currently a disambiguation page; my proposal is to eliminate the disambiguation page. I suggest we move the page about the cipher to RC4. At the top of the RC4 page would be a link saying "This page is about the encryption algorithm. For the Vietnam road named RC4, see Route Coloniale 4". This is appropriate because the the crypto use is overwhelmingly predominant; I think my solution is the standard Wikipedia approach in this situation. It is an improvement in useability, because most of the time when people type "RC4" into the search box, they are looking for the cipher -- this change will mean one less click for them. — Matt 17:13, 14 Sep 2004 (UTC)
[edit] Moved, some cleaning up still to do.
IMO the redirect at RC4 (cipher) and its talk page are both now redundant, and should be listed on RfD as soon as the links through them are tidied up. I've started this but help welcome. Andrewa 19:40, 19 Sep 2004 (UTC)
[edit] Page move request
Could an admin please move RC4 (cipher) to RC4? "Primary topic" disambiguation is more appropriate here than "equal" disambiguation. — Matt 15:40, 19 Sep 2004 (UTC)
- Will do. The history of the current RC4 although extensive is all disambiguation page maintenance, and doesn't need preservation. Andrewa 19:11, 19 Sep 2004 (UTC)
- Done. Discussion on both pages seemed to have reached consensus that this was the best thing, and has been consolidated at talk:RC4. Any problems comment on my talk page for the fastest response from me. Andrewa 19:27, 19 Sep 2004 (UTC)
(Deleted comment describing mistake which is now fixed - ciphergoth 22:39, 2004 Nov 22 (UTC))
[edit] POV?
I've written "RC4 is not recommended for use in new applications".
I guess that could sound like I'm introducing POV into the article. However, it seems to me that this is the concensus among cryptographers. RC4 is used because it's simple and famous, not because it's recommended by anybody. The breaks in RC4 are serious enough that they would consign any new cipher to the dustbin of history; my and Stefan Lucks's attacks on Leviathan are much less serious than the known attacks on RC4, but they were enough to have Leviathan immediately dismissed from consideration as a NESSIE candidate cipher.
I think there's a general puzzle with how to impart the information that will maximise the chances of future cryptosystems being secure without seeming to introduce POV into the crypto pages, and I welcome advice (and of course, edits!) that address it. — ciphergoth 09:43, 2004 Nov 22 (UTC)
- Yeah, I think this is a difficult problem; how to leave a reader with the right impression without prescribing exactly what he or she should do or think. I guess we have a set of facts that we can include:
- RC4 is practically insecure if used in certain ways.
- Avoiding the known problems, you can use RC4 in a way in which there are no known practical attacks.
- RC4 has some other academic, theoretical weaknesses, no matter how you use it.
- Because of the weaknesses, most cryptographers would rule it out of consideration as a primitive, and plump for something like AES in new systems.
- These facts are NPOV, yet if a reader reads them, they'd be quite foolish to then go and implement RC4 in a new application. — Matt 19:34, 22 Nov 2004 (UTC)
-
- I'm not sure it's that easy :-) I considered that wording, but isn't point 4 a weasel term? What's the difference between saying "it's not recommended" and "experts don't recommend it"? I think that on Wikipedia, the former has to be a synonym for the latter. "X is used to treat Y but not Z" would be a fair comment on a drug, and would be read in the same way.
- Hmm..yeah, tricky, I do think there's a small difference between "it is not recommended" and "experts don't recommend it" — at least to me, "it is recommended" is an idiom with the connotation that the author is relating his own opinion, rather a plain passive sense. But to the main issue, perhaps we could point out (1) the general observation that ciphers with even theoretical weaknesses are considered flawed by the academic community, and (2) other ciphers exist with no known weaknesses, such as AES; does that look weasel-free? — Matt 23:22, 22 Nov 2004 (UTC)
- Probably the best that can be done. Your reading of "it is recommended" argues strongly for the other way of putting it - I hadn't thought of that. ciphergoth 00:00, 2004 Nov 23 (UTC)
- Hmm..yeah, tricky, I do think there's a small difference between "it is not recommended" and "experts don't recommend it" — at least to me, "it is recommended" is an idiom with the connotation that the author is relating his own opinion, rather a plain passive sense. But to the main issue, perhaps we could point out (1) the general observation that ciphers with even theoretical weaknesses are considered flawed by the academic community, and (2) other ciphers exist with no known weaknesses, such as AES; does that look weasel-free? — Matt 23:22, 22 Nov 2004 (UTC)
- I'm not sure it's that easy :-) I considered that wording, but isn't point 4 a weasel term? What's the difference between saying "it's not recommended" and "experts don't recommend it"? I think that on Wikipedia, the former has to be a synonym for the latter. "X is used to treat Y but not Z" would be a fair comment on a drug, and would be read in the same way.
[edit] Put back mention of CipherSaber-2?
Correct me if I'm wrong, but aren't the problems in RC4 addressed by adding the trivial modification to the KSA (repeated mixing) proposed by CipherSaber-2? I pointed that out in a revision a long time ago, and I just noticed that ciphergoth removed it (also a while ago). I think his motivation in removing it was because he thought it was off-topic, but I disagree. The reason I put that mention of CipherSaber-2 was not just to say, "hey CipherSaber-2 is based on RC4". It was to counteract the suggestion of the artcile that RC4 was "broken". It may be "broken" in its original form, but CipherSaber-2 is still basically RC4 and as far as I know, it is still considered secure. I would like to put some mention of this important point back into the article. &mdash User:Tzadikv, 03:06, December 15, 2005
- Trivial point first - please sign things you put on the talk page. You can do this simply by writing ~~~~ at the end - Wikipedia will substitute your user name and the date. Thanks.
- CipherSaber-2 is not RC4. It is very closely related, but in cryptography we consider even the smallest tweak to be the creation of a new, distinct cipher. I do not believe this new cipher has received very much cryptanalysis, so its security is unknown.
- At the time, I tried to persuade the author to use RC4 directly and discard a portion of the keystream, which is a construction that is widely used and analyzed, but for some reason he opted not to.
- Even this is not considered to meet the high standards of a modern stream cipher, since a practical distinguisher exists (Fluhrer and McGrew, 2000)
- Nonetheless, it's probably worth a very brief mention, something like "CipherSaber-2 uses a cipher very closely related to RC4 with a tweak to defeat the FMS attack" or something. — ciphergoth 08:50, 15 December 2005 (UTC)
[edit] Test vectors
I added some simple test vectors. The "official" test vectors I found in some RFCs etc were a bit to messy to use in the article. My program is verified against some crypto libraries but I would apreciate if some more people checked those vectors. Please note your result here if you found them correct or not. If your program disagrees write down your result here so we can see if several users get the same. --David Göthberg 04:26, 5 March 2006 (UTC)
-
- My Perl implementation gave all the exact same results for all 3 vectors listed below. --Jesse 5:08PM, 23 January 2007
- Adding test vectors is a good idea. There are some unnecessary spaces in the results of the test vectors; given it is hex there is probably no need for spaces. I thought it was also conventional to use capital letters for hex codes. The following might be more appropriate:
- RC4( "Key", "Plaintext" ) == BBF316E8D940AF0AD3
- RC4( "Wiki", "pedia" ) == 1021BF0420
- RC4( "Secret", "Attack at dawn" ) == 45A01F645FC35B383552544B9BF5
- Mike carton 15:49, 23 October 2006 (UTC)
-
- Hi Mike. Well, the spaces were there for readability and is common when one lists hex data. But doesn't matter much since the strings are so short. And yeah, it looks better in upper case. Been thinking of perhaps changing that myself but didn't get around to it. --David Göthberg 10:42, 24 October 2006 (UTC)
[edit] A couple of things that I am almost certain are mistakes in this page
This page has several mistakes/flaws. I don't feel right correcting them, because I don't really know enough about RC4 to write a decent article.
- Many thanks for taking the time to look for errors and comment here! This sort of thing is what makes Wikipedia great.
"Cryptosystems can defend against this attack by discarding the initial portion of the keystream (say the first 1024 bytes) before using it." -There are two different problems with RC4. The first is the bias in the first few bits of the keystream. That can be defended against by discarding those bits. The second is the Fluent/Mantin/Shamir attack. I do not believe that the Fluent/Mantin/Shamir attack can be defended against by simply discarding part of the keystream. I believe the concatenation of the IV and the fixed key must be run through a random oracle hash function (ex SHA-256). That is certainly what Fluent, Mantin and Shamir write at the end of their paper, if you read it carefully. REPEAT AFTER ME: "RC4 provides a cryptographically secure pseudo-random output given a RANDOM INPUT (or one that is indistinguishable from random)."
- The Fluhrer/Mantin/Shamir attack is a bias in the initial portion of the keystream. Discarding 1024 bytes of the output defends against it. However, there is also a bias in successive pairs and triples of bytes; this is the Fluhrer/McGrew attack, and discarding initial bytes of keystream will not defend against it.
"RC4 falls short of the standards set by cryptographers for a secure cipher in several ways, and thus is not recommended for use in new applications." -I do not believe this is true. I've had the privlege of attending talks by several well known academic cryptographers where RC4 came up and they have never implied that it is an inheriently insecure algorithm, nor is that what Fluent, Mantin and Shamir write. The general gripe seems to be with the initialization step which can be avoided using the techniques from above. Indeed, I don't believe that the WEP mode of operation--concatenating the IV to a fixed key--was ever suggested or condoned by RSA. (RC4 is a PRNG, not a PRF) Stream ciphers have certain inherient usage defects that might discourage their use under many circumstances (programmers love designing software that reuses the key which creates a two time pad).
- You're in luck - Wikipedia does have an academic cryptographer working on these pages, here I am! I know Scott Fluhrer well and I've had the opportunity to discuss this with him at some length. I can tell you that even when you initialize randomly, the Fluhrer/McGrew distinguisher is real and distinguishers weaker than that have been enough to for example rule ciphers out of consideration in eStream. Look at my and Sekar et al's attack on Py, a related cipher - that's a much weaker distinguisher, but it's still considered an attack.
- RSA would never have made the makes in the design of WEP, but the WEP break has nothing to do with the Fluhrer/McGrew attack, which is what the above is discussing.
"This and related effects were then used to break the WEP ("wired equivalent privacy") encryption used with 802.11 wireless networks." -They weren't necessarily related. WEP would have been insecure regardless what cryptosystem was used due to other inherient weaknesses in the cryptographic key & IV sizes and MAC (or lack thereof) algorithms. See Wagner's paper.
- Sure, but the big gaping hole is because of the FMS attack.
-
- Is this really a big gaping hole? Is there any realistic chance (say better than the chances of being struck by lightning) that my RC4-encrypted data will be deciphered because of this? If so, this should be made more clear in the article. If not, that should also be made clear. It seems to me from what I've read on the net that the non-randomness of RC4 is primarily of academic interest, and is not likely to lead to recovery of a user's data. If it just means that someone will be able to detect that I do in fact have RC4 encrypted data on my drive, I can live with that.
There are probably other defects, and I don't claim to be an expert on RC4 or crypto in general. This page needs to be reviewed by a professional cryptographer or at the very least a grad student specializing in crypto who is well acquainted with RC4. It is downright dangerous for Wikipedia to give out incorrect information about cryptographic algorithms.
- You are right that some of the crypto coverage here is a bit patchy, but fortunately this page is OK on the points that you raise. Still, many thanks for raising your concerns! — ciphergoth 07:24, 4 April 2006 (UTC)
-
- I have to agree with the poster of this section. In particular, I was looking for a super-fast super-simple yet somewhat secure encryption algorithm for a fun little project with potential value in encrypting file systems: tinycrypt can be found on tinycrypt.sf.net. From what I've read, RC4 (using the precautions suggested above) is the best thing going if you really want speed and simplicity. If it's insecure for real applications, it's not made clear here, or on the web where I've read. Smilindog2000 16:47, 31 October 2006 (UTC)
-
-
- Well, another good option if you want something super-simple is XTEA. And as far as we know XTEA is as secure as for instance AES. Then you also need a block mode but for instance CTR mode is very simple to implement. Since XTEA only has 64 bit block size an interesting alternative is XXTEA (which is the new version of "Block TEA"). XXTEA has variable block size up to any block size you want. Then in most cases you don't need to use a block mode.
-
-
-
- But yeah, RC4 is very fast. But for encrypting small blocks of data like disk encryption then there will be some overhead since you need to initiate RC4 for each block you encrypt. Then it looses most of its speed advantage so then you can just as well use XTEA. (Note that by initiate I here mean both the initiation with the key+IV and then at throwing away at least 1024 bytes of RC4 output to properly mix the internal state of RC4.)
-
-
-
- --David Göthberg 23:11, 31 October 2006 (UTC)
-
-
-
-
- Thanks for the XTEA link. I agree that if I were encrypting small blocks rather than whole files, XTEA would be better. However, I'm encrypting whole files, and it's MUCH slower than RC4 at this level. Each round is slower, and it takes 64 rounds. RC4 takes one round. AES is probably what I would use if I didn't mind a 2X slowdown, or a more complex algorithm. AES rocks, but I think RC4 is still best for tinycrypt.Smilindog2000 11:28, 1 November 2006 (UTC)
-
-
-
-
-
-
- Ehm, no. One single round of XTEA is certainly not slower than RC4. Note that XTEA uses 32 bit operations while RC4 usually uses 8 bit operations. And those 64 rounds are 32 rounds each on two 32 bit blocks. So after 64 "rounds" a total of 64 bits (8 bytes) are encrypted. XTEA is about the same speed as AES and RC4 is about 2 to 2.5 times faster than XTEA/AES. And yes, if you use RC4 and throw away 1024 bytes of RC4 output during the initiation then when encrypting blocks/files that are about 1000 bytes or larger then RC4 is faster than XTEA/AES. And since most filesystems handles whole clusters at a time then I guess RC4 is the faster choice for disk encryption even if you encrypt on cluster level. However I think that just throwing away 1024 bytes of RC4 output is not enough to fully mix the state of RC4. So even though I am a huge fan of RC4 I am not sure that RC4 is the faster choice for disk encryption. It depends on the "cluster size" you work on and how much RC4 output you throw away at init time etc. (But you mentioned you encrypt whole files so the situation is different.) Anyway, if you want speed, as usual when optimising code there is only one way to know: Test all plausible methods and see which one is the faster in your case. Of course, if you REALLY want speed then there are super fast stream ciphers like ISAAC. --David Göthberg 03:42, 2 November 2006 (UTC)
-
-
-
[edit] MARC4
Is the term "MARC4" in sufficiently widespread use to warrant a mention? 78 Google hits for "MARC4 RC4", only one Citeseer hit. Since some papers recommend discarding more than 256 bytes anyway, I'm not sure it's worth putting in. But maybe it's in more widespread use than I realize. Does anyone know where this term was coined? Thanks! — ciphergoth 06:33, 10 April 2006 (UTC)
[edit] RC4A and VMPC
Aloha!
(I have been waay to busy for Wikipedia contributions lateley, sorry!) What do you guys think of adding some info about the RC4A version of RC4 and references to the VMPC description?
- They're sufficiently different from RC4 that if we were going to write about them it would be best to do it in separate articles, but I'm not sure there's much point, since neither is particularly well known and both are broken. — ciphergoth 05:12, 23 May 2006 (UTC)
-
- I think they deserve being mentioned, at least as a reference (Wikipedia a sort of encyclopedia, right?). Instead of wasting a whole page, one could at least say something to the effect (about RC4A):
-
-
-
- Several attempts at extending RC4 and fixing the vulnerabilities has been done. For example RC4A (bla bla bla.... - a short note what it does), BUT in a paper by Yukiyasu Tsunoo et al in 2005 an efficient distinguisher against was shown...
- You're right, that does sound like it would fit. Go for it! — ciphergoth 06:09, 24 May 2006 (UTC)
-
-
-- Joachim Strombergson
[edit] What is your opinion of implementation links?
I saw that all implementation links was removed and the reason was that Wikipedia is not a web directory. I respectfully agree to this, but I think if a sub section was provided in external links with some examples of implementations, it would be very helpful for those that wants to use it at their projects and I think that it will not make Wikipedia a web directory, just make it more complete and more useful. --Ali Farhadi 15:34, 28 June 2006 (UTC)
- We do have a problem with people spamming implementation links to crypto algorithm pages, and we've started being more strict about it as a result. Probably the best solution would be to either find or create a list of implementations somewhere on the web (Open Directory Project?), and link to that. — Matt Crypto 16:07, 28 June 2006 (UTC)
- I agree with Mr. Farhadi. I only added my implementations as links because both links to the C implementations were dead (at the time), and I figured it would help. While I understand that Wikipedia is not a web directory, I don't quite think it proper to remove implementation links on only one or two articles citing this reason, and leave the others. Also, would it be OK if I posted the test vectors from the Wikisource page that was deleted (combining them with the current ones)? — Reikon 20:30, 14 September 2006 (UTC)
[edit] What language?
What langauge are the examples in, and is that the full code? 75.15.242.0 (talk) 00:55, 2 January 2007 (UTC).
- The example description is in pseudo code so not written in a specific existing programming language. And yes, the example is almost a full description. The description lacks a surrounding function header that takes the key, keylength, cleartext and cleartextlength as paramaters. And it lacks one line in the end where the output bytes are XORed onto the cleartext to get the ciphertext. --David Göthberg 06:10, 15 January 2007 (UTC)
-
- Would anyone mind if the pseudo code example were to have obvious end-of-loop indicators such as endwhile and endfor? Both external links on the pseudo code page titled "pseudocode standards" specify using endwhile and endfor (except the second one which doesn't mention 'for.' It only specifies endwhile). I, (being a hobbyist real language programmer), I was a a bit confused at first due to the lack of instruction group indicators -- normal (like {}) or otherwise. Thanks! -Jesse. January 23 2007, 6:08PM —The preceding unsigned comment was added by 64.146.180.232 (talk) 02:29, 24 January 2007 (UTC).
[edit] What is going on?
I'm trying the Python implementation of RC4 given here, but I've noticed something that seems a little strange. I have the code (the WikipediaARC4
class) in a file called wiki.py
and I have another Python file that uses it like this:
from wiki import WikipediaARC4 print WikipediaARC4("thekey").crypt("theplaintext").encode("hex").upper()
...to generate hex-encoded ciphertext.
When I use the ASCII string "aaaaa" as the key and "test" as the plaintext, I get the hex output 64D9EB6A. Fine, nothing wrong with that. But if I use key "aaaa" on the same plaintext (note one less "a"), I get the same ciphertext. Surely a completely different ciphertext should have been generated!?
The same occurs if I use "aaa", "aa" or "a", but "aaaab" instead of "aaaaa" generates a completely different plaintext, which is what I would have assumed would have happened in all cases.
Is the Wikipedia implementation broken or is this a problem with RC4? I would have thought not. --Tim1988 talk 16:51, 30 April 2007 (UTC)
Hmm...I just realised I'm stupid and that this happens because of the way RC4 works...but...what if two people choose similar keys (e.g. "aaaaa" and "aa")? They would both be able to decrypt each other's ciphertext with keys that are different? --Tim1988 talk 12:13, 15 May 2007 (UTC)
- Yes, you are right about that. But there are solutions. Hash functions and many other things have similar problems. So they use something called "length padding" of their input to guarantee that two different inputs cause two different hashes. There are many different ways to do length padding, but the one used in most hash functions is described in detail at Merkle-Damgård hash function#Length padding example. If you are building your own system that doesn't need to be compatible with other systems you might consider using something similar. Or you could simply hash the keys before you use them as keys in RC4 (if you have code for hashing).
- But in any case I want to remind you and everyone else: RC4 should not be used as is, you have to properly mix the internal state after adding the key, before using RC4 to encrypt. Otherwise you don't get full avalanche effect and then similar keys can encrypt in similar ways. The simplest way to achieve such good mixing is to feed the key as usual. Then read out at least 1024 bytes of "random" bytes from RC4 and throw those bytes away. Then start using RC4 for encryption. Some claim you should do more mixing, perhaps throwing away at least 12*256 = 3072 bytes to stop all known attacks. And I say while your at it why not do key strengthening at the same time and throw away what corresponds to about 0.1 seconds of CPU time? (At least if the keys you use are human memorable passwords/passphrases.)
- --David Göthberg 19:29, 8 September 2007 (UTC)
[edit] RC4 and nonces
I think (but am not sure) that this graf is making the mistake of equating RC4 (a primitive) with constructions and cryptosystems that actually use RC4. This would fall more under the category of an "implementation risk" than a flaw in the algorithm. Can someone sanity check and either fix or remove the tag on the graf? Thanks. --- tqbf 22:55, 15 November 2007 (UTC)
- Have had a go at rewording — ciphergoth 16:16, 30 November 2007 (UTC)
[edit] RC4 not suggested for new cryptosystems?
The intro says RC4 "is not recommended for use in new systems." That seems like a fairly reasonable position (doubly so if you just mean RC4 with no bytes dropped), but I worry it may be OR unless we can find an anti-recommendation.
It's also sad and interesting to me that RC4's problems stem from a bad key schedule, with just the distinguishing attack on the actual PRNG. I don't know whether there's any legitimate way this should to be reflected in the article, but figured I'd throw it out there. 75.24.111.126 (talk) 17:35, 16 January 2008 (UTC)
[edit] Wikipedia is also used by Learners to learn, not just Gurus to discuss?
I am a beginner (well, not exactly a high-school student, but not yet a PhD guru either). It took me about 3 or 5 minutes to realise that the following line of code in the Python example actually implemented a bit-wise XOR function:
output[i] = chr(ord(input[i]) ^ r)
I do acknowledge that much earlier in the article the word 'XOR' is mentioned once and that to someone trained in the art of cryptography this should have been more then enough of an explanation. However, if the objective of the article is to relate knowledge to those who have not yet been initiated and to help readers understand quicker the text I would suggest to add a few more comments to the example, and in particular a comment to this effect, too:
output[i] = chr(ord(input[i]) ^ r) # This implements (in terms of ascii): output[i] = input[i] XOR r
Ideally, from my beginner's perspective, I would prefer that when the article explains that RC4 can be used as pseudo-random number generator and gives the pseudo-code example, that it again would explain how exactly a pseudo-random number sequence is converted into a encrypted/decrypted message and that the XOR operation be shown in the pseudo-code example. Something along these lines:
"...To use the pseudo-random number sequence generated by the above example for encrypting a message each of its bytes is XOR-ed with one ASCII byte from the unencrypted message as shown bellow:
Example 1:
i := 0 j := 0 k := 0 while k<inputMessageLength i := (i + 1) mod 256 j := (j + S[i]) mod 256 swap(S[i],S[j]) r := S[(S[i] + S[j]) mod 256] outputMessage[k] := inputMessage[k] XOR r k := k + 1 endwhile
"Please note that in the above example if inputMessage[i]
is the unencrypted message then outputMessage[i]
will be the encrypted message and conversely, if inputMessage[i]
is the encrypted message then outputMessage[i]
will be the unencrypted message. This is due to the fact that, as mentioned previously, the RC4 algorithm is symmetric and exactly the same code can be used for encryption and decryption..."
Thanks for your understanding and sorry for bothering you with such trifle details. The above is just an idea and I would expect someone better versed in editing to do a better job. I do appreciate that the way the article is worded now may represent a source of considerable pride and I must apologise for interfering with my beginner's point of view and proposing something that could be inappropriate for the high scientific standards of this article.
Wikipedia is also used by Learners to learn, not just Gurus to discuss -- right or wrong?
88.105.237.135 (talk) 20:36, 13 February 2008 (UTC)
- I agree with you 88.105.237.135. RC4 is mostly used as a stream cipher and only rarely as a pseudo random number generator (PRNG). And I have heard the same confusion from everyone I know who have read this article. So I too think we should show the XOR with the message in the pseudocode. I see that you mean we should keep the current example (the PRNG) in the article and add a second example with the encryption and I am okay with that. Although I was thinking of just showing the encryption and skipping the PRNG example. But hey, we got plenty of space so lets keep both as you suggest.
- I played around a little with your suggested pseudocode and below are my conclusions. (I took the liberty to give your example above the title "Example 1" to make it easier to discuss.)
- Example 2:
i := 0 j := 0 n := 0 while n < inputMessageLength i := (i + 1) mod 256 j := (j + S[i]) mod 256 swap(S[i],S[j]) k := S[(S[i] + S[j]) mod 256] outputMessage[n] := inputMessage[n] XOR k n := n + 1 endwhile
- Example 3:
i := 0 j := 0 n := 0 while n < inputMessageLength i := (i + 1) mod 256 j := (j + S[i]) mod 256 swap(S[i],S[j]) outputMessage[n] := inputMessage[n] XOR S[(S[i] + S[j]) mod 256] n := n + 1 endwhile
- My conclusions/comments:
- Example 1: You use the variable "r" for the keystream bytes and "k" for the message length counter. But I find that confusing, instead I think we should use the "k" for the keystream bytes.
- Example 2: Here I use "k" for the keystream bytes and instead for instance "n" for the message length counter.
- Example 3: This is how RC4 normally is coded, but I find this less readable. Also it doesn't give us the possibility to state something like: " 'k' is the stream of pseudo random bytes used as keystream."
- So personally I prefer Example 2.
- --David Göthberg (talk) 15:22, 22 February 2008 (UTC)
[edit] Perl implementation
I redid it in perl. Mine shows all the steps for encryption and decryption and prints strings so you can see the output.
my $key = "Secret"; my $plaintext = "Attack at dawn"; my $hexText = ''; for ( split(//,$plaintext) ) { $hexText .= sprintf("%01X",ord); } print "hex('$plaintext') => $hexText\n"; my (@S,$i,$j); # state variables KSA($key); # PRGA encryption my $cipherText = PRGA($hexText); print "RC4($hexText) => $cipherText\n"; KSA($key); # PRGA decryption $hexText = PRGA($cipherText); print "RC4($cipherText) => $hexText\n"; # And the original string my $origString = ''; for ( $hexText =~ /../g ) { $origString .= chr hex; } print "dehex($hexText) => $origString\n"; sub KSA { my ($k) = @_; my @keyArray = split(//,$k); for ( 0 .. 255 ) { $S[$_] = $_; } $j = 0; for $i (0 .. 255 ) { my $keyValue = ord($keyArray[$i % ($#keyArray + 1)]); # mix in the key $j = ( $j + $S[$i] + $keyValue ) % 256; ($S[$i],$S[$j]) = ($S[$j],$S[$i]); #swap } $i = 0; $j = 0; } sub PRGA { my ($text) = @_; my ($keyStream, $cipherText); my @plainTextArray = $text =~ /../g; for ( @plainTextArray) { $i = ($i + 1) % 256; $j = ($j + $S[$i]) % 256; ($S[$i],$S[$j]) = ($S[$j],$S[$i]); # swap $keyStream = $S[($S[$i] + $S[$j]) % 256]; $cipherText .= sprintf("%02X",$keyStream ^ hex); } return $cipherText; }
Addps4cat (talk) 23:01, 5 April 2008 (UTC)
[edit] RC4 = RedCode4?
The Gpcode.AK is a new June 2008 computer virus and trojan horse originating from Russia. It uses RC4 cipher routine as supplied by the Windows built-in cryptographic provider service to encrypt precious document files on infected computers.
The Gpcode.AK virus takes those files hostage, because as a final act of malice, it discards the RC4 key and only keeps a further ciphered copy of it, hidden within an RSA-1024 envelope.
The victimized user only has four choices: A. Pay the demanded ransom to the hacker, so you get your files back B. Forget about your lost documents, write off the resulting losses and reinstall Windows C. Crack RC4 algorythm to get your files back directly and make fun of the hacker D. Crack RSA-1024 math to get the key which allows deciphering the files yourself, so no pay to hacker
Sorrowfully option C. and D. does not work currently, as the crypto code of Gpcode.Ak virus appears to be sound and well-written, taking an estimated 15 million years to crack if using brute-force computing methods only.
Unless the NSA helps out with a secret RC4 or RSA-1024 vulnerability trick, the infected users will probably never get their documents back for free, they will need to pay the ransom.
I think this scandal may make an interesting addition to the RC4 article. 91.83.19.207 (talk) 20:06, 8 June 2008 (UTC)