Talk:Birthday attack

From Wikipedia, the free encyclopedia

WikiProject on Cryptography This article is part of WikiProject Cryptography, an attempt to build a comprehensive and detailed guide to cryptography on Wikipedia. If you would like to participate, you can choose to edit the article attached to this page, or visit the project page, where you can join the project and see a list of open tasks.

Contents

[edit] Birthday attack recommendation

Copied from User talk:Matt Crypto:

Hi, re Birthday attack, you removed the recommendation of changing a foreign-supplied text befor signing, saying that using a longer hash is better. While using a long hash is certainly good, it seems to me that changing the text can only help: it squares the time your adversary needs, especially since you never know whether the hashes are long enough for the adversary's hardware, or whether the adversary has discovered a weakness in the hash function. Do you know any downsides to changing the text before signing? Thanks, AxelBoldt 11:44, 18 Aug 2004 (UTC)

The text I removed was, "to avoid this birthday attack, it is recommended that Alice slightly modify any digital contract that's presented to her, before signing it.". The downside is that Alice has to modify any digital contract presented to her; the user should be able to treat the crypto primitives as "black boxes", to the greatest possible extent — this is a point brought out in, say, Ferguson and Schneier's Practical Cryptography. The birthday attack is defeated at the minor cost of an extra, say, 80 bits in the hash length; the user shouldn't have to worry about modifying her actions at a higher level of abstraction. — Matt 21:07, 18 Aug 2004 (UTC)
Matt is right. It would be possible to imagine a signing system in which a document is randomly "salted" before hashing, which would provide similar security, but I don't know of any analysis of the security of such signing. As seems to be the case inevitably with hash functions, it's easy to prove the security of such a measure in the random oracle model, but very hard to state the properties a hash function would have to have to make such a measure secure. Universal one-way hash functions are a distinct but related idea. ciphergoth 10:32, 2004 Dec 8 (UTC)

I just wanted to check in regarding "It has also been recommended that Alice slightly change any contract presented to her before signing" — has this practice actually been recommended to any degree? — Matt Crypto 09:02, 24 Feb 2005 (UTC)

Sorry for the late reply. It's on page 430 of Schneier's Applied Cryptography, 2nd ed.. After describing the birthday attack, he writes: "This is a big problem. (One Moral is to always make a cosmetic change to any document you sign.)". I know that I initially read the recommendation somewhere else, but now I don't remember where, and that source probably got it from Schneier anyway. Cheers, AxelBoldt 05:38, 5 Mar 2005 (UTC)
OK, I added that reference. --68.0.120.35 20:02, 9 March 2007 (UTC)

I can see two problems with this recommendation. The first one is that an attacker, who tries to forge a signature, can now exploit the birthday attack. This has already been mentioned in this discussion. The second problem is that the recommendation is very fuzzy. A concrete proposal, how a message should be modified and an analysis of that proposal is missing. Here is an example showing why a simple modification may not help at all. Assume an attacker finds two messages m and m' that hash to the same value. Most hash functions have the property that collisions can be extended. I.e., given h(m)=h(m') we can add further blocks s at the end of the message and get h(m || s) = h(m' || s) and in particular we can even choose s randomly. Hence an attack is still possible. All the attacker has to do is to reject modifications in the first part of the message. In particular, if the signer is given m||s to sign, but signs m||s' instead then the attacker gets the signature of the message m'||s' . This doesn't mean that randomizing messages before hashing them can not improve the security of the scheme. However, any such proposal must be analyzed first. I'd generally be careful about any statement in Applied Cryptography that is neither referenced nor justified. 85.2.52.225 07:22, 31 May 2007 (UTC)

[edit] bob's a she?

"It has also been recommended that Bob cosmetically modify any contract presented to her before signing." now, i guess this is related to the other discussion, since it seems this sentence used to be about alice. but now it's about bob, who is apparently having a gender crisis. is this common among cryptographers? is mallory involved? pauli 11:53, 12 Apr 2005 (UTC)

It's my fault, I changed Bob and Alice around, because I thought it wouldn't be fair if Bob is always portrayed as trying to trick Alice. But I forgot to change one instance of her <-> him. Sorry Bob!
What kind of editor are you?!?

[edit] modifying the contract is no solution

It doesn't matter whether it was Alice or Bob who last modified the contract - if this attack is possible, either could suspect the other of attempting to defraud her or him. So, Bob, altering the contract yourself is no solution, now Alice suspects you of trying to trick her into a fraudulent contract!

I removed the sentence

However, this does not solve the problem, because now Alice suspects Bob of attempting to use a birthday attack.

Yes, it is true that no matter which person "last" modifies the contract, the other could suspect that person of specially crafting it.
But what about this alternative:
Alice signs ( hash( fair contract with Alice's cosmetic modifications ) )
Bob signs ( hash( fair contract with Bob's cosmetic modifications ) )
Each person stores both versions of the contract (and the signatures).
This makes it impossible for Alice to use a birthday attack to "prove" that Bob signed some other contract, and it makes it impossible for Bob to use a birthday attack to "prove" that Alice signed some other contract. (But plausible deniability is a different issue).
Should we add more clarification that *each* person adds cosmetic modifications,
and each person only signs the version that she modified?
--68.0.120.35 20:02, 9 March 2007 (UTC)
No. Any such proposal would be original research. Cosmetic modifications to documents is not used in practice, unless you count probabilistic signature schemes. The usual way to defeat birthday attacks is -- as already stated in the article -- to use large enough collision resistant hashes. 85.2.115.82 10:50, 10 March 2007 (UTC)

[edit] Hard to understand

While I understand the mathmatics behind the Birthday Paradox, I'm having trouble seeing its application to the Birthday Attack. I think I understand a bit of it, but the article isn't very clear.

Agreed! I'm finding this all a bit confusing. The Birthday Paradox page is simple to understand however it seems that a clear example hasn't been applied to this topic.
I added a derivation of the probability and added an example. Removed the 'confusing' tag.--Kbk 18:14, 2 May 2006 (UTC)

[edit] deja vu

[edit] Birthday vs Collision Attack

I've noticed that most other articles seem refer to this as a Collision attack not a Birthday attack. Which is more correct? This is not my area expertise so I don't know if this is noteworthy. ~~k —Preceding unsigned comment added by 203.152.115.183 (talk) 22:58, 23 September 2007 (UTC)

Hi guys... Something is wrong with the references. People reference this page as "collision attack" and "collision resistant". When they get here, it talks about the "birthday attack", which is really a kind-of bruteforce attack; it assumes that the hash is really collision resistant and that no better collision attacks exist. As we all know better attacks exist (for e.g. MD5, SHA, some HMAC implementations), we really really need a page about the general collision attack problem. It would be great if someone more familiar with how to use wikipedia would create a page for "collision attack" /"collision resistant" and let us start pumping information to it. Birthday attack would of course be referenced on the new page.... --Blaufish 15:48, 29 September 2007 (UTC)

We've already got hash collision, preimage attack and collision resistance. Are you saying that these are not enough? -- intgr [talk] 13:28, 30 September 2007 (UTC)

[edit] Page is incomprehensible to non-mathematicians - plain-language introduction would be good.

I don't consider myself mathematically illiterate, but I am not a mathematician, and I find this page completely incomprehensible. This actually illustrates a problem I've seen with dozens of Wikipedia pages which deal with subjects which have mathematical theory behind them, and I've just chosen this article to raise a problem that I think could apply to many, many pages.

I'm not suggesting that any of the mathematical theory (including specialist notation, if relevant) should be removed; but it would seem a good idea to me that the article should open with a broad explanation or description that should give readers an overall view of what the term means. I think I read a Wikipedia policy somewhere that longer articles should contain two main parts: a relatively brief description that gives an overall sense of what the term means, with perhaps a few elementary details, which should be understandable by a person with no knowledge of that topic; and then a much longer section (perhaps subdivided into further sections) that contains all the technical detail.

In articles like this one, I really think this first section is either lacking altogether or inadequate. I would like to be able to read an opening section for articles like this that tell me what the term means, with perhaps some of the broad mathematical theory included if that can be described approximately in plain language without detailed mathematical notation. The technical detail, including mathematical notation, would be better placed in the second, more detailed section; and then, after I've read the opening description and got a sense of what the term means, I can start the technical section, and be in a good position to decide if I'm likely to understand it, or if I'm interested enough to wade through difficult details.

M.J.E. 09:19, 19 October 2007 (UTC)

I agree, this is a pretty prevalent on specialist topics. The birthday paradox article does a much better job explaining the theory behind this, and some of it should probably be repeated on this article.
When you spot problems like these, it can be useful to tag the article or section with "{{context}}", "{{introrewrite}}" or "{{technical}}" to draw attention to the problem (just insert that, including the braces, to the beginning of the article). More such tags can be found at WP:TC. When the tag isn't self-explanatory, it's a good practice to leave a short edit summary or a talk page message, explaining how the tag applies to the particular article. -- intgr [talk] 13:57, 19 October 2007 (UTC)

[edit] Implications of the space/time tradeoff

What is the nature of the space-time tradoff? Do you need to store every source and every hash that you compute to get the stated probability? —Preceding unsigned comment added by NeilenMarais (talkcontribs) 21:50, 19 November 2007 (UTC)

It means that, instead of naively picking one hash, and comparing it against different hashes (which would require 2n computations), you store the hash somehow, and compare it to all previously calculated hashes, requiring "just" \sqrt{2^n} = 2^{n/2} operations according to the birthday problem. This doesn't mean that you have to store all 2n/2 hashes; an obvious alternative is using rainbow tables to reduce the storage requirements, although still quite inefficient.
A much better approach is to start off a large number of computers at a random hash value, and make them simply calculate hashes of the previous hash (or alternatively a reduce function of the previous hash). Hashes that match a particular criteria (for example, have upper half bits set to 0), would be stored ("checkpointed") in a database, along with the previous checkpoint. Eventually, two such threads will "coincide", producing an equal checkpoint hash. At this point, the colliding inputs can be re-calculated from the previous checkpoint hashes. This is how the distributed.net MD5 collision search worked.[1] -- intgr [talk] 01:02, 20 November 2007 (UTC)
Aren't you mixing collision search and preimage attacks here? Rainbow tables are mainly used to store a large number of (input, hash) pairs in a very space-efficient way, at the cost that given a hash retrieving the corresponding input can be done with modest computation depending on how compressed the rainbow table is. Here we have a time-memory tradeoff, but it is not a birthday attack. Collision searches which are birthday attacks can be done in time O(2n / 2) regardless of how much memory is available. I.e. cycle finding algorithms require just constant memory, but are only by a constant slower than if we stored all the input/outputs during a collision search or if we used the distributed method that you describe above. The situation is similar for discrete logarithms in generic cyclic groups. If the order of the group is a prime p then the best known generic algorithms takes time O(\sqrt{p}). Pollard rho needs only constant memory and runs in O(\sqrt{p}) too.
Hence again, memory does not improve the time complexity. Hence classifying a birthday attack as a time-memory attack is not justified. 85.2.91.75 (talk) 14:53, 20 November 2007 (UTC)
You're right, I was describing collision attacks and didn't realize that. -- intgr [talk] 21:19, 20 November 2007 (UTC)

[edit] Birthday attacks and cyphers

Not a comment on this article, but I've removed a comment re birthday attacks on the Blowfish and Disk encryption hardware pages which make reference to birthday attacks and cypher block size. Birthday attacks can potentially be effective wrt hashed values, but aren't effective against encrypted data. In encryption, being deterministic, any given block of cyphertext may only be decrypted to a single plaintext, unless a different key is used

i.e. E(k, a) == E(k, b) only when a == b. Unlike a hash algorithm where H(a) == H(b) can be true when a != b for various a and b

If anyone can find a citation otherwise, feel free to add them back in with a suitable citation! Nuwewsco (talk) 18:47, 21 January 2008 (UTC)

It depends on the block cipher mode that is used. For many modes the birthday paradox does apply. E.g., when using a cipher in CBC mode the i-th ciphertext block is computed as C_i = E_K(P_i \oplus C_{i-1}), where Pi is the i-th plaintext block. Here collisions Ci = Cj are possible and imply P_i \oplus C_{i-1}=P_j \oplus C_{j-1}, hence P_i\oplus P_j is computable and thus some information about the plaintext leaks. Such leaks are expected to appear once the number of encrypted blocks reaches the birthday bound, i.e. about 2k / 2 blocks where k is the block size in bits of the cipher. 85.2.62.68 (talk) 19:54, 21 January 2008 (UTC)

[edit] Alice and bob

In the section about digital signatures it says "Suppose Alice wants to trick Bob into signing a fraudulent contract", Shouldn't it be Mallory or Eve instead of Alice? I thought you didn't want to use alice or bob when doing "mean" things to avoid confusion? -RichoDemus 2008-05-09 09:15 gtm+1 edit: hm, I'll be bold and do it myself —Preceding unsigned comment added by RichoDemus (talkcontribs) 08:05, 15 May 2008 (UTC)