User talk:Michael Birk

From Wikipedia, the free encyclopedia

[edit] Initialisation Vector

Hello, Michael! Thanks for taking time to contribute to the IV page. I just wanted to discuss your correction. I'm sure you'd agree that the recepient does need to know the IV to be able to decrypt the message. It was just probably worded incorrectly. The recepient doesn't necessarily need to receive the IV when it can be simply calculated or measured - the IV can be current time for instance, if it's properly synchronised. —Ruptor 16:40, 18 October 2005 (UTC)

Hi Ruptor,

Yes, I agree it was just a combination of ambiguous wording and my own ignorance. I realized this soon after, agonized over my change, and nearly reverted it. I'm glad, though, that it prompted you to add to the page, because I think your explanations really help.

However, I do think there is a key point about an IV that may be worthy of explanation in the Wikipedia article. There are really two ways to think about, or describe, what an IV is. (NB: I don't have much experience with stream ciphers, so this pertains to block ciphers only. Does a similar dichotomy exist with stream ciphers?)

The first view, the "IV as initial input" view, is what is described in your article. The IV is a block of data that is used to initiailize the feedback register. Here, it is obvious that the receiver needs to know the IV, since an encryption algorithm in feedback mode requires this input to operate correctly.

The other view, the "IV as salt" view, is that the IV is simply some random data that is included at the beginning of the plaintext, so that when encrypted (in feedback mode), the resulting ciphertext is unique. In this case, the feedback register is initialized with zeroes. Here, the receiver does not need to know the IV beforehand -- instead, the IV (as salt) is what the receiver sees as the first block after decrypting.

It seems clear that these two views are equivalent; the "IV as input" is merely the encrypted "IV as salt." (Note: If "IV as salt" doesn't really exist -- that is, if the distinction between "IV" and "salt" is clear to everyone, then either I am dense or the texts that I have read are poorly written, or both.)

I can understand why "IV as input" is the normal view, though, since it gives an implementor freedom to calculate the IV as they choose, and you nicely describe various ways that it can be computed. This freedom allows you to avoid, in some cases, the overhead of sending an extra block of encrypted data.

However, perhaps that is also the freedom to hang oneself? Schneir stresses in Applied Cryptography that the IV (as initial input) is not secret. So it seems to me that "IV as salt" has a minor advantage, in that the IV (as salt) is always encrypted with the private key. Therefore, even if an implementor chooses to unwisely attach semantics to the IV (i.e. it really should be secret), it is at least not available to an attacker in plaintext. Granted, since the feedback register is zeroed when encrypting the salt, the IV itself is exposed to the same attacks as ECB mode. With "IV as input," you can achieve the same protection as "IV as salt" by encrypting (in ECB mode) the IV. But I wonder how many people are actually doing that? It didn't even occur to me until just now.

Michael Birk 23:27, 28 October 2005 (UTC)

The term IV is used for a lot more things than just CBC mode. If you look at any of the eSTREAM entrants, they are all specified as encrypting a plaintext stream into a ciphertext stream using a key and an IV. Usually the IV may be explicitly sent with the message, or it may be implicit from other data accompanying the messages, or it may be implicit. A mode of operation like CBC is just a way of constructing a stream cipher from a block cipher.
You seem to be discussing two variants on CBC mode, one in which the IV is encrypted before it is XORed with the first plaintext, and one in which it isn't. I don't know for sure (if there's a definitive paper on CBC security my brief search hasn't found it) but some thought leads me to the conclusion that it doesn't make much difference, because either way CBC has an unusual insecurity you have to work around.
Normally, when reasoning about the security of a stream cipher you assume that the attacker not only knows the IV and plaintext, but controls it. In this instance either CBC variant is insecure. If instead you assume that the IV is randomly generated, and revealed only once the plaintext has been chosen, then AFAICT either variant is secure (up to the "birthday bound" - ie so long as you're invoking the block cipher much less than 2^32 times if you're using TDES).
To me this makes CBC mode pretty unattractive. I would always recommend that new applications use another mode. CTR mode, or one of the new AEAD modes, would be a better choice. CTR mode doesn't fail so badly at the birthday bound either. — ciphergoth 09:24, 20 February 2006 (UTC)
Hi ciphergoth, thanks for the input. In case it's not abundantly clear, I'm certainly not a cryptography expert. My perspective is that of a systems and applications programmer who occasionally has to grapple with encryption issues. So this discussion may or may not be useful to you, depending on your objectives. Of course, we all know that security must be addressed throughout the software stack, so it is important that non-experts like myself understand the technology.
Anyway, I'm already a little confused when you say normally "you assume that the attacker not only knows the IV and plaintext, but controls it." [emphasis added] If the attacker already knows the plaintext, what is the point of encryption? Or was that a typo?
No, that's not a typo :-) It's hard to get across how high the standards are that we hold our ciphers to - they have to leak no information about the plaintext at all (besides length) no matter how favourable for the attacker circumstances are. For something like CBC mode, that means that the attacker shouldn't even be able to tell if CBC mode is in use, even if they get to choose the plaintexts that get encrypted.
Thanks also for the heads up about the other modes, such as CTR and AEAD. I had not heard of them, but I will do some more research. However, isn't it the case that CBC is by far the most commonly used encryption mode for existing applications today? 3DES with CBC seems to be the default that I most commonly encounter. In that case, the discussion of IV with respect to CBC is still highly relevant to me.
Yes, 3DES with CBC is very common. And some people who don't know any better have started using AES with CBC, though fortunately CTR mode is becoming more prevalent.
In my reply to Ruptor, I was trying to point out that it seemed like there were two distinct, but related, uses of the term "initialization vector" in the literature. For example, in 9.3 of Applied Cryptography, Schneier states that [in CBC mode] "Two identical messages will still encrypt to the same ciphertext ... Prevent this by encrypting random data as the first block. This block of random data is called the initialization vector (IV)."
That sounds a lot like "salt," according to my understanding. Yet the IV page on Wikipedia makes no mention of salt. Instead it describes the IV as the "initial input" (my own characterization) to the encryption and decryption process.
It was this confusion that led me to (foolishly) delete the statement "[The IV] must be known to the recepient of encrypted information." from the IV entry. If, as Schneier suggests, the IV is simply some "random data as the first block," then clearly the recipient doesn't need to know it in order to properly decrypt.
After giving it some thought, I could see that these different descriptions of IV are two sides of the same coin. Yet, based on my current understanding of the commonly-accepted meaning of IV, I think Schneier's statement is highly misleading.
Michael Birk 22:44, 21 February 2006 (UTC)
It sounds like AC's description isn't that good. I wouldn't be surprised to learn that Practical Cryptography did a better job. AC is a pretty old book and has some important flaws as a guide to cryptography. I think "IV as input" is a clearer way to think about it. — ciphergoth 09:46, 22 February 2006 (UTC)