7 3 Chosen Ciphertext Attacks 12 Min

In the last segment we defined authenticated encryption, but I didn't really show you why authenticated encryption is the right notion of security. In this segment I want to show you that authenticated encryption in fact is a very natural notion of security and I'll do it by showing you that it defends against a very powerful attack called a chosen cipher text attack. So in fact we already saw a number of examples of a chosen cipher text attack so imagine the adversary has some cipher text C that it wants to decrypt. And what it can do is, for example, fool the decryption server into decrypting some cipher text but not actually the cipher text c. So we already saw some examples of that. If you remember in the first segment, we looked at an adversary that can submit arbitrary cipher text, and if the plaintext happened to start with destination equals 25, then the adversary is actually given the plaintext in the clear. So that's an example of an adversary that can obtain the decryption of certain cipher texts but not all cipher texts. Another example we saw is an adversary that can learn something about the plaintext by submitting cipher texts to the decrypter. So we have this example where the adversary submits encrypted TCP/IP packets to the decryption server, and if the decryption server sends back an ACK, the adversary learns that the decrypted plain text had a valid check sum. And otherwise, the decrypted plain text didn't have a valid check sum. So this is, again, an example of a chosen cipher text attack, where the attacker submits cipher text, and learns something about the decryption of that cipher text. So to address this type of threats, we're gonna define a very general notion of security, called chosen cipher text security. So here, we're gonna give the adversary a lot of power, okay? So he can do both chosen plain text attack, and a chosen cipher text attack. In other words, he can obtain the encryption of arbitrary messages of his choice. And he can decrypt any cipher text of his choice, other than some challenge cipher texts. And as I showed you before, this is actually a fairly conservative modeling of real life. In real life, often, the attacker can fool the, the decrypter, into decrypting certain cipher texts for the attacker, but not all cipher texts. So the model here is that the attacker has a certain cipher text that it wants to decrypt. It can interact with the decrypter by issuing these chosen cipher text queries to the decrypter. Namely, to decrypt various cipher text other than the challenge cipher text. And then the adversary's goal is to break semantic security of the challenge cipher text. So you notice that we're giving the adversary a lot of power. And all we're asking you to do is break semantic security. So it's going to be fairly difficult to design systems that are secure against such adversaries. Nevertheless, we're going to do it. So let's define the chosen cipher text security model more precisely. So, as usual, we have a cipher (E, D). And we're gonna define two experiments, experiment zero and experiment one. So this should look somewhat familiar by now. The challenger is gonna start off by choosing a random key. And now the adversary is gonna submit queries to this challenger. Every query can be one of two types. It can be a chosen plain text query, or it can be a chosen cipher text query. So a chosen plain text query, as we already know. Basically, the adversary submits two messages, M zero and M1. They have to be the same length. And the adversary receives the encryption of either M zero if we're in experiment zero, or M1, if we're in experiment one. So he receives the encryption of the left or the right depending on whether we were in experiment zero or in experiment one. The second type of query is the more interesting one. This is where the adversary submits an arbitrary cipher text of his choice and what he gets back is the decryption of that cipher text. So you notice the adverary's allowed to decrypt arbitrary cipher texts of his choice. The only restriction is that the cipher text is not one of the cipher texts that were obtained as a result of a CPA query. And of course this wouldn't be fair otherwise, because the attacker can simply take one cipher text that was obtained from a CPA query. That's gonna to be either the encryption of M0 or the encryption of M1. If he could submit a CCA query for that particular cipher text, he will in response either obtain M0 or M1, and then he'll know whether he is in experiment zero or experiment one. So this wouldn't be fair. So we say that the CPA cipher texts that he received are the challenge cipher texts. And he's allowed to decrypt any cipher texts of his choice, other than these challenge cipher texts. And as usual, his goal is to determine whether he's in experiment zero, or in experiment one. So I'm gonna emphasize again, that this is an extremely powerful adversary. One that can decrypt any cipher text of his choice, other than the challenge cipher text. And still, he can't distinguish whether he is in experiment zero, or in experiment one. So, as usual, we say that the cipher is CCA secure, chosen cipher text secure, if the adversary behaves the same in experiment zero as it does in experiment one. Namely, it cannot distinguish the two experiments. So let's start with a simple example, and show that, in fact, CBC with a random IV, is not CCA secure, is not secure against chosen cipher text attacks. So let's see why that's the case. So what the adversary's gonna start by doing, he's gonna simply submit two distinct messages, M0 and M1. And let's just pretend that these messages are one block messages. And what he's gonna get back is the CBC encryption of either M0 or M1. You notice the cipher text only has one block, because the plain texts were only one block long. Now what is the attacker gonna do? Well, he's gonna modify this cipher text C that he was given into C prime simply by changing the IV. Okay? So he just takes the IV and XORs it with one. That's it. This gives a new cipher text, C prime, which is different from C and as a result it's perfectly valid for the adversary to submit C prime as its chosen cipher text query. So he asks the challenger please decrypt this C prime for me. The challenger, because c prime is not equal to c, must decrypt c prime. And now let's see, what happens when he decrypts c prime? Well, what's the decryption of c prime, let me ask you. So you probably remember from the first segment that if we xor the IV by one, that simply xors the plaintext by one. So now that adversary received M0 xor one, or M1 xor one, and now he can perfectly tell whether he's in experiment zero and, or in experiment one. So the advantage of this adversary is basically one, because he can very easily tell which experiment he's in. And as a result he can win the chosen cipher text security game. So if you think about it for a second, you'll see that this attack game exactly captured the first active attack that we saw, where the adversary slightly changed the cipher text that he was given. And then he got the decrypter to decrypt it for him. And therefore, he was able to eavesdrop on messages that were not intended for the adversary. So I wanna emphasize again that this chosen cipher text game really does come up in real life, where the adversary can submit cipher texts to the decrypter and the decrypter can reveal information about the plain text, or it can give the plain text outright to the adversary for certain cipher texts but not others. So this is a very natural notion of security, and the question is, how do we design crypto-systems that are CCA secure? So I claim that this authenticated encryption notion that we defined before actually implies chosen cipher text security, and this is why authenticated encryption is such a natural concept. Okay? So the theorem basically says, well, if you give me a cipher that provides authenticated encryption, the cipher can withstand chosen cipher text attacks. And more precisely, the theorem says the following. If we have an adversary that issues Q queries, in other words, at most, q CPA queries and q chosen cipher text queries, then there are two efficient adversaries, B1 and B2, that satisfy this inequality here. So since the scheme has authenticated encryption, we know that this quantity is negligible because it's CPA secure. And we know that this quantity is negligible because the encryption scheme has cipher text integrity. And as a result, since both terms are negligible we know that adversary's advantage in winning the CCA game is also negligible. So let's prove this theorem. It's actually a very simple theorem to prove. And so let's just do it as proof by pictures. Okay, so here we have two copies of the CCA game, so this would be experiment zero. And the bottom one is experiment one. You can see the adversary's issuing CPA queries, and he's issuing CCA queries, and at the end he outputs, you know, a certain guess b, let's call it b prime, and our goal is to show that this b prime is indistinguishable in both cases. In other words, probability that b prime is equal to one in the top game is the same as the probability that b prime is equal to one in the bottom game. Okay, so the way we're gonna do it is the following. Well, first of all, we're gonna change the challenger a little bit, so that instead of actually outputting the decryption of CCA queries, the challenger is just gonna always output bottom. So every time the adversary submits a CCA query, the challenger says bottom. And I claim that these two games are, in fact, indistinguishable. In other words, the adversary can't distinguish these two games, for the simple reason that, because the scheme has cipher text integrity, the adversary simply cannot create a cipher text that's not in C1 to CI-1 that decrypts to anything other than bottom. That is the definition of cipher text integrity. And as a result, again, because of cipher text integrity, it must be the case that every chosen cipher text query that the adversary issues results in bottom. If the adversary, in fact, could distinguish between the left game and the right game, that would mean that at some point he issued a query that decrypted to something other than bottom. And that we could use to then break cipher text integrity of the scheme. And since the scheme has cipher-text integrity, these left and right games are indistinguishable. Okay, so that's kind of a cute argument that shows that it's very easy to respond to chosen cipher-text queries when you have cipher-text integrity. And the same thing exactly applies on the bottom, where we can simply replace the chosen cipher-text responses by just always saying bottom. Okay, very good. But now, since the chosen cipher text queries always respond in the same way, they're not giving the adversary any information. The adversary always knows that we're always gonna just respond with bottom. So we might as well just remove these queries, 'cause they don't contribute any information to the adversary. But now, once we remove these queries, the resulting game should look fairly familiar. The top right game, and the [bottom right] game are basically the two games that come up in the definition of CPA security. And as a result, because the scheme is CPA secure, we know that the adversary can't distinguish the top from the bottom. And so now, by this change of reasoning, we've proven that all of these games are equivalent. And in particular, the original two games that we started with are also equivalent, and therefore, the adversary can't distinguish the top left from the bottom left. And therefore, the scheme is CCA secure. So this gives you the intuition as to why authenticated encryption is such a cool concept. Because primarily it implies security against chosen cipher test attacks. Okay, so as we said authenticated encryption ensures confidentiality. Even if the adversary can decrypt a subset of the cipher text, and more generally, even if he can mount a general chosen cipher text attack, he still is not going to be able to break semantic security of the system. However, it is important to remember the two limitations. First of all, it does not prevent replay attacks on its own. We're going to have to do something in addition to defend against replay attacks. We're going to see several examples where if the decryption engine reveals more information about why a cipher text is rejected, it doesn't just output bottom, but it actually outputs more information, say, by timing attacks. And that explains why the cipher text is rejected. Then in fact that can completely destroy security of the authenticated encryption system. So we'll see some cute attacks like this in a later segment. Okay. So, in the next segment we're gonna turn to constructing authenticated encryption systems.