In my last cryptography post, I wrote about using message authentication codes

(MACs) as a way of guaranteeing message integrity. To review briefly, most ciphers

are designed to provide message confidentiality – which means that no one but the

sender and the intended receiver can see the plain-text of the message. But

ciphers that provide confidentiality don’t necessarily make any guarantees that

the message received is exactly the message that was sent. There are a good number

of cryptographic attacks that work by altering the message in transit, and

depending on the cipher, that can result in a variety of undesirable

results.

For example, if you use DES encryption with the ECB mode of operation,

you can insert new blocks anywhere in a message that you want. By using

a replay attack (where you take encrypted blocks from other messages using

the same encryption, and resend them), an attacker can alter your messages, and

you won’t be able to detect it.

So in addition to just confidentiality, we need to provide integrity. What does integrity really mean? Basically, it expands the definition of the

decryption function. Written as a function signature, confidential message

decryption is a function `decrypt : ciphertext × key → plaintext`

. With message integrity, we add the

option that decrypt can return a result saying that the message is invalid: `decrypt`

. _{integ} : ciphertext × key → (plaintext | REJECT)

In the last post, I explained the basic concept behind integrity schemes: you add an addition chunk of data to the message, which summarizes the message; any change to the message will (with a very high degree of probability) result in

a change in the authentication code; so if the message is altered, that change

will be detected, because the authentication code will not match the rest

of the message.

The idea of the message authentication code seems quite simple. We know lots

of pseudo-random functions that can do a good job as MAC functions. But like

so many things in cryptography, the fact that the concept is simple doesn’t

mean that the actual application of that concept is simple: it’s incredibly

easy to get it disastrously wrong, even when all of the building blocks are right.

For example, let’s look at a very simple (and very weak) integrity

system, called CBC-MAC.

In CBC-MAC, you compute an authentication code by running a your

block cipher on the message with an initialization vector set

to 0. The output for encrypting the last block is your MAC.

The reason that this works is fairly simple – CBC keeps pushing the

result from the previous encryption step through to the next one – so

the output from the final block conceptually includes information

from all of the other blocks. So you’ve got an authentication code

whose security is, roughly, equivalent to the security of the block

cipher used in CBC mode.

CBC-MAC isn’t a bad mode of operation. It’s got serious flaws – the biggest

one is that there are a number of attacks against it if it’s used for

variable length messages. (And you can’t fix it just by adding a message

length indicator to the message, it’s deeper than that.) But CBC is

a pretty good integrity mode for fixed-length messages, and there’s a widely

used variant, called CMAC, which works for variable length messages.

So how can this go wrong?

What do you use as the password to the MAC computation? CBC is

a secure cryptographic mode, so why not just use the same password? Lots

of novices (and even some not-no-novice folks!) have implemented CBC-MAC

where the integrity and confidentiality modes used the same key. If you

compute the MAC using the same key as the message, then you’re open to

an attack which completely voids the integrity guarantee!

Imagine a message M, which is divided into n blocks: (M_{1}, …, M_{n}). For convenience, we’ll treat the IV as if it were ciphertext block 0. Tuen using CBC with a key K, M encrypts to ciphertext blocks

(C_{1}, …, C_{K}), where C_i=Encrypt(M_{i} ⊕ C_{i-1}, K), plus a CBC-MAC integrity block. If you look

at the CBC block, if it’s using the same key as the main encryption,

then the value of the MAC is the result of computing

Encrypt(M_{n} ⊕C_{n-1}, k). That value

is *exactly* the same as C_{N}! So it’s *trivial*

to fake the MAC. If you use CBC-MAC with the same encryption key as

you used to encrypt the message, you’re wasting your time: you might as well

not bother with the MAC at all, because you’ve got no integrity guarantee.

But that’s an amazingly common mistake. A huge number of implementations

of supposedly secure cryptosystems that claim to guarantee integrity have

actually used CBC-MAC with one key.

I mentioned above that CBC-MAC is only safe for fixed-length messages. The reason for that is closely related to the trick I described above. Basically, the nature of CBC ultimately means that given a MAC value M, it’s not hard to generate a string to append to a message which will result in it having the desired

MAC. Suppose we know that the message M will generate the mac T_{M},

and we know that the block we want to replace the message with, N has

the MAC T_{N}. Then we can easily generate a string N’, such that

appending MAC(M,N) = MAC(N). All we do is XOR the first block of N

with T_{M}, and add it to the message. The new message

is is (M_{1},…,M_{n},(N_{1}⊕T_{N}),

N_{2},…,N_{n}) – that is, we can remove the MAC from the

first message, use it to glue the two messages together, and tack T_{N} on to the message as the MAC. T_{N}, the MAC computed for T alone, will

be the MAC for the combined message.

It’s easy to do that if we’re allowed to change the length of the message. If the receiver doesn’t know how long the message is going to be, then they can’t

count on its integrity using CBC-MAC.

Next time, I’ll look at a variation of CBC-MAC that provides integrity

for variable-length messages using 2 keys – one for confidentiality, and

one for integrity. Then I’ll go on to a two-pass single-key mode of operation

that provides both confidentiality and integrity.