D. J. Bernstein
Hash functions and ciphers
Notes on the ECRYPT Stream Cipher Project (eSTREAM)
Leaks
Software handling secret data,
and cryptographic software in particular,
must be careful to avoid leaking secrets through side channels.
Plugging a side-channel leak
can drastically increase the cost of building and using the software.
Similar comments apply to hardware.
The AES designers believed, incorrectly, that table lookup was
``not vulnerable to timing attacks.''
See Section 7 of my paper
Cache-timing attacks on AES
for further discussion of this design error.
It is, in fact, extremely difficult to write
constant-time high-speed AES software for modern CPUs.
I see this as one of the big reasons that a new cipher-design process
can do better than the AES design process.
So far eSTREAM has seen very little study of side-channel leaks,
even the simplest timing leaks.
How expensive is protected stream-cipher software?
Here are my initial impressions of the timing leaks from various stream ciphers:
- ABC:
Low-level operations are
addition; xor; and; or; constant-distance shift; dot product.
The dot product takes bits b_0,b_1,...,b_{31}
and 32-bit integers e_0,e_1,...,e_{31}
and computes the sum e_0 b_0 + e_1 b_1 + ... + e_{31} b_{31}.
Every 4 bytes of output have one dot product and several other operations.
The reported speed of ABC relies on computing the dot product
by secret-index table lookups,
creating timing-attack problems.
- CryptMT:
No timing leaks.
- Dragon:
Timing leaks similar to the AES timing leaks.
- HC-256:
RC4-style.
The table lookups in HC-256 take variable time.
I don't see how to carry out a cache-timing attack that uses solely the total HC-256 time:
the array indices in HC-256 are extremely complicated, and not visibly related, functions of the nonce.
But it should be easy to carry out a cache-timing attack on HC-256
using the time for individual operations:
e.g., a hyperthreading attack or a process-interruption attack.
- LEX: Timing leaks similar to the AES timing leaks.
- Mir-1: Uses the Rijndael S-boxes in initialization; potential timing leaks.
The other low-level operations are xor, and, or, addition mod 2^64, and multiplication mod 2^64.
- NLS:
Timing leaks similar to the AES timing leaks.
NLS is built from (page 18) addition, xor, constant-distance shift, and table lookups.
The designers claim, incorrectly, that table lookup takes constant time.
- Phelix:
No timing leaks.
- Py:
RC4-style.
The table lookups in Py take variable time;
I don't see any obstacles to a cache-timing attack
obtaining the entire Py expanded key.
Specifically, the first lookup indices appear to be extremely simple functions
of the nonce and a few expanded-key bytes, exposing those expanded-key bytes to a cache-timing attack;
the next few lookup indices appear to be extremely simple functions of the nonce,
the known expanded-key bytes, and a few more expanded-key bytes,
exposing those expanded-key bytes to a cache-timing attack; etc.
- Rabbit:
Low-level operations are addition; addition with carry;
squaring of a 4-byte input, with the 8-byte output folded by xor into a 4-byte result;
rotation by multiples of 8 bits;
and some other byte shuffling as part of key setup.
Each 16-byte output block involves 8 squarings and various other operations.
I'm concerned about Rabbit's use of
integer squaring on, e.g., the Motorola PowerPC G4e 7450,
which takes a cycle less if the input is between -131072 and 131071;
how much speed does Rabbit lose if this timing leak is eliminated?
- Salsa20:
No timing leaks.
- SOSEMANUK: The reported speed of SOSEMANUK
relies on performing computations in F_{2^32}
by secret-index table lookups,
creating timing-attack problems.
- YAMB:
RC4-style. Timing leaks.
Fischer, Gammel, Kniffler, and Velten (SASC 2007)
reported successful differential power analysis of Trivium and Grain.