Poly1305-AES: a state-of-the-art message-authentication code D. J. Bernstein
Authenticators and signatures

A state-of-the-art message-authentication code

Why is Poly1305-AES better than other message-authentication codes?
How do I use Poly1305-AES in my own software?
How does the Poly1305-AES implementation work?
Where can I learn more about Poly1305-AES?

Poly1305-AES speed tables
AES timing variability at a glance
AES timing variability graphs
Poly1305-AES for the Athlon
Poly1305-AES for the Pentium Pro/II/III/M
Poly1305-AES for the PowerPC under AIX
Poly1305-AES for the PowerPC under MacOS X
Poly1305-AES for the UltraSPARC
Poly1305-AES for generic computers with IEEE floating point
Poly1305-AES for the 8051
Poly1305-AES using GMP and OpenSSL

Poly1305-AES is a state-of-the-art secret-key message-authentication code suitable for a wide variety of applications.

Poly1305-AES computes a 16-byte authenticator of a message of any length, using a 16-byte nonce (unique message number) and a 32-byte secret key. Attackers can't modify or forge messages if the message sender transmits an authenticator along with each message and the message receiver checks each authenticator.

There's a mailing list for Poly1305-AES discussions. To subscribe, send an empty message to poly1305-subscribe@list.cr.yp.to.

Why is Poly1305-AES better than other message-authentication codes?

Poly1305-AES has several useful features:

Guaranteed security if AES is secure. There's a theorem guaranteeing that the security gap is extremely small (n/2^(102) per forgery attempt for 16n-byte messages) even for long-term keys (2^64 messages). The only way for an attacker to break Poly1305-AES is to break AES.
Cipher replaceability. If anything does go wrong with AES, users can switch from Poly1305-AES to Poly1305-AnotherFunction, with an identical security guarantee.
Extremely high speed. My Poly1305-AES software takes just 3843 Athlon cycles, 5361 Pentium III cycles, 5464 Pentium 4 cycles, 4611 Pentium M cycles, 8464 PowerPC 7410 cycles, 5905 PowerPC RS64 IV cycles, 5118 UltraSPARC II cycles, or 5601 UltraSPARC III cycles to verify an authenticator on a 1024-byte message. Poly1305-AES offers consistent high speed, not just high speed for one favored CPU.
Low per-message overhead. My Poly1305-AES software takes just 1232 Pentium 4 cycles, 1264 PowerPC 7410 cycles, or 1077 UltraSPARC III cycles to verify an authenticator on a 64-byte message. Poly1305-AES offers consistent high speed, not just high speed for long messages. Most competing functions are designed for long messages and don't pay attention to short-packet performance.
Key agility. Poly1305-AES can fit thousands of simultaneous keys into cache, and remains fast even when keys are out of cache. Poly1305-AES offers consistent high speed, not just high speed for single-key benchmarks. Almost all competing functions use a large table for each key; as the number of keys grows, those functions miss the cache and slow down dramatically.
Parallelizability and incrementality. Poly1305-AES can take advantage of additional hardware to reduce the latency for long messages, and can be recomputed at low cost for a small modification of a long message.
No intellectual-property claims. I am not aware of any patents or patent applications relevant to Poly1305-AES.

Guaranteed security, cipher replaceability, and parallelizability are provided by the standard polynomial-evaluation-Wegman-Carter-MAC framework. Within that framework, hash127-AES achieved extremely high speed at the expense of a large table for each key. The big advantage of Poly1305-AES is key agility: extremely high speed without any key expansion.

Other standard MACs are slower and less secure than Poly1305-AES. Specifically, HMAC-MD5 is slower and doesn't have a comparable security guarantee; CBC-MAC-AES is much slower and has a weaker security guarantee. Both HMAC-MD5 and CBC-MAC-AES are breakable within 2^64 messages. I'm not saying that anyone is going to carry out this attack; I'm saying that everyone satisfied with the standard CBC security level should be satisfied with the even higher security level of Poly1305-AES.

How do I use Poly1305-AES in my own software?

My fast poly1305aes library is in the public domain. You can and should include it in your own programs, rather than going to the effort of linking to a shared library; the compiled code is between 6 and 10 kilobytes, depending on the CPU.

To get started, download and unpack the poly1305aes library:

     wget https://cr.yp.to/mac/poly1305aes-20050218.tar.gz
     gunzip < poly1305aes-20050218.tar.gz | tar -xf -

Compile the library (making sure to use appropriate compiler options for your platform, such as -m64 for the UltraSPARC) to get an idea of how it's structured:

     cd poly1305aes-20050218
     env CC='gcc -O2' make

Copy the library files into your project:

     cp `cat FILES.lib` yourproject/
     cat Makefile.lib >> yourproject/Makefile

For any C program that will use Poly1305-AES, modify the program to include poly1305aes.h; also modify your Makefile to link the program with poly1305aes.a and to declare that the program depends on poly1305aes.a and poly1305aes.h.

Inside the program, to generate a 32-byte Poly1305-AES key, start by generating 32 secret random bytes from a cryptographically safe source: kr[0], kr[1], ..., kr[31]. Then call

     poly1305aes_clamp(kr)

to create a 32-byte Poly1305-AES secret key kr[0], kr[1], ..., kr[31].

Later, to send a message m[0], m[1], ..., m[len-1] with a 16-byte nonce n[0], n[1], ..., n[15] (which must be different for every message!), call

     poly1305aes_authenticate(a,kr,n,m,len)

to compute a 16-byte authenticator a[0], a[1], ..., a[15].

After receiving an authenticated message a[0], a[1], ..., a[15], n[0], n[1], ..., n[15], m[0], m[1], ..., m[len-1], call

     poly1305aes_verify(a,kr,n,m,len)

to verify the authenticator. Accept the message if poly1305aes_verify returns nonzero; otherwise throw it away.

Do not generate or accept messages longer than a gigabyte. If you need to send large amounts of data, you are undoubtedly breaking the data into small packets anyway; security requires a separate authenticator for every packet.

Please make sure to set up a Googleable web page identifying your program and saying that it is ``powered by Poly1305-AES.''

How does the Poly1305-AES implementation work?

Interested in writing your own Poly1305-AES implementation? Seeing whether Poly1305-AES can benefit from, say, AltiVec? Using Poly1305-AES in a language without C linkage? Checking Poly1305-AES test vectors? Building Poly1305-AES circuits? Adapting the Poly1305-AES computational techniques to other functions?

The simplest C implementation of Poly1305-AES is poly1305aes_test, which relies on GMP and OpenSSL. I suggest starting from the top: read poly1305aes_test_verify.c and work your way down.

Test implementations in other languages:

C++ (only the Poly1305 part): poly1305_gmpxx.h, poly1305_gmpxx.cpp. This is easier to read than the C version.
Python: Ken Raeburn has contributed some sample code with my four ``Appendix A'' tests.
Perl: Tony Betts has contributed some sample code with the same tests.

You can then move on to the serious implementations:

poly1305aes_sparc, published 31 January 2005.
poly1305aes_aix, published 6 February 2005.
poly1305aes_macos, published 7 February 2005.
poly1305aes_ppro, published 14 February 2005.
poly1305aes_athlon, published 18 February 2005.

If you're trying to achieve better speed, make sure you understand all the different situations covered by my speed tables. You might want to start with my essay on the difference between best-case benchmarks and the real world. I designed the Poly1305-AES software, and the underlying Poly1305-AES function, to provide consistent high speed in a broad range of applications. A slight speedup in some situations often means a big slowdown in others; a Poly1305-AES implementation making that tradeoff might be useful for some applications, but it will be at best an alternative, not a replacement.

Where can I learn more about Poly1305-AES?

There are four papers:

[poly1305] 18pp. (PDF) D. J. Bernstein. The Poly1305-AES message-authentication code. Proceedings of Fast Software Encryption 2005, to appear. Document ID: 0018d9551b5546d97c340e0dd8cb5750. URL: https://cr.yp.to/papers.html#poly1305. Date: 2005.03.29. Supersedes: (PDF) (PS) (DVI) 2004.11.01. (PDF) (PS) (DVI) 2005.01.13.
This paper gives the complete definition of Poly1305-AES, explains the Poly1305-AES design decisions, discusses the security of Poly1305-AES, and explains how to compute Poly1305-AES quickly.
[securitywcs] 17pp. (PDF) (PS) (DVI) D. J. Bernstein. Stronger security bounds for Wegman-Carter-Shoup authenticators. Proceedings of Eurocrypt 2005, to appear. Document ID: 2d603727f69542f30f7da2832240c1ad. URL: https://cr.yp.to/papers.html#securitywcs. Date: 2005.02.27. Supersedes: (PDF) (PS) (DVI) 2004.10.19. (PDF) (PS) (DVI) 2004.10.27.
This paper proves security of this type of authenticator up to (and slightly beyond) 2^64 messages. Previous work by Shoup was limited to a smaller number of messages, often below 2^50.
[permutations] 10pp. (PDF) (PS) (DVI) D. J. Bernstein. Stronger security bounds for permutations. Document ID: 2f843f5d86111da8df8a14ef9ae1a3fb. URL: https://cr.yp.to/papers.html#permutations. Date: 2005.03.23.
This paper presents a new proof of the same security bound. The new proof factors the previous proof into (1) the usual Wegman-Carter security bounds and (2) a general technique for replacing uniform random functions with uniform random permutations. Previous versions of the technique were limited to far fewer messages.
[cachetiming] 37pp. (PDF) (PS) D. J. Bernstein. Cache-timing attacks on AES. Document ID: cd9faae9bd5308c440df50fc26a517b4. URL: https://cr.yp.to/papers.html#cachetiming. Date: 2005.04.14. Supersedes: (PDF) (PS) (DVI) 2004.11.11. Supersedes: (PDF) (PS) (DVI) 2004.11.21.
This paper discusses timing leaks in AES software. This is an issue for all AES users, not just Poly1305-AES users.

I'm also giving three talks on Poly1305-AES: 2005.02.15, emphasizing the structure of message-authentication codes; 2005.02.21, emphasizing the difference between best-case benchmarks and the real world; 2005.05, emphasizing the security-bound proof.