Folks:<br><br>I just wrote this essay and posted it on Google+ (which I am using somewhat like a blog, so this is sort of like a blog entry). A conversation I had with Shawn Willden on his G+ "blog" recently is one of the inspirations for this post.<br>
<br><a href="https://plus.google.com/108313527900507320366/posts/cMng6kChAAW">https://plus.google.com/108313527900507320366/posts/cMng6kChAAW</a><br><br><b>On the limits of the use cases for authenticated encryption</b><br>
<br><b>What is authenticated encryption?</b><br><br>Authenticated
encryption is getting a lot of attention among cryptographers and
crypto programmers nowadays. Authenticated encryption is just like
normal (symmetric) encryption, in that it prevents anyone who doesn't
know the key from learning anything [*] about the text. The
"authenticated" part is that it <i>also</i> prevents anyone who doesn't
know the key from undetectably altering the text. (If someone who
doesn't know the key does alter the text, then the recipient will
cleanly reject it as corrupted rather than accepting the altered text.)<br><br>It
is a classic mistake for engineers using crypto to confuse encryption
with authentication. If you're trying to find weaknesses in someone's
crypto protocol, one of the first things to check is whether the
designers of the protocol assumed that by encrypting some data they were
preventing that data from being undetectably modified. Encryption
doesn't accomplish that, so if they made that common mistake, you can
attack the system by modifying the ciphertext. Depending on the details
of their system, this could easily lead to a full break of the system,
such that you can violate the security properties that they had intended
to provide to their users.<br><br>Since this is such a common mistake,
with such potentially bad consequences, and because fixing it is not
that easy (especially due to timing attacks against authentication
schemes), cryptographers have studied how to efficiently and securely
integrate both encryption and authentication into one package. The
resulting schemes are called authenticated encryption schemes.<br><br>In
the years since cryptographers developed some good authenticated
encryption schemes, they've started thinking of them as a "drop-in
replacement" for normal old unauthenticated encryption schemes, and
started suggesting that everyone should use authenticated encryption
schemes instead of unauthenticated encryption schemes in all cases.
There was a recent move among cryptographers, spearheaded by the
estimable Daniel J. Bernstein, to collectively focus on developing new
improved authenticated encryption schemes. This would be a sort of
community-wide collaboration, now that the community-wide collaboration
on secure hash functions—the SHA-3 contest—is coming to an end.<br><br>Several
modern cryptography libraries, including “Keyczar” and Daniel J.
Bernstein's “nacl”, try to make it easy for the programmer to use an
authenticated encryption mode and some of them make it difficult or
impossible to use an unauthenticated encryption mode.<br><br>When Brian
Warner and I presented Tahoe-LAFS at the RSA Conference in 2010, I was
surprised and delighted when an audience member who approached me
afterward turned out to be Prof. Phil Rogaway, renowned cryptographer
and author of a very efficient authenticated encryption scheme ("OCB
mode"). He said something nice about our presentation and then asked why
we didn't use an authenticated encryption mode. Shortly before that
conversation he had published a very stimulating paper named
“Practice-Oriented Provable Security and the Social Construction of
Cryptography”, but I didn't read it until years later. In that
fascinating and wide-ranging paper he opines, among many other ideas,
that authenticated encryption is one of “the most useful abstraction
boundaries”.<br><br>So, here's what I wish I had been quick-witted
enough to say to him when we met in 2010: authenticated encryption can't
satisfy any of my use cases!<br><br><b>Tahoe-LAFS access control semantics</b><br><br>I'm
one of the original and current designers of the Tahoe-LAFS secure
distributed filesystem. We started out, in 2006, by choosing the access
control semantics that we wanted to offer our users and that we knew how
to implement. Here's what we chose:<br><br><i>There are two kinds of
files: immutable and mutable. When you write a file to the filesystem
you can choose which kind of file it will be in the filesystem.
Immutable files can't be modified once they have been written. A mutable
file can be modified by someone with read-write access to it. A user
can have read-write access to a mutable file or read-only access to it,
or no access to it at all.</i><br><br><i>In addition to read-write
access and read-only access, we implement a third, more limited, form of
access which is "verify-only" access. You can grant someone the ability
to check the integrity of your ciphertexts without also granting them
the ability to decrypt it.</i><br><br>This is useful in the modern
cloudy world, because with it you can delegate the job of auditing and
repairing your files to a third party without becoming vulnerable to
that party reading your files.<br><br>(You can read a one-page summary of the Tahoe-LAFS design here: <a href="https://tahoe-lafs.org/trac/tahoe-lafs/browser/docs/about.rst" class="ot-anchor">https://tahoe-lafs.org/trac/tahoe-lafs/browser/docs/about.rst</a> , a more detailed explanation of the access control semantics here: <a href="https://tahoe-lafs.org/trac/tahoe-lafs/wiki/Capabilities" class="ot-anchor">https://tahoe-lafs.org/trac/tahoe-lafs/wiki/Capabilities</a> , and a six-page paper about the way it is implemented with cryptography here: <a href="https://tahoe-lafs.org/%7Ezooko/lafs.pdf" class="ot-anchor">https://tahoe-lafs.org/~zooko/lafs.pdf</a> .)<br>
<br><b>Can't be implemented with authenticated encryption!</b><br><br>This
seems like a small, useful set of concepts which users can understand
and employ. It doesn't do everything that everyone wants, but I think
that it has a certain elegance, and also it has stood the test of time
and has served many users well.<br><br>Now, here are three consequences of the above design:<br><br>1. I can authorize you to check the integrity of a file (ciphertext) without authorizing you to read it.<br><br>2. I can authorize you to check the integrity of a file without authorizing you to change its contents.<br>
<br>These
two are necessary for the use case mentioned above: delegating the job
of monitoring and repairing your data to some third party who is not
allowed to read your data.<br><br>3. I can authorize you to read a file without authorizing you to change its contents.<br><br>This
one is necessary to implement both the immutability property of
immutable files (nobody can change the contents, although verifiers and
readers can check the integrity of the contents), and the authenticity
property of mutable files (readers or verifiers can't change the
contents, although they can verify the correctness of the contents).<br><br>As far as I can tell, authenticated encryption cannot be used to implement these properties.<br><br><b>What does this imply for other users of cryptography?</b><br>
<br>I'm
not sure if the Tahoe-LAFS design is sort of the odd duck, and all the
rest of the world should go ahead and upgrade from unauthenticated
encryption to authenticated encryption, or if this mismatch is a warning
sign. Maybe authenticated encryption isn't the most useful abstraction
boundary after all.<br><br>Maybe we should have a conversation about
which abstractions benefit our users. I think it helps to work
“top-down”, from use cases (e.g. one-to-one chat, group chat,
file-sharing, web hosting, live file-editing collaboration, streaming
video, voice, etc.) to desired semantics, and then to the security
properties of protocols. So far I think the enthusiasm for authenticated
encryption has been somewhat “bottom-up”—after we all witnessed the
repeated mistake of relying on encryption for authentication, we
invented a way to prevent that, and then started thinking that we should
apply the solution to all uses.<br><br>[*] Except they get to learn the
length of it, depending on your padding. And of course they get to
learn from where and to where it was transmitted, and when, depending on
how your comms work. That's called "traffic analysis" and it is often
the most valuable information to the attacker anyway.<br>