[tahoe-dev] protecting against bugs in our own crypto code
zooko
zooko at zooko.com
Tue May 6 11:47:17 PDT 2008
On May 5, 2008, at 10:44 PM, Ben Laurie wrote:
> Since you don't do an integrity check after decryption (which seems
> unwise), all that needs to fail is the decryption.
Here's the history of this issue until now:
One of the earliest discussions on the newly-public tahoe-dev mailing
list a year ago was me arguing that an integrity check on the
plaintext was redundant with an integrity check on the ciphertext: [1].
This was my argument against Rob Kinninmont who (inunpublished
personal communication of the same time) argued that redundant
integrity checks were worth it. My argument was basically that our
quality assurance process should ensure that the decryption code was
correct, and that if our quality assurance process wasn't up to that
task, then it probably wouldn't be up to the task of ensuring that
our redundant integrity check was correct either.
We went Rob's preference -- the redundant integrity checks on
plaintext in addition to ciphertext (and merkle trees of shares) on
immutable files, but planned to remove the redundant checks once we
had sufficient confidence in the correctness of our erasure-decoding,
decryption, and digital signature checking. When we created
decentralized, encrypted mutable files in Tahoe v0.7 we didn't
include redundant checks on the plaintext of the mutable files.
With the Tahoe 1.0 release, we had learned that brute-force/guessing
attacks on convergent encryption were more powerful than previously
thought (controversial -- more below), and we removed the hash of the
plaintext from the immutable files, since they can be used to perform
such attacks. (Thanks to Brian Warner for re-explaining this to me on
IRC today.)
Now that I've seen this problem -- how the port to FreeBSD, done
independently by two different people, yielded silently incorrect AES
decryption, I've changed my mind and decided that Rob was right. In
a sense my argument was right -- the allmydata quality assurance
process made sure that we never deployed a version which did AES
incorrectly, since we had extensive unit tests which verified the AES
against fixed standard test vectors and other tests, and as soon as
Greg Hazel ported Tahoe to the first platform where this bug was
exercised (Windows using the Microsoft compiler), the very next thing
he did was run the pycryptopp unit tests and discover the bug.
But in a sense I wasn't right, because not everyone runs the
pycryptopp unit tests when installing Tahoe, and because there could
be other bugs which aren't caught by the unit tests but would be
caught by a redundant integrity check on the plaintext.
So, I've changed my mind and think Rob was right.
However, the brute-force/guessing attack on convergent encryption is
still an issue. Since the last time that the topic was discussed
[2], I was able to come up with an example which will hopefully
motivate some of the doubters (although I am sure not all of them).
That example (thanks to some anonymous contributors on IRC for this)
is that when you install MySQL it by default creates a file which
contains some boilerplate bytes and your root password. If you then
backup your entire system using convergent encryption, this enables
an economical brute-force/guessing/rainbow-table attack on your MySQL
root password as well as everyone else's MySQL root password who uses
the same default setting and the same convergent encryption.
Fortunately, we can have both integrity-check on the plaintext, and
immunity to the question of such attacks by using a MAC instead of a
secure hash, where the MAC key is (derived from) the encryption key.
As a bonus, we can get reduced CPU usage and smaller Capability
Extension Blocks compared to a secure hash of the plaintext by using
the modern Carter-Wegman MAC such as Poly1305-AES or VMAC-AES-128.
Another improvement to our process might be to have the Tahoe tests
run the pycryptopp tests, as suggested by Ben Laurie, or at least to
add a "check if pycryptopp's AES is working" test to the Tahoe tests.
Another improvement, also suggested by Ben Laurie, would be to
eliminate the "default iv" feature from pycryptopp. I'm working on a
new version of pycryptopp, so I'll look at that issue.
Another improvement would be for someone to contribute a reliable
FreeBSD box to server as a buildslave.
Thanks!
Regards,
Zooko
[1] http://allmydata.org/pipermail/tahoe-dev/2007-May/000020.html
[2] http://www.mail-archive.com/cryptography@metzdowd.com/msg08949.html
More information about the tahoe-dev
mailing list