[tahoe-dev] Helper integrity (was: Using allmydata.com production grid?)

Sun May 31 13:15:06 PDT 2009

On Sat, 30 May 2009 21:14:35 +0100
David-Sarah Hopwood <david-sarah at jacaranda.org> wrote:

> Peter Secor wrote:
> >    A helper is a node which accepts encrypted data and
> > performs ...
> 
> This requires trusting the helper for availability but not for
> integrity, is that correct?

Ideally, that would be true. The worst the Helper should be able to do is to
upload the wrong shares (or just not upload anything), and the client should
be able to detect this quickly (by checking the helper's work immediately
after upload, while it is still hanging on to the original file and can
upload it directly if it needs to).

However, after looking at the code, I've realized that the current
implementation relies upon the helper for both availability *and* integrity.
It also turns out that we're granting it the ability to perform a
partial-information-guessing attack. But, as designed, we do not rely upon it
for confidentiality.

What happens in the current tahoe-1.4.1 is that the helper computes the
hashes of both the encrypted segments and the encoded shares, builds them
into the "URI Extension Block" or "UEB" (which is slowly being renamed to
"Capability Extension Block"), hashes this into the "UEB hash", and reports
both the UEB fields and the UEB hash back to the uploading client. The client
then concatenates their privately-held encryption key with the UEB hash to
form the readcap.

Because the client does not do anything to validate the UEB directly, it
relies upon the Helper to compute the right one. This enables an integrity
attack: the Helper could upload an unrelated ciphertext (of the same size)
and return the resulting UEB hash. The client would blindly accept the UEB
hash, and later (when someone goes to download the file), would use the
readcap's key to decrypt that ciphertext, resulting in a bad plaintext. (in
practice, because the Helper doesn't have access to the key, their best
chance of doing damage is by flipping specific bits, like a wire protocol
that uses encryption but not integrity checking).

The v1.4.1 client does not check up on the helper's work afterwards (by
immediately trying to download the file, for example, or performing some
random spot checks). Nor does it compute its own copy of the ciphertext
hashes to compare them against the UEB data sent by the Helper, nor does it
confirm that the UEB fields actually hash into the claimed UEB hash. If the
client computed and verified the ciphertext hash, we would avoid the
wrong-ciphertext integrity attack described above. Specifically, the Helper
could still upload anything they like, but the downloader would detect the
problem just before decryption (converting an integrity problem into an
availability problem).

We used to compute a hash of the plaintext (both a flat keyed hash and a
merkle tree) and store it in the UEB, but we disabled the storing code in
v1.0 (released 25-Mar-2008), and removed it entirely in v1.3 (13-Feb-2009),
as part of the fix to thwart the partial-information-guessing attack (#365).
Obviously, this plaintext hash was generated on the client side, and thus
would prevent the Helper from changing anything undetectably.

However, the code that actually generates the plaintext hash is still present
in v1.4.1, and the Helper is granted access to this object (even though our
implementation never calls it). So at present, the Helper also gets to
perform a partial-information-guessing attack against the client, by asking
it for the plaintext hash and then using that as an oracle to limit their
search space.

So, some things that ought to be done:

 #722: stop giving Helper access to plaintext hashes
       (fix partial-information-guessing attack)
 #723: client should verify hashes instead of trusting the helper's response
       (stop relying upon helper for integrity)
 #453: safely add plaintext hashes to the UEB
 #724: client should check up on the helper's work
       (stop relying upon helper for reliability)

I think that #722 is easy to do in the next week, and have marked it as a
1.5.0 item. #723 is slightly harder, but I think we could probably pull it
off for 1.5.0 too. #453 is already 12 months old, so it's probably a 1.6
thing. #724 feels like a bunch of work for insufficient gain, but folks who
use a helper more than I do might feel differently.

For reference, here's a brief timeline of relevant changes:

 12-Mar-2008: v0.9 released, had plaintext hash
 23-Mar-2008: [2331] removed plaintext hashes
 23-Mar: [2337] undoes [2331]
 24-Mar: [2332] add convergence secret (#365)
 24-Mar: [2338] add Encoder.USE_PLAINTEXT_HASHES=False (disables sending code)
 25-Mar-2008: v1.0 released
 21-Jul-2008: v1.2 released
 09-Dec-2008: [3286] remove plaintext-sending code
 13-Feb-2009: v1.3 released
 13-Apr-2009: v1.4.1 released

And some related tickets:

#491 (done): add ciphertext hash to fix alternative-versions attack, v1.2.0
#377: conditionally enable plaintext hashers
#365: add convergence secret

cheers,
 -Brian