[tahoe-dev] how to encrypt and integrity-check with only one value
David-Sarah Hopwood
david-sarah at jacaranda.org
Tue Sep 8 02:14:12 PDT 2009
Brian Warner wrote:
> Some observations:
>
> * obviously the "v = H(ciphertext)" could+should be expanded to include
> our usual UEB scheme, with all integrity information (merkle trees,
> share hash trees, ideally even an encrypted form of the plaintext
> hash data) going into the UEB, and "v" being the hash of the UEB.
> David-Sarah's point about making verifycap=H(v,K1enc) is spot-on.
>
> * verifycap cannot be offline-derived from readcap: you have to run
> through part of the download process, fetch at least "v" and the
> K1enc value, derive K1, hash K1+v together to confirm that you really
> do get the readcap, then emit H(v+K1enc) as the verifycap. This makes
> manifest/repaircap generation really expensive (a network trip per
> file). One mitigation strategy would be to store both readcap and
> verifycap in dirnodes, effectively caching the verifycap computation.
Given that the combined (readcap, H(v, k1_enc)) is as short as just the
readcap in any alternative scheme, this seems quite acceptable to me.
> * what should the storage-index be? It clearly must be the hash of the
> readcap, otherwise readers cannot find the shares (or must carry
> around some extra value, negating the shortness of the readcap).
>
> * but since storage-index != verifycap (i.e. H(UEBhash+k1enc)), servers
> will be unable to completely validate their shares. They can confirm
> that everything (including K1enc, thanks to David-Sarah's suggestion)
> matches the verifycap, but they can't tell that the verifycap matches
> the storage-index under which the share is stored (i.e. they'd be
> unable to detect two swapped sharefiles). This permits the
> "roadblock" attack and generally misses our goals of allowing full
> server-side validation.
That could be fixed by including the storage index in the verifycap,
i.e. (storage_index, H(v, k1_enc)).
dirnodes still only need to store (readcap, H(v, k1_enc)), since
the readcap can be hashed to get the storage index.
> * we can't determine the storage-index until after we've encoded the
> entire file (which generally means after we've uploaded it). So we
> need a new uploader protocol that lets us upload to an as-yet-unnamed
> slot, and then provide the slot's storage-index at the very end of
> the process. This is more work, but it isn't a huge deal.
>
> * we wouldn't be able to directly use our permuted-list Tahoe2
> peer-selection protocol, since we won't know the storage-index (and
> thus the permuted list) until after we've uploaded all the shares.
Zooko's protocol still works if r = H(k1, H(plaintext)), rather than
r = H(k1, H(ciphertext)). In that case you would only need to know the
hash of the plaintext, not the encoded ciphertext, to calculate the
storage-index. Does that help?
In the mutable-file variant I suggested there is no corresponding
problem, because v is a public verification key that is fixed for a
given file, and can be generated before any particular ciphertext.
> So, while I like the one-cryptovalue trick, I'm unsatisfied with both
> the lack of server-side validation and offline readcap-to-verifycap
> attenuation, and the separate SSI value makes me slightly nervous.
Are the above suggestions enough to address your dissatisfaction?
> Incidentally, I kind of suspect that we could get away with longer
> immutable readcaps if we had short directory readcaps, since I imagine
> that people are more likely to share with dircaps (which get you
> filenames) than with the raw filecaps. On the other hand, I fear that we
> have even fewer tricks available for mutable encoding schemes, unless
> semiprivate keys work out.
On the contrary, dircaps can be shorter than immutable filecaps due
to not needing collision resistance.
--
David-Sarah Hopwood ⚥ http://davidsarah.livejournal.com
More information about the tahoe-dev
mailing list