[tahoe-dev] UEB hash size

Zooko Wilcox-O'Hearn zooko at zooko.com
Sun Jul 12 19:50:43 PDT 2009


And in answer to your questions:

On Jul 12, 2009, at 18:45 PM, Shawn Willden wrote:

> What's the rationale for including the full 256-bit UEB hash in the  
> CHK URI?  Those URIs could be shortened considerably by truncating  
> it to, say, 128 bits.

It is that the integrity of an immutable file cap is the "exactly one  
file matches this cap" guarantee.  To ensure this requires 2K bits in  
the immutable cap to guarantee K bits of security, because of a  
birthday-surprise attack in which an attacker generates two (or more)  
files with the same immutable file cap so that they have the ability  
to undetectably swap in the alternate files substituted for the  
original file, after they've distributed the cap to other people.  To  
generate such files requires only about K bits of work to find  
multiple matching files for a 2K-bit immutable file cap.

> How difficult would it be to allow Tahoe to operate with either  
> full UEB hashes or abbreviated hashes?

It is a neat idea.  We've discussed it before, but I can't find the  
reference.  I seem to recall that Brian had a good summary of the  
risk of publishing a shortened immutable cap.  Perhaps he just  
pointed out that in the future people may come to distrust whether  
the file that they get by retrieving with that cap is really the only  
file that could have matched.  If your shortened cap is sufficiently,  
let's say 192-bits, this risk doesn't sound like a big issue as far  
as brute computer power goes -- even if people in the future have  
vastly improved computation technology, 2^96 computations will  
probably still be very, perhaps even prohibitively, expensive.   
However, the possibility of people uncovering algorithmic weaknesses  
in the hash algorithm that we are using (currently SHA-256d,  
hopefully in the future SHA-3) can reduce the effective strength.

By the way, I'm sitting on a good idea that I haven't finished  
writing up yet for how to combine the encryption key and the  
integrity-checking hash together so that you have only one value  
(perhaps of size 256 bits) instead of two values -- one for the key  
and one for the hash.  Perhaps that would solve most of your  
performance issues?  As I mentioned in my previous mail, I'd like to  
understand more about what the performance implications are in  
GridBackup.

> What is the bare minimum data needed to retrieve, reassemble and  
> decrypt an immutable file?  Just the AES read key?

That, and some way to find the shares, which we currently call the  
"storage index".  That would omit not only the integrity check on the  
ciphertext (to guarantee that the immutable cap you started with  
could match only one file) but also the integrity check on the shares  
(to identify which servers are responsible for serving up corrupted  
shares, in the case that the resulting file was corrupted).

Regards,

Zooko


More information about the tahoe-dev mailing list