[Tahoe-dev] [p2p-hackers] announcing Allmydata Tahoe

zooko at zooko.com zooko at zooko.com
Tue May 8 14:43:28 PDT 2007


 david wrote:
>
> Unless, perhaps, you can't trust where you're downloading the hash list from
> in the first place.

Yes -- both of the schemes that we evaluated started with the downloader or
verifier having a short string that is the verification key for the file.  In
the hash list scheme, that key is the secure hash of the hash list.  In the
Merkle Tree scheme, it is the root of the Merkle Tree.  The idea is that you
can enable someone to download or verify a file by giving them only a small
key, rather than by giving them a large key (the entire hash list), or by
giving them the ability to interactively query you later for more of the
key that they would need.

Make sense?


>   In that case, you'd need to download the entire
> hashlist and then perform some verification on that.  For a 4GB file it
> means you wouldn't be able to verify anything until you've downloaded and
> verified all half-meg of hashes.  But for a Merkle tree you'd only need the
> leaf hash, all ancestor hashes, and all the siblings of the ancestors.  Then
> you could verify your subset of the Merkle tree, and begin verifying the
> content, before having the rest of the hashes.

Exactly.


> If this is the case, are you really able to obtain the partial Merkle-tree
> subset faster than just downloading the entire hashlist?  Or is there some
> other advantage to the Merkle tree that I've overlooked?

The numbers work out like this (copied from the post-script of my earlier
message and excerpted):

hash list approach:

                    storage    storage
Size     blocksize  overhead   overhead     k  d  alacrity
                    (bytes)      (%)
-------  -------    --------   --------  ---- --  --------
   256    10.24        1.95k    781.25      1  1   10.24
    64k    2.56k       1.95k      3.05      1  1    2.56k
    16M  655.36k      15.62k      0.10      8  1   82.06k
     4G  163.84M       3.91M      0.10   2048  1  121.90k
     1T   40.96G    1000   M      0.10   524288  1   10.08M

Merkle Tree approach:

                    storage    storage
Size     blocksize  overhead   overhead     k  d  alacrity
                    (bytes)      (%)
-------  -------    --------   --------  ---- --  --------
   256    10.24        0          0.00      2  0   10.24
    64k    2.56k       0          0.00      2  0    2.56k
    16M  655.36k      27.34k      0.17      2  3   81.98k
     4G  163.84M       7.81M      0.19      2 11   82.13k
     1T   40.96G       1.95G      0.19      2 19   82.29k

They have equivalent alacrity up to files of size 16 M, above which the Merkle
Tree approach has better alacrity.


Regards,

Zooko



More information about the Tahoe-dev mailing list