[Tahoe-dev] [p2p-hackers] announcing Allmydata Tahoe
David Barrett
dbarrett at quinthar.com
Tue May 8 15:34:20 PDT 2007
> -----Original Message-----
> From: zooko at zooko.com
>
> > If this is the case, are you really able to obtain the partial Merkle-
> tree
> > subset faster than just downloading the entire hashlist? Or is there
> some
> > other advantage to the Merkle tree that I've overlooked?
>
> The numbers work out like this (copied from the post-script of my earlier
> message and excerpted):
>
> hash list approach:
>
> storage storage
> Size blocksize overhead overhead k d alacrity
> (bytes) (%)
> ------- ------- -------- -------- ---- -- --------
> 256 10.24 1.95k 781.25 1 1 10.24
> 64k 2.56k 1.95k 3.05 1 1 2.56k
> 16M 655.36k 15.62k 0.10 8 1 82.06k
> 4G 163.84M 3.91M 0.10 2048 1 121.90k
> 1T 40.96G 1000 M 0.10 524288 1 10.08M
>
> Merkle Tree approach:
>
> storage storage
> Size blocksize overhead overhead k d alacrity
> (bytes) (%)
> ------- ------- -------- -------- ---- -- --------
> 256 10.24 0 0.00 2 0 10.24
> 64k 2.56k 0 0.00 2 0 2.56k
> 16M 655.36k 27.34k 0.17 2 3 81.98k
> 4G 163.84M 7.81M 0.19 2 11 82.13k
> 1T 40.96G 1.95G 0.19 2 19 82.29k
>
> They have equivalent alacrity up to files of size 16 M, above which the
> Merkle
> Tree approach has better alacrity.
Ah, I see. So 'sizes.py' effectively defines 'alacrity' as the "number of
bytes until you can stream some data". And the problem with a hashlist is
you need to download all the hashes before you can stream the first byte of
verified data.
So I can see how a Merkle tree reduces the the 'alacrity' down by a bunch
when the file gets over a certain size (in this case, 16MB).
Very clever!
-david
More information about the Tahoe-dev
mailing list