[Tahoe-dev] [p2p-hackers] announcing Allmydata Tahoe
David Barrett
dbarrett at quinthar.com
Tue May 8 14:26:27 PDT 2007
I agree that "alacrity" is a good goal. But how is a Merkle tree easier or
faster to download than a simple list of hashes?
If a hash block is, say, 256KB and a SHA-256 hash is 32 bytes, then wouldn't
the fastest way to start streaming be to just download the hash list in
parallel to the stream? After downloading just 32 bytes of hash data, you'd
be able to stream out 256KB of content.
Indeed, the hash list for a 4GB file is only like half a meg.
So the Merkle tree sounds great, but I'm still not seeing why it's better
than a simple list of hashes.
Unless, perhaps, you can't trust where you're downloading the hash list from
in the first place. In that case, you'd need to download the entire
hashlist and then perform some verification on that. For a 4GB file it
means you wouldn't be able to verify anything until you've downloaded and
verified all half-meg of hashes. But for a Merkle tree you'd only need the
leaf hash, all ancestor hashes, and all the siblings of the ancestors. Then
you could verify your subset of the Merkle tree, and begin verifying the
content, before having the rest of the hashes.
If this is the case, are you really able to obtain the partial Merkle-tree
subset faster than just downloading the entire hashlist? Or is there some
other advantage to the Merkle tree that I've overlooked?
-david
> -----Original Message-----
> From: p2p-hackers-bounces at lists.zooko.com [mailto:p2p-hackers-
> bounces at lists.zooko.com] On Behalf Of zooko at zooko.com
> Sent: Tuesday, May 08, 2007 2:08 PM
> To: tahoe-dev at allmydata.org; p2p-hackers at lists.zooko.com
> Subject: Re: [Tahoe-dev] [p2p-hackers] announcing Allmydata Tahoe
>
>
> david wrote:
> >
> > Wow, the project looks really great! I'm curious why you chose to use
> > Merkle trees, however, instead of a simple hash list. What's the
> advantage
> > of the additional complexity?
>
> Thanks!
>
> The reason for the Merkle Tree is so that you can verify the correctness
> of the
> first part of the file sooner. We call this "alacrity" -- the elapsed
> time
> between deciding you want to download a file and being able to read the
> first
> few bits of cleartext. With a simple hash list and a very large file, you
> have
> to download a very long hash list (or else a short hash list and a very
> large
> block to check if it matches the first hash in the list) before you can
> safely
> use the first byte of the data.
>
> Brian Warner wrote a script [1] to calculate the overhead (bytes used for
> verification metadata divided by bytes used for data) and alacrity (bytes
> downloaded before being able to use the first byte of cleartext) for the
> hash-list and Merkle Tree options. See appended output from that script.
>
> It turns out that this might be useful not only for people who are
> impatient to
> use the first part of a file that they are downloading, but also for
> probabilistic verifiers who want to sample only one or a few blocks
> instead of
> downloading the whole thing.
>
> By the way, here is a document with pictures describing how this part of
> Tahoe
> works: [2].
>
> Regards,
>
> Zooko
>
> [1] http://allmydata.org/trac/tahoe/browser/misc/sizes.py
> [2] http://allmydata.org/trac/tahoe/wiki/TahoeIssues/FileEncoding
>
>
> In the following, k is the arity of the hash tree, and d is the depth of
> it.
> So the hash list approach is where d == 1.
>
> ------- begin appended output from misc/sizes.py
>
> $ python -OOu ./misc/sizes.py --mode beta
> mode=beta # k=num_subblocks, d=1 # each subblock has a 20-byte hash
> storage storage
> Size blocksize overhead overhead k d alacrity
> (bytes) (%)
> ------- ------- -------- -------- ---- -- --------
> 4 0.16 1.95k 50000.00 1 1 0.16
> 16 0.64 1.95k 12500.00 1 1 0.64
> 64 2.56 1.95k 3125.00 1 1 2.56
> 256 10.24 1.95k 781.25 1 1 10.24
> 1k 40.96 1.95k 195.31 1 1 40.96
> 4k 163.84 1.95k 48.83 1 1 163.84
> 16k 655.36 1.95k 12.21 1 1 655.36
> 64k 2.56k 1.95k 3.05 1 1 2.56k
> 256k 10.24k 1.95k 0.76 1 1 10.24k
> 1M 40.96k 1.95k 0.19 1 1 40.96k
> 4M 163.84k 3.91k 0.10 2 1 81.94k
> 16M 655.36k 15.62k 0.10 8 1 82.06k
> 64M 2.56M 62.50k 0.10 32 1 82.53k
> 256M 10.24M 250 k 0.10 128 1 84.40k
> 1G 40.96M 1000 k 0.10 512 1 91.90k
> 4G 163.84M 3.91M 0.10 2048 1 121.90k
> 16G 655.36M 15.62M 0.10 8192 1 241.90k
> 64G 2.56G 62.50M 0.10 32768 1 721.90k
> 256G 10.24G 250 M 0.10 131072 1 2.58M
> 1T 40.96G 1000 M 0.10 524288 1 10.08M
> $ python -OOu ./misc/sizes.py --mode gamma
> mode=gamma # self.subblock_arity = k = arity
> storage storage
> Size blocksize overhead overhead k d alacrity
> (bytes) (%)
> ------- ------- -------- -------- ---- -- --------
> 4 0.16 0 0.00 2 0 0.16
> 16 0.64 0 0.00 2 0 0.64
> 64 2.56 0 0.00 2 0 2.56
> 256 10.24 0 0.00 2 0 10.24
> 1k 40.96 0 0.00 2 0 40.96
> 4k 163.84 0 0.00 2 0 163.84
> 16k 655.36 0 0.00 2 0 655.36
> 64k 2.56k 0 0.00 2 0 2.56k
> 256k 10.24k 0 0.00 2 0 10.24k
> 1M 40.96k 0 0.00 2 0 40.96k
> 4M 163.84k 3.91k 0.10 2 1 81.94k
> 16M 655.36k 27.34k 0.17 2 3 81.98k
> 64M 2.56M 121.09k 0.18 2 5 82.02k
> 256M 10.24M 496.09k 0.19 2 7 82.06k
> 1G 40.96M 1.95M 0.19 2 9 82.10k
> 4G 163.84M 7.81M 0.19 2 11 82.13k
> 16G 655.36M 31.25M 0.19 2 13 82.17k
> 64G 2.56G 125 M 0.19 2 15 82.21k
> 256G 10.24G 500 M 0.19 2 17 82.25k
> 1T 40.96G 1.95G 0.19 2 19 82.29k
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers at lists.zooko.com
> http://lists.zooko.com/mailman/listinfo/p2p-hackers
More information about the Tahoe-dev
mailing list