[Tahoe-dev] [p2p-hackers] announcing Allmydata Tahoe

zooko at zooko.com zooko at zooko.com
Tue May 8 09:03:43 PDT 2007


 Alen Peacock wrote:
>
> Welcome to the world, Tahoe!  And congrats to the team on the public release.

Thanks!  I'm very pleased that we already have a handful of hackers poking at
it.  The next step is to fix issue #22 [1] so that we can have friend-nets for
backup and sharing which traverse firewalls (when the firewalls are manually
configured to forward a port).


> I'm curious about application level goals for tahoe.  I know that
> allmydata is in the backup business, but it seems like several design
> decisions in tahoe are driven by more generic distributed storage
> constraints (desire for 'alacrity' in download, streaming, etc.),
> i.e., it seems that tahoe could be used as a generic file-sharing
> service, which has differnent usage patterns and common case scenarios
> from backup.  Is that an accurate characterization?  Care to comment
> on the tradeoffs/benefits of the design in this regard?

You're right on all counts.  Allmydata is in the business of consumer backup,
but as we design Tahoe, we're alert for opportunities to make it more flexible
so that it can be extended to other purposes.  So far this has worked well --
the way that we designed for alacrity and streaming, for example, hasn't caused
any problems for "batch mode" backup and restore as far as I can tell.


> I also have some questions about the Verifier, driven mostly by my
> ignorance I'm sure :)  It is clear that it could verify the file from
> the verifierid, but I think this requires it to download and decode
> the file, right?  Is that sufficient to detect which shares might be
> corrupt (to assess blame to a specific node), or is there an
> additional hash for each share that is stored somewhere other than
> with the share (in the URI, perhaps)?  Or some other means to detect
> corrupt data without reconstructing files remotely?

There is a Merkle Tree.  A verifier is given the root of the Merkle Tree and
can thus validate the correctness of any block given only that block and the
necessary Merkle-Tree-Uncle-Nodes.  To validate every single block would still
require downloading all the blocks, but perhaps a verifier could randomly
sample blocks and thus be satisfied.


> Very cool stuff.  Congrats again.  Allmydata should be commended for
> releasing this openly.

Thanks again!  And thanks for the good questions.

And how is flÅ­d going?  I like the documentation and blog, especially this
entry: [1].  And I love your bibliography [2], especially the "Security /
Robustness / Resilience" section.

I see that you are using LDPC: [3].  I would be interested to see how it
performs compared to my zfec library [4].  Certainly LDPC has a significant
advantage insofar as it can produce large numbers of check blocks, but Tahoe is
currently designed to require no more than a hundred or so, so we can do
without that feature.  In fact, I threw out the GF(2^16) parts of Luigi Rizzo's
fec library in order to have a simpler and potentially faster GF(2^8) library,
because I plan to need no more than 256 shares for the forseeable future.


Regards,

Zooko

[1] http://flud.org/blog/2007/04/26/eradicating-service-outages-once-and-for-all/
[2] http://www.flud.org/wiki/index.php/RelatedPapers
[3] http://flud.googlecode.com/svn/trunk/coding/
[4] http://allmydata.org/trac/tahoe/browser/src/zfec/README.txt



More information about the Tahoe-dev mailing list