[Tahoe-dev] [p2p-hackers] announcing Allmydata Tahoe

Tue May 8 23:47:16 PDT 2007

On 5/8/07, zooko at zooko.com <zooko at zooko.com> wrote:
> And how is flŭd going?

  Thanks for asking!  I've been having a lot of fun with it.  It
certainly has taken me longer to get to where it is now than if I
could devote myself full-time to it, but that has been good in a way
-- I've had a lot of time to really think through its design.  It is
getting very close to being ready for another release that will be a
bit less uber-alpha. :)

>  I like the documentation and blog, especially this
> entry: [1].
> [1] http://flud.org/blog/2007/04/26/eradicating-service-outages-once-and-for-all/

  Thanks.  I'm sure it's all preaching to the choir -- old hat stuff
to the p2p-hackers crowd, but it was largely inspired by personal
experiences I've had with both using and trying to maintain
centralized services.  The sad truth is, I fear, that unnoticed
service outages and/or data loss occur in centrally managed
architectures far more frequently than is reported.

> And I love your bibliography [2], especially the "Security /
> Robustness / Resilience" section.
> [2] http://www.flud.org/wiki/index.php/RelatedPapers

  I hope it's a useful resource.  The game theory stuff and fairness
enforcing mechanisms (http://www.flud.org/wiki/index.php/Fairness) are
areas I get a bit excited about, and it is rewarding to finally have
enough pieces in place in flŭd to start really making that stuff work.

> You're right on all counts.  Allmydata is in the business of consumer backup,
> but as we design Tahoe, we're alert for opportunities to make it more flexible
> so that it can be extended to other purposes.  So far this has worked well --
> the way that we designed for alacrity and streaming, for example, hasn't caused
> any problems for "batch mode" backup and restore as far as I can tell.

  It's always a series of tradeoffs with these things.  In contrast to
Tahoe, with flŭd I decided to focus the architecture squarely on
backup.  The common case for backup is writing data to the grid, which
is the opposite of the common case for file sharing.  In a pure backup
system, an individual user will restore backed-up data very rarely,
but will be sending data to the backup system throughout the day,
every day, all the time.  And on those rare occasions when a user does
need to restore data, they are going to want a full copy of their
file[s].  I suppose I'm claiming that features like partial file
streaming are going to be of little value in that scenario, and if
they impose unnecessary costs, could actually be a net negative.  In
light of Tahoe's more generic goals, it sounds like you believe these
potential negatives are more than offset by the additional flexibility
and future opportunity to leverage the architecture.  Is that a fair
assessment?

  To me it has always seemed useful to have distinct decentralized
storage networks tuned for different usage scenarios.  BitTorrent has
probably influenced me a bit in that thinking; you'd never use
BitTorrent for redundant archiving (unless you have permanently
popular content ;) ), but what BT does well is what it has focused on:
swarming cooperative download.  "Grid Computing," which the press
fawned over for some time, has also influenced my thinking, but in the
opposite direction; here's a set of super generic distributed storage
technologies, and yet it doesn't really seem to have caught on with
anyone other than its big corporate creators.  Maybe I have
exaggerated the effect, but it seems like focused specialization
benefits the backup scenario.

  Having said all that, I will admit that having a single storage
technology that can be applied to multiple uses is attractive.  Which
is why I'm interested in how you deal with those tradeoffs and related
performance implications.  In other words, if it is possible to create
separate decentralized storage systems that are tuned to specific
usage scenarios, and in those scenarios yield significant (e.g., >2x)
performance advantages over the generic architecture, couldn't that be
a problem for Tahoe?  If, on the other hand, such specialization
yields no or only moderate improvement (e.g., 1.1x), then it seems
that an approach like Tahoe's clearly wins the day (because absolute
performance isn't everything; lots of high-performance technologies
have been marginalized by their lower-performing competitors;
"slightly better" is usually more than offset by "cheaper," "better
marketed," and/or "popular + network effects.")

> I see that you are using LDPC: [3].  I would be interested to see how it
> performs compared to my zfec library [4]

  I've found that once a design problem convinces you to make a
focused choice like I made with flŭd, it takes you down a certain
road.  LDPC vs RS was one of those choices that became very clear once
I decided that flŭd was focused only on the backup use case, where
encode-and-upload-time dominates download-and-decode-time.  I don't
remember off-hand by how many factors LDPC is faster than Rizzo for
encoding, but when I was evaluating, it was several :) .  Decoding is
similarly faster, but the tradeoff is that for decode, LDPC might need
to recover an extra chunk (or several).  That seems like a very decent
tradeoff, especially given that we can choose parameters to make
guarantees like "you'll only need to recover 25/50" chunks to rebuild
the file.  This is magnified by the above observation that users'
daily experience will include snappier backup and friendlier CPU
burnage.

  The other thing that LDPC does for flŭd is provide efficient memory
operations even over very large files without segmenting them.
Whether encoding a 1K file or a 2GB file, each encodes to N+M
data+parity blocks (it is possible to do this with RS, even with a 2^8
galois field, it just takes a lot more computation).  The LDPC library
simplifies the per-file metadata requirements and makes it possible to
have a small, fixed size hashlist for every file, which has relevance
to the Merkle Tree vs. Hash List discussion you and David have been
having.  For Tahoe, which has more generic goals, maybe this is not an
advantage (since you want to chunk files into segments for reasons
other than satisfying the encoder), but for flŭd it is (since it
simplifies the design and improves performance).

  I'll have to play with zfec.  I've always been impressed with how
much performance Rizzo was able to squeeze out of Reed-Solomon.
You've done a lot of people a great favor by providing a python
wrapper.  I've been meaning to pull the ldpc wrapper (which needs some
serious rewriting, btw) out of flŭd and make it a separate release for
some time now.  Hope you don't mind if I follow your example.

  Alen