[volunteergrid2-l] Problems

Shawn Willden shawn at willden.org
Thu Oct 4 15:30:04 UTC 2012


On Thu, Oct 4, 2012 at 8:58 AM, Peter Secor <secorp at gmail.com> wrote:

> Turn it up to 11!
>
> One failure case that I ran into a couple times is what happened to
> Allmydata (the company).


Yes, that's one of several reasons I don't like 3/7/10, at least in theory.

If the shares of each of your files (including dirnodes) are each deployed
on a small subset of the available servers then the failures of the files
are effectively independent, which is -- somewhat counterintuitively -- a
bad thing.  It means that if each file has a reliability probability p, and
you have a file which is at the end of a chain of n dirnodes, then the
probability that you'll have access to that file is the probability that *
all* of the n+1 nodes in the chain are available, which is p^(n+1).  If you
have chains of any significant depth that can really degrade your net
reliability.

For example, assuming 80% of the servers were available, the nominal
per-file reliability of a 3-of-7 (IMO, shares total isn't all that
meaningful, I prefer to look at shares-of-happiness) is 99.95%.  But if you
have a 20-deep directory hierarchy, that drops to 91%.  Also, even ignoring
directory hierarchies, if you have 1,000 files the odds of having access to
all of them is essentially zero.

If, instead, N is set to the number of servers available and you choose K
to give an appropriate level of reliability, all of your files will live or
die together, which is much better.  You can pick a reliability level and
it will be independent of the depth of your tree or the number of your
files.  Also, using larger values of N results in smaller expansion factors
for a given reliability level.

Except... values that seem like they should work well on a grid the size of
VG2 don't.  Setting N close to the number of nodes in the grid, or setting
H close to N, result in large numbers of upload failures.  Though download
availability does seem to be good if you can manage to get the files
uploaded.

-- 
Shawn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://tahoe-lafs.org/cgi-bin/mailman/private/volunteergrid2-l/attachments/20121004/6ab41fcc/attachment.html>


More information about the volunteergrid2-l mailing list