[volunteergrid2-l] Why high availability is crucial

Sun Jan 16 00:55:59 UTC 2011

Shawn,

You obviously know more about this than I do.

So in summary are you suggesting:

(S) # of servers: 20
(H) happiness: 15
(K) # of file segments: 7

On Sat, Jan 15, 2011 at 5:10 PM, Shawn Willden <shawn at willden.org> wrote:

> It may seem that the reason maintaining high node uptime is important is so
> that files can be retrieved reliably, i.e. read-availability.  In fact, the
> bigger hurdle is maintaining write-availability.  This is fairly obvious,
> since to read you only need K servers and to write you need H servers and
> usually H is significantly larger than K.
>
> I think it's even more important than it appears, however, because I think
> there's value in setting H very close to S (the number of servers in the
> grid).  If S=20 and H=18, then clearly it's crucial that availability of
> individual servers be very high, otherwise the possibility of more than two
> servers being down at once is high, and the grid is then unavailable for
> writes.
>
> So, why would you want to set H very high, rather than just sticking with
> the 3/7/10 parameters provided by default?
>
> There are two reasons you might want to increase H.  The first is to
> increase read-reliability and the second is so that you can increase K and
> reduce expansion while maintaining a certain level of read-reliability.  For
> purposes of determining the likelihood that a file will be available at some
> point in the future, I ignore N.  Setting H and N to different values is
> basically saying "I'll accept one level of reliability, but if I happen to
> get lucky I'll get a higher one".  That's fine, but when determining what
> parameters to choose, it's H and K that make the difference.  In fact if S
> happens to decline so that at the moment of your upload S=H, then any value
> of N > H is a waste.
>
> If you want to find out what kinds of reliability you can expect from
> different parameters, there's a tool in the Tahoe source tree.
>  Unfortunately, I haven't done the work to make it available from the web
> UI, but if you want you can use it like this:
>
> 1.  Go to the tahoe/src directory.
> 2.  Run python without any command-line arguments to start the python
> interpreter.
> 3.  Type "import allmydata.util.statistics as s" to import the statistics
> module and give it a handy label (s)
> 4.  Type "s.pr_file_loss([p]*H, K)", where "p" is the server reliability,
> and H and K are the values you want to evaluate.
>
> What value to use for p?  Well, ideally it's the probability that the data
> on the server will _not_ become lost before your next repair cycle.  To be
> conservative, I just use the server _availability_ target, which I'm
> proposing is 0.95.
>
> The value you get is an estimate of the likelihood that your file will be
> lost before the next repair cycle.  If you want to understand how it's
> calculated and maybe argue with me about its validity, read my lossmodel
> paper (in the docs dir).  I think it's a very useful figure.
>
> However, unless you're only storing one file, it's only part of the story.
>  Suppose you're going to store 10,000 files.  On a sufficiently-large grid
> (which volunteergrid2 will not be), you can model the survival or failure of
> each file independently, which means the probability that all of your files
> survive is "(1-s.pr_file_loss([p]*H, K))**10000".  Since volunteergrid2 will
> not be big enough for the independent-survival model to be accurate, the
> real estimate would fall somewhere between that figure and
> "1-s.pr_file_loss([p]*H, K)", which is the single-file survival probability.
>  To be conservative, I choose to pay attention to lower probability, which
> is the 10,000-file number.
>
> Anyway, if you use that tool and spend some time playing with different
> values of H and K, what you find is that if you increase H you can increase
> K and reduce your expansion factor while maintaining your survival
> probability.  If you think about it, this makes intuitive sense, because
> although you're decreasing the amount of redundancy, you're actually
> increasing the number of servers that must fail in order for your date to
> get lost.  With 3/7, if five servers fail, your data is gone.  With 7/15,
> nine servers must fail.  With 35/50, 16 must fail.  Of course that's five
> out of seven, nine out of 15 and 16 out of 50, but still, with relatively
> high availability numbers, the odds of those failure rates are very close to
> the same.
>
> From a read-performance perspective there's also some value in increasing
> K, because it will allow more parallelism of downloads -- at least in
> theory.  With the present Tahoe codebase that doesn't help as much as it
> should, but it will be fixed eventually.  (At present, you do download in
> parallel from K servers, but all K downloads are limited to the speed of the
> slowest, so your effective bandwidth is K*min(server_speeds).  If that were
> fixed, it would just be the sum of the bandwidth available to the K
> servers).
>
> So, if we can take as a given that larger values of K and H are a good
> thing (and I'm happy to go into more detail about why that is if anyone
> likes; I've glossed over a lot here), then the best way to choose your
> parameters is to, ideally, set H=S and then choose the largest K that gives
> you the level of reliability you're looking for.
>
> But if you set H=S, then even a single server being unavailable means that
> the grid is unavailable for writes.  So you want to set H a little smaller
> than S.  How much smaller?  That depends on what level of server
> availability you have, and what level of write-availability you require.
>
> I'd like to have 99% write-availability.  If we have a 95% individual
> server availability and a grid of 20 servers, the probability that at least
> a given number of servers is available at any given moment is:
>
> 20 servers: 35.8%
> 19 servers: 73.6%
> 18 servers: 92.5%
> 17 servers: 98.4%
> 16 servers: 99.7%
> 15 servers: 99.9%
>
> Again, if anyone would like to understand the way I calculated those, just
> ask.
>
> At 99.9% availability, if I can't write to the grid it's more likely
> because my network connection is down than because there aren't enough
> servers to satisfy H=15.
>
> So, that's why I'd really like everyone to commit to trying to maintain
> 95+% availability on individual servers.  In practice if you have a
> situation which takes your box down for a few days, it's not a huge deal,
> because more than likely most of the nodes will have >95% availability, but
> what we don't want is a situation (like we have over on volunteergrid1)
> where a server is unavailable for weeks.
>
> If you can't commit to keeping your node available nearly all the time, I
> would rather that you're not in the grid.  Sorry if that seems harsh, but I
> really want this to be a production grid that we can actually use with very
> high confidence that will always work, for both read and write.
>
> Also, sorry for the length of this e-mail :-)
>
> --
> Shawn
>
> _______________________________________________
> volunteergrid2-l mailing list
> volunteergrid2-l at tahoe-lafs.org
> http://tahoe-lafs.org/cgi-bin/mailman/listinfo/volunteergrid2-l
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tahoe-lafs.org/pipermail/volunteergrid2-l/attachments/20110115/5c19862b/attachment-0001.html>