[tahoe-dev] How many servers can fail?

Tue Oct 25 22:53:36 UTC 2011

On 10/25/11 9:20 AM, Dirk Loss wrote:

> To foster my understanding, I've tried to visualize what that means:
> 
>  http://dirk-loss.de/tahoe-lafs_nhk-defaults.png

Wow, that's an awesome picture!. If we ever get to produce an animated
cartoon of precious shares and valiant servers and menacing attackers
battling it out while the user nervously hits the Repair button, I'm
hiring you for sure :). But yeah, your understanding is correct.

It may help to know that "H" is a relatively recent addition (1.7.0,
Jun-2010). The original design had only k=3 and N=10, but assumed that
you'd only upload files in an environment with at least N servers (in
fact the older design, maybe 1.2.0 in 2009, had the Introducer tell all
clients what k,N to use, instead of them picking it for themselves). Our
expectation was thus that you'd get no more than one share per server,
so "losing 7 servers" was equivalent to "losing 7 shares", leaving you 3
(>=k) left.

I designed the original uploader to allow uploads in the presence of
fewer than N servers, by storing multiple shares per server as necessary
to place all N shares. The code strives for uniform placement (it won't
put 7 on server A and then 1 each on servers B,C,D, unless they're
nearly full). My motivation was to improve the out-of-the-box experience
(where you spin up a test grid with just one or two servers, but don't
think to modify your k/N to match), and to allow reasonable upgrades to
more servers later (by migrating the doubled-up shares to new servers,
keeping the filecaps and encoding the same, but improving the
diversity).

There was a "shares of happiness" setting in that original uploader, but
it was limited to throwing an exception if too many servers drop off
*during* the upload itself (which commits to a fixed set of servers at
the start of the process). I still expected there to be plenty of
servers available, so re-trying the upload would still get you full
diversity.

The consequences of my choosing write-availability over reliability show
up when some of your servers are already down when you *start* the
upload (this wasn't a big deal for the AllMyData production grid, but
happens much more frequently in a volunteergrid). You might think you're
on a grid with 20 servers, but it's 2AM and most of those boxes are
turned off, so your upload only actually gets to use 2 servers. The old
code would cheerfully put 5 shares on each, and now you've got a 2POF
(dual-point-of-failure). The worst case was when your combination
client+server hadn't really managed to connect to the network yet, and
stored all the shares on itself (SPOF). You might prefer to get a
failure rather than a less-reliable upload: to choose reliability over
the availability of writes.

So 1.7.0 changed the old "shares of happiness" into a more accurate (but
more confusing) "servers of happiness" (but unfortunately kept the old
name). It also overloaded what "k" means. So now you set "H" to be the
size of a "target set". The uploader makes sure that any "k"-sized
subset of this target will have enough shares to recover the file. That
means that H and k are counting *servers* now. (N and k still control
encoding as usual, so k also counts shares, but share *placement* is
constrained by H and k). The uploader refuses to succeed unless it can
get sufficient diversity, where H and k define what "sufficient" means.

(there may be situations where an upload would fail, but your data would
still have been recoverable: shares-of-happiness is meant to ensure a
given level of diversity/safety: choosing reliability over
write-availability)

So the new "X servers may fail and your data is still recoverable"
number comes from H-k (both counting servers). The share-placement
algorithm still tries for uniformity, and if it achieves that then you
can tolerate even more failures (up to N-k if you manage to get one
share per server).

I'm still not sure the servers-of-happiness metric is ideal. While it
lets you specify a safety level more accurately/meaningfully in the face
of insufficient servers, it's always been more hard to explain and
understand. Some days I'm in favor of a more-absolute list of "servers
that must be up" (I mocked up a control panel in
http://tahoe-lafs.org/pipermail/tahoe-dev/2011-January/005944.html).
Having a minimum-reliability constraint is good, but you also want to
tell your client just how many servers you *expect* to have around, so
it can tell you whether it can satisfy your demands.

I still think it'd be pretty cool if the client's Welcome page had a
little game or visualization where it could show you, given the current
set of available servers, whether the configured k/H/N could be
satisfied or not. Something to help explain the share-placement rules
and help users set reasonable expectations.

cheers,
 -Brian