[tahoe-dev] Selecting optimal FEC parameters (also, idea for new peer selection policy)

Wed Aug 12 09:42:28 PDT 2009

On Wednesday 12 August 2009 09:49:37 am Zooko Wilcox-O'Hearn wrote:
> Interesting ideas, Shawn.  I'll just respond to a couple of details,
> below, plus I'll say that I think Kevan should proceed with his plan
> to implement "servers of happiness" now, as it is a small enough
> change that he can finish it before the fall semester starts.  :-)

I think the statistical approach is perhaps not as much effort as you're 
thinking, but the fall semester is coming up pretty quickly.

> In the long run, I can see how some people (possibly including me)
> would like the sort of sophisticated, heuristical, statistical
> approach that you envision, but I can also see how some people
> (possibly including me) would want a dumber, more predictable set of
> rules, such as "Make sure there are always at least K+1 shares in
> each of these Q co-los, and that's the entire ruleset.".

Absolutely I see the value in multiple policies being available.  One thing 
that is becoming clear, though is that what we're talking about is broader 
than just a peer selection policy, because FEC parameters are part of the 
decision parameters as well.  At least the choice of M and perhaps K as well.

One more comment on that: Assuming the reliability calculation is sufficiently 
sophisticated and accurate, your dumber, more predictable, rule and the 
statistical calculation should reach the same results.  In fact, that's one 
of the criteria I would use to validate the reliability calculation, assuming 
that it was sophisticated enough to include server set failure modes (meaning 
a set of servers can fail all at once, because a meteor hit the machine room 
or something).

> > 2.  Retrieval performance is maximized when shares are retrived
> > from as many servers at once (assuming all are roughly equally
> > responsive).  This means that K should be set to the size of the
> > grid, and M adjusted based on reliability requirements.  This is
> > the reverse of what I have been thinking.
>
> I went through a similar reversal:
>
> http://allmydata.org/pipermail/tahoe-dev/2009-April/001554.html #
> using the volunteer grid (dogfood tasting report, continued)

I remember that e-mail.

One of the things I plan to do while sitting on the beach this week is work 
out some concise formulas/algorithms for computing reliability as a function 
of M.  I'll be sure to use your 13-of-16 case as an example, and see how much 
it differs from 13-of-13 or 13-of-26.  After a few seconds more thought, I've 
changed the guess from my previous note:  I think 13-of-16 is better than 
13-of-13, and the reliability is a smooth function of M, so non-multiples of 
K are useful choices.

> > 3.  M larger than the grid means that each server will receive
> > multiple shares.  A reasonable first assumption is that all shares
> > on a given server will survive or fail together, so the effective
> > reliability of a file is a function not of how many shares must be
> > lost for it to disappear, but how many servers.
>
> Yes, thus the motivation for #778.

Yes.  I just wanted to make the assumption explicit.  It's not 100% true, of 
course, there are plenty of ways that a server could lose just one of the 
shares it's keeping.  But they're sufficiently less likely than all-or-none 
that I think we can ignore them for the moment.  Perhaps in the future they 
could be considered.  One nice thing is that the math I described in my paper 
(really need to finish that thing...) can accommodate multiple failure modes 
per share and failure modes that affect multiple shares or multiple servers.  
You can make the model as sophisticated as you like.  The only limitation is 
that the structure of the share groupings must be hierarchical.

> By the way, I wonder if #678 would interest you.  If we had #678,
> then your strategy could delay making up its mind about the best
> value of M until later, during a repair process, possibly after the
> set of servers and their known qualities has changed.

I think I'd ultimately like the repair process to go through the same decision 
process as the original upload when picking parameters -- but #678s idea of 
keeping K and using the systematic nature of the share generation to just 
push additional shares is really interesting as a lower-cost method to 
increase reliability.

An even lower-cost method is to simply replicate existing shares.  If copying 
one or two existing shares from one server to another pushes you above the 
required reliability threshold, it's a zero-computation solution.

> It should be 
> relatively easy to make up your mind about a good value for K -- for
> example, maybe just K = number-of-servers for starters?

For small grids, I can't think of a reason to choose a different value for K.  
Perhaps you could match your target reliability more precisely by varying 
both M and K, and achieve a little lower expansion factor.  But if the 
assumption is that most servers are on asymmetric Internet connections, 
setting K = number-of-servers should provide the best download performance, 
and that's probably worth a tiny bit of extra expansion.

> #678 is not a ticket that can be fixed before the fall semester
> starts, though.  Hopefully it will be fixed by the next generation of
> capabilities (http://allmydata.org/trac/tahoe/wiki/NewCapDesign ).

Ohhh... that's a good point.  Allowing M and/or K to be changed by a repairer 
will require a new cap structure.  That would be a good point in time to add 
a hamming distance parameter to the cap as well, assuming we decide that is 
sufficiently helpful to justify the extra length.

	Shawn.