#791 new enhancement

Optimize FEC parameters to increase download performance

Reported by: swillden Owned by:
Priority: minor Milestone: undecided
Component: code-encoding Version: 1.5.0
Keywords: performance preservation Cc:
Launchpad Bug:

Description

When Tahoe is configured with a large value for k, the number of shares required to reconstruct a file, retrievals can be done by pulling shares from many servers in parallel. The 'swarming download' effect can in many cases significantly improve performance.

In ticket #778 (http://allmydata.org/trac/tahoe/ticket/778) it was decided to change the Tahoe configuration to specify the share generation and distribution parameters in terms of numbers of servers, and to leave the selection of actual FEC parameters as an implementation decision in Tahoe. This decision makes it possible for Tahoe to increase the number of shares required, k_e, and distributed m_e. The ticket #778 discussion included one algorithm for doing that.

Change History (4)

comment:1 Changed at 2009-08-22T19:43:57Z by warner

  • Component changed from unknown to code-encoding
  • Owner nobody deleted

comment:2 Changed at 2009-10-10T23:32:45Z by zooko

Here is the code that Shawn Willden attached to ticket #778 to compute fec parameters:

http://allmydata.org/trac/tahoe/attachment/ticket/778/fecparams.py

Here was his comment about it:

http://allmydata.org/trac/tahoe/ticket/778#comment:32

comment:3 follow-up: Changed at 2010-01-15T05:50:57Z by warner

incidentally, if we allowed the downloader to use a bit more RAM, we could achieve maximum parallelism without raising k, by having it pull shares for two segments at the same time (pull from servers(1..k) for seg0, and from servers(k+1..2k) for seg1). If we were really clever, we could time the requests such that the data for the second segment didn't arrive at our downstream congestion point until most of the data for the first segment had showed up, which might let us keep the downstream pipe filled more.

Increasing k has the negative property that it leads to an increase in N, and if N is capped by the total number of servers, then the entropy of share placement (as explored in #872) drops to zero, so all files get placed on the an identical set of servers, causing them all to share the same fate. I don't yet know how overall reliability of k-of-N vs 2k-of-2N when the total number of servers is not much larger than 2N.. I'll have to think about that. I believe that reliability roughly grows with N-k-1, so it's possible that 2k-of-2N is so much better than k-of-N that the peer-selection-entropy is irrelevant.

comment:4 in reply to: ↑ 3 Changed at 2010-01-15T18:19:30Z by davidsarah

  • Keywords performance preservation added

Replying to warner:

I believe that reliability roughly grows with N-k-1, so it's possible that 2k-of-2N is so much better than k-of-N that the peer-selection-entropy is irrelevant.

Reliability can't be concluded to grow with N-k-1 without making additional assumptions. Consider a grid with half of the servers in each of two clusters. If some disaster wipes out one cluster, then on average it wipes out half the shares of each file. In that case k-of-N and 2k-of-2N have roughly equal reliability: the proportion of files lost depends on how k compares to N/2 in both cases.

Note: See TracTickets for help on using tickets.