[tahoe-dev] [tahoe-lafs] #302: stop permuting peerlist, use SI as offset into ring instead?

Sat Dec 26 08:12:04 PST 2009

#302: stop permuting peerlist, use SI as offset into ring instead?
------------------------------------+---------------------------------------
 Reporter:  warner                  |           Owner:           
     Type:  task                    |          Status:  new      
 Priority:  major                   |       Milestone:  undecided
Component:  code-peerselection      |         Version:  0.7.0    
 Keywords:  repair newcaps newurls  |   Launchpad_bug:           
------------------------------------+---------------------------------------

Comment(by zooko):

 Thanks for doing this work to simulate it and write up such a detailed and
 useful report!  I think you are right that the unpermuted share placement
 can often (depending on node id placement and {{{N}}}) result in
 significantly higher inlet rates to some storage servers than others.  But
 as you say it isn't clear how much this matters: "Now, do we actually need
 uniform upload rates? What we really want, to attain maximum reliability,
 is to never double-up shares. That means we want all servers to become
 full at the same time, so instead of equal bytes-per-second for all
 servers, we actually want equal percentage-of-space-per-second for all
 servers."

 Note that in actual deployment, storage servers end up being of multiple
 generations, so for example on the allmydata.com prodgrid the oldest
 servers are running 1 TB hard drives, then once those started filling up
 we deployed the thumper which comprises about 48 storage servers each with
 a 0.5 TB hard drive, then once the thumper started getting full we
 deployed a few more servers, including ten which each had a 2 TB hard
 drive.  The point is that there was never a time (after the initial
 deployment started to fill up) where we had similar amounts of free space
 on lots of servers so that equal inlet rates would lead to equal time-to-
 full.

 My simulator (mentioned earlier in this thread) reported time-to-full
 instead of reporting inlet rate, and it indicated that regardless of
 whether you have permuted or non-permuted share placement, if you start
 with a large set of empty, same-sized servers and start filling them, then
 once the first one gets full then very quickly they all get full.

 Note that there are two separate arguments: 1. A more uniform inlet rate
 might not be so important. 2. The time between the first one filling and
 the last one filling is a small fraction of the time between the start of
 the grid and the last one filling (regardless of share placement
 strategy).

 I guess I'm not sure how you got from "do we actually need uniform upload
 rates?" to "easier to deal with and gives better system-wide properties"
 in your comment:12.

 Oh!  Also note that "What we really want, to attain maximum reliability,
 is to never double-up shares" is at least partially if not fully addressed
 by #778.

-- 
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/302#comment:13>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid