[tahoe-dev] [tahoe-lafs] #302: stop permuting peerlist, use SI as offset into ring instead?
Jody Harris
imhavoc at gmail.com
Sat Dec 26 10:39:15 PST 2009
As a user/administrator of an ad-hoc Tahoe-lafs, there are several
assumptions that are not valid for our use case:
- All nodes do not start out with the same size storage space
- Almost all nodes will be used to store data other than Tahoe shares
- Bandwidth balancing will be more important than storage space balancing
- Nodes will be expected to join, leave and fail at random
- The rate at which a node reaches capacity is of far less concern
than distribution of shares across nodes.
Nodes will range from 4GB non-dedicated to 2 TB dedicated.
I realize that this is a use case that falls outside of the initial design
parameters of Tahoe-lafs, but you have managed to build something useful and
(somewhat) elegant. When a tool like that becomes available, people will use
it in the most unexpected ways.
----
- Think carefully.
- Contra mundum - "Against the world" (St. Athanasius)
- Credo ut intelliga - "I believe that I may know" (St. Augustin of Hippo)
On Sat, Dec 26, 2009 at 9:12 AM, tahoe-lafs <trac at allmydata.org> wrote:
> #302: stop permuting peerlist, use SI as offset into ring instead?
>
> ------------------------------------+---------------------------------------
> Reporter: warner | Owner:
> Type: task | Status: new
> Priority: major | Milestone: undecided
> Component: code-peerselection | Version: 0.7.0
> Keywords: repair newcaps newurls | Launchpad_bug:
>
> ------------------------------------+---------------------------------------
>
> Comment(by zooko):
>
> Thanks for doing this work to simulate it and write up such a detailed and
> useful report! I think you are right that the unpermuted share placement
> can often (depending on node id placement and {{{N}}}) result in
> significantly higher inlet rates to some storage servers than others. But
> as you say it isn't clear how much this matters: "Now, do we actually need
> uniform upload rates? What we really want, to attain maximum reliability,
> is to never double-up shares. That means we want all servers to become
> full at the same time, so instead of equal bytes-per-second for all
> servers, we actually want equal percentage-of-space-per-second for all
> servers."
>
> Note that in actual deployment, storage servers end up being of multiple
> generations, so for example on the allmydata.com prodgrid the oldest
> servers are running 1 TB hard drives, then once those started filling up
> we deployed the thumper which comprises about 48 storage servers each with
> a 0.5 TB hard drive, then once the thumper started getting full we
> deployed a few more servers, including ten which each had a 2 TB hard
> drive. The point is that there was never a time (after the initial
> deployment started to fill up) where we had similar amounts of free space
> on lots of servers so that equal inlet rates would lead to equal time-to-
> full.
>
> My simulator (mentioned earlier in this thread) reported time-to-full
> instead of reporting inlet rate, and it indicated that regardless of
> whether you have permuted or non-permuted share placement, if you start
> with a large set of empty, same-sized servers and start filling them, then
> once the first one gets full then very quickly they all get full.
>
> Note that there are two separate arguments: 1. A more uniform inlet rate
> might not be so important. 2. The time between the first one filling and
> the last one filling is a small fraction of the time between the start of
> the grid and the last one filling (regardless of share placement
> strategy).
>
> I guess I'm not sure how you got from "do we actually need uniform upload
> rates?" to "easier to deal with and gives better system-wide properties"
> in your comment:12.
>
> Oh! Also note that "What we really want, to attain maximum reliability,
> is to never double-up shares" is at least partially if not fully addressed
> by #778.
>
> --
> Ticket URL: <http://allmydata.org/trac/tahoe/ticket/302#comment:13>
> tahoe-lafs <http://allmydata.org>
> secure decentralized file storage grid
> _______________________________________________
> tahoe-dev mailing list
> tahoe-dev at allmydata.org
> http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://allmydata.org/pipermail/tahoe-dev/attachments/20091226/92977bb1/attachment.htm
More information about the tahoe-dev
mailing list