Changes between Version 1 and Version 2 of ServerSelection


Ignore:
Timestamp:
2009-04-21T20:47:23Z (16 years ago)
Author:
warner
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ServerSelection

    v1 v2  
    1111As I, Zooko, have emphasized a few times, we really should not try to write a super-clever algorithm into Tahoe which satisfies all of these people, plus all the other crazy people that will be using Tahoe for crazy things in the future.  Instead, we need some sort of configuration language or plugin system so that each crazy person can customize their own crazy server selection policy.  I don't know the best way to implement this yet -- a domain specific language?  Implement the above-mentioned list of seven policies into Tahoe and have an option to choose which of the seven you want for this upload?  My current favorite approach is: you give me a Python function.  When the time comes to upload a file, I'll call that function and then use whichever servers it said to use.
    1212
     13==== Brian says: ====
     14
     15Having a function or class to control server-selection is a great idea. The
     16current code already separates out responsibility for server-selection into a
     17distinct class, at least for immutable files
     18(source:src/allmydata/immutable/upload.py#L131 {{{Tahoe2PeerSelector}}}). It
     19would be pretty easy to make the uploader use different classes according to
     20a {{{tahoe.cfg}}} option.
     21
     22However, there are some additional properties that need to be satified by the
     23sever-selection algorithm for it to work at all. The basic Tahoe model is
     24that the filecap is both necessary and sufficient (given some sort of grid
     25membership) to recover the file. This means that the eventual
     26'''downloader''' needs to be able to find the same servers, or at least have
     27a sufficiently-high probability of finding "enough" servers within a
     28reasonable amount of time, using only information which is found in the
     29filecap.
     30
     31If the downloader is allowed to ask every server in the grid for shares, then
     32anything will work. If you want to keep the download setup time low, and/or
     33if you expect to have more than a few dozen servers, then the algorithm needs
     34to be able to do something better. Note that this is even more of an issue
     35for mutable shares, where it is important that publish-new-version is able to
     36track down and update all of the old shares: the chance of accidental
     37rollback increases when it cannot reliably/cheaply find them all.
     38
     39Another potential goal is for the download process to be tolerant of new
     40servers, removed servers, and shares which have been moved (possibly as the
     41result of repair or "rebalancing"). Some use cases will care about this,
     42while others may never change the set of active servers and won't care.
     43
     44It's worth pointing out the properties we were trying to get when we came up
     45with the current "tahoe2" algorithm:
     46
     47 * for mostly static grids, download uses minimal do-you-have-share queries
     48 * adding one server should only increase download search time by 1/numservers
     49 * repair/rebalancing/migration may move shares to new places, including
     50   servers which weren't present at upload time, and download should be able
     51   to find and use these shares, even though the filecap doesn't change
     52 * traffic load-balancing: all non-full servers get new shares at the same
     53   bytes-per-second, even if serverids are not uniformly distributed
     54
     55We picked the pseudo-random permuted serverlist to get these properties. I'd
     56love to be able to get stronger diversity among hosts, racks, or data
     57centers, but I don't yet know how to get that '''and''' get the properties
     58listed above, while keeping the filecaps small.