[tahoe-dev] issue #610: upload should take better advantage of existing shares
zooko
zooko at zooko.com
Mon Feb 9 16:56:59 PST 2009
Brian opened this ticket, which explains most of dreid's performance
problems:
"""
Our current upload process (which is nearly the oldest code in the
entire tahoe tree) could be smarter in the presence of existing
shares. If a file is uploaded in January, then a few dozen servers
are added in February, then in March it is (for whatever reason)
uploaded again, here's what currently happens:
* peer selection comes up with a permuted list of servers, with the
same partial ordering as the original list but with the new servers
inserted in various pseudo-random places
* each server in the list is asked, in turn, if they would be
willing to hold on to the next sequentially numbered share
* each server might say yes or no. In addition, each server will
return a list of shares that they might already have
* the client never asks a server to accept a share that it already
had a home for, but it also never unasks a server to hold a share
that it later learns is housed somewhere else
So, if the client queries a server which already has a share, that
server will probably end up with two shares. In addition, many shares
will probably end up being sent to a new server even though some
other server (later in the permuted list) already has a copy.
To fix this, the upload process needs to do more work:
* it needs to cancel share-upload requests if it later learns that
some other server already has that particular share
* perhaps it should perform some sort of validation on the
claimed already-uploaded share
* if it detects any evidence of pre-existing shares, it should put
more energy into finding additional ones
* it needs to ask more servers than it strictly needs (for upload
purposes) to increase the chance that it can detect this evidence
We're planning an overhaul of immutable upload/download code, both to
improve parallelism and to replace the DeferredList with a state
machine (to make it easier to bypass stalled servers, for example).
These goals should be included in that work.
This process will work best when the shares are closer to the
beginning of the permuted list. A "share rebalancing" mechanism
should be created to gradually move shares in this direction over
time. This is another facet of repair: no only should there be enough
shares in existence, but they should be located in the best place for
a downloader to find them quickly.
"""
tickets mentioned in this message:
http://allmydata.org/trac/tahoe/ticket/610 # upload should take
better advantage of existing shares
More information about the tahoe-dev
mailing list