[tahoe-dev] [tahoe-lafs] #610: upload should take better advantage of existing shares

Fri Feb 15 03:54:46 UTC 2013

#610: upload should take better advantage of existing shares
-------------------------+-------------------------------------------------
     Reporter:  warner   |      Owner:  kevan
         Type:           |     Status:  new
  enhancement            |  Milestone:  1.11.0
     Priority:  major    |    Version:  1.2.0
    Component:  code-    |   Keywords:  upload verify preservation
  encoding               |  performance space-efficiency
   Resolution:           |
Launchpad Bug:           |
-------------------------+-------------------------------------------------

Old description:

> Our current upload process (which is nearly the oldest code in the entire
> tahoe tree) could be smarter in the presence of existing shares. If a
> file is
> uploaded in January, then a few dozen servers are added in February, then
> in
> March it is (for whatever reason) uploaded again, here's what currently
> happens:
>
>  * peer selection comes up with a permuted list of servers, with the same
>    partial ordering as the original list but with the new servers
> inserted in
>    various pseudo-random places
>  * each server in the list is asked, in turn, if they would be willing to
>    hold on to the next sequentially numbered share
>  * each server might say yes or no. In addition, each server will return
> a
>    list of shares that they might already have
>  * the client never asks a server to accept a share that it already had a
>    home for, but it also never unasks a server to hold a share that it
> later
>    learns is housed somewhere else
>
> So, if the client queries a server which already has a share, that server
> will probably end up with two shares. In addition, many shares will
> probably
> end up being sent to a new server even though some other server (later in
> the
> permuted list) already has a copy.
>
> To fix this, the upload process needs to do more work:
>
>  * it needs to cancel share-upload requests if it later learns that some
>    other server already has that particular share
>   * perhaps it should perform some sort of validation on the claimed
>     already-uploaded share
>  * if it detects any evidence of pre-existing shares, it should put more
>    energy into finding additional ones
>  * it needs to ask more servers than it strictly needs (for upload
> purposes)
>    to increase the chance that it can detect this evidence
>
> We're planning an overhaul of immutable upload/download code, both to
> improve
> parallelism and to replace the !DeferredList with a state machine (to
> make it
> easier to bypass stalled servers, for example). These goals should be
> included in that work.
>
> This process will work best when the shares are closer to the beginning
> of
> the permuted list. A "share rebalancing" mechanism should be created to
> gradually move shares in this direction over time. This is another facet
> of
> repair: no only should there be enough shares in existence, but they
> should
> be located in the best place for a downloader to find them quickly.

New description:

 Our current upload process (which is nearly the oldest code in the entire
 tahoe tree) could be smarter in the presence of existing shares. If a file
 is
 uploaded in January, then a few dozen servers are added in February, then
 in
 March it is (for whatever reason) uploaded again, here's what currently
 happens:

  * peer selection comes up with a permuted list of servers, with the same
    partial ordering as the original list but with the new servers inserted
 in
    various pseudo-random places
  * each server in the list is asked, in turn, if they would be willing to
    hold on to the next sequentially numbered share
  * each server might say yes or no. In addition, each server will return a
    list of shares that they might already have
  * the client never asks a server to accept a share that it already had a
    home for, but it also never unasks a server to hold a share that it
 later
    learns is housed somewhere else

 So, if the client queries a server which already has a share, that server
 will probably end up with two shares. In addition, many shares will
 probably
 end up being sent to a new server even though some other server (later in
 the
 permuted list) already has a copy.

 To fix this, the upload process needs to do more work:

  * it needs to cancel share-upload requests if it later learns that some
    other server already has that particular share
   * perhaps it should perform some sort of validation on the claimed
     already-uploaded share
  * if it detects any evidence of pre-existing shares, it should put more
    energy into finding additional ones
  * it needs to ask more servers than it strictly needs (for upload
 purposes)
    to increase the chance that it can detect this evidence

 We're planning an overhaul of immutable upload/download code, both to
 improve
 parallelism and to replace the !DeferredList with a state machine (to make
 it
 easier to bypass stalled servers, for example). These goals should be
 included in that work.

 This process will work best when the shares are closer to the beginning of
 the permuted list. A "share rebalancing" mechanism should be created to
 gradually move shares in this direction over time. This is another facet
 of
 repair: no only should there be enough shares in existence, but they
 should
 be located in the best place for a downloader to find them quickly.

--

Comment (by davidsarah):

 This is likely to be fixed by implementing the algorithm in
 ticket:1130#comment:12, provided that it applies to upload as well as
 repair.

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/610#comment:14>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage