[tahoe-dev] [tahoe-lafs] #1170: does new-downloader perform badly for certain situations (such as today's Test Grid)?

Sat Aug 14 01:52:35 UTC 2010

#1170: does new-downloader perform badly for certain situations (such as today's
Test Grid)?
------------------------------+---------------------------------------------
     Reporter:  zooko         |       Owner:                                           
         Type:  defect        |      Status:  new                                      
     Priority:  critical      |   Milestone:  1.8.0                                    
    Component:  code-network  |     Version:  1.8β                                     
   Resolution:                |    Keywords:  immutable download performance regression
Launchpad Bug:                |  
------------------------------+---------------------------------------------

Comment (by warner):

 I had an idea for a not-too-complex share-selection algorithm this
 morning:

 * first, have the {{{ShareFinder}}} report all shares as soon as it learns
   about them, instead of its current behavior of withholding them until
   someone says that they're hungry
 * also record the DYHB RTT in each Share object, so it can be found later.
   Keep the list of Shares sorted by this RTT (with share-number as the
   secondary sort key).
 * assume there's a Share
 * then, each time the {{{SegmentFetcher}}} needs to start using a new
 share,
   use the following algorithm:

 {{{
 sharemap = {} # maps shnum to Share instance
 num_shares_per_server = {} # maps Server to a count of shares
 for max_shares_per_server in itertools.count(1):
   progress = False
   for sh in shares:
     if sh.shnum in sharemap:
       continue
     if num_shares_per_server[sh.server] >= max_shares_per_server:
       continue
     sharemap[sh.shnum] = sh
     num_shares_per_server[sh.server] += 1
     progress = True
     if len(sharemap) >= k:
       return SUCCESS
   if not progress:
     return FAIL
 }}}

 The general idea is to cycle through all the shares we know about, but
 first
 try to build a sharemap that only uses one share per server (i.e. perfect
 diversity). That might fail because the shares are not diverse enough, so
 we
 can walk through the loop a second time and be willing to accept '''two'''
 shares per server. If that fails, we raise our willingness to three shares
 per server, etc. If we ever finish a loop without adding at least one
 share
 to our sharemap, we declare failure: this indicates that there are not
 enough
 distinct shares (that we know about so far) to succeed.

 If this returns FAIL, that really means we should declare "hunger" and ask
 the {{{ShareFinder}}} to look for more shares. If we return SUCCESS but
 {{{max_shares_per_server > 1}}}, then we should ask for more shares too
 (but
 start the segment anyways: new shares may help the next segment do
 better).

 This is still vulnerable to certain pathological situations, like if
 everybody has a copy of sh0 but only the first server has a copy of sh1:
 this
 will use sh0 from the first server then circle around and have to use sh1
 from that server as well. A smarter algorithm would peek ahead, realize
 the
 scarcity of sh1, and add sh1 from the first server so it could get sh0
 from
 one of the other servers instead.

 But I think this might improve the diversity of downloads without going
 down
 the full {{{itertools.combinations}}}-enumerating route that represents
 the
 "complete" way to approach this problem.

-- 
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1170#comment:27>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage