[tahoe-lafs-trac-stream] [tahoe-lafs] #2107: don't place shares on servers that already have shares

tahoe-lafs trac at tahoe-lafs.org
Tue Dec 3 01:53:11 UTC 2013


#2107: don't place shares on servers that already have shares
-------------------------+-------------------------------------------------
     Reporter:  zooko    |      Owner:
         Type:           |     Status:  new
  enhancement            |  Milestone:  undecided
     Priority:  normal   |    Version:  1.10.0
    Component:  code-    |   Keywords:  upload servers-of-happiness brians-
  peerselection          |  opinion-needed
   Resolution:           |
Launchpad Bug:           |
-------------------------+-------------------------------------------------

Comment (by gdt):

 (I've been too busy to pay attention, so apologies if me throwing out a
 random comment is not helpful.)

 It seems clear that "servers of happiness" is an approximation to a richer
 property that cannot be described so simply.  So a rule written in terms
 of SoH is going to be an approximation to the correct behavior.

 The real question is maximizing the ability to retrieve the file, given
 some assumed probability distribution of a) a server losing a share, but
 staying up and b) a server going away, traded off against storage cost
 somehow.  In particular, I'm not sure how often a) happens, and whether we
 want to support the notion of assigning different probabilities to
 different servers, or correlated probabilities.  So two placements with
 the same SoH can still have different probabilities, and the higher one is
 still preferred.

 A key issue is when the number of servers is less than the number of
 shares, vs when it's more.  I suspect that behaviors are quite different
 then.   We've been talking about that, but perhaps that great divide
 should be more front and center in the discussion.   It's also not
 reasonable to expect people to tweak encoding as the number of servers
 changes.   It should be sane to run 3/10 all the time, and just place more
 shares if there are 4 servers.

 It seems obvious to me that better balance is better.  But what about S=4,
 3/10 encoding.  Is 1,3,3,3 really worse than 2,3,2,3?   Or is it better,
 because if 3 out of 4 are lost, there are 3 ways to win and one to lose,
 vs 2 and 2 with 2,3,2,3?  So it seems that regardless of probabilities,
 1,3,3,3 is more robust.  I'm not sure how to generalize this, except that
 for k=3 one should not put more than k until all servers have k.

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2107#comment:23>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list