[tahoe-dev] [tahoe-lafs] #778: "shares of happiness" is the wrong measure; "servers of happiness" is better

tahoe-lafs trac at allmydata.org
Tue Aug 18 23:32:29 PDT 2009


#778: "shares of happiness" is the wrong measure; "servers of happiness" is
better
--------------------------------+-------------------------------------------
 Reporter:  zooko               |           Owner:           
     Type:  defect              |          Status:  new      
 Priority:  critical            |       Milestone:  undecided
Component:  code-peerselection  |         Version:  1.4.1    
 Keywords:  reliability         |   Launchpad_bug:           
--------------------------------+-------------------------------------------

Comment(by kevan):

 zooko: I think you're being clear + coherent.

 To summarize the recent discussion.

 Share uploading and encoding will be controlled by five parameters.

 The three that the user can directly specify are:
   * {{{k}}}: The number of servers that must survive for the file to
 survive (this is an upper bound, so another way of thinking of it is "I
 should be able to retrieve my file if at most {{{k}}} of the original
 servers that received parts of it continue to function").
   * {{{m}}}: The ideal number of distinct servers (or: an upper bound on
 distinct servers)
   * {{{h}}}: A lower bound on distinct servers: at least {{{h}}} servers
 must receive shares in order for an upload to be considered successful.

 Two others, {{{k_e}}} and {{{m_e}}} (wouldn't it be cool if you could use
 LaTeX in trac tickets?), are calculated at upload time by tahoe based on
 user-defined parameters and grid data (namely, {{{n}}}, the number of
 servers receiving shares). These are defined as:
   * {{{k_e = n}}}
   * {{{m_e = n * (k_e / k)}}} if {{{k}}} divides {{{k_e}}}, otherwise
   * {{{m_e = n - k + 1 + n * (k_e // k)}}}

 We impose the constraint that {{{k <= m}}}, that {{{h <= m}}}, and that
 {{{h, k, m > 0}}}. We do not say that {{{h >= k}}} because a user who
 simply wants a backup and doesn't care about the specific dispersal or
 replication of their shares would say  {{{h=1}}} (i.e., the upload is
 successful if the shares are somewhere). Additionally, we will say that
 the upload is a failure if {{{n < h}}}.

 Does anyone disagree with my summary?

 I like this.

 It elegantly supports the use case of someone who doesn't care much about
 their files beyond the fact that they're on the grid somewhere by
 decoupling {{{h}}} from most of the logic (since it remains, unless I'm
 paraphrasing badly, only in checks at the end of file upload and in the
 repairer) -- they'd set ({{{k=10, h=1, m=10}}}), and be on their way. It
 also gives metacrob the tools he needs to support his use case.

 What sort of defaults seem reasonable for this new behavior?

 I feel a lot better about this ticket, now -- it will be pretty cool once
 we get it working, thanks to all the new feedback. :-)

-- 
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/778#comment:30>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid


More information about the tahoe-dev mailing list