[tahoe-dev] [tahoe-lafs] #778: "shares of happiness" is the wrong measure; "servers of happiness" is better
tahoe-lafs
trac at allmydata.org
Wed Jan 20 08:22:56 PST 2010
#778: "shares of happiness" is the wrong measure; "servers of happiness" is
better
---------------------------------------+------------------------------------
Reporter: zooko | Owner: warner
Type: defect | Status: new
Priority: critical | Milestone: 1.6.0
Component: code-peerselection | Version: 1.4.1
Keywords: reliability review-needed | Launchpad_bug:
---------------------------------------+------------------------------------
Comment(by zooko):
Kevan:
I've been struggling and struggling to understand the
{{{servers_of_happiness()}}} function. The documentation -- that it
attempts to find a 1-to-1 (a.k.a. "injective") function from servers to
shares sounds great! But, despite many attempts, I have yet to understand
if the code is actually doing the right thing. (Note: this may well be in
part my fault for being thick-headed. Especially these days, when I am
very sleep-deprived and stressed and busy. But if we can make a function
that even I can understand then we'll be golden.)
So, one thing that occurs to me as I look at this function today is that
it might help if {{{existing_shares}}} and {{{used_peers}}} had more
consistent data types and names. If I understand correctly what they do
(which is a big 'if' at this point), they could each be a map from
{{{shareid}}} to {{{serverid}}}, or possibly a map from {{{shareid}}} to a
set of {{{serverid}}}'s, and their names could be {{{existing_shares}}}
and {{{planned_shares}}}, and the doc could explain that
{{{existing_shares}}} describes shares that are already alleged to be
hosted by servers, and {{{planned_shares}}} describes shares that we are
currently planning to upload to servers.
Would that be correct? It raises the question in my mind as to why
{{{servers_of_happiness()}}} distinguishes between those two inputs
instead of just generating its injective function from the union of those
two inputs. I suspect that this is because we want to prefer existing
shares instead of new shares when the two collide (i.e. when uploading a
new share would be redundant) in the interests of upload efficiency. Is
that true? Perhaps a note to that effect could be added to the
{{{servers_of_happiness()}}} doc.
I realize that I have asked so many times for further explanation of
{{{servers_of_happiness()}}} that it has become comment-heavy. Oh well!
If we see ways to make the comments more concise and just as explanatory
that would be cool, but better too many comments than too little, for this
particular twisty little core function. :-)
Thanks!
--
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/778#comment:143>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid
More information about the tahoe-dev
mailing list