[tahoe-dev] [tahoe-lafs] #1092: shares.happy is the wrong name of the measure
tahoe-lafs
trac at tahoe-lafs.org
Fri Dec 24 23:34:14 UTC 2010
#1092: shares.happy is the wrong name of the measure
--------------------------------+-------------------------------------------
Reporter: zooko | Owner: warner
Type: defect | Status: new
Priority: minor | Milestone: eventually
Component: code-nodeadmin | Version: 1.7.0
Resolution: | Keywords: usability upload
Launchpad Bug: |
--------------------------------+-------------------------------------------
Comment (by warner):
You know, I actually kinda like servers.happy=1, probably because I still
haven't internalized the whole bijective-mapping-of-servers concept yet.
(I
mean, I know what's going on, yet each time that error appears, I walk
away
in confusion because the text of the error message is so hard to follow,
so
it leaves a general taste in my mouth that the whole idea is bad, even
though
I know it's not really that bad)
Kevan's arguments in the first comment are spot on. "forcing people to
reason
about their grid" needs to happen in a friendlier place than the error
message.
gdt's comment about the flippant use of "happy" is accurate too. I
originally
picked that for shares-of-happiness because it was a somewhat arbitrary
threshold appliedin a very narrow and probably-rare error case (you've
connected to enough servers at the start of the upload, but then some were
lost by the time you finished.. do you still declare success? are you
still
happy?)
> The current ordering gives the impression that shares.needed are
> shares.total are more independent than they are. So perhaps
> "shares.coding = (3, 10)" would be better than two variables. (I am
under
> the impression that I can't just set shares.total to 12 and reconstruct
> those missing sh10, sh11 without having to recode the entire file; if
I'm
> confused on that point this paragraph is invalid.)
(you're correct: you can't go from 3-of-10 to 3-of-12 without reencoding
the
whole file. raw zfec would treat them the same, but the share-hash-trees
that
tahoe adds for integrity checking would be different, so we fold both k
and N
into the CHK hash, so you'll get an entirely different encryption key and
share data anyways)
Yeah, combining two tahoe.cfg directives into one might be a good idea. In
fact, it should be phrased in the same way we talk about it in english:
[client]
shares.encoding = 3-of-10
> So I'll suggest "shares.independent", with the meaning being "the
minimum
> number of shares that must be on independent servers"
I get the impression that this issue is more about "servers" than about
"shares", so I wonder if maybe it ought to be "servers.independent". I
know
the math touches both, but I'd like to give users the ability to learn how
this works in chunks, where the first chunk is only about shares
("3-of-10, I
need 3 distinct shares, doesn't matter where they come from, ok, got it"),
and then a later chunk is about where those shares are placed ("oh, right,
what happens if there aren't enough servers?"). Maybe, if all the
"shares.*"
configuration fit into the first chunk, then all the controls that involve
servers (even though they also involve shares) could be put into a
different
namespace and support the user's concept of a second chunk of things to
learn. "servers.*" might support that.
I'm still undecided about what the default "use-case" ought to be. I think
it's vital that folks be able to bring up a small grid and test it out. I
also think it's important to protect "tahoe backup" users against the
trivial
case where you're only putting shares on yourself. Maybe what I'm really
wishing for were better #467 explicit-server-selection code and UI. Maybe
I'm
coming around to the idea that diversity trumps write-availability: if you
have some way of configuring (or at least acknowledging) who you're
*supposed* to connect to, then you could fail writes unless all those
servers
were present. Maybe a set of checkboxes on the known-servers web page,
meaning "don't allow uploads to succeed unless this server is present".
Maybe
I'm balking at simple integer success criteria because I don't see it as
being easy for a user (or me) to understand what it means, whereas a list
of
required serverids is pretty straightforward.
But I'm hesitant on the explicit serverlist too, because of how it'd not
work
so well in very dynamic grids, and how it kind of needs constant attention
and decision making by the user.
Hm. I'll think about the checkboxes idea more, I kinda like it.
--
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1092#comment:5>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-dev
mailing list