[tahoe-dev] [tahoe-lafs] #1252: use different encoding parameters for dirnodes than for files
tahoe-lafs
trac at tahoe-lafs.org
Thu Jan 6 19:17:32 UTC 2011
#1252: use different encoding parameters for dirnodes than for files
-------------------------------+--------------------------------------------
Reporter: davidsarah | Owner: davidsarah
Type: defect | Status: assigned
Priority: major | Milestone: 1.9.0
Component: code-frontend | Version: 1.8.0
Resolution: | Keywords: preservation availability dirnodes anti-censorship
Launchpad Bug: |
-------------------------------+--------------------------------------------
Comment (by warner):
I guess I'm +0 on the general idea of making dirnodes more robust than the
default, and -0 about the implementation/configuration complexity
involved.
If you have a deep directory tree, and the only path from a rootcap to a
filenode is through 10 subdirectories, then your chances of recovering the
file are P(recover_dirnode)^10^*P(recover_filenode) . We provision things
to
make sure that P(recover_node) is extremely high, but that x^10^ is a big
factor, so making P(recover_dirnode) even higher isn't a bad idea.
But I agree that it's a pretty vague heuristic, and it'd be nicer to have
something less uncertain, or at least some data to work from. I'd bet that
most people retain a small number of rootcaps and use them to access a
much
larger number of files, and that making dirnodes more reliable (at the
cost
of more storage space) would be a good thing for 95% of the use cases.
(note
that folks who keep track of individual filecaps directly, like a big
database or something, would not see more storage space consumed by this
change).
On the "data to work from" front, it might be interesting if
{{{tahoe deep-stats}}} built a histogram of node-depth (i.e. number of
dirnodes traversed, from the root, for each file). With the exception of
multiply-linked nodes and additional external rootcaps, this might give us
a
better notion of how much dirnode reliability affects filenode
reachability.
I'll also throw in a +0 for Zooko's deeper message, which perhaps he
didn't
state explicitly this particular time, which is that our P(recover_node)
probability is already above the it-makes-sense-to-think-about-it-further
threshold: the notion that unmodeled real-world failures are way more
likely
than the nice-clean-(artificial) modeled
all-servers-randomly-independently-fail-simultaneously failures. Once your
P(failure) drops below 10^5^ or something, any further modeling is just an
act of self-indulgent mathematics.
I go back and forth on this: it feels like a good exercise to do the math
and
build a system with a theoretical failure probability low enough that we
don't need to worry about it, and to keep paying attention to that
theoretical number when we make design changes (e.g. the reason we use
segmentation instead of chunking is because the math says that chunking is
highly likely to fail). It's nice to be able to say that, if you have 20
servers with Poisson failure rates X and repair with frequency Y then your
files will have Poisson durability Z (where Z is really good). But it's
also
important to remind the listener that you'll never really achieve Z
because
something outside the model will happen first: somebody will pour coffee
into
your only copy of ~/.tahoe/private/aliases, put a backhoe into the DSL
line
that connects you to the whole grid, or introduce a software bug into all
your storage servers at the same time.
(incidentally, this is one of the big reasons I'd like to move us to a
simpler storage protocol: it would allow multiple implementations of the
storage server, in different languages, improving diversity and reducing
the
chance of simultaneous non-independent failures).
So anyways, yeah, I still think reinforcing dirnodes might be a good idea,
but I have no idea how good, or how much extra expansion is appropriate,
so
I'm content to put it off for a while yet. Maybe 1.9.0, but I'd prioritize
it
lower than most of the other 1.9.0-milestone projects I can think of.
--
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1252#comment:5>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-dev
mailing list