[tahoe-dev] [tahoe-lafs] #607: DIR2:CHK
tahoe-lafs
trac at allmydata.org
Thu Aug 27 20:16:51 PDT 2009
#607: DIR2:CHK
---------------------------+------------------------------------------------
Reporter: zooko | Owner:
Type: defect | Status: new
Priority: major | Milestone: undecided
Component: code-dirnodes | Version: 1.2.0
Keywords: | Launchpad_bug:
---------------------------+------------------------------------------------
Comment(by warner):
Another good feature of an immutable-file based directory is that it could
be
repaired, unlike our current RSA-based (write-enabler-based) mutable
files,
when referenced through a readcap (#625), like the ones created by "tahoe
backup".
I'd like to implement this, and change "tahoe backup" to use it. The basic
steps I anticipate are:
* implement {{{create_dirnode(mutable=True, initial_children={})}}}
* replace the existing {{{create_empty_dirnode()}}} with that
* refactor {{{DirectoryNode}}} to separate out the underlying filenode
better. The idea would be to nail down the interface that dirnodes need
from the filenode that they've wrapped. The read side just needs
read().
The write side needs the normal mutable-filenode operations, like
modify(). We should have an immutable filenode which offers the same
read-side interface as the mutable filenode does.
* change the "!NodeMaker" code to create dirnodes by first creating a
filenode and then passing it as the constructor to {{{Dirnode()}}}. It
may
useful to first change the way that uploads are done, and create a
special
kind of immutable filenode for upload purposes. This "gestating" node
would have an interface to add data, would perform the upload while
data
is added, and would then have a finalize() method, which would finish
the
upload process, compute the filecap, and return the real
!IFilesystemNode
which can be used for reading. Making this special node have the same
interface as a mutable filenode's initial-upload methods would let
Dirnode
be oblivious to the type of filenode it's been given.
I'm planning to require that the contents of an immutable directory are
also
immutable (LIT, CHK, and DIR2:CHK, not regular mutable DIR2), so that
these
objects are always deep-readonly. (there may be an argument to provide
shallow-readonly directories, but I think deep-readonly is more generally
useful).
I'm pondering if there's a way to support multi-level trees in the future
without drastic changes, so that this one-level immutable directory could
turn into a full "virtual CD" (#204), with better performance (by bundling
a
whole tree of directories into a single distributed object). This would
suggest making the name table accept tuples of names instead of just a
single
one.
I've also wondered if we should implement some faster lookup scheme for
these
immutable dirnodes, especially because we don't need to update it later.
Maybe djb's "cdb" (constant-time database). I'm not sure that a database
which has been optimized for minimal disk seeks will necessarily help us
here, since the segment size is drastically larger than what a hard disk
offers, and the network roundtrip latency is frequently an order of
magnitude
larger too. But certainly we can come up with something that's easier to
pack
and unpack than the DIR2 format.
Also, we can discard several things from the DIR2 format: we don't need
child
writecaps (just the readcaps), and we obviously don't need the obsolete
salt.
We probably still want the metadata dictionary, although that would
potentially interfere with the grid-side convergence that Zooko mentioned.
Changing the table format would remove some of the benefits (and thus
motivation) to the other refactoring changes described above: if we've got
a
separate class for immutable-dirnodes, then there's not much point in
contorting mutable and immutable filenodes to present the same interface.
But, it would probably be cleaner overall if there were just one dirnode
class, whose mutability is determined solely by asking the underlying
filenode about its own mutability. In this case, all the mutating methods
will still exist on the immutable dirnodes, but they'd throw an exception
if
you actually try to call them in that situation, just as they do now.
--
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/607#comment:2>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid
More information about the tahoe-dev
mailing list