[tahoe-dev] [tahoe-lafs] #393: mutable: implement MDMF
tahoe-lafs
trac at tahoe-lafs.org
Wed Jun 23 14:56:50 PDT 2010
#393: mutable: implement MDMF
------------------------------+---------------------------------------------
Reporter: warner | Owner: kevan
Type: enhancement | Status: assigned
Priority: major | Milestone: 1.8.0
Component: code-mutable | Version: 1.0.0
Resolution: | Keywords: newcaps performance random-access privacy gsoc mdmf mutable
Launchpad Bug: |
------------------------------+---------------------------------------------
Comment (by warner):
as we just discussed briefly on IRC, I'd like to avoid O(N) transfers, so
I'd rather go with a merkle structure of salts rather than a flat hash. I
think the best approach would be to put the salt in front of each
encrypted block, and compute the merkle tree over the (salt+ciphertext)
pairs. The one block-tree remains the same size, but is computed over
slightly different data. Downloading a single block requires downloading
the (salt+ciphertext) for that one block, plus the merkle hash chain to
validate it, plus the (constant-length) top-level UEB-like structure
(which includes the root hash of the merkle chain, and the version
number), and the signature of that structure.
Using version=2 in the share to indicate MDMF sounds fine.
readcaps/writecaps will remain the same as for SDMF: clients will need to
fetch a share before they can discover whether the file is SDMF or MDMF.
One thing to keep in mind, which I learned while writing the #798 new
immutable downloader, is to try and structure the fields so that
downloaders can fetch everything in a single request. This means avoiding
as many round-trip-inducing decisions as possible. Having an offset table
makes the share nicely flexible, but it also means that you have to fetch
it (and burn a roundtrip) before you can precisely fetch anything else.
The new-downloader makes a best-guess about the offset table, and
optimistically fetches both the table and the data it points to at the
same time, with fallbacks in case it guessed wrong.
One down side of moving anything in the share format, or of increasing the
size of the data blocks, is that these sorts of guesses are more likely to
be wrong, or that there will be less overlap between the wrong guess and
the real location of the data. The #798 downloader unconditionally fetches
the first few kilobytes of the file, without being too precise about what
it grabs, because that can save a roundtrip in some cases of bad guesses.
I'm not suggesting that we do a lot of changes to the mutable downloader,
but it might be worth walking through what a #798-style mutable downloader
would look like, and see how many roundtrips are required, and if there's
something simple we can do to the share format now, it might give us room
to improve performance in the future.
--
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/393#comment:19>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized file storage grid
More information about the tahoe-dev
mailing list