[tahoe-dev] [tahoe-lafs] #393: mutable: implement MDMF

tahoe-lafs trac at tahoe-lafs.org
Wed Jun 23 14:56:50 PDT 2010


#393: mutable: implement MDMF
------------------------------+---------------------------------------------
     Reporter:  warner        |       Owner:  kevan                                                      
         Type:  enhancement   |      Status:  assigned                                                   
     Priority:  major         |   Milestone:  1.8.0                                                      
    Component:  code-mutable  |     Version:  1.0.0                                                      
   Resolution:                |    Keywords:  newcaps performance random-access privacy gsoc mdmf mutable
Launchpad Bug:                |  
------------------------------+---------------------------------------------

Comment (by warner):

 as we just discussed briefly on IRC, I'd like to avoid O(N) transfers, so
 I'd rather go with a merkle structure of salts rather than a flat hash. I
 think the best approach would be to put the salt in front of each
 encrypted block, and compute the merkle tree over the (salt+ciphertext)
 pairs. The one block-tree remains the same size, but is computed over
 slightly different data. Downloading a single block requires downloading
 the (salt+ciphertext) for that one block, plus the merkle hash chain to
 validate it, plus the (constant-length) top-level UEB-like structure
 (which includes the root hash of the merkle chain, and the version
 number), and the signature of that structure.

 Using version=2 in the share to indicate MDMF sounds fine.
 readcaps/writecaps will remain the same as for SDMF: clients will need to
 fetch a share before they can discover whether the file is SDMF or MDMF.

 One thing to keep in mind, which I learned while writing the #798 new
 immutable downloader, is to try and structure the fields so that
 downloaders can fetch everything in a single request. This means avoiding
 as many round-trip-inducing decisions as possible. Having an offset table
 makes the share nicely flexible, but it also means that you have to fetch
 it (and burn a roundtrip) before you can precisely fetch anything else.
 The new-downloader makes a best-guess about the offset table, and
 optimistically fetches both the table and the data it points to at the
 same time, with fallbacks in case it guessed wrong.

 One down side of moving anything in the share format, or of increasing the
 size of the data blocks, is that these sorts of guesses are more likely to
 be wrong, or that there will be less overlap between the wrong guess and
 the real location of the data. The #798 downloader unconditionally fetches
 the first few kilobytes of the file, without being too precise about what
 it grabs, because that can save a roundtrip in some cases of bad guesses.

 I'm not suggesting that we do a lot of changes to the mutable downloader,
 but it might be worth walking through what a #798-style mutable downloader
 would look like, and see how many roundtrips are required, and if there's
 something simple we can do to the share format now, it might give us room
 to improve performance in the future.

-- 
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/393#comment:19>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized file storage grid


More information about the tahoe-dev mailing list