[tahoe-lafs-trac-stream] [tahoe-lafs] #1526: make sure the new MDMF extension field is forward-compatible and safe

tahoe-lafs trac at tahoe-lafs.org
Sun Sep 4 15:18:44 PDT 2011


#1526: make sure the new MDMF extension field is forward-compatible and safe
-------------------------+-------------------------------------------------
     Reporter:  zooko    |      Owner:
         Type:  defect   |     Status:  new
     Priority:           |  Milestone:  1.9.0
  critical               |    Version:  1.9.0a1
    Component:  code-    |   Keywords:  forward-compatibility mdmf design-
  mutable                |  review-needed
   Resolution:           |
Launchpad Bug:           |
-------------------------+-------------------------------------------------

Comment (by warner):

 on IRC just now, I concluded that a "k" hint is useful to the mapupdate
 step (so it can make a reasonable number of parallel DYHB/checkstring
 requests), but that the "segsize" hint isn't so useful.

 Ideally, the hints would let us build a one-round-trip downloader. The
 share contains a bunch of static data (whose size depends upon "N" and the
 keysize, but not the current filesize), followed by the share data,
 finally followed by the block hash tree (which depends upon the current
 filesize). This order lets us grow the file without needing to move all
 the sharedata. So 1 RTT is really hard: you have to correctly guess where
 the block hash tree data is, which means knowing the filesize to within a
 segment, in addition to knowing the segsize.

 So 2 RTT is a more reasonable goal: the first request tells you the
 current filesize and the offset table, which lets you make an accurate
 second request. For the current Retrieve code, the first request is done
 during the mapupdate phase, which gets the checkstring and offset table
 for each known share. Then the retrieve phase can make an accurate initial
 request, and return the first segment of data in just 2 RTT.

 So anyways, I think "k" is a useful hint, but I'm no longer so sure about
 including "segsize". I think "k" is so useful that I'm willing to make it
 mandatory, or at least untagged, so the MDMF filecap would be
 {{{URI:MDMF:$writekey:$verfkeyfingerprint[:$k][:$EXTNAME=$VALUE]*}}}.

 The downside of making "k" so non-optional is that a repairer/reencoder
 which changes the file's encoding (replacing every single share) would
 then cause the filecap's "k" value to be stale. The retrieve code only
 treats it as a hint, of course (includinging the mapupdate step but not
 e.g. decoding), so it couldn't hurt anything but efficiency (the
 mapupdate's first batch of requests might not be enough to find enough
 shares, and we'll have to wait for the first batch to return before we
 learn of the larger real "k" and send out a second batch). But I think
 that's not a significant issue.

-- 
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1526#comment:6>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list