[tahoe-lafs-trac-stream] [tahoe-lafs] #1526: make sure the new MDMF extension field is forward-compatible and safe
tahoe-lafs
trac at tahoe-lafs.org
Sun Sep 4 15:18:44 PDT 2011
#1526: make sure the new MDMF extension field is forward-compatible and safe
-------------------------+-------------------------------------------------
Reporter: zooko | Owner:
Type: defect | Status: new
Priority: | Milestone: 1.9.0
critical | Version: 1.9.0a1
Component: code- | Keywords: forward-compatibility mdmf design-
mutable | review-needed
Resolution: |
Launchpad Bug: |
-------------------------+-------------------------------------------------
Comment (by warner):
on IRC just now, I concluded that a "k" hint is useful to the mapupdate
step (so it can make a reasonable number of parallel DYHB/checkstring
requests), but that the "segsize" hint isn't so useful.
Ideally, the hints would let us build a one-round-trip downloader. The
share contains a bunch of static data (whose size depends upon "N" and the
keysize, but not the current filesize), followed by the share data,
finally followed by the block hash tree (which depends upon the current
filesize). This order lets us grow the file without needing to move all
the sharedata. So 1 RTT is really hard: you have to correctly guess where
the block hash tree data is, which means knowing the filesize to within a
segment, in addition to knowing the segsize.
So 2 RTT is a more reasonable goal: the first request tells you the
current filesize and the offset table, which lets you make an accurate
second request. For the current Retrieve code, the first request is done
during the mapupdate phase, which gets the checkstring and offset table
for each known share. Then the retrieve phase can make an accurate initial
request, and return the first segment of data in just 2 RTT.
So anyways, I think "k" is a useful hint, but I'm no longer so sure about
including "segsize". I think "k" is so useful that I'm willing to make it
mandatory, or at least untagged, so the MDMF filecap would be
{{{URI:MDMF:$writekey:$verfkeyfingerprint[:$k][:$EXTNAME=$VALUE]*}}}.
The downside of making "k" so non-optional is that a repairer/reencoder
which changes the file's encoding (replacing every single share) would
then cause the filecap's "k" value to be stale. The retrieve code only
treats it as a hint, of course (includinging the mapupdate step but not
e.g. decoding), so it couldn't hurt anything but efficiency (the
mapupdate's first batch of requests might not be enough to find enough
shares, and we'll have to wait for the first batch to return before we
learn of the larger real "k" and send out a second batch). But I think
that's not a significant issue.
--
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1526#comment:6>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list