[tahoe-dev] URI compatibility break

Brian Warner warner-tahoe at allmydata.com
Thu Feb 7 10:43:36 PST 2008


Hi folks..

Last night I made a change to the way we hash encryption keys into storage
index values, which had the unintended side-effect of breaking all existing
URIs. Oops. I was making a necessary change to the CHK data-to-key hash, and
incorrectly assumed that I could change the key-to-SI hash at the same time
without causing problems.

The result is that any URI that was generated with version 0.7.0-280 or
earlier will be unreadable by version 0.7.0-281 or later. The symptom will be
a "NotEnoughPeersError" when attempting to download those files. In addition,
and URI generated by 0.7.0-281 or later will be unreadable by older (<=280)
clients.

Since the new hash is better in a few minor ways than the old one, we've
decided to stick with the new scheme, and to accept the compatibility hit
(using the "better now than later" argument). You can still retrieve those
old files by using an old client, but you'll want to re-upload them to make
them visible to versions going forward. Also note that it is only the client
version that matters: neither the storage servers nor the upload helper are
aware of this change.

Also note that because the CHK data-to-key hash changed, any files that you
uploaded before the change will get different CHK keys, meaning that the
usual "avoid duplicate uploads" mechanism will not consider those old files
to be the same as the new ones. This was done to include the encoding
parameters in the CHK hash, to avoid an annoying problem in which two copies
of the same file would get encoded (by different clients) with different
k-of-N parameters, but have the same storage index. The result was a weird
mixture of incompatible shares on the grid, with no good way to tell them
apart. On a good day this merely slows down the download (since the download
algorithm treated the wrong-encoding shares as corrupted, and dropped them in
favor of better shares), but in some cases it could cause the file to be
unretrievable (as the wrong-encoding shares effectively displaced the
right-encoding shares). It has been an outstanding item for almost a year to
include these parameters in the hash.. only with the refactoring work of the
last few days has it become convenient to actually do it.


sorry for the inconvenience,
 -Brian


More information about the tahoe-dev mailing list