[tahoe-dev] tahoe backup re-uploads old files

Brian Warner warner at lothar.com
Wed Feb 29 23:59:01 UTC 2012


On 2/24/12 11:48 AM, Marco Tedaldi wrote:

>> To my surprise it skipped some old pictures (which I consider normal)
>> but started uploading one of the old pictures I had for sure not
>> touched in the mean time.

It depends upon the timing involved, but "tahoe backup" will check up on
unchanged files that haven't been checked in a while. If the file was
last uploaded or checked within a month, it will assume that the shares
are still ok (so 0% chance of doing a filecheck). Starting at one month
old, the probability of doing a filecheck grows, until it reaches 100%
at two months (i.e. if the file hasn't been checked for over two months,
it will *always* do a filecheck). If the filecheck reports any problems,
the file is re-uploaded.

The idea was to smoothly check up on files that aren't "fresh", to
detect servers going away or other surprises (like switching grids
altogether). So my guess is that you started the backup more than a
month ago, now tahoe is starting to check up on those files, and either
some servers have gone away or are temporarily offline, so it decides it
needs to re-upload the files.

Does that seem to match what you're observing?

> (oh yeah, i've deleted some random parts of the URI... if you need the
> whole uri, I can give it out by PM. It's not that these images are big
> secrets)...

Incidentally, if you want to safely mangle a CHK filecap like:

> 'URI:CHK:gxtbrbfds5x63jbcnu4jaq:qzqfninxxp37itz46omv77jk7z65tj5q3rij4aa:5:11:47963680'

then the important thing to hide is the third field ("gxtbrb.."). That's
the part that contains the decryption key. The rest is
publically-visible integrity-checking data and share-counts/filesizes.

cheers,
 -Brian


More information about the tahoe-dev mailing list