"backup" behavior and corrupted file

Brian Warner warner at lothar.com
Sun Jul 26 19:00:18 UTC 2015


On 7/1/15 10:57 PM, droki wrote:

> This brings me to my second issue. I was trying to work around this
> problem and thought "I'll just make a whole new backup." So I ran
> "tahoe backup" and specified a new directory as the destination. But I
> saw that tahoe was still skipping all the files that had previously
> been backed up, so it wasn't creating a new complete backup. Is this
> the intended behavior?

Incidentally, the "tahoe backup" command (eventually) creates a big
immutable directory tree for each snapshot it creates. As it runs, it
uploads files to the storage servers, and records their (immutable)
filecaps in the backupdb. But it won't create the final tahoe-side
snapshot directory until the very end, to avoid creating
partial/incomplete backups. It's all-or-nothing.

This can be a bit unsettling the first time you see it, because "tahoe
backup" seems to be consuming a lot of CPU and upstream network
bandwidth, but without anything to show for it. At least if you're
measuring progress by running "tahoe ls" on the target directory over
and over again, you won't see anything until it finishes.

When combined with a bug (like the one you seem to have encountered)
that causes something to break along the way, it's easy to conclude that
"tahoe backup" has accomplished nothing.

You looked in exactly the right place to measure progress: the number of
files that the backup is skipping. Each time "tahoe backup" successfully
uploads a file, even if the overall backup doesn't complete, the next
backup command will skip that file. The only progress lost by
interrupting a "tahoe backup" is the one file that was in flight at that
moment, and the time it took to scan the local filesystem to get to that
point.

We have some progress-notification code in there, but it's pretty rough.
It has no idea how many total files you have, or how many will need to
be uploaded, so it can't give you a "33% complete" counter or anything.
Adding the "--verbose" option will make it print something about every
file, so (except for files that take a long time to upload) that'll give
you some sense of whether things are still happening or if it's gotten
stuck completely. I've sketched out some better progress-display
schemes, but nothing is ready for porting into Tahoe yet.

cheers,
 -Brian


More information about the tahoe-dev mailing list