[tahoe-dev] Tahoe performance
Shawn Willden
shawn-tahoe at willden.org
Wed Feb 11 00:18:46 PST 2009
On Wednesday 11 February 2009 12:18:07 am Lazy Stream wrote:
> Why is anything necessary other than "mount Tahoe as a filesystem"
> followed by "rsync -av /home/you/ /mnt/tahoe" or "rsync.exe -av
> \users\you z:\" ? Restartable, orthogonal, smurfilicious.
Can't work.
Rsync can only work its magic if the remote machine can read the files (and
can run a copy of the rsync program). The remote rsync instance needs to be
able to generate rolling checksums of the content and send them to the other
end which uses them to figure out what's changed, but one of the fundamental
design requirements for Tahoe was privacy -- the server should NOT be able to
read the files. Thus rsync cannot work.
I'm working on a backup system that should address a lot of these issues. It
supports rsync-based incremental backups in a way that does work for
encrypted file stores, and is also somewhat restartable (it does a file
system scan and then uploads a log of the filesystem metadata before
uploading file data -- as long as that log is safely uploaded, the backup
system can be stopped and restarted, and it will pick up where it left off).
It also supports upload prioritization; it allows you to assign weights to
file and path patterns, so if it's going to take a month to get your data
backed up, the most important stuff will go first. It also applies a
secondary ordering by size, so that among files of the same priority, the
smallest ones are uploaded first. Unlike duplicity, my system should never
require more working space than equivalent to about 2-3% of your storage to
be backed up.
I hope to get an alpha version out this weekend. It will only include the
backup code -- no restore capability, and no GUI, so it's still some distance
from being usable, but I want to throw it out for anyone who's interested in
helping me test it. After I get the backup code stabilized, then I'll focus
on restore. After that's working, then I'll make the interface nice. The
release this weekend will probably require editing of Python files to
configure your backup, so it has a long way to go. :-)
Shawn.
More information about the tahoe-dev
mailing list