[tahoe-dev] Tahoe performance

Shawn Willden shawn-tahoe at willden.org
Wed Feb 11 00:18:46 PST 2009


On Wednesday 11 February 2009 12:18:07 am Lazy Stream wrote:
> Why is anything necessary other than "mount Tahoe as a filesystem"
> followed by "rsync -av /home/you/ /mnt/tahoe" or "rsync.exe -av
> \users\you z:\" ? Restartable, orthogonal, smurfilicious.

Can't work.

Rsync can only work its magic if the remote machine can read the files (and 
can run a copy of the rsync program).  The remote rsync instance needs to be 
able to generate rolling checksums of the content and send them to the other 
end which uses them to figure out what's changed, but one of the fundamental 
design requirements for Tahoe was privacy -- the server should NOT be able to 
read the files.  Thus rsync cannot work.

I'm working on a backup system that should address a lot of these issues.  It 
supports rsync-based incremental backups in a way that does work for 
encrypted file stores, and is also somewhat restartable (it does a file 
system scan and then uploads a log of the filesystem metadata before 
uploading file data -- as long as that log is safely uploaded, the backup 
system can be stopped and restarted, and it will pick up where it left off).  
It also supports upload prioritization; it allows you to assign weights to 
file and path patterns, so if it's going to take a month to get your data 
backed up, the most important stuff will go first.  It also applies a 
secondary ordering by size, so that among files of the same priority, the 
smallest ones are uploaded first.  Unlike duplicity, my system should never 
require more working space than equivalent to about 2-3% of your storage to 
be backed up.

I hope to get an alpha version out this weekend.  It will only include the 
backup code -- no restore capability, and no GUI, so it's still some distance 
from being usable, but I want to throw it out for anyone who's interested in 
helping me test it.  After I get the backup code stabilized, then I'll focus 
on restore.  After that's working, then I'll make the interface nice.  The 
release this weekend will probably require editing of Python files to 
configure your backup, so it has a long way to go. :-)

	Shawn.


More information about the tahoe-dev mailing list