[tahoe-dev] Potential use for personal backup
Zooko Wilcox-O'Hearn
zooko at zooko.com
Tue May 22 19:01:58 UTC 2012
Dear Saint Germain:
I think Tahoe-LAFS is best understood by (at least temporarily)
forgetting all about a "filesystem", like ext3, zfs, etc. etc. Forget
about that. It isn't a filesystem! (Until you get to the end of this
letter.)
• Start by thinking of it as an application, like Bittorrent, which
can be manually triggered to download a single file. (Bittorrent can
only download—Tahoe-LAFS can also upload as well as download.) Or
think of it as being more like "scp". You can run "scp $LOCALFILE
$REMOTEHOST:$REMOTELOCATION", and you can run "tahoe put $LOCALFILE
$DIRCAP/$GRIDLOCATION".
So, if you run the Tahoe-LAFS client on your computer, and you ask it
to upload a file with the command-line above, it will transfer the
file contents (encrypted) to one or more servers (using erasure coding
to spread the shares out among multiple servers).
• Next, understand that it can upload or download whole sets of files
by recursively traversing a directory and uploading each file it finds
therein. "tahoe cp -r $LOCALDIRECTORY $DIRCAP/$GRIDLOCATION"
• Next, that it deduplicates any file which is identical to any other
file, but doesn't do deltas, compressions, block-level dedup, or
otherwise do anything smart with a file that is not bitwise identical
to another file.
• Next, that the "tahoe backup" command (but not the other commands
such as "tahoe cp") will check the timestamp on your local file and if
it already uploaded the file and the timestamp hasn't changed, then it
will not upload it again. (Except occasionally it randomly uploads it
again if it has been awhile since it last uploaded it, just to make
sure that the copy of it on the server is still good.)
The fact that "tahoe backup" checks timestamps on your local files and
skips ones that don't appear to have been changed is one of the major
differences between "tahoe backup" and "tahoe cp -r". The other major
difference is that "tahoe backup" keeps links to all of the versions
that have been uploaded to the grid, so you can navigate among old
versions stored in the grid. In contrast, "tahoe cp -r" unlinks the
previous version from the grid directory and links the new version
into place, so unless you have a link to the older version stored
somewhere else, you'll never be able to get back to it.
Okay, now you understand the core functionality of Tahoe-LAFS,
including the fact that it will probably be too inefficient to manage
your 1 GB virtual machine images with "tahoe cp" or "tahoe backup".
• Now, can it be used for your purposes? Maybe! There are at least
three different "front ends" that you could try that might fit your
needs:
Option 1: The drop-upload feature:
https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/drop-upload.rst?rev=5486
The drop-upload feature is inspired by the behavior of dropbox, but it
is only half of dropbox's functionality. What it does is that the
Tahoe-LAFS daemon watches a directory that you specify, and whenever a
file gets added or changed in that directory, then it backs it up.
That's all. It works only on Linux. There's an issue ticket open to
track our progress on making it work on Windows (help wanted!): #1431.
Option 2: FUSE
Unix people seem to think that this is the only way to go. "If it
isn't FUSE it's CRAP!" seems to be the motto of unix heads. But I
haven't heard a lot of reports of people using it successfully. I
suspect the other two options are better for most purposes. But if you
want to try it and let us know how it works, start here:
https://tahoe-lafs.org/trac/tahoe-lafs/wiki/FAQ#Q23_FUSE
Option 3: duplicity or duplicati
Duplicity is a program that uses rsync to delta-compress successive
versions of each file. It can be configured to use Tahoe-LAFS as its
backend. Duplicati is a separate project that set out to rewrite
duplicity.
I haven't heard any reports of people using these with Tahoe-LAFS yet,
but it sounds like a good idea to me if what you need is to backup
large files like virtual machine images.
http://duplicity.nongnu.org/features.html
http://duplicati.com/
I hope this helps! I would very much like to read a letter from you to
the tahoe-dev list saying whether you tried to use Tahoe-LAFS, and how
you tried to use it, and how well it worked.
Regards,
Zooko
https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1431# drop-upload on Windows
More information about the tahoe-dev
mailing list