[tahoe-dev] Person with 3.5GB offsite backups thinks of using BitTorrent. I say Tahoe. What say you?
Kevin Reid
kpreid at mac.com
Sun Feb 14 04:54:33 PST 2010
From http://mmol-6453.livejournal.com/221488.html 2010-02-07 11:00:00:
> I've been thinking about a recurring issue...Namely that none of
> RCo's remote backup targets are very good. My home system has a
> problem remaining alive for any extended duration, and I don't have
> any other good prospective places to I can trust to send the data.
> (No offense to anyone who's offered, but it's hard for me to totally
> trust someone I've never met in person.*)
>
> I may have hit on a novel solution, but I want to run it past a
> bunch of people (namely, you), before I do something this crazy.
>
> Step 1: Take backup on server
> Step 2: Compress backup to tarball.
> Step 3: Encrypt tarball using GPG and a long, long public key.
> Step 4: Build a torrent.
> Step 5: Add torrent to RSS feed.
> Step 6: Anyone who wants to help can point their torrent client at
> the RSS feed. Data's encrypted, so I don't have to worry. With
> enough seeder boxes out there, there can be several full copies out
> there.
>
> Bonus: Server data migration happens much, much faster. :)
>
> Steps 2 and 3 can be munched a bit by nesting the encryption in the
> tarball, using separate keys for each subdirectory, or even splitting
> a .tar.lzma, separately encrypting each chunk with a unique key,
> tarring that and using another key for encrypting that final tarball.
>
> And, yes, generating that many keys is problematic; It took me about
> five minutes to generate a 4kbit key yesterday as a test, as my home
> system didn't have enough entropy. That's solvable with either bulk
> data from random.org or using a hardware RNG. There's also key
> transport, but that's somewhat alleviated if I generate the key
> pairs at home, and then copy only the public keys to the server.
>
> * Yes, this is me we're talking about, and I realize the irony.
I replied:
> How about using Tahoe-LAFS instead? You get exactly the same privacy
> guarantees, but in Tahoe the uploader actually verifies that there
> are N full (well, erasure-coded) copies around, and there is a
> protocol for you to get the data back at any time rather than having
> to ask “hey, who's got a tarball?”
>
> Find enough mostly-reliable friends to contribute at least (12GB ×
> compression factor × number of backups per month) of storage space
> to the Tahoe-LAFS volunteer grid, and you’re all set.
>
> Only advantage I immediately see for BitTorrent is that Tahoe-LAFS
> doesn't have any cross-node data transfer; the uploader (or, if you
> set one up, the separate helper machine (which is not trusted with
> your unencrypted data)) has to send out all the erasure-coded (3.33×
> expansion) data, not just equal volume to the original.
>
> I'll see if I can get some comments from the real Tahoe folks about
> this use case...
Anyone have any further thoughts on the matter? Have I missed any
other BitTorrent/Tahoe differences worth pointing out?
--
Kevin Reid <http://switchb.org/kpreid/>
More information about the tahoe-dev
mailing list