[tahoe-dev] Person with 3.5GB offsite backups thinks of using BitTorrent. I say Tahoe. What say you?

Kevin Reid kpreid at mac.com
Sun Feb 14 04:54:33 PST 2010


 From http://mmol-6453.livejournal.com/221488.html 2010-02-07 11:00:00:

> I've been thinking about a recurring issue...Namely that none of  
> RCo's remote backup targets are very good. My home system has a  
> problem remaining alive for any extended duration, and I don't have  
> any other good prospective places to I can trust to send the data.  
> (No offense to anyone who's offered, but it's hard for me to totally  
> trust someone I've never met in person.*)
>

> I may have hit on a novel solution, but I want to run it past a  
> bunch of people (namely, you), before I do something this crazy.
> 
> Step 1: Take backup on server
> Step 2: Compress backup to tarball.
> Step 3: Encrypt tarball using GPG and a long, long public key.
> Step 4: Build a torrent.
> Step 5: Add torrent to RSS feed.
> Step 6: Anyone who wants to help can point their torrent client at  
> the RSS feed. Data's encrypted, so I don't have to worry. With  
> enough seeder boxes out there, there can be several full copies out  
> there.
>
> Bonus: Server data migration happens much, much faster. :)
>
> Steps 2 and 3 can be munched a bit by nesting the encryption in the
> tarball, using separate keys for each subdirectory, or even splitting
> a .tar.lzma, separately encrypting each chunk with a unique key,
> tarring that and using another key for encrypting that final tarball.
>
> And, yes, generating that many keys is problematic; It took me about  
> five minutes to generate a 4kbit key yesterday as a test, as my home  
> system didn't have enough entropy. That's solvable with either bulk  
> data from random.org or using a hardware RNG. There's also key  
> transport, but that's somewhat alleviated if I generate the key  
> pairs at home, and then copy only the public keys to the server.
>
> * Yes, this is me we're talking about, and I realize the irony.


I replied:

> How about using Tahoe-LAFS instead? You get exactly the same privacy  
> guarantees, but in Tahoe the uploader actually verifies that there  
> are N full (well, erasure-coded) copies around, and there is a  
> protocol for you to get the data back at any time rather than having  
> to ask “hey, who's got a tarball?”
>
> Find enough mostly-reliable friends to contribute at least (12GB ×  
> compression factor × number of backups per month) of storage space  
> to the Tahoe-LAFS volunteer grid, and you’re all set.
>
> Only advantage I immediately see for BitTorrent is that Tahoe-LAFS  
> doesn't have any cross-node data transfer; the uploader (or, if you  
> set one up, the separate helper machine (which is not trusted with  
> your unencrypted data)) has to send out all the erasure-coded (3.33×  
> expansion) data, not just equal volume to the original.
>
> I'll see if I can get some comments from the real Tahoe folks about  
> this use case...


Anyone have any further thoughts on the matter? Have I missed any  
other BitTorrent/Tahoe differences worth pointing out?

-- 
Kevin Reid                                  <http://switchb.org/kpreid/>






More information about the tahoe-dev mailing list