[tahoe-dev] backup, revision control

Brian Warner warner at lothar.com
Mon Jan 17 21:35:08 UTC 2011


On 1/16/11 4:53 AM, Greg Troxel wrote:
> 
>   Command line tools for tahoe are less functional than WUI, so it's
>   too tempting to use the WUI, which means firefox/etc. handles caps,
>   which is obviously unsafe. Getting to the point where I don't want
>   to use the WUI beyond seeing server status is one of my gating
>   conditions before real use.

Yeah, we may need some tahoe-specific GUI (which speaks HTTP to
localhost behind the scenes), to keep the caps away from
traditionally-leaky browsers, but still make things easier than a CLI.

>   With gpg, one uses the agent which holds the private key, and goes to
>   great lengths to wipe memory, avoid swapping, etc.  I have no reason
>   to believe that the python code in tahoe client/server does this, but
>   maybe I'm totally confused on this point.

You're right, the tahoe code makes no attempt to protect secrets in RAM.
In addition to being really difficult in general, a language like python
probably makes it impossible.. when I was looking into implementing the
Pynchon Gate in python, I concluded that you'd want to have encrypted
swap, use a separate process for long-term keys to achieve some level of
forward-security, and give up on protecting against compromise of a live
system.


Yeah, in general, backup systems are either aggregating
(tar-then-encrypt-then-upload) or fine-grained (encrypt-then-upload).
Aggregating systems tend to be faster and more efficient for a single
snapshot, while fine-grained systems tend to re-use space between
multiple snapshots better and enable sharing of individual files or
directories.

Tahoe falls into the second category. Git is kind of in-between. Git
gets its speed (specifically its super-efficient network transport,
moving as little data as possible in one or two roundtrips) because both
sides of the wire get to know about the object graph. To make Tahoe work
this fast, we'd have to reveal the directory structure (or something
similar) to the storage servers. We'd also probably need to combine
small files together into single bundles to make things more efficient,
which would complicate the find-grained sharing thing.

cheers,
 -Brian


More information about the tahoe-dev mailing list