[tahoe-dev] Getting my root writecap for the production grid

Brian Warner warner-tahoe at allmydata.com
Wed Jul 30 18:17:37 PDT 2008


> 1. Hopefully the client can mark all "important" files simply by
> traversing down from the root URI.

Yeah, that's the plan.. I'm thinking POST /uri/ROOTDIRCAP?t=deep-renew which
will recursively update the lease timer on everything you can reach from that
root. I'm figuring we do this once a week or once a month, and then have the
servers flag anything with a timer more than a few months old as garbage.

There's a whole bunch of tradeoffs here: reliability vs traffic vs garbage. I
have a diagram (offline) of the issues: if you put renewal time on one axis,
and expiration time on the other, then you get three unreconcileable
pressures: long renewal time to get low traffic, short expiration time to get
minimal garbage, large expire/renewal ratio to get high reliability.

> Better yet, why not garbage collect any files that aren't linked to a name
> inside any directory linked from the root (perhaps through more
> directories)?

That's also basically the plan, except for files that have been uploaded
outside a directory structure (the "unlinked upload" feature) that somebody
wants to retain. The main issue here is who is able to renew the leases. The
servers can't see inside the directories.

We've been planning to make lease-renewal something that you can safely
delegate to someone else, a sort of "renewer service": they get to renew your
leases for you, but they don't get to read your plaintext. For example, it
would be completely appropriate for Allmydata to provide a service like this.

One current vague plan has been for the client to give a list of their
renewal-caps (basically the same as a verify-cap, which t=manifest has
returned since v1.0.0) to the renewer service. The client would walk their
rootcaps every once in a while (maybe once a day) to build up their current
manifest (a list of everything they want to keep around), then give it to
this service. The service would then take responsibility for renewing the
leases every week. The client could go offline for an extended period of
time, but the renewer would keep their files alive. The benefit here is that
clients don't need to share their rootcaps with the renewer; the downside is
that they have to do a recursive walk to figure out what *does* need to be
given to the renewer, and that they need to have a pretty comprehensive view
of what their rootcaps are (so they don't tell the renewer to forget about
something that is not actually garbage).

There are problems with that plan, so as an intermediate position I've been
thinking about abandoning the manifest-of-verifycaps scheme and just
concentrating on the recursive walk. Somebody would be responsible for doing
a deep-renew from their rootcap on a regular basis. That somebody might be
the client, or it might be somebody else.

We're planning to introduce "traversal caps" in the new DSA-based dirnodes
(ticket #217), which would let the holder do a recursive walk but not see any
of the plaintext.. with these, the renewer agent (i.e. allmydata) could hold
the traversal cap and renew your files for you, even though we can't see your
plaintext.

In the meantime, allmydata currently retains rootcaps for all our customers
(both to provide password recovery and to let us do recursive traversals like
this). So we could just have our own servers do a recursive walk-and-renew on
all files reachable from those rootcaps. When we get #217 done and switch
people over to DSA-based files, we can retain the traversal-cap instead (and
at least put the rootcap in a different place, available for password
recovery but not generally available to our deep-renew cronjob).

 (the action item for 3rd-party allmydata client code is to be prepared to do
 this deep-renew operation on any directories that allmydata doesn't know
 about, since we won't be able to do it for you)

> 3. Also, if anyone is interested, I would be willing to open source
> (at least) the lowest level Tahoe client portion of my project. That
> portion is a very simple Ruby library for performing basic Tahoe operations
> (put, get, rename, delete, mkdir, list, attributes). Similar to the AWS::S3
> (amazon.rubyforge.org) library.

That's great.. is this a front-end to the Tahoe webapi?

cheers,
 -Brian


More information about the tahoe-dev mailing list