[tahoe-dev] core functionality improvements for v1.4.0

Brian Warner warner-tahoe at allmydata.com
Thu Sep 25 20:48:45 PDT 2008


On Wed, 24 Sep 2008 07:49:07 -0600
zooko <zooko at zooko.com> wrote:

> By the way, I really hope that you, Brian, allocate time in the future
> for some improvements to core Tahoe functionality which you are
> uniquely well prepared to work on.  Backupdb would be a nice
> improvement, and you would do an excellent job of designing and
> implementing it, but there are a lot of other people who could also
> design and implement a backupdb.  There are relatively few other
> people who could contribute to these core pieces of Tahoe:
> 
> Things that I'm slightly embarassed that Tahoe doesn't already do:
> 
>   * #483 repairer service (this one is my job for v1.3.0)
>   * #119 lease expiration / deletion / garbage-collection
>   * #320 add streaming upload to HTTP interface
>   * #346 increase share-size field to 8 bytes, remove 12GiB filesize
>      limit
> 
> Things that dramatically increase performance (in one case) and make
> capabilities actually usable:
> 
>   * #217 DSA-based mutable files -- small URLs, fast file creation
> 
> Things that open up the way to standardization and re-implementation
> of Tahoe and fit it into more deployment scenarios:
> 
>   * #510 use plain HTTP for storage server protocol?
> 
> Let's make a plan for when to make progress on some of these.  Do you
> think we can do one of these in v1.4.0 before the end of the year?

Probably one, maybe two. Our track record of getting features done in a
reasonable amount of time is not great.. our team is just too small.

#510 (HTTP storage-server protocol) is a nice idea, and we have a general
idea of what it would involve, but it doesn't actually make anything easier
in the short-term, and it requires at least #217 and a new immutable format
(which would probably be part of #346), and interacts strongly with
Accounting (which we really haven't figured out yet). So that one is going to
take a while.

#217 depends upon #331 (unfluffy pubkeys), and is an opportunity to fix some
other issues (cleaner URI format #432 and #102). It's mostly data structure
work, but the URI changes could drag it out a bit.

#346 is nominally just a change to the share format on-disk, but that adds a
versioning issue (storage servers need to be able to handle either old or new
shares, we need unit tests for both), and it might be nice to accomodate
whatever gc/accounting lease scheme we want while we're rearranging the share
format.

#119 is really accounting too, in that files can be kept alive by either a
parent directory or an account-based reference. We've made some good design
progress in the last few months, but I think we're still another Zooko-trip
away from a coherent design. The information we'll want to put into each
share will change, so part of the design will involve squeezing the
leases/references into the existing share format (even if/when we do #346,
we've got 40TB of old-format shares that will need leases too).

#320 is a good goal, but I don't see it solving any immediate problems, and
the design will depend upon what sort of goal we care about: is it doing a
big PUT? A PUT with a bunch of Chunked-Encoding pieces? Some sort of
application-visible "upload handle" to which the client does multiple HTTP
POSTs? It's a tradeoff between speed, disk usage, memory usage, and
complexity. If too many storage servers fail during upload, would you rather
have spent the disk space on buffering, or spend the cpu/network/bandwidth on
restarting the upload? It also ties into improvements to resume interrupted
uploads, which involves storage-API changes. The big PUT approach involves
changes to twisted.web that I don't understand (although we know people who
have done this before, and we'd get their help). The upload-handle approach
involves webapi changes.

If I were working on this stuff entirely by myself, and not working on
anything else, here's what I'd currently guess each would take:

 * #346 >12GiB immutable files: two weeks
 * #217 DSA mutable files: three weeks
 * #320 streaming HTTP upload: three weeks plus help with twisted.web
 * #119 gc/accounting: five weeks

My current near-term task list includes the following items (with relative
priorities that are open to discussion, particularly with my boss :-):

 * #466 signed/extensible introducer, then secure prodnet with it (1wk+#331)
 * #174 make argv[0] be "tahoe" not "twistd", add BASEDIR to argv (two days)
 * assist with checker/repairer automation
 * #512 FTP frontend
 * #287 overhaul download to tolerate lost/slow servers (three weeks? four?)
 * 'tahoe backup' command, backupdb (two weeks?)

(note: 'tahoe backup' is on that list because it would be personally useful
to me: I have not started using Tahoe to back up my own photograph
collection, because I want a simple+efficient linux tool to do that backup.
So I've got a different sort of motivation to have that implemented than the
rest of these tickets. I agree that somebody else could write 'tahoe backup'
while I work on something listed above, but so far nobody has done so :-).

And then my anticipated medium-term task list has:

 * #518 single-file node configuration (three days)
 * improve upload speed with pipelining
 * #284 helper farm via introducer
 * #480/#390 mutable-upload vs readonly servers
 * handle disk-full situations cleanly
 * #68 distributed introducer
 * #484 client feedback channel

And of course the usual fires that pop up at a rate of about two per week.


So, I'd start with #346 (>12GiB immutable files), since that is the easiest
and gives us an immediate improvement (I think we see one or two large-file
upload attempts on the prodnet each week). Then I'd reluctantly put #119
(gc/accounting) ahead of #217 (DSA mutable files), because even though I
would really love to share "D18GeGYBSLAodrPuG" with someone instead of
URI:DIR2:kblqqqo47sswqfte3hodeou8q3:lrvvffwa6qa5wvhsvndjh2fxyuzqvxfb7q6qaihngbxseso4qh2a,
I'm more concerned about allmydata.com being able to efficiently find out how
much storage space they're using.

My current projects, plus #346 and #119 and other scheduling probabilities,
will pretty much fill the rest of the year.


Sigh. If only there were more hours in the day :).

cheers,
 -Brian


More information about the tahoe-dev mailing list