[tahoe-dev] #510: thoughts about an HTTP-based storage-server protocol

Wed Sep 10 13:31:15 PDT 2008

There is a lot of design work that gets recorded in Trac tickets
rather than on the mailing list: I like to make a ticket for new ideas
so that we don't forget about them. Zooko asked me to copy ticket #510
onto the mailing list so people could see it better:

START  http://allmydata.org/trac/tahoe/ticket/510

Zooko told me about an idea: use plain HTTP for the storage server
protocol, instead of foolscap. Here are some thoughts:

 * it could make Tahoe easier to standardize: the spec wouldn't have to
   include foolscap too
 * the description of the share format (all the hashes/signatures/etc)
   becomes the most important thing: most other aspects of the system
   can be inferred from this format (with peer selection being a
   significant omission)
 * download is easy, use GET and a URL of /shares/STORAGEINDEX/SHNUM,
   perhaps with an HTTP Content-Range header if you only want a
   portion of the share
 * upload for immutable files is easy: PUT /shares/SI/SHNUM, which
   works only once
 * upload for mutable files:
   * implement DSA-based mutable files, in which the storage index is
     the hash of the public key (or maybe even equal to the public
     key)
   * the storage server is obligated to validate every bit of the
     share against the roothash, validate the roothash signature
     against the pubkey, and validate the pubkey against the storage
     index
   * the storage server will accept any share that validates up to the
     SI and has a seqnum higher than any existing share
   * if there is no existing share, the server will accept any valid
     share
   * when using Content-Range: (in some one-message equivalent of
     writev), the server validates the resulting share, which is some
     combination of the existing share and the deltas being written.
     (this is for MDMF where we're trying to modify just one segment,
     plus the modified hash chains, root hash, and signature)

Switching to a validate-the-share scheme to control write access is
good and bad:

 * + repairers can create valid, readable, overwritable shares without
     access to the writecap.
 * - storage servers must do a lot of hashing and public key
     computation on every upload
 * - storage servers must know the format of the uploaded share, so
     clients cannot start using new formats without first upgrading
     all the storage servers

The result would be a share-transfer protocol that would look exactly
like HTTP, however it could not be safely implemented by a simple HTTP
server because the PUT requests must be constrained by validating the
share. (a simple HTTP server doesn't really implement PUT anyways).
There is a benefit to using "plain HTTP", but some of the benefit is
lost when in fact it is really HTTP being used as an RPC mechanism
(think of the way S3 uses HTTP).

It might be useful to have storage servers declare two separate
interfaces: a plain HTTP interface for read, and a separate port or
something for write. The read side could indeed be provided by a dumb
HTTP server like apache; the write side would need something slightly
more complicated. An apache module to provide the necessary
share-write checking would be fairly straightforward, though.

Hm, that makes me curious about the potential to write the entire
Tahoe node as an apache module: it could convert requests for
/ROOT/uri/FILECAP etc into share requests and FEC decoding...

END  http://allmydata.org/trac/tahoe/ticket/510

cheers,
 -Brian