[tahoe-dev] #510: thoughts about an HTTP-based storage-server protocol
Brian Warner
warner-tahoe at allmydata.com
Wed Sep 10 13:31:15 PDT 2008
There is a lot of design work that gets recorded in Trac tickets
rather than on the mailing list: I like to make a ticket for new ideas
so that we don't forget about them. Zooko asked me to copy ticket #510
onto the mailing list so people could see it better:
START http://allmydata.org/trac/tahoe/ticket/510
Zooko told me about an idea: use plain HTTP for the storage server
protocol, instead of foolscap. Here are some thoughts:
* it could make Tahoe easier to standardize: the spec wouldn't have to
include foolscap too
* the description of the share format (all the hashes/signatures/etc)
becomes the most important thing: most other aspects of the system
can be inferred from this format (with peer selection being a
significant omission)
* download is easy, use GET and a URL of /shares/STORAGEINDEX/SHNUM,
perhaps with an HTTP Content-Range header if you only want a
portion of the share
* upload for immutable files is easy: PUT /shares/SI/SHNUM, which
works only once
* upload for mutable files:
* implement DSA-based mutable files, in which the storage index is
the hash of the public key (or maybe even equal to the public
key)
* the storage server is obligated to validate every bit of the
share against the roothash, validate the roothash signature
against the pubkey, and validate the pubkey against the storage
index
* the storage server will accept any share that validates up to the
SI and has a seqnum higher than any existing share
* if there is no existing share, the server will accept any valid
share
* when using Content-Range: (in some one-message equivalent of
writev), the server validates the resulting share, which is some
combination of the existing share and the deltas being written.
(this is for MDMF where we're trying to modify just one segment,
plus the modified hash chains, root hash, and signature)
Switching to a validate-the-share scheme to control write access is
good and bad:
* + repairers can create valid, readable, overwritable shares without
access to the writecap.
* - storage servers must do a lot of hashing and public key
computation on every upload
* - storage servers must know the format of the uploaded share, so
clients cannot start using new formats without first upgrading
all the storage servers
The result would be a share-transfer protocol that would look exactly
like HTTP, however it could not be safely implemented by a simple HTTP
server because the PUT requests must be constrained by validating the
share. (a simple HTTP server doesn't really implement PUT anyways).
There is a benefit to using "plain HTTP", but some of the benefit is
lost when in fact it is really HTTP being used as an RPC mechanism
(think of the way S3 uses HTTP).
It might be useful to have storage servers declare two separate
interfaces: a plain HTTP interface for read, and a separate port or
something for write. The read side could indeed be provided by a dumb
HTTP server like apache; the write side would need something slightly
more complicated. An apache module to provide the necessary
share-write checking would be fairly straightforward, though.
Hm, that makes me curious about the potential to write the entire
Tahoe node as an apache module: it could convert requests for
/ROOT/uri/FILECAP etc into share requests and FEC decoding...
END http://allmydata.org/trac/tahoe/ticket/510
cheers,
-Brian
More information about the tahoe-dev
mailing list