[tahoe-lafs-trac-stream] [tahoe-lafs] #1565: URL formats for HTTP-based storage server
tahoe-lafs
trac at tahoe-lafs.org
Mon Oct 17 13:43:53 PDT 2011
#1565: URL formats for HTTP-based storage server
--------------------------+----------------------------
Reporter: warner | Owner:
Type: task | Status: new
Priority: major | Milestone: eventually
Component: code-storage | Version: 1.9.0b1
Keywords: | Launchpad Bug:
--------------------------+----------------------------
Ticket #510 is about speaking to storage servers with mostly-plain HTTP.
One
piece of this is deciding what the URLs should look like. Downloading a
share
from the storage server should be a simple HTTP "GET", using a
{{{Range:}}}
header to fetch less than the whole share. But we also need ways to
discover
which shares are available for download, and eventually ways to upload
data
to the server too.
Here's the starting point that I implemented in my prototype (which still
uses Foolscap and get_buckets() to discover shares):
* {{{GET /storage/imm/SI/%(storage_index)s/share/%(shnum)d}}}: retrieves
data
from the given share. Normal downloads use e.g. {{{Range:
bytes=87418-131108,422601-422664,423593-423656}}} to fetch a bunch of
spans.
* {{{GET /storage}}}: this currently returns a human-readable page
describing
the state of the storage server.
The next steps:
* {{{GET /storage/imm/SI/%(storage_index)s/shares}}}: return a JSON list
of
share numbers
* {{{GET /storage/imm/SI/%(storage_index)s/all_shares}}}: return a JSON
dictionary mapping share number to a read data vector. The same spans
are
returned for all shares. This collapses the Do-You-Have-Block query with
the initial data fetch, allowing one-round-trip downloads.
I put "imm" into the URL because the current storage server treats
immutable
and mutable shares very differently (they have different container
formats).
It's not trivial to take an SI and switch on the type of share that it
points
to. It might be cleaner to fix the server to handle this well, and then
remove the "imm" from the URL. OTOH, it might be better to leave them
distinct.
We need similar URLs for reading from mutable shares; they can probably be
the same but with "mut" instead of "imm".
We'll need POST URLs for uploading files and modifying mutable shares, as
well as adding/renewing leases and other storage server methods. The
request
bodies will be more complicated since they'll need authorization
signatures
or something. But the basic URL target could be:
* {{{POST /storage/imm/SI/%(storage_index)s/shares/%(shnum)d}}}: start
uploading the given share. Return 302 FOUND if the share already exists.
The upload can be spread across multiple requests, with a "finished"
flag
on the last request. This might involve returning an "upload token"
which
subsequent requests must reference.
* {{{POST /storage/mut/SI/%(storage_index)s/shares/%(shnum)d}}}: modify
the
given mutable share. The body will probably be a signed serialized JSON
modification request, basically a write-vector, along with a test-vector
or
other collision-avoidance scheme.
All of this presumes that Accounting is not being enforced on read access.
At
least one of the designs I've drawn up offers {{{read=False}}} control, as
a
stick for the storage operator to use against a client who doesn't pay
their
bills (but still less drastic than {{{store=False}}}, which deletes all
their
data). To enforce {{{read=False}}}, the GETs would need to be authorized,
which either involves adding an extra signature header, or implementing
them
with a POST instead (and putting the signature in the request body).
--
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1565>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list