[tahoe-dev] proposal for an HTTP-based storage protocol
Kevin Reid
kpreid at switchb.org
Sun Sep 26 11:51:09 UTC 2010
On Sep 26, 2010, at 1:35, Ravi Pinjala wrote:
> There have been some noises on this list about replacing the
> foolscap-based storage protocol with something HTTP-based and easier
> to work with. I'd like to throw in my own work on an extensible
> HTTP-based storage protocol as a starting point....
[...]
I'd like to offer some criticism of this protocol from a web-
architecture/REST perspective.
> * discovery document URL: http://server.address/
Let this be an arbitrary URL, not required to be the server root.
> * discovery document contents:
> <webfs>
> <module path="data" interface="http://p-static.net/webfs/data/1.0">
> <feature name="max-directory-depth" value="0" />
> <feature name="max-data-size" value="1048576" />
> </module>
> <module path="metadata" interface="http://p-static.net/webfs/metadata/1.0
> " />
> </webfs>
Place these elements in an XML namespace.
Perhaps even let the XML namespace serve for interface and feature
identification:
<webfs xmlns="http://p-static.net/webfs/1.0">
<module xmlns="http://p-static.net/webfs/data/1.0"
path="data">
<max-directory-depth>0</max-directory-depth>
<max-data-size>1048576</max-data-size>
</module>
<module xmlns="http://p-static.net/webfs/metadata/1.0"
path="metadata" />
</webfs>
> * "data" interface URL: http://server.address/data/
This URL should be constructed from resolving path="data" as a
relative URL against the discovery document URL. Then, use xlink:href=
instead of path= as the attribute.
The goal of all these changes is to make the XML contain more
semantics that are already understood by general XML/web tools,
reducing the amount of application-specific interpretation logic
needed (thus reducing the chances that someone will casually implement
the interpretation incorrectly).
> * URL of a document stored on the server: http://server.address/data/foo/bar
>
> * URL of the metadata for said document: http://server.address/metadata/foo/bar
>
> * Example of direct access to a metadata key:
> http://server.address/metadata/foo/bar?mtime
It should be explicitly part of the definition of the data and
metadata modules that they define these path patterns (underneath the
path= URL).
> The modules I've implemented so far are a RESTful data module
> (GET/PUT/DELETE on a path do exactly what you'd expect) and metadata
> module (lets you associate arbitrary key-value metadata with a file,
> also uses GET/PUT/DELETE in an intuitive way). If my understanding of
> how a storage node works is correct, this is enough to implement a
> storage node.
What it doesn't have that a storage node should have is verifying of
what's uploaded; it should check that the name of an uploaded share is
the appropriate function of its contents (I don't know offhand what
that actually is), so that clients can't upload obviously bogus shares.
IIRC, this is one of the reasons we haven't just implemented 'WebDAV
server as storage node', even though WebDAV does have the GET/PUT/
DELETE and arbitrary-metadata functionality.
(Ah, that raises another question: What are the advantages of your
protocol over WebDAV? I've implemented WebDAV, and while it does have
a certain amount of architecture bloat, it doesn't seem -- at the
moment -- worth using a different protocol just for that.)
--
Kevin Reid <http://switchb.org/kpreid/>
More information about the tahoe-dev
mailing list