new WAPI+WUI proposal
Brian Warner
warner at lothar.com
Tue Jun 28 23:12:20 UTC 2016
To support some new accounting / grid-setup / magic-folders workflows
(in which Tahoe nodes get invited to join a grid using a
Magic-Wormhole[1] -based protocol), I'd like to add a new web API
("WAPI"). I wanted to describe the new protocol and see what people
think. I also have plans for a new directory-browsing / node-controlling
web UI ("WUI") that would sit on top of it, to replace the
server-generated HTML -based WUI we currently have.
== How the old WAPI/WUI works:
The old (machine-oriented) WAPI works by making HTTP GET/PUT/POST
requests to the web.port, generally putting the filecap in the URL path,
and selecting the operation to perform (upload, download, check, verify,
repair) by adding query args (like "&t=check") to the URL. To fetch a
directory, you GET a dircap with "&t=json".
The old (human-oriented) WUI renders a directory listing when you do a
GET of a dircap without t=json. The directory page has HTML forms which
let you upload a file (adding it to the rendered directory),
rename/delete files, and trigger check/verify/repair operations.
There's also a "Welcome page" which shows grid membership,
storage-server status, and lets you perform upload/download operations
that aren't tied to a specific directory. This page also links to
"Recent Uploads And Downloads", which presents performance and timing
data about recent operations, some of which is presented graphically
(the d3.js-based download timeline). A few of these pages have JSON
representations, but many do not, and as a result Cypher's "Grid-Sync"
project has resorted to scraping the HTML to determine e.g. the list of
connected servers.
== Problems with the old WAPI/WUI
There are three main problems:
* Origin Contamination: All pages retrieved from the same Tahoe
client/gateway will share a single "web origin" (usually
http://localhost:3456), which means any active Javascript content in
those pages shares access to cookies, same-origin tabs, and
same-origin browser history. This allows (the code inside) one
document to learn dircaps/filecaps of other documents. To address
this, we recommend disabling JS for the gateway origin, which
obviously limits the kinds of appplications that can be built on top
of Tahoe-backed storage. See tickets #615, #821, #1859, and everything
with the "websec" keyword.
* Ambient Authority: We don't like ambient authority, and we assume that
an attacker can probably find a way to cause your browser to visit a
URL of their own choosing. So the current WAPI is intentionally
limited in power: you must give it a filecap before it can do anything
with files, and it offers no control over non-filecap things (node
admin, storage server controls). It'd be nice if we could use a
browser-based UI to e.g. configure periodic backup, or magic-folder
directories, but there's not currently a way for the node to
distinguish attacker's requests from legitimate ones.
* No Async Notifications: The WUI only does things in response to a POST
or a GET. But there are features we'd like to add that would benefit
from asynchronous user notification, like magic-folder conflict
detection, completion of long-running check/repair operations, backup
progress, etc. The current WUI check/repair has a hacked-up
poll-until-complete scheme, but it's pretty ugly and only helps with
long-running (but user-triggered) operations, nothing else.
== New WAPI/WUI
So I'm looking to build the following:
* new WAPI, listening on a new local port (so a different HTTP origin)
* this hosts a WebSocket at a path of "/v1"
* the websocket accepts CBOR[2]-encoded requests, and returns CBOR
responses
* we'll define new capability tokens which authorize node-admin actions;
CLI tools will read these from the node's basedir/private/, and will
submit them in the CBOR requests (see [3] for one CLI approach) (see
#674)
* the same port will also serve a new WUI static-HTML directory
* the port will never provide grid-stored files/directories via GET or
POST. The *only* thing retrievable by normal HTML operations will be
the static WUI pages, to keep the HTML origin safe
* the new WUI will be a JS-based single-page app, which provides
node control and grid-stored directory navigation
* when asked to show a grid-stored document, the WUI will render it into
a sandboxed iframe[4], like ZeroNet[5] does (which was proposed in
ticket #1797)
We get to define the runtime environment of Tahoe-stored HTML/JS
document/applications, but I think a reasonable approach would be to
give the doc/app access to it's parent directory and anything reachable
from there. I think, with the right combination of ServiceWorkers and
intercepted requests, we could arrange it so that when you open
DIRCAP/foo/bar.html, the sandboxed bar.html can retrieve "../other.html"
(e.g. DIRCAP/other.html) but not "../../outside.html". This would enable
Zooko's TiddlyWiki and similar apps, but protect against attacks.
This is a shift from my previous no-JS stance (I've always encouraged
users to disable javascript for their Tahoe clients on localhost:3456,
to prevent the one-document-eats-the-other attacks, e.g. #615 and #821).
It's been about 10 years since I established that opinion.. what's
changed?:
* I've learned a bit more JS and see how useful it can be
* sandboxed iframes are a thing now
* ServiceWorkers are a thing now
Also, Accounting needs storagecaps and retrievalcaps (which are
basically private signing keys that allow the node to make authorized
upload and download requests), and these ought to be passed into the
WAPI rather than being ambiently available to all WAPI clients (see
#587), and I think a JS-based frontend is the only practical way to pass
in multiple caps (directory writecap + storagecap) at the same time.
The new WAPI port is also a good place to offer control of
non-file-based resources. Imagine an "admincap" (just an unguessable
token) which enables the frontend to add a new local directory to the
periodic "tahoe backup" list. Or which authorizes an API call that can
generate or accept Wormhole-style "invitation codes" to exchange
introducer.furl and accounting keys and an initial shared directory with
a new node.
So anyways, I plan to start implementing this at some point, and wanted
to describe my ideas with enough detail to give people a chance to
complain first (or, who knows, maybe offer ideas and feedback instead
:-).
I'll give folks a chance to chime in here on the mailing list first,
then I'll write this up in a proper ticket.
let me know what you think!
-Brian
[1]: http://magic-wormhole.io
[2]: CBOR is like JSON but includes bytestrings and has a very compact
encoding. It actually looks a lot like Banana, the encoding used by
Foolscap. https://tools.ietf.org/html/rfc7049 is the spec, and
there are multiple libraries for both Python and JS (and others)
[3]:
https://github.com/warner/petmail/blob/master/docs/steal_this_software.md#secure-all-http-frontend-access-tokens-event-channels
[4]: https://html.spec.whatwg.org/#attr-iframe-sandbox
[5]: http://zeronet.readthedocs.io/en/latest/
More information about the tahoe-dev
mailing list