[tahoe-lafs-trac-stream] [Tahoe-LAFS] #1310: separate "gateway state directory" from "client state directory"
Tahoe-LAFS
trac at tahoe-lafs.org
Mon Sep 22 21:40:20 UTC 2014
#1310: separate "gateway state directory" from "client state directory"
-----------------------------------+-----------------------
Reporter: zooko | Owner: warner
Type: defect | Status: reopened
Priority: major | Milestone: undecided
Component: code-frontend-cli | Version: 1.8.1
Resolution: | Keywords: usability
Launchpad Bug: |
-----------------------------------+-----------------------
Comment (by warner):
(circling back to this ticket thanks to zooko's link from #2045, which is
about larger-scale changes to the node's and code's directory layout)
Rereading zooko's initial issue, I found myself tempted to yell out "don't
do that!". I guess I've always optimized tahoe's frontend- and setup-
management tools for the common case of a single "gateway" per
(user*computer) tuple. I really want the instructions to be as simple as
"tahoe create; tahoe start; tahoe webopen". I don't want to complicate
that for the sake of the less-common use case of multiple
nodes/gateways/clients/whatevers.
Partly that indicates a lack of universality in our design (which we've
always known about, and always regretted, but also know better than to try
and fix, because it's very hard, certainly distracting, probably
confusing, and slightly impossible). There's no "one true grid" (#2009).
Grids can't be too big (#235, #444), increasing the demand for using
multiple ones, in particular if you want to use tahoe to share files with
other people. There are no "grid identifiers" in filecaps (#403), so if
you want to use multiple distinct grids, you need multiple distinct client
nodes ("gateways" in zooko's lexicon) and must be careful to give the
right filecap to the right node (WAPI port / gateway process).
A side-note on terminology mismatches: I think (and talk) about tahoe in
terms of three pieces:
* 1: frontends (CLI scripts, web browsers, FTP/SFTP clients), all
talking over a network connection (WAPI or others) to the client node
* 2: client nodes, which respond to WAPI requests, perform the
upload/download/encode/decode algorithms and make connections to server
nodes
* 3: server nodes, which are (so far) agnostic about file-encoding
formats and just respond to PUT/GET-share requests from client nodes
I think Zooko thinks/talks in the same three pieces but with different
names:
* 1: clients (CLI scripts, web browsers, FTP/SFTP clients)
* 2: gateways
* 3: servers
A related angle is the imperfect distinction between the functions
performed by pieces 1 and 2. `tahoe backup` is a good example: this is
currently a CLI command, but I feel that it should really be moved into
the client node (#1018). Backup is more of an ongoing process than a one-
off action (#643). A one-shot CLI command needs to be run from cron to
make it into a process, and then it doesn't have enough information to
coordinate with other (overlapping) runs (#2062, #2053). I'd like to have
backups be managed through some sort of control panel (#1588, #1587),
where you can express your priorities and preferences about what you want
to be backed up and how much network/CPU it's allowed to consume, and then
the backup agent handles the rest. This control panel should also be a
place to check in on the process, especially for progress reports during
the long initial upload.
There are two big blockers for this sort of long-running agent. The first
is how/whether to split this from the long-running code that knows how to
upload/download files. We had a good [wiki:Summit2Day1#AgentGatewaysplit
discussion] about this at the [wiki:Summit2011 2011 summit]. Zooko's
mental model, which uses the word "gateway", helps make this a bit more
clear: my desired backup-manager process would live in an "Agent", and the
upload/download stuff would live in the "Gateway", and maybe the Agent
would use te Gateway but not vice-versa. Depending upon the value of
caching server connections and generally having a long-term relationship
with servers (tracking uptime/speed/reliability), it might even make sense
for the Gateway functionality to *not* live in a long-term process, and
instead be a short-lifetime library that gets loaded on demand (imagine if
"tahoe put" were standalone, maybe learning about cached server
information from a sqlite database, but establishing its own server
connections as necessary).
The other is how to safely talk to this agent, honoring our objcap no-
ambient-authority style: some sort of restricted-access web-based control
panel (#674, wiki:Summit2Day2#ControlPanel) which I've been prototyping
externally in my "toolbed" and "petmail" projects (but it's very JS-
heavy).
Anyways, that was a long diversion away from the main point: the use of a
single NODEDIR to manage the states and configurations of all these pieces
(client-ish stuff, agent-ish stuff, gateway-ish stuff, heck even server-
ish stuff) is ideal for one-grid cases, and confusing for multiple-grid
cases.
I'm warming slightly to the `--cli-directory=` idea. Maybe by splitting
these different bits of functionality into separate subdirs, putting all
of them in the single NODEDIR by default, but making it clear that e.g.
CLI commands only touch stuff in NODEDIR/cli/* . Then make it possible to
either override the top-level `--nodedir=`, or a CLI-functionality-
specific `--cli-directory=`.
It's worth remembering the overlap between these components, though. The
gateway writes out a `node.url` file, the frontend commands read it.
Control panels will involve access keys being read or written to agent-
accessible databases. We might be able to statically construct enough of
this that we don't need to think about ongoing config-directory-based
communication between components after initial `tahoe create`, but maybe
not.
But I still think it may be easier to tell people "don't do that" and try
to make the one-grid-per-user case work better.
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1310#comment:14>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list