[tahoe-dev] [tahoe-lafs] #83: Extend external interfaces for operation monitoring.
tahoe-lafs
trac at allmydata.org
Sat Dec 12 19:59:38 PST 2009
#83: Extend external interfaces for operation monitoring.
-------------------------+--------------------------------------------------
Reporter: nejucomo | Owner: nejucomo
Type: enhancement | Status: new
Priority: minor | Milestone: undecided
Component: code | Version: 0.6.1
Keywords: | Launchpad_bug:
-------------------------+--------------------------------------------------
Comment(by zooko):
So I definitely would have preferred the simplicity of using in-band
progress indicators and cancellation as described in comment:1, but Brian
persuaded me that this just wasn't good enough. The part of his argument
that I remember being unable to counter was that we have some operations
that take longer than an HTTP connection can reliably last. For example
if you want to do a deep-verify-and-repair which is going to walk a large
directory structure and download every bit of every share of every file
and, if necessary, upload replacement shares. This could take days or
weeks or months, and if your control of the process is a single HTTP
connection then you're quite likely to suffer a network glitch which
closes your TCP connection or encounter some kind of stupid timeout in an
HTTP proxy or something.
(The way I like to think of this is that the comms abstraction of TCP is
insufficiently robust -- there isn't a widely understood and implemented
way to force your HTTP transaction to outlive temporary disconnections of
the underlying TCP connection. That means that HTTP, while a wonderful
lingua franca for some protocols, can't be used for long-running
operations or operations which cannot be cannot be safely retried when the
first try might or might not have failed to get through.)
So, Brian went ahead and invented "operation handles", documented here:
[source:docs/frontends/webapi.txt at 4112#L203].
Hm, reading those docs again, I see this new text:
{{{
Many "slow" operations can begin to use unacceptable amounts of memory
when
operation on large directory structures. The memory usage increases when
the
ophandle is polled, as the results must be copied into a JSON string, sent
over the wire, then parsed by a client. So, as an alternative, many "slow"
operations have streaming equivalents. These equivalents do not use
operation
handles. Instead, they emit line-oriented status results immediately.
Client
code can cancel the operation by simply closing the HTTP connection.
}}}
Oh dear, so it appears that neither the operation-handles nor the single
HTTP connection is really good enough in all dimensions. Hm.
So what shall we do with this ticket? I guess we'll close it as "fixed",
and then maybe open a new ticket saying "Make operation-handle-querying
use only a little memory" and maybe open a new ticket saying "Invent
robust HTTP so that streaming operations handles can be used on operations
that last longer than a TCP connection lasts".
I'm not actually going to open either of those two tickets right now. I
just took painkillers for my knee (recuperating from surgery).
If Brian, Nathan, or David-Sarah (or anyone) have any ideas on how to
follow-up on this by all means post to the list or comment on this or some
other ticket.
--
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/83#comment:4>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid
More information about the tahoe-dev
mailing list