[tahoe-dev] uTP (low extra delay transport)

Thu May 27 16:45:29 PDT 2010

On 5/22/10 10:37 PM, ghazel at gmail.com wrote:
> 
> So my question is; would the Tahoe project be interested in integrating uTP?

Heya!

In the abstract, yes. As you observed, there are a number of places in
Tahoe where low-level background traffic could be put to good use. File
repair, in particular, is the kind of thing that you want to do
continuously, but which needs to avoid interfering with user experience.
And it would be a lot easier to talk somebody into running a storage
server if you can also promise them that their web browsing won't be
interrupted by the Tahoe traffic. (in the long run, if you do a lot of
web browsing, you might not leave much bandwidth left for tahoe, and
you'll lose reputation among the other nodes, and they'll stop using
you, and you'll stop getting points/mojo/etc).

If we treat non-human-inconveniencing bandwidth as free, then uTP is a
great way to take advantage of all this free bandwidth lying around.
Configuring a router to perform QoS (and throttle all tahoe traffic in
deference to everything else) could accomplish the same thing, but
that's not something that the average user is going to bother with. uTP
is much easier to deploy than QoS, and would probably behave better too,
because QoS is hard to get right (where "right" means that TCP behaves
well when its packets are delayed or dropped).

The NAT handling properties are useful too.

To use it, though, we'd need to build some layers on top. Tahoe's
storage-server connections are currently built on Foolscap+SSL+TCP. We
currently need confidentiality from the protocol to protect the shared
secrets used for lease-management and mutable-share-modification. The
Accounting project will add signatures for resource-consumption
authority, and may be a good time to replace the shared secrets with
ECDSA keypairs, removing the confidentiality requirement. If we do it
right, we should be able to drop SSL and send shares over basic HTTP
(using signed request/response messages, with no secrets inside).

(I imagine that it would be non-trivial to build SSL on top of uTP, so I
don't think we'd make a lot of progress until we get away from the
shared secrets).

If we get there, then we can probably build a transport that could
safely run over uTP. We might want to have two transports: the
normal-priority TCP one that's used for foreground uploads/downloads,
and the low-priority uTP one that's used for repair work. Using HTTP for
the normal-priority one would leverage existing code and experience.

What does the uTP API look like? Is it a byte-pipe? Or like a reliable
datagram?

After moving away from SSL, our HTTP-friendly protocol would probably
look like pairwise large reliable datagram exchanges. Uploads would look
like a reserve-space-message-plus-ACK, followed by a series of
store-block-plus-ACK, followed by a close-plus-ACK. Downloads would be a
Do-You-Have-Block-plus-reponse, followed by
Read-Some-Data-plus-response. We could manage our own
request-id-response-id mapping table, if we only got a datagramish
interface. We'd have to think through the security implications of
losing the notion of "connection", but that's probably manageable.

We'd want to have a uTP API that's close enough to the subset of
TCP/HTTP that we use, so these two modes (foreground vs background)
wouldn't need drastically different code.

So, sounds exciting, and I'd like to use it, but there's a lot of prep
work that has to happen first.

thanks,
 -Brian