[tahoe-lafs-trac-stream] [Tahoe-LAFS] #2759: separate Tub per server connection
Tahoe-LAFS
trac at tahoe-lafs.org
Tue Mar 29 21:41:38 UTC 2016
#2759: separate Tub per server connection
--------------------------+---------------------------
Reporter: warner | Owner:
Type: enhancement | Status: new
Priority: normal | Milestone: undecided
Component: code-network | Version: 1.10.2
Keywords: | Launchpad Bug:
--------------------------+---------------------------
Leif, dawuud, and I had an idea during today's devchat: what it we used a
separate Tub for each server connection?
The context was Leif's use case, where he wants a grid in which all
servers (including his own) advertise a Tor .onion address, but he wants
to connect to his own servers over faster direct TCP connections (these
servers are on the local network).
Through a combination of the #68 multi-introducer work, and the #517
server-override work, the plan is:
* each introducer's data is written into a cache file (YAML-format, with
one clause per server)
* there is also an override file, which contains YAML clauses of server
data that should be used instead-of/in-addition-to the data received from
the introducer
* the !StorageFarmBroker, when deciding how to contact a server, combines
data from all introducers, then updates that dict with data from the
override file
So Leif can:
* start up the node normally, wait for the introducers to collect
announcements
* copy the cached clauses for his local servers into the override file
* edit the override file to modify the FURL to use a direct
"tcp:HOST:PORT" hint, instead of the "tor:XYZ.onion:80" hint that they
advertised
But now the issue is: tahoe.cfg has an `anonymous=true` flag, which tells
it to configure Foolscap to remove the `DefaultTCP` connection-hint
handler, for safety: no direct-TCP hints will be honored. So how should
this overridden server use an otherwise-prohibited TCP connection?
So our idea was that each YAML clause has two chunks of data: one local,
one copied from the introducer announcement. The local data should include
a string of some form that specifies the properties of the Tub that should
be used for connections to this server. The !StorageFarmBroker will spin
up a new Tub for each connection, configure it according to those
properties, then call `getReference()` (actually `connectTo()`, to get the
reconnect-on-drop behavior).
The tahoe.cfg settings for foolscap connection-hint handlers get written
into the cached introducer data. !StorageFarmBroker creates Tubs that obey
those rules because those rules are sitting next to the announcement that
will contain the FURL.
In this world, we'll have one Tub for the server (if any), with a
persistent identity (storing its key in private/node.privkey as usual).
Then we'll have a separate ephemeral Tub for each storage server, which
doesn't store its private key anywhere. (I think we'll also have a
separate persistent Tub for the control-port / logport).
Potential issues:
* performance: we have to make a new TLS key (probably RSA?) for each
connection. probably not a big deal.
* We can't share client-side objects between storage servers. We don't do
this now, so it's no big loss. The idea would be something like: instead
of the client getting access to a server !ShareWriter object and sending
`.write(data)` messages to it, we could flip it around and *give* the
server access to a client-side !ShareReader object, and the server would
issue `.read(length)` calls to it. That would let the server set the pace
more directly. And then the server could sub-contract to a different
server by passing it the !ShareReader object, then step out of the
conversation entirely. However this would only work if our client could
accept inbound connections, or if the subcontractor server already had a
connection to the client (maybe the client connected to them as well).
* We lose the sneaky NAT-bypass trick that lets you run a storage server
on a NAT-bound machine. The trick is that you also run a client on your
machine, it connects to other client+server nodes, then when those nodes
want to use your server, they utilize the existing reverse connection
(Foolscap doesn't care who originally initiated a connection, as long as
both sides have proved control over the right TLS key). This trick only
worked when those other clients had public IP addresses, so your box could
connect to them.
None of those issues are serious: I think we could live with them.
And one benefit is that we'd eliminate the TubID-based correlation between
connections to different storage servers. This is the correlation that
foils your plans when you call yourself Alice when you connect to server1,
and Bob when you connect to server2.
It would leave the #666 Accounting pubkey relationship (but you'd probably
turn that off if you wanted anonymity), and the timing relationship
(server1 and server2 compare notes, and see that "Alice" and "Bob" connect
at exactly the same time, and conclude that Alice==Bob). And of course
there's the usual storage-index correlation: Alice and Bob are always
asking for the same shares. But removing the TubID correlation is a good
(and necessary) first step.
The !StorageFarmBroker object has responsibility for creating IServer
objects for each storage server, and it doesn't have to expose what Tub
it's using, so things would be encapsulated pretty nicely. (In the long
run, the IServer objects it provides won't be using Foolscap at all).
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2759>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list