#2773 closed task (fixed)

`tahoe create-node` should require `--location` and/or `--hostname`, and not autodetect

Reported by: warner Owned by: warner
Priority: normal Milestone: 1.12.0
Component: code-nodeadmin Version: 1.11.0
Keywords: Cc:
Launchpad Bug:

Description (last modified by warner)

This splits out the most user-visible aspect of #2491. The create-node commands that create listening nodes should always be told what hostname or IP address(es) to advertise, rather than guessing by using IP-address autodetection functions.

That means the following commands should require a --location= argument:

  • create-node (when storage service is enabled)
  • create-introducer
  • create-key-generator (deleted in #2783)
  • create-stats-gatherer

These commands should also accept (but not require) a --port= argument, which says what port the server should listen on. These will be endpoint descriptors, so things like tcp:12345. Both location and port get written into tahoe.cfg. Unlike --location=, --port= is not mandatory, and the code should pick something sensible if it isn't provided.

I'm undecided on what exactly counts as "sensible". One option is just to allocate a free one inside create-node (different for each server). Another is for use to choose a port number for Tahoe servers to listen on (now, in this ticket, maybe register it with IANA or something), and let the option default to that. A third is a hybrid: attempt to listen on the default port during create-node and record that (in tahoe.cfg) if that succeeds, but if it's already claimed by some other process, allocate and record a random one.

Note: if the server is using a Tor "onion service", then the user should not be obligated to figure this out: the --tor option should provision and register the XYZ.onion location automatically. So --location= and --port= should *not* be required (and in fact should probably be rejected) if --tor is active.

But for plain TCP locations, the server should stop trying to figure it out on its own, and have the node-constructing command get this information from the user.

Change History (18)

comment:1 Changed at 2016-04-28T06:58:32Z by warner

Hm, I realized I'm not sure whether --location= is supposed to be just a hostname (example.com), or a full Foolscap connection hint (tcp:example.com:12345).

If it's just a hostname, then it makes sense for --port to be optional, because the command can build a connection hint from the given hostname and a self-allocated port.

If it's a full connection hint, then you basically must provide a --port that matches the hint: --location=tcp:example.com:12345 --port=12345. If you have some funky non-1-to-1 port-mapping firewall going on, you might provide different port numbers (where --location points at the external one, and --port provides the internal one). You'd do the same to use an externally-configured Tor onion service, --location=tor:xyz.onion:80 --port=tcp:12345.

I suppose we could say that if you pass --location, but not --port, and the location is in the form tcp:HOSTNAME:PORT, then we glean --port from the last component of the location, but this feels a bit magic to me.

We could also say that if you provide --location=tcp:HOSTNAME:PORT1 and a different --port=PORT2, then we could pretend that you really said --location=tcp:HOSTNAME:PORT2, but that would be even more magical, and would prevent setting up the portmapping example above.

I guess I'm most inclined to be fully explicit. --location= would be a full tcp:HOSTNAME:PORT connection hint (or a comma-separated list of them). --port= would be a full endpoint specification (so tcp:PORT or tcp:PORT:interface=127.0.0.1), with maybe the "tcp:" being optional.

Oh, what if we have a separate argument name --hostname=? Then we'd have the following possibilities:

  • --hostname=example.com : self-allocate a port, use location=tcp:example.com:PORT, and port=tcp:PORT
  • --hostname=example.com --port=12345: use tcp:example.com:12345 and tcp:12345
  • --location=tcp:example.com:12345 --port=tcp:12345: use those as location and port
  • it would be an error to provide both --hostname and --location, or to provide --location without --port.

And then using an automatically-configured Tor onion service would still be something different, like --tor or --listen-tor, which builds the service, then sets --location and --port internally.

comment:2 Changed at 2016-04-28T07:07:07Z by warner

  • Summary changed from `tahoe create-node` should require `--location`, and not autodetect to `tahoe create-node` should require `--location` and/or `--hostname`, and not autodetect

comment:3 Changed at 2016-04-28T07:12:50Z by warner

#1478 would be easier with something like this in place.

comment:4 Changed at 2016-04-29T18:12:50Z by warner

At today's devchat, meejah and I worked something out.

The node creation commands are responsible for writing tub.location and tub.port into tahoe.cfg, with the same semantics as we have now: tub.location is a foolscap hint list (e.g. "tcp:HOSTNAME:PORT,tcp:HOST2:PORT2"), and tub.port is a twisted server endpoint specifiction ("tcp:12345" or "tcp:12345:interface=127.0.0.1"). These commands will also write information about Tor setup, when necessary. All allocation is finished by the time create-node exits, so nothing dynamic needs to happend at runtime (tahoe start).

Then the server-like node-creation commands (create-server, create-introducer, and create-node if it means client+server) have a couple of different allowable cases

  • tahoe create-server --hostname=HOST: This allocates a TCP port (maybe an IANA-standardized Tahoe port, maybe just any free one), and sets location to tcp:HOSTNAME:PORT and port to tcp:PORT.
  • --listen-tor: This allocates an onion address (inside create-server, which will need to start up a reactor, maybe start a tor instance, and talk to Tor's control port to create the private key), then sets location to tor:XYZ.onion:PORT1 and port to tcp:PORT2:interface=127.0.01. If we're able to use a unix socket for the inbound connections from tor, the port will be something like unix:PATH instead.
  • --listen-i2p: same, but for I2P
  • any combination of the above three
  • --location= --port=: This explicitly provides both the FURL connection hints and the listening port. --location and --port must be provided as a pair (both, or neither, but never just one). If they are provided, none of --hostname/--listen-tor/--listen-i2p are accepted.

In the outbound direction, clients will automatically use Tor hints if possible. At node startup, they'll attempt to import txtorcon, and if that works, they'll register a foolscap connection handler for "tor" hints. The handler will provide Tor endpoints on request (synchronously). Then, when .connect() is called on one of those hints, the handler will attempt to spin up a Tor instance. If this fails (e.g. because Tor is not actually installed), the connection will fail, no big deal. Same story for i2p: try to import txi2p, if that works install the connection handler (which might not work), if not, don't.

tahoe.cfg will have an option to *not* use Tor, for the benefit of folks operating in an environment where they have txtorcon and Tor installed, but for whatever reason don't want want Tahoe to use them. I don't think this is common enough to need a CLI argument: folks can just edit tahoe.cfg after node creation.

The client-like node creation commands (create-client, maybe create-node) will then have an argument like --only-tor which indicates that all outbound TCP connections ought to route through Tor instead. It will also have an --anonymous flag that enforces restrictions on the other settings: it requires --only-tor, it looks at the server settings and complains if it sees non-onion hints in --location or non-localhost listening ports in --port.

The client-like commands will also take something like --socks-port, which registers a foolscap connection handler (for both tcp and tor) that sends everything to the given SOCKS server. This might be a Tor instance (in which case both tcp:HOST:PORT and tor:XYZ.onion:PORT will work), or it might be a plain SOCKS daemon (so tor: won't work).

All commands will accept --tor-controlport=endpoint, which means txtorcon ought to talk to the pre-existing Tor instance at that control port, rather than launching its own. This gets written into tahoe.cfg somehow.

It might be a good idea to have the node creation commands, when asked to use tor, check to see if it will actually work. That means spinning up a Tor executable (or connecting to the --tor-controlport= provided) and seeing if Tor can get the full descriptor list. Maybe even making a connection to a tahoe-lafs.org -hosted onion service to make sure it works.

And looking towards the future, meejah had a great idea about a sort of "setup wizard". tahoe create-node --web could run a server and pop open your web browser to a control panel. It could then run ifconfig and see if your box seemed to have a public IP address, or try to get a UPnP port, or provide instructions for setting up a firewall port-forwarding, etc. Then it could ask the user for permission to confirm connectivity, by having a service on tahoe-lafs.org connect back down to the purported address.

In the end, we really want running a server to be as easy as running a client, but the networking world in which we live doesn't make that trivial. I think --hostname= is about the simplest possible tool (but only works if you have a public-IP VPS server of some sort), and --location= --port= is the second-simplest tool.

comment:5 Changed at 2016-05-05T01:38:45Z by Brian Warner <warner@…>

In f57d1e9/trunk:

Merge branch '2773-stats'

This changes "tahoe create-stats-gatherer" to take --hostname,
--location, and --port, according to (refs ticket:2773).

comment:6 Changed at 2016-05-05T01:40:17Z by warner

  • Milestone changed from undecided to 1.12.0
  • Owner set to warner
  • Status changed from new to assigned

comment:7 Changed at 2016-05-05T02:04:28Z by warner

BTW, that 2773-stats patch fixed the last of our "pending deprecations": warnings we'd get from running the test suite against the latest trunk version of Foolscap and Twisted.

comment:8 Changed at 2016-06-28T18:20:37Z by warner

  • Milestone changed from 1.12.0 to 1.13.0

moving most tickets from 1.12 to 1.13 so we can release 1.12 with magic-folders

comment:9 Changed at 2016-06-30T03:57:53Z by warner

  • Description modified (diff)

comment:10 Changed at 2016-09-02T16:55:10Z by warner

At some point on IRC, I think we (maybe str4d?) suggested we use --listen=tor / --listen=i2p, instead of --listen-tor / --listen-i2p. Talking with dawuud today, we thought of:

  • --listen=tcp (the default for create-node and create-server)
  • --listen=none (the default for create-client)
  • --listen=tor
  • --listen=i2p
  • --listen=tor,i2p,tcp (arbitrary combinations, except for none)

although I don't think we should feel obligated to support automatically-configured multiple listeners on the first step.

comment:11 Changed at 2016-09-02T17:02:20Z by warner

The first step is probably to change the create-node functions to be async, which means changing runner.py to accept a Deferred back from the dispatch function, and spin up a reactor, and wait for the Deferred to finish before shutting it down and returning the exit code. Twisted has a utility for this twisted.internet.task.react(), which I'm using in magic-wormhole, although it calls sys.exit(rc) itself, so we might only want to use it in the run-by-human=true path.

Second step is to add --location/--port. Third is to add --listen=tcp. Fourth is to implement --listen=tor or i2p.

Although we could put off the async create-node until just before implementing --listen=tor, because we've got a synchronous allmydata.util.iputil.allocate_tcp_port().

comment:12 Changed at 2016-09-02T17:23:44Z by warner

  • Milestone changed from 1.13.0 to 1.12.0

It's looking like we're pulling this into 1.12, since it's turning into the "private-IP AND magic-folders" release. If it looks like trouble, we can push it back out again.

comment:13 Changed at 2016-09-03T00:08:32Z by warner

Note: #2490 is about --listen=tor/i2p and automatic server setup. This ticket (#2773) is just about --port/--location/--hostname, the non-automatic setup. #2773 may include --listen=tcp and/or --listen=none, but should not block on the async-ification of node creation, or on anything involving onion server creation.

1.12 will block on #2773, since I think we want those setup functions to make manually-configured tahoe servers easier to build. But I'm going to leave #2490 in the next (1.13) milestone, unless it happens to get finished early (in which case, yay! 1.12 is even cooler).

So I'm going to suggest that the order of operations is:

  • 1: add --location=/--port=
  • 2: add --listen=tcp (as a default), and --hostname=
  • 3: close #2773
  • 4: make create-node functions async
  • 5: implement --listen=tor, --listen=i2p
  • 6: close #2490

comment:14 Changed at 2016-09-06T15:12:01Z by dawuud

this PR comes pretty close to the above steps 1 & 2: https://github.com/tahoe-lafs/tahoe-lafs/pull/336

the tests don't pass. not sure why.

comment:15 Changed at 2016-09-08T14:58:36Z by daira

The create-node command requires --hostname=localhost to be added to get the previous behaviour. I know that often Explicit Is Better Than Implicit, but I'm not clear on what the advantage is of requiring this.

comment:16 Changed at 2016-09-08T16:55:19Z by warner

Hm.. the previous behavior was that tahoe create-node (with no arguments) would 1: listen on all interfaces, 2: auto-detect the host's IP addresses, 3: put all detected IP addresses in the Tub location hints. So you'd wind up with the equivalent of e.g. tub.port = tcp:12345 ; tub.location = tcp:127.0.0.1:12345,tcp:8.8.8.8:12345.

I believe we decided that automatic IP address detection was bad/confusing/unhelpful, and automatic detection of externally-unreachable IPs like 127.0.0.1 and RFC1918 addresses was nigh-useless. (I originally pushed back against this, because I run test networks on my local box all the time, but I finally agreed that the majority use case is more important).

So I think we agreed that running a server on a box with a public IP can/should be one argument more verbose: you have to know enough about networking to know the hostname (or public IP address) of that box, and you provide it when setting up the node.

So as far as use cases go:

  • no server? run tahoe create-client and none of this matters, no extra arguments are needed
  • server on public host? run tahoe create-node --hostname=MYHOSTNAME and you'll wind up with an allocated TCP port and a matching tub.port/tub.location
  • server behind a firewall? figure out port-forwarding and external IP/hostname, run tahoe create-node --port=tcp:LOCALPORT --location=tcp:EXTERNALIP:EXTERNALPORT
  • even more unusual setup? use --port= --location= to make it work
  • need multiple location hints? use --port= --location=

and then when #2490 lands, we add:

  • server behind Tor? run tahoe create-node --listen=tor (or i2p) to automatically set up the .onion/.i2p address

comment:17 Changed at 2016-09-10T02:15:38Z by Brian Warner <warner@…>

  • Resolution set to fixed
  • Status changed from assigned to closed

In a8899c8/trunk:

Merge branch '2773-listen-port'

This adds several arguments to "tahoe create-node" and
create-introducer:

  • --location=/--port=: always provided as a pair, directly set the listening port and the advertised location
  • --hostname=: provides the node's hostname so it doesn't have to crawl the network interfaces for IP addresses, when listening on TCP
  • --listen=: can only be "tcp" for now, but will soon be the way to enable automatic listener setup for Tor and I2P services

This is a rebased and cleaned-up version of #336, which fixes a bunch of
tests, and simplifies the argument validation slightly.

closes tahoe-lafs/tahoe-lafs#336
closes ticket:2773

comment:18 Changed at 2016-10-09T06:11:26Z by Brian Warner <warner@…>

In 5a195e2/trunk:

Merge branch '2490-tor-2'

This adds --listen=tor to create-node and create-server, along with
.onion-address allocation at creation time, and onion-service
starting (launching or connecting to tor as necessary) as node startup
time.

closes ticket:2490
refs ticket:2773
refs ticket:1010
refs ticket:517

Note: See TracTickets for help on using tickets.