#2850 new defect

ugly (temporary) error message when connecting to new onion server

Reported by: warner Owned by:
Priority: normal Milestone: soon
Component: code-frontend-web Version: 1.11.0
Keywords: Cc:
Launchpad Bug:

Description

While running some smoketests on 1.12, I spun up a new onion-based server (tahoe create-node --listen=tor) on an existing grid, while watching a client on that grid.

I think the client hears about the new service before it's really ready to listen, because for the first 10 or 15 seconds, the client's Welcome page displays an ugly traceback squeezed into the "Connection" column (screenshots attached). The full traceback (obtained from the flogtool log) was:

21:56:02.363 L20 []#132 TubConnector created from p4pmhkdfwp2hdrm33xgvyoiijzkkj6cm to likvz52mql57cxczpf4nnthfjsm4tngj
21:56:02.363 L20 []#133 considering hint: tor:7uq4kbwg5cbiwlhv.onion:3457
21:56:02.363 L20 []#134 connecting to hint: tor:7uq4kbwg5cbiwlhv.onion:3457
21:56:03.671 L28 []#135
 FAILURE:
 [CopiedFailure instance: Traceback from remote host -- Traceback (most recent call last):
   File "/home/warner/venv/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 498, in errback
     self._startRunCallbacks(fail)
   File "/home/warner/venv/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 565, in _startRunCallbacks
     self._runCallbacks()
   File "/home/warner/venv/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 651, in _runCallbacks
     current.result = callback(current.result, *args, **kw)
   File "/home/warner/venv/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 1355, in gotResult
     _inlineCallbacks(r, g, deferred)
 --- <exception caught here> ---
   File "/home/warner/v

 -- TRACEBACK ELIDED --

 receive(data)
   File "/home/warner/venv/local/lib/python2.7/site-packages/ometa/tube.py", line 41, in receive
     status = self._interp.receive(data)
   File "/home/warner/venv/local/lib/python2.7/site-packages/ometa/interp.py", line 48, in receive
     for x in self.next:
   File "/home/warner/venv/local/lib/python2.7/site-packages/ometa/interp.py", line 177, in apply
     for x in self._apply(f, ruleName, argvals):
   File "/home/warner/venv/local/lib/python2.7/site-packages/ometa/interp.py", line 110, in _apply
     for x in rule():
   File "/home/warner/venv/local/lib/python2.7/site-packages/ometa/interp.py", line 256, in parse_Or
     for x in self._eval(subexpr):
   File "/home/warner/venv/local/lib/python2.7/site-packages/ometa/interp.py", line 241, in parse_And
     for x in self._eval(subexpr):
   File "/home/warner/venv/local/lib/python2.7/site-packages/ometa/interp.py", line 440, in parse_Action
     val = eval(expr.data, self.globals, self._localsStack[-1])
   File "<string>", line 1, in <module>

   File "/home/warner/venv/local/lib/python2.7/site-packages/txsocksx/client.py", line 153, in serverResponse
     raise e.socks5ErrorMap.get(status)()
 txsocksx.errors.HostUnreachable:
 ]
21:56:03.671 L20 []#136 connectorFinished (<foolscap.connection.TubConnector object at 0x7fe0acf919d0 from p4pmhkdfwp2hdrm33xgvyoiijzkkj6cm to likvz52mql57cxczpf4nnthfjsm4tngj>)

After about 15 or maybe 30 seconds, the server became reachable, and the error message cleared up. I believe the client was using the SOCKS port of the local Tor daemon, so the delay probably wasn't because the client was waiting for a local Tor to launch, nor waiting for a control connection to the local Tor.

I think the fix for this will be to have the foolscap Tor handler map this txsocksx.errors.HostUnreachable to something more concise, like just "onion server unreachable".

It's possible that we could do something on the server side too. The server was using a control port on the system Tor daemon to register the onion service. It will have published the .onion address as soon as it started (just after establishing a Tor-mediated connection to the Introducer), but it tells Tor to set up the server at startup too, and that process probably didn't finish by the time the client heard about the address. The .onion address is generated during tahoe create-node, and written into tahoe.cfg, so it's available very early, before the Tor control connection is even started.

So maybe we want to change the server to wait for some sort of ACK from the tor_provider, telling us that the Tor daemon says the onion service descriptors and rendezvous points are ready to go, before allowing the IntroducerClient to publish the storage server. That might help with the first launch of the server. It won't help with later launches, when the Introducer (and clients) already have the announcement, and they're busy trying+retrying to connect, but there's nothing we can do about that.

Attachments (2)

onion-connect-error.png (328.1 KB) - added by warner at 2016-12-17T22:19:01Z.
error during connect
onion-connect-success.png (185.2 KB) - added by warner at 2016-12-17T22:19:17Z.
successful connect

Download all attachments as: .zip

Change History (4)

Changed at 2016-12-17T22:19:01Z by warner

error during connect

Changed at 2016-12-17T22:19:17Z by warner

successful connect

comment:1 Changed at 2020-06-30T14:45:13Z by exarkun

  • Milestone changed from 1.13.0 to 1.15.0

Moving open issues out of closed milestones.

comment:2 Changed at 2021-03-30T18:40:19Z by meejah

  • Milestone changed from 1.15.0 to soon

Ticket retargeted after milestone closed

Note: See TracTickets for help on using tickets.