[tahoe-lafs-trac-stream] [Tahoe-LAFS] #2850: ugly (temporary) error message when connecting to new onion server

Tahoe-LAFS trac at tahoe-lafs.org
Sat Dec 17 22:18:32 UTC 2016


#2850: ugly (temporary) error message when connecting to new onion server
-------------------------------+------------------------
 Reporter:  warner             |          Owner:
     Type:  defect             |         Status:  new
 Priority:  normal             |      Milestone:  1.13.0
Component:  code-frontend-web  |        Version:  1.11.0
 Keywords:                     |  Launchpad Bug:
-------------------------------+------------------------
 While running some smoketests on 1.12, I spun up a new onion-based server
 (`tahoe create-node --listen=tor`) on an existing grid, while watching a
 client on that grid.

 I think the client hears about the new service before it's really ready to
 listen, because for the first 10 or 15 seconds, the client's Welcome page
 displays an ugly traceback squeezed into the "Connection" column
 (screenshots attached). The full traceback (obtained from the flogtool
 log) was:

 {{{
 21:56:02.363 L20 []#132 TubConnector created from
 p4pmhkdfwp2hdrm33xgvyoiijzkkj6cm to likvz52mql57cxczpf4nnthfjsm4tngj
 21:56:02.363 L20 []#133 considering hint: tor:7uq4kbwg5cbiwlhv.onion:3457
 21:56:02.363 L20 []#134 connecting to hint:
 tor:7uq4kbwg5cbiwlhv.onion:3457
 21:56:03.671 L28 []#135
  FAILURE:
  [CopiedFailure instance: Traceback from remote host -- Traceback (most
 recent call last):
    File "/home/warner/venv/local/lib/python2.7/site-
 packages/twisted/internet/defer.py", line 498, in errback
      self._startRunCallbacks(fail)
    File "/home/warner/venv/local/lib/python2.7/site-
 packages/twisted/internet/defer.py", line 565, in _startRunCallbacks
      self._runCallbacks()
    File "/home/warner/venv/local/lib/python2.7/site-
 packages/twisted/internet/defer.py", line 651, in _runCallbacks
      current.result = callback(current.result, *args, **kw)
    File "/home/warner/venv/local/lib/python2.7/site-
 packages/twisted/internet/defer.py", line 1355, in gotResult
      _inlineCallbacks(r, g, deferred)
  --- <exception caught here> ---
    File "/home/warner/v

  -- TRACEBACK ELIDED --

  receive(data)
    File "/home/warner/venv/local/lib/python2.7/site-
 packages/ometa/tube.py", line 41, in receive
      status = self._interp.receive(data)
    File "/home/warner/venv/local/lib/python2.7/site-
 packages/ometa/interp.py", line 48, in receive
      for x in self.next:
    File "/home/warner/venv/local/lib/python2.7/site-
 packages/ometa/interp.py", line 177, in apply
      for x in self._apply(f, ruleName, argvals):
    File "/home/warner/venv/local/lib/python2.7/site-
 packages/ometa/interp.py", line 110, in _apply
      for x in rule():
    File "/home/warner/venv/local/lib/python2.7/site-
 packages/ometa/interp.py", line 256, in parse_Or
      for x in self._eval(subexpr):
    File "/home/warner/venv/local/lib/python2.7/site-
 packages/ometa/interp.py", line 241, in parse_And
      for x in self._eval(subexpr):
    File "/home/warner/venv/local/lib/python2.7/site-
 packages/ometa/interp.py", line 440, in parse_Action
      val = eval(expr.data, self.globals, self._localsStack[-1])
    File "<string>", line 1, in <module>

    File "/home/warner/venv/local/lib/python2.7/site-
 packages/txsocksx/client.py", line 153, in serverResponse
      raise e.socks5ErrorMap.get(status)()
  txsocksx.errors.HostUnreachable:
  ]
 21:56:03.671 L20 []#136 connectorFinished
 (<foolscap.connection.TubConnector object at 0x7fe0acf919d0 from
 p4pmhkdfwp2hdrm33xgvyoiijzkkj6cm to likvz52mql57cxczpf4nnthfjsm4tngj>)
 }}}

 After about 15 or maybe 30 seconds, the server became reachable, and the
 error message cleared up. I believe the client was using the SOCKS port of
 the local Tor daemon, so the delay probably wasn't because the client was
 waiting for a local Tor to launch, nor waiting for a control connection to
 the local Tor.

 I think the fix for this will be to have the foolscap Tor handler map this
 {{{txsocksx.errors.HostUnreachable}}} to something more concise, like just
 "onion server unreachable".

 It's possible that we could do something on the server side too. The
 server was using a control port on the system Tor daemon to register the
 onion service. It will have published the .onion address as soon as it
 started (just after establishing a Tor-mediated connection to the
 Introducer), but it tells Tor to set up the server at startup too, and
 that process probably didn't finish by the time the client heard about the
 address. The .onion address is generated during `tahoe create-node`, and
 written into tahoe.cfg, so it's available very early, before the Tor
 control connection is even started.

 So maybe we want to change the server to wait for some sort of ACK from
 the tor_provider, telling us that the Tor daemon says the onion service
 descriptors and rendezvous points are ready to go, before allowing the
 !IntroducerClient to publish the storage server. That might help with the
 first launch of the server. It won't help with later launches, when the
 Introducer (and clients) already have the announcement, and they're busy
 trying+retrying to connect, but there's nothing we can do about that.

--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2850>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list