#2837 new defect

create-node --listen=tor hangs with tor-0.2.8.8

Reported by: warner Owned by:
Priority: normal Milestone: undecided
Component: code-network Version: 1.11.0
Keywords: anonymity tor Cc: meejah
Launchpad Bug:

Description

After finishing off #2490, I noticed during testing that tahoe create-node --listen=tor consistently hangs on one of my test machines (which is running debian/sid, with tor-0.2.8.8 git-8d8a099454d994bd). This happens when the system Tor process has been running for a while, at least a few days. If I bounce the Tor process, then create-node finishes correctly in the 30-40 seconds that I expect it to take.

This doesn't happen on an Ubuntu-16.04 box (running tor-0.2.7.6 git-605ae665009853bd). Both cases are using txtorcon-0.17.0 . I'm guessing that there's something broken with the Tor on my sid box, but maybe there's something about the tor-control-port command stream in the more recent Tor that's confusing txtorcon.

Meejah suggested a patch like this to turn on command-stream debugging:

from txtorcon.log import debug_logging
debug_logging()

and with that, I found differences between the two command streams. They're identical (modulo the random auth-cookie) through the following commands and their responses:

cmd: AUTHCHALLENGE SAFECOOKIE [cookie]
cmd: AUTHENTICATE [cookie]
cmd: GETINFO signal/names
cmd: GETINFO version
Connected to a Tor with VERSION [version]
cmd: GETINFO events/names
cmd: USEFEATURE EXTENDED_EVENTS
cmd: GETINFO ns/all
6278 named routers found.
[list of duplicates]
2494 GUARDs

At that point, both do a cmd: GETINFO circuit-status. The working case (with 0.2.7.6) gets back a bunch of circuit_(new|extend|built) responses, then does a series of GETINFO ip-to-country/[ipaddr] commands, then a GETINFO stream-status. The hanging case sends the circuit-status but never sees the circuit_* messages, and goes directly to the GETINFO stream-status.

I don't know if this debugging includes the actual responses to each command, or if it's just logging async notifications.

Change History (2)

comment:1 Changed at 2016-10-09T17:17:31Z by meejah

All the "600"-level responses are async notifications (i.e. all the circuit_* etc stuff) -- so it sort of seems like Tor "isn't doing stuff" in the hanging case (or at least: not creating new circuits).

You can try also something like:

def log_msg(msg):
    print("Tor: {}".format(msg))
control_proto.add_event_listener("INFO", log_msg)
control_proto.add_event_listener("DEBUG", log_msg)

comment:2 Changed at 2016-10-09T19:25:10Z by warner

debian#835119 and tor#19969 might be related.

Note: See TracTickets for help on using tickets.