[tahoe-lafs-trac-stream] [tahoe-lafs] #2017: non-deterministic test hang on OpenBSD
tahoe-lafs
trac at tahoe-lafs.org
Sat Jul 6 15:16:56 UTC 2013
#2017: non-deterministic test hang on OpenBSD
--------------------------------------+---------------------------
Reporter: zooko | Owner: sickness
Type: defect | Status: new
Priority: normal | Milestone: undecided
Component: code | Version: 1.10.0
Keywords: iputil heisenbug openbsd | Launchpad Bug:
--------------------------------------+---------------------------
sickness's !OpenBSD buildslave showed a test timeout:
{{{
===============================================================================
[ERROR]
Traceback (most recent call last):
Failure: twisted.internet.defer.TimeoutError:
<allmydata.test.test_runner.RunNode testMethod=test_client_no_noise>
(test_client_no_noise) still running at 240.0 secs
allmydata.test.test_runner.RunNode.test_client_no_noise
===============================================================================
[ERROR]
Traceback (most recent call last):
Failure: twisted.trial.util.DirtyReactorAggregateError: Reactor was
unclean.
DelayedCalls: (set twisted.internet.base.DelayedCall.debug = True to
debug)
<DelayedCall 0x816eb82c [0.00169348716736s] called=0 cancelled=0
LoopingCall<0.01>(RunNode._poll, *(<function _node_has_started at
0x7ff29ed4>, 1373030506.664452), **{})()>
allmydata.test.test_runner.RunNode.test_client_no_noise
-------------------------------------------------------------------------------
Ran 1139 tests in 1784.336s
FAILED (skips=15, expectedFailures=3, errors=2, successes=1120)
}}}
(from https://tahoe-lafs.org/buildbot-tahoe-
lafs/builders/sickness%20OpenBSD%205.0%20x86%20py2.7/builds/27)
Rerunning the tests with the exact same build (using Buildbot's "force
rebuild" feature) resulted in success:
https://tahoe-lafs.org/buildbot-tahoe-
lafs/builders/sickness%20OpenBSD%205.0%20x86%20py2.7/builds/28
In that run (build number 28), those tests took only a few seconds:
{{{
19.917 seconds: allmydata.test.test_runner.RunNode.test_client
}}}
{{{
13.758 seconds: allmydata.test.test_runner.RunNode.test_client_no_noise
}}}
(from https://tahoe-lafs.org/buildbot-tahoe-
lafs/builders/sickness%20OpenBSD%205.0%20x86%20py2.7/builds/28/steps/test/logs/timings)
So there is a non-deterministic bug that exhibits on sickness's buildslave
which causes those two tests to hang.
Questions:
1. Does this happen on any other buildslaves?
2. Did this ever happen before the recent patches which changed the
behavior of iputil — [b0883807361830c609dff1677c3cb34fd64d3ebb],
[f97b8e5e1df75284aa9b89dd830f8728040eab67],
[08590b1f6a880d51751fdcacea6a007ebc568f2e],
[16b245563db2f6ca71b9332b06debbe3e1d734b4],
[b31a4f6e870cb56efa40c785a868a944b964e8b9],
[a493ee0bb641175ecf918e28fce4d25df15994b6],
[6104950ed8a7a356eed2218f2df958d074022eea],
[f77ec470d75f4b8fb81b1abca4ee3b73f1ad8b22],
[8e31d66cd0b0821ccaa2c7c259e7d6f262ad4738],
[6a445d73bc5253ec4ae0dec70af02e33bc869cf6]?
I suspect those iputil patches of causing this hang.
sickness: could you please run the unit tests from the current trunk
version repeatedly with trial's {{{--until-failure}}} option?
{{{./bin/tahoe debug trial --until-failure allmydata.test}}} (See
[wiki:HowToWriteTests] for more options.) If you can reliably reproduce
the problem, then would you use git to rewind to before those patches and
see if that makes the problem go away? Thanks!
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2017>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list