Opened at 2013-07-06T15:16:55Z
Last modified at 2020-01-13T20:36:24Z
#2017 closed defect
non-deterministic test hang on OpenBSD — at Initial Version
Reported by: | zooko | Owned by: | sickness |
---|---|---|---|
Priority: | normal | Milestone: | soon |
Component: | code | Version: | 1.10.0 |
Keywords: | iputil heisenbug openbsd test hang | Cc: | |
Launchpad Bug: |
Description
sickness's !OpenBSD buildslave showed a test timeout:
=============================================================================== [ERROR] Traceback (most recent call last): Failure: twisted.internet.defer.TimeoutError: <allmydata.test.test_runner.RunNode testMethod=test_client_no_noise> (test_client_no_noise) still running at 240.0 secs allmydata.test.test_runner.RunNode.test_client_no_noise =============================================================================== [ERROR] Traceback (most recent call last): Failure: twisted.trial.util.DirtyReactorAggregateError: Reactor was unclean. DelayedCalls: (set twisted.internet.base.DelayedCall.debug = True to debug) <DelayedCall 0x816eb82c [0.00169348716736s] called=0 cancelled=0 LoopingCall<0.01>(RunNode._poll, *(<function _node_has_started at 0x7ff29ed4>, 1373030506.664452), **{})()> allmydata.test.test_runner.RunNode.test_client_no_noise ------------------------------------------------------------------------------- Ran 1139 tests in 1784.336s FAILED (skips=15, expectedFailures=3, errors=2, successes=1120)
Rerunning the tests with the exact same build (using Buildbot's "force rebuild" feature) resulted in success:
In that run (build number 28), those tests took only a few seconds:
19.917 seconds: allmydata.test.test_runner.RunNode.test_client
13.758 seconds: allmydata.test.test_runner.RunNode.test_client_no_noise
So there is a non-deterministic bug that exhibits on sickness's buildslave which causes those two tests to hang.
Questions:
- Does this happen on any other buildslaves?
- Did this ever happen before the recent patches which changed the behavior of iputil — [b0883807361830c609dff1677c3cb34fd64d3ebb], [f97b8e5e1df75284aa9b89dd830f8728040eab67], [08590b1f6a880d51751fdcacea6a007ebc568f2e], [16b245563db2f6ca71b9332b06debbe3e1d734b4], [b31a4f6e870cb56efa40c785a868a944b964e8b9], [a493ee0bb641175ecf918e28fce4d25df15994b6], [6104950ed8a7a356eed2218f2df958d074022eea], [f77ec470d75f4b8fb81b1abca4ee3b73f1ad8b22], [8e31d66cd0b0821ccaa2c7c259e7d6f262ad4738], [6a445d73bc5253ec4ae0dec70af02e33bc869cf6]?
I suspect those iputil patches of causing this hang.
sickness: could you please run the unit tests from the current trunk version repeatedly with trial's --until-failure option? ./bin/tahoe debug trial --until-failure allmydata.test (See HowToWriteTests for more options.) If you can reliably reproduce the problem, then would you use git to rewind to before those patches and see if that makes the problem go away? Thanks!