#1852 closed defect (wontfix)

test suite: find a way to prevent unclean reactor errors causing subsequent test failures

Reported by: davidsarah Owned by: daira
Priority: normal Milestone: soon
Component: code Version: 1.9.2
Keywords: test twisted-trial unclean reactor timeout hang Cc: zooko
Launchpad Bug:

Description (last modified by zooko)

For example, in the 666-accounting branch, if test_mutable.Filenode.test_modify_backoffer is reenabled, it will fail with an unclean reactor error and subsequent tests will time out.

Probably this is due to some non-obvious sharing of global state between tests (it doesn't always happen that an unclean reactor error causes subsequent timeouts).

In any case, it makes it difficult to run the test suite the whole way through when some tests cause this problem. One rather brute-force solution would be to exit on an unclean reactor error and restart the test process from the next test. I can't see a way to do that without forking trial, though. Another possibility is to try to use twisted.test.proto_helpers.MemoryReactor.

Change History (17)

comment:1 Changed at 2012-11-10T01:52:57Z by davidsarah

It seems like the unclean reactor errors in our test suite are most often due to DelayedCalls. I wonder whether it's possible to automatically wait (until a timeout) for all DelayedCalls to have completed.

comment:2 Changed at 2012-11-10T02:36:53Z by davidsarah

Yes, it is:

import time

from twisted.internet import reactor

from allmydata.util import log
from allmydata.util.pollmixin import PollMixin


class WaitForDelayedCallsMixin(PollMixin):
    def _delayed_calls_done(self):
        # We're done when the only remaining DelayedCalls fire after threshold.
        # (These will be associated with the test timeout.)
        threshold = time.time() + 60
        for delayed in reactor._pendingTimedCalls + reactor._newTimedCalls:
            if delayed.getTime() < threshold:
                return False
        return True

    def wait_for_delayed_calls(self, res):
        d = self.poll(self._delayed_calls_done)
        d.addErrback(log.err, "error while waiting for delayed calls")
        d.addBoth(lambda ign: res)
        return d

comment:3 Changed at 2012-11-11T14:54:44Z by davidsarah

A variation of the above code is now in allmydata.util.deferredutil on my 666-accounting branch. It works well to avoid some spurious unclean reactor errors in tests that are known to fail for that reason, but it doesn't solve the general problem that such errors cause timeouts in later tests.

comment:4 Changed at 2012-11-22T23:25:02Z by zooko

See also clean_pending(), which I think accomplishes a similar result without the delay. I've used it successfully in the past, but not recently.

comment:5 Changed at 2012-11-22T23:29:23Z by zooko

We should figure out whether WaitForDelayedCallsMixin and pyutil's clean_pending() are each sufficient to solve our problem with errors in unit tests causing later unit tests to spurious fail-or-error, and if so which approach is better, and we should deprecate the other one.

comment:6 Changed at 2012-11-22T23:29:42Z by zooko

  • Cc zooko added

comment:7 follow-up: Changed at 2012-11-22T23:43:25Z by davidsarah

Ah, I did not know about the reactor.getDelayedCalls method used by clean_pending. I think clean_pending does not quite do the right thing, though:

  • I want the delay, because that's more likely to avoid problems with subsequent tests. Cancelling the delayed calls might still leave other reactor resources unclean.
  • I think clean_pending will cancel the Deferred that implements the test timeout, or any others that are internal to Twisted Trial. (There weren't any others when I tested, but that's an implementation detail.)

I'll see whether reactor.getDelayedCalls can be used to replace the use of internal attributes of reactor in WaitForDelayedCalls.

Last edited at 2012-11-22T23:59:26Z by davidsarah (previous) (diff)

comment:8 in reply to: ↑ 7 Changed at 2012-11-22T23:56:58Z by davidsarah

Replying to davidsarah:

I'll see whether reactor.getDelayedCalls can be used to replace the use of internal attributes of reactor in WaitForDelayedCalls.

It can; in fact its implementation does exactly the right thing:

def getDelayedCalls(self):
    """Return all the outstanding delayed calls in the system.
    They are returned in no particular order.
    This method is not efficient -- it is really only meant for
    test cases."""
    return [x for x in (self._pendingTimedCalls + self._newTimedCalls) if not x.cancelled]
Last edited at 2012-11-22T23:59:51Z by davidsarah (previous) (diff)

comment:10 follow-up: Changed at 2012-12-02T01:36:19Z by davidsarah

clean_pending is not just in pyutil; it's also in src/allmydata/test/common_util.py's TestMixin. So we should probably merge WaitForDelayedCalls into that, but TextMixin also contains code I don't understand relating to repeatable_random, so I'm not going to attempt that merge now.

Assigning to zooko to explain what TestMixin.setUp is trying to do (I can't decipher the comment, which appears to be missing some words).

Last edited at 2014-06-25T19:39:48Z by zooko (previous) (diff)

comment:11 Changed at 2012-12-02T01:37:19Z by davidsarah

  • Milestone changed from undecided to 1.11.0
  • Owner changed from davidsarah to zooko

comment:12 Changed at 2012-12-02T01:41:39Z by davidsarah

Note that TestMixin and WaitForDelayedCalls also differ in that inheriting TestMixin causes all test methods in the class to cancel delayed calls, whereas WaitForDelayedCalls only waits in methods that use d.addBoth(self.wait_for_delayed_calls).

comment:13 Changed at 2014-06-25T19:46:43Z by zooko

  • Description modified (diff)

comment:14 in reply to: ↑ 10 ; follow-up: Changed at 2014-06-25T19:47:16Z by zooko

Replying to davidsarah:

Assigning to zooko to explain what TestMixin.setUp is trying to do (I can't decipher the comment, which appears to be missing some words).

daira: the other thing the TestMixin does replaces os.urandom with a deterministic RNG in order to make tests more reproducible. It isn't currently used, so let's remove that functionality.

Also, yes, the doc is missing a word.

comment:15 Changed at 2014-06-25T19:47:23Z by zooko

  • Owner changed from zooko to daira

comment:16 in reply to: ↑ 14 Changed at 2014-07-24T00:18:59Z by daira

Replying to zooko:

daira: the other thing the TestMixin does replaces os.urandom with a deterministic RNG in order to make tests more reproducible. It isn't currently used, so let's remove that functionality.

+1. Filed as #2266.

comment:17 Changed at 2020-01-13T20:26:42Z by exarkun

  • Resolution set to wontfix
  • Status changed from new to closed

Instead, fix tests that fail with UncleanReactorError?. And stop using trial's TestCase? and use testools' instead.

Note: See TracTickets for help on using tickets.