[tahoe-lafs-trac-stream] [Tahoe-LAFS] #3412: Many tests are flaky

Tahoe-LAFS trac at tahoe-lafs.org
Fri Oct 9 19:38:00 UTC 2020


#3412: Many tests are flaky
-------------------------+------------------------------
     Reporter:  jaraco   |      Owner:  jaraco
         Type:  defect   |     Status:  assigned
     Priority:  normal   |  Milestone:  Support Python 3
    Component:  unknown  |    Version:  n/a
   Resolution:           |   Keywords:
Launchpad Bug:           |
-------------------------+------------------------------

Comment (by jaraco):

 Here is some of the conversation about this effort:

 {{{
 [2020-09-25 11:33:39] <jaraco> Is there a way for a decorator on a method
 to detect all of the deferreds that were created during the method and
 wait on all of them to succeed and retry if any of them fail?
 [2020-09-25 11:37:02] <jaraco> I’ve written https://github.com/jaraco
 /trial-
 retry/blob/13499eb3a5d9e8d068c28b3c40e52ca71a56a13a/test_everything.py#L40-L71
 to achieve that for a single deferred in the return value.
 [2020-09-25 11:38:30] <jaraco> Except, even that approach won’t work
 because it’s only retrying the callback. It’s not retrying the test.
 [2020-09-25 11:58:17] <itamarst> there's gather_results
 [2020-09-25 11:58:36] <itamarst> but...
 [2020-09-25 11:58:54] <itamarst> I'm not sure why you can't retry whole
 method
 [2020-09-25 11:59:19] <itamarst> oh, you don't want to mess with internal
 list
 [2020-09-25 11:59:52] <itamarst> you want to addErrback a function that,
 for N times, returns the result of the original decorated function
 [2020-09-25 12:00:50] <itamarst> a single retry would look like this:
 `return result.addErrback(lambda _: f(*args, **kwargs))`
 [2020-09-25 12:01:35] <itamarst> a niave "retry 3 times" is `return
 result.addErrback(lambda _: f(*args, **kwargs)).addErrback(lambda _:
 f(*args, **kwargs)).addErrback(lambda _: f(*args, **kwargs))`
 [2020-09-25 12:01:53] <itamarst> you don't want to mess with `.callbacks`
 at all, just call `addErrback`
 [2020-09-25 12:10:33] <jaraco> The problem is that the deferreds aren’t
 always returned by the method.
 [2020-09-25 12:16:48] <jaraco> Such as in `test_status_path_404_error` -
 how does one attach an errback to the “up” call?
 [2020-09-25 12:20:40] <jaraco> Also, adding errback isn’t enough to
 suppress the failure. You want to trap exceptions in the callback and
 errbacks of the deferreds created.
 [2020-09-25 12:25:53] <itamarst> if isinstance(result, Deferred): result =
 result.addErrback(lambda _: [_.trap(), f(*args, **kwargs)][0])
 [2020-09-25 12:25:56] <itamarst> return result
 [2020-09-25 12:26:04] <itamarst> should be [1]
 [2020-09-25 12:26:11] <itamarst> and really, shouldn't be lambda should be
 real function
 [2020-09-25 12:26:16] <itamarst> but easier to do one liners in IRC :)
 [2020-09-25 12:27:39] <itamarst> and it's probably not trap(), it's
 probably something with self.expectedFailures or something
 [2020-09-25 12:27:45] <itamarst> I forget the API name
 [2020-09-25 12:27:52] <itamarst> but the basic structure is as above, just
 addErrback
 [2020-09-25 12:28:07] <itamarst> Deferreds get chained automatically
 [2020-09-25 12:28:26] <itamarst> so you just need to suppress the "logged
 exceptions get marked as failures logic"
 [2020-09-25 12:28:31] <itamarst> that Twisted's TestCase adds
 [2020-09-25 12:28:43] <itamarst> back later if you want to pair
 [2020-09-25 13:18:10] <exarkun> jaraco: I'm not sure this is going to be a
 fruitful avenue
 [2020-09-25 13:18:27] <exarkun> jaraco: If just retrying the test method
 isn't sufficient then the test is probably so broken it needs to be
 rewritten
 [2020-09-25 13:19:09] <exarkun> jaraco: Clobbering `Deferred.callbacks` is
 pretty much guaranteed to make it more broken rather than less
 [2020-09-25 13:29:48] <itamarst> jaraco: if the worry is "what if someone
 does addCallback after the return", that will break the normal test
 infrastructure too, so you don't need to handle that case
 }}}

 I've thought about this some more and had some ideas. Like what if the
 asynchronous test could be synchronized then retried? That doesn't work
 because the event loop is already created for setup/teardown.

--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/3412#comment:18>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list