[tahoe-lafs-trac-stream] [Tahoe-LAFS] #3945: Retry moody GitHub Actions steps

Tahoe-LAFS trac at tahoe-lafs.org
Sun Nov 27 02:21:29 UTC 2022


#3945: Retry moody GitHub Actions steps
--------------------------------+---------------------------
 Reporter:  sajith              |          Owner:  sajith
     Type:  task                |         Status:  new
 Priority:  normal              |      Milestone:  undecided
Component:  dev-infrastructure  |        Version:  n/a
 Keywords:                      |  Launchpad Bug:
--------------------------------+---------------------------
 Some workflows fail on !GitHub Actions either because the tests are moody
 or !GitHub Actions itself is moody.  Example: https://github.com/tahoe-
 lafs/tahoe-lafs/actions/runs/3556042011/jobs/5973114477

 {{{
 2022-11-27T01:09:13.3236569Z [FAIL]
 2022-11-27T01:09:13.3236873Z Traceback (most recent call last):
 2022-11-27T01:09:13.3237795Z   File "D:\a\tahoe-lafs\tahoe-
 lafs\.tox\py310-coverage\lib\site-packages\allmydata\util\pollmixin.py",
 line 47, in _convert_done
 2022-11-27T01:09:13.3238340Z     f.trap(PollComplete)
 2022-11-27T01:09:13.3239166Z   File "D:\a\tahoe-lafs\tahoe-
 lafs\.tox\py310-coverage\lib\site-packages\twisted\python\failure.py",
 line 480, in trap
 2022-11-27T01:09:13.3244610Z     self.raiseException()
 2022-11-27T01:09:13.3245778Z   File "D:\a\tahoe-lafs\tahoe-
 lafs\.tox\py310-coverage\lib\site-packages\twisted\python\failure.py",
 line 504, in raiseException
 2022-11-27T01:09:13.3259779Z     raise self.value.with_traceback(self.tb)
 2022-11-27T01:09:13.3260719Z   File "D:\a\tahoe-lafs\tahoe-
 lafs\.tox\py310-coverage\lib\site-packages\twisted\internet\defer.py",
 line 206, in maybeDeferred
 2022-11-27T01:09:13.3261254Z     result = f(*args, **kwargs)
 2022-11-27T01:09:13.3261923Z   File "D:\a\tahoe-lafs\tahoe-
 lafs\.tox\py310-coverage\lib\site-packages\allmydata\util\pollmixin.py",
 line 69, in _poll
 2022-11-27T01:09:13.3262457Z     self.fail("Errors snooped, terminating
 early")
 2022-11-27T01:09:13.3262935Z twisted.trial.unittest.FailTest: Errors
 snooped, terminating early
 2022-11-27T01:09:13.3263257Z
 2022-11-27T01:09:13.3263547Z
 allmydata.test.test_system.SystemTest.test_upload_and_download_convergent
 2022-11-27T01:09:13.3263989Z
 ===============================================================================
 2022-11-27T01:09:13.3264288Z [ERROR]
 2022-11-27T01:09:13.3264609Z Traceback (most recent call last):
 2022-11-27T01:09:13.3265386Z   File "D:\a\tahoe-lafs\tahoe-
 lafs\.tox\py310-coverage\lib\site-packages\allmydata\util\rrefutil.py",
 line 26, in _no_get_version
 2022-11-27T01:09:13.3268422Z     f.trap(Violation, RemoteException)
 2022-11-27T01:09:13.3269217Z   File "D:\a\tahoe-lafs\tahoe-
 lafs\.tox\py310-coverage\lib\site-packages\twisted\python\failure.py",
 line 480, in trap
 2022-11-27T01:09:13.3269711Z     self.raiseException()
 2022-11-27T01:09:13.3270396Z   File "D:\a\tahoe-lafs\tahoe-
 lafs\.tox\py310-coverage\lib\site-packages\twisted\python\failure.py",
 line 504, in raiseException
 2022-11-27T01:09:13.3270976Z     raise self.value.with_traceback(self.tb)
 2022-11-27T01:09:13.3271553Z foolscap.ipb.DeadReferenceError: Connection
 was lost (to tubid=4vg7) (during
 method=RIStorageServer.tahoe.allmydata.com:get_version)
 2022-11-27T01:09:13.3271977Z
 2022-11-27T01:09:13.3272448Z
 allmydata.test.test_system.SystemTest.test_upload_and_download_convergent
 2022-11-27T01:09:13.3272884Z
 ===============================================================================
 2022-11-27T01:09:13.3273207Z [ERROR]
 2022-11-27T01:09:13.3273530Z Traceback (most recent call last):
 2022-11-27T01:09:13.3274088Z Failure: foolscap.ipb.DeadReferenceError:
 Connection was lost (to tubid=4vg7) (during
 method=RIUploadHelper.tahoe.allmydata.com:upload)
 2022-11-27T01:09:13.3274512Z
 2022-11-27T01:09:13.3274802Z
 allmydata.test.test_system.SystemTest.test_upload_and_download_convergent
 2022-11-27T01:09:13.3275437Z
 -------------------------------------------------------------------------------
 2022-11-27T01:09:13.3275958Z Ran 1776 tests in 1302.475s
 2022-11-27T01:09:13.3276195Z
 2022-11-27T01:09:13.3276435Z FAILED (skips=27, failures=1, errors=2,
 successes=1748)
 }}}

 That failure has nothing to do with the changes that triggered that
 workflow; it might be a good idea to retry that step.

 Some other workflows take a long time to run. Examples: on
 https://github.com/tahoe-lafs/tahoe-
 lafs/actions/runs/3556042011/jobs/5973114477, `coverage (ubuntu-latest,
 pypy-37)`, `integration (ubuntu-latest, 3.7)`, and `integration (ubuntu-
 latest, 3.9)`.  Although in this specific instance integration tests are
 failing due to #3943, it might be a good idea to retry them after a
 reasonable timeout, and give up altogether after a number of tries instead
 of spinning for many hours on end.

 This perhaps would be a good use of
 [https://github.com/marketplace/actions/retry-step actions/retry-step]?

--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/3945>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list