Opened at 2010-06-16T01:16:35Z
Last modified at 2011-09-28T16:22:47Z
#1084 assigned defect
nondeterministic failure of allmydata.test.test_system.SystemTest.test_upload_and_download_{random_key,convergent}
Reported by: | davidsarah | Owned by: | zooko |
---|---|---|---|
Priority: | major | Milestone: | undecided |
Component: | code | Version: | 1.7β |
Keywords: | test upload heisenbug | Cc: | |
Launchpad Bug: |
Description
http://tahoe-lafs.org/buildbot/builders/FreeStorm%20CentOS5-i386/builds/17/steps/test/logs/stdio
[ERROR]: allmydata.test.test_system.SystemTest.test_upload_and_download_random_key Traceback (most recent call last): File "/home/buildbot/tahoe-lafs/FreeStorm CentOS5-i386/build/src/allmydata/test/test_system.py", line 340, in _uploaded "resumption saved us some work even though we were using random keys:" exceptions.TypeError: int argument required
The code around that line is here.
This did not happen on a subsequent build with only this change.
Attachments (2)
Change History (10)
comment:1 follow-up: ↓ 2 Changed at 2010-06-16T01:25:01Z by davidsarah
comment:2 in reply to: ↑ 1 Changed at 2010-06-16T01:53:05Z by davidsarah
Replying to davidsarah:
Because there was an exception in the %d formatting of the message argument, we do not know what the value of bytes_sent was. It doesn't seem to have been None because that would have produced a different error message: ... (but maybe there is a difference between Python 2.4.3 and 2.6.2 here.)
There was; discount this argument. bytes_sent might have been None.
comment:3 Changed at 2011-07-31T21:01:45Z by davidsarah
- Keywords centos removed
- Summary changed from nondeterministic failure of allmydata.test.test_system.SystemTest.test_upload_and_download_random_key on CentOS builder to nondeterministic failure of allmydata.test.test_system.SystemTest.test_upload_and_download_random_key
#1273 was probably a duplicate. That failure occurred on Windows Vista, so the problem is not specific to CentOS or the CentOS builder. The error message is not exactly the same (I think an assertion was added), but seems to be due to the same type error.
comment:4 Changed at 2011-08-02T01:29:45Z by davidsarah
- Summary changed from nondeterministic failure of allmydata.test.test_system.SystemTest.test_upload_and_download_random_key to nondeterministic failure of allmydata.test.test_system.SystemTest.test_upload_and_download_{random_key,convergent}
In http://tahoe-lafs.org/buildbot/builders/Arthur%20lenny%20c7%2032bit/builds/745/steps/test/logs/stdio , this problem happens for both test_upload_and_download_random_key and test_upload_and_download_convergent:
[FAIL] Traceback (most recent call last): File "/home/arthur/buildbot/Arthur lenny c7 32bit/build/src/allmydata/test/test_system.py", line 329, in _uploaded self.failUnless(isinstance(bytes_sent, (int, long)), bytes_sent) twisted.trial.unittest.FailTest: None allmydata.test.test_system.SystemTest.test_upload_and_download_convergent allmydata.test.test_system.SystemTest.test_upload_and_download_random_key
Changed at 2011-08-02T01:31:10Z by davidsarah
Changed at 2011-08-02T01:33:03Z by davidsarah
comment:5 Changed at 2011-09-09T05:54:30Z by zooko
A problem with similar characteristics happened just now on Ruben's Fedora buildslave:
failed: http://tahoe-lafs.org/buildbot/builders/Ruben%20Fedora/builds/864
Ended with:
allmydata.test.test_system.SystemTest.test_upload_and_download_random_key ... Traceback (most recent call last): File "/usr/lib64/python2.7/site-packages/twisted/internet/defer.py", line 133, in maybeDeferred result = f(*args, **kw) File "/home/buildbot/tahoe/Ruben Fedora/build/src/allmydata/util/pollmixin.py", line 34, in _poll raise TimeoutError("PollMixin never saw %s return True" % check_f) allmydata.util.pollmixin.TimeoutError: PollMixin never saw <bound method SystemTest._check_connections of <allmydata.test.test_system.SystemTest testMethod=test_upload_and_download_random_key>> return True [ERROR]Traceback (most recent call last): Failure: twisted.internet.defer.TimeoutError: <allmydata.test.test_system.SystemTest testMethod=test_upload_and_download_random_key> (tearDown) still running at 3600.0 secs [ERROR]Traceback (most recent call last): Failure: twisted.trial.util.DirtyReactorAggregateError: Reactor was unclean. DelayedCalls: (set twisted.internet.base.DelayedCall.debug = True to debug) <DelayedCall 0x534e5f0 [39.4281477928s] called=0 cancelled=0 LoopingCall<60>(CPUUsageMonitor.check, *(), **{})()> <DelayedCall 0x48e40e0 [39759.6027431s] called=0 cancelled=0 LeaseCheckingCrawler.start_slice()> <DelayedCall 0x4bcb710 [99.8050701618s] called=0 cancelled=0 BucketCountingCrawler.start_slice()> [ERROR]Traceback (most recent call last): Failure: twisted.trial.util.DirtyReactorAggregateError: Reactor was unclean. Selectables: <<class 'twisted.internet.tcp.Port'> of foolscap.pb.Listener on 42824> [ERROR] command timed out: 7200 seconds without output, attempting to kill process killed by signal 9 program finished with exit code -1 elapsedTime=11884.623658
rebuilt exact same version and passed: http://tahoe-lafs.org/buildbot/builders/Ruben%20Fedora/builds/865
comment:6 Changed at 2011-09-09T05:58:10Z by zooko
I remember having a brainstorm that the bytes_sent would be None in the case that the helper had not been connected to the node before the node started its upload, so I hypothesized that there is a race condition in setting up the tests, between the node connecting to the helper and the node starting its upload.
Not sure if that applies to this new issue from comment:5. Also, I thought I wrote some notes about that last time, but they are not on this ticket. Is there a different (redundant) ticket somewhere? Did I post my notes elsewhere than trac? I will investigate...
comment:7 Changed at 2011-09-09T18:21:15Z by davidsarah
comment:8 Changed at 2011-09-28T16:22:47Z by zooko
- Owner changed from somebody to zooko
- Status changed from new to assigned
Because there was an exception in the %d formatting of the message argument, we do not know what the value of bytes_sent was. It doesn't seem to have been None because that would have produced a different error message:
(but maybe there is a difference between Python 2.4.3 and 2.6.2 here.)