#1166 closed defect (fixed)

ZeroDivisionError in web/status.py

Reported by: davidsarah Owned by: francois
Priority: major Milestone: 1.8.0
Component: code-frontend-web Version: 1.8β
Keywords: error reliability easy reviewed Cc: francois@…
Launchpad Bug:

Description

http://tahoe-lafs.org/buildbot/builders/FreeStorm%20WinXP-x86%20py2.6/builds/252/steps/test/logs/stdio

[ERROR]: allmydata.test.test_system.SystemTest.test_filesystem

Traceback (most recent call last):
  File "c:\python26\lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\twist.py", line 24, in _drive
    next = iterable.next()
  File "c:\python26\lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\ten.py", line 83, in iterflatten
    for item in gen:
  File "c:\python26\lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\flatstan.py", line 103, in TagSerializer
    yield serialize(toBeRenderedBy, context)
  File "c:\python26\lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\ten.py", line 70, in serialize
    return partialflatten(context, obj)
  File "c:\python26\lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\ten.py", line 61, in partialflatten
    return flattener(obj, context)
  File "c:\python26\lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\flatstan.py", line 264, in DirectiveSerializer
    return serialize(renderer, context)
  File "c:\python26\lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\ten.py", line 70, in serialize
    return partialflatten(context, obj)
  File "c:\python26\lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\ten.py", line 61, in partialflatten
    return flattener(obj, context)
  File "c:\python26\lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\flatstan.py", line 247, in MethodSerializer
    return FunctionSerializer(original, context, nocontext)
  File "c:\python26\lib\site-packages\nevow-0.10.0-py2.6.egg\nevow\flat\flatstan.py", line 236, in FunctionSerializer
    result = original(context, data)
  File "c:\buildbot_tahoe\freestorm_winxp-x86_py2.6\build\src\allmydata\web\status.py", line 462, in render_events
    speed = self.render_rate(None, 1.0 * seglen / segtime)
exceptions.ZeroDivisionError: float division

Looks like segtime was zero because something took less time than the clock granularity.

Attachments (3)

1166-zerodiv.diff (1.4 KB) - added by warner at 2010-08-11T07:16:37Z.
avoid divide-by-zero. Windows, how I hate thee.
rate-computation-refactoring.dpatch (7.3 KB) - added by francois at 2010-08-14T10:27:58Z.
rate-computation-refactoring-2.dpatch (7.5 KB) - added by francois at 2010-08-15T14:51:04Z.

Download all attachments as: .zip

Change History (18)

comment:1 Changed at 2010-08-10T08:38:45Z by davidsarah

  • Keywords easy added
  • Milestone changed from undecided to soon

Changed at 2010-08-11T07:16:37Z by warner

avoid divide-by-zero. Windows, how I hate thee.

comment:2 Changed at 2010-08-11T07:16:59Z by warner

  • Keywords review-needed added

comment:3 Changed at 2010-08-11T07:26:23Z by zooko

Oh come on, you can't blame Windows! "I will tick my clock faster than you do anything you care about." is not part of the contract that we choose to rely on. Also note that even after attachment:1166-zerodiv.diff, when rtt > 0 but rtt <= ϵ then the speed measurement will be wildly inaccurate.

comment:4 Changed at 2010-08-11T15:26:51Z by warner

  • Keywords tests-needed added; review-needed removed

ok, ok.

Needs tests anyways: test_web should acquire a second DownloadStatus instance in its FakeHistory object (with zero durations), and Web.test_status should make sure it renders without exception.

I don't know what to suggest about very small durations and very large speed measurements. Since the duration is clearly visible next to the speed, I believe users will figure it out for themselves ("huh, 18 terabytes per second? oh, look, and the whole thing completed in one microsecond. maybe that's not very accurate").

comment:5 follow-ups: Changed at 2010-08-11T18:32:03Z by warner

Maybe we should change the common abbreviate_rate function to take (bytes, seconds) arguments and have it do the division (and zero-avoidance) in one central place.

comment:6 in reply to: ↑ 5 Changed at 2010-08-12T01:28:56Z by davidsarah

  • Keywords test-needed added; tests-needed removed

Replying to warner:

Maybe we should change the common abbreviate_rate function to take (bytes, seconds) arguments and have it do the division (and zero-avoidance) in one central place.

+1.

(Brian and Zooko are both right: we shouldn't rely on the clock ticking fast enough, but Windows' clock ticks stupidly slowly.)

comment:7 Changed at 2010-08-14T07:04:28Z by zooko

  • Milestone changed from soon to 1.8.0

Is this important enough to fix in 1.8.0?

comment:8 Changed at 2010-08-14T08:30:53Z by francois

  • Owner set to francois

comment:9 Changed at 2010-08-14T08:31:46Z by francois

Looks like an easy ticket, I'll give it a go.

comment:10 in reply to: ↑ 5 Changed at 2010-08-14T10:36:01Z by francois

  • Cc francois@… added
  • Keywords review-needed added; test-needed removed

The attached patch contains a test case for this specific bug and a complete refactoring of rate computation on the status page.

Replying to warner:

Maybe we should change the common abbreviate_rate function to take (bytes, seconds) arguments and have it do the division (and zero-avoidance) in one central place.

This is implemented in a new function compute_rate. It sounds cleaner to me, because abbreviate_rate is called (via render_rate) from Nevow templates which only pass a single data argument. Moreover, this allows the function to be used more often in web/status.py.

comment:11 follow-up: Changed at 2010-08-14T18:27:33Z by warner

wait, what's that 0.1 doing in there? If it said 1.0, and if the tests were updated to match, I'd be +1 on this patch.

comment:12 in reply to: ↑ 11 Changed at 2010-08-15T14:50:49Z by francois

Replying to warner:

wait, what's that 0.1 doing in there? If it said 1.0, and if the tests were updated to match, I'd be +1 on this patch.

Oh crap! Can't believe how I missed that. Thanks for spotting it!

This is probably an indication that tests were not thorough enough, so I added a new human understandable one.

comment:13 Changed at 2010-08-21T21:19:39Z by terrell

looks good to me.

comment:14 Changed at 2010-08-21T22:29:56Z by zooko

  • Keywords reviewed added; review-needed removed

terrell reviewed it so I'm removing the review-needed and adding reviewed.

comment:15 Changed at 2010-08-21T23:33:20Z by francois@…

  • Resolution set to fixed
  • Status changed from new to closed

In f026927f86584870:

web: refactor rate computation, fixes #1166

Note: See TracTickets for help on using tickets.