#1055 closed defect (fixed)

Crashed WUI due to abbreviate_time in common.py

Reported by: rycee Owned by: jacoblyles
Priority: major Milestone: 1.8β
Component: code-frontend-web Version: 1.6.1
Keywords: news-needed Cc:
Launchpad Bug:

Description

I've been having a few difficulties with a Tahoe-node I'm running. It produced a long error message whenever I accessed the /storage web path.

With a bit of sleuthing I tracked the problem down to common.py:93. The line

return "%dus" % (1000000*s)

seems to assume that s an integer but s actually is a float as can be seen earlier in the function. Changing this line to

return "%dus" % (1000000*int(s))

fixed the issue for me and /storage now shows without problem.

Looking through the other functions in the file it appears as if there are other cases of %d being used together with float values. Perhaps they can also cause problems?

I cannot produce a stack trace at this time since the node is currently in use but I can try again later if there is a need for it.

Attachments (3)

WHAT.dpatch (2.1 KB) - added by jacoblyles at 2010-07-29T21:37:26Z.
Added unit test to show bug in allmydata.web.common.abbreviate_time in python 2.5
web_common_25.dpatch (2.5 KB) - added by jacoblyles at 2010-07-30T21:27:16Z.
This patch fixes an error with interpolating float values in "%d" format strings in python 2.5
NEWS-1055.dpatch.txt (2.6 KB) - added by zooko at 2010-08-01T04:06:40Z.

Download all attachments as: .zip

Change History (25)

comment:1 Changed at 2010-05-27T21:39:08Z by zooko

  • Milestone changed from undecided to 1.7.0

Let's fix this for v1.7.0.

comment:2 Changed at 2010-05-28T07:49:42Z by rycee

I have tried to reproduce this problem and have been unsuccesfull. I think what happened is that the computer clock had for some reason gotten very much out of sync prior to the previous pass by the expire crawler. Eventually the time got synchronized over ntp and the reported problem occurred. This should have caused the abbreviate_time function to recieve a very large negative number. This seems to rhyme well with the following python session:

>>> "%d" % (1000000 * -1234567891234.0)
'-1234567891233999872'
>>> "%d" % (1000000 * -12345678912345.0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: int argument required
>>> "%d" % (1000000 * int(-12345678912345.0))
'-12345678912345000000'

That is, python will handle floats as well as can be expected together with %d, as long as the number is small enough.

So, in conclusion, this error should be quite rare indeed and I would not protest if this bug report is closed as WONTFIX or similar. It may possibly be interesting to make the code a bit more robust in these cases, I would have appreciated a "check your clock dummy!" message :-)

comment:3 follow-up: Changed at 2010-05-30T21:00:27Z by zooko

Tahoe-LAFS is intended to function correctly in all respects even if the system clock jumps around to arbitrary values [*].

So instead of closing this as wontfix, we'll write a unit test that shows this bug. Thanks for tracking it down! If you want to help out with the next step, report to us what version of Python you have, because my version of Python doesn't behave like yours:

Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> "%d" % (1000000 * -1234567891234.0)
'-1234567891233999872'
>>> "%d" % (1000000 * -12345678912345.0)
'-12345678912344999936'
>>> "%d" % (1000000 * int(-12345678912345.0))
'-12345678912345000000'

[*] Not that it currently does so, but I consider it to be a bug that it exhibits random misbehavior when the gettimeofday(2) gets kicked by ntp or by a sysadmin, due to Tahoe-LAFS relying on the ordering of events in the Twisted reactor and Twisted using the result from gettimeofday(2); see http://twistedmatrix.com/trac/ticket/2424

comment:4 in reply to: ↑ 3 Changed at 2010-05-31T06:10:51Z by rycee

Replying to zooko:

If you want to help out with the next step, report to us what version of Python you have, because my version of Python doesn't behave like yours

The session of comment 2 was taken with the following Python:

Python 2.5.5 (r255:77872, Apr 21 2010, 08:40:04) 
[GCC 4.4.3] on linux2

This is the stock Python of an amd64 Debian Sid installation. The error of the bug report occurred on a 32bit Lenny box and on this box I get the following result:

Python 2.5.2 (r252:60911, Jan 24 2010, 14:53:14) 
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> "%d" % (1000000 * -1234.0)
'-1234000000'
>>> "%d" % (1000000 * -12345.0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: int argument required

Thanks for following up on this bug!

comment:5 Changed at 2010-05-31T06:47:32Z by rycee

Just a note that I cannot reproduce the "int argument required" error with Python 2.6.5 in Debian Sid. So I suppose it is something specific to Python 2.5?

comment:6 Changed at 2010-05-31T07:25:55Z by zooko

Yeah I suppose it is a bug in Python 2.5. We have Python 2.5 buildslaves, so if you write a unit test that exercises this bug then we can add it to the test suite and know that it is fixed on our supported platforms.

comment:7 Changed at 2010-06-18T23:56:39Z by zooko

  • Milestone changed from 1.7.0 to 1.7.1

comment:8 Changed at 2010-07-10T18:59:41Z by zooko

  • Keywords test-needed added

comment:9 Changed at 2010-07-17T07:34:57Z by zooko

  • Milestone changed from 1.7.1 to 1.8β

This just needs a simple unit test.

comment:10 Changed at 2010-07-29T06:12:03Z by jacoblyles

  • Owner set to jacoblyles

Changed at 2010-07-29T21:37:26Z by jacoblyles

Added unit test to show bug in allmydata.web.common.abbreviate_time in python 2.5

comment:11 Changed at 2010-07-29T21:39:58Z by jacoblyles

  • Keywords review-needed added; test-needed removed

comment:12 Changed at 2010-07-29T22:14:28Z by zooko

jacoblyles:

Thanks for the patch! Are you sure that your patch will trigger the bug, though? It looks like your patch will cause common.abbreviate_time() to be called with the argument -123000, which is an integer. common.abbreviate_time() will then multiply it by 1,000,000, making it be -123,000,000,000 As far as I understand, the bug shows up only when the argument is float instead of integer, and only when the argument is less than about negative 10 billion, and when the Python version is Python 2.5. If you could change your test to pass a float then we can run the test on our Python 2.5 buildbots.

comment:13 Changed at 2010-07-29T22:45:15Z by jacoblyles

Yes, this will trigger the bug. common.abbreviate_time() casts its input to a float at common.py:82. I have run the test on my own machine with python 2.6 and python 2.5 to confirm that it fails with 2.5.

comment:14 Changed at 2010-07-30T20:41:27Z by davidsarah

  • Keywords reviewed added; review-needed removed

Test looks good to me.

Changed at 2010-07-30T21:27:16Z by jacoblyles

This patch fixes an error with interpolating float values in "%d" format strings in python 2.5

comment:15 Changed at 2010-07-30T21:29:20Z by jacoblyles

I've attached a patch that should fix the problem. The magic value to break "%d" in Python 2.5 is a float of size 2.031 or larger. I replaced "%d" with "%.0f"

comment:16 Changed at 2010-07-30T21:29:57Z by jacoblyles

  • Keywords review-needed added; reviewed removed

comment:17 Changed at 2010-07-30T23:27:37Z by davidsarah

  • Keywords reviewed added; review-needed removed
  • Resolution set to fixed
  • Status changed from new to closed

Fixed by d3e6e15e23bcc93f. Not sure why this wasn't auto-closed by trac.

comment:18 Changed at 2010-07-30T23:46:02Z by zooko

jacoblyles: thanks! A couple of things to note for future reference:

  1. I re-edited your patch description to be a little more detailed and to have the string "fixes #1055" in it. See also Patches for examples of good patch descriptions.
  2. David-Sarah is adding unit tests of the adjacent formatting code in addition to the unit test you added for the one that your patch fixed.
  3. It might be good to bundle the tests together into the fix-it patch. At least in this case it would have prevented me from forgetting to apply your tests. :-)

comment:19 Changed at 2010-07-31T00:07:46Z by davidsarah

Test patch committed in e276050d5ab2a937. It turned out that the instances of %d in abbreviate_rate and abbreviate_size could not have overflowed (and float underflow is not a problem here), so I just committed the original attachment:WHAT.dpatch.

comment:20 Changed at 2010-07-31T00:07:56Z by davidsarah

  • Keywords reviewed removed

comment:21 Changed at 2010-07-31T02:53:00Z by zooko

Oh, and another thing: we need a snippet for NEWS to describe this change.

comment:22 Changed at 2010-07-31T02:53:19Z by zooko

  • Keywords news-needed added

Changed at 2010-08-01T04:06:40Z by zooko

Note: See TracTickets for help on using tickets.