#1812 closed defect (fixed)

parse_abbreviated_size doesn't accept T for terabytes (and other quibbles with the regex it uses)

Reported by: davidsarah Owned by: warner
Priority: minor Milestone: 1.10.0
Component: code Version: 1.9.2
Keywords: reserved_space storage error usability needs-review Cc: stercor, zancas@…
Launchpad Bug:

Description

stercor tried to use "reserved_space = 1.5TiB" in tahoe.cfg. This failed and used a reserved_space of 0 bytes (as shown on the server status page), because T is not recognized by parse_abbreviated_size in allmydata.util.abbreviate.

Note that the error due to an unparseable size is silent and only reported in the foolscap log. Continuing with a reserved_space of 0 bytes is a rather unhelpful failure mode. The code responsible for that is at git/src/allmydata/client.py@32377469#L248.

The regex is r"^(\d+)([kKmMgG]?[iB]?[bB]?)$", which also allows strings ending in "BB" or "Bb"; I don't think that allowing the repeated B was intentional.

Change History (19)

comment:1 in reply to: ↑ description Changed at 2012-09-19T03:31:30Z by davidsarah

Replying to davidsarah:

The regex is r"^(\d+)([kKmMgG]?[iB]?[bB]?)$", which also allows strings ending in "BB" or "Bb"; I don't think that allowing the repeated B was intentional.

... especially because parse_abbreviated_size will then fail with a KeyError (not a ValueError as it should) because only one B is stripped off, not both.

comment:2 Changed at 2012-09-19T04:28:14Z by davidsarah

  • Keywords test-needed review-needed added
  • Owner changed from davidsarah to warner

Please review: https://github.com/davidsarah/tahoe-lafs/commit/d8baf7aa5fe967f2611c55d297fb522b16706f89

Note that this only addresses the terabyte and double-B issues, not the error reporting. It also needs tests. I allowed E for exabytes as well.

Last edited at 2012-09-19T04:37:25Z by davidsarah (previous) (diff)

comment:3 Changed at 2012-09-19T04:38:07Z by davidsarah

Apparently I don't know my petabytes from my exabytes. Here's a fixed version: https://github.com/davidsarah/tahoe-lafs/commit/84d55da319f199c1ff335186330d34ebf94868fe

comment:4 Changed at 2012-12-27T20:04:03Z by davidsarah

  • Keywords test-needed removed
  • Milestone changed from soon to 1.10.0
  • Owner changed from warner to zancas

Review needed for https://github.com/tahoe-lafs/tahoe-lafs/pull/26 (now with tests).

comment:5 Changed at 2012-12-27T21:35:01Z by Zancas

I don't understand why "I" and "i" are considered valid suffixes.

comment:6 Changed at 2012-12-27T21:43:50Z by Zancas

The "^" symbol in the re.match call used to parse out the "number" and "suffix" values is, if I understand correctly, redundant with the way re.match works.

This is because the "pattern" (first) argument to re.match is applied to the beginning of the "string" (second) argument. The "^" specifies that the pattern is to match from the beginning of its target. So the two functions, the one intrinsic to re.match, and the one specified by "^" are identical.

This doesn't necessarily mean that the "^" should be removed from the code as it, debatably, increases readability.

comment:7 Changed at 2012-12-27T22:35:18Z by Zancas

  • Cc zancas@… added

I've reviewed these changes and run the unittests they include.

I concur that the doubled "BB" suffix now correctly raises a ValueError.

I concur that Tera-, Peta-, and Exa-bytes are now correctly handled as scales of 240, 250, and 260 respectively.

I note that the "^" in the "pattern" argument to re.match is not strictly necessary, but assert that it is correct, and may enhance readability.

I do not understand why the suffix's "I" and "i" are to be handled as they are.

It seems to me that patterns that contain only "I", "i", or e.g. "iB", are errors, but it's plausible I'm simply unfamiliar with established convention.

Last edited at 2012-12-27T22:40:45Z by Zancas (previous) (diff)

comment:8 Changed at 2012-12-27T22:41:51Z by Zancas

  • Keywords reviewed added; review-needed removed

comment:9 Changed at 2012-12-27T23:28:50Z by davidsarah

Whether "IB" or "I" on its own should be an error is debatable, but it is currently accepted, and I don't see any particular reason to change that. Use of ^ at the start of the pattern is simply a matter of style; I prefer the explicitness. So I'll commit this as-is, plus a change to git/docs/configuration.rst to mention the new prefixes.

comment:10 Changed at 2012-12-27T23:29:14Z by davidsarah

  • Owner changed from zancas to davidsarah
  • Status changed from new to assigned

comment:11 Changed at 2012-12-27T23:42:48Z by davidsarah

  • Resolution set to fixed
  • Status changed from assigned to closed

Change to configuration.rst is in eba64f2a, the rest is in the previous two patches.

comment:12 Changed at 2012-12-29T02:33:54Z by davidsarah

  • Resolution fixed deleted
  • Status changed from closed to reopened

Reopening to fix this:

Note that the error due to an unparseable size is silent and only reported in the foolscap log. Continuing with a reserved_space of 0 bytes is a rather unhelpful failure mode. The code responsible for that is at git/src/allmydata/client.py@32377469#L248.

Last edited at 2012-12-29T02:35:16Z by davidsarah (previous) (diff)

comment:13 Changed at 2013-01-31T15:50:42Z by davidsarah

  • Priority changed from normal to minor

comment:14 Changed at 2013-01-31T15:57:20Z by warner

  • Owner changed from davidsarah to warner
  • Status changed from reopened to new

Looks like we should *not* catch the exception, and allow it to be raised, since init_storage() is called before daemonization, so errors will appear on the 'tahoe start' stderr. The only thing that might need special-casing is "reserved_space=" (i.e. the default do-not-reserve-any-space), just in case that returns an empty string or something unparseable. This should be covered in unit tests.

comment:15 follow-up: Changed at 2013-03-07T16:35:04Z by zooko

  • Keywords reviewed removed

So this patch isn't actually ready, since it needs the improvements Brian described in comment:14. Removing the "reviewed" tag.

comment:16 in reply to: ↑ 15 Changed at 2013-03-10T23:54:20Z by davidsarah

Replying to zooko:

So this patch isn't actually ready, since it needs the improvements Brian described in comment:14. Removing the "reviewed" tag.

The patch has already been applied, but when I reopened the ticket I should have removed the "reviewed" tag since it was not applicable to any yet-to-be-applied patch.

comment:17 Changed at 2013-03-11T06:00:15Z by zooko

Um, so since the patch was applied, but according to Brian (comment:14), we should not catch the exception, does this mean that fixing the patch should be a blocker for Tahoe-LAFS v1.10? Assigning to Brian to clarify.

comment:18 Changed at 2013-03-20T22:35:00Z by warner

  • Keywords needs-review added

Please see the last patch on https://github.com/warner/tahoe-lafs/tree/1812-parse-size for review, it should report the parse error early, and updates the tests to match.

comment:19 Changed at 2013-03-20T23:25:42Z by David-Sarah Hopwood <david-sarah@…>

  • Resolution set to fixed
  • Status changed from new to closed

In b084396bddd1e12a:

client.py: throw error when reserved_space= is unparseable. Closes #1812.

This should now fail quickly (during "tahoe start"). Previously this
would silently treat an unparseable size as "0", and the only way to
discover that it had had a problem would be to look at the foolscap log,
or examine the storage-service web page for the unexpected "Reserved
Size" number.

Note: See TracTickets for help on using tickets.