#1120 closed defect (fixed)

simplify Unicode support by assuming that argv and output encodings are the same

Reported by: davidsarah Owned by: zooko
Priority: minor Milestone: 1.9.0
Component: code Version: 1.7.0
Keywords: unicode cleanup Cc:
Launchpad Bug:

Description

When the changes to support Unicode on Windows (#1074 and #565) are applied, the argv and output encodings will always be the same as each other, on all platforms. Call that the "I/O encoding" and make any consequent simplifications.

Attachments (2)

same-argv-and-output-encoding.darcs.patch (30.6 KB) - added by davidsarah at 2011-06-28T19:02:46Z.
encodingutil: argv and output encodings are always the same on all platforms. Lose the unnecessary generality of them being different. fixes #1120
same-argv-and-output-encoding-withdarcsreplace.darcs.patch (29.2 KB) - added by zooko at 2011-07-27T19:40:56Z.

Download all attachments as: .zip

Change History (13)

comment:1 Changed at 2010-07-15T02:47:37Z by davidsarah

  • Component changed from unknown to code
  • Owner changed from nobody to somebody

comment:2 Changed at 2010-07-17T05:29:42Z by davidsarah

  • Milestone changed from undecided to 1.8β
  • Owner changed from somebody to davidsarah
  • Status changed from new to assigned

comment:3 Changed at 2010-09-10T20:54:18Z by davidsarah

  • Milestone changed from 1.8β to 1.9.0

I have a patch for this, but don't intend to apply it for 1.8.

Changed at 2011-06-28T19:02:46Z by davidsarah

encodingutil: argv and output encodings are always the same on all platforms. Lose the unnecessary generality of them being different. fixes #1120

comment:4 Changed at 2011-06-28T19:03:17Z by davidsarah

  • Keywords review-needed added
  • Owner changed from davidsarah to zooko
  • Priority changed from major to minor
  • Status changed from assigned to new

comment:5 Changed at 2011-07-27T19:48:28Z by zooko

I reviewed attachment:same-argv-and-output-encoding.darcs.patch and while doing so I recorded an equivalent patch using darcs replace. Using darcs replace instead of editing hunks can avoid unnecessary merge conflicts. I confirmed that my patch, attachment:same-argv-and-output-encoding-withdarcsreplace.darcs.patch, has the same effect when applied to the current trunk.

So, I've reviewed this patch and give it +1. If someone wants to review my darcs replace variant of this patch, that would be cool, or if someone wants to just apply the original patch to trunk that would be cool too.

comment:6 Changed at 2011-07-27T19:48:45Z by zooko

  • Keywords reviewed added; review-needed removed

comment:7 Changed at 2011-07-28T00:00:56Z by davidsarah

Maybe this is off-topic for this bug, but I'm really confused about the patches that darcs generates when darcs replace is used.

The effect of either patch set is basically to rename [get_]argv_encoding and [get_]output_encoding to [get_]io_encoding, and clean up any resulting duplicated imports and attributes. But in same-argv-and-output-encoding-withdarcsreplace.darcs.patch, we also have explicit hunks that are either redundant with, or seem to conflict with the replace hunks.

For example, in test_cli.py we have hunks that explicitly change get_argv_encoding to get_output_encoding (not get_io_encoding), and then we also have replace hunks that change both get_argv_encoding and get_output_encoding to get_io_encoding. So why were the explicit hunks needed, and what is the implied ordering or priority when both explicit and replace hunks in a given patch affect the same lines?

There are some hunks that apparently go in the wrong direction, for instance changing get_io_encoding back to get_output_encoding in test_encodingutil.py. But there was no get_io_encoding in the original file, so how can those hunks be applied, even if they were correct? That is very puzzling.

Also, won't the explicit, non-replace hunks still cause conflicts if those lines or neighbouring lines are touched by a patch to be merged -- and if they do, doesn't this partly defeat the point of using darcs replace?

In conclusion, there's too much I don't understand about darcs replace, that makes me lack confidence in reviewing patches that use it. For patches that use only explicit hunks, I can often (not always) review them without actually applying them to see the effects; for replace patches that seems never to be the case :-(

comment:8 Changed at 2011-07-28T00:04:28Z by david-sarah@…

  • Resolution set to fixed
  • Status changed from new to closed

In f9d218c6736cf389:

encodingutil: argv and output encodings are always the same on all platforms. Lose the unnecessary generality of them being different. fixes #1120

comment:9 Changed at 2011-07-28T00:05:30Z by davidsarah

  • Keywords reviewed removed

I committed the non-darcs replace version of the patch.

comment:10 Changed at 2011-07-28T05:24:39Z by zooko

I went to report the problem with darcs replace on the darcs bug tracker, and saw that Zancas had previously reported the same issue.

comment:11 Changed at 2011-07-29T00:49:10Z by davidsarah

Heffalump wrote in #darcs:

darcs applies patches in the order listed.

a replace patch on its own does exactly what it says on the tin - it replaces token A with token B in the file listed

so if you have hunk patches then a replace patch, the hunk patches happen first then the replace

in your example I think force must have been used to do a couple of replace operations. I'm not entirely sure, but perhaps output -> argv was done first then argv -> io, or something like that.

with --force, if you are renaming A to B, then --force first inserts hunks that change any existing Bs to A, then a replace patch from A to B. That's to ensure that everything is invertible.

because darcs tries to squish together individual pieces of patches in a single named patch to avoid redundant operations, I can imagine replace --force twice generting something like you describe.

Note: See TracTickets for help on using tickets.