[tahoe-dev] String encoding in tahoe
Dan McNair
glucnac at gmail.com
Mon Dec 22 19:31:19 PST 2008
On Mon, Dec 22, 2008 at 18:33, zooko <zooko at zooko.com> wrote:
> Okay, after testing on my Macbook Pro, I committed François's patch
> [1], and some related patches of my own [2, 3, 4]. This fixed the
> cli tests on Ubuntu Feisty -- hooray! But it broke the test on
> cygwin, GNU/OpenSolaris, Windows, and ArchLinux -- boo! See the
> buildbot for details [5].
Got ArchLinux working again with a simple fix outside of the Tahoe source.
I had all my locale environment set to "C", so when os.stat() was called, it
automagically converted its unicode argument (at least, I assume it's a
unicode object still, I only glanced at the source, and figured it wouldn't
be trying to call encode() if it wasn't a unicode object) using the 'ascii'
encoding. Which failed, because ascii can't encode anything except A-Za-z0-9
and the other basics.
So I switched my system over to using "en_US.UTF-8" for the locale, and now
os.stat() (I assume) is autoconverting to UTF-8, which can represent the
special characters in the test filename, and all is well.
There's a big chance something similar may be happening on the OpenSolaris
box, a smaller chance it's related to the cygwin failure, and I have no idea
whether the failure on Windows has anything at all to do with it.
I guess that it is incorrect to assume that the Python strings that
> appear in sys.argv are utf-8 encoded. They could be in some other
> encoding.
My guess is this is locale-specific on most POSIX platforms.
Your mileage my vary,
Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://allmydata.org/pipermail/tahoe-dev/attachments/20081222/8ed07c57/attachment.htm
More information about the tahoe-dev
mailing list