Opened at 2014-07-30T15:41:25Z
Last modified at 2021-03-30T18:40:19Z
#2268 assigned enhancement
on Unix, if filesystem and/or I/O encodings are ASCII, ignore that and use UTF-8 instead
Reported by: | daira | Owned by: | daira |
---|---|---|---|
Priority: | normal | Milestone: | soon |
Component: | code-frontend-cli | Version: | 1.10.0 |
Keywords: | cli error unicode utf-8 unix easy | Cc: | |
Launchpad Bug: |
Description (last modified by daira)
Alexander Kaufman wrote:
I wanted to let you know that "tahoe cp" on "Дядя Ваня (1970).avi" results in an error:
tahoe --version allmydata-tahoe: 1.10.0 foolscap: 0.6.4 pycryptopp: 0.6.0.1206569328141510525648634803928199668821045408958 zfec: 1.4.24 Twisted: 14.0.0 Nevow: 0.10.0 zope.interface: unknown python: 2.7.8 platform: Linux-Arch_Linux_-x86_64-64bit_ELF pyOpenSSL: 0.14 simplejson: 3.4.0 pycrypto: 2.6.1 pyasn1: 0.1.7 mock: 1.0.1 setuptools: 5.4.1
Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/allmydata/scripts/runner.py", line 156, in run rc = runner(sys.argv[1:], install_node_control=install_node_control) File "/usr/lib/python2.7/site-packages/allmydata/scripts/runner.py", line 141, in runner rc = cli.dispatch[command](so) File "/usr/lib/python2.7/site-packages/allmydata/scripts/cli.py", line 551, in cp rc = tahoe_cp.copy(options) File "/usr/lib/python2.7/site-packages/allmydata/scripts/tahoe_cp.py", line 770, in copy return Copier().do_copy(options) File "/usr/lib/python2.7/site-packages/allmydata/scripts/tahoe_cp.py", line 451, in do_copy status = self.try_copy() File "/usr/lib/python2.7/site-packages/allmydata/scripts/tahoe_cp.py", line 512, in try_copy return self.copy_to_directory(sources, target) File "/usr/lib/python2.7/site-packages/allmydata/scripts/tahoe_cp.py", line 616, in copy_to_directory source_dirs = self.build_graphs(source_infos) File "/usr/lib/python2.7/site-packages/allmydata/scripts/tahoe_cp.py", line 764, in build_graphs source.populate(True) File "/usr/lib/python2.7/site-packages/allmydata/scripts/tahoe_cp.py", line 112, in populate child.populate(True) File "/usr/lib/python2.7/site-packages/allmydata/scripts/tahoe_cp.py", line 104, in populate children = listdir_unicode(self.pathname) File "/usr/lib/python2.7/site-packages/allmydata/util/encodingutil.py", line 279, in listdir_unicode return listdir_unicode_fallback(path) File "/usr/lib/python2.7/site-packages/allmydata/util/encodingutil.py", line 264, in listdir_unicode_fallback raise FilenameEncodingError(fn) FilenameEncodingError: ������������(1970).avi
[...]
The above happens when LANG=C and is fixed with export LANG=en_US.UTF-8
In general the correct encoding for LANG is not necessarily UTF-8; it is whatever the filesystem uses. The error message should say this.
Change History (9)
comment:1 Changed at 2014-07-30T15:42:12Z by daira
- Description modified (diff)
- Keywords easy added
comment:2 follow-up: ↓ 3 Changed at 2014-07-30T15:54:04Z by daira
- Keywords utf-8 added
comment:3 in reply to: ↑ 2 Changed at 2014-08-06T04:09:12Z by zooko
Replying to daira:
Perhaps if the filesystem and/or I/O encodings are ASCII, we should just ignore that and assume UTF-8? (See canonical_encoding in src/allmydata/util/encodingutil.py; the change needed is obvious.)
+1
comment:4 Changed at 2015-04-05T15:41:26Z by daira
- Milestone changed from undecided to 1.11.0
- Owner set to daira
- Status changed from new to assigned
comment:5 Changed at 2015-04-05T15:42:50Z by daira
- Summary changed from on Unix, in case of a FilenameEncodingError, suggest changing the LANG environment variable to on Unix, if filesystem and/or I/O encodings are ASCII, ignore that and use UTF-8 instead
comment:6 Changed at 2016-03-22T05:02:52Z by warner
- Milestone changed from 1.11.0 to 1.12.0
Milestone renamed
comment:7 Changed at 2016-06-28T18:20:37Z by warner
- Milestone changed from 1.12.0 to 1.13.0
moving most tickets from 1.12 to 1.13 so we can release 1.12 with magic-folders
comment:8 Changed at 2020-06-30T14:45:13Z by exarkun
- Milestone changed from 1.13.0 to 1.15.0
Moving open issues out of closed milestones.
comment:9 Changed at 2021-03-30T18:40:19Z by meejah
- Milestone changed from 1.15.0 to soon
Ticket retargeted after milestone closed
Perhaps if the filesystem and/or I/O encodings are ASCII, we should just ignore that and assume UTF-8? (See canonical_encoding in src/allmydata/util/encodingutil.py; the change needed is obvious.)