Opened at 2010-06-17T20:07:16Z
Last modified at 2013-08-02T03:06:46Z
#1089 closed defect
SFTP and FTP: support for non-UTF-8 charsets (error message "Path could not be decoded as UTF-8") — at Version 13
Reported by: | slush | Owned by: | nobody |
---|---|---|---|
Priority: | minor | Milestone: | eventually |
Component: | code-frontend | Version: | 1.7.0 |
Keywords: | utf-8 unicode ftpd sftp names docs | Cc: | slush@…, amontero@… |
Launchpad Bug: |
Description (last modified by amontero)
I open new ticket, because I didn't found any reported problem with UTF-8 (except this one #704 which is afaik something different).
When I try to create file/directory over web frontend, everything goes well. But any attempt to create directory/file with national symbols over FTP/SFTP frontend fails.
WinSCP tell me "Path could not be decoded as UTF-8", Total Commander just tell me some general error.
Example directory name which fails: žluťoučký kůň úpěl ďábelské ódy (means: Yellow horse moaned devil odes), but fails also with any other reasonable name as "dovolená" (holiday).
Reproducibility: always Python version: 2.6
Change History (13)
comment:1 follow-up: ↓ 5 Changed at 2010-06-17T20:22:12Z by slush
comment:2 Changed at 2010-06-17T20:42:22Z by slush
- Description modified (diff)
comment:3 Changed at 2010-06-17T21:40:25Z by davidsarah
- Keywords docs added
- Summary changed from Path could not be decoded as UTF-8 to SFTP and FTP: Path could not be decoded as UTF-8
This error will occur if the file or directory name is not valid UTF-8. Polish systems often use ISO-Latin-2 locales -- is that what your filesystem uses?
If so, the SFTP specification implies that it is the responsibility of the client to convert names to UTF-8, and apparently the clients you tried aren't doing that.
Alternatives:
- have an option in tahoe.cfg or the FTP/SFTP accounts file to specify another encoding.
- declare this to be a client bug.
The FTP frontend does not support Unicode at all; that is ticket #682. However, if #682 were fixed by implementing RFC 2640, then the FTP frontend would also only support UTF-8.
We can improve the error message for clients that display it.
It is a bug in Total Commander that it doesn't display the message. Also, if "Internal server error" was reported for SFTP then that description is inaccurate -- FX_FAILURE just means an error that has no more specific code. In this case we could arguably report FX_BAD_MESSAGE, though.
comment:4 Changed at 2010-06-17T21:45:11Z by davidsarah
wiki:SftpFrontend#Unicodefilenames already documented this problem, but I added a reference to this ticket.
comment:5 in reply to: ↑ 1 Changed at 2010-06-17T21:47:24Z by davidsarah
comment:6 follow-up: ↓ 8 Changed at 2010-06-20T23:24:02Z by slush
@David:
You are right, I fixed problems with UTF-8 in WinSCP by setting correct charset. I think it should be mentioned in SFTP frontend docs (because many other SFTP servers Im using works without any settings changes).
comment:7 Changed at 2010-06-21T01:55:08Z by davidsarah
- Keywords ftpd added; ftp removed
- Milestone changed from undecided to soon
- Owner set to davidsarah
- Status changed from new to assigned
comment:8 in reply to: ↑ 6 ; follow-up: ↓ 11 Changed at 2010-06-21T01:57:02Z by davidsarah
- Owner changed from davidsarah to slush
- Status changed from assigned to new
Replying to slush:
@David:
You are right, I fixed problems with UTF-8 in WinSCP by setting correct charset.
Please change the 'Unicode filenames' section of wiki:SftpFrontend to explain how to do this.
comment:9 Changed at 2010-06-21T03:11:01Z by davidsarah
- Keywords names added
comment:10 Changed at 2010-06-21T20:59:57Z by zooko
- Version changed from 1.7β to 1.7.0
comment:11 in reply to: ↑ 8 Changed at 2010-07-11T22:07:40Z by davidsarah
- Milestone changed from soon to eventually
- Owner changed from slush to nobody
Replying to davidsarah:
Replying to slush:
You are right, I fixed problems with UTF-8 in WinSCP by setting correct charset.
Please change the 'Unicode filenames' section of wiki:SftpFrontend to explain how to do this.
Done.
comment:12 Changed at 2011-02-03T00:02:21Z by davidsarah
- Priority changed from major to minor
- Summary changed from SFTP and FTP: Path could not be decoded as UTF-8 to SFTP and FTP: support for non-UTF-8 charsets (error message "Path could not be decoded as UTF-8")
I'm unconvinced that supporting non-UTF-8 encodings is worth the hassle and complexity. Does anyone want to argue in favour of it?
Note that neither SFTP nor FTP have any standard by which a specific non-UTF-8 encoding could be automatically negotiated. So, this would have to be manually configured. But clients that do a reasonable job of supporting non-ASCII characters at all, usually have an option to select UTF-8. So I think that support for other encodings would benefit very few users.
comment:13 Changed at 2013-07-27T12:54:22Z by amontero
- Cc amontero@… added
- Description modified (diff)
And when I create directory thru web interface, I cannot enter them using FTP client (TC tell me Internal server error).