[tahoe-dev] String encoding in tahoe
Dan McNair
glucnac at gmail.com
Tue Dec 23 14:09:25 PST 2008
A collection of more or less random thoughts follows.
I think that ignoring the encoding issue will work better more of the time
than assuming utf-8 is the encoding.
Ignoring encoding will only break if what the user passes in on the command
line is unsupported by the filesystem. This is more like a user error than
an application error. Our responsibility should be limited to gracefully
alerting the user to the problem, as opposed to dying with a cryptic
exception.
FWIW: The current 'default' encoding and 'filesystem' encoding can both be
queried in the sys module. Need to confirm that '/' isn't munged up in
encoding?
assert u'/'.encode(sys.getfilesystemencoding()) == '/'
Adding CLI options to control encoding/decoding would be useful for power
users. Otherwise I think it should be left alone. I can't even dream up a
situation in which having options would help.
Curious: does Tahoe support arbitrary binary strings as filenames in the
backend, or only accept certain encodings? HTTP certainly supports arbitrary
byte sequences, ugly though it may be. I don't recall anything from my scan
of the DIR2 documentation that would cause problems with filenames in
arbitrary encoding(s).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://allmydata.org/pipermail/tahoe-dev/attachments/20081223/15181d6b/attachment.htm
More information about the tahoe-dev
mailing list