[tahoe-dev] #534: "tahoe cp" command encoding issue
Francois Deppierraz
francois at ctrlaltdel.ch
Fri Feb 27 14:58:52 PST 2009
zooko wrote:
> In that case, we don't need the separate base32-encoded bytestring,
> just the flag to say whether the child name element was the result of
> a successful decode using the encoding declared by the filesystem, or
> else the result of a "fallback" latin-1 decode.
What about using the object replacement character (U+FFFC) described on
[1] as flag ?
def fs_to_unicode(s):
encoding = sys.getfilesystemencoding()
try:
return unicode(s, encoding)
except UnicodeDecodeError:
return u"\ufffc" + s.decode('latin-1')
def unicode_to_fs(s):
encoding = sys.getfilesystemencoding()
if s[0] == u"\ufffc":
return s[1:].encode('latin-1')
else:
return s.encode('utf-8')
Am I missing something ?
François
[1] http://en.wikipedia.org/wiki/Replacement_character
More information about the tahoe-dev
mailing list