[tahoe-dev] #534: "tahoe cp" command encoding issue

Francois Deppierraz francois at ctrlaltdel.ch
Fri Feb 27 14:58:52 PST 2009


zooko wrote:

> In that case, we don't need the separate base32-encoded bytestring,  
> just the flag to say whether the child name element was the result of  
> a successful decode using the encoding declared by the filesystem, or  
> else the result of a "fallback" latin-1 decode.

What about using the object replacement character (U+FFFC) described on
[1] as flag ?


def fs_to_unicode(s):
  encoding = sys.getfilesystemencoding()
  try:
    return unicode(s, encoding)
  except UnicodeDecodeError:
    return u"\ufffc" + s.decode('latin-1')

def unicode_to_fs(s):
  encoding = sys.getfilesystemencoding()
  if s[0] == u"\ufffc":
    return s[1:].encode('latin-1')
  else:
    return s.encode('utf-8')


Am I missing something ?

François

[1] http://en.wikipedia.org/wiki/Replacement_character



More information about the tahoe-dev mailing list