[tahoe-dev] #534: "tahoe cp" command encoding issue
    Francois Deppierraz 
    francois at ctrlaltdel.ch
       
    Fri Feb 27 14:58:52 PST 2009
    
    
  
zooko wrote:
> In that case, we don't need the separate base32-encoded bytestring,  
> just the flag to say whether the child name element was the result of  
> a successful decode using the encoding declared by the filesystem, or  
> else the result of a "fallback" latin-1 decode.
What about using the object replacement character (U+FFFC) described on
[1] as flag ?
def fs_to_unicode(s):
  encoding = sys.getfilesystemencoding()
  try:
    return unicode(s, encoding)
  except UnicodeDecodeError:
    return u"\ufffc" + s.decode('latin-1')
def unicode_to_fs(s):
  encoding = sys.getfilesystemencoding()
  if s[0] == u"\ufffc":
    return s[1:].encode('latin-1')
  else:
    return s.encode('utf-8')
Am I missing something ?
François
[1] http://en.wikipedia.org/wiki/Replacement_character
    
    
More information about the tahoe-dev
mailing list