[tahoe-dev] [tahoe-lafs] #534: "tahoe cp" command encoding issue

tahoe-lafs trac at allmydata.org
Fri Apr 10 09:51:16 PDT 2009


#534: "tahoe cp" command encoding issue
-----------------------------------+----------------------------------------
     Reporter:  francois           |       Owner:  francois                          
         Type:  defect             |      Status:  assigned                          
     Priority:  minor              |   Milestone:  1.5.0                             
    Component:  code-frontend-cli  |     Version:  1.2.0                             
   Resolution:                     |    Keywords:  cp encoding unicode filename utf-8
Launchpad_bug:                     |  
-----------------------------------+----------------------------------------

Comment(by zooko):

 Hm.  I just learned that the {{{windows-1252}}} encoding is a superset of
 the {{{iso-8859-1}}} a.k.a. {{{latin-1}}} encoding:

 http://en.wikipedia.org/wiki/Windows-1252

 The difference is that some bytes which are mapped to control characters
 in {{{iso-8859-1}}} are mapped to characters in {{{windows-1252}}}.  (Also
 maybe some of the characters are in a different order but that doesn't
 matter for this purpose.)

 Does that mean that when doing the mojibake fallback when decoding fails,
 if we decode with {{{windows-1252}}} instead of {{{iso-8859-1}}} then
 we'll have fewer control characters in the resulting unicode string?  That
 sounds like an improvement.

-- 
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/534#comment:57>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid


More information about the tahoe-dev mailing list