[tahoe-dev] #534: "tahoe cp" command encoding issue

Terrell Russell terrellrussell at gmail.com
Thu Feb 26 10:25:49 PST 2009


On 2/26/09 12:56 PM, zooko wrote:
> Strategy 2: decode the filename using the declared codec of the
> filesystem, if that fails, just copy the bytes without decoding them.
>
 > And, we must mark down somewhere that this is a "just the bytes"
 > filename instead of a utf-8 encoded filename.  I think the easiest
 > place to mark this down might be to add a flag to the "metadata" dict
 > associated with that name, something like "unknown_codec: True".
>
> <snip>
>
> Note that this strategy could cause failures in older tahoe clients
> which are expecting utf-8 encoded names in the name field.  They
> could get a decode error.  Newer tahoe clients would know to check
> for the "unknown_codec" flag before decoding.  Hm -- that doesn't
> sound good.  I can think of three options:
>

How many older tahoe clients are we talking about?  What's the deployed 
base here?  5?  10 users?  Thousands?

Is this not something we can live with?  I mean, the people who are 
pushing around 'funky' characters live with this stuff all the time - 
aren't we making their lives easier in the long run, here, early, by 
helping them out?   And wouldn't they love to upgrade real-soon-now if 
they're seeing these types of issues?  Clamoring, even...

Since the 100% happy case isn't presenting itself, 2a seems the best 
option moving forward.

Strategy 2b seems awkward.
and
Strategy 2c has the word magical.


I vote 2a.

Terrell


More information about the tahoe-dev mailing list