[tahoe-dev] [tahoe-lafs] #534: "tahoe cp" command encoding issue

Shawn Willden shawn-tahoe at willden.org
Thu Feb 26 04:53:15 PST 2009


On Thursday 26 February 2009 01:43:06 am Andrej Falout wrote:
> I was hoping that the patch had the intention of allowing all valid encoded
> file names to be backed up?

It does *exactly* that.

The issue is that the filename in question is not a valid encoding (or so it 
appears).  The other tools you refer to are blindly backing up the file 
without regard to encoding issues.  Tahoe backup is trying to maximize the 
chance that the file name will be fully interpretable even if you restore the 
file to a system that uses an entirely different encoding.  This is a 
laudable goal, but one that requires that the names in the file system 
actually be encoded in the file system encoding.

To make sure that the name in question actually isn't valid UTF-8, can you 
change into the directory containing the file and run the following in a 
Python interpreter (or put it in a file and run it from the directory).

import codecs, os
lst = os.listdir('.')
for name in lst: print codecs.escape_encode(name)

Actually, we don't need the whole output, just the line about the file in 
question.

	Shawn.


More information about the tahoe-dev mailing list