[tahoe-dev] [tahoe-lafs] #897: "tahoe backup" thinks "ctime" means "creation time"

tahoe-lafs trac at allmydata.org
Tue Jan 12 15:36:42 PST 2010


#897: "tahoe backup" thinks "ctime" means "creation time"
-----------------------------------------------------+----------------------
 Reporter:  zooko                                    |           Owner:  nobody   
     Type:  defect                                   |          Status:  new      
 Priority:  major                                    |       Milestone:  undecided
Component:  unknown                                  |         Version:  1.5.0    
 Keywords:  forward-compatibility docs tahoe-backup  |   Launchpad_bug:           
-----------------------------------------------------+----------------------
 backupdb seems to think "ctime" means "creation time", which it does, but
 only on Windows.

 This means there is an incorrect statement in the documentation, that
 "tahoe backup" is unnecessarily re-uploading files in the case that the
 ownership or permission bits have changed but the file contents haven't,
 and that "tahoe backup" is incorrectly mapping between "unix change time"
 and "file creation time" when used on Windows.  So this ticket is for
 three bugs, but they are all closely related and should probably be fixed
 at once.

 I noticed in [source:docs/backupdb.txt at 4111#L84] that the backupdb docs
 mention "creation time".  POSIX doesn't provide a "creation time" but it
 does provide a "change time", abbreviated "ctime", which most people
 mistakenly think is a "creation time".  Windows ''does'' provide a
 "creation time", and unfortunately Python provides unix "change time" and
 Windows "creation time" in the same slot -- the {{{st_ctime}}} slot of the
 {{{stat}}} module.  Here is my [http://bugs.python.org/issue5720 bug
 report] saying that the Python stdlib is wrong to do this, and that any
 Python code which uses the Python stdlib is wrong unless it immediately
 disambiguates.

 In particular, it is a bug for any Tahoe-LAFS code to read the
 {{{st_ctime}}} member without immediately switching on whether the current
 platform is Windows or not.  If you read the {{{st_ctime}}} member and do
 not use the current platform to disambiguate, then you have a value whose
 semantics are uninterpretable without guessing what platform that value
 was generated on.

 In particular, for "tahoe backup" purposes, it is probably a mistake to
 say that a new {{{ctime}}} means that the file needs to be uploaded again.
 Unix and Windows both guarantee that the {{{mtime}}} will be changed if
 the file contents have changed, and therefore if {{{mtime}}} is unchanged
 then the file contents are unchanged, even if the {{{ctime}}} has changed.
 On the other hand the {{{ctime}}} changes on Unix even when the file
 contents have not changed, such as if ownership or permission bits have
 changed.  So if only the {{{ctime}}} has changed then "tahoe backup" might
 want to set the new {{{ctime}}} value on the link leading to that file,
 but it should not reupload the file contents.

 In addition, I think "tahoe backup" should disambiguate between "unix
 change time" and "creation time" in the metadata that it stores.  Why not
 change the name of the metadata stored in the tahoe-lafs filesystem edge
 from the ambiguous and widely misunderstood "ctime" to something like
 "unix change time", and then if you are on non-Windows you can set that
 from the local filesystem's {{{ctime}}} on upload and set the local
 filesystem's {{{ctime}}} from that on download.  On the other hand if you
 are on Windows then it is a bug to set the "unix change time" from the
 local filesystem's {{{ctime}}}, although it would be correct to set a
 different metadata entry named {{{file creation time}}} from the local
 filesystem's {{{ctime}}}.

 See also #628, which is about the same issue in "tahoe cp", includes a
 taxonomy of filesystem "ctime" semantics, and includes a satisfactory
 backward-compatible solution that was shipped in Tahoe-LAFS v1.4.1.

 I'm tagging this ticket with "forward-compatibility" because we'll
 eventually have to clarify these semantics and the longer we ship a tool
 that uploads ambiguous data the harder it will be to fix.

-- 
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/897>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid


More information about the tahoe-dev mailing list