#897 new defect

"tahoe backup" thinks "ctime" means "creation time" — at Initial Version

Reported by: zooko Owned by: nobody
Priority: major Milestone: soon
Component: code-frontend-cli Version: 1.6.1
Keywords: forward-compatibility docs tahoe-backup time Cc: kpreid@…, zooko
Launchpad Bug:

Description

backupdb seems to think "ctime" means "creation time", which it does, but only on Windows.

This means there is an incorrect statement in the documentation, that "tahoe backup" is unnecessarily re-uploading files in the case that the ownership or permission bits have changed but the file contents haven't, and that "tahoe backup" is incorrectly mapping between "unix change time" and "file creation time" when used on Windows. So this ticket is for three bugs, but they are all closely related and should probably be fixed at once.

I noticed in docs/backupdb.txt@4111#L84 that the backupdb docs mention "creation time". POSIX doesn't provide a "creation time" but it does provide a "change time", abbreviated "ctime", which most people mistakenly think is a "creation time". Windows does provide a "creation time", and unfortunately Python provides unix "change time" and Windows "creation time" in the same slot -- the st_ctime slot of the stat module. Here is my bug report saying that the Python stdlib is wrong to do this, and that any Python code which uses the Python stdlib is wrong unless it immediately disambiguates.

In particular, it is a bug for any Tahoe-LAFS code to read the st_ctime member without immediately switching on whether the current platform is Windows or not. If you read the st_ctime member and do not use the current platform to disambiguate, then you have a value whose semantics are uninterpretable without guessing what platform that value was generated on.

In particular, for "tahoe backup" purposes, it is probably a mistake to say that a new ctime means that the file needs to be uploaded again. Unix and Windows both guarantee that the mtime will be changed if the file contents have changed, and therefore if mtime is unchanged then the file contents are unchanged, even if the ctime has changed. On the other hand the ctime changes on Unix even when the file contents have not changed, such as if ownership or permission bits have changed. So if only the ctime has changed then "tahoe backup" might want to set the new ctime value on the link leading to that file, but it should not reupload the file contents.

In addition, I think "tahoe backup" should disambiguate between "unix change time" and "creation time" in the metadata that it stores. Why not change the name of the metadata stored in the tahoe-lafs filesystem edge from the ambiguous and widely misunderstood "ctime" to something like "unix change time", and then if you are on non-Windows you can set that from the local filesystem's ctime on upload and set the local filesystem's ctime from that on download. On the other hand if you are on Windows then it is a bug to set the "unix change time" from the local filesystem's ctime, although it would be correct to set a different metadata entry named file creation time from the local filesystem's ctime.

See also #628, which is about the same issue in "tahoe cp", includes a taxonomy of filesystem "ctime" semantics, and includes a satisfactory backward-compatible solution that was shipped in Tahoe-LAFS v1.4.1.

I'm tagging this ticket with "forward-compatibility" because we'll eventually have to clarify these semantics and the longer we ship a tool that uploads ambiguous data the harder it will be to fix.

Change History (0)

Note: See TracTickets for help on using tickets.