#1325 new enhancement

make `tahoe backup` keep more filesystem metadata

Reported by: chrysn Owned by: nobody
Priority: major Milestone: undecided
Component: unknown Version: 1.8.1
Keywords: tahoe-backup metadata symlink hardlink Cc: amontero@…
Launchpad Bug:

Description (last modified by amontero)

there is a number of probems due to which tahoe backup can not replace rsync style backups yet. the core of the problem is that tahoe-lafs can not keep all the information that is stored in a posix style file system. the issues i see are:

  • ctime/mtime is not saved
  • symlinks can not be saved (compare ticket #641, which has been around for two years)
  • other special files can not be saved (devices etc)
  • user, group and permissions are not saved
  • acls are not saved

i am aware that tahoe has its own ways of dealing with permissions, that it has its own time stamps, and that directories work in a way that every directory entry is kind of a link anyway, but that's not the point -- it's about being able to restore a disk's contents from a backup.

from my point of view, symlinks, times and user/group/permissions are the most important of these; device files are nowadays created on a ramdisk on the fly anyway, and acl users know the problem well enough to have their workarouds (afair this is an issue with most backup systems).

implementation-wise, i guess that most if not all of this can be stored in the directory as additional information.

if it is possible in trac, i suggest all related bugs to be marked as "blocking" this bug.

is this something that is realistic to achieve for tahoe-lafs?

Change History (5)

comment:1 Changed at 2011-01-16T09:22:30Z by zooko

I think this would require #307 (maybe add node metadata? (in addition to edge metadata)) and/or #947 (Add file-with-metadata caps). (Hm, maybe those two tickets should be merged.)

comment:2 Changed at 2011-01-18T14:39:02Z by chrysn

two other issues came to my mind related to this, though both in the low-priority class:

  • sparse files (might actually be implemented, didn't test it)
  • hardlinks

hardlinks are not too much of an issue server-wise due to the backuping node using the same convergence key, but when restoring, the file gets duplicated. (hardlinking all files from the same readcap is not a good idea either as they might originally have been distinct but had equal contents.)

it might be reasonable to implement this by saving each file's device and inode number (i figure there has to be something compatible for each file system that provides hard links). that would solve it for the backup case (where all files are created more-or-less atomically), but is probably the wrong approach for a more general case where one wants to create arbitrary hard-links in tahoe-lafs. (one could argue that identical mutable file hashes are equivalent to hard-links, but then again that wouldn't work out too well for the backup scenario.)

comment:3 Changed at 2011-01-27T13:31:36Z by zooko

  • Summary changed from make `tahoe backup` useable as a replacement for rsync to make `tahoe backup` keep more filesystem metadata

comment:4 Changed at 2011-05-21T00:14:39Z by davidsarah

  • Keywords tahoe-backup metadata symlink hardlink added

comment:5 Changed at 2013-11-28T01:50:14Z by amontero

  • Cc amontero@… added
  • Description modified (diff)
Note: See TracTickets for help on using tickets.