#729 closed defect (fixed)
Tahoe backup should WARN and go on when finding errors like: links to deleted files or access/read permission denied in local files/directories
Reported by: | stockrt | Owned by: | francois |
---|---|---|---|
Priority: | major | Milestone: | 1.6.0 |
Component: | code-frontend-cli | Version: | 1.4.1 |
Keywords: | reviewed tahoe-backup symlink permissions reliability news-done | Cc: | stockrt@… |
Launchpad Bug: |
Description (last modified by warner)
I am facing this problem when backing up my files:
stockrt@host ~/ $ tahoe backup -v stockrt tahoe:backups/stockrt Traceback (most recent call last): File "/usr/bin/tahoe", line 8, in <module> load_entry_point('allmydata-tahoe==1.4.1', 'console_scripts', 'tahoe')() File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/runner.py", line 91, in run rc = runner(sys.argv[1:]) File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/runner.py", line 78, in runner rc = cli.dispatch[command](so) File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/cli.py", line 424, in backup rc = tahoe_backup.backup(options) File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/tahoe_backup.py", line 374, in backup return bu.run() File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/tahoe_backup.py", line 216, in run new_backup_dircap = self.process(options.from_dir, latest_backup_dircap) File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/tahoe_backup.py", line 266, in process newchilddircap = self.process(childpath, oldchildcap) File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/tahoe_backup.py", line 266, in process newchilddircap = self.process(childpath, oldchildcap) File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/tahoe_backup.py", line 266, in process newchilddircap = self.process(childpath, oldchildcap) File "/usr/lib/python2.5/site-packages/allmydata_tahoe-1.4.1-py2.5.egg/allmydata/scripts/tahoe_backup.py", line 272, in process raise BackupProcessingError("Cannot backup this file %r" % childpath) allmydata.scripts.tahoe_backup.BackupProcessingError: Cannot backup this file 'stockrt/Roger/etc/namedb' stockrt@host ~/ $ ll stockrt/Roger/etc/namedb lrwxrwxrwx 1 stockrt stockrt 21 2009-01-29 23:00 stockrt/Roger/etc/namedb -> /var/named/etc/namedb
This destination does not exists in my local filesystem: /var/named/etc/namedb but I have this (broken) link in my local "to backup" directory: stockrt/Roger/etc
I think Tahoe should warn it but do not break just because of it.
The same should occur for read/access permission denied for files and directories.
Regards,
Rogério Schneider
Attachments (3)
Change History (21)
comment:1 Changed at 2009-07-11T11:32:25Z by warner
- Component changed from unknown to code-frontend-cli
- Description modified (diff)
comment:2 Changed at 2009-12-07T02:40:02Z by davidsarah
- Keywords tahoe-backup added; backup file directory removed
- Priority changed from trivial to major
comment:3 Changed at 2009-12-07T02:40:56Z by davidsarah
- Keywords reliability added
comment:4 Changed at 2009-12-21T00:28:45Z by davidsarah
- Keywords symlink permissions added; link permission denied not found removed
The failure for dangling symlinks is #641. (This ticket is not quite a duplicate because it also asks to tolerate other failures.)
comment:5 Changed at 2010-01-10T01:35:31Z by francois
- Owner changed from nobody to francois
- Status changed from new to assigned
Changed at 2010-01-10T01:57:39Z by francois
comment:6 Changed at 2010-01-10T02:03:14Z by francois
- Keywords review-needed added
This patch renders tahoe backup more tolerant to special or unreadable files. A warning is displayed on stderr for each file which was skipped.
I think it's a good first step which allows tahoe backup to work out of the box for most users before it can store all those Unix specialties (see ticket #641).
comment:7 Changed at 2010-01-10T02:09:25Z by francois
- Milestone changed from undecided to 1.6.0
comment:8 Changed at 2010-01-10T06:51:57Z by davidsarah
+ def test_ignore_symlinks(self): + if not hasattr(os, 'symlink'): + raise unittest.SkipTest("There is no symlink on this platform.")
Vista has symbolic links (NTFS junction points), but Windows Python probably doesn't have os.symlink yet. There is a mklink command that could be used to test behaviour with junction points. I don't think that the lack of a test on Windows should hold up this fix, though.
comment:9 Changed at 2010-01-10T06:58:04Z by davidsarah
Wouldn't it be better to catch the exception caused by an access error, rather than testing using os.access etc?
comment:10 Changed at 2010-01-10T08:13:02Z by warner
I agree, although I'd also like to avoid even trying to open some things. It would be pretty unfortunate to discover a copy of /dev/zero or /dev/urandom lying in your backup source tree.
Our discussions on #833 (dealing with illegal mutable children inside an immutable directory) led us to want "tahoe cp" to warn-and-skip "bad" things found on the tahoe side, and we figured that the same policy would be appropriate for "bad" things found on the local disk side (specifically symlinks and device files). I think it'd be reasonable to have "tahoe backup" do the same.
If so, the way I'd do it would be to have both "tahoe backup" and "tahoe cp" test the potential local source filesystem object with os.path.isfile and os.path.isdir before opening or entering it. These tests should rule out special files and symlinks (whether dangling or not). Skipping symlinks to directories is more important than skipping symlinks to files, but I think it'd be ok to skip both unless/until we come up with some reasonable way to record these sorts of things in tahoe directly.
If these tests pass, then we try to open the thing, and if that raises EnvironmentError (as would be caused by permission problems), we should emit a warning and skip over it. It might be nice to count how many such warnings we emit and dump a summary at the end, so that folks using --verbose never get to see that something went wrong. Also, we should consider having the "tahoe backup" exit code be non-zero if anything was skipped due to errors or non-normal-fileness.
comment:11 Changed at 2010-01-19T22:58:26Z by francois
Thanks Brian and David-Sarah for the comments, here is a newer version of the patch which takes your comments into account. What do you think now?
Changed at 2010-01-19T22:58:46Z by francois
comment:12 follow-up: ↓ 13 Changed at 2010-01-20T00:50:54Z by davidsarah
Looks good, just some nitpicks:
- why except OSError: when listing a directory, but except IOError: when processing a file?
- typo in the comment for upload: "when call" should be "when called"
comment:13 in reply to: ↑ 12 Changed at 2010-01-20T08:43:41Z by francois
Replying to davidsarah:
- why except OSError: when listing a directory, but except IOError: when processing a file?
It seems to be the way the Python stdlib works. But as Brian mentioned, I should probably catch the parent exception EnvironmentError instead in both cases.
- typo in the comment for upload: "when call" should be "when called"
Thanks.
I'll send an updated patch.
Changed at 2010-01-20T09:49:17Z by francois
comment:14 Changed at 2010-01-24T21:23:18Z by francois
Brian, Zooko, do you feel comfortable commiting this patch before 1.6?
comment:15 Changed at 2010-01-24T21:37:53Z by davidsarah
- Keywords reviewed added; review-needed removed
Reviewed, can't see any problems. Tests look sufficient.
comment:16 Changed at 2010-01-26T15:32:50Z by zooko
- Resolution set to fixed
- Status changed from assigned to closed
applied in b03406af9d216278. Thank you very much!
comment:17 Changed at 2010-01-27T20:13:28Z by warner
I kinda think we should also be skipping symlinks in general (not just dangling ones). The os.path.isdir and os.path.isfile tests will return True for symlinks that point to directories and files. So if we want to skip symlinks, those tests need to turn into if os.path.isdir() and not os.path.islink(), etc.
comment:18 Changed at 2010-02-02T06:03:51Z by davidsarah
- Keywords news-done added
reformatted traceback