[tahoe-lafs-trac-stream] [tahoe-lafs] #1331: --verify option for `tahoe backup`
tahoe-lafs
trac at tahoe-lafs.org
Thu Nov 28 01:50:28 UTC 2013
#1331: --verify option for `tahoe backup`
-------------------------+-------------------------------------------------
Reporter: chrysn | Owner: nobody
Type: defect | Status: new
Priority: major | Milestone: undecided
Component: code- | Version: 1.7.1
frontend-cli | Keywords: tahoe-backup preservation backupdb
Resolution: | gridid verify
Launchpad Bug: |
-------------------------+-------------------------------------------------
Changes (by amontero):
* cc: amontero@… (added)
Old description:
> tahoe backup will happily end its operation even if the files that are to
> be backupped are not present on any node.
>
> there are two parts of this problem:
>
> * the backupdb seems not to track introducer urls (e.g. when one backups
> the same directory to different clouds)
> * caps the new version relies on are not verified
>
> while the first could be un-fixable for all i know (that is, in case
> tahoe has no concept of "different clouds"), for the second one i suggest
> the following:
>
> * have a --verify option that takes four values:
> * none -- rely on caps remembered in backupdb to be present
> * shallow -- check for the existence of every cap remembered from
> backupdb
> * deep -- do a deep check on all caps used in the backup db
> * checksum -- calculate the data checksums of all files involved in re-
> using a cap, and compare to the reference cap (this requires equal
> convergence secrets)
>
> the current implementation (i'm using 1.7.1, but the changelog doesn't
> mention anything relevant) does the equivalent of none, which is
> especially a problem together with the first problem mentioned above.
>
> i'd suggest at least --verify=shallow to be default for backups; it has
> the advantage of keeping the O(1) network traffic advantage of the
> backupdb.
>
> another switch should be created to configure whether verify misses are
> to be treated critical or should just be reported to stderr. (--verify-
> fatal or similar)
New description:
tahoe backup will happily end its operation even if the files that are to
be backupped are not present on any node.
there are two parts of this problem:
* the backupdb seems not to track introducer urls (e.g. when one backups
the same directory to different clouds)
* caps the new version relies on are not verified
while the first could be un-fixable for all i know (that is, in case tahoe
has no concept of "different clouds"), for the second one i suggest the
following:
* have a --verify option that takes four values:
* none -- rely on caps remembered in backupdb to be present
* shallow -- check for the existence of every cap remembered from
backupdb
* deep -- do a deep check on all caps used in the backup db
* checksum -- calculate the data checksums of all files involved in re-
using a cap, and compare to the reference cap (this requires equal
convergence secrets)
the current implementation (i'm using 1.7.1, but the changelog doesn't
mention anything relevant) does the equivalent of none, which is
especially a problem together with the first problem mentioned above.
i'd suggest at least --verify=shallow to be default for backups; it has
the advantage of keeping the O(1) network traffic advantage of the
backupdb.
another switch should be created to configure whether verify misses are to
be treated critical or should just be reported to stderr. (--verify-fatal
or similar)
--
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1331#comment:4>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list