[tahoe-lafs-trac-stream] [tahoe-lafs] #1331: --verify option for `tahoe backup`

tahoe-lafs trac at tahoe-lafs.org
Thu Nov 28 01:50:28 UTC 2013


#1331: --verify option for `tahoe backup`
-------------------------+-------------------------------------------------
     Reporter:  chrysn   |      Owner:  nobody
         Type:  defect   |     Status:  new
     Priority:  major    |  Milestone:  undecided
    Component:  code-    |    Version:  1.7.1
  frontend-cli           |   Keywords:  tahoe-backup preservation backupdb
   Resolution:           |  gridid verify
Launchpad Bug:           |
-------------------------+-------------------------------------------------
Changes (by amontero):

 * cc: amontero@… (added)


Old description:

> tahoe backup will happily end its operation even if the files that are to
> be backupped are not present on any node.
>
> there are two parts of this problem:
>
> * the backupdb seems not to track introducer urls (e.g. when one backups
> the same directory to different clouds)
> * caps the new version relies on are not verified
>
> while the first could be un-fixable for all i know (that is, in case
> tahoe has no concept of "different clouds"), for the second one i suggest
> the following:
>
> * have a --verify option that takes four values:
>  * none -- rely on caps remembered in backupdb to be present
>  * shallow -- check for the existence of every cap remembered from
> backupdb
>  * deep -- do a deep check on all caps used in the backup db
>  * checksum -- calculate the data checksums of all files involved in re-
> using a cap, and compare to the reference cap (this requires equal
> convergence secrets)
>
> the current implementation (i'm using 1.7.1, but the changelog doesn't
> mention anything relevant) does the equivalent of none, which is
> especially a problem together with the first problem mentioned above.
>
> i'd suggest at least --verify=shallow to be default for backups; it has
> the advantage of keeping the O(1) network traffic advantage of the
> backupdb.
>
> another switch should be created to configure whether verify misses are
> to be treated critical or should just be reported to stderr. (--verify-
> fatal or similar)

New description:

 tahoe backup will happily end its operation even if the files that are to
 be backupped are not present on any node.

 there are two parts of this problem:

 * the backupdb seems not to track introducer urls (e.g. when one backups
 the same directory to different clouds)
 * caps the new version relies on are not verified

 while the first could be un-fixable for all i know (that is, in case tahoe
 has no concept of "different clouds"), for the second one i suggest the
 following:

 * have a --verify option that takes four values:
  * none -- rely on caps remembered in backupdb to be present
  * shallow -- check for the existence of every cap remembered from
 backupdb
  * deep -- do a deep check on all caps used in the backup db
  * checksum -- calculate the data checksums of all files involved in re-
 using a cap, and compare to the reference cap (this requires equal
 convergence secrets)

 the current implementation (i'm using 1.7.1, but the changelog doesn't
 mention anything relevant) does the equivalent of none, which is
 especially a problem together with the first problem mentioned above.

 i'd suggest at least --verify=shallow to be default for backups; it has
 the advantage of keeping the O(1) network traffic advantage of the
 backupdb.

 another switch should be created to configure whether verify misses are to
 be treated critical or should just be reported to stderr. (--verify-fatal
 or similar)

--

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1331#comment:4>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list