[tahoe-lafs-trac-stream] [tahoe-lafs] #662: change "tahoe manifest" to not skip duplicates

tahoe-lafs trac at tahoe-lafs.org
Mon Sep 2 17:33:33 UTC 2013


#662: change "tahoe manifest" to not skip duplicates
-------------------------------+----------------------------------
     Reporter:  warner         |      Owner:
         Type:  enhancement    |     Status:  new
     Priority:  major          |  Milestone:  undecided
    Component:  code-dirnodes  |    Version:  1.3.0
   Resolution:                 |   Keywords:  tahoe-manifest cycle
Launchpad Bug:                 |
-------------------------------+----------------------------------
Changes (by kmarkley86):

 * cc: kyle@… (added)


Old description:

> My current job involves tools which modify a directory tree ("tahoe debug
> consolidate"), and I'd like to use "tahoe manifest" to compare the
> before- and after- trees to make sure they're the same. Unfortunately,
> "tahoe manifest"'s cycle-avoidance code (which simply ignores files or
> directories that it's seen before) is causing me trouble, since an object
> that's referenced by multiple places in the tree will appear in the
> manifest output at only one of them, and that location will depend upon
> the traversal order. (I just pushed a patch to make deep_traverse at
> least sort the child names before walking them, so it should now be
> consistent).
>
> I'm thinking that it might be nice to have a flag to "tahoe manifest"
> that tells it to not supress duplicates like this. The cycle-avoidance
> code would need to change: instead of keeping a set of nodes that have
> already been visited, it should just keep a list of the ancestors of the
> current node. A cycle should be declared if the child node we're
> considering entering appears on its own ancestor list.
>
> It might also be useful to have two sets of stats: one that includes
> shared objects, and one that does not.

New description:

 My current job involves tools which modify a directory tree ("tahoe debug
 consolidate"), and I'd like to use "tahoe manifest" to compare the before-
 and after- trees to make sure they're the same. Unfortunately, "tahoe
 manifest"'s cycle-avoidance code (which simply ignores files or
 directories that it's seen before) is causing me trouble, since an object
 that's referenced by multiple places in the tree will appear in the
 manifest output at only one of them, and that location will depend upon
 the traversal order. (I just pushed a patch to make deep_traverse at least
 sort the child names before walking them, so it should now be consistent).

 I'm thinking that it might be nice to have a flag to "tahoe manifest" that
 tells it to not supress duplicates like this. The cycle-avoidance code
 would need to change: instead of keeping a set of nodes that have already
 been visited, it should just keep a list of the ancestors of the current
 node. A cycle should be declared if the child node we're considering
 entering appears on its own ancestor list.

 It might also be useful to have two sets of stats: one that includes
 shared objects, and one that does not.

--

Comment:

 I tried using manifest as a sort of recursive ls, and immediately ran into
 this issue that it wasn't showing duplicates.  Unless there's recursive ls
 behavior available somewhere else, it would be great to fix this.

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/662#comment:3>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list