[tahoe-lafs-trac-stream] [tahoe-lafs] #662: change "tahoe manifest" to not skip duplicates
tahoe-lafs
trac at tahoe-lafs.org
Wed Sep 4 20:23:41 UTC 2013
#662: change "tahoe manifest" to not skip duplicates
-------------------------------+----------------------------------
Reporter: warner | Owner:
Type: enhancement | Status: new
Priority: major | Milestone: undecided
Component: code-dirnodes | Version: 1.3.0
Resolution: | Keywords: tahoe-manifest cycle
Launchpad Bug: |
-------------------------------+----------------------------------
Description changed by daira:
Old description:
> My current job involves tools which modify a directory tree ("tahoe debug
> consolidate"), and I'd like to use "tahoe manifest" to compare the
> before- and after- trees to make sure they're the same. Unfortunately,
> "tahoe manifest"'s cycle-avoidance code (which simply ignores files or
> directories that it's seen before) is causing me trouble, since an object
> that's referenced by multiple places in the tree will appear in the
> manifest output at only one of them, and that location will depend upon
> the traversal order. (I just pushed a patch to make deep_traverse at
> least sort the child names before walking them, so it should now be
> consistent).
>
> I'm thinking that it might be nice to have a flag to "tahoe manifest"
> that tells it to not supress duplicates like this. The cycle-avoidance
> code would need to change: instead of keeping a set of nodes that have
> already been visited, it should just keep a list of the ancestors of the
> current node. A cycle should be declared if the child node we're
> considering entering appears on its own ancestor list.
>
> It might also be useful to have two sets of stats: one that includes
> shared objects, and one that does not.
New description:
My current job involves tools which modify a directory tree [...], and I'd
like to use "tahoe manifest" to compare the before- and after- trees to
make sure they're the same. Unfortunately, "tahoe manifest"'s cycle-
avoidance code (which simply ignores files or directories that it's seen
before) is causing me trouble, since an object that's referenced by
multiple places in the tree will appear in the manifest output at only one
of them, and that location will depend upon the traversal order. (I just
pushed a patch to make deep_traverse at least sort the child names before
walking them, so it should now be consistent).
I'm thinking that it might be nice to have a flag to "tahoe manifest" that
tells it to not supress duplicates like this. The cycle-avoidance code
would need to change: instead of keeping a set of nodes that have already
been visited, it should just keep a list of the ancestors of the current
node. A cycle should be declared if the child node we're considering
entering appears on its own ancestor list.
It might also be useful to have two sets of stats: one that includes
shared objects, and one that does not.
--
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/662#comment:4>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list