[tahoe-lafs-trac-stream] [tahoe-lafs] #662: change "tahoe manifest" to not skip duplicates
tahoe-lafs
trac at tahoe-lafs.org
Mon Sep 2 17:33:33 UTC 2013
#662: change "tahoe manifest" to not skip duplicates
-------------------------------+----------------------------------
Reporter: warner | Owner:
Type: enhancement | Status: new
Priority: major | Milestone: undecided
Component: code-dirnodes | Version: 1.3.0
Resolution: | Keywords: tahoe-manifest cycle
Launchpad Bug: |
-------------------------------+----------------------------------
Changes (by kmarkley86):
* cc: kyle@… (added)
Old description:
> My current job involves tools which modify a directory tree ("tahoe debug
> consolidate"), and I'd like to use "tahoe manifest" to compare the
> before- and after- trees to make sure they're the same. Unfortunately,
> "tahoe manifest"'s cycle-avoidance code (which simply ignores files or
> directories that it's seen before) is causing me trouble, since an object
> that's referenced by multiple places in the tree will appear in the
> manifest output at only one of them, and that location will depend upon
> the traversal order. (I just pushed a patch to make deep_traverse at
> least sort the child names before walking them, so it should now be
> consistent).
>
> I'm thinking that it might be nice to have a flag to "tahoe manifest"
> that tells it to not supress duplicates like this. The cycle-avoidance
> code would need to change: instead of keeping a set of nodes that have
> already been visited, it should just keep a list of the ancestors of the
> current node. A cycle should be declared if the child node we're
> considering entering appears on its own ancestor list.
>
> It might also be useful to have two sets of stats: one that includes
> shared objects, and one that does not.
New description:
My current job involves tools which modify a directory tree ("tahoe debug
consolidate"), and I'd like to use "tahoe manifest" to compare the before-
and after- trees to make sure they're the same. Unfortunately, "tahoe
manifest"'s cycle-avoidance code (which simply ignores files or
directories that it's seen before) is causing me trouble, since an object
that's referenced by multiple places in the tree will appear in the
manifest output at only one of them, and that location will depend upon
the traversal order. (I just pushed a patch to make deep_traverse at least
sort the child names before walking them, so it should now be consistent).
I'm thinking that it might be nice to have a flag to "tahoe manifest" that
tells it to not supress duplicates like this. The cycle-avoidance code
would need to change: instead of keeping a set of nodes that have already
been visited, it should just keep a list of the ancestors of the current
node. A cycle should be declared if the child node we're considering
entering appears on its own ancestor list.
It might also be useful to have two sets of stats: one that includes
shared objects, and one that does not.
--
Comment:
I tried using manifest as a sort of recursive ls, and immediately ran into
this issue that it wasn't showing duplicates. Unless there's recursive ls
behavior available somewhere else, it would be great to fix this.
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/662#comment:3>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list