[tahoe-dev] Mutable directory update performance
Kyle Markley
kyle at arbyte.us
Tue Dec 4 05:17:40 UTC 2012
Hi,
I've been tinkering around with some code that would like to grow up to
be a parallelized flavor of "tahoe backup" so I could do backup, and
eventually deep-check, on trees with tens of thousands of files without
waiting many hours for the standard tools to serially walk through every
file. (And without the first serious error caused by a network glitch
causing the entire operation to abort halfway through those many hours!)
As soon as I got this functioning, I noticed that it was spending almost
all its time on directory updates.
My code is just invoking the tahoe CLI to do all its work, and I don't
see that immutable directories are available to the CLI, so I'm linking
files into mutable directories. That turns into a serial operation for
all files that are supposed to be in the same directory.
Is there a plan to make immutable directories available to the CLI, or
does anyone have advice on making "tahoe ln" faster? The only other
idea I have for the short run is to randomize the file order so I'm
usually touching many separate directories at once. That's simple but
not elegant at all...
--
Kyle Markley
More information about the tahoe-dev
mailing list