#383 new enhancement

large directories take a long time to modify

Reported by: warner Owned by:
Priority: major Milestone: eventually
Component: code-dirnodes Version: 1.0.0
Keywords: dirnode performance newcaps Cc:
Launchpad Bug:


We found that the prodnet webapi servers were taking about 35 seconds to modify a large (about 10k entries) dirnode. That time is measured from the end of the Retrieve to the beginning of the Publish. We're pretty sure that this is because the loop that decrypts and verifies the write-cap in each row is in python (whereas the code that decrypts the mutable file contents as a whole, in a single pycryptopp call, runs in 8 milliseconds). Then the other loop that re-encrypts everything takes a similar amount of time, probably 17 seconds each.

We don't actually need to decrypt the whole thing. Most of the modifications we're doing are to add or replace specific children. Since the dirnode is represented as a concatenations of netstrings (one per child), we could have a loop that iterates through the string, reading the netstring length prefix, extracting the child name, seeing if it matches, and skipping ahead to the next child if not. This would result in a big string of everything before the match, the match itself, and a big string of everything after the match. We should modify the small match piece, then concatenate everything back together when we're done. Only the piece we're changing needs to be decrypted/reencrypted.

In addition, we could probably get rid of the HMAC on those writecaps now, I think they're leftover from the central-vdrive-server days. But we should put that compatibility break off until we move to DSA directories (if we choose to go with the 'deep-verify' caps).

Change History (5)

comment:1 Changed at 2008-04-24T23:50:10Z by warner

  • Component changed from code-performance to code-dirnodes

comment:2 Changed at 2009-05-04T16:53:43Z by zooko

See also #327 (performance measurement of directories), #414 (profiling on directory unpacking), and #329 (dirnodes could cache encrypted/serialized entries for speed).

comment:3 Changed at 2009-06-25T16:30:40Z by zooko

Tahoe-LAFS hasn't checked the HMAC since f1fbd4feae1fb5d7, 2008-12-21, which patch was first released in Tahoe-LAFS v1.3.0, 2009-02-13.

If we produced dirnode entries which didn't have the HMAC tag (or which had a blank space instead of correct tag bytes there -- I don't know how the parsing works), then clients older than v1.3.0 would get some sort of integrity error when trying to read that entry. Our backward-compatibility tradition is typically longer-duration than this. For example, the most recent release notes say that Tahoe-LAFS v1.4.1 is backwards-compatible with v1.0, and in fact it is actually compatible with v0.8 or so (unless you try to upload large files -- files with shares larger than about 4 GiB).

So, let's not yet break compatibility by ceasing to emit the HMAC tags.

Also, let this be a lesson to us to that if we notice forward-compatibility issues and fix them early then this frees us up to evolve the protocols earlier. We actually stopped needing the HMAC tags when we released Tahoe-LAFS v0.7 in 2008-01-07, but we didn't notice that we were still checking them and erroring if they were wrong until the v1.3.0 release. So, everybody go look at forward-compatibility issues and fix them!

comment:4 Changed at 2009-06-25T16:39:02Z by zooko

Oh, by the way the time to actually compute and write the HMAC tags is really tiny compared to the other performance issues. (The following tickets are how we can be sure of this: #327 (performance measurement of directories), #414 (profiling on directory unpacking).) If we could stop producing the HMAC tags, I would be happier about the simplification than about the speed-up...

comment:5 Changed at 2010-01-04T20:06:50Z by davidsarah

  • Keywords performance newcaps added
Note: See TracTickets for help on using tickets.