Opened at 2016-09-03T21:40:31Z
#2822 new defect
remove redundant read from web GET of directory
Reported by: | warner | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | undecided |
Component: | code-frontend-web | Version: | 1.11.0 |
Keywords: | dirnode cache performance tahoe-cp | Cc: | |
Launchpad Bug: |
Description
While checking out the "recent and active operations" page, I noticed that doing a simple tahoe cp into a pre-existing top-level directory caused a total of 4 mapupdate operations, 4 retrieves, and 1 publish (where I was expecting a single retrieve and a single publish).
It looks like we're doing some redundant operations. The tahoe cp command does two WAPI operations: GET /uri/ALIAS/CHILD?t=json (to see what we're replacing), then a PUT /uri/ALIAS/CHILD (to do the actual assignment). The WAPI GET causes two dirnode operations:
- a get(childname), called from allmydata.web.directory.DirectoryNodeHandler.childFactory() as it walks through the ALIAS dirnode to find CHILD
- a get_metadata_for(childname), called from web.filenode.FileNodeHandler.render_GET (in the t=json clause when self.parentnode and self.name are present). We have to retrieve the metadata from the parent directory, because that's how tahoe dirnodes work
I think we should remove the get_metadata_for call, by changing DirectoryNodeHandler.childFactory to use get_child_and_metadata, and passing the metadata into the new FileNodeHandler.
It might be possible to remove the first read that PUT does, but I'm not yet sure how. In general, I wonder if we should have some sort of write-through cache that allows us to remember the contents of dirnodes for a little while, until we know they've changed (because we wrote to the dirnode ourselves).