[tahoe-dev] [tahoe-lafs] #1170: new-downloader performs badly when downloading a lot of data from a file
tahoe-lafs
trac at tahoe-lafs.org
Wed Aug 25 07:17:40 UTC 2010
#1170: new-downloader performs badly when downloading a lot of data from a file
------------------------------+---------------------------------------------
Reporter: zooko | Owner:
Type: defect | Status: new
Priority: critical | Milestone: 1.8.0
Component: code-network | Version: 1.8β
Resolution: | Keywords: immutable download performance regression
Launchpad Bug: |
------------------------------+---------------------------------------------
Comment (by warner):
I did some more testing with those visualization tools (adding some misc
events like entry/exit of internal functions). I've found one place where
the
downloader makes excessive eventual-send calls which appears to cost 250us
per {{{remote_read}}} call. I've also measured hash-tree operations as
consuming a surprising amount of overhead.
* each {{{Share._got_response}}} call queues an eventual-send to
{{{Share.loop}}}, which checks the satisfy/desire processes. Since a
single
TCP buffer is parsed into lots of Foolscap response messages, these are
all
queued during the same turn, which means the first {{{loop()}}} call
will
see all of the data, and the remaining ones will see nothing. Each of
these
empty {{{loop()}}} calls takes about 250us. There is one for each
{{{remote_read}}} call, which means k*(3/2)*numsegs for the block hash
trees and an additional k*(3/2)*numsegs for the ciphertext hash tree
(because we ask each share for the CTHT nodes, rather than asking only
one
and hoping they return it so we can avoid an extra roundtrip). For k=3
that's 2.25ms per segment. The cost is variable: on some segments (in
particular the first and middle ones) the overhead is maximal, whereas
on
every odd segnum there is no overhead. On a 12MB download, this is about
225ms, and on my local one-CPU testnet, the download took 2.9s, so this
represents about 8%.
* It takes my laptop 1.34ms to process a set of blocks into a segment
(seg2
of a 96-segment file). 1.19ms of that was checking the ciphertext hash
tree
(probably two extra hash nodes), and a mere 73us was spent in FEC. AES
decryption of the segment took 1.1ms, and accounted for 65% of the 1.7ms
inter-segment gap (the delay between delivering seg2 and requesting
seg3).
I'd like to change the {{{_got_response}}} code to set a flag and queue a
single call to {{{loop}}} instead of queueing multiple calls. That would
save
a little time (and probably remove the severe jitter that I've seen on
local
downloads), but I don't think it can explain the 50% slowdown that Zooko's
observed.
These visualization tools are a lot of fun. One direction to explore is to
record some packet timings (with tcpdump) and add it as an extra row: that
would show us how much latency/load Foolscap is spending before it
delivers a
message response to the application.
I'll attach two samples of the viz output as attachment:viz-3.png and
attachment:viz-4.png . The two captures are of different parts of the
download, but in both cases the horizontal ticks are 500us apart. The
candlestick-diagram-like shapes are the satisfy/desire sections of
{{{Share.loop}}}, and the lines (actually very narrow boxes) between them
are
the "disappointment" calculation at the end of {{{Share.loop}}}, so the
gap
before it must be the {{{send_requests}}} routine.
--
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1170#comment:92>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-dev
mailing list