[tahoe-dev] [tahoe-lafs] #1170: new-downloader performs badly when downloading a lot of data from a file

Tue Aug 24 18:09:39 UTC 2010

#1170: new-downloader performs badly when downloading a lot of data from a file
------------------------------+---------------------------------------------
     Reporter:  zooko         |       Owner:                                           
         Type:  defect        |      Status:  new                                      
     Priority:  critical      |   Milestone:  1.8.0                                    
    Component:  code-network  |     Version:  1.8β                                     
   Resolution:                |    Keywords:  immutable download performance regression
Launchpad Bug:                |  
------------------------------+---------------------------------------------

Comment (by warner):

 I made a lot of progress with my javascript-based download-status
 visualization tools last night, after switching to the
 [http://vis.stanford.edu/protovis/ Protovis] library (which rocks!). Here
 are
 two diagrams of a 12MB download performed on my laptop (using a local
 testgrid entirely contained on one computer: lots of bandwidth, but only
 one
 CPU to share among them all, and only one disk). The downloader code is
 from
 current trunk, which means 1.8.0c2 (it was '''not''' using any of the
 patches
 from this ticket, so it exhibits all the misbehaviors of 1.8.0c2).

 I'm still working on the graphics. Time proceeds from left to right. The
 live
 display is pan+zoomable. Currently DYHB and block-reads are filled with a
 color that indicates which server they used, and block-reads get an
 outline
 color that indicates which share number was being accessed. Overlapping
 block-reads are shown stacked up. Most block reads are tiny (32 or 64
 bytes)
 but of course each segment requires 'k' large reads (each of about 41kB,
 segsize/k).

  * attachment:180c2-viz-dyhb.png : this shows the startup phase. Note how
 all
    block reads are coming from a single server (w5gi, in purple), even
 though
    we heard from other servers by the time the second segment started.
 Also
    note that, for some reason, the requests made for the first segment
 were
    all serialized by shnum: we waited for all requests from the first
 share
    to return before sending any requests for the second share.

  * attachment:180c2-viz-delays.png : this shows the midpoint of the
 download
    (specifically the segments that cross the middle of the Merkle tree,
    requiring the most hash nodes to retrieve). By this point, I'd added a
    thicker outline around block reads that fetched more than 1kB of data,
 so
    the actual data blocks can be distinguished from the hash tree nodes.
 The
    reads are properly pipelined. But note the large gap (about 7.5ms)
 between
    the receipt of the last block and the delivery of the segment. Also
 note
    how the segments that require fewer hash nodes are delivered much
 faster.

 I haven't yet ported these tools to the combo-patch -fixed downloader, nor
 have I applied them to a download from the testgrid (which would behave
 very
 differently: longer latencies, but less contention for disk or CPU). I'm
 partially inclined to disregard the idiosyncrasies displayed by these
 charts
 until I do that, but they still represent interesting problems to
 understand
 further.

 The large delay on the lots-of-hash-nodes segments raises suspicions of
 bad
 performance in {{{IncompleteHashTree}}} when you add nodes, or about the
 behavior of {{{DataSpans}}} when you add/remove data in it. The
 {{{DataSpans.add}}} time occurs immediately after the response comes back,
 so
 is clearly minimal (it lives in the space between one response and the
 next,
 along the steep downwards slope), but the {{{DataSpans.pop}}} occurs
 during
 the mysterious gap. The Foolscap receive-processing time occurs inside the
 request block rectangle. The Foolscap transmit-serialization time occurs
 during the previous mysterious gap, so it must be fairly small (after the
 previous segment was delivered, we sent a bazillion hash requests, and the
 gap was small, whereas after the big segment was delivered, we didn't send
 any hash requests, and the gap was big).

 The next set of information that will be useful to add here will we a
 generalized event list: in particular I want to see the start/finish times
 of
 all hashtree-manipulation calls, zfec-decode calls, and AES decrypt calls.
 That should take about 15 minutes to add, and should illuminate some of
 that
 gap.

-- 
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1170#comment:90>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage