[tahoe-dev] [tahoe-lafs] #1170: does new-downloader perform badly for certain situations (such as today's Test Grid)?
tahoe-lafs
trac at tahoe-lafs.org
Thu Aug 12 18:15:54 UTC 2010
#1170: does new-downloader perform badly for certain situations (such as today's
Test Grid)?
------------------------------+---------------------------------------------
Reporter: zooko | Owner:
Type: defect | Status: new
Priority: major | Milestone: 1.8.0
Component: code-network | Version: 1.8β
Resolution: | Keywords: immutable download
Launchpad Bug: |
------------------------------+---------------------------------------------
Comment (by warner):
yeah, the 32/64-byte reads are hashtree nodes. The spans structure only
coaleses adjacent/overlapping reads (the 64-byte reads are the result of
two neighboring 32-byte hashtree nodes being fetched), but all requests
are pipelined (note the "txtime" column in the "Requests" table, which
tracks remote-bucket-read requests), and the overhead of each message is
fairly small (also note the close proximity of the "rxtime" for those
batches of requests). So I'm not particularly worried about merging these
requests further.
My longer-term goal is to extend the Spans data structure with some sort
of "close enough" merging feature: given a Spans bitmap, return a new
bitmap with all the small holes filled in, so e.g. a 32-byte gap between
two hashtree nodes (which might not be strictly needed until a later
segment is read) would be retrieved early. The max-hole-size would need to
be tuned to match the overhead of each remote-read message (probably on
the order of 30-40 bytes): there's a breakeven point somewhere in there.
Another longer-term goal is to add a {{{readv()}}}-type API to the remote
share-read protocol, so we could fetch multiple ranges in a single call.
This doesn't shave much overhead off of just doing multiple pipelined
{{{read()}}} requests, so again it's low-priority.
And yes, a cleverer which-share-should-I-use-now algorithm might reduce
stalls like that. I'm working on visualization tools to show the raw
download-status events in a Gantt-chart -like form, which should make it
easier to develop such an algorithm. For now, you want to look at the
Request table for correlations between reads that occur at the same time.
For example, at the +1.65s point, I see several requests that take
1.81s/2.16s/2.37s . One clear improvement would be to fetch shares 0 and 5
from different servers: whatever slowed down the reads of sh0 also slowed
down sh5. But note that sh8 (from the other server) took even longer: this
suggests that the congestion was on your end of the line, not theirs,
especially since the next segment arrived in less than half a second.
--
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1170#comment:2>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-dev
mailing list