[tahoe-lafs-trac-stream] [tahoe-lafs] #1543: rearrange share format to make downloads faster
tahoe-lafs
trac at tahoe-lafs.org
Sun Sep 25 17:02:01 PDT 2011
#1543: rearrange share format to make downloads faster
---------------------------+---------------------------
Reporter: warner | Owner:
Type: enhancement | Status: new
Priority: major | Milestone: undecided
Component: code-encoding | Version: 1.9.0a1
Keywords: performance | Launchpad Bug:
---------------------------+---------------------------
(I thought I had a ticket for this already, but now I can't find it)
In a new share format, we could arrange the layout of the various share
pieces to benefit download speed. The biggest slowdowns are:
* extra round trips required by unpredictable/overflexible layouts: the
downloader must first fetch an offset table, and then use a second query
to fetch the actual data. Speculative fetches can mitigate this, but a
deterministic layout is preferable.
* scatter/gather reads (e.g. hash-tree nodes spread out over a large
area) require discontiguous read operations. This requires larger read
vectors, more disk IO operations, and in the worst case more disk seeks.
Our current designs treat whole-file downloads and random-access as equal
peers: we design first for minimum alacrity for the single-segment case.
We could instead recognize that most downloads are whole-file, and accept
a slight alacrity hit in exchange for faster throughput and higher
transport efficiency:
* we could fetch multiple segments at once, increasing alacrity and
memory footprint for faster throughput
* we could fetch hash nodes for multiple segments ahead of time
and we could rearrange the share to help:
* Zooko's idea was to store extra hashnodes. We could put the full
merkle-chain after each data block. My thought was to further append them
with the bottom-most nodes first. This would increase storage
requirements, because we'd be storing multiple redundant copies of the
hash tree nodes. But reads would be much faster: each segment would
require a block fetch of the data block plus 0 or more trailing hash nodes
(depending upon which nodes were already known). That means one contiguous
IO operation per block, which will be the fastest/most-efficient for the
storage servers
* integrity storage overhead with this approach becomes O(nlogn) instead
of O(n). Rough estimate: for a 10MB file, segsize=128KiB, I think that
needs 18kB overhead instead of 5kB (per share), about N*0.1% total, maybe
double that because there are two file-sized hash trees
(crypttext_hash_tree, block_hash_tree).
--
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1543>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list