[tahoe-dev] [tahoe-lafs] #1223: got 'WrongSegmentError' during repair
tahoe-lafs
trac at tahoe-lafs.org
Fri Oct 15 08:16:23 UTC 2010
#1223: got 'WrongSegmentError' during repair
-------------------------------+--------------------------------------------
Reporter: francois | Owner: somebody
Type: defect | Status: new
Priority: major | Milestone: 1.8.1
Component: code-encoding | Version: 1.8.0
Resolution: | Keywords: regression repair performance
Launchpad Bug: |
-------------------------------+--------------------------------------------
Comment (by warner):
so I think there are two problems: one performance-harming, one crashing:
* the read_encrypted(input_chunk_size) will read about segsize/k bytes at
a time. When k=3 this isn't too bad, but when k=22, it starts to hurt
(especially for a small file, or a short tail segment of a multiple-
segment file). zfec wants data in pieces of that size (input_chunk_size
comes from zfec), but that doesn't mean we have to do
{{{read_encrypted()}}} in chunks of that size.
* we should read full segments at a time, and then split them into
{{{input_chunk_size}}} pieces after hashing. If we do it right, I don't
think this will increase the memory footprint, although it will add
another brief window where there's an extra 1*segsize in use (during the
split and before we free the original segsize-sized buffer).
* this performance problem is more significant with new-downloader than
with old-downloader. I think old-downloader would have cached the
ciphertext onto disk (the old {{{CacheFileManager}}} or something). A
client doing sequential reads of small ranges would trigger a full
download and then read tiny chunks off disk (really out of the kernel's
dirty FS buffers) at RAM speeds. But new-downloader, which doesn't use a
disk cache, responds to each read() call by fetching a single segment. But
since it doesn't cache those segments anywhere, a client which does a lot
of tiny reads will trigger a lot of segment fetches, taking a round-trip
each.
* the Repairer has probably been reading past the end of the input file
all along. The old-downloader tolerated this (because it was really just
reading the cachefile from disk). But the new-downloader does not, and in
fact crashes when you ask it to {{{read()}}} with a starting offset that
is beyond the end of the file
* we should first fix Repairer to not do this, then we should fix new-
downloader to tolerate it or at least raise a sensible exception
--
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1223#comment:11>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-dev
mailing list