[tahoe-dev] [tahoe-lafs] #616: bug in repairer causes sporadic hangs in unit tests
tahoe-lafs
trac at allmydata.org
Tue Feb 10 00:48:40 PST 2009
#616: bug in repairer causes sporadic hangs in unit tests
---------------------------+------------------------------------------------
Reporter: zooko | Owner:
Type: defect | Status: new
Priority: major | Milestone: 1.3.0
Component: code-encoding | Version: 1.2.0
Keywords: | Launchpad_bug:
---------------------------+------------------------------------------------
There is a bug in {{{DownUpConnector._satisfy_reads_if_possible()}}}:
[source:src/allmydata/immutable/repairer.py at 20090112214120-e01fd-
7d241072d30b14d3e243829e952e8c8440e6c461#L127]
It should be putting {{{leftover}}} bytes back into the {{{self.bufs}}}
and the rest into the result, not putting all-but-{{{leftover}}} bytes
back and the rest into the result! In cases where the input chunks have
come in different sizes than the read requests, this bug could lead to a
read request getting more or fewer bytes than it requested. This could
lead to data corruption (although not irreversibly so -- it would then
upload the same sequence of bytes but in different-sized blocks, which
would screw up the integrity checking code but not the ciphertext).
Fortunately, in our current code, the writes and the read requests are
always of the same sizes (the block size), so this doesn't happen in
practice. I've added an assertion in [20090210054605-92b7f-
81c751b4418ffa63b4b2b43a459318ea3659ad90] just to make it fail safely if
this were to happen in practice. I have started writing unit tests for
{{{DownUpConnector._satisfy_reads_if_possible()}}} -- it turns out that we
need unit tests in addition to the functional tests that I already wrote:
[source:src/allmydata/test/test_repairer.py].
This explains the sporadic "lost progress" failure in the functional
tests. Hm... Could it also explain the "lost progress" behavior that
Brian and I witnessed on the testgrid when this code was newly committed
to trunk? I hope not, because that would mean that I am wrong about the
writes and reads always having the same sizes. But I'm pretty sure I am
right about that.
--
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/616>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid
More information about the tahoe-dev
mailing list