[tahoe-dev] [tahoe-lafs] #1154: mplayer triggers two bugs in Tahoe's new downloader
tahoe-lafs
trac at tahoe-lafs.org
Thu Aug 5 17:40:53 UTC 2010
#1154: mplayer triggers two bugs in Tahoe's new downloader
------------------------------+---------------------------------------------
Reporter: francois | Owner: warner
Type: defect | Status: assigned
Priority: critical | Milestone: 1.8.0
Component: code-network | Version: 1.8β
Resolution: | Keywords: download regression random-access error
Launchpad Bug: |
------------------------------+---------------------------------------------
Comment (by warner):
Oh, nevermind, I think I figured it out. There's actually three bugs
overlapping here:
1. the {{{Spans/DataSpans}}} classes used {{{__len__}}} methods that
returned {{{long}}}s instead of {{{int}}}s, causing an exception during
download. (my [4664] fix was incorrect: it turns out that
{{{__nonzero__}}} is not allowed to return a {{{long}}} either).
1. there is a lost-progress bug in {{{DownloadNode}}}, where a failure in
one segment-fetch will cause all other pending segment-fetches to hang
forever
1. a {{{stopProducing}}} that occurs during this hang-forever period
causes an exception, because there is no active segment-fetch in place
The bug1 fix is easy: replace {{{self.__len__}}} with {{{self.len}}} and
make {{{__nonzero__}}} always return a {{{bool}}}. The bug3 fix is also
easy: {{{DownloadNode._cancel_request}}} should tolerate
{{{self._active_segment}}} being {{{None}}}.
The bug2 fix is not trivial but not hard. The start-next-fetch code in
{{{DownloadNode}}} should be factored out, and
{{{DownloadNode.fetch_failed}}} code should invoke it after sending
errbacks to the requests which failed. This will add a nice property: if
you get unrecoverable bit errors in one segment, you might still be able
to get valid data from other segments (as opposed to giving up on the
whole file because of a single error). I think there are some other
changes that must be made to really get this property, though.. when we
get to the point where we sort shares by "goodness", we'll probably clean
this up. The basic idea will be that shares with errors go to the bottom
of the list but are not removed from it entirely: if we really can't find
the data we need somewhere else, we'll give the known-corrupted share a
try, in the hopes that there are some uncorrupted parts of the share.
I've got a series of test cases to exercise these three bugs.. I just have
to build them in the right order to make sure that I'm not fixing the
wrong one first (and thus hiding one of the others from my test).
--
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1154#comment:10>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-dev
mailing list