[tahoe-dev] [tahoe-lafs] #698: corrupted file displayed to user after failure to download followed by retry
tahoe-lafs
trac at allmydata.org
Thu May 7 09:21:42 PDT 2009
#698: corrupted file displayed to user after failure to download followed by
retry
--------------------------+-------------------------------------------------
Reporter: zooko | Owner:
Type: defect | Status: new
Priority: critical | Milestone: 1.5.0
Component: code-network | Version: 1.4.1
Keywords: integrity | Launchpad_bug:
--------------------------+-------------------------------------------------
I clicked the bookmark to load my blog writably, and I got an HTML page
saying:
{{{
<class 'twisted.internet.defer.FirstError'>:
FirstError(<twisted.python.failure.Failure <class
'foolscap.ipb.DeadReferenceError'>>, 2)
<class 'twisted.internet.defer.FirstError'>:
FirstError(<twisted.python.failure.Failure <class
'foolscap.ipb.DeadReferenceError'>>, 2)
}}}
I looked at the "Recent Uploads/Downloads" page and saw that my attempt to
load it had failed:
{{{
09:23:12 07-May-2009 download lxershd2xflho66w6yikhwg3ne No
588.7kB 0.0% Failed
09:23:12 07-May-2009 retrieve 6s64wyhfbm7yxb5cwzqblnndpe No
2.0kB 100.0% Done
09:23:12 07-May-2009 mapupdate MODE_READ 6s64wyhfbm7yxb5cwzqblnndpe
No -NA- 100.0% Done
}}}
The three details pages are attached: {{{mapupdate-35.html}}},
{{{retrieve-35.html}}}, and {{{down-13.html}}}.
I looked in the {{{logs}}} directory of my Tahoe node, and saw that the
{{{twistd.log}}} had these same error messages about
{{{DeadReferenceError}}}. {{{twistd.log}}} is attached (bzipped).
I looked in the {{{logs/incidents}}} directory and saw that there was one
incident that was recorded at the time of this attempt to load. It is
attached as {{{incident-2009-05-07-094319-jg54cni.flog.bz2}}}. The
triggering incident line is
{{{
13:51:05.704 [6928]: SCARY <CiphertextDownloader #10>(hekksgbfsn6w):
download failed! FAILURE:
[CopiedFailure instance: Traceback from remote host -- Traceback (most
recent call last):
Failure: twisted.internet.defer.FirstError:
FirstError(<twisted.python.failure.Failure <class
'foolscap.ipb.DeadReferenceError'>>, 0)
]
}}}
So far I think that this Tahoe demonstrating suboptimal handling of a
network failure -- it should probably have returned an HTTP 503 "Service
Unavailable" (or maybe 504 "Gateway Timeout" or just 500 "Internal Server
Error"?) instead of an HTML page containing cryptic error messages. But
it gets worse:
Then I hit the "Reload" button on my web browser, and I got the same two
error message lines followed by a partial copy of the contents of my blog
source code! This result is attached as {{{wiki.html}}} (bzipped). This
is what I mean by a corrupted file being displayed to the user.
The "Recent Uploads and Downloads" now says:
{{{
09:47:10 07-May-2009 retrieve 6s64wyhfbm7yxb5cwzqblnndpe No
2.0kB 100.0% Done
09:47:10 07-May-2009 mapupdate MODE_READ 6s64wyhfbm7yxb5cwzqblnndpe
No -NA- 100.0% Done
09:23:12 07-May-2009 download lxershd2xflho66w6yikhwg3ne No
588.7kB 0.0% Failed
09:23:12 07-May-2009 retrieve 6s64wyhfbm7yxb5cwzqblnndpe No
2.0kB 100.0% Done
09:23:12 07-May-2009 mapupdate MODE_READ 6s64wyhfbm7yxb5cwzqblnndpe
No -NA- 100.0% Done
}}}
The fact that there is no download following the map-update is surprising
to me.
The details from the new {{{mapupdate-36.html}}} and
{{{retrieve-36.html}}} are attached. There are no new problems reported
in the {{{twistd.log}}} or the {{{logs/incidents/}}}.
I'm going to go ahead and mark this with {{{Priority: critical}}} because
I see a corrupted file and I don't understand why. Hopefully it will turn
out to be a bug in the web browser, which is an unstable release:
firefox-3.5 {{{3.5~b4~hg20090330r24021+nobinonly-0ubuntu1}}}.
--
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/698>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid
More information about the tahoe-dev
mailing list