[tahoe-dev] [tahoe-lafs] #698: corrupted file displayed to user after failure to download followed by retry

Thu May 7 09:21:42 PDT 2009

#698: corrupted file displayed to user after failure to download followed by
retry
--------------------------+-------------------------------------------------
 Reporter:  zooko         |           Owner:       
     Type:  defect        |          Status:  new  
 Priority:  critical      |       Milestone:  1.5.0
Component:  code-network  |         Version:  1.4.1
 Keywords:  integrity     |   Launchpad_bug:       
--------------------------+-------------------------------------------------
 I clicked the bookmark to load my blog writably, and I got an HTML page
 saying:

 {{{
 <class 'twisted.internet.defer.FirstError'>:
 FirstError(<twisted.python.failure.Failure <class
 'foolscap.ipb.DeadReferenceError'>>, 2)

 <class 'twisted.internet.defer.FirstError'>:
 FirstError(<twisted.python.failure.Failure <class
 'foolscap.ipb.DeadReferenceError'>>, 2)
 }}}

 I looked at the "Recent Uploads/Downloads" page and saw that my attempt to
 load it had failed:

 {{{
 09:23:12 07-May-2009    download        lxershd2xflho66w6yikhwg3ne      No
 588.7kB         0.0%    Failed
 09:23:12 07-May-2009    retrieve        6s64wyhfbm7yxb5cwzqblnndpe      No
 2.0kB   100.0%  Done
 09:23:12 07-May-2009    mapupdate MODE_READ     6s64wyhfbm7yxb5cwzqblnndpe
 No      -NA-    100.0%  Done
 }}}

 The three details pages are attached: {{{mapupdate-35.html}}},
 {{{retrieve-35.html}}}, and {{{down-13.html}}}.

 I looked in the {{{logs}}} directory of my Tahoe node, and saw that the
 {{{twistd.log}}} had these same error messages about
 {{{DeadReferenceError}}}.  {{{twistd.log}}} is attached (bzipped).

 I looked in the {{{logs/incidents}}} directory and saw that there was one
 incident that was recorded at the time of this attempt to load.  It is
 attached as {{{incident-2009-05-07-094319-jg54cni.flog.bz2}}}.  The
 triggering incident line is

 {{{
 13:51:05.704 [6928]: SCARY <CiphertextDownloader #10>(hekksgbfsn6w):
 download failed! FAILURE:
 [CopiedFailure instance: Traceback from remote host -- Traceback (most
 recent call last):
 Failure: twisted.internet.defer.FirstError:
 FirstError(<twisted.python.failure.Failure <class
 'foolscap.ipb.DeadReferenceError'>>, 0)
 ]
 }}}

 So far I think that this Tahoe demonstrating suboptimal handling of a
 network failure -- it should probably have returned an HTTP 503 "Service
 Unavailable" (or maybe 504 "Gateway Timeout" or just 500 "Internal Server
 Error"?) instead of an HTML page containing cryptic error messages.  But
 it gets worse:

 Then I hit the "Reload" button on my web browser, and I got the same two
 error message lines followed by a partial copy of the contents of my blog
 source code!  This result is attached as {{{wiki.html}}} (bzipped).  This
 is what I mean by a corrupted file being displayed to the user.

 The "Recent Uploads and Downloads" now says:

 {{{
 09:47:10 07-May-2009    retrieve        6s64wyhfbm7yxb5cwzqblnndpe      No
 2.0kB   100.0%          Done
 09:47:10 07-May-2009    mapupdate MODE_READ     6s64wyhfbm7yxb5cwzqblnndpe
 No      -NA-    100.0%  Done
 09:23:12 07-May-2009    download        lxershd2xflho66w6yikhwg3ne      No
 588.7kB         0.0%    Failed
 09:23:12 07-May-2009    retrieve        6s64wyhfbm7yxb5cwzqblnndpe      No
 2.0kB   100.0%  Done
 09:23:12 07-May-2009    mapupdate MODE_READ     6s64wyhfbm7yxb5cwzqblnndpe
 No      -NA-    100.0%  Done
 }}}

 The fact that there is no download following the map-update is surprising
 to me.

 The details from the new {{{mapupdate-36.html}}} and
 {{{retrieve-36.html}}} are attached.  There are no new problems reported
 in the {{{twistd.log}}} or the {{{logs/incidents/}}}.

 I'm going to go ahead and mark this with {{{Priority: critical}}} because
 I see a corrupted file and I don't understand why.  Hopefully it will turn
 out to be a bug in the web browser, which is an unstable release:
 firefox-3.5 {{{3.5~b4~hg20090330r24021+nobinonly-0ubuntu1}}}.

-- 
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/698>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid