[tahoe-lafs-trac-stream] [Tahoe-LAFS] #1278: gateway won't serve any page; variety of interesting error messages in twistd.log
Tahoe-LAFS
trac at tahoe-lafs.org
Fri Nov 28 15:50:09 UTC 2014
#1278: gateway won't serve any page; variety of interesting error messages in
twistd.log
-------------------------+-------------------------------------------------
Reporter: zooko | Owner: somebody
Type: defect | Status: new
Priority: major | Milestone: soon
Component: code | Version: 1.8.0
Resolution: | Keywords: reliability availability error hang
Launchpad Bug: | heisenbug wui anti-censorship test-needed
-------------------------+-------------------------------------------------
Old description:
> Some servers on the public test grid are misbehaving -- I can tell
> because updating my blog takes about ten minutes sometimes. Having
> updated my blog a few times this morning, my gateway got into a broken
> state where it will not even load the recent uploads and downloads
> page...
>
> Oh, wait, now it works again. Hm, I will attach {{{twistd.log}}}, and
> here are a few of the interesting messages:
> {{{
> 2010-12-08 22:48:44-0800 [HTTPChannel,8,127.0.0.1] Unhandled Error
> Traceback (most recent call last):
> Failure: allmydata.immutable.downloader.common.DownloadStopped:
> our Consumer called stopProducing()
>
> 2010-12-10 10:21:11-0800 [-] Unhandled Error
> Traceback (most recent call last):
> File "/Library/Python/2.6/site-
> packages/allmydata/immutable/upload.py", line 546, in _got_response
> return self._loop()
> File "/Library/Python/2.6/site-
> packages/allmydata/immutable/upload.py", line 435, in _loop
> self.contacted_peers2)
> File "/Library/Python/2.6/site-
> packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
> line 320, in addBoth
> callbackKeywords=kw, errbackKeywords=kw)
> File "/Library/Python/2.6/site-
> packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
> line 286, in addCallbacks
> self._runCallbacks()
> --- <exception caught here> ---
> File "/Library/Python/2.6/site-
> packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
> line 542, in _runCallbacks
> current.result = callback(current.result, *args, **kw)
> File "/Library/Python/2.6/site-
> packages/allmydata/immutable/upload.py", line 546, in _got_response
> return self._loop()
> File "/Library/Python/2.6/site-
> packages/allmydata/immutable/upload.py", line 458, in _loop
> return self._failed(msg)
> File "/Library/Python/2.6/site-
> packages/allmydata/immutable/upload.py", line 561, in _failed
> raise UploadUnhappinessError(msg)
> allmydata.interfaces.UploadUnhappinessError: peer selection
> failed for <Tahoe2PeerSelector for upload kcaaq>: shares could be placed
> on only 4 server(s) such that any 3 of them have enough share
> s to recover the file, but we were asked to place shares on at least 5
> such servers. (placed 9 shares out of 10 total (1 homeless), want to
> place shares on at least 5 servers such that any 3 of them hav
> e enough shares to recover the file, sent 13 queries to 11 peers, 2
> queries placed some shares, 11 placed none (of which 4 placed none due to
> the server being full and 7 placed none due to an error)) (l
> ast failure (from <PeerTracker for peer nszizgf5 and SI kcaaq>) was:
> [Failure instance: Traceback: <class 'foolscap.ipb.DeadReferenceError'>:
> Calling Stale Broker
> /Library/Python/2.6/site-
> packages/allmydata/immutable/upload.py:546:_got_response
> /Library/Python/2.6/site-
> packages/allmydata/immutable/upload.py:433:_loop
> /Library/Python/2.6/site-
> packages/allmydata/immutable/upload.py:119:query
> /Library/Python/2.6/site-packages/foolscap-0.5.1_-
> py2.6.egg/foolscap/referenceable.py:408:callRemote
> --- <exception caught here> ---
> /Library/Python/2.6/site-
> packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py:133:maybeDeferred
> /Library/Python/2.6/site-packages/foolscap-0.5.1_-
> py2.6.egg/foolscap/referenceable.py:447:_callRemote
> /Library/Python/2.6/site-packages/foolscap-0.5.1_-
> py2.6.egg/foolscap/broker.py:468:newRequestID
> ])
>
> 2010-12-10 10:21:12-0800 [-] Exception rendering error page:
> 2010-12-10 10:21:12-0800 [-] Unhandled Error
> Traceback (most recent call last):
> File "/Library/Python/2.6/site-packages/foolscap-0.5.1_-
> py2.6.egg/foolscap/call.py", line 95, in fail
> self.deferred.errback(why)
> File "/Library/Python/2.6/site-
> packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
> line 388, in errback
> self._startRunCallbacks(fail)
> File "/Library/Python/2.6/site-
> packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
> line 455, in _startRunCallbacks
> self._runCallbacks()
> File "/Library/Python/2.6/site-
> packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
> line 542, in _runCallbacks
> current.result = callback(current.result, *args, **kw)
> --- <exception caught here> ---
> File "/Library/Python/2.6/site-
> packages/Nevow-0.10.0-py2.6.egg/nevow/appserver.py", line 82, in
> processingFailed
> handler.renderHTTP_exception(ctx, reason)
> File "/Library/Python/2.6/site-
> packages/allmydata/web/common.py", line 278, in renderHTTP_exception
> return super.renderHTTP_exception(self, ctx, f)
> File "/Library/Python/2.6/site-
> packages/Nevow-0.10.0-py2.6.egg/nevow/appserver.py", line 61, in
> renderHTTP_exception
> request.finishRequest( False )
> File "/Library/Python/2.6/site-
> packages/Nevow-0.10.0-py2.6.egg/nevow/appserver.py", line 176, in
> finishRequest
> server.Request.finish(self)
> File "/Library/Python/2.6/site-
> packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/web/http.py",
> line 900, in finish
> "Request.finish called on a request after its connection was
> lost; "
> exceptions.RuntimeError: Request.finish called on a request after
> its connection was lost; use Request.notifyFinish to keep track of this.
>
> 2010-12-10 10:21:12-0800 [-] 'Original exception:'
> }}}
>
> {{{
> 2010-12-10 11:26:14-0800 [-] Unhandled Error
> Traceback (most recent call last):
> File "/Library/Python/2.6/site-
> packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
> line 286, in addCallbacks
> self._runCallbacks()
> File "/Library/Python/2.6/site-
> packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
> line 542, in _runCallbacks
> current.result = callback(current.result, *args, **kw)
> File "/Library/Python/2.6/site-
> packages/allmydata/immutable/upload.py", line 694, in <lambda>
> self._read_encrypted(length, ciphertext, hash_only, d2))
> File "/Library/Python/2.6/site-
> packages/allmydata/immutable/upload.py", line 709, in _read_encrypted
> d = defer.maybeDeferred(self.original.read, size)
> --- <exception caught here> ---
> File "/Library/Python/2.6/site-
> packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
> line 133, in maybeDeferred
> result = f(*args, **kw)
> File "/Library/Python/2.6/site-
> packages/allmydata/immutable/upload.py", line 1362, in read
> return defer.succeed([self._filehandle.read(length)])
> exceptions.ValueError: I/O operation on closed file
>
> }}}
>
> There is one flog incident file. I'll attach it.
New description:
Some servers on the public test grid are misbehaving -- I can tell because
updating my blog takes about ten minutes sometimes. Having updated my blog
a few times this morning, my gateway got into a broken state where it will
not even load the recent uploads and downloads page...
Oh, wait, now it works again. Hm, I will attach {{{twistd.log}}}, and here
are a few of the interesting messages:
{{{
2010-12-08 22:48:44-0800 [HTTPChannel,8,127.0.0.1] Unhandled Error
Traceback (most recent call last):
Failure: allmydata.immutable.downloader.common.DownloadStopped:
our Consumer called stopProducing()
2010-12-10 10:21:11-0800 [-] Unhandled Error
Traceback (most recent call last):
File "/Library/Python/2.6/site-
packages/allmydata/immutable/upload.py", line 546, in _got_response
return self._loop()
File "/Library/Python/2.6/site-
packages/allmydata/immutable/upload.py", line 435, in _loop
self.contacted_peers2)
File "/Library/Python/2.6/site-
packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
line 320, in addBoth
callbackKeywords=kw, errbackKeywords=kw)
File "/Library/Python/2.6/site-
packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
line 286, in addCallbacks
self._runCallbacks()
--- <exception caught here> ---
File "/Library/Python/2.6/site-
packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
line 542, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Library/Python/2.6/site-
packages/allmydata/immutable/upload.py", line 546, in _got_response
return self._loop()
File "/Library/Python/2.6/site-
packages/allmydata/immutable/upload.py", line 458, in _loop
return self._failed(msg)
File "/Library/Python/2.6/site-
packages/allmydata/immutable/upload.py", line 561, in _failed
raise UploadUnhappinessError(msg)
allmydata.interfaces.UploadUnhappinessError: peer selection failed
for <Tahoe2PeerSelector for upload kcaaq>: shares could be placed on only
4 server(s) such that any 3 of them have enough share
s to recover the file, but we were asked to place shares on at least 5
such servers. (placed 9 shares out of 10 total (1 homeless), want to place
shares on at least 5 servers such that any 3 of them hav
e enough shares to recover the file, sent 13 queries to 11 peers, 2
queries placed some shares, 11 placed none (of which 4 placed none due to
the server being full and 7 placed none due to an error)) (l
ast failure (from <PeerTracker for peer nszizgf5 and SI kcaaq>) was:
[Failure instance: Traceback: <class 'foolscap.ipb.DeadReferenceError'>:
Calling Stale Broker
/Library/Python/2.6/site-
packages/allmydata/immutable/upload.py:546:_got_response
/Library/Python/2.6/site-
packages/allmydata/immutable/upload.py:433:_loop
/Library/Python/2.6/site-
packages/allmydata/immutable/upload.py:119:query
/Library/Python/2.6/site-packages/foolscap-0.5.1_-
py2.6.egg/foolscap/referenceable.py:408:callRemote
--- <exception caught here> ---
/Library/Python/2.6/site-
packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py:133:maybeDeferred
/Library/Python/2.6/site-packages/foolscap-0.5.1_-
py2.6.egg/foolscap/referenceable.py:447:_callRemote
/Library/Python/2.6/site-packages/foolscap-0.5.1_-
py2.6.egg/foolscap/broker.py:468:newRequestID
])
2010-12-10 10:21:12-0800 [-] Exception rendering error page:
2010-12-10 10:21:12-0800 [-] Unhandled Error
Traceback (most recent call last):
File "/Library/Python/2.6/site-packages/foolscap-0.5.1_-
py2.6.egg/foolscap/call.py", line 95, in fail
self.deferred.errback(why)
File "/Library/Python/2.6/site-
packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
line 388, in errback
self._startRunCallbacks(fail)
File "/Library/Python/2.6/site-
packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
line 455, in _startRunCallbacks
self._runCallbacks()
File "/Library/Python/2.6/site-
packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
line 542, in _runCallbacks
current.result = callback(current.result, *args, **kw)
--- <exception caught here> ---
File "/Library/Python/2.6/site-
packages/Nevow-0.10.0-py2.6.egg/nevow/appserver.py", line 82, in
processingFailed
handler.renderHTTP_exception(ctx, reason)
File "/Library/Python/2.6/site-
packages/allmydata/web/common.py", line 278, in renderHTTP_exception
return super.renderHTTP_exception(self, ctx, f)
File "/Library/Python/2.6/site-
packages/Nevow-0.10.0-py2.6.egg/nevow/appserver.py", line 61, in
renderHTTP_exception
request.finishRequest( False )
File "/Library/Python/2.6/site-
packages/Nevow-0.10.0-py2.6.egg/nevow/appserver.py", line 176, in
finishRequest
server.Request.finish(self)
File "/Library/Python/2.6/site-
packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/web/http.py",
line 900, in finish
"Request.finish called on a request after its connection was
lost; "
exceptions.RuntimeError: Request.finish called on a request after
its connection was lost; use Request.notifyFinish to keep track of this.
2010-12-10 10:21:12-0800 [-] 'Original exception:'
}}}
{{{
2010-12-10 11:26:14-0800 [-] Unhandled Error
Traceback (most recent call last):
File "/Library/Python/2.6/site-
packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
line 286, in addCallbacks
self._runCallbacks()
File "/Library/Python/2.6/site-
packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
line 542, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Library/Python/2.6/site-
packages/allmydata/immutable/upload.py", line 694, in <lambda>
self._read_encrypted(length, ciphertext, hash_only, d2))
File "/Library/Python/2.6/site-
packages/allmydata/immutable/upload.py", line 709, in _read_encrypted
d = defer.maybeDeferred(self.original.read, size)
--- <exception caught here> ---
File "/Library/Python/2.6/site-
packages/Twisted-10.2.0_r30364-py2.6-macosx-10.6-universal.egg/twisted/internet/defer.py",
line 133, in maybeDeferred
result = f(*args, **kw)
File "/Library/Python/2.6/site-
packages/allmydata/immutable/upload.py", line 1362, in read
return defer.succeed([self._filehandle.read(length)])
exceptions.ValueError: I/O operation on closed file
}}}
There is one flog incident file. I'll attach it.
--
Comment (by zooko):
I have a consulting client (codename "WAG") who is reporting an issue with
very similar symptoms. They say that they have hundreds of thousands of
people inconvenienced by this failure, so it is urgent for them to figure
out what happened.
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1278#comment:6>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list