[tahoe-dev] Sighting reports
Kyle Markley
kyle at arbyte.us
Tue Jul 13 19:36:20 UTC 2010
Hey developers,
I've been putting my 4-node grid through some stress and I've encountered
a few problems I wanted to report.
1) Sometimes I get backup operations failing like this:
allmydata.scripts.common_http.HTTPError: Error during file PUT: 500
Internal Server Error
Traceback (most recent call last):
File "build/bdist.openbsd-4.6-amd64/egg/foolscap/call.py", line 674, in
_done
File "build/bdist.openbsd-4.6-amd64/egg/foolscap/call.py", line 60, in
complete
File
"/usr/local/lib/python2.6/site-packages/Twisted-10.0.0-py2.6-openbsd-4.6-amd64.egg/twisted/internet/defer.py",
line 280, in callback
self._startRunCallbacks(result)
File
"/usr/local/lib/python2.6/site-packages/Twisted-10.0.0-py2.6-openbsd-4.6-amd64.egg/twisted/internet/defer.py",
line 354, in _startRunCallbacks
self._runCallbacks()
--- <exception caught here> ---
File
"/usr/local/lib/python2.6/site-packages/Twisted-10.0.0-py2.6-openbsd-4.6-amd64.egg/twisted/internet/defer.py",
line 371, in _runCallbacks
self.result = callback(self.result, *args, **kw)
File
"/usr/local/lib/python2.6/site-packages/allmydata_tahoe-1.7.0-py2.6.egg/allmydata/immutable/upload.py",
line 506, in _got_response
return self._loop()
File
"/usr/local/lib/python2.6/site-packages/allmydata_tahoe-1.7.0-py2.6.egg/allmydata/immutable/upload.py",
line 359, in _loop
self._get_progress_message()))
allmydata.interfaces.UploadUnhappinessError: shares could be placed on
only 3 server(s) such that any 2 of them have enough shares to recover the
file, but we were asked to place shares on at least 4 such servers. (placed
all 4 shares, want to place shares on at least 4 servers such that any 2 of
them have enough shares to recover the file, sent 4 queries to 4 peers, 3
queries placed some shares, 1 placed none (of which 1 placed none due to
the server being full and 0 placed none due to an error))
This error report is incorrect -- all of the storage nodes show on their
status pages that they are still accepting new shares! Further, I've seen
that if I keep trying to restart the backup, the storage situation degrades
until eventually it says that all 4 shares couldn't be placed due to the
server being full. If I restart the tahoe node trying to run the backup,
this problem goes away, at least for a while.
This backup operation is not using a helper, but is running on the node
that runs the helper.
2) A long tahoe backup aborted with this error:
allmydata.scripts.common_http.HTTPError: Error during file PUT: 500
Internal Server Error
Traceback (most recent call last):
File
"/usr/local/lib/python2.6/site-packages/Twisted-10.0.0-py2.6-openbsd-4.6-amd64.egg/twisted/internet/defer.py",
line 325, in unpause
self._runCallbacks()
File
"/usr/local/lib/python2.6/site-packages/Twisted-10.0.0-py2.6-openbsd-4.6-amd64.egg/twisted/internet/defer.py",
line 371, in _runCallbacks
self.result = callback(self.result, *args, **kw)
File
"/usr/local/lib/python2.6/site-packages/Twisted-10.0.0-py2.6-openbsd-4.6-amd64.egg/twisted/internet/defer.py",
line 330, in _continue
self.unpause()
File
"/usr/local/lib/python2.6/site-packages/Twisted-10.0.0-py2.6-openbsd-4.6-amd64.egg/twisted/internet/defer.py",
line 325, in unpause
self._runCallbacks()
--- <exception caught here> ---
File
"/usr/local/lib/python2.6/site-packages/Twisted-10.0.0-py2.6-openbsd-4.6-amd64.egg/twisted/internet/defer.py",
line 371, in _runCallbacks
self.result = callback(self.result, *args, **kw)
File
"/usr/local/lib/python2.6/site-packages/allmydata_tahoe-1.7.0-py2.6.egg/allmydata/immutable/upload.py",
line 896, in set_shareholders
assert len(buckets) == sum([len(peer.buckets) for peer in used_peers])
exceptions.AssertionError:
I can reproduce this error, which should make it more debuggable.
--
Kyle Markley
More information about the tahoe-dev
mailing list