[tahoe-lafs-trac-stream] [tahoe-lafs] #1791: UploadUnhappinessError with available storage nodes > shares.happy
tahoe-lafs
trac at tahoe-lafs.org
Sat Jul 7 21:30:23 UTC 2012
#1791: UploadUnhappinessError with available storage nodes > shares.happy
------------------------------------+------------------------------------
Reporter: gyver | Owner: gyver
Type: defect | Status: new
Priority: major | Milestone: 1.10.0
Component: code-peerselection | Version: 1.9.2
Resolution: | Keywords: happiness upload error
Launchpad Bug: |
------------------------------------+------------------------------------
Comment (by gyver):
Replying to [comment:6 davidsarah]:
> Please add the following just after line 225 (i.e. after
{{{readonly_servers = }}}... and before {{{# decide upon the
renewal/cancel secrets}}}...) of
[source:1.9.2/src/allmydata/immutable/upload.py
src/allmydata/immutable/upload.py in 1.9.2]:
I may not have done it right : I got the same output with this at the end:
{{{
23:09:02.238 L23 []#2436 an outbound callRemote (that we [omkz] sent to
someone else [zqxq]) failed on the far end
23:09:02.238 L10 []#2437 reqID=873, rref=<RemoteReference at 0x2e780d0>,
methname=RILogObserver.foolscap.lothar.com.msg
23:09:02.238 L10 []#2438 the REMOTE failure was:
FAILURE:
[CopiedFailure instance: Traceback from remote host -- Traceback (most
recent call last):
File "/usr/lib64/python2.7/site-packages/foolscap/slicers/root.py",
line 107, in send
d.callback(None)
File "/usr/lib64/python2.7/site-packages/twisted/internet/defer.py",
line 361, in callback
self._startRunCallbacks(result)
File "/usr/lib64/python2.7/site-packages/twisted/internet/defer.py",
line 455, in _startRunCallbacks
self._runCallbacks()
File "/usr/lib64/python2.7/site-packages/twisted/internet/defer.py",
line 542, in _runCallbacks
current.result = callback(current.result, *args, **kw)
--- <exception caught here> ---
File "/usr/lib64/python2.7/site-packages/foolscap/banana.py", line 215,
in produce
slicer = self.newSlicerFor(obj)
File "/usr/lib64/python2.7/site-packages/foolscap/banana.py", line 314,
in newSlicerFor
return topSlicer.slicerForObject(obj)
File "/usr/lib64/python2.7/site-packages/foolscap/slicer.py", line 48,
}}}
BUT... I may have a lead looking at the last error message in my original
log dump.
server selection unsuccessful for <Tahoe2ServerSelector for upload k5ga2>:
shares could be placed on only 5 server(s) [...], merged=sh0: zp6jpfeu,
sh1: pa2myijh, sh2: pa2myijh, sh3: omkzwfx5, sh4: wo6akhxt, sh5: ughwvrtu
I assume the sh<n> are the shares to be placed. sh1 and sh2 were affected
to pa2myijh. I'm not sure if this repartition is the result of share
detection (my guess) or the result of a share placement algorithm that
could produce invalid placement and needs a check before upload (late
error detection isn't good practice so I bet it's not the case).
What if these shares are already stored on pa2myijh '''before''' the
upload attempt (due to past uploads with a buggy version or whatever
happened in the store directory out of Tahoe's control). Is the code able
to detect such a case and reupload one of the two shares on a free
(without one of the 6 shares) server? If not, it might be the cause of my
problem (the file was part of a long list of files I tried to upload with
only partial success weeks ago...) and my storage nodes are most probably
polluted by "dangling" shares.
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1791#comment:9>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list