[tahoe-dev] [tahoe-lafs] #732: Not Enough Shares when repairing a file which has 7 shares on 2 servers
tahoe-lafs
trac at allmydata.org
Wed Jun 10 07:23:41 PDT 2009
#732: Not Enough Shares when repairing a file which has 7 shares on 2 servers
----------------------------+-----------------------------------------------
Reporter: zooko | Owner: zooko
Type: defect | Status: new
Priority: major | Milestone: 1.5.0
Component: code-encoding | Version: 1.4.1
Keywords: repair process | Launchpad_bug:
----------------------------+-----------------------------------------------
My demo at the Northern Colorado Linux Users Group had an unfortunate
climactic conclusion when someone (whose name I didn't catch) asked about
repairing damaged files, so I clicked the check button with the "repair"
checkbox turned on, and got this:
{{{
NotEnoughSharesError: no shares could be found. Zero shares usually
indicates a corrupt URI, or that no servers were connected, but it might
also indicate severe corruption. You should perform a filecheck on this
object to learn more.
}}}
I couldn't figure it out and had to just bravely claim that Tahoe had
really great test coverage and this sort of unpleasant surprise wasn't
common. I also promised to email them all with the explanation, so I'm
subscribing to the NCLUG mailing list so that I can e-mail the URL to this
ticket. :-)
The problem remains reproducible today. I have a little demo grid with an
introducer, a gateway, and two storage servers. The gateway has storage
service turned off. I have a file stored therein with 3-of-10 encoding,
and I manually {{{rm}}}'ed three shares from one of the storage servers.
Check correctly reports says:
{{{
"summary": "Not Healthy: 7 shares (enc 3-of-10)"
}}}
Check also works with the "verify" checkbox turned on.
When I try to repair I get thie Not Enough Shares error and an incident
report like this one (full incident report file attached):
{{{
07:03:12.747 [5977]: web: 127.0.0.1 GET /uri/[CENSORED].. 200 308553
07:03:25.604 [5978]: <Repairer #6>(u7rxp): starting repair
07:03:25.604 [5979]: CHKUploader starting
07:03:25.604 [5980]: starting upload of <DownUpConnector #6>
07:03:25.604 [5981]: creating Encoder <Encoder for unknown storage index>
07:03:25.604 [5982]: <CiphertextDownloader #22>(u7rxpbtbw5wb): starting
download
07:03:25.613 [5983]: SCARY <CiphertextDownloader #22>(u7rxpbtbw5wb):
download failed! FAILURE:
[CopiedFailure instance: Traceback from remote host -- Traceback (most
recent call last):
File
"/Users/wonwinmcbrootles/playground/allmydata/tahoe/trunk/trunk/src/allmydata/immutable/repairer.py",
line 69, in start
d2 = dl.start()
File
"/Users/wonwinmcbrootles/playground/allmydata/tahoe/trunk/trunk/src/allmydata/immutable/download.py",
line 715, in start
d.addCallback(self._got_all_shareholders)
File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5
/site-
packages/Twisted-8.2.0-py2.5-macosx-10.3-i386.egg/twisted/internet/defer.py",
line 195, in addCallback
callbackKeywords=kw)
File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5
/site-
packages/Twisted-8.2.0-py2.5-macosx-10.3-i386.egg/twisted/internet/defer.py",
line 186, in addCallbacks
self._runCallbacks()
--- <exception caught here> ---
File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5
/site-
packages/Twisted-8.2.0-py2.5-macosx-10.3-i386.egg/twisted/internet/defer.py",
line 328, in _runCallbacks
self.result = callback(self.result, *args, **kw)
File
"/Users/wonwinmcbrootles/playground/allmydata/tahoe/trunk/trunk/src/allmydata/immutable/download.py",
line 810, in _got_all_shareholders
self._verifycap.needed_shares)
allmydata.interfaces.NotEnoughSharesError: Failed to get enough
shareholders
]
[INCIDENT-TRIGGER]
07:03:26.253 [5984]: web: 127.0.0.1 POST /uri/[CENSORED].. 410 234
}}}
--
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/732>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid
More information about the tahoe-dev
mailing list