[tahoe-lafs-trac-stream] [tahoe-lafs] #1280: deal with fragile, but disposable, bucket state files (was: if bucket_counter.state or lease_checker.state can't be written, stop the node with an error message)
tahoe-lafs
trac at tahoe-lafs.org
Tue Oct 16 02:36:30 UTC 2012
#1280: deal with fragile, but disposable, bucket state files
--------------------------------+--------------------------------
Reporter: francois | Owner: zooko
Type: defect | Status: reopened
Priority: normal | Milestone: 1.11.0
Component: code-nodeadmin | Version: 1.8.1
Resolution: | Keywords: pickle reliability
Launchpad Bug: |
--------------------------------+--------------------------------
Description changed by zooko:
Old description:
> After a hard system shutdown due to power failure, Tahoe node might not
> be able to start again automatically because files
> '''storage/bucket_counter.state''' or '''storage/lease_checker.state'''
> are empty.
>
> The easy workaround is to manually delete the empty files before
> restarting nodes.
>
> {{{
> find /srv/tahoe/*/storage/bucket_counter.state -size 0 -exec rm {} \;
> find /srv/tahoe/*/storage/lease_checker.state -size 0 -exec rm {} \;
> }}}
>
> Here is what a startup attempt looks like in such case.
>
> {{{
> Traceback (most recent call last):
> File "/usr/lib/python2.5/site-packages/twisted/application/app.py",
> line 614, in run
> runApp(config)
> File "/usr/lib/python2.5/site-packages/twisted/scripts/twistd.py", line
> 23, in runApp
> _SomeApplicationRunner(config).run()
> File "/usr/lib/python2.5/site-packages/twisted/application/app.py",
> line 330, in run
> self.application = self.createOrGetApplication()
> File "/usr/lib/python2.5/site-packages/twisted/application/app.py",
> line 416, in createOrGetApplication
> application = getApplication(self.config, passphrase)
> --- <exception caught here> ---
> File "/usr/lib/python2.5/site-packages/twisted/application/app.py",
> line 427, in getApplication
> application = service.loadApplication(filename, style, passphrase)
> File "/usr/lib/python2.5/site-packages/twisted/application/service.py",
> line 368, in loadApplication
> application = sob.loadValueFromFile(filename, 'application',
> passphrase)
> File "/usr/lib/python2.5/site-packages/twisted/persisted/sob.py", line
> 214, in loadValueFromFile
> exec fileObj in d, d
> File "tahoe-client.tac", line 10, in <module>
> c = client.Client()
> File "/opt/tahoe-lafs/src/allmydata/client.py", line 140, in __init__
> self.init_storage()
> File "/opt/tahoe-lafs/src/allmydata/client.py", line 269, in
> init_storage
> expiration_sharetypes=expiration_sharetypes)
> File "/opt/tahoe-lafs/src/allmydata/storage/server.py", line 97, in
> __init__
> self.add_bucket_counter()
> File "/opt/tahoe-lafs/src/allmydata/storage/server.py", line 114, in
> add_bucket_counter
> self.bucket_counter = BucketCountingCrawler(self, statefile)
> File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 449, in
> __init__
> ShareCrawler.__init__(self, server, statefile)
> File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 86, in
> __init__
> self.load_state()
> File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 195, in
> load_state
> state = pickle.load(f)
> exceptions.EOFError:
> }}}
New description:
If a bucket state file can't be loaded and parsed for any reason (usually
because it is 0-length, but any other sort of error should also be handled
similarly), then just blow it away and start fresh.
------- original post by François below:
After a hard system shutdown due to power failure, Tahoe node might not be
able to start again automatically because files
'''storage/bucket_counter.state''' or '''storage/lease_checker.state'''
are empty.
The easy workaround is to manually delete the empty files before
restarting nodes.
{{{
find /srv/tahoe/*/storage/bucket_counter.state -size 0 -exec rm {} \;
find /srv/tahoe/*/storage/lease_checker.state -size 0 -exec rm {} \;
}}}
Here is what a startup attempt looks like in such case.
{{{
Traceback (most recent call last):
File "/usr/lib/python2.5/site-packages/twisted/application/app.py", line
614, in run
runApp(config)
File "/usr/lib/python2.5/site-packages/twisted/scripts/twistd.py", line
23, in runApp
_SomeApplicationRunner(config).run()
File "/usr/lib/python2.5/site-packages/twisted/application/app.py", line
330, in run
self.application = self.createOrGetApplication()
File "/usr/lib/python2.5/site-packages/twisted/application/app.py", line
416, in createOrGetApplication
application = getApplication(self.config, passphrase)
--- <exception caught here> ---
File "/usr/lib/python2.5/site-packages/twisted/application/app.py", line
427, in getApplication
application = service.loadApplication(filename, style, passphrase)
File "/usr/lib/python2.5/site-packages/twisted/application/service.py",
line 368, in loadApplication
application = sob.loadValueFromFile(filename, 'application',
passphrase)
File "/usr/lib/python2.5/site-packages/twisted/persisted/sob.py", line
214, in loadValueFromFile
exec fileObj in d, d
File "tahoe-client.tac", line 10, in <module>
c = client.Client()
File "/opt/tahoe-lafs/src/allmydata/client.py", line 140, in __init__
self.init_storage()
File "/opt/tahoe-lafs/src/allmydata/client.py", line 269, in
init_storage
expiration_sharetypes=expiration_sharetypes)
File "/opt/tahoe-lafs/src/allmydata/storage/server.py", line 97, in
__init__
self.add_bucket_counter()
File "/opt/tahoe-lafs/src/allmydata/storage/server.py", line 114, in
add_bucket_counter
self.bucket_counter = BucketCountingCrawler(self, statefile)
File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 449, in
__init__
ShareCrawler.__init__(self, server, statefile)
File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 86, in
__init__
self.load_state()
File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 195, in
load_state
state = pickle.load(f)
exceptions.EOFError:
}}}
--
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1280#comment:11>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list