[tahoe-lafs-trac-stream] [tahoe-lafs] #1280: deal with fragile, but disposable, bucket state files (was: if bucket_counter.state or lease_checker.state can't be written, stop the node with an error message)

tahoe-lafs trac at tahoe-lafs.org
Tue Oct 16 02:36:30 UTC 2012


#1280: deal with fragile, but disposable, bucket state files
--------------------------------+--------------------------------
     Reporter:  francois        |      Owner:  zooko
         Type:  defect          |     Status:  reopened
     Priority:  normal          |  Milestone:  1.11.0
    Component:  code-nodeadmin  |    Version:  1.8.1
   Resolution:                  |   Keywords:  pickle reliability
Launchpad Bug:                  |
--------------------------------+--------------------------------
Description changed by zooko:

Old description:

> After a hard system shutdown due to power failure, Tahoe node might not
> be able to start again automatically because files
> '''storage/bucket_counter.state''' or '''storage/lease_checker.state'''
> are empty.
>
> The easy workaround is to manually delete the empty files before
> restarting nodes.
>
> {{{
> find /srv/tahoe/*/storage/bucket_counter.state -size 0 -exec rm {} \;
> find /srv/tahoe/*/storage/lease_checker.state -size 0 -exec rm {} \;
> }}}
>
> Here is what a startup attempt looks like in such case.
>
> {{{
> Traceback (most recent call last):
>   File "/usr/lib/python2.5/site-packages/twisted/application/app.py",
> line 614, in run
>     runApp(config)
>   File "/usr/lib/python2.5/site-packages/twisted/scripts/twistd.py", line
> 23, in runApp
>     _SomeApplicationRunner(config).run()
>   File "/usr/lib/python2.5/site-packages/twisted/application/app.py",
> line 330, in run
>     self.application = self.createOrGetApplication()
>   File "/usr/lib/python2.5/site-packages/twisted/application/app.py",
> line 416, in createOrGetApplication
>     application = getApplication(self.config, passphrase)
> --- <exception caught here> ---
>   File "/usr/lib/python2.5/site-packages/twisted/application/app.py",
> line 427, in getApplication
>     application = service.loadApplication(filename, style, passphrase)
>   File "/usr/lib/python2.5/site-packages/twisted/application/service.py",
> line 368, in loadApplication
>     application = sob.loadValueFromFile(filename, 'application',
> passphrase)
>   File "/usr/lib/python2.5/site-packages/twisted/persisted/sob.py", line
> 214, in loadValueFromFile
>     exec fileObj in d, d
>   File "tahoe-client.tac", line 10, in <module>
>     c = client.Client()
>   File "/opt/tahoe-lafs/src/allmydata/client.py", line 140, in __init__
>     self.init_storage()
>   File "/opt/tahoe-lafs/src/allmydata/client.py", line 269, in
> init_storage
>     expiration_sharetypes=expiration_sharetypes)
>   File "/opt/tahoe-lafs/src/allmydata/storage/server.py", line 97, in
> __init__
>     self.add_bucket_counter()
>   File "/opt/tahoe-lafs/src/allmydata/storage/server.py", line 114, in
> add_bucket_counter
>     self.bucket_counter = BucketCountingCrawler(self, statefile)
>   File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 449, in
> __init__
>     ShareCrawler.__init__(self, server, statefile)
>   File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 86, in
> __init__
>     self.load_state()
>   File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 195, in
> load_state
>     state = pickle.load(f)
> exceptions.EOFError:
> }}}

New description:

 If a bucket state file can't be loaded and parsed for any reason (usually
 because it is 0-length, but any other sort of error should also be handled
 similarly), then just blow it away and start fresh.

 ------- original post by François below:

 After a hard system shutdown due to power failure, Tahoe node might not be
 able to start again automatically because files
 '''storage/bucket_counter.state''' or '''storage/lease_checker.state'''
 are empty.

 The easy workaround is to manually delete the empty files before
 restarting nodes.

 {{{
 find /srv/tahoe/*/storage/bucket_counter.state -size 0 -exec rm {} \;
 find /srv/tahoe/*/storage/lease_checker.state -size 0 -exec rm {} \;
 }}}

 Here is what a startup attempt looks like in such case.

 {{{
 Traceback (most recent call last):
   File "/usr/lib/python2.5/site-packages/twisted/application/app.py", line
 614, in run
     runApp(config)
   File "/usr/lib/python2.5/site-packages/twisted/scripts/twistd.py", line
 23, in runApp
     _SomeApplicationRunner(config).run()
   File "/usr/lib/python2.5/site-packages/twisted/application/app.py", line
 330, in run
     self.application = self.createOrGetApplication()
   File "/usr/lib/python2.5/site-packages/twisted/application/app.py", line
 416, in createOrGetApplication
     application = getApplication(self.config, passphrase)
 --- <exception caught here> ---
   File "/usr/lib/python2.5/site-packages/twisted/application/app.py", line
 427, in getApplication
     application = service.loadApplication(filename, style, passphrase)
   File "/usr/lib/python2.5/site-packages/twisted/application/service.py",
 line 368, in loadApplication
     application = sob.loadValueFromFile(filename, 'application',
 passphrase)
   File "/usr/lib/python2.5/site-packages/twisted/persisted/sob.py", line
 214, in loadValueFromFile
     exec fileObj in d, d
   File "tahoe-client.tac", line 10, in <module>
     c = client.Client()
   File "/opt/tahoe-lafs/src/allmydata/client.py", line 140, in __init__
     self.init_storage()
   File "/opt/tahoe-lafs/src/allmydata/client.py", line 269, in
 init_storage
     expiration_sharetypes=expiration_sharetypes)
   File "/opt/tahoe-lafs/src/allmydata/storage/server.py", line 97, in
 __init__
     self.add_bucket_counter()
   File "/opt/tahoe-lafs/src/allmydata/storage/server.py", line 114, in
 add_bucket_counter
     self.bucket_counter = BucketCountingCrawler(self, statefile)
   File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 449, in
 __init__
     ShareCrawler.__init__(self, server, statefile)
   File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 86, in
 __init__
     self.load_state()
   File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 195, in
 load_state
     state = pickle.load(f)
 exceptions.EOFError:
 }}}

--

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1280#comment:11>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list