[tahoe-lafs-trac-stream] [tahoe-lafs] #1280: deal with fragile, but disposable, bucket state files

tahoe-lafs trac at tahoe-lafs.org
Tue Jul 9 14:25:47 UTC 2013


#1280: deal with fragile, but disposable, bucket state files
--------------------------------+--------------------------------
     Reporter:  francois        |      Owner:  zooko
         Type:  defect          |     Status:  reopened
     Priority:  normal          |  Milestone:  1.11.0
    Component:  code-nodeadmin  |    Version:  1.8.1
   Resolution:                  |   Keywords:  pickle reliability
Launchpad Bug:                  |
--------------------------------+--------------------------------

Old description:

> If a bucket state file can't be loaded and parsed for any reason (usually
> because it is 0-length, but any other sort of error should also be
> handled similarly), then just blow it away and start fresh.
>
> ------- original post by François below:
>
> After a hard system shutdown due to power failure, Tahoe node might not
> be able to start again automatically because files
> '''storage/bucket_counter.state''' or '''storage/lease_checker.state'''
> are empty.
>
> The easy workaround is to manually delete the empty files before
> restarting nodes.
>
> {{{
> find /srv/tahoe/*/storage/bucket_counter.state -size 0 -exec rm {} \;
> find /srv/tahoe/*/storage/lease_checker.state -size 0 -exec rm {} \;
> }}}
>
> Here is what a startup attempt looks like in such case.
>
> {{{
> Traceback (most recent call last):
>   File "/usr/lib/python2.5/site-packages/twisted/application/app.py",
> line 614, in run
>     runApp(config)
>   File "/usr/lib/python2.5/site-packages/twisted/scripts/twistd.py", line
> 23, in runApp
>     _SomeApplicationRunner(config).run()
>   File "/usr/lib/python2.5/site-packages/twisted/application/app.py",
> line 330, in run
>     self.application = self.createOrGetApplication()
>   File "/usr/lib/python2.5/site-packages/twisted/application/app.py",
> line 416, in createOrGetApplication
>     application = getApplication(self.config, passphrase)
> --- <exception caught here> ---
>   File "/usr/lib/python2.5/site-packages/twisted/application/app.py",
> line 427, in getApplication
>     application = service.loadApplication(filename, style, passphrase)
>   File "/usr/lib/python2.5/site-packages/twisted/application/service.py",
> line 368, in loadApplication
>     application = sob.loadValueFromFile(filename, 'application',
> passphrase)
>   File "/usr/lib/python2.5/site-packages/twisted/persisted/sob.py", line
> 214, in loadValueFromFile
>     exec fileObj in d, d
>   File "tahoe-client.tac", line 10, in <module>
>     c = client.Client()
>   File "/opt/tahoe-lafs/src/allmydata/client.py", line 140, in __init__
>     self.init_storage()
>   File "/opt/tahoe-lafs/src/allmydata/client.py", line 269, in
> init_storage
>     expiration_sharetypes=expiration_sharetypes)
>   File "/opt/tahoe-lafs/src/allmydata/storage/server.py", line 97, in
> __init__
>     self.add_bucket_counter()
>   File "/opt/tahoe-lafs/src/allmydata/storage/server.py", line 114, in
> add_bucket_counter
>     self.bucket_counter = BucketCountingCrawler(self, statefile)
>   File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 449, in
> __init__
>     ShareCrawler.__init__(self, server, statefile)
>   File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 86, in
> __init__
>     self.load_state()
>   File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 195, in
> load_state
>     state = pickle.load(f)
> exceptions.EOFError:
> }}}

New description:

 If a bucket state file can't be loaded and parsed for any reason (usually
 because it is 0-length, but any other sort of error should also be handled
 similarly), then just blow it away and start fresh.

 ------- original post by François below:

 After a hard system shutdown due to power failure, Tahoe node might not be
 able to start again automatically because files
 '''storage/bucket_counter.state''' or '''storage/lease_checker.state'''
 are empty.

 The easy workaround is to manually delete the empty files before
 restarting nodes.

 {{{
 find /srv/tahoe/*/storage/bucket_counter.state -size 0 -exec rm {} \;
 find /srv/tahoe/*/storage/lease_checker.state -size 0 -exec rm {} \;
 }}}

 Here is what a startup attempt looks like in such case.

 {{{
 Traceback (most recent call last):
   File "/usr/lib/python2.5/site-packages/twisted/application/app.py", line
 614, in run
     runApp(config)
   File "/usr/lib/python2.5/site-packages/twisted/scripts/twistd.py", line
 23, in runApp
     _SomeApplicationRunner(config).run()
   File "/usr/lib/python2.5/site-packages/twisted/application/app.py", line
 330, in run
     self.application = self.createOrGetApplication()
   File "/usr/lib/python2.5/site-packages/twisted/application/app.py", line
 416, in createOrGetApplication
     application = getApplication(self.config, passphrase)
 --- <exception caught here> ---
   File "/usr/lib/python2.5/site-packages/twisted/application/app.py", line
 427, in getApplication
     application = service.loadApplication(filename, style, passphrase)
   File "/usr/lib/python2.5/site-packages/twisted/application/service.py",
 line 368, in loadApplication
     application = sob.loadValueFromFile(filename, 'application',
 passphrase)
   File "/usr/lib/python2.5/site-packages/twisted/persisted/sob.py", line
 214, in loadValueFromFile
     exec fileObj in d, d
   File "tahoe-client.tac", line 10, in <module>
     c = client.Client()
   File "/opt/tahoe-lafs/src/allmydata/client.py", line 140, in __init__
     self.init_storage()
   File "/opt/tahoe-lafs/src/allmydata/client.py", line 269, in
 init_storage
     expiration_sharetypes=expiration_sharetypes)
   File "/opt/tahoe-lafs/src/allmydata/storage/server.py", line 97, in
 __init__
     self.add_bucket_counter()
   File "/opt/tahoe-lafs/src/allmydata/storage/server.py", line 114, in
 add_bucket_counter
     self.bucket_counter = BucketCountingCrawler(self, statefile)
   File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 449, in
 __init__
     ShareCrawler.__init__(self, server, statefile)
   File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 86, in
 __init__
     self.load_state()
   File "/opt/tahoe-lafs/src/allmydata/storage/crawler.py", line 195, in
 load_state
     state = pickle.load(f)
 exceptions.EOFError:
 }}}

--

Comment (by daira):

 I suspect that changeset
 [changeset:3cb99364e6a83d0064d2838a0c470278903e19ac/trunk 3cb99364]
 effectively fixed this ticket. (It doesn't delete a corrupted state file,
 but it does use default values, and the corrupted file will be overwritten
 on the next crawler pass.) Please close this ticket as fixed in milestone
 1.9.2 if you agree.

 See also #1290.

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1280#comment:12>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list