[tahoe-lafs-trac-stream] [tahoe-lafs] #1749: bug in mutable publish that could cause an IndexError when a writer is removed in Publish._connection_problem

tahoe-lafs trac at tahoe-lafs.org
Wed May 23 04:16:23 UTC 2012


#1749: bug in mutable publish that could cause an IndexError when a writer is
removed in Publish._connection_problem
------------------------------+--------------------------------------------
     Reporter:  davidsarah    |      Owner:  zooko
         Type:  defect        |     Status:  new
     Priority:  critical      |  Milestone:  1.9.2
    Component:  code-mutable  |    Version:  1.9.1
   Resolution:                |   Keywords:  publish regression test-needed
Launchpad Bug:                |
------------------------------+--------------------------------------------
Changes (by davidsarah):

 * owner:   => zooko


Old description:

> {{{
> "Traceback (most recent call last):
> Failure: allmydata.mutable.common.NotEnoughServersError: (\"Publish ran
> out of good servers, last failure was:
> [Failure instance: Traceback: <type 'exceptions.IndexError'>: list index
> out of range
> /home/davidsarah/cloud-branch/support/lib/python2.6/site-
> packages/Twisted-12.0.0-py2.6-linux-i686.egg/twisted/internet/base.py:800:runUntilCurrent
> /home/davidsarah/cloud-branch/support/lib/python2.6/site-
> packages/foolscap-0.6.3-py2.6.egg/foolscap/eventual.py:26:_turn
> /home/davidsarah/cloud-branch/support/lib/python2.6/site-
> packages/Twisted-12.0.0-py2.6-linux-i686.egg/twisted/internet/defer.py:368:callback
> /home/davidsarah/cloud-branch/support/lib/python2.6/site-
> packages/Twisted-12.0.0-py2.6-linux-i686.egg/twisted/internet/defer.py:464:_startRunCallbacks
> (04:15:44) davidsarah: --- <exception caught here> ---\\n/home/davidsarah
> /cloud-branch/support/lib/python2.6/site-
> packages/Twisted-12.0.0-py2.6-linux-i686.egg/twisted/internet/defer.py:551:_runCallbacks
> /home/davidsarah/cloud-branch/src/allmydata/mutable/publish.py:634:_push
> /home/davidsarah/cloud-
> branch/src/allmydata/mutable/publish.py:651:push_segment
> /home/davidsarah/cloud-branch/src/allmydata/mutable/publish.py:637:_push
> /home/davidsarah/cloud-
> branch/src/allmydata/mutable/publish.py:773:push_everything_else
> /home/davidsarah/cloud-
> branch/src/allmydata/mutable/publish.py:878:finish_publishing
> /home/davidsarah/cloud-
> branch/src/allmydata/mutable/publish.py:886:_record_verinfo
> }}}
>
> I can reproduce this, at least on the cloud-branch, when I do a {{{tahoe
> put --mutable}}} shortly after the gateway has started.

New description:

 {{{
 "Traceback (most recent call last):
 Failure: allmydata.mutable.common.NotEnoughServersError: (\"Publish ran
 out of good servers, last failure was:
 [Failure instance: Traceback: <type 'exceptions.IndexError'>: list index
 out of range
 /home/davidsarah/cloud-branch/support/lib/python2.6/site-
 packages/Twisted-12.0.0-py2.6-linux-i686.egg/twisted/internet/base.py:800:runUntilCurrent
 /home/davidsarah/cloud-branch/support/lib/python2.6/site-
 packages/foolscap-0.6.3-py2.6.egg/foolscap/eventual.py:26:_turn
 /home/davidsarah/cloud-branch/support/lib/python2.6/site-
 packages/Twisted-12.0.0-py2.6-linux-i686.egg/twisted/internet/defer.py:368:callback
 /home/davidsarah/cloud-branch/support/lib/python2.6/site-
 packages/Twisted-12.0.0-py2.6-linux-i686.egg/twisted/internet/defer.py:464:_startRunCallbacks
 --- <exception caught here> ---\\n/home/davidsarah/cloud-
 branch/support/lib/python2.6/site-
 packages/Twisted-12.0.0-py2.6-linux-i686.egg/twisted/internet/defer.py:551:_runCallbacks
 /home/davidsarah/cloud-branch/src/allmydata/mutable/publish.py:634:_push
 /home/davidsarah/cloud-
 branch/src/allmydata/mutable/publish.py:651:push_segment
 /home/davidsarah/cloud-branch/src/allmydata/mutable/publish.py:637:_push
 /home/davidsarah/cloud-
 branch/src/allmydata/mutable/publish.py:773:push_everything_else
 /home/davidsarah/cloud-
 branch/src/allmydata/mutable/publish.py:878:finish_publishing
 /home/davidsarah/cloud-
 branch/src/allmydata/mutable/publish.py:886:_record_verinfo
 }}}

 I can reproduce this, at least on the cloud-branch, when I do a {{{tahoe
 put --mutable}}} shortly after the gateway has started.

--

Comment:

 zooko suggests this fix, which I reviewed and approved of:
 {{{
 --- old-dw/src/allmydata/mutable/publish.py     2012-05-22
 21:48:48.764939441 -0600
 +++ new-dw/src/allmydata/mutable/publish.py     2012-05-22
 21:48:50.641598788 -0600
 @@ -253,6 +253,10 @@
          # updating, we ignore damaged and missing shares -- callers must
          # do a repair to repair and recreate these.
          self.goal = set(self._servermap.get_known_shares())
 +
 +        # k: shnum, v: [ instance of IMutableSlotWriter ]
 +        # The value is required to always be non-empty if the item is
 present
 +        # in the dict at all.
          self.writers = {}

          # SDMF files are updated differently.
 @@ -891,8 +895,10 @@
          """
          self.log("found problem: %s" % str(f))
          self._last_failure = f
 -        self.writers[writer.shnum].remove(writer)
 -
 +        writers = self.writers[writer.shnum]
 +        writers.remove(writer)
 +        if len(writers) == 0:
 +            del self.writers[writer.shnum]

      def log_goal(self, goal, message=""):
          logmsg = [message]
 }}}

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1749#comment:1>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list