[tahoe-lafs-trac-stream] [tahoe-lafs] #1670: KeyError in mutable retrieve

tahoe-lafs trac at tahoe-lafs.org
Thu Jun 21 19:30:43 UTC 2012


#1670: KeyError in mutable retrieve
-------------------------+-------------------------------------------------
     Reporter:  vikarti  |      Owner:  zooko
         Type:  defect   |     Status:  new
     Priority:           |  Milestone:  1.9.2
  critical               |    Version:  1.9.1
    Component:  code-    |   Keywords:  mutable retrieve error tahoe-backup
  mutable                |  regression
   Resolution:           |
Launchpad Bug:           |
-------------------------+-------------------------------------------------
Changes (by zooko):

 * status:  assigned => new


Comment:

 Fixed #1669 and it is different from this -- this isn't a duplicate.

 By inspecting the bug reports and incident files I've confirmed what Kevan
 said, that the error happens during retry after a UCWE. One of the
 incident files shows the following:

 {{{
 02:26:28.641 [1920941]: current goal: before update: , sh0 to [fbil5app],
 sh1 to [d4f7ssed], sh2 to [lxur2oq3], sh3 to [rczho2ut], sh4 to
 [xi6zcwi2], sh5 to [zpm5ziqh], sh6 to [e5l3augd], sh7 to [yf6ahya4], sh8
 to [7xdwosoe], sh9 to [t3zmoeba], sh10 to [lfc4wmmk], sh11 to [wasuxoux],
 sh12 to [iz3pbtdv]
     02:26:28.641 [1920942]: we are planning to push new seqnum=#2
     02:26:28.642 [1920943]: Starting push
     02:26:28.642 [1920944]: Pushing segment 1 of 1
     02:26:28.950 [1920945]: _got_write_answer from d4f7ssed, share 1
         02:26:28.950 [1920947]: WEIRD our testv failed, so the write did
 not happen [INCIDENT-TRIGGER]
         02:26:33.803 [1920948]: somebody modified the share on us:
 shnum=1: I thought they had #1:R=ibc6, but testv reported
 #2:R=�2e!J&t����
ܘvҐ:%��l�i�tA2
     02:26:28.950 [1920946]: found the following surprise shares: set([])
     02:26:33.805 [1920949]: _got_write_answer from rczho2ut, share 3
         02:26:33.805 [1920951]: WEIRD our testv failed, so the write did
 not happen
         02:26:33.805 [1920952]: somebody modified the share on us:
 shnum=3: I thought they had #1:R=ibc6, but testv reported
 #2:R=�2e!J&t����
ܘvҐ:%��l�i�tA2
     02:26:33.805 [1920950]: found the following surprise shares: set([])
     02:26:33.806 [1920953]: _got_write_answer from t3zmoeba, share 9
         02:26:33.806 [1920955]: WEIRD our testv failed, so the write did
 not happen
         02:26:33.806 [1920956]: somebody modified the share on us:
 shnum=9: I thought they had #1:R=ibc6, but testv reported
 #2:R=�2e!J&t����
ܘvҐ:%��l�i�tA2
     02:26:33.806 [1920954]: found the following surprise shares: set([])
     02:26:33.812 [1920957]: _got_write_answer from fbil5app, share 0
         02:26:33.812 [1920959]: WEIRD our testv failed, so the write did
 not happen
         02:26:33.812 [1920960]: somebody modified the share on us:
 shnum=0: I thought they had #1:R=ibc6, but testv reported
 #2:R=�2e!J&t����
ܘvҐ:%��l�i�tA2
     02:26:33.812 [1920958]: found the following surprise shares: set([])
     02:26:33.813 [1920961]: _got_write_answer from lfc4wmmk, share 10
         02:26:33.813 [1920963]: WEIRD our testv failed, so the write did
 not happen
         02:26:33.813 [1920964]: somebody modified the share on us:
 shnum=10: I thought they had #1:R=ibc6, but testv reported
 #2:R=�2e!J&t����
ܘvҐ:%��l�i�tA2
     02:26:33.813 [1920962]: found the following surprise shares: set([])
     02:26:33.815 [1920965]: _got_write_answer from wasuxoux, share 11
         02:26:33.815 [1920967]: WEIRD our testv failed, so the write did
 not happen
         02:26:33.815 [1920968]: somebody modified the share on us:
 shnum=11: I thought they had #1:R=ibc6, but testv reported
 #2:R=�2e!J&t����
ܘvҐ:%��l�i�tA2
     02:26:33.815 [1920966]: found the following surprise shares: set([])
     02:26:33.816 [1920969]: _got_write_answer from lxur2oq3, share 2
         02:26:33.816 [1920971]: WEIRD our testv failed, so the write did
 not happen
         02:26:33.818 [1920972]: somebody modified the share on us:
 shnum=2: I thought they had #1:R=ibc6, but testv reported
 #2:R=�2e!J&t����
ܘvҐ:%��l�i�tA2
     02:26:33.816 [1920970]: found the following surprise shares: set([])
     02:26:33.819 [1920973]: _got_write_answer from 7xdwosoe, share 8
         02:26:33.819 [1920975]: WEIRD our testv failed, so the write did
 not happen
         02:26:33.819 [1920976]: somebody modified the share on us:
 shnum=8: I thought they had #1:R=ibc6, but testv reported
 #2:R=�2e!J&t����
ܘvҐ:%��l�i�tA2
     02:26:33.819 [1920974]: found the following surprise shares: set([])
     02:26:33.819 [1920977]: _got_write_answer from iz3pbtdv, share 12
         02:26:33.819 [1920979]: WEIRD our testv failed, so the write did
 not happen
         02:26:33.821 [1920980]: somebody modified the share on us:
 shnum=12: I thought they had #1:R=ibc6, but testv reported
 #2:R=�2e!J&t����
ܘvҐ:%��l�i�tA2
     02:26:33.819 [1920978]: found the following surprise shares: set([])
     02:26:33.822 [1920981]: _got_write_answer from e5l3augd, share 6
         02:26:33.822 [1920983]: WEIRD our testv failed, so the write did
 not happen
         02:26:33.822 [1920984]: somebody modified the share on us:
 shnum=6: I thought they had #1:R=ibc6, but testv reported
 #2:R=�2e!J&t����
ܘvҐ:%��l�i�tA2
     02:26:33.822 [1920982]: found the following surprise shares: set([])
     02:26:33.882 [1920985]: _got_write_answer from zpm5ziqh, share 5
         02:26:33.882 [1920987]: WEIRD our testv failed, so the write did
 not happen
         02:26:33.882 [1920988]: somebody modified the share on us:
 shnum=5: I thought they had #1:R=ibc6, but testv reported
 #2:R=�2e!J&t����
ܘvҐ:%��l�i�tA2
     02:26:33.882 [1920986]: found the following surprise shares: set([])
     02:26:33.884 [1920989]: _got_write_answer from xi6zcwi2, share 4
         02:26:33.884 [1920991]: WEIRD our testv failed, so the write did
 not happen
         02:26:33.884 [1920992]: somebody modified the share on us:
 shnum=4: I thought they had #1:R=ibc6, but testv reported
 #2:R=�2e!J&t����
ܘvҐ:%��l�i�tA2
     02:26:33.884 [1920990]: found the following surprise shares: set([])
     02:26:33.885 [1920993]: _got_write_answer from yf6ahya4, share 7
         02:26:33.885 [1920995]: WEIRD our testv failed, so the write did
 not happen
         02:26:33.885 [1920996]: somebody modified the share on us:
 shnum=7: I thought they had #1:R=ibc6, but testv reported
 #2:R=�2e!J&t����
ܘvҐ:%��l�i�tA2
     02:26:33.885 [1920994]: found the following surprise shares: set([])
     02:26:33.885 [1920997]: Publish failed with UncoordinatedWriteError

 02:26:37.075 [1920998]: doing modify
 02:26:37.075 [1920999]: SharemapUpdater(wmnt2): starting (MODE_CHECK)

     02:26:37.076 [1921000]: sending query to [iz3pbtdv], readsize=1000
     02:26:37.078 [1921001]: sending query to [rczho2ut], readsize=1000
     02:26:37.079 [1921002]: sending query to [lfc4wmmk], readsize=1000
     02:26:37.081 [1921003]: sending query to [yf6ahya4], readsize=1000
     02:26:37.082 [1921004]: sending query to [d4f7ssed], readsize=1000
     02:26:37.084 [1921005]: sending query to [t3zmoeba], readsize=1000
     02:26:37.085 [1921006]: sending query to [fbil5app], readsize=1000
     02:26:37.085 [1921007]: sending query to [xi6zcwi2], readsize=1000
     02:26:37.086 [1921008]: sending query to [wasuxoux], readsize=1000
     02:26:37.088 [1921009]: sending query to [7xdwosoe], readsize=1000
     02:26:37.088 [1921010]: sending query to [zpm5ziqh], readsize=1000
     02:26:37.089 [1921011]: sending query to [lxur2oq3], readsize=1000
     02:26:37.098 [1921012]: sending query to [e5l3augd], readsize=1000
     02:26:37.131 [1921013]: got result from [d4f7ssed], 1 shares
         02:26:37.131 [1921014]: _got_results done
         02:26:37.134 [1921019]: _got_results: got shnum #1 from peerid
 d4f7ssed
         02:26:37.134 [1921020]: found valid version 2-tmzg from d4f7ssed-
 sh1: 4-13/352/349
     02:26:37.131 [1921015]: got result from [rczho2ut], 1 shares
         02:26:37.132 [1921016]: _got_results done
         02:26:37.137 [1921025]: _got_results: got shnum #3 from peerid
 rczho2ut
         02:26:37.137 [1921026]: found valid version 2-tmzg from rczho2ut-
 sh3: 4-13/352/349
     02:26:37.134 [1921017]: got result from [t3zmoeba], 1 shares
         02:26:37.134 [1921018]: _got_results done
         02:26:37.138 [1921031]: _got_results: got shnum #9 from peerid
 t3zmoeba
         02:26:37.138 [1921032]: found valid version 2-tmzg from t3zmoeba-
 sh9: 4-13/352/349
     02:26:37.135 [1921021]: _check_for_done, mode is 'MODE_CHECK', 12
 queries outstanding, 0 extra peers available, 12 'must query' peers left,
 need_privkey=False
         02:26:37.135 [1921022]: 12 'must query' peers left
     02:26:37.135 [1921023]: _check_for_done, mode is 'MODE_CHECK', 12
 queries outstanding, 0 extra peers available, 12 'must query' peers left,
 need_privkey=False
         02:26:37.135 [1921024]: 12 'must query' peers left
     02:26:37.137 [1921027]: _check_for_done, mode is 'MODE_CHECK', 11
 queries outstanding, 0 extra peers available, 11 'must query' peers left,
 need_privkey=False
         02:26:37.137 [1921028]: 11 'must query' peers left
     02:26:37.137 [1921029]: _check_for_done, mode is 'MODE_CHECK', 11
 queries outstanding, 0 extra peers available, 11 'must query' peers left,
 need_privkey=False
         02:26:37.137 [1921030]: 11 'must query' peers left
     02:26:37.138 [1921033]: _check_for_done, mode is 'MODE_CHECK', 10
 queries outstanding, 0 extra peers available, 10 'must query' peers left,
 need_privkey=False
         02:26:37.138 [1921034]: 10 'must query' peers left
     02:26:37.138 [1921035]: _check_for_done, mode is 'MODE_CHECK', 10
 queries outstanding, 0 extra peers available, 10 'must query' peers left,
 need_privkey=False
         02:26:37.138 [1921036]: 10 'must query' peers left
     02:26:37.269 [1921037]: got result from [7xdwosoe], 1 shares
         02:26:37.270 [1921038]: _got_results done
         02:26:37.272 [1921039]: _got_results: got shnum #8 from peerid
 7xdwosoe
         02:26:37.273 [1921040]: found valid version 2-tmzg from 7xdwosoe-
 sh8: 4-13/352/349
     02:26:37.273 [1921041]: _check_for_done, mode is 'MODE_CHECK', 9
 queries outstanding, 0 extra peers available, 9 'must query' peers left,
 need_privkey=False
         02:26:37.273 [1921042]: 9 'must query' peers left
     02:26:37.273 [1921043]: _check_for_done, mode is 'MODE_CHECK', 9
 queries outstanding, 0 extra peers available, 9 'must query' peers left,
 need_privkey=False
         02:26:37.273 [1921044]: 9 'must query' peers left
     02:26:37.319 [1921045]: got result from [lfc4wmmk], 1 shares
         02:26:37.319 [1921046]: _got_results done
         02:26:37.321 [1921047]: _got_results: got shnum #10 from peerid
 lfc4wmmk
 }}}

 And the user then reported the following:

 {{{
   File "c:\tahoe\allmydata-tahoe-1.9.1\src\allmydata\mutable\filenode.py",
 line 855, in <lambda>
     self._modify_once(modifier, first_time))
   File "c:\tahoe\allmydata-tahoe-1.9.1\src\allmydata\mutable\filenode.py",
 line 881, in _modify_once
     d = self._try_to_download_data()
   File "c:\tahoe\allmydata-tahoe-1.9.1\srcllmydata\mutable\filenode.py",
 line 959, in _try_to_download_data
     d = self._read(c, fetch_privkey=True)
   File "c:\tahoe\allmydata-tahoe-1.9.1\src\allmydata\mutable\filenode.py",
 line 980, in _read
     d = r.download(consumer, offset, size)
   File "c:\tahoe\allmydata-tahoe-1.9.1\src\allmydata\mutable\retrieve.py",
 line 237, in download
     self._setup_download()
   File "c:\tahoe\allmydata-tahoe-1.9.1\src\allmydata\mutable\retrieve.py",
 line 277, in _setup_download
     shares = versionmap[self.verinfo]
 exceptions.KeyError: (1,
 '@E\xe0H\x8a\xbc\xa6rQA\xfa\xab\xfd&B\xf6td"jv\xae\x96\xccel\xcf\xc2\x9f\x8aG)',
 '\xde\xdeyKb\xcc\x8f-\xed\xea\xa7\xcc\xe1a\xfe\x10', 0, 0, 4, 13,
 '\x00\x00\x00\x00\x00\x00\x00\x01 at E\xe0H\x8a\xbc\xa6rQA\xfa\xab\xfd&B\xf6td"jv\xae\x96\xccel\xcf\xc2\x9f\x8aG)\xde\xdeyKb\xcc\x8f-\xed\xea\xa7\xcc\xe1a\xfe\x10\x04\r\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00',
 (('enc_privkey', 823), ('EOF', 2038), ('share_data', 823), ('signature',
 399), ('block_hash_tree', 791), ('share_hash_chain', 655)))
 }}}

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1670#comment:19>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list