[tahoe-lafs-trac-stream] [tahoe-lafs] #1130: Failure to achieve happiness in upload or repair

tahoe-lafs trac at tahoe-lafs.org
Thu Jun 27 17:11:31 UTC 2013


#1130: Failure to achieve happiness in upload or repair
-------------------------+-------------------------------------------------
     Reporter:           |      Owner:  kevan
  kmarkley86             |     Status:  new
         Type:  defect   |  Milestone:  1.11.0
     Priority:  major    |    Version:  1.7.1
    Component:  code-    |   Keywords:  upload repair rebalancing
  peerselection          |  availability unfinished-business servers-of-
   Resolution:           |  happiness
Launchpad Bug:           |
-------------------------+-------------------------------------------------

Old description:

> Prior to Tahoe-LAFS v1.7.1, the immutable uploader would sometimes raise
> an assertion error (#1118). We fixed that problem, and we also fixed the
> problem of uploader uploading an insufficiently well-distributed set of
> shares while thinking that it achieved servers-of-happiness. But now
> uploader gives up and doesn't upload at all, saying that it hasn't
> achieved happiness, when if it were smarter it could achieve happiness.
> This ticket is to make it successfully upload in this case.
>
> Log excerpt:
> {{{
> 19:12:35.519 L20 []#1337 CHKUploader starting
> 19:12:35.519 L20 []#1338 starting upload of
> <allmydata.immutable.upload.EncryptAnUploadable instance at 0x20886b5a8>
> 19:12:35.520 L20 []#1339 creating Encoder <Encoder for unknown storage
> index>
> 19:12:35.520 L20 []#1340 file size: 106
> 19:12:35.520 L10 []#1341 my encoding parameters: (2, 4, 4, 106)
> 19:12:35.520 L20 []#1342 got encoding parameters: 2/4/4 106
> 19:12:35.520 L20 []#1343 now setting up codec
> 19:12:35.520 L20 []#1344 using storage index 5xpii
> 19:12:35.520 L20 []#1345 <Tahoe2PeerSelector for upload 5xpii> starting
> 19:12:35.633 L10 []#1346 response from peer 47cslusc: alreadygot=(),
> allocated=(0,)
> 19:12:36.590 L10 []#1347 response from peer vjqcroal: alreadygot=(0, 3),
> allocated=(1,)
> 19:12:37.119 L10 []#1348 response from peer sn4ana4b: alreadygot=(1,),
> allocated=(2,)
> 19:12:37.124 L20 []#1349 storage: allocate_buckets
> 5xpiivbjrybcmy4ws7xp7dxez4
> 19:12:37.130 L10 []#1350 response from peer yuzbctlc: alreadygot=(2,),
> allocated=(0,)
> 19:12:37.130 L25 []#1351 server selection unsuccessful for
> <Tahoe2PeerSelector for upload 5xpii>: shares could be placed on only 3
> server(s) such that any 2 of them have enough shares to recover the file,
> but we were asked to place shares on at least 4 such servers. (placed all
> 4 shares, want to place shares on at least 4 servers such that any 2 of
> them have enough shares to recover the file, sent 4 queries to 4 peers, 4
> queries placed some shares, 0 placed none (of which 0 placed none due to
> the server being full and 0 placed none due to an error)), merged={0:
> set(['\xc52\x11Mb\xa1\xff\x8d\xafn\x0b#s\x17\xbe\x82\x85\x93G0']), 1:
> set(['\xaa`(\xb8\x0b\x89\x98Y\xfb\xcc2,T\xd0\xde\xf7\xca\xbfA#',
> '\x93x\x06\x83\x81\xdb\x12*\xe5\xb095T\xf0\x1e\xa5\x00V+\x0f']), 2:
> set(['\xc52\x11Mb\xa1\xff\x8d\xafn\x0b#s\x17\xbe\x82\x85\x93G0',
> '\x93x\x06\x83\x81\xdb\x12*\xe5\xb095T\xf0\x1e\xa5\x00V+\x0f']), 3:
> set(['\xaa`(\xb8\x0b\x89\x98Y\xfb\xcc2,T\xd0\xde\xf7\xca\xbfA#'])}
> 19:12:37.133 L20 []#1352 web: 127.0.0.1 PUT /uri/[CENSORED].. 500 1826
> 19:12:37.148 L23 []#1353 storage: aborting sharefile
> /home/tahoe/.tahoe/storage/shares/incoming/5x/5xpiivbjrybcmy4ws7xp7dxez4/0
> }}}

New description:

 Prior to Tahoe-LAFS v1.7.1, the immutable uploader would sometimes raise
 an assertion error (#1118). We fixed that problem, and we also fixed the
 problem of uploader uploading an insufficiently well-distributed set of
 shares while thinking that it achieved servers-of-happiness. But now
 uploader gives up and doesn't upload at all, saying that it hasn't
 achieved happiness, when if it were smarter it could achieve happiness.
 This ticket is to make it successfully upload in this case.

 Log excerpt:
 {{{
 19:12:35.519 L20 []#1337 CHKUploader starting
 19:12:35.519 L20 []#1338 starting upload of
 <allmydata.immutable.upload.EncryptAnUploadable instance at 0x20886b5a8>
 19:12:35.520 L20 []#1339 creating Encoder <Encoder for unknown storage
 index>
 19:12:35.520 L20 []#1340 file size: 106
 19:12:35.520 L10 []#1341 my encoding parameters: (2, 4, 4, 106)
 19:12:35.520 L20 []#1342 got encoding parameters: 2/4/4 106
 19:12:35.520 L20 []#1343 now setting up codec
 19:12:35.520 L20 []#1344 using storage index 5xpii
 19:12:35.520 L20 []#1345 <Tahoe2PeerSelector for upload 5xpii> starting
 19:12:35.633 L10 []#1346 response from peer 47cslusc: alreadygot=(),
 allocated=(0,)
 19:12:36.590 L10 []#1347 response from peer vjqcroal: alreadygot=(0, 3),
 allocated=(1,)
 19:12:37.119 L10 []#1348 response from peer sn4ana4b: alreadygot=(1,),
 allocated=(2,)
 19:12:37.124 L20 []#1349 storage: allocate_buckets
 5xpiivbjrybcmy4ws7xp7dxez4
 19:12:37.130 L10 []#1350 response from peer yuzbctlc: alreadygot=(2,),
 allocated=(0,)
 19:12:37.130 L25 []#1351 server selection unsuccessful for
 <Tahoe2PeerSelector for upload 5xpii>: shares could be placed on only 3
 server(s) such that any 2 of them have enough shares to recover the file,
 but we were asked to place shares on at least 4 such servers. (placed all
 4 shares, want to place shares on at least 4 servers such that any 2 of
 them have enough shares to recover the file, sent 4 queries to 4 peers, 4
 queries placed some shares, 0 placed none (of which 0 placed none due to
 the server being full and 0 placed none due to an error)), merged={0:
 set(['\xc52\x11Mb\xa1\xff\x8d\xafn\x0b#s\x17\xbe\x82\x85\x93G0']), 1:
 set(['\xaa`(\xb8\x0b\x89\x98Y\xfb\xcc2,T\xd0\xde\xf7\xca\xbfA#',
 '\x93x\x06\x83\x81\xdb\x12*\xe5\xb095T\xf0\x1e\xa5\x00V+\x0f']), 2:
 set(['\xc52\x11Mb\xa1\xff\x8d\xafn\x0b#s\x17\xbe\x82\x85\x93G0',
 '\x93x\x06\x83\x81\xdb\x12*\xe5\xb095T\xf0\x1e\xa5\x00V+\x0f']), 3:
 set(['\xaa`(\xb8\x0b\x89\x98Y\xfb\xcc2,T\xd0\xde\xf7\xca\xbfA#'])}
 19:12:37.133 L20 []#1352 web: 127.0.0.1 PUT /uri/[CENSORED].. 500 1826
 19:12:37.148 L23 []#1353 storage: aborting sharefile
 /home/tahoe/.tahoe/storage/shares/incoming/5x/5xpiivbjrybcmy4ws7xp7dxez4/0
 }}}

--

Comment (by daira):

 Step 5 in the comment:12 algorithm isn't very specific about where the
 remaining shares are placed. I can think of two possibilities:

 a) continue the loop in step 4, i.e. place in the order of the permuted
 list with wrap-around.

 b) sort the servers by the number of shares they have at that point
 (breaking ties in some deterministic way) and place on the servers with
 fewest shares first.

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1130#comment:18>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list