#1118 closed defect (fixed)

broken assert in upload.py

Reported by: kevan Owned by: kevan
Priority: critical Milestone: 1.7.1
Component: code-peerselection Version: 1.7.0
Keywords: regression upload availability reliability Cc:
Launchpad Bug:

Description

from http://tahoe-lafs.org/pipermail/tahoe-dev/2010-July/004656.html:

allmydata.scripts.common_http.HTTPError: Error during file PUT: 500
Internal Server Error
Traceback (most recent call last):
  File
"/usr/local/lib/python2.6/site-packages/Twisted-10.0.0-py2.6-openbsd-4.6-amd64.egg/twisted/internet/defer.py",
line 325, in unpause
    self._runCallbacks()
  File
"/usr/local/lib/python2.6/site-packages/Twisted-10.0.0-py2.6-openbsd-4.6-amd64.egg/twisted/internet/defer.py",
line 371, in _runCallbacks
    self.result = callback(self.result, *args, **kw)
  File
"/usr/local/lib/python2.6/site-packages/Twisted-10.0.0-py2.6-openbsd-4.6-amd64.egg/twisted/internet/defer.py",
line 330, in _continue
    self.unpause()
  File
"/usr/local/lib/python2.6/site-packages/Twisted-10.0.0-py2.6-openbsd-4.6-amd64.egg/twisted/internet/defer.py",
line 325, in unpause
    self._runCallbacks()
--- <exception caught here> ---
  File
"/usr/local/lib/python2.6/site-packages/Twisted-10.0.0-py2.6-openbsd-4.6-amd64.egg/twisted/internet/defer.py",
line 371, in _runCallbacks
    self.result = callback(self.result, *args, **kw)
  File
"/usr/local/lib/python2.6/site-packages/allmydata_tahoe-1.7.0-py2.6.egg/allmydata/immutable/upload.py",
line 896, in set_shareholders
    assert len(buckets) == sum([len(peer.buckets) for peer in used_peers])
exceptions.AssertionError: 

Attachments (8)

stuff.flog.bz2 (8.8 KB) - added by kmarkley86 at 2010-07-15T04:55:40Z.
flogtool log
debugging-1118-2.dpatch.txt (18.8 KB) - added by zooko at 2010-07-18T22:19:01Z.
test-1118.dpatch.txt (12.6 KB) - added by zooko at 2010-07-18T22:37:17Z.
diff.txt (967 bytes) - added by kevan at 2010-07-19T03:50:12Z.
fix-1118.dpatch (12.6 KB) - added by davidsarah at 2010-07-19T04:26:55Z.
* upload.py: fix #1118 by aborting newly-homeless buckets when reassignment runs. This makes a previously failing assert correct.
happy_share_distribution.dpatch.txt (37.4 KB) - added by zooko at 2010-07-19T05:07:38Z.
fix-1118-v2.dpatch (12.9 KB) - added by davidsarah at 2010-07-19T05:51:41Z.
upload.py: abort homeless buckets when reassignment runs. This makes a previously failing assert correct. This versi on refactors 'abort' into two methods, rather than using a default argument. fixes #1118
improved-logging.dpatch.txt (51.3 KB) - added by zooko at 2010-07-19T07:11:24Z.

Download all attachments as: .zip

Change History (33)

comment:1 Changed at 2010-07-14T06:52:01Z by zooko

Kevan: could this be a regression from v1.6.0? Looking at the annotation of that file, the assertion and the calculation of buckets dates from way back (patch 17299fc96e6fb855!), but of course the server-selection was changed in v1.7.0 thanks to your work, so maybe it is a regression.

comment:2 Changed at 2010-07-14T08:19:07Z by kevan

(caveat: I haven't thoroughly investigated this issue yet, so what I say below might not be the case)

I think that the old peer selection technique would never end up allocating a remote bucket for a particular share more than once. If the new peer selection technique redistributes shares, it can end up doing that if it ends up allocating a share to a different peer in the redistribution process than it did in the initial process. The assert, as written, works fine for the former case, because there will never be more buckets than are in the dictionary. But in the servers-of-happiness case, there can be. Wait, I guess there shouldn't be -- there is no reason for used_peers to contain duplicate buckets, since we should be aborting and removing the ones that we reassign. So in this case, the assert is right, and the code is wrong. I think. I'll look at this tomorrow when I'm less sleepy.

In summary, I think this is a regression from 1.6.0.

comment:3 Changed at 2010-07-14T14:11:15Z by zooko

  • Component changed from code to code-peerselection
  • Keywords regression upload added
  • Priority changed from major to critical

Changed at 2010-07-15T04:55:40Z by kmarkley86

flogtool log

comment:4 Changed at 2010-07-15T05:02:45Z by kmarkley86

I've just attached stuff.flog.bz2 containing one instance of each problem I reported in the original e-mail to tahoe-dev. This log is from the node performing the backup. (I couldn't get flogtool running on Windows, and the three other nodes in the grid are all different flavors of Windows.)

First is the assert error:

assert len(buckets) == sum([len(peer.buckets) for peer in used_peers])

Second, I see that if I immediately retry the backup operation, I will immediately get the second problem of being unable to place shares even though no storage nodes are anywhere near full:

allmydata.interfaces.UploadUnhappinessError: shares could be placed on only 3 server(s) such that any 2 of them have enough shares to recover the file, but we were asked to place shares on at least 4 such servers. (placed all 4 shares, want to place shares on at least 4 servers such that any 2 of them have enough shares to recover the file, sent 4 queries to 4 peers, 3 queries placed some shares, 1 placed none (of which 1 placed none due to the server being full and 0 placed none due to an error))

I think it's important to note that as I was using my grid I originally saw the UploadUnhappinessError? a couple times, nonreproducibly. Only after encountering the (reproducible) assert error do I now have an easily reproducible UploadUnhappinessError?.

comment:5 Changed at 2010-07-18T03:05:05Z by warner

The persistent UploadUnhappinessError is probably explained by the failure to abort the allocated-but-not-written shares, covered in #1117. If the number of available servers in the grid is not much bigger than N, it wouldn't take more than one or two of these assert failures to get your client into a state where it would think that the servers were all full. Restarting the client (specifically dropping and reestablishing the TCP connections to the servers) should cause those not-yet-written shares to be abandoned, clearing the problem.

comment:6 Changed at 2010-07-18T16:30:35Z by zooko

I tried to write a unit test for this case, but my server 0 is constructed by the test code as having all shares. I want server 0 to have no shares. Can someone else tell me how to write a unit test that creates a server with no shares? Here's the relevant part of the test.log:

2010-07-18 10:14:44-0600 [-] just created a server, number: 0 => <StorageServer xgru5adv>
2010-07-18 10:14:44-0600 [-] just created a server, number: 1 => <StorageServer ysbz4st7>
2010-07-18 10:14:44-0600 [-] just created a server, number: 2 => <StorageServer rvsry4kn>
2010-07-18 10:14:44-0600 [-] just created a server, number: 3 => <StorageServer b3llgpww>
2010-07-18 10:14:44-0600 [-] response to allocate_buckets() from peer xgru5adv: alreadygot=(0, 1, 2, 3), allocated=()
2010-07-18 10:14:44-0600 [-] response to allocate_buckets() from peer ysbz4st7: alreadygot=(0, 3), allocated=()
2010-07-18 10:14:44-0600 [-] response to allocate_buckets() from peer b3llgpww: alreadygot=(1,), allocated=()
2010-07-18 10:14:44-0600 [-] response to allocate_buckets() from peer rvsry4kn: alreadygot=(2,), allocated=()
2010-07-18 10:14:44-0600 [-] peer selection successful for <Tahoe2PeerSelector for upload t2uvf>: placed all 4 shares, want to place shares on at least 4 servers such that any 2 of them have enough shares to recover the file, sent 4 queries to 4 peers, 4 queries placed some shares, 0 placed none (of which 0 placed none due to the server being full and 0 placed none due to an error): sh0: ysbz4st7, sh1: b3llgpww, sh2: rvsry4kn, sh3: ysbz4st7+xgru5adv

I will attach the test code as a darcs patch.

comment:7 Changed at 2010-07-18T19:19:37Z by kevan

When you do

d = self._setup_and_upload(k=2, n=4)

you cause server 0 to be created, and to receive all of the initial shares of the upload. The server 0 that you create later shares a storage directory with the first one, so from the perspective of the second upload, it does have all the shares. You can fix this by emptying the storedir of server 0 after you've finished copying shares, using something like this:

storedir = self.get_serverdir(0)
# remove the storedir, wiping out any existing shares
shutil.rmtree(storedir)
# create an empty storedir to replace the one we just removed
os.mkdir(storedir)

placed after the self._copy_share_to_server call. On my machine, this modification raises the AssertionError.

comment:8 Changed at 2010-07-18T19:52:11Z by zooko

Brian suggested on IRC last night that maybe the assertion should just be removed.

comment:9 Changed at 2010-07-18T22:12:39Z by zooko

Here's a better patch for adding debug diagnostics. I don't intend to commit this to trunk before 1.7.1, but it might be useful for people who are debugging.

Changed at 2010-07-18T22:19:01Z by zooko

comment:10 Changed at 2010-07-18T22:36:38Z by zooko

Here is a unit test which uses Kevan's suggestion to reproduce the assertion failure. I'll attach it as a darcs patch.

Changed at 2010-07-18T22:37:17Z by zooko

comment:11 Changed at 2010-07-19T00:32:42Z by zooko

Okay, I've figured this out.

CHKUploader decides what to upload, including potentially deciding to upload one share to multiple servers. But encoder.set_shareholders() takes only a dict mapping from shnum to IStorageBucketWriter, so encoder will not upload a shnum to more than one server.

The assertion is showing this mismatch.

Now what to do about it? We could simply comment-out the assertion, but then for each shnum that CHKUploader wants to upload to multiple servers, it will get uploaded to only one (random) server. Interesting fact: in this case, as reported by Kyle Markley and implemented in attachment:test-1118.dpatch.txt, it wouldn't matter! CHKUploader is trying to upload sh0 to two different servers, but as far as I can tell if it uploads to either one of the servers it will still achieve happiness level 2.

(Question: why then is CHKUploader overdoing it? Kevan: do you know?)

Next step is to write a unit test for a situation in which you need to actually successfully upload a share to more than one server in order to achieve the desired happiness level. The current upload code would get this assertion failure under that test. Even if we commented out that assertion, then the code would think that it had successfully uploaded but would in fact leave an insufficiently happy share distribution.

I'm not sure if we should hold up v1.7.1 for this issue.

I think the solution is actually fairly straightforward to extend encoder to accept a dict from shnum to set of IStorageBucketWriter instead of to a single IStorageBucketWriter. We should of course write the aforementioned unit test first.

comment:12 Changed at 2010-07-19T00:35:39Z by davidsarah

  • Keywords availability reliability added

comment:13 follow-up: Changed at 2010-07-19T01:08:15Z by kevan

I'm thinking about it, but I'm having trouble coming up with a case where uploading one share to multiple servers would increase the size of a maximum matching in the graph that happiness is. I think that the fact that CHKUploader does this is a bug. I think that the only time that it would open multiple bucketwriters for one share is the case where it redistributes a share to another server to try to increase happiness. When it does this, it should look in its collection of remote proxies and check to see if it has one for the share that it redistributes -- if it does, that should be abort()ed and removed. Extending the encoder to accept a dict of shnum => set(IStorageBucketWriter) would also solve the issue, but, if I'm right about multiply-uploaded shares not increasing happiness, at an efficiency cost, in that an upload would need to upload 1/k more data without any increase in happiness. I favor the first approach, and favor leaving the assert for that reason.

comment:14 Changed at 2010-07-19T01:10:28Z by zooko

David-Sarah asked on IRC: doesn't uploader check whether it has actually successfully uploaded to happiness level H? I thought so, but I looked in the source code to remind myself. Bad news: Encoder._remove_shareholder() checks by using its .servermap attribute, which was initialized with the intended upload plan of CHKUploader, so it contains the plan to upload the same shnum to multiple servers, so Encoder might be falsely thinking that the plan is still capable of achieving happiness when it is not.

We want to have some unit tests which test whether the actual share distribution that ends up on the servers satisfied, not just whether the CHKUploader and Encoder were satisfied with their own progress. Let's see...

Hm, no most of the tests in src/allmydata/test/test_upload.py are just relying on the uploader's self-evaluation. I see there is one test that looks at result.pushed_shares. That could be a useful thing to check too. But in general we should probably add to some or all of the tests which try an upload that afterward we inspect the share distribution on the servers and fail the test if that distribution isn't actually happy.

Now another tweak that we could do right away is to change the calculation of servermap in CHKUploader so that it doesn't include a plan to upload the same shnum to multiple servers. That would be in CHKUploader.set_shareholders() and is an easy change to make.

comment:15 in reply to: ↑ 13 Changed at 2010-07-19T01:13:36Z by zooko

Replying to kevan:

I think that the only time that it would open multiple bucketwriters for one share is the case where it redistributes a share to another server to try to increase happiness. When it does this, it should look in its collection of remote proxies and check to see if it has one for the share that it redistributes -- if it does, that should be abort()ed and removed.

That does sound right! So, we should still extend the test_upload.py unit tests to inspect the actual share distribution, and we should leave the assertion in place, and we should figure out ...

Oh wait, see the example in attachment:test-1118.dpatch.txt. It is choosing to upload sh0 to two different servers, neither of which is the server that already had sh0! So I agree that this is, in this particular instance at least, a bug in CHKUploader, but it isn't the bug you thought of just now.

comment:16 Changed at 2010-07-19T01:36:24Z by zooko

I'm working on extending the tests in src/allmydata/test/test_upload.py to inspect the actual share distribution's happiness level.

Changed at 2010-07-19T03:50:12Z by kevan

comment:17 Changed at 2010-07-19T03:51:47Z by kevan

This diff changes the abort function on the PeerTracker? class to take a list of sharenumbers that should be aborted, rather than simply aborting all of them. It also aborts all of the homeless shares after the reassignment algorithm runs. AFAICT, it makes the test in attachment:test-1118.dpatch.txt pass.

Changed at 2010-07-19T04:26:55Z by davidsarah

  • upload.py: fix #1118 by aborting newly-homeless buckets when reassignment runs. This makes a previously failing assert correct.

comment:18 Changed at 2010-07-19T04:27:37Z by davidsarah

  • Keywords review-needed added

comment:19 Changed at 2010-07-19T05:06:57Z by zooko

Please review this patch which extends the unit tests to make sure that the shares that land on the storage servers really do form a happy distribution. It would be okay to leave this patch out of 1.7.1 since there is no known bug that these tests would have caught, but it might be good to apply it too.

comment:20 follow-up: Changed at 2010-07-19T05:18:40Z by zooko

  • Keywords review-needed removed

It is a bit confusing to attempt aborting on all of homeless_shares, since that gets populated with all shares at the beginning and then gradually whittled down and sometimes repopulated. It would make more sense to me to just attempt aborting on the shares that are getting aborted right now -- the share variable inside the while loop. (Also of course this would waste a tad fewer CPU cycles.)

comment:21 in reply to: ↑ 20 Changed at 2010-07-19T05:31:40Z by kevan

Replying to zooko:

It is a bit confusing to attempt aborting on all of homeless_shares, since that gets populated with all shares at the beginning and then gradually whittled down and sometimes repopulated. It would make more sense to me to just attempt aborting on the shares that are getting aborted right now -- the share variable inside the while loop. (Also of course this would waste a tad fewer CPU cycles.)

The reason I didn't do this was that I couldn't think of an obvious way to know which of the PeerTrackers, if any, was the appropriate one for a particular share. So, to abort within the loop, we'd have to loop through the list of peer trackers once for each share that we wanted to abort. By aborting all of the homeless shares at once, we only have to loop through that list once. To avoid wasting CPU cycles on homeless shares that aren't homeless as a result of redistribution, we could store those in a separate list, and then just abort that list at the end, if you think that addresses your comment (or I could be missing something really obvious which may also address your comment).

comment:22 Changed at 2010-07-19T05:40:22Z by zooko

  • Keywords reviewed added

reviewed! +1

Changed at 2010-07-19T05:51:41Z by davidsarah

upload.py: abort homeless buckets when reassignment runs. This makes a previously failing assert correct. This versi on refactors 'abort' into two methods, rather than using a default argument. fixes #1118

Changed at 2010-07-19T07:11:24Z by zooko

comment:23 Changed at 2010-07-19T08:41:06Z by zooko

  • Keywords reviewed removed
  • Resolution set to fixed
  • Status changed from new to closed

The problem with hitting an assertion failure is fixed by 461f9464579f2442. The other problem that it would sometimes upload an unhappy share distribution while thinking it was happy is fixed by 13b5e44fbc2effd0 -- see also tests in 2e83f243c2c2492b and 1dbfcf753de46e06. The remaining problem, that it gives up with unhappiness when it could more cleverly find a happy way to upload in this situation, is moved to ticket #1128.

Thanks Kyle, Kevan, David-Sarah, and Brian!

comment:24 Changed at 2010-07-19T08:49:22Z by david-sarah@…

In 461f9464579f2442:

(The changeset message doesn't reference this ticket)

comment:25 Changed at 2010-07-20T03:49:00Z by davidsarah

Fixing this problem uncovered another one, due to a bug in merge_peers in happinessutil.py, which sometimes resulted in an upload succeeding even though the share distribution should not have been considered happy.

merge_peers was only shallow-copying its servermap argument, rather than deep-copying it. This resulted in the sets of servers being aliased between the merged result, and the self.preexisting_shares attribute of Tahoe2PeerSelector in immutable/upload.py. So merge_peers was accidentally adding servers to the sets in preexisting_shares. This bug was fixed in 13b5e44fbc2effd0.

The current code deep-copies the servermap even when merge_peers is not going to mutate it. This isn't strictly necessary, but returning a result in which the sets are aliased would be correct only because the call sites of merge_peers in source:src/allmydata/immutable/upload.py have finished looking at the result before modifying any of those sets. Relying on that would be fragile under maintenance.

Note: See TracTickets for help on using tickets.