#3540 new defect

allmydata.mutable.publish.Publish.publish has unreliably covered bad shares handling code

Reported by: exarkun Owned by:
Priority: normal Milestone: undecided
Component: unknown Version: n/a
Keywords: Cc:
Launchpad Bug:

Description

From https://www.tahoe-lafs.org/trac/tahoe-lafs/ticket/2891 and https://app.codecov.io/gh/tahoe-lafs/tahoe-lafs/compare/896/changes, these lines are non-deterministically covered:

        for key, old_checkstring in list(self._servermap.get_bad_shares().items()):
            (server, shnum) = key
            self.goal.add( (server,shnum) )
            self.bad_share_checkstrings[(server,shnum)] = old_checkstring

Add some deterministic coverage for them.

Change History (1)

comment:1 Changed at 2020-11-30T21:05:16Z by exarkun

It's hard to tell what the point of this loop is. Nothing in the test suite fails if I just delete it.

The self.update_goal() call that follows immediately afterwards discovers the bad shares are homeless and adds them to self.goal itself so this loop does not seem to be important to cause bad shares to be re-uploading before the publish operation is considered successful.

The bad_share_checkstrings thing might be the purpose. If values are found there later then the writer is told about the checkstrings. Perhaps this avoid uncoordinated repairs?

So ...

  1. node0 decides that share0 is bad and has sequence number seq0.
  2. it records a checkstring including seq0 and gets ready to repair it.
  3. node1 decides that share0 is bad and has sequence number seq0.
  4. it records a checkstring including seq0 and gets ready to repair it.
  5. node1 uploads a new share0 with sequence number seq1 against checkstring seq0.
  6. the share is now repaired and contains a new content version.
  7. node0 tries to upload a new share0 with sequence number seq1 against checkstring seq0.
  8. the upload fails because the checkstring doesn't match.

maybe?

Note: See TracTickets for help on using tickets.