#1157 new enhancement

new downloader could still get block data from shares with UEB/hashchain corruption

Reported by: warner Owned by:
Priority: minor Milestone: undecided
Component: code-encoding Version: 1.8β
Keywords: download availability Cc:
Launchpad Bug:


during the work on #1154, I was reminded that I made an expedient/conservative choice during my recent new-downloader work: once we see any corruption in a share, we completely give up on it. It would be nice if we could get value out of partially-corrupted shares. Specifically, when the block data for e.g. segment0 is corrupted, we should still be able to get block data for segment1.

This change will show up in source:src/allmydata/immutable/downloader/fetcher.py#L222 , in SegmentFetcher._block_request_activity, in the handling of state=CORRUPT (that state will probably go away, or become per-segment).

I plan to defer this work until we get a "sort shares by quality" scheme in place, where the general idea is to put the "best" shares (fastest, most-data-already-downloaded) at the top of the list, occasionally try out new shares for serendipity, and put slow/tardy shares at the bottom. In this system, corruption would move a share to the bottom of the list, but would not discard it completely.

There is a test (test_download.py, DownloadTest.test_simultaneous_onefails) which will need to be updated when this is fixed, since it asserts that two simultaneous segment reads (one for a segment that we've intentionally corrupted, the other for an uncorrupted segment) results in two failures, whereas once we've fixed this it will result in one failure and one success.

Change History (2)

comment:1 Changed at 2010-08-05T19:21:33Z by warner

  • Summary changed from new downloader should reuse shares with only partial corruption to new downloader could still get block data from shares with UEB/hashchain corruption

actually, I was less conversative than I thought. We *do* use partially-corrupted shares, as long as the corruption is limited to block data. If there is corruption in the hash trees or UEB, and we notice it (because we tried to get those fields from a given share), then we will abandon that share for the life of the FileNode instance. If we don't notice the corruption (because we already had those fields from some earlier share), we'll keep using it.

So the only improvement we might make is to not give up on UEB-corrupted shares in the hopes of getting useful block data from them later (even though we must get the UEB from some other share).

Updating the description to match.

comment:2 Changed at 2011-05-21T15:06:57Z by davidsarah

  • Keywords download availability added
Note: See TracTickets for help on using tickets.