[tahoe-lafs-trac-stream] [tahoe-lafs] #1566: if a stored share has a corrupt header, other shares held by that server for the file should still be accessible to clients

Fri Jul 12 18:10:22 UTC 2013

#1566: if a stored share has a corrupt header, other shares held by that server
for the file should still be accessible to clients
-------------------------+-------------------------------------------------
     Reporter:           |      Owner:  zooko
  davidsarah             |     Status:  new
         Type:  defect   |  Milestone:  1.11.0
     Priority:  major    |    Version:  1.9.0b1
    Component:  code-    |   Keywords:  corruption preservation storage
  storage                |  review-needed
   Resolution:           |
Launchpad Bug:           |
-------------------------+-------------------------------------------------

Old description:

> When a storage server receives a {{{remote_get_buckets}}} or
> {{{remote_slot_testv_and_readv_and_writev}}} request, it will try to
> create share objects for each of the shares it stores under that SI that
> are wanted by the client. If any of those shares have a corrupt header
> (typically resulting in a {{{UnknownMutableContainerVersionError}}},
> {{{UnknownImmutableContainerVersionError}}}, or {{{struct.error}}} from
> the share class constructor), the whole request will fail, even though
> the server might hold other shares that are not corrupted.
>
> Unfortunately there is no way in the current storage protocol to report
> success for some shares and a failure for others. The options are:
>  * the status quo -- no shares in the shareset are accessible;
>  * shares with corrupt headers are ignored on read requests;
>  * if ''all'' shares are corrupted then report one of the errors, but if
> only some shares in a shareset have corrupted headers, ignore them and
> allow access to the rest.
>
> I found this bug when working on the branch for #999, but I think it also
> applies to trunk.

New description:

 When a storage server receives a {{{remote_get_buckets}}} or
 {{{remote_slot_testv_and_readv_and_writev}}} request, it will try to
 create share objects for each of the shares it stores under that SI that
 are wanted by the client. If any of those shares have a corrupt header
 (typically resulting in a {{{UnknownMutableContainerVersionError}}},
 {{{UnknownImmutableContainerVersionError}}}, or {{{struct.error}}} from
 the share class constructor), the whole request will fail, even though the
 server might hold other shares that are not corrupted.

 Unfortunately there is no way in the current storage protocol to report
 success for some shares and a failure for others. The options are:
  * the status quo -- no shares in the shareset are accessible;
  * shares with corrupt headers are ignored on read requests;
  * if ''all'' shares are corrupted then report one of the errors, but if
 only some shares in a shareset have corrupted headers, ignore them and
 allow access to the rest.

 I found this bug when working on the branch for #999, but I think it also
 applies to trunk.

--

Comment (by markberger):

 It looks good to me, except for one line I have a question about. I left a
 comment on [https://github.com/LeastAuthority/tahoe-
 lafs/commit/fd819cea11599cc274b8e1d72bfce0fffea39296 fd819cea] about it.

 Does the server file a local corruption report like Brian suggested? I
 can't seem to find code anywhere that does this.

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1566#comment:12>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage