#269 closed defect (wontfix)

client should handle migrated shares by updating the write-enabler

Reported by: warner Owned by:
Priority: major Milestone: eventually
Component: code-storage Version: 0.7.0
Keywords: migration leases preservation Cc:
Launchpad Bug:

Description

On 12/17/07, in zooko's patch named "put all private state in $BASEDIR/private", the location of the Tub's certificate moved from node.pem to private/node.pem . Production nodes that were created before this change had their cert in the old location. When I upgraded most of these nodes to the newer code (some time after 12/17), I forgot about the change, and as a result these nodes generated new certificate files (and thus changed their tubid).

However, these nodes had already been running for a while, and had accepted shares with the old tubid. As a result, the leases recorded in these shares were made for a different nodeid, so the cancel/renew secrets are different, and the write_enabler used for mutable slots is different.

The consequence is that certain files created before this change can no longer be modified. They also have leases which can no longer be renewed or canceled, but that's not as big a deal because (I think) the client nodes can just establish new leases.

Effectively, we've accidentally migrated these shares from one storage server to another. We made provisions for this, by recording the nodeid of the storage server which created the slot inside the share (next to the write enabler) and also in all the leases. The exception raised by a bad write enabler includes this recorded nodeid (of the old server), so the client can compute the same write enabler. Eventually, we'll write migration tools which allow the client to demonstrate knowledge of the old write_enabler and then ask the server to update the share to the new (correct) write-enabler.

But we haven't written these tools yet, and we're hoping to avoid it until we actually need to start migrating shares.

I'm adding this ticket to remind me what the problem is. We can close it once we've written and deployed the migration tools.

Note that CHK/immutable slots do not record the nodeid information, because 1) they don't have a write-enabler, and leases are somewhat less of a concern, and 2) we didn't think about it at the time. We should probably add these identifiers to the CHK leases, and include CHK lease migration in the tools we build. The basic idea is that if you can demonstrate knowledge of the old lease secret, then you are allowed to change it to be whatever you want.

Change History (7)

comment:1 Changed at 2008-01-11T05:19:26Z by warner

it might also be a good idea to delete the current testnet shares suffering from this problem, since they're causing our lack of mutable-file recovery code to become a bigger issue than it otherwise needs to be: see #272 for details.

comment:2 Changed at 2008-02-07T19:09:25Z by zooko

  • Summary changed from accidental share migration to share migration

comment:3 Changed at 2008-02-07T19:09:37Z by zooko

  • Summary changed from share migration to share migration (update write-enabler)

comment:4 Changed at 2008-07-01T23:28:47Z by warner

The clients could respond to the wrong-write-enabler error by computing the new one and using it immediately: work around the problem rather than fixing it. This would not require any new storage-server API, and clients would continue to be able modify all their shares (it would just take an extra roundtrip).

comment:5 Changed at 2008-07-01T23:29:39Z by warner

  • Summary changed from share migration (update write-enabler) to client should handle migrated shares by updating the write-enabler

comment:6 Changed at 2009-12-13T03:36:32Z by davidsarah

  • Keywords preservation added

comment:7 Changed at 2009-12-13T05:56:45Z by zooko

  • Resolution set to wontfix
  • Status changed from new to closed

The new plan is to define a new mutable-file format (wiki:NewCapDesign) which doesn't have write-enablers (instead it has a digital signature checkable by the server), so I think we should close this as wontfix.

This means that people using the old cap format (which is everyone, currently) can't expect that a future version of Tahoe-LAFS will have the ability to fix the writer-enablers on migrated mutable shares.

Note: See TracTickets for help on using tickets.