[tahoe-lafs-trac-stream] [tahoe-lafs] #881: cancel leases on extra shares in repairer, check-and-add-lease, upload, and publish

tahoe-lafs trac at tahoe-lafs.org
Thu Dec 6 22:26:27 UTC 2012


#881: cancel leases on extra shares in repairer, check-and-add-lease, upload, and
publish
---------------------------+-------------------------------------
     Reporter:  warner     |      Owner:  somebody
         Type:  defect     |     Status:  closed
     Priority:  major      |  Milestone:  soon
    Component:  code       |    Version:  1.5.0
   Resolution:  duplicate  |   Keywords:  leases space-efficiency
Launchpad Bug:             |
---------------------------+-------------------------------------
Changes (by davidsarah):

 * status:  new => closed
 * resolution:   => duplicate


Old description:

> The ideal state of a file is to have exactly N distinct shares on N
> distinct servers. Anything beyond that is "extra": they might improve
> reliability but also consume extra storage space. We'd like to remove
> these extra shares to bring the total consumed storage space back down to
> the target implied by the user's choice of the N/k "expansion ratio".
>
> For mutable files, anyone with a writecap can simply delete the extra
> shares. We should modify the "publish" operation to identify and delete
> the extra shares (after successfully updating the non-extra shares).
>
> But there is no appropriate way to explicitly delete an immutable share:
> we intentionally do not provide a "destroycap". So the way to get rid of
> these shares is through garbage collection.
>
> The operations that add leases (check --add-lease, and the repairer)
> should pay attention to how many shares have been seen, and identify the
> extra shares, and then cancel any leases that we can on them.
>
> Check-and-add-lease pipelines both operations: it sends a DYHB and an
> add-lease-to-anything-you-have message together, ignoring the response
> from the add-lease message, and counting the DYHB responses to form the
> checker results. This speeds up the operation: if we allowed the code to
> have an unbounded number of outstanding messages in flight, the entire
> operation could be finished in one RTT.
>
> Instead, this code should watch the DYHB responses and identify the extra
> shares, then send out cancel-lease messages for the extra shares. This
> increases the required time to two RTT (since we can't send out any
> cancel-lease messages until we've seen enough DYHB responses to correctly
> identify shares as being extra), but only in the (hopefully rare) case
> where there are extra shares. In the common case, check-and-add-lease
> should proceed at full speed and never need to send out additional
> messages.
>
> Sending out cancel-lease messages is also easier than carefully
> refraining from sending out add-lease messages on the extra shares. To
> accomplish that, we'd have to do a full check run (i.e. DYHB messages to
> everyone), and only after most of those came back could we do the
> selective add-lease messages. By sending out cancel-messages instead,
> we're sending more messages (DYHB, add-lease, cancel-lease), but we can
> pipeline them more efficiently.
>
> Extra shares can arise in a variety of ways. The most common is when a
> mutable file is modified while some of the servers are offline: new
> shares (to replace the unavailable ones) will be created and sent to new
> servers, and then on a subsequent publish, all shares will be updated.
> This typically results in e.g. sh1 being present on both servers A and B.
>
> Another cause is the immutable repairer, which (because immutable upload
> is still pretty simplistic) will place a share on a server before
> checking to see if that same share is on a different server, or before
> seeing if there are any other shares on that server already. This
> typically results in e.g. sh1 and sh2 being present on server A, while
> sh2 is also present on server B.
>
> The storage server's add/cancel lease operations need to be enhanced to
> allow clients to selectively manipulate leases on each share, not just
> the bucket as a whole. This is needed to allow the sh2 on server A to
> expire, while preserving the sh1 on server A. This also argues against
> some of the storage-server changes that I've recommended elsewhere
> (#600), in which the lease information would be pulled out of the per-
> share files and into a per-bucket structure, since that would make it
> impossible to cancel a lease on one share but not the other.

New description:

 The ideal state of a file is to have exactly N distinct shares on N
 distinct servers. Anything beyond that is "extra": they might improve
 reliability but also consume extra storage space. We'd like to remove
 these extra shares to bring the total consumed storage space back down to
 the target implied by the user's choice of the N/k "expansion ratio".

 For mutable files, anyone with a writecap can simply delete the extra
 shares. We should modify the "publish" operation to identify and delete
 the extra shares (after successfully updating the non-extra shares).

 But there is no appropriate way to explicitly delete an immutable share:
 we intentionally do not provide a "destroycap". So the way to get rid of
 these shares is through garbage collection.

 The operations that add leases (check --add-lease, and the repairer)
 should pay attention to how many shares have been seen, and identify the
 extra shares, and then cancel any leases that we can on them.

 Check-and-add-lease pipelines both operations: it sends a DYHB and an add-
 lease-to-anything-you-have message together, ignoring the response from
 the add-lease message, and counting the DYHB responses to form the checker
 results. This speeds up the operation: if we allowed the code to have an
 unbounded number of outstanding messages in flight, the entire operation
 could be finished in one RTT.

 Instead, this code should watch the DYHB responses and identify the extra
 shares, then send out cancel-lease messages for the extra shares. This
 increases the required time to two RTT (since we can't send out any
 cancel-lease messages until we've seen enough DYHB responses to correctly
 identify shares as being extra), but only in the (hopefully rare) case
 where there are extra shares. In the common case, check-and-add-lease
 should proceed at full speed and never need to send out additional
 messages.

 Sending out cancel-lease messages is also easier than carefully refraining
 from sending out add-lease messages on the extra shares. To accomplish
 that, we'd have to do a full check run (i.e. DYHB messages to everyone),
 and only after most of those came back could we do the selective add-lease
 messages. By sending out cancel-messages instead, we're sending more
 messages (DYHB, add-lease, cancel-lease), but we can pipeline them more
 efficiently.

 Extra shares can arise in a variety of ways. The most common is when a
 mutable file is modified while some of the servers are offline: new shares
 (to replace the unavailable ones) will be created and sent to new servers,
 and then on a subsequent publish, all shares will be updated. This
 typically results in e.g. sh1 being present on both servers A and B.

 Another cause is the immutable repairer, which (because immutable upload
 is still pretty simplistic) will place a share on a server before checking
 to see if that same share is on a different server, or before seeing if
 there are any other shares on that server already. This typically results
 in e.g. sh1 and sh2 being present on server A, while sh2 is also present
 on server B.

 The storage server's add/cancel lease operations need to be enhanced to
 allow clients to selectively manipulate leases on each share, not just the
 bucket as a whole. This is needed to allow the sh2 on server A to expire,
 while preserving the sh1 on server A. This also argues against some of the
 storage-server changes that I've recommended elsewhere (#600), in which
 the lease information would be pulled out of the per-share files and into
 a per-bucket structure, since that would make it impossible to cancel a
 lease on one share but not the other.

--

Comment:

 Merging this into #1816.

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/881#comment:3>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list