[tahoe-lafs-trac-stream] [tahoe-lafs] #1832: support indefinite leases with garbage collection

tahoe-lafs trac at tahoe-lafs.org
Tue Oct 30 18:40:03 UTC 2012


#1832: support indefinite leases with garbage collection
-------------------------------------------------+-------------------------
 Reporter:  zooko                                |          Owner:
     Type:  enhancement                          |         Status:  new
 Priority:  normal                               |      Milestone:
Component:  code-network                         |  undecided
 Keywords:  leases garbage-collection            |        Version:  1.9.2
  accounting                                     |  Launchpad Bug:
-------------------------------------------------+-------------------------
 !LeastAuthority.com runs a storage server and we want to offer our
 customers an indefinite (within the scope of their business relationship
 with us) lease. That is: as long as they keep paying their credit card
 charges to us (or even ''longer'', if we choose to keep their ciphertext
 until they bring their account back into good standing) we will not delete
 ciphertext that their LAFS storage client has marked as something to keep,
 even if they don't successfully get their LAFS storage client to renew
 leases ever again.

 We need this, because the current protocol offers us only two options,
 neither of which is acceptable:

 * We can turn on periodic, lease-renewal-based garbage collection, but if
 the customer fails to renew the leases on their data, it will get deleted.
 We don't believe that's an acceptable risk to offer our customers.

 Or:

 * We can turn off garbage collection altogether and keep everything that
 the customer ever uploaded, and charge them for keeping it.

 The latter is what we current do, but it isn't sustainable, because:

 * Some of the ciphertext that the customer has uploaded is no longer
 valuable to them, and their LAFS client could inform us of the fact that
 we need to delete it (or at the very least, to stop charging ''them'' for
 the cost of keeping it!)

 and even more difficult:

 * Some of the ciphertext that they customer has uploaded is no longer
 valuable to them, but they've forgotten all about it and deleted or lost
 all of their links to it, and their LAFS storage client never informed us
 about this fact (possibly due to a network failure when it tried to inform
 us).

 This implies that to satisfy this use case, there ''must'' be a protocol
 whereby the LAFS client can tell the storage server "Okay, here are some
 ciphertext shares which ''as of now'' I want to keep, and ''any other
 ones'' that you might have I hereby cease paying for, so you'd better
 delete them, if they exist."

 Now, it would be troublesome for the LAFS client to be required to build a
 complete manifest of all ciphertext shares that it wants to keep and then
 deliver that entire manifest to the server at once. So, a better, more
 incremental algorithm that would satisfy this use case is like this:

 The following protocol is what the LAFS client does when it wants to stop
 paying for everything not-reachable from a given root.

 1. The client asks the server for a magic token which is something that is
 meaningful ''only'' to the server. The meaning of this is "Give me a
 special token that when I later give it back to you, you'll know you can
 delete everything that I didn't touch since you created this special
 token." As a matter of implementation, the storage server will find it
 convenient to use a timestamp from his clock to be the token, but in order
 to deter the client from comparing it to timestamps from the client's
 clock, it is sent as a string. Ooh, in fact, the server may have actually
 ''encrypted'' the token with a secret key known only to the server and
 unknown to the client, just to prevent the client from comparing its value
 to a value taken from their own clock. This way, the protocol adds
 absolutely no requirement for clock sync between the client's clock and
 the server's clock, but instead this "timestamp" is derived from the
 server's clock, and is only ever compared to the server's clock. If the
 server's clock is set to 1969 and the client's clock is set to 2099, or
 vice versa, that's fine.

 2. The client starts traversing the files from the root, and for each one
 (or batch of them) it sends a message to the storage server saying "Please
 mark these ones as to-keep.". The server replies "Okay, done." (This "mark
 as to-keep" can be implemented as a "100 year lease" if that helps
 implementation.) It might make sense for the client to send the magic
 token with every one of these requests, so it means "Please mark these
 ones as to-keep when you do the garbage-collection sweep associated with
 this token.". Or "Please exempt these ones from a probably future garbage-
 collection sweep that will, when and if it comes, be associated with this
 token.".

 Note: If the client crashes or gets stopped and restarted or loses and
 regains connection to the storage server during this process, it can
 always resume at step 2, provided that it still has the "token" from step
 1 written down.

 Note: If the client creates new files during this process, the newly
 created file comes with an equivalent mark ("100 year lease"), so that the
 client doesn't have to worry about race conditions between its traversal
 for marking keepers and its addition of new files.

 3. Once the client is satisfied that it has marked all files that it wants
 to keep, then it sends a message to the storage server saying "Here is a
 magic token that you gave me earlier. I hereby cease paying you for any
 files that I haven't marked (or created) since you gave me that magic
 token.".


 (discussed [//pipermail/tahoe-dev/2012-October/007768.html on the mailing
 list])

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1832>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list