[tahoe-lafs-trac-stream] [tahoe-lafs] #1835: stop grovelling the whole storage backend looking for externally-added shares to add a lease to

tahoe-lafs trac at tahoe-lafs.org
Tue Oct 30 23:02:37 UTC 2012


#1835: stop grovelling the whole storage backend looking for externally-added
shares to add a lease to
-------------------------------------------------+-------------------------
 Reporter:  zooko                                |          Owner:
     Type:  enhancement                          |         Status:  new
 Priority:  normal                               |      Milestone:
Component:  code-storage                         |  undecided
 Keywords:  leases garbage-collection            |        Version:  1.9.2
  accounting                                     |  Launchpad Bug:
-------------------------------------------------+-------------------------
 Currently, storage server operators can manually add share files into the
 storage backend, such as with "mv" or "rsync" or what have you, and a
 crawler will eventually discover that share and add a lease to it.

 I propose that we stop supporting this method of installing shares. If we
 stop supporting this, that would leave three options for if you want to
 add a share to a server:

 1. Send it through the front door — use a tool that speaks the LAFS
 protocol, connects to the storage server over a network socket, and
 delivers the share. This will make the server write the share out to
 persistent storage, and also update the leasedb to reflect the share's
 existence, so that the share can get garbage-collected when appropriate.
 This would be a good way to do it if you have few shares or if they are on
 a remote server that can connect to this storage server over a network.
 2. Copy the shares directly into place in the storage backend and then
 remove the leasedb. The next time the storage server starts, it will
 initiate a crawl that will eventually reconstruct the leasedb, and the
 newly reconstructed leasedb will include lease information about the new
 share so that it can eventually be garbage collected. This might be a
 reasonable thing to do when you are adding a large number of shares and it
 is easier/more efficient for you to add them directly to the storage
 backend, and you don't mind temporarily losing the lease information on
 the shares that are already there.
 3. Copy the shares into place, but don't do anything that would register
 them in the leasedb. They are now immortal, unless a client subsequently
 adds a lease to them.

 The combination of these two options ''might'' suffice for most real use
 cases. If there are use cases where these aren't good enough, i.e. it is
 too inconvenient or slow to send all of the shares through the LAFS
 storage protocol, and you don't want to destroy the extant lease
 information, and you don't want the new shares to possibly become
 immortal, then we could invent other ways to do it:

 4. Copy the shares into place and then use a newly added feature of
 storage server which tells it to notice the existence of each new share
 (by storage index). This newly added feature doesn't need to be exported
 over the network to remote foolscap clients, it could just be a "tahoe"
 command-line that connects to the storage server's local WAPI. What the
 server does when it is informed this way about the existence of a share is
 check if the share is really there and then add it to the leasedb.
 5. Copy the shares into place and then use a newly added feature of
 storage server which performs a full crawl to update the leasedb without
 first deleting it.

 4 would be a bit more efficient than 5 when used, but a lot more
 complication for the server administrator, who has to figure out how to
 call {{{tahoe add-share-to-lease-db $STORAGEINDEX}}} for each share that
 he's added, or else that share will be immortal. It is also more work for
 us to implement.

 5 is really simple both for us to implement and storage server operators
 to use. It is exactly like the current crawler code, except that instead
 of continuously restarting itself and going to look for new shares, it
 quiesces and doesn't restart unless the server operator invokes {{{tahoe
 resync-lease-db}}}.

 So my proposal boils down to: change the accounting crawler never to run
 unless the leasedb is missing or corrupted (which also happens the first
 time you upgrade your server to a leasedb-capable version), or unless the
 operator has specifically indicated that the accounting crawler should
 run.

 This is part of an "overarching ticket" to eliminate most uses of crawler
 — ticket #1834.

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1835>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list