[tahoe-lafs-trac-stream] [tahoe-lafs] #1835: stop grovelling the whole storage backend looking for externally-added shares to add a lease to
tahoe-lafs
trac at tahoe-lafs.org
Tue Oct 30 23:02:37 UTC 2012
#1835: stop grovelling the whole storage backend looking for externally-added
shares to add a lease to
-------------------------------------------------+-------------------------
Reporter: zooko | Owner:
Type: enhancement | Status: new
Priority: normal | Milestone:
Component: code-storage | undecided
Keywords: leases garbage-collection | Version: 1.9.2
accounting | Launchpad Bug:
-------------------------------------------------+-------------------------
Currently, storage server operators can manually add share files into the
storage backend, such as with "mv" or "rsync" or what have you, and a
crawler will eventually discover that share and add a lease to it.
I propose that we stop supporting this method of installing shares. If we
stop supporting this, that would leave three options for if you want to
add a share to a server:
1. Send it through the front door — use a tool that speaks the LAFS
protocol, connects to the storage server over a network socket, and
delivers the share. This will make the server write the share out to
persistent storage, and also update the leasedb to reflect the share's
existence, so that the share can get garbage-collected when appropriate.
This would be a good way to do it if you have few shares or if they are on
a remote server that can connect to this storage server over a network.
2. Copy the shares directly into place in the storage backend and then
remove the leasedb. The next time the storage server starts, it will
initiate a crawl that will eventually reconstruct the leasedb, and the
newly reconstructed leasedb will include lease information about the new
share so that it can eventually be garbage collected. This might be a
reasonable thing to do when you are adding a large number of shares and it
is easier/more efficient for you to add them directly to the storage
backend, and you don't mind temporarily losing the lease information on
the shares that are already there.
3. Copy the shares into place, but don't do anything that would register
them in the leasedb. They are now immortal, unless a client subsequently
adds a lease to them.
The combination of these two options ''might'' suffice for most real use
cases. If there are use cases where these aren't good enough, i.e. it is
too inconvenient or slow to send all of the shares through the LAFS
storage protocol, and you don't want to destroy the extant lease
information, and you don't want the new shares to possibly become
immortal, then we could invent other ways to do it:
4. Copy the shares into place and then use a newly added feature of
storage server which tells it to notice the existence of each new share
(by storage index). This newly added feature doesn't need to be exported
over the network to remote foolscap clients, it could just be a "tahoe"
command-line that connects to the storage server's local WAPI. What the
server does when it is informed this way about the existence of a share is
check if the share is really there and then add it to the leasedb.
5. Copy the shares into place and then use a newly added feature of
storage server which performs a full crawl to update the leasedb without
first deleting it.
4 would be a bit more efficient than 5 when used, but a lot more
complication for the server administrator, who has to figure out how to
call {{{tahoe add-share-to-lease-db $STORAGEINDEX}}} for each share that
he's added, or else that share will be immortal. It is also more work for
us to implement.
5 is really simple both for us to implement and storage server operators
to use. It is exactly like the current crawler code, except that instead
of continuously restarting itself and going to look for new shares, it
quiesces and doesn't restart unless the server operator invokes {{{tahoe
resync-lease-db}}}.
So my proposal boils down to: change the accounting crawler never to run
unless the leasedb is missing or corrupted (which also happens the first
time you upgrade your server to a leasedb-capable version), or unless the
operator has specifically indicated that the accounting crawler should
run.
This is part of an "overarching ticket" to eliminate most uses of crawler
— ticket #1834.
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1835>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list