[tahoe-dev] [tahoe-lafs] #633: lease-expiring share crawler
tahoe-lafs
trac at allmydata.org
Wed Feb 18 13:21:15 PST 2009
#633: lease-expiring share crawler
--------------------------+-------------------------------------------------
Reporter: warner | Owner: warner
Type: task | Status: new
Priority: major | Milestone: 1.4.0
Component: code-storage | Version: 1.3.0
Keywords: | Launchpad_bug:
--------------------------+-------------------------------------------------
Comment(by warner):
Hm, yeah, there are a number of optimizations that can take advantage of
the
fact that we're allowed to delete shares late. You can think of this as
another factor in the tradeoff diagram I just attached to this ticket:
with
marginally increased complexity, we can reduce the CPU/diskIO costs, by
increasing the lease expiration time.
For example, we don't need to maintain an exact sorted order: if leases on
A
and B both don't expire for a month, we don't care (right now) whether A
comes first or B does.. we can put off that sort for a couple of weeks.
Likewise we don't care about timestamp resolution smaller than a day.
I definitely like having the share contain the canonical lease
information,
and using the ancillary data structures merely as a cache. If we were to
go
with a traditional database (sqlite or the like), then I'd have the DB
contain a table with (storageindex, leasedata, expirationtime), with an
index
on both storageindex and expirationtime, and the daily or hourly query
would
then be "SELECT storageindex FROM table WHERE expirationtime < now". We'd
read the real lease data from the share before acting upon it (which
incurs
an IO cost, but share expiration is relatively infrequent, and the safety
benefits are well worth it).
Given the large number of shares we're talking about (a few million per
server), I'm hesitant to create a persistent data structure that needs one
file per share. The shares themselves are already wasting GBs of space on
the
minimum block size overhead. Mind you, ext3 is pretty good about zero-
length
files, a quick test shows that it spends one 4kB block for each 113 files
(each named with the same length as one of our storage index strings, 26
bytes, which means ext3's per-file overhead is an impressively-small 10.25
bytes), so a million would take about 36MB.. not too bad.
Having a separate directory for each second would probably result in a
million directories, but a tree of expire-time directories (as you
described)
that only goes down to the kilosecond might be reasonably-sized. It would
still require a slow initial crawl to set up, though.
Incidentally, a slow-share-crawler could also be used to do local share
verification (slowly read and check hashes on all local shares, to
discover
local disk failures before the filecap holder gets around to doing a
bandwidth-expensive remote verification), and even server-driven repair
(ask
other servers if they have other shares for this file, perform ciphertext
repair if it looks like the file needs it). Hm, note to self: server-
driven
repair should create new shares with the same lease expiration time as the
original shares, so that it doesn't cause a garbage file to live forever
like
some infectious epidemic.
--
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/633#comment:2>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid
More information about the tahoe-dev
mailing list