[tahoe-dev] [tahoe-lafs] #700: have servers publish Bloom filter of which shares they have

Thu Dec 24 12:48:48 PST 2009

#700: have servers publish Bloom filter of which shares they have
--------------------------------+-------------------------------------------
 Reporter:  warner              |           Owner:           
     Type:  enhancement         |          Status:  new      
 Priority:  major               |       Milestone:  undecided
Component:  code-storage        |         Version:  1.4.1    
 Keywords:  performance repair  |   Launchpad_bug:           
--------------------------------+-------------------------------------------

Comment(by warner):

 One approach we've discussed a lot is for an occasionally-connected client
 (who holds a rootcap) to generate a list of currently-active storage-index
 values (known in the code as a "manifest"), and deliver it to an
 always-connected maintenance node, like a
 checker/verifier/repairer/lease-renewer agent. The agent would take on
 responsibility for the files indicated by the manifest: updating their
 leases, repairing as necessary. So the "pregenerated manifest of files" is
 actually a pretty reasonable thing to use.

 There are actually two directions in which a bloom filter might get used.
 We've discussed the first, in which the server generates the filter and
 publishes it for use by clients: this is most useful to accelerate the
 DYHB
 "Do You Have Block" query, which is used at the start of a download, and
 is
 the only query sent by the Checker. For download, we can tolerate the
 false-positive rate of the Bloom filter because we're going to ask more
 questions later (like fetching the actual share data), so false positives
 merely cause a minor performance hit. For the Checker, we have to be more
 conscious of the percentages, because false positives impact file health.

 (incidentally, we should take a step back and think about what sorts of
 failures we're anticipating here.. our current servers don't just delete
 shares on a whim, and we believe that disk errors tend to take out the
 whole
 disk instead of taking out individual shares, so it seems unlikely that
 these
 DYHB queries will ever return different answers from one day to the next,
 and
 the Checker is far more likely to experience a whole server going offline
 than an individual share disappearing. I've been looking for an excuse to
 use
 a Bloom filter for years now, but I shouldn't let that desire push me into
 wasting time on building something that won't actually be of much use).

 The second direction for using a Bloom filter would be to take the
 client's
 manifest and send it to the storage server, saying "please do something
 with
 any shares that match this list". This wouldn't be useful for a checker,
 but
 it could be used by a slow lease-updater process (one in which a
 share-crawler had a list of outstanding per-account bloom filters, with
 instructions to add/renew a lease on anything that matched). OTOH, it
 would
 probably be easier to have a share-to-account (one-to-many) mapping table
 on
 each server, and have the client renew a "lease" on the account in
 general.
 (this is the scheme that we've discussed before, in which each client
 sends
 one message per storage server per renewal period, instead of one per
 (SS*share*period), which would be a awful lot of messages).

-- 
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/700#comment:6>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid