[tahoe-lafs-trac-stream] [tahoe-lafs] #600: storage: maybe store buckets as files, not directories

tahoe-lafs trac at tahoe-lafs.org
Wed Oct 2 01:23:15 UTC 2013


#600: storage: maybe store buckets as files, not directories
-------------------------+-------------------------------------------------
     Reporter:  warner   |      Owner:  warner
         Type:           |     Status:  new
  enhancement            |  Milestone:  undecided
     Priority:  minor    |    Version:  1.2.0
    Component:  code-    |   Keywords:  storage disk-backend performance
  storage                |  migration crawlers
   Resolution:           |
Launchpad Bug:           |
-------------------------+-------------------------------------------------
Changes (by warner):

 * keywords:
     storage disk-backend performance migration crawlers brians-opinion-
     needed
     => storage disk-backend performance migration crawlers


Comment:

 Hm. Yeah, buckets are a thing of the past, and lease information
 wants to be per-share, not per-anything-larger. Likewise any
 metadata we might add in the future should be per-share too.

 The real question is: how should the on-disk storage backend
 organize its pieces? If we rely upon the leasedb to satisfy
 "do-you-have-share" queries (which I think is good), then we don't
 need to query the disk each time. We still need to query it for the
 crawler, but that can be relatively slow, since it only happens in
 the background.

 Removing per-bucket subdirectories will probably slow down the
 on-disk "do we know anything about this SI" query, because it
 basically turns into a large readdir() and a grep through the
 results (looking for a prefix-match on the SI). For our nominal
 1M-share server, each prefix-directory contains 1k shares, and an
 ideal one-share-per-server encoding will result in listing a
 1k-entry directory for each query.

 If people are doing crazy encodings that put lots of shares on each
 server, we'll incur a larger lookup cost.

 So yeah, I think I'm +1 on changing the on-disk format to get rid
 of the bucket directories. It should probably be driven by the
 pluggable-backend-storage changes y'all (LAE) are making, though..
 what would fit best with the scheme you've put together?

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/600#comment:4>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list