Opened at 2012-10-30T23:12:32Z
Last modified at 2020-10-30T12:35:44Z
#1836 closed defect
use leasedb (not crawler) to figure out how many shares you have and how many bytes — at Version 10
Reported by: | zooko | Owned by: | davidsarah |
---|---|---|---|
Priority: | normal | Milestone: | 1.15.0 |
Component: | code-storage | Version: | 1.9.2 |
Keywords: | leases garbage-collection test-needed accounting | Cc: | |
Launchpad Bug: |
Description (last modified by zooko)
In current trunk, there is a "BucketCountingCrawler" whose job it is to count up how many shares are stored.
I propose that this be replaced by using the leasedb to count files (a simple SQL COUNT query!), and at the same time to extend the storage server's abilities by letting it be able to add up the aggregate sizes of things as well as their number.
This is part of an "overarching ticket" to eliminate most uses of crawler — ticket #1834.
Change History (10)
comment:1 Changed at 2012-10-30T23:12:45Z by zooko
- Description modified (diff)
comment:2 Changed at 2012-10-30T23:14:24Z by zooko
comment:3 Changed at 2012-10-31T00:09:16Z by davidsarah
- Owner set to davidsarah
- Status changed from new to assigned
+1.
comment:4 Changed at 2012-10-31T10:08:04Z by zooko
- Summary changed from stop crawling share files in order to figure out how many shares you have to use leasedb (not crawler) to figure out how many shares you have and how many bytes
comment:5 Changed at 2012-11-09T06:51:58Z by zooko
Using leasedb this way would facilitate solving #671 — bring back sizelimit (i.e. max consumed, not min free).
comment:6 Changed at 2012-11-21T00:49:35Z by zooko
- Description modified (diff)
comment:7 Changed at 2012-12-14T20:24:43Z by zooko
Using leasedb this way would facilitate solving #940.
comment:8 Changed at 2012-12-15T00:59:41Z by davidsarah
The most basic form of the 'total used space' query is
SELECT SUM(`used_space`) FROM `shares`
How much account-specific information should we add? At the moment, there are only two accounts -- anonymous and starter -- but that is already enough to introduce the complication that more than one account can hold a lease on the same share, so the query above is not equivalent to
SELECT SUM(`used_space`) FROM `shares` s JOIN `leases` l ON (s.`storage_index` = l.`storage_index` AND s.`shnum` = l.`shnum`)
since that can count space for a share more than once.
comment:9 Changed at 2012-12-15T01:09:38Z by davidsarah
This query solves the above problem, giving the total number of leased shares and the total space used by leased shares:
SELECT COUNT(*), SUM(`used_space`) FROM (SELECT `used_space` FROM `shares` s JOIN `leases` l ON (s.`storage_index` = l.`storage_index` AND s.`shnum` = l.`shnum`) GROUP BY s.`storage_index`, s.`shnum`)
(Any WHERE clause can be added to the inner SELECT to pick leases that satisfy certain criteria.)
And this gives the number of shares and total used space leased by each account, sorted beginning with the one that is using most space:
SELECT `account_id`, COUNT(*), SUM(`used_space`) FROM `leases` l LEFT JOIN `shares` s ON (l.`storage_index` = s.`storage_index` AND l.`shnum` = s.`shnum`) GROUP BY `account_id` ORDER BY SUM(`used_space`) DESC
comment:10 Changed at 2013-07-04T16:23:57Z by zooko
- Description modified (diff)
The part about reporting total space usage would be very useful for customers of LeastAuthority.com (who pay per byte), among others.