| 1 | |
| 2 | (This was copied from a LeastAuthority wiki page, summarizing steps and desire to get cloud-backend things into master .. mostly related directly to the S4 service, but is fairly general) |
| 3 | |
| 4 | # background |
| 5 | |
| 6 | We wish to get the 2237-cloud-backend branch onto master. The |
| 7 | cloud-backend branch was built off of a minimal Accounting prototype |
| 8 | (warner/accounting-2) so that the new "lease-db" could have somewhere |
| 9 | to hang. |
| 10 | |
| 11 | ## currently |
| 12 | |
| 13 | As far as leases and accounting go, 2237 / accounting-3 have the |
| 14 | following design: |
| 15 | |
| 16 | - Accountant hold accounts. There are just 2 accounts and no way |
| 17 | (yet) to create or manage them: |
| 18 | |
| 19 | - "starter" account |
| 20 | - "anonymous" account |
| 21 | |
| 22 | - an Account object now implments RIStorageServer (formerly |
| 23 | implemented by StorageServer). So from a client perspective, |
| 24 | nothing changes: they contact a fURL that implements the |
| 25 | RIStorageServer API. During client setup, that fURL is now pointed |
| 26 | at the anonymous Account instance (instead of the StorageServer |
| 27 | instance). |
| 28 | |
| 29 | - leases are stored in a local sqlite database |
| 30 | - new "starter" leases are created for anything which lacks a lease |
| 31 | - all the code that reads/writes leases to the shares themselves is gone |
| 32 | |
| 33 | - the Accountant and Account objects have access to the leasedb |
| 34 | - the Account object manages leases |
| 35 | - an AccountingCrawler replaces the LeaseCheckingCrawler. This new crawler will: |
| 36 | - Remove leases that are past their expiration time. |
| 37 | - Delete objects containing unleased shares. |
| 38 | - Discover shares that have been manually added to storage. |
| 39 | - Discover shares that are present when a storage server is upgraded from |
| 40 | a pre-leasedb version, and give them "starter leases". |
| 41 | - Recover from a situation where the leasedb is lost or detectably |
| 42 | corrupted. This is handled in the same way as upgrading. |
| 43 | - Detect shares that have unexpectedly disappeared from storage. |
| 44 | |
| 45 | ## problems |
| 46 | |
| 47 | |
| 48 | There are a few problems with this: |
| 49 | |
| 50 | ### database durability, ops burden |
| 51 | |
| 52 | - ultimately, cloud-backend uses "not local disk" for storage |
| 53 | - ...but the leasedb is "a thing that should be backed up", but isn't |
| 54 | stored in the "not local disk" storage. That is, if we're using an |
| 55 | S3 thing, it would be best to have the lease-db in S3 (or AWS |
| 56 | database) |
| 57 | |
| 58 | - this is "okay" for now, because the lease-db is built to recover |
| 59 | from "zero leases". Basically: |
| 60 | - if there's no lease for a share, add a "starter" one |
| 61 | - eventually (after the default-30-days expiry) we will either |
| 62 | learn which clients care about that share (because they renewed |
| 63 | their leases) or the starter lease expires (and we delete the |
| 64 | share) |
| 65 | - ...but this means we can't use the lease-db to definitely answer |
| 66 | the question "how much space is Alice using" if our lease-db is |
| 67 | younger than "default-expiry-time". |
| 68 | |
| 69 | ### non-async APIs |
| 70 | |
| 71 | - the current LeaseDB API is synchronous. This is "sort of fine if |
| 72 | you squint" for a local sqlite database (although still not |
| 73 | correct, because a database read can take an arbitrary amount of |
| 74 | time). Ideally the LeaseDB API should be async. |
| 75 | - e.g. by using twisted.enterprise.adbapi (or similar "general-pupose |
| 76 | Twisted database API" -- is there a better one?) |
| 77 | |
| 78 | |
| 79 | ### "database as cache" |
| 80 | |
| 81 | - currently, the database is completely throw-away |
| 82 | - that may limit future designs (i.e. we can't put anything |
| 83 | "permanent" in the leasedb) |
| 84 | - is this a problem? (if so, is it a problem we *can't* easily fix |
| 85 | later? i.e. if and when we want to add a feature that needs durable |
| 86 | lease-db data?) |
| 87 | - I *think* we decided in last Nuts&Bolts that treating the database |
| 88 | as "mostly disposible" is okay |
| 89 | |
| 90 | |
| 91 | ## the future |
| 92 | |
| 93 | ### Remote API Design |
| 94 | |
| 95 | - obviously, to support "not yet upgraded" clients, the |
| 96 | "anonymous-storage-FURL" API can't change. That is, it must |
| 97 | implement RIStorageServer. |
| 98 | - but maybe having Account directly implement that isn't great. |
| 99 | - Consider this: |
| 100 | |
| 101 | - we want introducers to go away |
| 102 | - thus, "tahoe storage servers" need to stay (as "the" smart thing) |
| 103 | - what if we call these "tahoe servers" instead, and they provide services |
| 104 | - one of those services is "storage" |
| 105 | - (another service might be e.g. a "membrane" that provides |
| 106 | temporary access to a read-cap) |
| 107 | - (another might be a payment API of some kind, to pay for "storage" or other services) |
| 108 | |
| 109 | - ...so I think a better API might be this: |
| 110 | |
| 111 | - Account just provides a "services" API |
| 112 | - "storage" is one of those services (the only one we provide right now) |
| 113 | - ...and "storage" implements RIStorageServer |
| 114 | |
| 115 | - not much changes, except the shape of the code: during client |
| 116 | setup, we get the "anonymous-storage-FURL" from the "storage" |
| 117 | service of the anonymous Account (instead of it just *being* the |
| 118 | Account directly). |
| 119 | |
| 120 | |
| 121 | ### Backing up the Databse |
| 122 | |
| 123 | - one thing suggested was to just periodically (e.g. every hour) back |
| 124 | up the sqlite database to "whatever storage the backend is |
| 125 | using". That is, a "storage backend" has an API to backup (and |
| 126 | restore) an sqlite file. |
| 127 | - then can "mostly" still answer the "how much space is Alice using" |
| 128 | stuff (except for the possibility that shares were added by Alice |
| 129 | after the last database backup) |
| 130 | - ...but you get fast, local queries most of the time for other things |
| 131 | - (I still think we should make the LeaseDB API async even if we're |
| 132 | "always" using sqlite) |
| 133 | |