| 1 | |
| 2 | Quota management is about protecting the Storage Servers. Given a limited |
| 3 | amount of storage space, how should it be allocated between the users? The |
| 4 | admin of a server may give out space in exchange for money, for other storage |
| 5 | space, or out of friendship, but many use cases work better if the admin can |
| 6 | track and control how much space any given user can consume. |
| 7 | |
| 8 | There are roughly three different approaches to this: secret-based, |
| 9 | private-key-based, and FURL-based. From certain points of view, these are all |
| 10 | equivalent: a private key is like a remote capability that can be used to |
| 11 | create attenuated capabilities on demand, which is kind of like an asymmetric |
| 12 | form of hash derivation. The differences consist of tradeoffs between CPU |
| 13 | usage, storage space, and online message exchanges: it is possible to reduce |
| 14 | the dependence upon a central server being available by using signed messages |
| 15 | and certificates that can be generated and verified offline, at the cost of |
| 16 | CPU space and complexity. Likewise the online check can be replaced by a big |
| 17 | table of shared secrets, at the cost of storage space. |
| 18 | |
| 19 | All of these schemes use the same core design: each share kept on a Storage |
| 20 | Server is associated with a list of leases. Each lease has an "account |
| 21 | number" that refers to some particular user/account that we might like to |
| 22 | track. The storage servers have the ability to sum the size of all their |
| 23 | shares by account number, to report that account number 4 is consuming 2GB. |
| 24 | By adding these reports across all storage servers, we discover that account |
| 25 | 4 is using a total of 10GB. Some other mechanism is used to give a name |
| 26 | (Alice) to account 4. This mechanism might also be able to instruct the |
| 27 | storage servers to stop accepting new leases for that account until it |
| 28 | returns below-quota. |
| 29 | |
| 30 | The schemes differ in the way that we decide which account number to put in |
| 31 | the lease. If accounts are in use, the client must be confined to use a |
| 32 | specific (authorized) account number: Bob should not be able to get leases |
| 33 | placed using Alice's account number, and Carol (who is not a valid user) |
| 34 | should not be able to place leases at all. |
| 35 | |
| 36 | The main design goals we seek to attain here are: |
| 37 | |
| 38 | * optional central control over storage space consumed ("quotas") and |
| 39 | ability to participate in the grid at all (storage authority). Grids |
| 40 | which do not wish to maintain this level of control are not required |
| 41 | to use Account Servers at all. Grids can also enable opt-in quota |
| 42 | checking, in which clients are trusted to provide their correct |
| 43 | account number, and to refrain from stealing each other's accounts. |
| 44 | * when an account is added, the user should be able to use the grid |
| 45 | immediately |
| 46 | * when an account is removed, the user should eventually be prohibited |
| 47 | from using the grid. It is ok if it takes a month for this revocation |
| 48 | to take effect. |
| 49 | * most functionality should continue uninterrupted even if a central Account |
| 50 | Server falls offline |
| 51 | |
| 52 | Secondary design goals (not all of which we can meet) are: |
| 53 | |
| 54 | * clients should be able to easily delegate their storage authority to |
| 55 | someone else, like a trusted Helper or a Repair agent. These two agents |
| 56 | may need to create leases in the user's name. |
| 57 | * clients should be able to delegate storage authority for servers they |
| 58 | haven't met yet. Specifically: |
| 59 | * the Repairer may be creating leases on new storage servers that were |
| 60 | added while the client was offline (otherwise the client would be |
| 61 | repairing its own files). |
| 62 | * one purpose of the Helper is to allow clients to be unaware of all |
| 63 | storage servers. If this is the case, the client won't know which |
| 64 | storage servers it should be delegating authority for. |
| 65 | * Storage authority may need to be passed through the web API as a query |
| 66 | parameter. The WUI (the human-facing side) may need a storage-authority |
| 67 | mechanism as well, perhaps through cookies. |
| 68 | * storage requirements should remain sensible. A large grid may have one |
| 69 | million accounts: the storage servers may need to record a shared secret |
| 70 | for each one, but it would be nicer if they didn't have to. |
| 71 | * it should be possible to delegate limited authority. It would be nice if |
| 72 | we could run Helpers on untrusted machines, but it the Helper gets to |
| 73 | consume the full quotas of all clients who use it, then it must be trusted |
| 74 | to not do that. If we could delegate just 2MB of storage authority, or |
| 75 | authority that expired after an hour, we could use more machines for these |
| 76 | services. |
| 77 | |
| 78 | My current plan is to pursue the "secret-based" approach described here. The |
| 79 | other approaches are summarized in subsequent sections. |
| 80 | |
| 81 | == Secret-based "storage authority" approach == |
| 82 | |
| 83 | In this scheme, each user has a master storage authority secret: just a |
| 84 | random string, either 128 or 256 bits long. They also have a unique (but |
| 85 | non-secret) "Account Number". In centrally-managed grids, these are both |
| 86 | created and stored by an Account Server, which uses sequential account |
| 87 | numbers. In friend-nets, there is no Account Server, account numbers are |
| 88 | either made unique by agreement or by making them large and random, and there |
| 89 | is more manual work involved to distribute the various secrets. |
| 90 | |
| 91 | Each time a client performs a storage operation, it does so under the |
| 92 | auspices of a specific storage authority. The Tahoe node on Alice's computer |
| 93 | is running solely for the benefit of Alice, so all operations it performs |
| 94 | will use Alice's storage authority (i.e. all leases that it creates will |
| 95 | include Alice's account number). On the other hand, a shared Tahoe node |
| 96 | accessed through its web-API port may be working for a variety of users. This |
| 97 | node must be given the storage authority to use for each operation. The means |
| 98 | to do this is still under investigation: adding an account= query argument is |
| 99 | one approach, passing the information through cookies is another, each with |
| 100 | their own advantages and drawbacks. |
| 101 | |
| 102 | Each time the client talks to a storage server, it computes a per |
| 103 | (account*SS) secret by hashing the master secret with the Storage Server's |
| 104 | nodeid (the "SSid"). It then prepends the account number, resulting in a |
| 105 | per-SS authority string that looks like 123-lmaypcuoh6c4l3icvvloo2656y. The |
| 106 | Storage Server has a function that takes this string and decides whether or |
| 107 | not the authority is valid, and if valid, which account number to use. |
| 108 | |
| 109 | (the "123" account number is used as an index when communicating with the AS, |
| 110 | to avoid requiring a complete table of user-to-secret mappings. This might |
| 111 | not be the same account number that is used in the final lease, to allow the |
| 112 | creation of "temporary account numbers" that are attenuated in some way, like |
| 113 | a short validity period or limited to a certain number of bytes) |
| 114 | |
| 115 | For friend-nets, this "authority-is-valid" function is implemented by a |
| 116 | simple static table lookup. The storage server has a file named |
| 117 | NODE/private/valid-accounts, that contains one line per account. Each line |
| 118 | looks like "123-lmaypcuoh6c4l3icvvloo2656y 123 Alice", and contains the |
| 119 | authority string, followed by the account number to use, followed by the |
| 120 | account's nickname. The client node has a function that creates one of these |
| 121 | lines for every known storage server, and the user can send each line to the |
| 122 | SS's admin and ask them to add it to their valid-accounts file. By doing so, |
| 123 | the SS's admin is granting that user the right to claim storage on the |
| 124 | account named "Alice". |
| 125 | |
| 126 | For a centrally-managed grid, there is a special Account Server node which |
| 127 | manages these authorities. Each storage server is configured with a reference |
| 128 | to the AS by providing NODE/account-server.furl . The authority-is-valid |
| 129 | function works by telling the AS the (SSID,authority-string) pair for each |
| 130 | query. The AS responds with either a "no" or a "yes, account=123" answer. |
| 131 | |
| 132 | To reduce network traffic and improve tolerance to AS downtime, the SS |
| 133 | maintains a cache of positive responses. The cache entries are aged out after |
| 134 | one month. Negative responses are not cached. |
| 135 | |
| 136 | Furthermore, the AS gets to pre-emptively manipulate this cache. When the SS |
| 137 | connects to the AS, it makes its "valid-accounts-manager" object available to |
| 138 | it, and this manager object gives the AS complete control over the |
| 139 | valid-accounts table. |
| 140 | |
| 141 | A user starts by creating a new account, using some AS-specific mechanism. |
| 142 | (in the case of the allmydata.com commercial grid, this uses a PHP script |
| 143 | that also accepts credit-card payments). The AS records the new user's |
| 144 | storage authority in a table, which is used to answer the subsequent |
| 145 | "authority-is-valid" queries from storage servers. It extracts the account |
| 146 | number from the authority string, looks up the corresponding table entry, |
| 147 | hashes the secret it finds there with the SSID, then compares the result to |
| 148 | the authority string. |
| 149 | |
| 150 | When the AS creates a new account, it also creates authority strings for all |
| 151 | current storage servers, and uses its valid-accounts-manager connections to |
| 152 | push these strings into the SS caches. This improves availability: even if |
| 153 | the AS fell over and died a moment later, the new user would still be able to |
| 154 | use their storage authority for a month without problems. |
| 155 | |
| 156 | If the AS deletes an account (because the user has stopped paying their |
| 157 | bills), the AS uses its valid-accounts-manager connection to delete the cache |
| 158 | entries for that account. This accomplishes fast revocation for all storage |
| 159 | servers that are currently online. Any SS which were offline at the time of |
| 160 | the account termination will continue to provide service for the rest of the |
| 161 | month. Other timings are possible: for example the SS might refresh its cache |
| 162 | after a day, but treat AS unavailability as meaning it should keep using the |
| 163 | previous answer. |
| 164 | |
| 165 | === creating attenuated authorities === |
| 166 | |
| 167 | Eventually we may want to take advantage of untrusted Helpers, by allowing |
| 168 | clients to create attenuated storage authority strings. The possessor of |
| 169 | these strings might be allowed to claim leases for a specific storage index, |
| 170 | or only for a certain number of bytes. The untrusted Helper might abuse this |
| 171 | authority, but the damage it can do is limited by the extra restrictions. |
| 172 | |
| 173 | Likewise, the Repairer might get a "repair-cap" which contains enough |
| 174 | information to download and verify the plaintext, and enough authority to |
| 175 | upload new shares in the name of the original uploader. This repair-cap could |
| 176 | contain an authority string which can only be used to create shares for the |
| 177 | specific storage index. It might also be restricted to creating shares that |
| 178 | contain a specific root hash, to prevent the repairer from using the |
| 179 | authority to store its own data in the same slot (on new storage servers). |
| 180 | |
| 181 | The shared-secret scheme is the least favorable for creating attenuate |
| 182 | authorities: it requires more work (and more network traffic) than DSA |
| 183 | private-key approaches. To provide this, the client contacts the AS and asks |
| 184 | it for an attenuated authority string: it specifies the conditions of use |
| 185 | (validity period, storage index restrictions, size limits, etc), and gets |
| 186 | back a new string and account number. The client uses this pair when |
| 187 | delegating authority to Helpers and Repairers, instead of their usual |
| 188 | (full-powered) pair. |
| 189 | |
| 190 | The Account Server adds an entry to its authority table with the new number |
| 191 | and string. When storage servers eventually come asking about the validity of |
| 192 | derived strings, the AS will find this entry, read out the restrictions, and |
| 193 | respond to the SS with the (restrictions, real account number) pair. (one |
| 194 | might think of the "real account number" as a special kind of restriction: |
| 195 | the string grants the authority to consume space in account 123, possibly |
| 196 | with other restrictions). |
| 197 | |
| 198 | The SS will enforce these restrictions. When the restriction involves total |
| 199 | storage space consumed, the SS will need to maintain a table that is indexed |
| 200 | by the authority string, counting bytes. This sort of restriction will be |
| 201 | much easier to manage if the authority includes a duration restriction, |
| 202 | because that will tend to limit the size of this table. |
| 203 | |
| 204 | Clearly, the AS must be online and reachable to generate these attenuated |
| 205 | authorities. Likewise, either the AS must inform the SS about the strings and |
| 206 | restrictions at the time of creation (storage+network), or the AS must be |
| 207 | reachable by the SS when the strings are used (availability). A private-key |
| 208 | -based scheme would not suffer from this tradeoff. |
| 209 | |
| 210 | The need for attenuated authorities is not fully established at this point, |
| 211 | so the relative simplicity of the shared-secret approach remains appealing |
| 212 | (i.e. the fact that private-key makes attenuation easier is not a string |
| 213 | motivation to go with DSA over shared-secret). Untrusted Helpers could |
| 214 | alternatively be required to establish leases under their own authority, in a |
| 215 | make-before-break handoff: the Helper uploads the shares and adds temporary |
| 216 | helper leases, the client is informed about the shares and establishes its |
| 217 | own leases, then the helper cancels its temporary leases. Likewise the |
| 218 | Repairer could maintain its own leases on behalf of offline clients, keeping |
| 219 | track of how much space it is consuming for whom, and reporting the quota |
| 220 | data to the accounting machinery (and perhaps simply refusing to repair |
| 221 | accounts which have gone over-quota). When the client comes back online and |
| 222 | syncs up with the Repairer, the client can establish its own leases on the |
| 223 | repairer-generated shares, allowing the repairier to drop the temporary ones. |
| 224 | |
| 225 | These temporary-leases would add traffic but would remove the need to |
| 226 | delegate storage authority to the helper, removing some of the need for |
| 227 | attenuated authorities. |
| 228 | |
| 229 | Another conceivable use for attenuated authority would be to give "drop-box" |
| 230 | access to other users: provide a write-cap (or a hypothetical append-cap) to |
| 231 | a directory, along with enough storage authority to use it. This would allow |
| 232 | Alice to safely grant non-account-holders the ability to send her files. The |
| 233 | current accounting mechanisms we are developing do not allow this: |
| 234 | non-account-holders can read files, but not add new ones. |
| 235 | |
| 236 | |
| 237 | == DSA private-key -based "membership card" approach == |
| 238 | |
| 239 | In this scheme, each user has a DSA private key, and a "membership card" |
| 240 | (signed by a central Account Server) that declares that the associated pubkey |
| 241 | is an authorized member of the grid. Clients then use that key to sign |
| 242 | messages that are sent to the storage servers. The SS will verify the |
| 243 | signatures before accepting lease requests. Attenuated authority is |
| 244 | implemented by certificate chains: the client creates a new private/public |
| 245 | key pair, uses their main key to sign a cert declaring that the new key has |
| 246 | certain (limited) powers, then gives the new privkey and certificate chain to |
| 247 | the delegate. The delegate uses the new privkey to sign request messages. |
| 248 | |
| 249 | To handle revocation, the certificates need to have expiration dates, and get |
| 250 | replaced every once in a while. |
| 251 | |
| 252 | This approach maximizes offline-ness: the Account Server is only known by its |
| 253 | public key, and almost never needs to be spoken to directly. It also |
| 254 | minimizes storage requirements: everything can be computed from the |
| 255 | certificates that are passed around. PK operations are somewhat expensive, |
| 256 | but this can be mitigated through more protocol work: creating a |
| 257 | limited-purpose shared secret on-demand that effectively caches the PK |
| 258 | verification check. |
| 259 | |
| 260 | |
| 261 | == FURL-based "managed introducer" approach == |
| 262 | |