| | 1 | |
| | 2 | Quota management is about protecting the Storage Servers. Given a limited |
| | 3 | amount of storage space, how should it be allocated between the users? The |
| | 4 | admin of a server may give out space in exchange for money, for other storage |
| | 5 | space, or out of friendship, but many use cases work better if the admin can |
| | 6 | track and control how much space any given user can consume. |
| | 7 | |
| | 8 | There are roughly three different approaches to this: secret-based, |
| | 9 | private-key-based, and FURL-based. From certain points of view, these are all |
| | 10 | equivalent: a private key is like a remote capability that can be used to |
| | 11 | create attenuated capabilities on demand, which is kind of like an asymmetric |
| | 12 | form of hash derivation. The differences consist of tradeoffs between CPU |
| | 13 | usage, storage space, and online message exchanges: it is possible to reduce |
| | 14 | the dependence upon a central server being available by using signed messages |
| | 15 | and certificates that can be generated and verified offline, at the cost of |
| | 16 | CPU space and complexity. Likewise the online check can be replaced by a big |
| | 17 | table of shared secrets, at the cost of storage space. |
| | 18 | |
| | 19 | All of these schemes use the same core design: each share kept on a Storage |
| | 20 | Server is associated with a list of leases. Each lease has an "account |
| | 21 | number" that refers to some particular user/account that we might like to |
| | 22 | track. The storage servers have the ability to sum the size of all their |
| | 23 | shares by account number, to report that account number 4 is consuming 2GB. |
| | 24 | By adding these reports across all storage servers, we discover that account |
| | 25 | 4 is using a total of 10GB. Some other mechanism is used to give a name |
| | 26 | (Alice) to account 4. This mechanism might also be able to instruct the |
| | 27 | storage servers to stop accepting new leases for that account until it |
| | 28 | returns below-quota. |
| | 29 | |
| | 30 | The schemes differ in the way that we decide which account number to put in |
| | 31 | the lease. If accounts are in use, the client must be confined to use a |
| | 32 | specific (authorized) account number: Bob should not be able to get leases |
| | 33 | placed using Alice's account number, and Carol (who is not a valid user) |
| | 34 | should not be able to place leases at all. |
| | 35 | |
| | 36 | The main design goals we seek to attain here are: |
| | 37 | |
| | 38 | * optional central control over storage space consumed ("quotas") and |
| | 39 | ability to participate in the grid at all (storage authority). Grids |
| | 40 | which do not wish to maintain this level of control are not required |
| | 41 | to use Account Servers at all. Grids can also enable opt-in quota |
| | 42 | checking, in which clients are trusted to provide their correct |
| | 43 | account number, and to refrain from stealing each other's accounts. |
| | 44 | * when an account is added, the user should be able to use the grid |
| | 45 | immediately |
| | 46 | * when an account is removed, the user should eventually be prohibited |
| | 47 | from using the grid. It is ok if it takes a month for this revocation |
| | 48 | to take effect. |
| | 49 | * most functionality should continue uninterrupted even if a central Account |
| | 50 | Server falls offline |
| | 51 | |
| | 52 | Secondary design goals (not all of which we can meet) are: |
| | 53 | |
| | 54 | * clients should be able to easily delegate their storage authority to |
| | 55 | someone else, like a trusted Helper or a Repair agent. These two agents |
| | 56 | may need to create leases in the user's name. |
| | 57 | * clients should be able to delegate storage authority for servers they |
| | 58 | haven't met yet. Specifically: |
| | 59 | * the Repairer may be creating leases on new storage servers that were |
| | 60 | added while the client was offline (otherwise the client would be |
| | 61 | repairing its own files). |
| | 62 | * one purpose of the Helper is to allow clients to be unaware of all |
| | 63 | storage servers. If this is the case, the client won't know which |
| | 64 | storage servers it should be delegating authority for. |
| | 65 | * Storage authority may need to be passed through the web API as a query |
| | 66 | parameter. The WUI (the human-facing side) may need a storage-authority |
| | 67 | mechanism as well, perhaps through cookies. |
| | 68 | * storage requirements should remain sensible. A large grid may have one |
| | 69 | million accounts: the storage servers may need to record a shared secret |
| | 70 | for each one, but it would be nicer if they didn't have to. |
| | 71 | * it should be possible to delegate limited authority. It would be nice if |
| | 72 | we could run Helpers on untrusted machines, but it the Helper gets to |
| | 73 | consume the full quotas of all clients who use it, then it must be trusted |
| | 74 | to not do that. If we could delegate just 2MB of storage authority, or |
| | 75 | authority that expired after an hour, we could use more machines for these |
| | 76 | services. |
| | 77 | |
| | 78 | My current plan is to pursue the "secret-based" approach described here. The |
| | 79 | other approaches are summarized in subsequent sections. |
| | 80 | |
| | 81 | == Secret-based "storage authority" approach == |
| | 82 | |
| | 83 | In this scheme, each user has a master storage authority secret: just a |
| | 84 | random string, either 128 or 256 bits long. They also have a unique (but |
| | 85 | non-secret) "Account Number". In centrally-managed grids, these are both |
| | 86 | created and stored by an Account Server, which uses sequential account |
| | 87 | numbers. In friend-nets, there is no Account Server, account numbers are |
| | 88 | either made unique by agreement or by making them large and random, and there |
| | 89 | is more manual work involved to distribute the various secrets. |
| | 90 | |
| | 91 | Each time a client performs a storage operation, it does so under the |
| | 92 | auspices of a specific storage authority. The Tahoe node on Alice's computer |
| | 93 | is running solely for the benefit of Alice, so all operations it performs |
| | 94 | will use Alice's storage authority (i.e. all leases that it creates will |
| | 95 | include Alice's account number). On the other hand, a shared Tahoe node |
| | 96 | accessed through its web-API port may be working for a variety of users. This |
| | 97 | node must be given the storage authority to use for each operation. The means |
| | 98 | to do this is still under investigation: adding an account= query argument is |
| | 99 | one approach, passing the information through cookies is another, each with |
| | 100 | their own advantages and drawbacks. |
| | 101 | |
| | 102 | Each time the client talks to a storage server, it computes a per |
| | 103 | (account*SS) secret by hashing the master secret with the Storage Server's |
| | 104 | nodeid (the "SSid"). It then prepends the account number, resulting in a |
| | 105 | per-SS authority string that looks like 123-lmaypcuoh6c4l3icvvloo2656y. The |
| | 106 | Storage Server has a function that takes this string and decides whether or |
| | 107 | not the authority is valid, and if valid, which account number to use. |
| | 108 | |
| | 109 | (the "123" account number is used as an index when communicating with the AS, |
| | 110 | to avoid requiring a complete table of user-to-secret mappings. This might |
| | 111 | not be the same account number that is used in the final lease, to allow the |
| | 112 | creation of "temporary account numbers" that are attenuated in some way, like |
| | 113 | a short validity period or limited to a certain number of bytes) |
| | 114 | |
| | 115 | For friend-nets, this "authority-is-valid" function is implemented by a |
| | 116 | simple static table lookup. The storage server has a file named |
| | 117 | NODE/private/valid-accounts, that contains one line per account. Each line |
| | 118 | looks like "123-lmaypcuoh6c4l3icvvloo2656y 123 Alice", and contains the |
| | 119 | authority string, followed by the account number to use, followed by the |
| | 120 | account's nickname. The client node has a function that creates one of these |
| | 121 | lines for every known storage server, and the user can send each line to the |
| | 122 | SS's admin and ask them to add it to their valid-accounts file. By doing so, |
| | 123 | the SS's admin is granting that user the right to claim storage on the |
| | 124 | account named "Alice". |
| | 125 | |
| | 126 | For a centrally-managed grid, there is a special Account Server node which |
| | 127 | manages these authorities. Each storage server is configured with a reference |
| | 128 | to the AS by providing NODE/account-server.furl . The authority-is-valid |
| | 129 | function works by telling the AS the (SSID,authority-string) pair for each |
| | 130 | query. The AS responds with either a "no" or a "yes, account=123" answer. |
| | 131 | |
| | 132 | To reduce network traffic and improve tolerance to AS downtime, the SS |
| | 133 | maintains a cache of positive responses. The cache entries are aged out after |
| | 134 | one month. Negative responses are not cached. |
| | 135 | |
| | 136 | Furthermore, the AS gets to pre-emptively manipulate this cache. When the SS |
| | 137 | connects to the AS, it makes its "valid-accounts-manager" object available to |
| | 138 | it, and this manager object gives the AS complete control over the |
| | 139 | valid-accounts table. |
| | 140 | |
| | 141 | A user starts by creating a new account, using some AS-specific mechanism. |
| | 142 | (in the case of the allmydata.com commercial grid, this uses a PHP script |
| | 143 | that also accepts credit-card payments). The AS records the new user's |
| | 144 | storage authority in a table, which is used to answer the subsequent |
| | 145 | "authority-is-valid" queries from storage servers. It extracts the account |
| | 146 | number from the authority string, looks up the corresponding table entry, |
| | 147 | hashes the secret it finds there with the SSID, then compares the result to |
| | 148 | the authority string. |
| | 149 | |
| | 150 | When the AS creates a new account, it also creates authority strings for all |
| | 151 | current storage servers, and uses its valid-accounts-manager connections to |
| | 152 | push these strings into the SS caches. This improves availability: even if |
| | 153 | the AS fell over and died a moment later, the new user would still be able to |
| | 154 | use their storage authority for a month without problems. |
| | 155 | |
| | 156 | If the AS deletes an account (because the user has stopped paying their |
| | 157 | bills), the AS uses its valid-accounts-manager connection to delete the cache |
| | 158 | entries for that account. This accomplishes fast revocation for all storage |
| | 159 | servers that are currently online. Any SS which were offline at the time of |
| | 160 | the account termination will continue to provide service for the rest of the |
| | 161 | month. Other timings are possible: for example the SS might refresh its cache |
| | 162 | after a day, but treat AS unavailability as meaning it should keep using the |
| | 163 | previous answer. |
| | 164 | |
| | 165 | === creating attenuated authorities === |
| | 166 | |
| | 167 | Eventually we may want to take advantage of untrusted Helpers, by allowing |
| | 168 | clients to create attenuated storage authority strings. The possessor of |
| | 169 | these strings might be allowed to claim leases for a specific storage index, |
| | 170 | or only for a certain number of bytes. The untrusted Helper might abuse this |
| | 171 | authority, but the damage it can do is limited by the extra restrictions. |
| | 172 | |
| | 173 | Likewise, the Repairer might get a "repair-cap" which contains enough |
| | 174 | information to download and verify the plaintext, and enough authority to |
| | 175 | upload new shares in the name of the original uploader. This repair-cap could |
| | 176 | contain an authority string which can only be used to create shares for the |
| | 177 | specific storage index. It might also be restricted to creating shares that |
| | 178 | contain a specific root hash, to prevent the repairer from using the |
| | 179 | authority to store its own data in the same slot (on new storage servers). |
| | 180 | |
| | 181 | The shared-secret scheme is the least favorable for creating attenuate |
| | 182 | authorities: it requires more work (and more network traffic) than DSA |
| | 183 | private-key approaches. To provide this, the client contacts the AS and asks |
| | 184 | it for an attenuated authority string: it specifies the conditions of use |
| | 185 | (validity period, storage index restrictions, size limits, etc), and gets |
| | 186 | back a new string and account number. The client uses this pair when |
| | 187 | delegating authority to Helpers and Repairers, instead of their usual |
| | 188 | (full-powered) pair. |
| | 189 | |
| | 190 | The Account Server adds an entry to its authority table with the new number |
| | 191 | and string. When storage servers eventually come asking about the validity of |
| | 192 | derived strings, the AS will find this entry, read out the restrictions, and |
| | 193 | respond to the SS with the (restrictions, real account number) pair. (one |
| | 194 | might think of the "real account number" as a special kind of restriction: |
| | 195 | the string grants the authority to consume space in account 123, possibly |
| | 196 | with other restrictions). |
| | 197 | |
| | 198 | The SS will enforce these restrictions. When the restriction involves total |
| | 199 | storage space consumed, the SS will need to maintain a table that is indexed |
| | 200 | by the authority string, counting bytes. This sort of restriction will be |
| | 201 | much easier to manage if the authority includes a duration restriction, |
| | 202 | because that will tend to limit the size of this table. |
| | 203 | |
| | 204 | Clearly, the AS must be online and reachable to generate these attenuated |
| | 205 | authorities. Likewise, either the AS must inform the SS about the strings and |
| | 206 | restrictions at the time of creation (storage+network), or the AS must be |
| | 207 | reachable by the SS when the strings are used (availability). A private-key |
| | 208 | -based scheme would not suffer from this tradeoff. |
| | 209 | |
| | 210 | The need for attenuated authorities is not fully established at this point, |
| | 211 | so the relative simplicity of the shared-secret approach remains appealing |
| | 212 | (i.e. the fact that private-key makes attenuation easier is not a string |
| | 213 | motivation to go with DSA over shared-secret). Untrusted Helpers could |
| | 214 | alternatively be required to establish leases under their own authority, in a |
| | 215 | make-before-break handoff: the Helper uploads the shares and adds temporary |
| | 216 | helper leases, the client is informed about the shares and establishes its |
| | 217 | own leases, then the helper cancels its temporary leases. Likewise the |
| | 218 | Repairer could maintain its own leases on behalf of offline clients, keeping |
| | 219 | track of how much space it is consuming for whom, and reporting the quota |
| | 220 | data to the accounting machinery (and perhaps simply refusing to repair |
| | 221 | accounts which have gone over-quota). When the client comes back online and |
| | 222 | syncs up with the Repairer, the client can establish its own leases on the |
| | 223 | repairer-generated shares, allowing the repairier to drop the temporary ones. |
| | 224 | |
| | 225 | These temporary-leases would add traffic but would remove the need to |
| | 226 | delegate storage authority to the helper, removing some of the need for |
| | 227 | attenuated authorities. |
| | 228 | |
| | 229 | Another conceivable use for attenuated authority would be to give "drop-box" |
| | 230 | access to other users: provide a write-cap (or a hypothetical append-cap) to |
| | 231 | a directory, along with enough storage authority to use it. This would allow |
| | 232 | Alice to safely grant non-account-holders the ability to send her files. The |
| | 233 | current accounting mechanisms we are developing do not allow this: |
| | 234 | non-account-holders can read files, but not add new ones. |
| | 235 | |
| | 236 | |
| | 237 | == DSA private-key -based "membership card" approach == |
| | 238 | |
| | 239 | In this scheme, each user has a DSA private key, and a "membership card" |
| | 240 | (signed by a central Account Server) that declares that the associated pubkey |
| | 241 | is an authorized member of the grid. Clients then use that key to sign |
| | 242 | messages that are sent to the storage servers. The SS will verify the |
| | 243 | signatures before accepting lease requests. Attenuated authority is |
| | 244 | implemented by certificate chains: the client creates a new private/public |
| | 245 | key pair, uses their main key to sign a cert declaring that the new key has |
| | 246 | certain (limited) powers, then gives the new privkey and certificate chain to |
| | 247 | the delegate. The delegate uses the new privkey to sign request messages. |
| | 248 | |
| | 249 | To handle revocation, the certificates need to have expiration dates, and get |
| | 250 | replaced every once in a while. |
| | 251 | |
| | 252 | This approach maximizes offline-ness: the Account Server is only known by its |
| | 253 | public key, and almost never needs to be spoken to directly. It also |
| | 254 | minimizes storage requirements: everything can be computed from the |
| | 255 | certificates that are passed around. PK operations are somewhat expensive, |
| | 256 | but this can be mitigated through more protocol work: creating a |
| | 257 | limited-purpose shared secret on-demand that effectively caches the PK |
| | 258 | verification check. |
| | 259 | |
| | 260 | |
| | 261 | == FURL-based "managed introducer" approach == |
| | 262 | |