| 1 | |
|---|
| 2 | = Accounting = |
|---|
| 3 | |
|---|
| 4 | "Accounting" is the arena of the Tahoe system that concerns measuring, |
|---|
| 5 | controlling, and enabling the ability to upload and download files, and to |
|---|
| 6 | create new directories. In contrast with the capability-based access control |
|---|
| 7 | model, which dictates how specific files and directories may or may not be |
|---|
| 8 | manipulated, Accounting is concerned with resource consumption: how much disk |
|---|
| 9 | space a given person/account/entity can use. |
|---|
| 10 | |
|---|
| 11 | Tahoe releases up to and including 1.4.1 have a nearly-unbounded resource |
|---|
| 12 | usage model. Anybody who can talk to the Introducer gets to talk to all the |
|---|
| 13 | Storage Servers, and anyone who can talk to a Storage Server gets to use as |
|---|
| 14 | much disk space as they want (up to the reserved_space= limit imposed by the |
|---|
| 15 | server, which affects all users equally). Not only is the per-user space |
|---|
| 16 | usage unlimited, it is also unmeasured: the owner of the Storage Server has |
|---|
| 17 | no way to find out how much space Alice or Bob is using. |
|---|
| 18 | |
|---|
| 19 | The goals of the Accounting system are thus: |
|---|
| 20 | |
|---|
| 21 | * allow the owner of a storage server to control who gets to use disk space, |
|---|
| 22 | with separate limits per user |
|---|
| 23 | * allow both the server owner and the user to measure how much space the user |
|---|
| 24 | is consuming, in an efficient manner |
|---|
| 25 | * provide grid-wide aggregation tools, so a set of cooperating server |
|---|
| 26 | operators can easily measure how much a given user is consuming across all |
|---|
| 27 | servers. This information should also be available to the user in question. |
|---|
| 28 | |
|---|
| 29 | For the purposes of this document, the terms "Account" and "User" are mostly |
|---|
| 30 | interchangeable. The fundamental unit of Accounting is the "Account", in that |
|---|
| 31 | usage and quota enforcement is performed separately for each account. These |
|---|
| 32 | accounts might correspond to individual human users, or they might be shared |
|---|
| 33 | among a group, or a user might have an arbitrary number of accounts. |
|---|
| 34 | |
|---|
| 35 | Accounting interacts with Garbage Collection. To protect their shares from |
|---|
| 36 | GC, clients maintain limited-duration leases on those shares: when the last |
|---|
| 37 | lease expires, the share is deleted. Each lease has a "label", which |
|---|
| 38 | indicates the account or user which wants to keep the share alive. A given |
|---|
| 39 | account's "usage" (their per-server aggregate usage) is simply the sum of the |
|---|
| 40 | sizes of all shares on which they hold a lease. The storage server may limit |
|---|
| 41 | the user to a fixed "quota" (an upper bound on their usage). To keep a file |
|---|
| 42 | alive, the user must be willing to use up some of their quota. |
|---|
| 43 | |
|---|
| 44 | Note that a popular file might have leases from multiple users, in which case |
|---|
| 45 | one user might take a chance and decline to add their own lease, saving some |
|---|
| 46 | of their quota and hoping that the other leases continue to keep the file |
|---|
| 47 | alive despite their personal unwillingness to contribute to the effort. One |
|---|
| 48 | could imagine a "pro-rated quotas" scheme, in which a 10MB file with 5 |
|---|
| 49 | leaseholders would deduct 2MB from each leaseholder's quota. We have decided |
|---|
| 50 | to not implement pro-rated quotas, because such a scheme would make usage |
|---|
| 51 | values hard to predict: a given account might suddenly go over quota solely |
|---|
| 52 | because of a third party's actions. |
|---|
| 53 | |
|---|
| 54 | == Accounting Implementation == |
|---|
| 55 | |
|---|
| 56 | The implementation of these accounting features are tracked in this ticket: |
|---|
| 57 | |
|---|
| 58 | https://tahoe-lafs.org/trac/tahoe-lafs/ticket/666 |
|---|
| 59 | |
|---|
| 60 | == Authority Flow == |
|---|
| 61 | |
|---|
| 62 | The authority to consume space on the storage server originates, of course, |
|---|
| 63 | with the storage server operator. These operators start with complete control |
|---|
| 64 | over their space, and delegate portions of it to others: either directly to |
|---|
| 65 | clients who want to upload files, or to intermediaries who can then delegate |
|---|
| 66 | attenuated authority onwards. The operators have various reasons for wanting |
|---|
| 67 | to share their space: monetary consideration, expectations of in-kind |
|---|
| 68 | exchange, or simple generosity. But the final authority always rests with the |
|---|
| 69 | operator. |
|---|
| 70 | |
|---|
| 71 | The server operator grants limited authority over their space by configuring |
|---|
| 72 | their server to accept requests that demonstrate knowledge of certain |
|---|
| 73 | secrets. They then share those secrets with the client who intends to use |
|---|
| 74 | this space, or with an intermediary who will generate still more secrets and |
|---|
| 75 | share those with the client. Eventually, an upload or create-directory |
|---|
| 76 | operation will be performed that needs this authority. Part of the operation |
|---|
| 77 | will involve proving knowledge of the secret to the storage server, and the |
|---|
| 78 | server will require this proof before accepting the uploaded share or adding |
|---|
| 79 | a new lease. |
|---|
| 80 | |
|---|
| 81 | The authority is expressed as a string, containing cryptographically-signed |
|---|
| 82 | messages and keys. The string also contains "restrictions", which are |
|---|
| 83 | annotations that explain the limits imposed upon this authority, either by |
|---|
| 84 | the original grantor (the storage server operator) or by one of the |
|---|
| 85 | intermediaries. Authority can be reduced but not increased. Any holder of a |
|---|
| 86 | given authority can delegate some or all of it to another party. |
|---|
| 87 | |
|---|
| 88 | The authority string may be short enough to include as an argument to a CLI |
|---|
| 89 | command (--with-authority ABCDE), or it may be long enough that it must be |
|---|
| 90 | stashed in a file and referenced in some other fashion (--with-authority-file |
|---|
| 91 | ~/.my_authority). There are CLI tools to create brand new authority strings, |
|---|
| 92 | to derive attenuated authorities from an existing one, and to explain the |
|---|
| 93 | contents of an authority string. These authority strings can be shared with |
|---|
| 94 | others just like filecaps and dircaps: knowledge of the authority string is |
|---|
| 95 | both necessary and complete to wield the authority it represents. |
|---|
| 96 | |
|---|
| 97 | Web-API requests will include the authority necessary to complete the |
|---|
| 98 | operation. When used by a CLI tool, the authority is likely to come from |
|---|
| 99 | ~/.tahoe/private/authority (i.e. it is ambient to the user who has access to |
|---|
| 100 | that node, just like aliases provide similar access to a specific "root |
|---|
| 101 | directory"). When used by the browser-oriented WUI, the authority will [TODO] |
|---|
| 102 | somehow be retained on each page in a way that minimizes the risk of CSRF |
|---|
| 103 | attacks and allows safe sharing (cut-and-paste of a URL without sharing the |
|---|
| 104 | storage authority too). The client node receiving the web-API request will |
|---|
| 105 | extract the authority string from the request and use it to build the storage |
|---|
| 106 | server messages that it sends to fulfill that request. |
|---|
| 107 | |
|---|
| 108 | == Definition Of Authority == |
|---|
| 109 | |
|---|
| 110 | The term "authority" is used here in the object-capability sense: it refers |
|---|
| 111 | to the ability of some principal to cause some action to occur, whether |
|---|
| 112 | because they can do it themselves, or because they can convince some other |
|---|
| 113 | principal to do it for them. In Tahoe terms, "storage authority" is the |
|---|
| 114 | ability to do one of the following actions: |
|---|
| 115 | |
|---|
| 116 | * upload a new share, thus consuming storage space |
|---|
| 117 | * adding a new lease to a share, thus preventing space from being reclaimed |
|---|
| 118 | * modify an existing mutable share, potentially increasing the space consumed |
|---|
| 119 | |
|---|
| 120 | The Accounting effort may involve other kinds of authority that get limited |
|---|
| 121 | in a similar manner as storage authority, like the ability to download a |
|---|
| 122 | share or query whether a given share is present: anything that may consume |
|---|
| 123 | CPU time, disk bandwidth, or other limited resources. The authority to renew |
|---|
| 124 | or cancel a lease may be controlled in a similar fashion. |
|---|
| 125 | |
|---|
| 126 | Storage authority, as granted from a server operator to a client, is not |
|---|
| 127 | simply a binary "use space or not" grant. Instead, it is parameterized by a |
|---|
| 128 | number of "restrictions". The most important of these restrictions (with |
|---|
| 129 | respect to the goals of Accounting) is the "Account Label". |
|---|
| 130 | |
|---|
| 131 | === Account Labels === |
|---|
| 132 | |
|---|
| 133 | A Tahoe "Account" is defined by a variable-length sequence of small integers. |
|---|
| 134 | (they are not required to be small, the actual limit is 2**64, but neither |
|---|
| 135 | are they required to be unguessable). For the purposes of discussion, these |
|---|
| 136 | lists will be expressed as period-joined strings: the two-element list (1,4) |
|---|
| 137 | will be displayed here as "1.4". |
|---|
| 138 | |
|---|
| 139 | These accounts are arranged in a hierarchy: the account identifier 1.4 is |
|---|
| 140 | considered to be a "parent" of 1.4.2 . There is no relationship between the |
|---|
| 141 | values used by unrelated accounts: 1.4 is unrelated to 2.4, despite both |
|---|
| 142 | coincidentally using a "4" in the second element. |
|---|
| 143 | |
|---|
| 144 | Each lease has a label, which contains the Account identifier. The storage |
|---|
| 145 | server maintains an aggregate size count for each label prefix: when asked |
|---|
| 146 | about account 1.4, it will report the amount of space used by shares labeled |
|---|
| 147 | 1.4, 1.4.2, 1.4.7, 1.4.7.8, etc (but *not* 1 or 1.5). |
|---|
| 148 | |
|---|
| 149 | The "Account Label" restriction allows a client to apply any label it wants, |
|---|
| 150 | as long as that label begins with a specific prefix. If account 1 is |
|---|
| 151 | associated with Alice, then Alice will receive a storage authority string |
|---|
| 152 | that contains a "must start with 1" restriction, enabling her to to use |
|---|
| 153 | storage space but obligating her to lease her shares with a label that can be |
|---|
| 154 | traced back to her. She can delegate part of her authority to others (perhaps |
|---|
| 155 | with other non-label restrictions, such as a space restriction or time limit) |
|---|
| 156 | with or without an additional label restriction. For example, she might |
|---|
| 157 | delegate some of her authority to her friend Amy, with a 1.4 label |
|---|
| 158 | restriction. Amy could then create labels with 1.4 or 1.4.7, but she could |
|---|
| 159 | not create labels with the same 1 identifier that Alice can do, nor could she |
|---|
| 160 | create labels with 1.5 (which Alice might have given to her other friend |
|---|
| 161 | Annette). The storage server operator can ask about the usage of 1 to find |
|---|
| 162 | out how much Alice is responsible for (which includes the space that she has |
|---|
| 163 | delegated to Amy and Annette), and none of the A-users can avoid being |
|---|
| 164 | counted in this total. But Alice can ask the storage server about the usage |
|---|
| 165 | of 1.4 to find out how much Amy has taken advantage of her gift. Likewise, |
|---|
| 166 | Alice has control over any lease with a label that begins with 1, so she can |
|---|
| 167 | cancel Amy's leases and free the space they were consuming. If this seems |
|---|
| 168 | surprising, consider that the storage server operator considered Alice to be |
|---|
| 169 | responsible for that space anyways: with great responsibility (for space |
|---|
| 170 | consumed) comes great power (to stop consuming that space). |
|---|
| 171 | |
|---|
| 172 | === Server Space Restriction === |
|---|
| 173 | |
|---|
| 174 | The storage server's basic control over how space usage (apart from the |
|---|
| 175 | binary use-it-or-not authority granted by handing out an authority string at |
|---|
| 176 | all) is implemented by keeping track of the space used by any given account |
|---|
| 177 | identifier. If account 1.4 sends a request to allocate a 1MB share, but that |
|---|
| 178 | 1MB would bring the 1.4 usage over its quota, the request will be denied. |
|---|
| 179 | |
|---|
| 180 | For this to be useful, the storage server must give each usage-limited |
|---|
| 181 | principal a separate account, and it needs to configure a size limit at the |
|---|
| 182 | same time as the authority string is minted. For a friendnet, the CLI "add |
|---|
| 183 | account" tool can do both at once: |
|---|
| 184 | |
|---|
| 185 | tahoe server add-account --quota 5GB Alice |
|---|
| 186 | --> Please give the following authority string to "Alice", who should |
|---|
| 187 | provide it to the "tahoe add-authority" command |
|---|
| 188 | (authority string..) |
|---|
| 189 | |
|---|
| 190 | This command will allocate an account identifier, add Alice to the "pet name |
|---|
| 191 | table" to associate it with the new account, and establish the 5GB sizelimit. |
|---|
| 192 | Both the sizelimit and the petname can be changed later. |
|---|
| 193 | |
|---|
| 194 | Note that this restriction is independent for each server: some additional |
|---|
| 195 | mechanism must be used to provide a grid-wide restriction. |
|---|
| 196 | |
|---|
| 197 | Also note that this restriction is not expressed in the authority string. It |
|---|
| 198 | is purely local to the storage server. |
|---|
| 199 | |
|---|
| 200 | === Attenuated Server Space Restriction === |
|---|
| 201 | |
|---|
| 202 | TODO (or not) |
|---|
| 203 | |
|---|
| 204 | The server-side space restriction described above can only be applied by the |
|---|
| 205 | storage server, and cannot be attenuated by other delegates. Alice might be |
|---|
| 206 | allowed to use 5GB on this server, but she cannot use that restriction to |
|---|
| 207 | delegate, say, just 1GB to Amy. |
|---|
| 208 | |
|---|
| 209 | Instead, Alice's sub-delegation should include a "server_size" restriction |
|---|
| 210 | key, which contains a size limit. The storage server will only honor a |
|---|
| 211 | request that uses this authority string if it does not cause the aggregate |
|---|
| 212 | usage of this authority string's account prefix to rise above the given size |
|---|
| 213 | limit. |
|---|
| 214 | |
|---|
| 215 | Note that this will not enforce the desired restriction if the size limits |
|---|
| 216 | are not consistent across multiple delegated authorities for the same label. |
|---|
| 217 | For example, if Amy ends up with two delagations, A1 (which gives her a size |
|---|
| 218 | limit of 1GB) and A2 (which gives her 5GB), then she can consume 5GB despite |
|---|
| 219 | the limit in A1. |
|---|
| 220 | |
|---|
| 221 | === Other Restrictions === |
|---|
| 222 | |
|---|
| 223 | Many storage authority restrictions are meant for internal use by tahoe tools |
|---|
| 224 | as they delegate short-lived subauthorities to each other, and are not likely |
|---|
| 225 | to be set by end users. |
|---|
| 226 | |
|---|
| 227 | * "SI": a storage index string. The authority can only be used to upload |
|---|
| 228 | shares of a single file. |
|---|
| 229 | * "serverid": a server identifier. The authority can only be used when |
|---|
| 230 | talking to a specific server |
|---|
| 231 | * "UEB_hash": a binary hash. The authority can only be used to upload shares |
|---|
| 232 | of a single file, identified by its share's contents. (note: this |
|---|
| 233 | restricton would require the server to parse the share and validate the |
|---|
| 234 | hash) |
|---|
| 235 | * "before": a timestamp. The authority is only valid until a specific time. |
|---|
| 236 | Requires synchronized clocks or a better definition of "timestamp". |
|---|
| 237 | * "delegate_to_furl": a string, used to acquire a FURL for an object that |
|---|
| 238 | contains the attenuated authority. When it comes time to actually use the |
|---|
| 239 | authority string to do something, this is the first step. |
|---|
| 240 | * "delegate_to_key": an ECDSA pubkey, used to grant attenuated authority to |
|---|
| 241 | a separate private key. |
|---|
| 242 | |
|---|
| 243 | == User Experience == |
|---|
| 244 | |
|---|
| 245 | The process starts with Bob the storage server operator, who has just created |
|---|
| 246 | a new Storage Server: |
|---|
| 247 | |
|---|
| 248 | tahoe create-node |
|---|
| 249 | --> creates ~/.tahoe |
|---|
| 250 | # edit ~/.tahoe/tahoe.cfg, add introducer.furl, configure storage, etc |
|---|
| 251 | |
|---|
| 252 | Now Bob decides that he wants to let his friend Alice use 5GB of space on his |
|---|
| 253 | new server. |
|---|
| 254 | |
|---|
| 255 | tahoe server add-account --quota=5GB Alice |
|---|
| 256 | --> Please give the following authority string to "Alice", who should |
|---|
| 257 | provide it to the "tahoe add-authority" command |
|---|
| 258 | (authority string XYZ..) |
|---|
| 259 | |
|---|
| 260 | Bob copies the new authority string into an email message and sends it to |
|---|
| 261 | Alice. Meanwhile, Alice has created her own client, and attached it to the |
|---|
| 262 | same Introducer as Bob. When she gets the email, she pastes the authority |
|---|
| 263 | string into her local client: |
|---|
| 264 | |
|---|
| 265 | tahoe client add-authority (authority string XYZ..) |
|---|
| 266 | --> new authority added: account (1) |
|---|
| 267 | |
|---|
| 268 | Now all CLI commands that Alice runs with her node will take advantage of |
|---|
| 269 | Bob's space grant. Once Alice's node connects to Bob's, any upload which |
|---|
| 270 | needs to send a share to Bob's server will search her list of authorities to |
|---|
| 271 | find one that allows her to use Bob's server. |
|---|
| 272 | |
|---|
| 273 | When Alice uses her WUI, upload will be disabled until and unless she pastes |
|---|
| 274 | one or more authority strings into a special "storage authority" box. TODO: |
|---|
| 275 | Once pasted, we'll use some trick to keep the authority around in a |
|---|
| 276 | convenient-yet-safe fashion. |
|---|
| 277 | |
|---|
| 278 | When Alice uses her javascript-based web drive, the javascript program will |
|---|
| 279 | be launched with some trick to hand it the storage authorities, perhaps via a |
|---|
| 280 | fragment identifier (http://server/path#fragment). |
|---|
| 281 | |
|---|
| 282 | If Alice decides that she wants Amy to have some space, she takes the |
|---|
| 283 | authority string that Bob gave her and uses it to create one for Amy: |
|---|
| 284 | |
|---|
| 285 | tahoe authority dump (authority string XYZ..) |
|---|
| 286 | --> explanation of what is in XYZ |
|---|
| 287 | tahoe authority delegate --account 4,1 --space 2GB (authority string XYZ..) |
|---|
| 288 | --> (new authority string ABC..) |
|---|
| 289 | |
|---|
| 290 | Alice sends the ABC string to Amy, who uses "tahoe client add-authority" to |
|---|
| 291 | start using it. |
|---|
| 292 | |
|---|
| 293 | Later, Bob would like to find out how much space Alice is using. He brings up |
|---|
| 294 | his node's Storage Server Web Status page. In addition to the overall usage |
|---|
| 295 | numbers, the page will have a collapsible-treeview table with lines like: |
|---|
| 296 | |
|---|
| 297 | AccountID Usage TotalUsage Petname |
|---|
| 298 | (1) 1.5GB 2.5GB Alice |
|---|
| 299 | +(1,4) 1.0GB 1.0GB ? |
|---|
| 300 | |
|---|
| 301 | This indicates that Alice, as a whole, is using 2.5GB. It also indicates that |
|---|
| 302 | Alice has delegated some space to a (1,4) account, and that delegation has |
|---|
| 303 | used 1.0GB. Alice has used 1.5GB on her own, but is responsible for the full |
|---|
| 304 | 2.5GB. If Alice tells Bob that the subaccount is for Amy, then Bob can assign |
|---|
| 305 | a pet name for (1,4) with "tahoe server add-pet-name 1,4 Amy". Note that Bob |
|---|
| 306 | is not aware of the 2GB limit that Alice has imposed upon Amy: the size |
|---|
| 307 | restriction may have appeared on all the requests that have showed up thus |
|---|
| 308 | far, but Bob has no way of being sure that a less-restrictive delgation |
|---|
| 309 | hasn't been created, so his UI does not attempt to remember or present the |
|---|
| 310 | restrictions it has seen before. |
|---|
| 311 | |
|---|
| 312 | === Friendnet === |
|---|
| 313 | |
|---|
| 314 | A "friendnet" is a set of nodes, each of which is both a storage server and a |
|---|
| 315 | client, each operated by a separate person, all of which have granted storage |
|---|
| 316 | rights to the others. |
|---|
| 317 | |
|---|
| 318 | The simplest way to get a friendnet started is to simply grant storage |
|---|
| 319 | authority to everybody. "tahoe server enable-ambient-storage-authority" will |
|---|
| 320 | configure the storage server to give space to anyone who asks. This behaves |
|---|
| 321 | just like a 1.3.0 server, without accounting of any sort. |
|---|
| 322 | |
|---|
| 323 | The next step is to restrict server use to just the participants. "tahoe |
|---|
| 324 | server disable-ambient-storage-authority" will undo the previous step, then |
|---|
| 325 | there are two basic approaches: |
|---|
| 326 | |
|---|
| 327 | * "full mesh": each node grants authority directory to all the others. |
|---|
| 328 | First, agree upon a userid number for each participant (the value doesn't |
|---|
| 329 | matter, as long as it is unique). Each user should then use "tahoe server |
|---|
| 330 | add-account" for all the accounts (including themselves, if they want some |
|---|
| 331 | of their shares to land on their own machine), including a quota if they |
|---|
| 332 | wish to restrict individuals: |
|---|
| 333 | |
|---|
| 334 | tahoe server add-account --account 1 --quota 5GB Alice |
|---|
| 335 | --> authority string for Alice |
|---|
| 336 | tahoe server add-account --account 2 --quota 5GB Bob |
|---|
| 337 | --> authority string for Bob |
|---|
| 338 | tahoe server add-account --account 3 --quota 5GB Carol |
|---|
| 339 | --> authority string for Carol |
|---|
| 340 | |
|---|
| 341 | Then email Alice's string to Alice, Bob's string to Bob, etc. Once all |
|---|
| 342 | users have used "tahoe client add-authority" on everything, each server |
|---|
| 343 | will accept N distinct authorities, and each client will hold N distinct |
|---|
| 344 | authorities. |
|---|
| 345 | |
|---|
| 346 | * "account manager": the group designates somebody to be the "AM", or |
|---|
| 347 | "account manager". The AM generates a keypair and publishes the public key |
|---|
| 348 | to all the participants, who create a local authority which delgates full |
|---|
| 349 | storage rights to the corresponding private key. The AM then delegates |
|---|
| 350 | account-restricted authority to each user, sending them their personal |
|---|
| 351 | authority string: |
|---|
| 352 | |
|---|
| 353 | AM: |
|---|
| 354 | tahoe authority create-authority --write-private-to=private.txt |
|---|
| 355 | --> public.txt |
|---|
| 356 | # email public.txt to all members |
|---|
| 357 | AM: |
|---|
| 358 | tahoe authority delegate --from-file=private.txt --account 1 --quota 5GB |
|---|
| 359 | --> alice_authority.txt # email this to Alice |
|---|
| 360 | tahoe authority delegate --from-file=private.txt --account 2 --quota 5GB |
|---|
| 361 | --> bob_authority.txt # email this to Bob |
|---|
| 362 | tahoe authority delegate --from-file=private.txt --account 3 --quota 5GB |
|---|
| 363 | --> carol_authority.txt # email this to Carol |
|---|
| 364 | ... |
|---|
| 365 | Alice: |
|---|
| 366 | # receives alice_authority.txt |
|---|
| 367 | tahoe client add-authority --from-file=alice_authority.txt |
|---|
| 368 | # receives public.txt |
|---|
| 369 | tahoe server add-authorization --from-file=public.txt |
|---|
| 370 | Bob: |
|---|
| 371 | # receives bob_authority.txt |
|---|
| 372 | tahoe client add-authority --from-file=bob_authority.txt |
|---|
| 373 | # receives public.txt |
|---|
| 374 | tahoe server add-authorization --from-file=public.txt |
|---|
| 375 | Carol: |
|---|
| 376 | # receives carol_authority.txt |
|---|
| 377 | tahoe client add-authority --from-file=carol_authority.txt |
|---|
| 378 | # receives public.txt |
|---|
| 379 | tahoe server add-authorization --from-file=public.txt |
|---|
| 380 | |
|---|
| 381 | If the members want to see names next to their local usage totals, they |
|---|
| 382 | can set local petnames for the accounts: |
|---|
| 383 | |
|---|
| 384 | tahoe server set-petname 1 Alice |
|---|
| 385 | tahoe server set-petname 2 Bob |
|---|
| 386 | tahoe server set-petname 3 Carol |
|---|
| 387 | |
|---|
| 388 | Alternatively, the AM could provide a usage aggregator, which will collect |
|---|
| 389 | usage values from all the storage servers and show the totals in a single |
|---|
| 390 | place, and add the petnames to that display instead. |
|---|
| 391 | |
|---|
| 392 | The AM gets more authority than anyone else (they can spoof everybody), |
|---|
| 393 | but each server has just a single authorization instead of N, and each |
|---|
| 394 | client has a single authority instead of N. When a new member joins the |
|---|
| 395 | group, the amount of work that must be done is significantly less, and |
|---|
| 396 | only two parties are involved instead of all N: |
|---|
| 397 | |
|---|
| 398 | AM: |
|---|
| 399 | tahoe authority delegate --from-file=private.txt --account 4 --quota 5GB |
|---|
| 400 | --> dave_authority.txt # email this to Dave |
|---|
| 401 | Dave: |
|---|
| 402 | # receives dave_authority.txt |
|---|
| 403 | tahoe client add-authority --from-file=dave_authority.txt |
|---|
| 404 | # receives public.txt |
|---|
| 405 | tahoe server add-authorization --from-file=public.txt |
|---|
| 406 | |
|---|
| 407 | Another approach is to let everybody be the AM: instead of keeping the |
|---|
| 408 | private.txt file secret, give it to all members of the group (but not to |
|---|
| 409 | outsiders). This lets current members bring new members into the group |
|---|
| 410 | without depending upon anybody else doing work. It also renders any notion |
|---|
| 411 | of enforced quotas meaningless, so it is only appropriate for actual |
|---|
| 412 | friends who are voluntarily refraining from spoofing each other. |
|---|
| 413 | |
|---|
| 414 | === Commercial Grid === |
|---|
| 415 | |
|---|
| 416 | A "commercial grid", like the one that allmydata.com manages as a for-profit |
|---|
| 417 | service, is characterized by a large number of independent clients (who do |
|---|
| 418 | not know each other), and by all of the storage servers being managed by a |
|---|
| 419 | single entity. In this case, we use an Account Manager like above, to |
|---|
| 420 | collapse the potential N*M explosion of authorities into something smaller. |
|---|
| 421 | We also create a dummy "parent" account, and give all the real clients |
|---|
| 422 | subaccounts under it, to give the operations personnel a convenient "total |
|---|
| 423 | space used" number. Each time a new customer joins, the AM is directed to |
|---|
| 424 | create a new authority for them, and the resulting string is provided to the |
|---|
| 425 | customer's client node. |
|---|
| 426 | |
|---|
| 427 | AM: |
|---|
| 428 | tahoe authority create-authority --account 1 \ |
|---|
| 429 | --write-private-to=AM-private.txt --write-public-to=AM-public.txt |
|---|
| 430 | |
|---|
| 431 | Each time a new storage server is brought up: |
|---|
| 432 | |
|---|
| 433 | SERVER: |
|---|
| 434 | tahoe server add-authorization --from-file=AM-public.txt |
|---|
| 435 | |
|---|
| 436 | Each time a new client joins: |
|---|
| 437 | |
|---|
| 438 | AM: |
|---|
| 439 | N = next_account++ |
|---|
| 440 | tahoe authority delegate --from-file=AM-private.txt --account 1,N |
|---|
| 441 | --> new_client_authority.txt # give this to new client |
|---|
| 442 | |
|---|
| 443 | == Programmatic Interfaces == |
|---|
| 444 | |
|---|
| 445 | The storage authority can be passed as a string in a single serialized form, |
|---|
| 446 | which is cut-and-pasteable and printable. It uses minimal punctuation, to |
|---|
| 447 | make it possible to include it as a URL query argument or HTTP header field |
|---|
| 448 | without requiring character-escaping. |
|---|
| 449 | |
|---|
| 450 | Before passing it over HTTP, however, note that revealing the authority |
|---|
| 451 | string to someone is equivalent to irrevocably delegating all that authority |
|---|
| 452 | to them. While this is appropriate when transferring authority from, say, a |
|---|
| 453 | receptive storage server to your local agent, it is not appropriate when |
|---|
| 454 | using a foreign tahoe node, or when asking a Helper to upload a specific |
|---|
| 455 | file. Attenuations (see below) should be used to limit the delegated |
|---|
| 456 | authority in these cases. |
|---|
| 457 | |
|---|
| 458 | In the programmatic web-API, any operation that consumes storage will accept |
|---|
| 459 | a storage-authority= query argument, the value of which will be the printable |
|---|
| 460 | form of an authority string. This includes all PUT operations, POST t=upload |
|---|
| 461 | and t=mkdir, and anything which creates a new file, creates a directory |
|---|
| 462 | (perhaps an intermediate one), or modifies a mutable file. |
|---|
| 463 | |
|---|
| 464 | Alternatively, the authority string can also be passed through an HTTP |
|---|
| 465 | header. A single "X-Tahoe-Storage-Authority:" header can be used with the |
|---|
| 466 | printable authority string. If the string is too large to fit in a single |
|---|
| 467 | header, the application can provide a series of numbered |
|---|
| 468 | "X-Tahoe-Storage-Authority-1:", "X-Tahoe-Storage-Authority-2:", etc, headers, |
|---|
| 469 | and these will be sorted in alphabetical order (please use 08/09/10/11 rather |
|---|
| 470 | than 8/9/10/11), stripped of leading and trailing whitespace, and |
|---|
| 471 | concatenated. The HTTP header form can accomodate larger authority strings, |
|---|
| 472 | since these strings can grow too large to pass as a query argument |
|---|
| 473 | (especially when several delegations or attenuations are involved). However, |
|---|
| 474 | depending upon the HTTP client library being used, passing extra HTTP headers |
|---|
| 475 | may be more complicated than simply modifying the URL, and may be impossible |
|---|
| 476 | in some cases (such as javascript running in a web browser). |
|---|
| 477 | |
|---|
| 478 | TODO: we may add a stored-token form of authority-passing to handle |
|---|
| 479 | environments in which query-args won't work and headers are not available. |
|---|
| 480 | This approach would use a special PUT which takes the authority string as the |
|---|
| 481 | HTTP body, and remembers it on the server side in associated with a |
|---|
| 482 | brief-but-unguessable token. Later operations would then use the authority by |
|---|
| 483 | passing a --storage-authority-token=XYZ query argument. These authorities |
|---|
| 484 | would expire after some period. |
|---|
| 485 | |
|---|
| 486 | == Quota Management, Aggregation, Reporting == |
|---|
| 487 | |
|---|
| 488 | The storage server will maintain enough information to efficiently compute |
|---|
| 489 | usage totals for each account referenced in all of their leases, as well as |
|---|
| 490 | all their parent accounts. This information is used for several purposes: |
|---|
| 491 | |
|---|
| 492 | * enforce server-space restrictions, by selectively rejecting storage |
|---|
| 493 | requests which would cause the account-usage-total to rise above the limit |
|---|
| 494 | specified in the enabling authorization string |
|---|
| 495 | * report individual account usage to the account-holder (if a client can |
|---|
| 496 | consume space under account A, they are also allowed to query usage for |
|---|
| 497 | account A or a subaccount). |
|---|
| 498 | * report individual account usage to the storage-server operator, possibly |
|---|
| 499 | associated with a pet name |
|---|
| 500 | * report usage for all accounts to the storage-server operator, possibly |
|---|
| 501 | associated with a pet name, in the form of a large table |
|---|
| 502 | * report usage for all accounts to an external aggregator |
|---|
| 503 | |
|---|
| 504 | The external aggregator would take usage information from all the storage |
|---|
| 505 | servers in a single grid and sum them together, providing a grid-wide usage |
|---|
| 506 | number for each account. This could be used by e.g. clients in a commercial |
|---|
| 507 | grid to report overall-space-used to the end user. |
|---|
| 508 | |
|---|
| 509 | There will be web-API URLs available for all of these reports. |
|---|
| 510 | |
|---|
| 511 | TODO: storage servers might also have a mechanism to apply space-usage limits |
|---|
| 512 | to specific account ids directly, rather than requiring that these be |
|---|
| 513 | expressed only through authority-string limitation fields. This would let a |
|---|
| 514 | storage server operator revoke their space-allocation after delivering the |
|---|
| 515 | authority string. |
|---|
| 516 | |
|---|
| 517 | == Low-Level Formats == |
|---|
| 518 | |
|---|
| 519 | This section describes the low-level formats used by the Accounting process, |
|---|
| 520 | beginning with the storage-authority data structure and working upwards. This |
|---|
| 521 | section is organized to follow the storage authority, starting from the point |
|---|
| 522 | of grant. The discussion will thus begin at the storage server (where the |
|---|
| 523 | authority is first created), work back to the client (which receives the |
|---|
| 524 | authority as a web-API argument), then follow the authority back to the |
|---|
| 525 | servers as it is used to enable specific storage operations. It will then |
|---|
| 526 | detail the accounting tables that the storage server is obligated to |
|---|
| 527 | maintain, and describe the interfaces through which these tables are accessed |
|---|
| 528 | by other parties. |
|---|
| 529 | |
|---|
| 530 | === Storage Authority === |
|---|
| 531 | |
|---|
| 532 | ==== Terminology ==== |
|---|
| 533 | |
|---|
| 534 | Storage Authority is represented as a chain of certificates and a private |
|---|
| 535 | key. Each certificate authorizes and restricts a specific private key. The |
|---|
| 536 | initial certificate in the chain derives its authority by being placed in the |
|---|
| 537 | storage server's tahoe.cfg file (i.e. by being authorized by the storage |
|---|
| 538 | server operator). All subsequent certificates are signed by the authorized |
|---|
| 539 | private key that was identified in the previous certificate: they derive |
|---|
| 540 | their authority by delegation. Each certificate has restrictions which limit |
|---|
| 541 | the authority being delegated. |
|---|
| 542 | |
|---|
| 543 | authority: ([cert[0], cert[1], cert[2] ...], privatekey) |
|---|
| 544 | |
|---|
| 545 | The "restrictions dictionary" is a table which establishes an upper bound on |
|---|
| 546 | how this authority (or any attenuations thereof) may be used. It is |
|---|
| 547 | effectively a set of key-value pairs. |
|---|
| 548 | |
|---|
| 549 | A "signing key" is an EC-DSA192 private key string and is 12 bytes |
|---|
| 550 | long. A "verifying key" is an EC-DSA192 public key string, and is 24 |
|---|
| 551 | bytes long. A "key identifier" is a string which securely identifies a |
|---|
| 552 | specific signing/verifying keypair: for long RSA keys it would be a |
|---|
| 553 | secure hash of the public key, but since ECDSA192 keys are so short, |
|---|
| 554 | we simply use the full verifying key verbatim. A "key hint" is a |
|---|
| 555 | variable-length prefix of the key identifier, perhaps zero bytes long, |
|---|
| 556 | used to help a recipient reduce the number of verifying keys that it |
|---|
| 557 | must search to find one that matches a signed message. |
|---|
| 558 | |
|---|
| 559 | ==== Authority Chains ==== |
|---|
| 560 | |
|---|
| 561 | The authority chain consists of a list of certificates, each of which has a |
|---|
| 562 | serialized restrictions dictionary. Each dictionary will have a |
|---|
| 563 | "delegate-to-key" field, which delegates authority to a private key, |
|---|
| 564 | referenced with a key identifier. In addition, the non-initial certs are |
|---|
| 565 | signed, so they each contain a signature and a key hint: |
|---|
| 566 | |
|---|
| 567 | cert[0]: serialized(restrictions_dictionary) |
|---|
| 568 | cert[1]: serialized(restrictions_dictionary), signature, keyhint |
|---|
| 569 | cert[2]: serialized(restrictions_dictionary), signature, keyhint |
|---|
| 570 | |
|---|
| 571 | In this example, suppose cert[0] contains a delegate-to-key field that |
|---|
| 572 | identifies a keypair sign_A/verify_A. In this case, cert[1] will have a |
|---|
| 573 | signature that was made with sign_A, and the keyhint in cert[1] will |
|---|
| 574 | reference verify_A. |
|---|
| 575 | |
|---|
| 576 | cert[0].restrictions[delegate-to-key] = A_keyid |
|---|
| 577 | |
|---|
| 578 | cert[1].signature = SIGN(sign_A, serialized(cert[0].restrictions)) |
|---|
| 579 | cert[1].keyhint = verify_A |
|---|
| 580 | cert[1].restrictions[delegate-to-key] = B_keyid |
|---|
| 581 | |
|---|
| 582 | cert[2].signature = SIGN(sign_B, serialized(cert[1].restrictions)) |
|---|
| 583 | cert[2].keyhint = verify_B |
|---|
| 584 | cert[2].restrictions[delete-to-key] = C_keyid |
|---|
| 585 | |
|---|
| 586 | In this example, the full storage authority consists of the cert[0,1,2] chain |
|---|
| 587 | and the sign_C private key: anyone who is in possession of both will be able |
|---|
| 588 | to exert this authority. To wield the authority, a client will present the |
|---|
| 589 | cert[0,1,2] chain and an action message signed by sign_C; the server will |
|---|
| 590 | validate the chain and the signature before performing the requested action. |
|---|
| 591 | The only circumstances that might prompt the client to share the sign_C |
|---|
| 592 | private key with another party (including the server) would be if it wanted |
|---|
| 593 | to irrevocably share its full authority with that party. |
|---|
| 594 | |
|---|
| 595 | ==== Restriction Dictionaries ==== |
|---|
| 596 | |
|---|
| 597 | Within a restriction dictionary, the following keys are defined. Their full |
|---|
| 598 | meanings are defined later. |
|---|
| 599 | |
|---|
| 600 | 'accountid': an arbitrary-length sequence of integers >=0, restricting the |
|---|
| 601 | accounts which can be manipulated or used in leases |
|---|
| 602 | 'SI': a storage index (binary string), controlling which file may be |
|---|
| 603 | manipulated |
|---|
| 604 | 'serverid': binary string, limiting which server will accept requests |
|---|
| 605 | 'UEB-hash': binary string, limiting the content of the file being manipulated |
|---|
| 606 | 'before': timestamp (seconds since epoch), limits the lifetime of this |
|---|
| 607 | authority |
|---|
| 608 | 'server-size': integer >0, maximum aggregate storage (in bytes) per account |
|---|
| 609 | 'delegate-to-key': binary string (DSA pubkey identifier) |
|---|
| 610 | 'furl-to': printable FURL string |
|---|
| 611 | |
|---|
| 612 | ==== Authority Serialization ==== |
|---|
| 613 | |
|---|
| 614 | There is only one form of serialization: a somewhat-compact URL-safe |
|---|
| 615 | cut-and-pasteable printable form. We are interested in minimizing the size of |
|---|
| 616 | the resulting authority, so rather than using a general-purpose (perhaps |
|---|
| 617 | JSON-based) serialization scheme, we use one that is specialized for this |
|---|
| 618 | task. |
|---|
| 619 | |
|---|
| 620 | This URL-safe form will use minimal punctuation to avoid quoting issues when |
|---|
| 621 | used in a URL query argument. It would be nice to avoid word-breaking |
|---|
| 622 | characters that make cut-and-paste troublesome, however this is more |
|---|
| 623 | difficult because most non-alphanumeric characters are word-breaking in at |
|---|
| 624 | least one application. |
|---|
| 625 | |
|---|
| 626 | The serialized storage authority as a whole contains a single version |
|---|
| 627 | identifier and magic number at the beginning. None of the internal components |
|---|
| 628 | contain redundant version numbers: they are implied by the container. If |
|---|
| 629 | components are serialized independently for other reasons, they may contain |
|---|
| 630 | version identifers in that form. |
|---|
| 631 | |
|---|
| 632 | Signing keys (i.e. private keys) are URL-safe-serialized using Zooko's base62 |
|---|
| 633 | alphabet, which offers almost the same density as standard base64 but without |
|---|
| 634 | any non-URL-safe or word-breaking characters. Since we used fixed-format keys |
|---|
| 635 | (EC-DSA, 192bit, with SHA256), the private keys are fixed-length (96 bits or |
|---|
| 636 | 12 bytes), so there is no length indicator: all URL-safe-serialized signing |
|---|
| 637 | keys are 17 base62 characters long. The 192-bit verifying keys (i.e. public |
|---|
| 638 | keys) use the same approach: the URL-safe form is 33 characters long. |
|---|
| 639 | |
|---|
| 640 | An account-id sequence (a variable-length sequence of non-negative numbers) |
|---|
| 641 | is serialized by representing each number in decimal ASCII, then joining the |
|---|
| 642 | pieces with commas. The string is terminated by the first non-[0-9,] |
|---|
| 643 | character encountered, which will either be the key-identifier letter of the |
|---|
| 644 | next field, or the dictionary-terminating character at the end. |
|---|
| 645 | |
|---|
| 646 | Any single integral decimal number (such as the "before" timestamp field, or |
|---|
| 647 | the "server-size" field) is serialized as a variable-length sequence of ASCII |
|---|
| 648 | decimal digits, terminated by any non-digit. |
|---|
| 649 | |
|---|
| 650 | The restrictions dictionary is serialized as a concatenated series of |
|---|
| 651 | key-identifier-letter / value string pairs, ending with the marker "E.". The |
|---|
| 652 | URL-safe form uses a single printable letter to indicate the which key is |
|---|
| 653 | being serialized. Each type of value string is serialized differently: |
|---|
| 654 | |
|---|
| 655 | "A": accountid: variable-length sequence of comma-joned numbers |
|---|
| 656 | "I": storage index: fixed-length 26-character *base32*-encoded storage index |
|---|
| 657 | "P": server id (peer id): fixed-length 32-character *base32* encoded serverid |
|---|
| 658 | (matching the printable Tub.tubID string that Foolscap provides) |
|---|
| 659 | "U": UEB hash: fixed-length 43-character base62 encoded UEB hash |
|---|
| 660 | "B": before: variable-length sequence of decimal digits, seconds-since-epoch. |
|---|
| 661 | "S": server-size: variable-length sequence of decimal digits, max size in bytes |
|---|
| 662 | "D": delegate-to-key: ECDSA public key, 33 base62 characters. |
|---|
| 663 | "F": furl-to: variable-length FURL string, wrapped in a netstring: |
|---|
| 664 | "%d:%s," % (len(FURL), FURL). Note that this is rarely pasted. |
|---|
| 665 | "E.": end-of-dictionary marker |
|---|
| 666 | |
|---|
| 667 | The ECDSA signature is serialized as a variable number of base62 characters, |
|---|
| 668 | terminated by a period. We expect the signature to be about 384 bits (48 |
|---|
| 669 | bytes) long, or 65 base62 characters. A missing signature (such as for the |
|---|
| 670 | initial cert) is represented as a single period. |
|---|
| 671 | |
|---|
| 672 | The key hint is serialized with a base62-encoded serialized hint string (a |
|---|
| 673 | byte-quantized prefix of the serialized public key), terminated by a period. |
|---|
| 674 | An empty hint would thus be serialized as a single period. For the current |
|---|
| 675 | design, we expect the key hint to be empty. |
|---|
| 676 | |
|---|
| 677 | The full storage authority string consists of a certificate chain and a |
|---|
| 678 | delegate private key. Given the single-certificate serialization scheme |
|---|
| 679 | described above, the full authority is serialized as follows: |
|---|
| 680 | |
|---|
| 681 | * version prefix: depends upon the application, but for storage-authority |
|---|
| 682 | chains this will be "sa0-", for Storage-Authority Version 0. |
|---|
| 683 | * serialized certificates, concatenated together |
|---|
| 684 | * serialized private key (to which the last certificate delegates authority) |
|---|
| 685 | |
|---|
| 686 | Note that this serialization form does not have an explicit terminator, so |
|---|
| 687 | the environment must provide a length indicator or some other way to identify |
|---|
| 688 | the end of the authority string. The benefit of this approach is that the |
|---|
| 689 | full string will begin and end with alphanumeric characters, making |
|---|
| 690 | cut-and-paste easier (increasing the size of the mouse target: anywhere |
|---|
| 691 | within the final component will work). |
|---|
| 692 | |
|---|
| 693 | Also note that the period is a reserved delimiter: it cannot appear in the |
|---|
| 694 | serialized restrictions dictionary. The parser can remove the version prefix, |
|---|
| 695 | split the rest on periods, and expect to see 3*k+1 fields, consisting of k |
|---|
| 696 | (restriction-dictionary,signature,keyhint) 3-tuples and a single private key |
|---|
| 697 | at the end. |
|---|
| 698 | |
|---|
| 699 | Some examples: |
|---|
| 700 | |
|---|
| 701 | (example A) |
|---|
| 702 | cert[0] delegates account 1,4 to (pubkey ZlFA / privkey 1f2S): |
|---|
| 703 | |
|---|
| 704 | sa0-A1,4D2lFA6LboL2xx0ldQH2K1TdSrwuqMMiME3E...1f2SI9UJPXvb7vdJ1 |
|---|
| 705 | |
|---|
| 706 | (example B) |
|---|
| 707 | cert[0] delegates account 1,4 to ZlFA/1f2S |
|---|
| 708 | cert[1] subdelegates 5GB and subaccount 1,4,7 to pubkey 0BPo/06rt: |
|---|
| 709 | |
|---|
| 710 | sa0-A1,4D2lFA6LboL2xx0ldQH2K1TdSrwuqMMiME3E...A1,4,7S5000000000D0BPoGxJ3M4KWrmdpLnknhJABrWip5e9kPE,7cyhQvv5axdeihmOzIHjs85TcUIYiWHdsxNz50GTerEOR5ucj2TITPXxyaCUli1oF...06rtcPQotR3q4f2cT |
|---|
| 711 | |
|---|
| 712 | |
|---|
| 713 | |
|---|
| 714 | |
|---|
| 715 | |
|---|
| 716 | |
|---|
| 717 | |
|---|
| 718 | == Problems == |
|---|
| 719 | |
|---|
| 720 | Problems which have thus far been identified with this approach: |
|---|
| 721 | |
|---|
| 722 | * allowing arbitrary subaccount generation will permit a DoS attack, in |
|---|
| 723 | which an authorized uploader consumes lots of DB space by creating an |
|---|
| 724 | unbounded number of randomly-generated subaccount identifiers. OTOH, they |
|---|
| 725 | can already attach an unbounded number of leases to any file they like, |
|---|
| 726 | consuming a lot of space. |
|---|
| 727 | |
|---|