1 | |
---|
2 | = Accounting = |
---|
3 | |
---|
4 | "Accounting" is the arena of the Tahoe system that concerns measuring, |
---|
5 | controlling, and enabling the ability to upload and download files, and to |
---|
6 | create new directories. In contrast with the capability-based access control |
---|
7 | model, which dictates how specific files and directories may or may not be |
---|
8 | manipulated, Accounting is concerned with resource consumption: how much disk |
---|
9 | space a given person/account/entity can use. |
---|
10 | |
---|
11 | Tahoe releases up to and including 1.4.1 have a nearly-unbounded resource |
---|
12 | usage model. Anybody who can talk to the Introducer gets to talk to all the |
---|
13 | Storage Servers, and anyone who can talk to a Storage Server gets to use as |
---|
14 | much disk space as they want (up to the reserved_space= limit imposed by the |
---|
15 | server, which affects all users equally). Not only is the per-user space |
---|
16 | usage unlimited, it is also unmeasured: the owner of the Storage Server has |
---|
17 | no way to find out how much space Alice or Bob is using. |
---|
18 | |
---|
19 | The goals of the Accounting system are thus: |
---|
20 | |
---|
21 | * allow the owner of a storage server to control who gets to use disk space, |
---|
22 | with separate limits per user |
---|
23 | * allow both the server owner and the user to measure how much space the user |
---|
24 | is consuming, in an efficient manner |
---|
25 | * provide grid-wide aggregation tools, so a set of cooperating server |
---|
26 | operators can easily measure how much a given user is consuming across all |
---|
27 | servers. This information should also be available to the user in question. |
---|
28 | |
---|
29 | For the purposes of this document, the terms "Account" and "User" are mostly |
---|
30 | interchangeable. The fundamental unit of Accounting is the "Account", in that |
---|
31 | usage and quota enforcement is performed separately for each account. These |
---|
32 | accounts might correspond to individual human users, or they might be shared |
---|
33 | among a group, or a user might have an arbitrary number of accounts. |
---|
34 | |
---|
35 | Accounting interacts with Garbage Collection. To protect their shares from |
---|
36 | GC, clients maintain limited-duration leases on those shares: when the last |
---|
37 | lease expires, the share is deleted. Each lease has a "label", which |
---|
38 | indicates the account or user which wants to keep the share alive. A given |
---|
39 | account's "usage" (their per-server aggregate usage) is simply the sum of the |
---|
40 | sizes of all shares on which they hold a lease. The storage server may limit |
---|
41 | the user to a fixed "quota" (an upper bound on their usage). To keep a file |
---|
42 | alive, the user must be willing to use up some of their quota. |
---|
43 | |
---|
44 | Note that a popular file might have leases from multiple users, in which case |
---|
45 | one user might take a chance and decline to add their own lease, saving some |
---|
46 | of their quota and hoping that the other leases continue to keep the file |
---|
47 | alive despite their personal unwillingness to contribute to the effort. One |
---|
48 | could imagine a "pro-rated quotas" scheme, in which a 10MB file with 5 |
---|
49 | leaseholders would deduct 2MB from each leaseholder's quota. We have decided |
---|
50 | to not implement pro-rated quotas, because such a scheme would make usage |
---|
51 | values hard to predict: a given account might suddenly go over quota solely |
---|
52 | because of a third party's actions. |
---|
53 | |
---|
54 | == Accounting Implementation == |
---|
55 | |
---|
56 | The implementation of these accounting features are tracked in this ticket: |
---|
57 | |
---|
58 | https://tahoe-lafs.org/trac/tahoe-lafs/ticket/666 |
---|
59 | |
---|
60 | == Authority Flow == |
---|
61 | |
---|
62 | The authority to consume space on the storage server originates, of course, |
---|
63 | with the storage server operator. These operators start with complete control |
---|
64 | over their space, and delegate portions of it to others: either directly to |
---|
65 | clients who want to upload files, or to intermediaries who can then delegate |
---|
66 | attenuated authority onwards. The operators have various reasons for wanting |
---|
67 | to share their space: monetary consideration, expectations of in-kind |
---|
68 | exchange, or simple generosity. But the final authority always rests with the |
---|
69 | operator. |
---|
70 | |
---|
71 | The server operator grants limited authority over their space by configuring |
---|
72 | their server to accept requests that demonstrate knowledge of certain |
---|
73 | secrets. They then share those secrets with the client who intends to use |
---|
74 | this space, or with an intermediary who will generate still more secrets and |
---|
75 | share those with the client. Eventually, an upload or create-directory |
---|
76 | operation will be performed that needs this authority. Part of the operation |
---|
77 | will involve proving knowledge of the secret to the storage server, and the |
---|
78 | server will require this proof before accepting the uploaded share or adding |
---|
79 | a new lease. |
---|
80 | |
---|
81 | The authority is expressed as a string, containing cryptographically-signed |
---|
82 | messages and keys. The string also contains "restrictions", which are |
---|
83 | annotations that explain the limits imposed upon this authority, either by |
---|
84 | the original grantor (the storage server operator) or by one of the |
---|
85 | intermediaries. Authority can be reduced but not increased. Any holder of a |
---|
86 | given authority can delegate some or all of it to another party. |
---|
87 | |
---|
88 | The authority string may be short enough to include as an argument to a CLI |
---|
89 | command (--with-authority ABCDE), or it may be long enough that it must be |
---|
90 | stashed in a file and referenced in some other fashion (--with-authority-file |
---|
91 | ~/.my_authority). There are CLI tools to create brand new authority strings, |
---|
92 | to derive attenuated authorities from an existing one, and to explain the |
---|
93 | contents of an authority string. These authority strings can be shared with |
---|
94 | others just like filecaps and dircaps: knowledge of the authority string is |
---|
95 | both necessary and complete to wield the authority it represents. |
---|
96 | |
---|
97 | Web-API requests will include the authority necessary to complete the |
---|
98 | operation. When used by a CLI tool, the authority is likely to come from |
---|
99 | ~/.tahoe/private/authority (i.e. it is ambient to the user who has access to |
---|
100 | that node, just like aliases provide similar access to a specific "root |
---|
101 | directory"). When used by the browser-oriented WUI, the authority will [TODO] |
---|
102 | somehow be retained on each page in a way that minimizes the risk of CSRF |
---|
103 | attacks and allows safe sharing (cut-and-paste of a URL without sharing the |
---|
104 | storage authority too). The client node receiving the web-API request will |
---|
105 | extract the authority string from the request and use it to build the storage |
---|
106 | server messages that it sends to fulfill that request. |
---|
107 | |
---|
108 | == Definition Of Authority == |
---|
109 | |
---|
110 | The term "authority" is used here in the object-capability sense: it refers |
---|
111 | to the ability of some principal to cause some action to occur, whether |
---|
112 | because they can do it themselves, or because they can convince some other |
---|
113 | principal to do it for them. In Tahoe terms, "storage authority" is the |
---|
114 | ability to do one of the following actions: |
---|
115 | |
---|
116 | * upload a new share, thus consuming storage space |
---|
117 | * adding a new lease to a share, thus preventing space from being reclaimed |
---|
118 | * modify an existing mutable share, potentially increasing the space consumed |
---|
119 | |
---|
120 | The Accounting effort may involve other kinds of authority that get limited |
---|
121 | in a similar manner as storage authority, like the ability to download a |
---|
122 | share or query whether a given share is present: anything that may consume |
---|
123 | CPU time, disk bandwidth, or other limited resources. The authority to renew |
---|
124 | or cancel a lease may be controlled in a similar fashion. |
---|
125 | |
---|
126 | Storage authority, as granted from a server operator to a client, is not |
---|
127 | simply a binary "use space or not" grant. Instead, it is parameterized by a |
---|
128 | number of "restrictions". The most important of these restrictions (with |
---|
129 | respect to the goals of Accounting) is the "Account Label". |
---|
130 | |
---|
131 | === Account Labels === |
---|
132 | |
---|
133 | A Tahoe "Account" is defined by a variable-length sequence of small integers. |
---|
134 | (they are not required to be small, the actual limit is 2**64, but neither |
---|
135 | are they required to be unguessable). For the purposes of discussion, these |
---|
136 | lists will be expressed as period-joined strings: the two-element list (1,4) |
---|
137 | will be displayed here as "1.4". |
---|
138 | |
---|
139 | These accounts are arranged in a hierarchy: the account identifier 1.4 is |
---|
140 | considered to be a "parent" of 1.4.2 . There is no relationship between the |
---|
141 | values used by unrelated accounts: 1.4 is unrelated to 2.4, despite both |
---|
142 | coincidentally using a "4" in the second element. |
---|
143 | |
---|
144 | Each lease has a label, which contains the Account identifier. The storage |
---|
145 | server maintains an aggregate size count for each label prefix: when asked |
---|
146 | about account 1.4, it will report the amount of space used by shares labeled |
---|
147 | 1.4, 1.4.2, 1.4.7, 1.4.7.8, etc (but *not* 1 or 1.5). |
---|
148 | |
---|
149 | The "Account Label" restriction allows a client to apply any label it wants, |
---|
150 | as long as that label begins with a specific prefix. If account 1 is |
---|
151 | associated with Alice, then Alice will receive a storage authority string |
---|
152 | that contains a "must start with 1" restriction, enabling her to to use |
---|
153 | storage space but obligating her to lease her shares with a label that can be |
---|
154 | traced back to her. She can delegate part of her authority to others (perhaps |
---|
155 | with other non-label restrictions, such as a space restriction or time limit) |
---|
156 | with or without an additional label restriction. For example, she might |
---|
157 | delegate some of her authority to her friend Amy, with a 1.4 label |
---|
158 | restriction. Amy could then create labels with 1.4 or 1.4.7, but she could |
---|
159 | not create labels with the same 1 identifier that Alice can do, nor could she |
---|
160 | create labels with 1.5 (which Alice might have given to her other friend |
---|
161 | Annette). The storage server operator can ask about the usage of 1 to find |
---|
162 | out how much Alice is responsible for (which includes the space that she has |
---|
163 | delegated to Amy and Annette), and none of the A-users can avoid being |
---|
164 | counted in this total. But Alice can ask the storage server about the usage |
---|
165 | of 1.4 to find out how much Amy has taken advantage of her gift. Likewise, |
---|
166 | Alice has control over any lease with a label that begins with 1, so she can |
---|
167 | cancel Amy's leases and free the space they were consuming. If this seems |
---|
168 | surprising, consider that the storage server operator considered Alice to be |
---|
169 | responsible for that space anyways: with great responsibility (for space |
---|
170 | consumed) comes great power (to stop consuming that space). |
---|
171 | |
---|
172 | === Server Space Restriction === |
---|
173 | |
---|
174 | The storage server's basic control over how space usage (apart from the |
---|
175 | binary use-it-or-not authority granted by handing out an authority string at |
---|
176 | all) is implemented by keeping track of the space used by any given account |
---|
177 | identifier. If account 1.4 sends a request to allocate a 1MB share, but that |
---|
178 | 1MB would bring the 1.4 usage over its quota, the request will be denied. |
---|
179 | |
---|
180 | For this to be useful, the storage server must give each usage-limited |
---|
181 | principal a separate account, and it needs to configure a size limit at the |
---|
182 | same time as the authority string is minted. For a friendnet, the CLI "add |
---|
183 | account" tool can do both at once: |
---|
184 | |
---|
185 | tahoe server add-account --quota 5GB Alice |
---|
186 | --> Please give the following authority string to "Alice", who should |
---|
187 | provide it to the "tahoe add-authority" command |
---|
188 | (authority string..) |
---|
189 | |
---|
190 | This command will allocate an account identifier, add Alice to the "pet name |
---|
191 | table" to associate it with the new account, and establish the 5GB sizelimit. |
---|
192 | Both the sizelimit and the petname can be changed later. |
---|
193 | |
---|
194 | Note that this restriction is independent for each server: some additional |
---|
195 | mechanism must be used to provide a grid-wide restriction. |
---|
196 | |
---|
197 | Also note that this restriction is not expressed in the authority string. It |
---|
198 | is purely local to the storage server. |
---|
199 | |
---|
200 | === Attenuated Server Space Restriction === |
---|
201 | |
---|
202 | TODO (or not) |
---|
203 | |
---|
204 | The server-side space restriction described above can only be applied by the |
---|
205 | storage server, and cannot be attenuated by other delegates. Alice might be |
---|
206 | allowed to use 5GB on this server, but she cannot use that restriction to |
---|
207 | delegate, say, just 1GB to Amy. |
---|
208 | |
---|
209 | Instead, Alice's sub-delegation should include a "server_size" restriction |
---|
210 | key, which contains a size limit. The storage server will only honor a |
---|
211 | request that uses this authority string if it does not cause the aggregate |
---|
212 | usage of this authority string's account prefix to rise above the given size |
---|
213 | limit. |
---|
214 | |
---|
215 | Note that this will not enforce the desired restriction if the size limits |
---|
216 | are not consistent across multiple delegated authorities for the same label. |
---|
217 | For example, if Amy ends up with two delagations, A1 (which gives her a size |
---|
218 | limit of 1GB) and A2 (which gives her 5GB), then she can consume 5GB despite |
---|
219 | the limit in A1. |
---|
220 | |
---|
221 | === Other Restrictions === |
---|
222 | |
---|
223 | Many storage authority restrictions are meant for internal use by tahoe tools |
---|
224 | as they delegate short-lived subauthorities to each other, and are not likely |
---|
225 | to be set by end users. |
---|
226 | |
---|
227 | * "SI": a storage index string. The authority can only be used to upload |
---|
228 | shares of a single file. |
---|
229 | * "serverid": a server identifier. The authority can only be used when |
---|
230 | talking to a specific server |
---|
231 | * "UEB_hash": a binary hash. The authority can only be used to upload shares |
---|
232 | of a single file, identified by its share's contents. (note: this |
---|
233 | restricton would require the server to parse the share and validate the |
---|
234 | hash) |
---|
235 | * "before": a timestamp. The authority is only valid until a specific time. |
---|
236 | Requires synchronized clocks or a better definition of "timestamp". |
---|
237 | * "delegate_to_furl": a string, used to acquire a FURL for an object that |
---|
238 | contains the attenuated authority. When it comes time to actually use the |
---|
239 | authority string to do something, this is the first step. |
---|
240 | * "delegate_to_key": an ECDSA pubkey, used to grant attenuated authority to |
---|
241 | a separate private key. |
---|
242 | |
---|
243 | == User Experience == |
---|
244 | |
---|
245 | The process starts with Bob the storage server operator, who has just created |
---|
246 | a new Storage Server: |
---|
247 | |
---|
248 | tahoe create-node |
---|
249 | --> creates ~/.tahoe |
---|
250 | # edit ~/.tahoe/tahoe.cfg, add introducer.furl, configure storage, etc |
---|
251 | |
---|
252 | Now Bob decides that he wants to let his friend Alice use 5GB of space on his |
---|
253 | new server. |
---|
254 | |
---|
255 | tahoe server add-account --quota=5GB Alice |
---|
256 | --> Please give the following authority string to "Alice", who should |
---|
257 | provide it to the "tahoe add-authority" command |
---|
258 | (authority string XYZ..) |
---|
259 | |
---|
260 | Bob copies the new authority string into an email message and sends it to |
---|
261 | Alice. Meanwhile, Alice has created her own client, and attached it to the |
---|
262 | same Introducer as Bob. When she gets the email, she pastes the authority |
---|
263 | string into her local client: |
---|
264 | |
---|
265 | tahoe client add-authority (authority string XYZ..) |
---|
266 | --> new authority added: account (1) |
---|
267 | |
---|
268 | Now all CLI commands that Alice runs with her node will take advantage of |
---|
269 | Bob's space grant. Once Alice's node connects to Bob's, any upload which |
---|
270 | needs to send a share to Bob's server will search her list of authorities to |
---|
271 | find one that allows her to use Bob's server. |
---|
272 | |
---|
273 | When Alice uses her WUI, upload will be disabled until and unless she pastes |
---|
274 | one or more authority strings into a special "storage authority" box. TODO: |
---|
275 | Once pasted, we'll use some trick to keep the authority around in a |
---|
276 | convenient-yet-safe fashion. |
---|
277 | |
---|
278 | When Alice uses her javascript-based web drive, the javascript program will |
---|
279 | be launched with some trick to hand it the storage authorities, perhaps via a |
---|
280 | fragment identifier (http://server/path#fragment). |
---|
281 | |
---|
282 | If Alice decides that she wants Amy to have some space, she takes the |
---|
283 | authority string that Bob gave her and uses it to create one for Amy: |
---|
284 | |
---|
285 | tahoe authority dump (authority string XYZ..) |
---|
286 | --> explanation of what is in XYZ |
---|
287 | tahoe authority delegate --account 4,1 --space 2GB (authority string XYZ..) |
---|
288 | --> (new authority string ABC..) |
---|
289 | |
---|
290 | Alice sends the ABC string to Amy, who uses "tahoe client add-authority" to |
---|
291 | start using it. |
---|
292 | |
---|
293 | Later, Bob would like to find out how much space Alice is using. He brings up |
---|
294 | his node's Storage Server Web Status page. In addition to the overall usage |
---|
295 | numbers, the page will have a collapsible-treeview table with lines like: |
---|
296 | |
---|
297 | AccountID Usage TotalUsage Petname |
---|
298 | (1) 1.5GB 2.5GB Alice |
---|
299 | +(1,4) 1.0GB 1.0GB ? |
---|
300 | |
---|
301 | This indicates that Alice, as a whole, is using 2.5GB. It also indicates that |
---|
302 | Alice has delegated some space to a (1,4) account, and that delegation has |
---|
303 | used 1.0GB. Alice has used 1.5GB on her own, but is responsible for the full |
---|
304 | 2.5GB. If Alice tells Bob that the subaccount is for Amy, then Bob can assign |
---|
305 | a pet name for (1,4) with "tahoe server add-pet-name 1,4 Amy". Note that Bob |
---|
306 | is not aware of the 2GB limit that Alice has imposed upon Amy: the size |
---|
307 | restriction may have appeared on all the requests that have showed up thus |
---|
308 | far, but Bob has no way of being sure that a less-restrictive delgation |
---|
309 | hasn't been created, so his UI does not attempt to remember or present the |
---|
310 | restrictions it has seen before. |
---|
311 | |
---|
312 | === Friendnet === |
---|
313 | |
---|
314 | A "friendnet" is a set of nodes, each of which is both a storage server and a |
---|
315 | client, each operated by a separate person, all of which have granted storage |
---|
316 | rights to the others. |
---|
317 | |
---|
318 | The simplest way to get a friendnet started is to simply grant storage |
---|
319 | authority to everybody. "tahoe server enable-ambient-storage-authority" will |
---|
320 | configure the storage server to give space to anyone who asks. This behaves |
---|
321 | just like a 1.3.0 server, without accounting of any sort. |
---|
322 | |
---|
323 | The next step is to restrict server use to just the participants. "tahoe |
---|
324 | server disable-ambient-storage-authority" will undo the previous step, then |
---|
325 | there are two basic approaches: |
---|
326 | |
---|
327 | * "full mesh": each node grants authority directory to all the others. |
---|
328 | First, agree upon a userid number for each participant (the value doesn't |
---|
329 | matter, as long as it is unique). Each user should then use "tahoe server |
---|
330 | add-account" for all the accounts (including themselves, if they want some |
---|
331 | of their shares to land on their own machine), including a quota if they |
---|
332 | wish to restrict individuals: |
---|
333 | |
---|
334 | tahoe server add-account --account 1 --quota 5GB Alice |
---|
335 | --> authority string for Alice |
---|
336 | tahoe server add-account --account 2 --quota 5GB Bob |
---|
337 | --> authority string for Bob |
---|
338 | tahoe server add-account --account 3 --quota 5GB Carol |
---|
339 | --> authority string for Carol |
---|
340 | |
---|
341 | Then email Alice's string to Alice, Bob's string to Bob, etc. Once all |
---|
342 | users have used "tahoe client add-authority" on everything, each server |
---|
343 | will accept N distinct authorities, and each client will hold N distinct |
---|
344 | authorities. |
---|
345 | |
---|
346 | * "account manager": the group designates somebody to be the "AM", or |
---|
347 | "account manager". The AM generates a keypair and publishes the public key |
---|
348 | to all the participants, who create a local authority which delgates full |
---|
349 | storage rights to the corresponding private key. The AM then delegates |
---|
350 | account-restricted authority to each user, sending them their personal |
---|
351 | authority string: |
---|
352 | |
---|
353 | AM: |
---|
354 | tahoe authority create-authority --write-private-to=private.txt |
---|
355 | --> public.txt |
---|
356 | # email public.txt to all members |
---|
357 | AM: |
---|
358 | tahoe authority delegate --from-file=private.txt --account 1 --quota 5GB |
---|
359 | --> alice_authority.txt # email this to Alice |
---|
360 | tahoe authority delegate --from-file=private.txt --account 2 --quota 5GB |
---|
361 | --> bob_authority.txt # email this to Bob |
---|
362 | tahoe authority delegate --from-file=private.txt --account 3 --quota 5GB |
---|
363 | --> carol_authority.txt # email this to Carol |
---|
364 | ... |
---|
365 | Alice: |
---|
366 | # receives alice_authority.txt |
---|
367 | tahoe client add-authority --from-file=alice_authority.txt |
---|
368 | # receives public.txt |
---|
369 | tahoe server add-authorization --from-file=public.txt |
---|
370 | Bob: |
---|
371 | # receives bob_authority.txt |
---|
372 | tahoe client add-authority --from-file=bob_authority.txt |
---|
373 | # receives public.txt |
---|
374 | tahoe server add-authorization --from-file=public.txt |
---|
375 | Carol: |
---|
376 | # receives carol_authority.txt |
---|
377 | tahoe client add-authority --from-file=carol_authority.txt |
---|
378 | # receives public.txt |
---|
379 | tahoe server add-authorization --from-file=public.txt |
---|
380 | |
---|
381 | If the members want to see names next to their local usage totals, they |
---|
382 | can set local petnames for the accounts: |
---|
383 | |
---|
384 | tahoe server set-petname 1 Alice |
---|
385 | tahoe server set-petname 2 Bob |
---|
386 | tahoe server set-petname 3 Carol |
---|
387 | |
---|
388 | Alternatively, the AM could provide a usage aggregator, which will collect |
---|
389 | usage values from all the storage servers and show the totals in a single |
---|
390 | place, and add the petnames to that display instead. |
---|
391 | |
---|
392 | The AM gets more authority than anyone else (they can spoof everybody), |
---|
393 | but each server has just a single authorization instead of N, and each |
---|
394 | client has a single authority instead of N. When a new member joins the |
---|
395 | group, the amount of work that must be done is significantly less, and |
---|
396 | only two parties are involved instead of all N: |
---|
397 | |
---|
398 | AM: |
---|
399 | tahoe authority delegate --from-file=private.txt --account 4 --quota 5GB |
---|
400 | --> dave_authority.txt # email this to Dave |
---|
401 | Dave: |
---|
402 | # receives dave_authority.txt |
---|
403 | tahoe client add-authority --from-file=dave_authority.txt |
---|
404 | # receives public.txt |
---|
405 | tahoe server add-authorization --from-file=public.txt |
---|
406 | |
---|
407 | Another approach is to let everybody be the AM: instead of keeping the |
---|
408 | private.txt file secret, give it to all members of the group (but not to |
---|
409 | outsiders). This lets current members bring new members into the group |
---|
410 | without depending upon anybody else doing work. It also renders any notion |
---|
411 | of enforced quotas meaningless, so it is only appropriate for actual |
---|
412 | friends who are voluntarily refraining from spoofing each other. |
---|
413 | |
---|
414 | === Commercial Grid === |
---|
415 | |
---|
416 | A "commercial grid", like the one that allmydata.com manages as a for-profit |
---|
417 | service, is characterized by a large number of independent clients (who do |
---|
418 | not know each other), and by all of the storage servers being managed by a |
---|
419 | single entity. In this case, we use an Account Manager like above, to |
---|
420 | collapse the potential N*M explosion of authorities into something smaller. |
---|
421 | We also create a dummy "parent" account, and give all the real clients |
---|
422 | subaccounts under it, to give the operations personnel a convenient "total |
---|
423 | space used" number. Each time a new customer joins, the AM is directed to |
---|
424 | create a new authority for them, and the resulting string is provided to the |
---|
425 | customer's client node. |
---|
426 | |
---|
427 | AM: |
---|
428 | tahoe authority create-authority --account 1 \ |
---|
429 | --write-private-to=AM-private.txt --write-public-to=AM-public.txt |
---|
430 | |
---|
431 | Each time a new storage server is brought up: |
---|
432 | |
---|
433 | SERVER: |
---|
434 | tahoe server add-authorization --from-file=AM-public.txt |
---|
435 | |
---|
436 | Each time a new client joins: |
---|
437 | |
---|
438 | AM: |
---|
439 | N = next_account++ |
---|
440 | tahoe authority delegate --from-file=AM-private.txt --account 1,N |
---|
441 | --> new_client_authority.txt # give this to new client |
---|
442 | |
---|
443 | == Programmatic Interfaces == |
---|
444 | |
---|
445 | The storage authority can be passed as a string in a single serialized form, |
---|
446 | which is cut-and-pasteable and printable. It uses minimal punctuation, to |
---|
447 | make it possible to include it as a URL query argument or HTTP header field |
---|
448 | without requiring character-escaping. |
---|
449 | |
---|
450 | Before passing it over HTTP, however, note that revealing the authority |
---|
451 | string to someone is equivalent to irrevocably delegating all that authority |
---|
452 | to them. While this is appropriate when transferring authority from, say, a |
---|
453 | receptive storage server to your local agent, it is not appropriate when |
---|
454 | using a foreign tahoe node, or when asking a Helper to upload a specific |
---|
455 | file. Attenuations (see below) should be used to limit the delegated |
---|
456 | authority in these cases. |
---|
457 | |
---|
458 | In the programmatic web-API, any operation that consumes storage will accept |
---|
459 | a storage-authority= query argument, the value of which will be the printable |
---|
460 | form of an authority string. This includes all PUT operations, POST t=upload |
---|
461 | and t=mkdir, and anything which creates a new file, creates a directory |
---|
462 | (perhaps an intermediate one), or modifies a mutable file. |
---|
463 | |
---|
464 | Alternatively, the authority string can also be passed through an HTTP |
---|
465 | header. A single "X-Tahoe-Storage-Authority:" header can be used with the |
---|
466 | printable authority string. If the string is too large to fit in a single |
---|
467 | header, the application can provide a series of numbered |
---|
468 | "X-Tahoe-Storage-Authority-1:", "X-Tahoe-Storage-Authority-2:", etc, headers, |
---|
469 | and these will be sorted in alphabetical order (please use 08/09/10/11 rather |
---|
470 | than 8/9/10/11), stripped of leading and trailing whitespace, and |
---|
471 | concatenated. The HTTP header form can accomodate larger authority strings, |
---|
472 | since these strings can grow too large to pass as a query argument |
---|
473 | (especially when several delegations or attenuations are involved). However, |
---|
474 | depending upon the HTTP client library being used, passing extra HTTP headers |
---|
475 | may be more complicated than simply modifying the URL, and may be impossible |
---|
476 | in some cases (such as javascript running in a web browser). |
---|
477 | |
---|
478 | TODO: we may add a stored-token form of authority-passing to handle |
---|
479 | environments in which query-args won't work and headers are not available. |
---|
480 | This approach would use a special PUT which takes the authority string as the |
---|
481 | HTTP body, and remembers it on the server side in associated with a |
---|
482 | brief-but-unguessable token. Later operations would then use the authority by |
---|
483 | passing a --storage-authority-token=XYZ query argument. These authorities |
---|
484 | would expire after some period. |
---|
485 | |
---|
486 | == Quota Management, Aggregation, Reporting == |
---|
487 | |
---|
488 | The storage server will maintain enough information to efficiently compute |
---|
489 | usage totals for each account referenced in all of their leases, as well as |
---|
490 | all their parent accounts. This information is used for several purposes: |
---|
491 | |
---|
492 | * enforce server-space restrictions, by selectively rejecting storage |
---|
493 | requests which would cause the account-usage-total to rise above the limit |
---|
494 | specified in the enabling authorization string |
---|
495 | * report individual account usage to the account-holder (if a client can |
---|
496 | consume space under account A, they are also allowed to query usage for |
---|
497 | account A or a subaccount). |
---|
498 | * report individual account usage to the storage-server operator, possibly |
---|
499 | associated with a pet name |
---|
500 | * report usage for all accounts to the storage-server operator, possibly |
---|
501 | associated with a pet name, in the form of a large table |
---|
502 | * report usage for all accounts to an external aggregator |
---|
503 | |
---|
504 | The external aggregator would take usage information from all the storage |
---|
505 | servers in a single grid and sum them together, providing a grid-wide usage |
---|
506 | number for each account. This could be used by e.g. clients in a commercial |
---|
507 | grid to report overall-space-used to the end user. |
---|
508 | |
---|
509 | There will be web-API URLs available for all of these reports. |
---|
510 | |
---|
511 | TODO: storage servers might also have a mechanism to apply space-usage limits |
---|
512 | to specific account ids directly, rather than requiring that these be |
---|
513 | expressed only through authority-string limitation fields. This would let a |
---|
514 | storage server operator revoke their space-allocation after delivering the |
---|
515 | authority string. |
---|
516 | |
---|
517 | == Low-Level Formats == |
---|
518 | |
---|
519 | This section describes the low-level formats used by the Accounting process, |
---|
520 | beginning with the storage-authority data structure and working upwards. This |
---|
521 | section is organized to follow the storage authority, starting from the point |
---|
522 | of grant. The discussion will thus begin at the storage server (where the |
---|
523 | authority is first created), work back to the client (which receives the |
---|
524 | authority as a web-API argument), then follow the authority back to the |
---|
525 | servers as it is used to enable specific storage operations. It will then |
---|
526 | detail the accounting tables that the storage server is obligated to |
---|
527 | maintain, and describe the interfaces through which these tables are accessed |
---|
528 | by other parties. |
---|
529 | |
---|
530 | === Storage Authority === |
---|
531 | |
---|
532 | ==== Terminology ==== |
---|
533 | |
---|
534 | Storage Authority is represented as a chain of certificates and a private |
---|
535 | key. Each certificate authorizes and restricts a specific private key. The |
---|
536 | initial certificate in the chain derives its authority by being placed in the |
---|
537 | storage server's tahoe.cfg file (i.e. by being authorized by the storage |
---|
538 | server operator). All subsequent certificates are signed by the authorized |
---|
539 | private key that was identified in the previous certificate: they derive |
---|
540 | their authority by delegation. Each certificate has restrictions which limit |
---|
541 | the authority being delegated. |
---|
542 | |
---|
543 | authority: ([cert[0], cert[1], cert[2] ...], privatekey) |
---|
544 | |
---|
545 | The "restrictions dictionary" is a table which establishes an upper bound on |
---|
546 | how this authority (or any attenuations thereof) may be used. It is |
---|
547 | effectively a set of key-value pairs. |
---|
548 | |
---|
549 | A "signing key" is an EC-DSA192 private key string and is 12 bytes |
---|
550 | long. A "verifying key" is an EC-DSA192 public key string, and is 24 |
---|
551 | bytes long. A "key identifier" is a string which securely identifies a |
---|
552 | specific signing/verifying keypair: for long RSA keys it would be a |
---|
553 | secure hash of the public key, but since ECDSA192 keys are so short, |
---|
554 | we simply use the full verifying key verbatim. A "key hint" is a |
---|
555 | variable-length prefix of the key identifier, perhaps zero bytes long, |
---|
556 | used to help a recipient reduce the number of verifying keys that it |
---|
557 | must search to find one that matches a signed message. |
---|
558 | |
---|
559 | ==== Authority Chains ==== |
---|
560 | |
---|
561 | The authority chain consists of a list of certificates, each of which has a |
---|
562 | serialized restrictions dictionary. Each dictionary will have a |
---|
563 | "delegate-to-key" field, which delegates authority to a private key, |
---|
564 | referenced with a key identifier. In addition, the non-initial certs are |
---|
565 | signed, so they each contain a signature and a key hint: |
---|
566 | |
---|
567 | cert[0]: serialized(restrictions_dictionary) |
---|
568 | cert[1]: serialized(restrictions_dictionary), signature, keyhint |
---|
569 | cert[2]: serialized(restrictions_dictionary), signature, keyhint |
---|
570 | |
---|
571 | In this example, suppose cert[0] contains a delegate-to-key field that |
---|
572 | identifies a keypair sign_A/verify_A. In this case, cert[1] will have a |
---|
573 | signature that was made with sign_A, and the keyhint in cert[1] will |
---|
574 | reference verify_A. |
---|
575 | |
---|
576 | cert[0].restrictions[delegate-to-key] = A_keyid |
---|
577 | |
---|
578 | cert[1].signature = SIGN(sign_A, serialized(cert[0].restrictions)) |
---|
579 | cert[1].keyhint = verify_A |
---|
580 | cert[1].restrictions[delegate-to-key] = B_keyid |
---|
581 | |
---|
582 | cert[2].signature = SIGN(sign_B, serialized(cert[1].restrictions)) |
---|
583 | cert[2].keyhint = verify_B |
---|
584 | cert[2].restrictions[delete-to-key] = C_keyid |
---|
585 | |
---|
586 | In this example, the full storage authority consists of the cert[0,1,2] chain |
---|
587 | and the sign_C private key: anyone who is in possession of both will be able |
---|
588 | to exert this authority. To wield the authority, a client will present the |
---|
589 | cert[0,1,2] chain and an action message signed by sign_C; the server will |
---|
590 | validate the chain and the signature before performing the requested action. |
---|
591 | The only circumstances that might prompt the client to share the sign_C |
---|
592 | private key with another party (including the server) would be if it wanted |
---|
593 | to irrevocably share its full authority with that party. |
---|
594 | |
---|
595 | ==== Restriction Dictionaries ==== |
---|
596 | |
---|
597 | Within a restriction dictionary, the following keys are defined. Their full |
---|
598 | meanings are defined later. |
---|
599 | |
---|
600 | 'accountid': an arbitrary-length sequence of integers >=0, restricting the |
---|
601 | accounts which can be manipulated or used in leases |
---|
602 | 'SI': a storage index (binary string), controlling which file may be |
---|
603 | manipulated |
---|
604 | 'serverid': binary string, limiting which server will accept requests |
---|
605 | 'UEB-hash': binary string, limiting the content of the file being manipulated |
---|
606 | 'before': timestamp (seconds since epoch), limits the lifetime of this |
---|
607 | authority |
---|
608 | 'server-size': integer >0, maximum aggregate storage (in bytes) per account |
---|
609 | 'delegate-to-key': binary string (DSA pubkey identifier) |
---|
610 | 'furl-to': printable FURL string |
---|
611 | |
---|
612 | ==== Authority Serialization ==== |
---|
613 | |
---|
614 | There is only one form of serialization: a somewhat-compact URL-safe |
---|
615 | cut-and-pasteable printable form. We are interested in minimizing the size of |
---|
616 | the resulting authority, so rather than using a general-purpose (perhaps |
---|
617 | JSON-based) serialization scheme, we use one that is specialized for this |
---|
618 | task. |
---|
619 | |
---|
620 | This URL-safe form will use minimal punctuation to avoid quoting issues when |
---|
621 | used in a URL query argument. It would be nice to avoid word-breaking |
---|
622 | characters that make cut-and-paste troublesome, however this is more |
---|
623 | difficult because most non-alphanumeric characters are word-breaking in at |
---|
624 | least one application. |
---|
625 | |
---|
626 | The serialized storage authority as a whole contains a single version |
---|
627 | identifier and magic number at the beginning. None of the internal components |
---|
628 | contain redundant version numbers: they are implied by the container. If |
---|
629 | components are serialized independently for other reasons, they may contain |
---|
630 | version identifers in that form. |
---|
631 | |
---|
632 | Signing keys (i.e. private keys) are URL-safe-serialized using Zooko's base62 |
---|
633 | alphabet, which offers almost the same density as standard base64 but without |
---|
634 | any non-URL-safe or word-breaking characters. Since we used fixed-format keys |
---|
635 | (EC-DSA, 192bit, with SHA256), the private keys are fixed-length (96 bits or |
---|
636 | 12 bytes), so there is no length indicator: all URL-safe-serialized signing |
---|
637 | keys are 17 base62 characters long. The 192-bit verifying keys (i.e. public |
---|
638 | keys) use the same approach: the URL-safe form is 33 characters long. |
---|
639 | |
---|
640 | An account-id sequence (a variable-length sequence of non-negative numbers) |
---|
641 | is serialized by representing each number in decimal ASCII, then joining the |
---|
642 | pieces with commas. The string is terminated by the first non-[0-9,] |
---|
643 | character encountered, which will either be the key-identifier letter of the |
---|
644 | next field, or the dictionary-terminating character at the end. |
---|
645 | |
---|
646 | Any single integral decimal number (such as the "before" timestamp field, or |
---|
647 | the "server-size" field) is serialized as a variable-length sequence of ASCII |
---|
648 | decimal digits, terminated by any non-digit. |
---|
649 | |
---|
650 | The restrictions dictionary is serialized as a concatenated series of |
---|
651 | key-identifier-letter / value string pairs, ending with the marker "E.". The |
---|
652 | URL-safe form uses a single printable letter to indicate the which key is |
---|
653 | being serialized. Each type of value string is serialized differently: |
---|
654 | |
---|
655 | "A": accountid: variable-length sequence of comma-joned numbers |
---|
656 | "I": storage index: fixed-length 26-character *base32*-encoded storage index |
---|
657 | "P": server id (peer id): fixed-length 32-character *base32* encoded serverid |
---|
658 | (matching the printable Tub.tubID string that Foolscap provides) |
---|
659 | "U": UEB hash: fixed-length 43-character base62 encoded UEB hash |
---|
660 | "B": before: variable-length sequence of decimal digits, seconds-since-epoch. |
---|
661 | "S": server-size: variable-length sequence of decimal digits, max size in bytes |
---|
662 | "D": delegate-to-key: ECDSA public key, 33 base62 characters. |
---|
663 | "F": furl-to: variable-length FURL string, wrapped in a netstring: |
---|
664 | "%d:%s," % (len(FURL), FURL). Note that this is rarely pasted. |
---|
665 | "E.": end-of-dictionary marker |
---|
666 | |
---|
667 | The ECDSA signature is serialized as a variable number of base62 characters, |
---|
668 | terminated by a period. We expect the signature to be about 384 bits (48 |
---|
669 | bytes) long, or 65 base62 characters. A missing signature (such as for the |
---|
670 | initial cert) is represented as a single period. |
---|
671 | |
---|
672 | The key hint is serialized with a base62-encoded serialized hint string (a |
---|
673 | byte-quantized prefix of the serialized public key), terminated by a period. |
---|
674 | An empty hint would thus be serialized as a single period. For the current |
---|
675 | design, we expect the key hint to be empty. |
---|
676 | |
---|
677 | The full storage authority string consists of a certificate chain and a |
---|
678 | delegate private key. Given the single-certificate serialization scheme |
---|
679 | described above, the full authority is serialized as follows: |
---|
680 | |
---|
681 | * version prefix: depends upon the application, but for storage-authority |
---|
682 | chains this will be "sa0-", for Storage-Authority Version 0. |
---|
683 | * serialized certificates, concatenated together |
---|
684 | * serialized private key (to which the last certificate delegates authority) |
---|
685 | |
---|
686 | Note that this serialization form does not have an explicit terminator, so |
---|
687 | the environment must provide a length indicator or some other way to identify |
---|
688 | the end of the authority string. The benefit of this approach is that the |
---|
689 | full string will begin and end with alphanumeric characters, making |
---|
690 | cut-and-paste easier (increasing the size of the mouse target: anywhere |
---|
691 | within the final component will work). |
---|
692 | |
---|
693 | Also note that the period is a reserved delimiter: it cannot appear in the |
---|
694 | serialized restrictions dictionary. The parser can remove the version prefix, |
---|
695 | split the rest on periods, and expect to see 3*k+1 fields, consisting of k |
---|
696 | (restriction-dictionary,signature,keyhint) 3-tuples and a single private key |
---|
697 | at the end. |
---|
698 | |
---|
699 | Some examples: |
---|
700 | |
---|
701 | (example A) |
---|
702 | cert[0] delegates account 1,4 to (pubkey ZlFA / privkey 1f2S): |
---|
703 | |
---|
704 | sa0-A1,4D2lFA6LboL2xx0ldQH2K1TdSrwuqMMiME3E...1f2SI9UJPXvb7vdJ1 |
---|
705 | |
---|
706 | (example B) |
---|
707 | cert[0] delegates account 1,4 to ZlFA/1f2S |
---|
708 | cert[1] subdelegates 5GB and subaccount 1,4,7 to pubkey 0BPo/06rt: |
---|
709 | |
---|
710 | sa0-A1,4D2lFA6LboL2xx0ldQH2K1TdSrwuqMMiME3E...A1,4,7S5000000000D0BPoGxJ3M4KWrmdpLnknhJABrWip5e9kPE,7cyhQvv5axdeihmOzIHjs85TcUIYiWHdsxNz50GTerEOR5ucj2TITPXxyaCUli1oF...06rtcPQotR3q4f2cT |
---|
711 | |
---|
712 | |
---|
713 | |
---|
714 | |
---|
715 | |
---|
716 | |
---|
717 | |
---|
718 | == Problems == |
---|
719 | |
---|
720 | Problems which have thus far been identified with this approach: |
---|
721 | |
---|
722 | * allowing arbitrary subaccount generation will permit a DoS attack, in |
---|
723 | which an authorized uploader consumes lots of DB space by creating an |
---|
724 | unbounded number of randomly-generated subaccount identifiers. OTOH, they |
---|
725 | can already attach an unbounded number of leases to any file they like, |
---|
726 | consuming a lot of space. |
---|
727 | |
---|