1 | .. -*- coding: utf-8 -*- |
---|
2 | |
---|
3 | Storage Node Protocol ("Great Black Swamp", "GBS") |
---|
4 | ================================================== |
---|
5 | |
---|
6 | The target audience for this document is developers working on Tahoe-LAFS or on an alternate implementation intended to be interoperable. |
---|
7 | After reading this document, |
---|
8 | one should expect to understand how Tahoe-LAFS clients interact over the network with Tahoe-LAFS storage nodes. |
---|
9 | |
---|
10 | The primary goal of the introduction of this protocol is to simplify the task of implementing a Tahoe-LAFS storage server. |
---|
11 | Specifically, it should be possible to implement a Tahoe-LAFS storage server without a Foolscap implementation |
---|
12 | (substituting a simpler GBS server implementation). |
---|
13 | The Tahoe-LAFS client will also need to change but it is not expected that it will be noticably simplified by this change |
---|
14 | (though this may be the first step towards simplifying it). |
---|
15 | |
---|
16 | Glossary |
---|
17 | -------- |
---|
18 | |
---|
19 | `Foolscap <https://github.com/warner/foolscap/>`_ |
---|
20 | an RPC/RMI (Remote Procedure Call / Remote Method Invocation) protocol for use with Twisted |
---|
21 | |
---|
22 | storage server |
---|
23 | a Tahoe-LAFS process configured to offer storage and reachable over the network for store and retrieve operations |
---|
24 | |
---|
25 | storage service |
---|
26 | a Python object held in memory in the storage server which provides the implementation of the storage protocol |
---|
27 | |
---|
28 | introducer |
---|
29 | a Tahoe-LAFS process at a known location configured to re-publish announcements about the location of storage servers |
---|
30 | |
---|
31 | :ref:`fURLs <fURLs>` |
---|
32 | a self-authenticating URL-like string which can be used to locate a remote object using the Foolscap protocol (the storage service is an example of such an object) |
---|
33 | |
---|
34 | :ref:`NURLs <NURLs>` |
---|
35 | a self-authenticating URL-like string almost exactly like a fURL but without being tied to Foolscap |
---|
36 | |
---|
37 | swissnum |
---|
38 | a short random string which is part of a fURL/NURL and which acts as a shared secret to authorize clients to use a storage service |
---|
39 | |
---|
40 | lease |
---|
41 | state associated with a share informing a storage server of the duration of storage desired by a client |
---|
42 | |
---|
43 | share |
---|
44 | a single unit of client-provided arbitrary data to be stored by a storage server (in practice, one of the outputs of applying ZFEC encoding to some ciphertext with some additional metadata attached) |
---|
45 | |
---|
46 | bucket |
---|
47 | a group of one or more immutable shares held by a storage server and having a common storage index |
---|
48 | |
---|
49 | slot |
---|
50 | a group of one or more mutable shares held by a storage server and having a common storage index (sometimes "slot" is considered a synonym for "storage index of a slot") |
---|
51 | |
---|
52 | storage index |
---|
53 | a 16 byte string which can address a slot or a bucket (in practice, derived by hashing the encryption key associated with contents of that slot or bucket) |
---|
54 | |
---|
55 | write enabler |
---|
56 | a short secret string which storage servers require to be presented before allowing mutation of any mutable share |
---|
57 | |
---|
58 | lease renew secret |
---|
59 | a short secret string which storage servers required to be presented before allowing a particular lease to be renewed |
---|
60 | |
---|
61 | Additional terms related to the Tahoe-LAFS project in general are defined in the :doc:`../glossary` |
---|
62 | |
---|
63 | The key words |
---|
64 | "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" |
---|
65 | in this document are to be interpreted as described in RFC 2119. |
---|
66 | |
---|
67 | Motivation |
---|
68 | ---------- |
---|
69 | |
---|
70 | Foolscap |
---|
71 | ~~~~~~~~ |
---|
72 | |
---|
73 | Foolscap is a remote method invocation protocol with several distinctive features. |
---|
74 | At its core it allows separate processes to refer each other's objects and methods using a capability-based model. |
---|
75 | This allows for extremely fine-grained access control in a system that remains highly securable without becoming overwhelmingly complicated. |
---|
76 | Supporting this is a flexible and extensible serialization system which allows data to be exchanged between processes in carefully controlled ways. |
---|
77 | |
---|
78 | Tahoe-LAFS avails itself of only a small portion of these features. |
---|
79 | A Tahoe-LAFS storage server typically only exposes one object with a fixed set of methods to clients. |
---|
80 | A Tahoe-LAFS introducer node does roughly the same. |
---|
81 | Tahoe-LAFS exchanges simple data structures that have many common, standard serialized representations. |
---|
82 | |
---|
83 | In exchange for this slight use of Foolscap's sophisticated mechanisms, |
---|
84 | Tahoe-LAFS pays a substantial price: |
---|
85 | |
---|
86 | * Foolscap is implemented only for Python. |
---|
87 | Tahoe-LAFS is thus limited to being implemented only in Python. |
---|
88 | * There is only one Python implementation of Foolscap. |
---|
89 | The implementation is therefore the de facto standard and understanding of the protocol often relies on understanding that implementation. |
---|
90 | * The Foolscap developer community is very small. |
---|
91 | The implementation therefore advances very little and some non-trivial part of the maintenance cost falls on the Tahoe-LAFS project. |
---|
92 | * The extensible serialization system imposes substantial complexity compared to the simple data structures Tahoe-LAFS actually exchanges. |
---|
93 | |
---|
94 | HTTP |
---|
95 | ~~~~ |
---|
96 | |
---|
97 | HTTP is a request/response protocol that has become the lingua franca of the internet. |
---|
98 | Combined with the principles of Representational State Transfer (REST) it is widely employed to create, update, and delete data in collections on the internet. |
---|
99 | HTTP itself provides only modest functionality in comparison to Foolscap. |
---|
100 | However its simplicity and widespread use have led to a diverse and almost overwhelming ecosystem of libraries, frameworks, toolkits, and so on. |
---|
101 | |
---|
102 | By adopting HTTP in place of Foolscap Tahoe-LAFS can realize the following concrete benefits: |
---|
103 | |
---|
104 | * Practically every language or runtime has an HTTP protocol implementation (or a dozen of them) available. |
---|
105 | This change paves the way for new Tahoe-LAFS implementations using tools better suited for certain situations |
---|
106 | (mobile client implementations, high-performance server implementations, easily distributed desktop clients, etc). |
---|
107 | * The simplicity of and vast quantity of resources about HTTP make it a very easy protocol to learn and use. |
---|
108 | This change reduces the barrier to entry for developers to contribute improvements to Tahoe-LAFS's network interactions. |
---|
109 | * For any given language there is very likely an HTTP implementation with a large and active developer community. |
---|
110 | Tahoe-LAFS can therefore benefit from the large effort being put into making better libraries for using HTTP. |
---|
111 | * One of the core features of HTTP is the mundane transfer of bulk data and implementions are often capable of doing this with extreme efficiency. |
---|
112 | The alignment of this core feature with a core activity of Tahoe-LAFS of transferring bulk data means that a substantial barrier to improved Tahoe-LAFS runtime performance will be eliminated. |
---|
113 | |
---|
114 | TLS |
---|
115 | ~~~ |
---|
116 | |
---|
117 | The Foolscap-based protocol provides *some* of Tahoe-LAFS's confidentiality, integrity, and authentication properties by leveraging TLS. |
---|
118 | An HTTP-based protocol can make use of TLS in largely the same way to provide the same properties. |
---|
119 | Provision of these properties *is* dependant on implementers following Great Black Swamp's rules for x509 certificate validation |
---|
120 | (rather than the standard "web" rules for validation). |
---|
121 | |
---|
122 | Design Requirements |
---|
123 | ------------------- |
---|
124 | |
---|
125 | Security |
---|
126 | ~~~~~~~~ |
---|
127 | |
---|
128 | Summary |
---|
129 | !!!!!!! |
---|
130 | |
---|
131 | The storage node protocol should offer at minimum the security properties offered by the Foolscap-based protocol. |
---|
132 | The Foolscap-based protocol offers: |
---|
133 | |
---|
134 | * **Peer authentication** by way of checked x509 certificates |
---|
135 | * **Message authentication** by way of TLS |
---|
136 | * **Message confidentiality** by way of TLS |
---|
137 | |
---|
138 | * A careful configuration of the TLS connection parameters *may* also offer **forward secrecy**. |
---|
139 | However, Tahoe-LAFS' use of Foolscap takes no steps to ensure this is the case. |
---|
140 | |
---|
141 | * **Storage authorization** by way of a capability contained in the fURL addressing a storage service. |
---|
142 | |
---|
143 | Discussion |
---|
144 | !!!!!!!!!! |
---|
145 | |
---|
146 | A client node relies on a storage node to persist certain data until a future retrieval request is made. |
---|
147 | In this way, the client node is vulnerable to attacks which cause the data not to be persisted. |
---|
148 | Though this vulnerability can be (and typically is) mitigated by including redundancy in the share encoding parameters for stored data, |
---|
149 | it is still sensible to attempt to minimize unnecessary vulnerability to this attack. |
---|
150 | |
---|
151 | One way to do this is for the client to be confident the storage node with which it is communicating is really the expected node. |
---|
152 | That is, for the client to perform **peer authentication** of the storage node it connects to. |
---|
153 | This allows it to develop a notion of that node's reputation over time. |
---|
154 | The more retrieval requests the node satisfies correctly the more it probably will satisfy correctly. |
---|
155 | Therefore, the protocol must include some means for verifying the identify of the storage node. |
---|
156 | The initialization of the client with the correct identity information is out of scope for this protocol |
---|
157 | (the system may be trust-on-first-use, there may be a third-party identity broker, etc). |
---|
158 | |
---|
159 | With confidence that communication is proceeding with the intended storage node, |
---|
160 | it must also be possible to trust that data is exchanged without modification. |
---|
161 | That is, the protocol must include some means to perform **message authentication**. |
---|
162 | This is most likely done using cryptographic MACs (such as those used in TLS). |
---|
163 | |
---|
164 | The messages which enable the mutable shares feature include secrets related to those shares. |
---|
165 | For example, the write enabler secret is used to restrict the parties with write access to mutable shares. |
---|
166 | It is exchanged over the network as part of a write operation. |
---|
167 | An attacker learning this secret can overwrite share data with garbage |
---|
168 | (lacking a separate encryption key, |
---|
169 | there is no way to write data which appears legitimate to a legitimate client). |
---|
170 | Therefore, **message confidentiality** is necessary when exchanging these secrets. |
---|
171 | **Forward secrecy** is preferred so that an attacker recording an exchange today cannot launch this attack at some future point after compromising the necessary keys. |
---|
172 | |
---|
173 | A storage service offers service only to some clients. |
---|
174 | A client proves their authorization to use the storage service by presenting a shared secret taken from the fURL. |
---|
175 | In this way **storage authorization** is performed to prevent disallowed parties from consuming any storage resources. |
---|
176 | |
---|
177 | Functionality |
---|
178 | ------------- |
---|
179 | |
---|
180 | Tahoe-LAFS application-level information must be transferred using this protocol. |
---|
181 | This information is exchanged with a dozen or so request/response-oriented messages. |
---|
182 | Some of these messages carry large binary payloads. |
---|
183 | Others are small structured-data messages. |
---|
184 | Some facility for expansion to support new information exchanges should also be present. |
---|
185 | |
---|
186 | Solutions |
---|
187 | --------- |
---|
188 | |
---|
189 | An HTTP-based protocol, dubbed "Great Black Swamp" (or "GBS"), is described below. |
---|
190 | This protocol aims to satisfy the above requirements at a lower level of complexity than the current Foolscap-based protocol. |
---|
191 | |
---|
192 | Summary (Non-normative) |
---|
193 | ~~~~~~~~~~~~~~~~~~~~~~~ |
---|
194 | |
---|
195 | Communication with the storage node will take place using TLS. |
---|
196 | The TLS version and configuration will be dictated by an ongoing understanding of best practices. |
---|
197 | The storage node will present an x509 certificate during the TLS handshake. |
---|
198 | Storage clients will require that the certificate have a valid signature. |
---|
199 | The Subject Public Key Information (SPKI) hash of the certificate will constitute the storage node's identity. |
---|
200 | The **tub id** portion of the storage node fURL will be replaced with the SPKI hash. |
---|
201 | |
---|
202 | When connecting to a storage node, |
---|
203 | the client will take the following steps to gain confidence it has reached the intended peer: |
---|
204 | |
---|
205 | * It will perform the usual cryptographic verification of the certificate presented by the storage server. |
---|
206 | That is, |
---|
207 | it will check that the certificate itself is well-formed, |
---|
208 | that it is currently valid [#]_, |
---|
209 | and that the signature it carries is valid. |
---|
210 | * It will compare the SPKI hash of the certificate to the expected value. |
---|
211 | The specifics of the comparison are the same as for the comparison specified by `RFC 7469`_ with "sha256" [#]_. |
---|
212 | |
---|
213 | To further clarify, consider this example. |
---|
214 | Alice operates a storage node. |
---|
215 | Alice generates a key pair and secures it properly. |
---|
216 | Alice generates a self-signed storage node certificate with the key pair. |
---|
217 | Alice's storage node announces (to an introducer) a NURL containing (among other information) the SPKI hash. |
---|
218 | Imagine the SPKI hash is ``i5xb...``. |
---|
219 | This results in a NURL of ``pb://i5xb...@example.com:443/g3m5...#v=1``. |
---|
220 | Bob creates a client node pointed at the same introducer. |
---|
221 | Bob's client node receives the announcement from Alice's storage node |
---|
222 | (indirected through the introducer). |
---|
223 | |
---|
224 | Bob's client node recognizes the NURL as referring to an HTTP-dialect server due to the ``v=1`` fragment. |
---|
225 | Bob's client node can now perform a TLS handshake with a server at the address in the NURL location hints |
---|
226 | (``example.com:443`` in this example). |
---|
227 | Following the above described validation procedures, |
---|
228 | Bob's client node can determine whether it has reached Alice's storage node or not. |
---|
229 | If and only if the validation procedure is successful does Bob's client node conclude it has reached Alice's storage node. |
---|
230 | **Peer authentication** has been achieved. |
---|
231 | |
---|
232 | Additionally, |
---|
233 | by continuing to interact using TLS, |
---|
234 | Bob's client and Alice's storage node are assured of both **message authentication** and **message confidentiality**. |
---|
235 | |
---|
236 | Bob's client further inspects the NURL for the *swissnum*. |
---|
237 | When Bob's client issues HTTP requests to Alice's storage node it includes the *swissnum* in its requests. |
---|
238 | **Storage authorization** has been achieved. |
---|
239 | |
---|
240 | .. note:: |
---|
241 | |
---|
242 | Foolscap TubIDs are 20 bytes (SHA1 digest of the certificate). |
---|
243 | They are encoded with `Base32`_ for a length of 32 bytes. |
---|
244 | SPKI information discussed here is 32 bytes (SHA256 digest). |
---|
245 | They would be encoded in `Base32`_ for a length of 52 bytes. |
---|
246 | `unpadded base64url`_ provides a more compact encoding of the information while remaining URL-compatible. |
---|
247 | This would encode the SPKI information for a length of merely 43 bytes. |
---|
248 | SHA1, |
---|
249 | the current Foolscap hash function, |
---|
250 | is not a practical choice at this time due to advances made in `attacking SHA1`_. |
---|
251 | The selection of a safe hash function with output smaller than SHA256 could be the subject of future improvements. |
---|
252 | A 224 bit hash function (SHA3-224, for example) might be suitable - |
---|
253 | improving the encoded length to 38 bytes. |
---|
254 | |
---|
255 | |
---|
256 | Transition |
---|
257 | ~~~~~~~~~~ |
---|
258 | |
---|
259 | To provide a seamless user experience during this protocol transition, |
---|
260 | there should be a period during which both protocols are supported by storage nodes. |
---|
261 | The GBS announcement will be introduced in a way that *updated client* software can recognize. |
---|
262 | Its introduction will also be made in such a way that *non-updated client* software disregards the new information |
---|
263 | (of which it cannot make any use). |
---|
264 | |
---|
265 | Storage nodes will begin to operate a new GBS server. |
---|
266 | They may re-use their existing x509 certificate or generate a new one. |
---|
267 | Generation of a new certificate allows for certain non-optimal conditions to be addressed: |
---|
268 | |
---|
269 | * The ``commonName`` of ``newpb_thingy`` may be changed to a more descriptive value. |
---|
270 | * A ``notValidAfter`` field with a timestamp in the past may be updated. |
---|
271 | |
---|
272 | Storage nodes will announce a new NURL for this new HTTP-based server. |
---|
273 | This NURL will be announced alongside their existing Foolscap-based server's fURL. |
---|
274 | Such an announcement will resemble this:: |
---|
275 | |
---|
276 | { |
---|
277 | "anonymous-storage-FURL": "pb://...", # The old entry |
---|
278 | "anonymous-storage-NURLs": ["pb://...#v=1"] # The new, additional entry |
---|
279 | } |
---|
280 | |
---|
281 | The transition process will proceed in three stages: |
---|
282 | |
---|
283 | 1. The first stage represents the starting conditions in which clients and servers can speak only Foolscap. |
---|
284 | #. The intermediate stage represents a condition in which some clients and servers can both speak Foolscap and GBS. |
---|
285 | #. The final stage represents the desired condition in which all clients and servers speak only GBS. |
---|
286 | |
---|
287 | During the first stage only one client/server interaction is possible: |
---|
288 | the storage server announces only Foolscap and speaks only Foolscap. |
---|
289 | During the final stage there is only one supported interaction: |
---|
290 | the client and server are both updated and speak GBS to each other. |
---|
291 | |
---|
292 | During the intermediate stage there are four supported interactions: |
---|
293 | |
---|
294 | 1. Both the client and server are non-updated. |
---|
295 | The interaction is just as it would be during the first stage. |
---|
296 | #. The client is updated and the server is non-updated. |
---|
297 | The client will see the Foolscap announcement and the lack of a GBS announcement. |
---|
298 | It will speak to the server using Foolscap. |
---|
299 | #. The client is non-updated and the server is updated. |
---|
300 | The client will see the Foolscap announcement. |
---|
301 | It will speak Foolscap to the storage server. |
---|
302 | #. Both the client and server are updated. |
---|
303 | The client will see the GBS announcement and disregard the Foolscap announcement. |
---|
304 | It will speak GBS to the server. |
---|
305 | |
---|
306 | There is one further complication: |
---|
307 | the client maintains a cache of storage server information |
---|
308 | (to avoid continuing to rely on the introducer after it has been introduced). |
---|
309 | The follow sequence of events is likely: |
---|
310 | |
---|
311 | 1. The client connects to an introducer. |
---|
312 | #. It receives an announcement for a non-updated storage server (Foolscap only). |
---|
313 | #. It caches this announcement. |
---|
314 | #. At some point, the storage server is updated. |
---|
315 | #. The client uses the information in its cache to open a Foolscap connection to the storage server. |
---|
316 | |
---|
317 | Ideally, |
---|
318 | the client would not rely on an update from the introducer to give it the GBS NURL for the updated storage server. |
---|
319 | In practice, we have decided not to implement this functionality. |
---|
320 | |
---|
321 | Server Details |
---|
322 | -------------- |
---|
323 | |
---|
324 | The protocol primarily enables interaction with "resources" of two types: |
---|
325 | storage indexes |
---|
326 | and shares. |
---|
327 | A particular resource is addressed by the HTTP request path. |
---|
328 | Details about the interface are encoded in the HTTP message body. |
---|
329 | |
---|
330 | String Encoding |
---|
331 | ~~~~~~~~~~~~~~~ |
---|
332 | |
---|
333 | .. _Base32: |
---|
334 | |
---|
335 | Base32 |
---|
336 | !!!!!! |
---|
337 | |
---|
338 | Where the specification refers to Base32 the meaning is *unpadded* Base32 encoding as specified by `RFC 4648`_ using a *lowercase variation* of the alphabet from Section 6. |
---|
339 | |
---|
340 | That is, the alphabet is: |
---|
341 | |
---|
342 | .. list-table:: Base32 Alphabet |
---|
343 | :header-rows: 1 |
---|
344 | |
---|
345 | * - Value |
---|
346 | - Encoding |
---|
347 | - Value |
---|
348 | - Encoding |
---|
349 | - Value |
---|
350 | - Encoding |
---|
351 | - Value |
---|
352 | - Encoding |
---|
353 | |
---|
354 | * - 0 |
---|
355 | - a |
---|
356 | - 9 |
---|
357 | - j |
---|
358 | - 18 |
---|
359 | - s |
---|
360 | - 27 |
---|
361 | - 3 |
---|
362 | * - 1 |
---|
363 | - b |
---|
364 | - 10 |
---|
365 | - k |
---|
366 | - 19 |
---|
367 | - t |
---|
368 | - 28 |
---|
369 | - 4 |
---|
370 | * - 2 |
---|
371 | - c |
---|
372 | - 11 |
---|
373 | - l |
---|
374 | - 20 |
---|
375 | - u |
---|
376 | - 29 |
---|
377 | - 5 |
---|
378 | * - 3 |
---|
379 | - d |
---|
380 | - 12 |
---|
381 | - m |
---|
382 | - 21 |
---|
383 | - v |
---|
384 | - 30 |
---|
385 | - 6 |
---|
386 | * - 4 |
---|
387 | - e |
---|
388 | - 13 |
---|
389 | - n |
---|
390 | - 22 |
---|
391 | - w |
---|
392 | - 31 |
---|
393 | - 7 |
---|
394 | * - 5 |
---|
395 | - f |
---|
396 | - 14 |
---|
397 | - o |
---|
398 | - 23 |
---|
399 | - x |
---|
400 | - |
---|
401 | - |
---|
402 | * - 6 |
---|
403 | - g |
---|
404 | - 15 |
---|
405 | - p |
---|
406 | - 24 |
---|
407 | - y |
---|
408 | - |
---|
409 | - |
---|
410 | * - 7 |
---|
411 | - h |
---|
412 | - 16 |
---|
413 | - q |
---|
414 | - 25 |
---|
415 | - z |
---|
416 | - |
---|
417 | - |
---|
418 | * - 8 |
---|
419 | - i |
---|
420 | - 17 |
---|
421 | - r |
---|
422 | - 26 |
---|
423 | - 2 |
---|
424 | - |
---|
425 | - |
---|
426 | |
---|
427 | Message Encoding |
---|
428 | ~~~~~~~~~~~~~~~~ |
---|
429 | |
---|
430 | Clients and servers MUST use the ``Content-Type`` and ``Accept`` header fields as specified in `RFC 9110`_ for message body negotiation. |
---|
431 | |
---|
432 | The encoding for HTTP message bodies SHOULD be `CBOR`_. |
---|
433 | Clients submitting requests using this encoding MUST include a ``Content-Type: application/cbor`` request header field. |
---|
434 | A request MAY be submitted using an alternate encoding by declaring this in the ``Content-Type`` header field. |
---|
435 | A request MAY indicate its preference for an alternate encoding in the response using the ``Accept`` header field. |
---|
436 | A request which includes no ``Accept`` header field MUST be interpreted in the same way as a request including a ``Accept: application/cbor`` header field. |
---|
437 | |
---|
438 | Clients and servers MAY support additional request and response message body encodings. |
---|
439 | |
---|
440 | Clients and servers SHOULD support ``application/json`` request and response message body encoding. |
---|
441 | For HTTP messages carrying binary share data, |
---|
442 | this is expected to be a particularly poor encoding. |
---|
443 | However, |
---|
444 | for HTTP messages carrying small payloads of strings, numbers, and containers |
---|
445 | it is expected that JSON will be more convenient than CBOR for ad hoc testing and manual interaction. |
---|
446 | |
---|
447 | For this same reason, |
---|
448 | JSON is used throughout for the examples presented here. |
---|
449 | Because of the simple types used throughout |
---|
450 | and the equivalence described in `RFC 7049`_ |
---|
451 | these examples should be representative regardless of which of these two encodings is chosen. |
---|
452 | |
---|
453 | There are two exceptions to this rule. |
---|
454 | |
---|
455 | 1. Sets |
---|
456 | !!!!!!! |
---|
457 | |
---|
458 | For CBOR messages, |
---|
459 | any sequence that is semantically a set (i.e. no repeated values allowed, order doesn't matter, and elements are hashable in Python) should be sent as a set. |
---|
460 | Tag 6.258 is used to indicate sets in CBOR; |
---|
461 | see `the CBOR registry <https://www.iana.org/assignments/cbor-tags/cbor-tags.xhtml>`_ for more details. |
---|
462 | The JSON encoding does not support sets. |
---|
463 | Sets MUST be represented as arrays in JSON-encoded messages. |
---|
464 | |
---|
465 | 2. Bytes |
---|
466 | !!!!!!!! |
---|
467 | |
---|
468 | The CBOR encoding natively supports a bytes type while the JSON encoding does not. |
---|
469 | Bytes MUST be represented as strings giving the `Base64`_ representation of the original bytes value. |
---|
470 | |
---|
471 | HTTP Design |
---|
472 | ~~~~~~~~~~~ |
---|
473 | |
---|
474 | The HTTP interface described here is informed by the ideas of REST |
---|
475 | (Representational State Transfer). |
---|
476 | For ``GET`` requests query parameters are preferred over values encoded in the request body. |
---|
477 | For other requests query parameters are encoded into the message body. |
---|
478 | |
---|
479 | Many branches of the resource tree are conceived as homogenous containers: |
---|
480 | one branch contains all of the share data; |
---|
481 | another branch contains all of the lease data; |
---|
482 | etc. |
---|
483 | |
---|
484 | Clients and servers MUST use the ``Authorization`` header field, |
---|
485 | as specified in `RFC 9110`_, |
---|
486 | for authorization of all requests to all endpoints specified here. |
---|
487 | The authentication *type* MUST be ``Tahoe-LAFS``. |
---|
488 | Clients MUST present the `Base64`_-encoded representation of the swissnum from the NURL used to locate the storage service as the *credentials*. |
---|
489 | |
---|
490 | If credentials are not presented or the swissnum is not associated with a storage service then the server MUST issue a ``401 UNAUTHORIZED`` response and perform no other processing of the message. |
---|
491 | |
---|
492 | Requests to certain endpoints MUST include additional secrets in the ``X-Tahoe-Authorization`` headers field. |
---|
493 | The endpoints which require these secrets are: |
---|
494 | |
---|
495 | * ``PUT /storage/v1/lease/:storage_index``: |
---|
496 | The secrets included MUST be ``lease-renew-secret`` and ``lease-cancel-secret``. |
---|
497 | |
---|
498 | * ``POST /storage/v1/immutable/:storage_index``: |
---|
499 | The secrets included MUST be ``lease-renew-secret``, ``lease-cancel-secret``, and ``upload-secret``. |
---|
500 | |
---|
501 | * ``PATCH /storage/v1/immutable/:storage_index/:share_number``: |
---|
502 | The secrets included MUST be ``upload-secret``. |
---|
503 | |
---|
504 | * ``PUT /storage/v1/immutable/:storage_index/:share_number/abort``: |
---|
505 | The secrets included MUST be ``upload-secret``. |
---|
506 | |
---|
507 | * ``POST /storage/v1/mutable/:storage_index/read-test-write``: |
---|
508 | The secrets included MUST be ``lease-renew-secret``, ``lease-cancel-secret``, and ``write-enabler``. |
---|
509 | |
---|
510 | If these secrets are: |
---|
511 | |
---|
512 | 1. Missing. |
---|
513 | 2. The wrong length. |
---|
514 | 3. Not the expected kind of secret. |
---|
515 | 4. They are otherwise unparseable before they are actually semantically used. |
---|
516 | |
---|
517 | the server MUST respond with ``400 BAD REQUEST`` and perform no other processing of the message. |
---|
518 | 401 is not used because this isn't an authorization problem, this is a "you sent garbage and should know better" bug. |
---|
519 | |
---|
520 | If authorization using the secret fails, |
---|
521 | then the server MUST send a ``401 UNAUTHORIZED`` response and perform no other processing of the message. |
---|
522 | |
---|
523 | Encoding |
---|
524 | ~~~~~~~~ |
---|
525 | |
---|
526 | * ``storage_index`` MUST be `Base32`_ encoded in URLs. |
---|
527 | * ``share_number`` MUST be a decimal representation |
---|
528 | |
---|
529 | General |
---|
530 | ~~~~~~~ |
---|
531 | |
---|
532 | ``GET /storage/v1/version`` |
---|
533 | !!!!!!!!!!!!!!!!!!!!!!!!!!! |
---|
534 | |
---|
535 | This endpoint allows clients to retrieve some basic metadata about a storage server from the storage service. |
---|
536 | The response MUST validate against this CDDL schema:: |
---|
537 | |
---|
538 | {'http://allmydata.org/tahoe/protocols/storage/v1' => { |
---|
539 | 'maximum-immutable-share-size' => uint |
---|
540 | 'maximum-mutable-share-size' => uint |
---|
541 | 'available-space' => uint |
---|
542 | } |
---|
543 | 'application-version' => bstr |
---|
544 | } |
---|
545 | |
---|
546 | The server SHOULD populate as many fields as possible with accurate information about its behavior. |
---|
547 | |
---|
548 | For fields which relate to a specific API |
---|
549 | the semantics are documented below in the section for that API. |
---|
550 | For fields that are more general than a single API the semantics are as follows: |
---|
551 | |
---|
552 | * available-space: |
---|
553 | The server SHOULD use this field to advertise the amount of space that it currently considers unused and is willing to allocate for client requests. |
---|
554 | The value is a number of bytes. |
---|
555 | |
---|
556 | |
---|
557 | ``PUT /storage/v1/lease/:storage_index`` |
---|
558 | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
---|
559 | |
---|
560 | Either renew or create a new lease on the bucket addressed by ``storage_index``. |
---|
561 | |
---|
562 | The renew secret and cancellation secret should be included as ``X-Tahoe-Authorization`` headers. |
---|
563 | For example:: |
---|
564 | |
---|
565 | X-Tahoe-Authorization: lease-renew-secret <base64-lease-renew-secret> |
---|
566 | X-Tahoe-Authorization: lease-cancel-secret <base64-lease-cancel-secret> |
---|
567 | |
---|
568 | If the ``lease-renew-secret`` value matches an existing lease |
---|
569 | then the expiration time of that lease will be changed to 31 days after the time of this operation. |
---|
570 | If it does not match an existing lease |
---|
571 | then a new lease will be created with this ``lease-renew-secret`` which expires 31 days after the time of this operation. |
---|
572 | |
---|
573 | ``lease-renew-secret`` and ``lease-cancel-secret`` values must be 32 bytes long. |
---|
574 | The server treats them as opaque values. |
---|
575 | :ref:`Share Leases` gives details about how the Tahoe-LAFS storage client constructs these values. |
---|
576 | |
---|
577 | In these cases the response is ``NO CONTENT`` with an empty body. |
---|
578 | |
---|
579 | It is possible that the storage server will have no shares for the given ``storage_index`` because: |
---|
580 | |
---|
581 | * no such shares have ever been uploaded. |
---|
582 | * a previous lease expired and the storage server reclaimed the storage by deleting the shares. |
---|
583 | |
---|
584 | In these cases the server takes no action and returns ``NOT FOUND``. |
---|
585 | |
---|
586 | |
---|
587 | Discussion |
---|
588 | `````````` |
---|
589 | |
---|
590 | We considered an alternative where ``lease-renew-secret`` and ``lease-cancel-secret`` are placed in query arguments on the request path. |
---|
591 | This increases chances of leaking secrets in logs. |
---|
592 | Putting the secrets in the body reduces the chances of leaking secrets, |
---|
593 | but eventually we chose headers as the least likely information to be logged. |
---|
594 | |
---|
595 | Several behaviors here are blindly copied from the Foolscap-based storage server protocol. |
---|
596 | |
---|
597 | * There is a cancel secret but there is no API to use it to cancel a lease (see ticket:3768). |
---|
598 | * The lease period is hard-coded at 31 days. |
---|
599 | |
---|
600 | These are not necessarily ideal behaviors |
---|
601 | but they are adopted to avoid any *semantic* changes between the Foolscap- and HTTP-based protocols. |
---|
602 | It is expected that some or all of these behaviors may change in a future revision of the HTTP-based protocol. |
---|
603 | |
---|
604 | Immutable |
---|
605 | --------- |
---|
606 | |
---|
607 | Writing |
---|
608 | ~~~~~~~ |
---|
609 | |
---|
610 | ``POST /storage/v1/immutable/:storage_index`` |
---|
611 | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
---|
612 | |
---|
613 | Initialize an immutable storage index with some buckets. |
---|
614 | The server MUST allow share data to be written to the buckets at most one time. |
---|
615 | The server MAY create a lease for the buckets. |
---|
616 | Details of the buckets to create are encoded in the request body. |
---|
617 | The request body MUST validate against this CDDL schema:: |
---|
618 | |
---|
619 | { |
---|
620 | share-numbers: #6.258([0*256 uint]) |
---|
621 | allocated-size: uint |
---|
622 | } |
---|
623 | |
---|
624 | For example:: |
---|
625 | |
---|
626 | {"share-numbers": [1, 7, ...], "allocated-size": 12345} |
---|
627 | |
---|
628 | The server SHOULD accept a value for **allocated-size** that is less than or equal to the lesser of the values of the server's version message's **maximum-immutable-share-size** or **available-space** values. |
---|
629 | |
---|
630 | The request MUST include ``X-Tahoe-Authorization`` HTTP headers that set the various secrets—upload, lease renewal, lease cancellation—that will be later used to authorize various operations. |
---|
631 | For example:: |
---|
632 | |
---|
633 | X-Tahoe-Authorization: lease-renew-secret <base64-lease-renew-secret> |
---|
634 | X-Tahoe-Authorization: lease-cancel-secret <base64-lease-cancel-secret> |
---|
635 | X-Tahoe-Authorization: upload-secret <base64-upload-secret> |
---|
636 | |
---|
637 | The response body MUST include encoded information about the created buckets. |
---|
638 | The response body MUST validate against this CDDL schema:: |
---|
639 | |
---|
640 | { |
---|
641 | already-have: #6.258([0*256 uint]) |
---|
642 | allocated: #6.258([0*256 uint]) |
---|
643 | } |
---|
644 | |
---|
645 | For example:: |
---|
646 | |
---|
647 | {"already-have": [1, ...], "allocated": [7, ...]} |
---|
648 | |
---|
649 | The upload secret is an opaque _byte_ string. |
---|
650 | |
---|
651 | Handling repeat calls: |
---|
652 | |
---|
653 | * If the same API call is repeated with the same upload secret, the response is the same and no change is made to server state. |
---|
654 | This is necessary to ensure retries work in the face of lost responses from the server. |
---|
655 | * If the API calls is with a different upload secret, this implies a new client, perhaps because the old client died. |
---|
656 | Or it may happen because the client wants to upload a different share number than a previous client. |
---|
657 | New shares will be created, existing shares will be unchanged, regardless of whether the upload secret matches or not. |
---|
658 | |
---|
659 | Discussion |
---|
660 | `````````` |
---|
661 | |
---|
662 | We considered making this ``POST /storage/v1/immutable`` instead. |
---|
663 | The motivation was to keep *storage index* out of the request URL. |
---|
664 | Request URLs have an elevated chance of being logged by something. |
---|
665 | We were concerned that having the *storage index* logged may increase some risks. |
---|
666 | However, we decided this does not matter because: |
---|
667 | |
---|
668 | * the *storage index* can only be used to retrieve (not decrypt) the ciphertext-bearing share. |
---|
669 | * the *storage index* is already persistently present on the storage node in the form of directory names in the storage servers ``shares`` directory. |
---|
670 | * the request is made via HTTPS and so only Tahoe-LAFS can see the contents, |
---|
671 | therefore no proxy servers can perform any extra logging. |
---|
672 | * Tahoe-LAFS itself does not currently log HTTP request URLs. |
---|
673 | |
---|
674 | The response includes ``already-have`` and ``allocated`` for two reasons: |
---|
675 | |
---|
676 | * If an upload is interrupted and the client loses its local state that lets it know it already uploaded some shares |
---|
677 | then this allows it to discover this fact (by inspecting ``already-have``) and only upload the missing shares (indicated by ``allocated``). |
---|
678 | |
---|
679 | * If an upload has completed a client may still choose to re-balance storage by moving shares between servers. |
---|
680 | This might be because a server has become unavailable and a remaining server needs to store more shares for the upload. |
---|
681 | It could also just be that the client's preferred servers have changed. |
---|
682 | |
---|
683 | Regarding upload secrets, |
---|
684 | the goal is for uploading and aborting (see next sections) to be authenticated by more than just the storage index. |
---|
685 | In the future, we may want to generate them in a way that allows resuming/canceling when the client has issues. |
---|
686 | In the short term, they can just be a random byte string. |
---|
687 | The primary security constraint is that each upload to each server has its own unique upload key, |
---|
688 | tied to uploading that particular storage index to this particular server. |
---|
689 | |
---|
690 | Rejected designs for upload secrets: |
---|
691 | |
---|
692 | * Upload secret per share number. |
---|
693 | In order to make the secret unguessable by attackers, which includes other servers, |
---|
694 | it must contain randomness. |
---|
695 | Randomness means there is no need to have a secret per share, since adding share-specific content to randomness doesn't actually make the secret any better. |
---|
696 | |
---|
697 | ``PATCH /storage/v1/immutable/:storage_index/:share_number`` |
---|
698 | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
---|
699 | |
---|
700 | Write data for the indicated share. |
---|
701 | The share number MUST belong to the storage index. |
---|
702 | The request body MUST be the raw share data (i.e., ``application/octet-stream``). |
---|
703 | The request MUST include a *Content-Range* header field; |
---|
704 | for large transfers this allows partially complete uploads to be resumed. |
---|
705 | |
---|
706 | For example, |
---|
707 | a 1MiB share can be divided in to eight separate 128KiB chunks. |
---|
708 | Each chunk can be uploaded in a separate request. |
---|
709 | Each request can include a *Content-Range* value indicating its placement within the complete share. |
---|
710 | If any one of these requests fails then at most 128KiB of upload work needs to be retried. |
---|
711 | |
---|
712 | The server MUST recognize when all of the data has been received and mark the share as complete |
---|
713 | (which it can do because it was informed of the size when the storage index was initialized). |
---|
714 | |
---|
715 | The request MUST include a ``X-Tahoe-Authorization`` header that includes the upload secret:: |
---|
716 | |
---|
717 | X-Tahoe-Authorization: upload-secret <base64-upload-secret> |
---|
718 | |
---|
719 | Responses: |
---|
720 | |
---|
721 | * When a chunk that does not complete the share is successfully uploaded the response MUST be ``OK``. |
---|
722 | The response body MUST indicate the range of share data that has yet to be uploaded. |
---|
723 | The response body MUST validate against this CDDL schema:: |
---|
724 | |
---|
725 | { |
---|
726 | required: [0* {begin: uint, end: uint}] |
---|
727 | } |
---|
728 | |
---|
729 | For example:: |
---|
730 | |
---|
731 | { "required": |
---|
732 | [ { "begin": <byte position, inclusive> |
---|
733 | , "end": <byte position, exclusive> |
---|
734 | } |
---|
735 | , |
---|
736 | ... |
---|
737 | ] |
---|
738 | } |
---|
739 | |
---|
740 | * When the chunk that completes the share is successfully uploaded the response MUST be ``CREATED``. |
---|
741 | * If the *Content-Range* for a request covers part of the share that has already, |
---|
742 | and the data does not match already written data, |
---|
743 | the response MUST be ``CONFLICT``. |
---|
744 | In this case the client MUST abort the upload. |
---|
745 | The client MAY then restart the upload from scratch. |
---|
746 | |
---|
747 | Discussion |
---|
748 | `````````` |
---|
749 | |
---|
750 | ``PUT`` verbs are only supposed to be used to replace the whole resource, |
---|
751 | thus the use of ``PATCH``. |
---|
752 | From RFC 7231:: |
---|
753 | |
---|
754 | An origin server that allows PUT on a given target resource MUST send |
---|
755 | a 400 (Bad Request) response to a PUT request that contains a |
---|
756 | Content-Range header field (Section 4.2 of [RFC7233]), since the |
---|
757 | payload is likely to be partial content that has been mistakenly PUT |
---|
758 | as a full representation. Partial content updates are possible by |
---|
759 | targeting a separately identified resource with state that overlaps a |
---|
760 | portion of the larger resource, or by using a different method that |
---|
761 | has been specifically defined for partial updates (for example, the |
---|
762 | PATCH method defined in [RFC5789]). |
---|
763 | |
---|
764 | |
---|
765 | |
---|
766 | ``PUT /storage/v1/immutable/:storage_index/:share_number/abort`` |
---|
767 | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
---|
768 | |
---|
769 | This cancels an *in-progress* upload. |
---|
770 | |
---|
771 | The request MUST include a ``X-Tahoe-Authorization`` header that includes the upload secret:: |
---|
772 | |
---|
773 | X-Tahoe-Authorization: upload-secret <base64-upload-secret> |
---|
774 | |
---|
775 | If there is an incomplete upload with a matching upload-secret then the server MUST consider the abort to have succeeded. |
---|
776 | In this case the response MUST be ``OK``. |
---|
777 | The server MUST respond to all future requests as if the operations related to this upload did not take place. |
---|
778 | |
---|
779 | If there is no incomplete upload with a matching upload-secret then the server MUST respond with ``Method Not Allowed`` (405). |
---|
780 | The server MUST make no client-visible changes to its state in this case. |
---|
781 | |
---|
782 | ``POST /storage/v1/immutable/:storage_index/:share_number/corrupt`` |
---|
783 | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
---|
784 | |
---|
785 | Advise the server the data read from the indicated share was corrupt. |
---|
786 | The request body includes an human-meaningful text string with details about the corruption. |
---|
787 | It also includes potentially important details about the share. |
---|
788 | The request body MUST validate against this CDDL schema:: |
---|
789 | |
---|
790 | { |
---|
791 | reason: tstr .size (1..32765) |
---|
792 | } |
---|
793 | |
---|
794 | For example:: |
---|
795 | |
---|
796 | {"reason": "expected hash abcd, got hash efgh"} |
---|
797 | |
---|
798 | The report pertains to the immutable share with a **storage index** and **share number** given in the request path. |
---|
799 | If the identified **storage index** and **share number** are known to the server then the response SHOULD be accepted and made available to server administrators. |
---|
800 | In this case the response SHOULD be ``OK``. |
---|
801 | If the response is not accepted then the response SHOULD be ``Not Found`` (404). |
---|
802 | |
---|
803 | Discussion |
---|
804 | `````````` |
---|
805 | |
---|
806 | The seemingly odd length limit on ``reason`` is chosen so that the *encoded* representation of the message is limited to 32768. |
---|
807 | |
---|
808 | Reading |
---|
809 | ~~~~~~~ |
---|
810 | |
---|
811 | ``GET /storage/v1/immutable/:storage_index/shares`` |
---|
812 | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
---|
813 | |
---|
814 | Retrieve a list (semantically, a set) indicating all shares available for the indicated storage index. |
---|
815 | The response body MUST validate against this CDDL schema:: |
---|
816 | |
---|
817 | #6.258([0*256 uint]) |
---|
818 | |
---|
819 | For example:: |
---|
820 | |
---|
821 | [1, 5] |
---|
822 | |
---|
823 | If the **storage index** in the request path is not known to the server then the response MUST include an empty list. |
---|
824 | |
---|
825 | ``GET /storage/v1/immutable/:storage_index/:share_number`` |
---|
826 | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
---|
827 | |
---|
828 | Read a contiguous sequence of bytes from one share in one bucket. |
---|
829 | The response body MUST be the raw share data (i.e., ``application/octet-stream``). |
---|
830 | The ``Range`` header MAY be used to request exactly one ``bytes`` range, |
---|
831 | in which case the response code MUST be ``Partial Content`` (206). |
---|
832 | Interpretation and response behavior MUST be as specified in RFC 7233 § 4.1. |
---|
833 | Multiple ranges in a single request are *not* supported; |
---|
834 | open-ended ranges are also not supported. |
---|
835 | Clients MUST NOT send requests using these features. |
---|
836 | |
---|
837 | If the response reads beyond the end of the data, |
---|
838 | the response MUST be shorter than the requested range. |
---|
839 | It MUST contain all data up to the end of the share and then end. |
---|
840 | The resulting ``Content-Range`` header MUST be consistent with the returned data. |
---|
841 | |
---|
842 | If the response to a query is an empty range, |
---|
843 | the server MUST send a ``No Content`` (204) response. |
---|
844 | |
---|
845 | Discussion |
---|
846 | `````````` |
---|
847 | |
---|
848 | Multiple ``bytes`` ranges are not supported. |
---|
849 | HTTP requires that the ``Content-Type`` of the response in that case be ``multipart/...``. |
---|
850 | The ``multipart`` major type brings along string sentinel delimiting as a means to frame the different response parts. |
---|
851 | There are many drawbacks to this framing technique: |
---|
852 | |
---|
853 | 1. It is resource-intensive to generate. |
---|
854 | 2. It is resource-intensive to parse. |
---|
855 | 3. It is complex to parse safely [#]_ [#]_ [#]_ [#]_. |
---|
856 | |
---|
857 | A previous revision of this specification allowed requesting one or more contiguous sequences from one or more shares. |
---|
858 | This *superficially* mirrored the Foolscap based interface somewhat closely. |
---|
859 | The interface was simplified to this version because this version is all that is required to let clients retrieve any desired information. |
---|
860 | It only requires that the client issue multiple requests. |
---|
861 | This can be done with pipelining or parallel requests to avoid an additional latency penalty. |
---|
862 | In the future, |
---|
863 | if there are performance goals, |
---|
864 | benchmarks can demonstrate whether they are achieved by a more complicated interface or some other change. |
---|
865 | |
---|
866 | Mutable |
---|
867 | ------- |
---|
868 | |
---|
869 | Writing |
---|
870 | ~~~~~~~ |
---|
871 | |
---|
872 | ``POST /storage/v1/mutable/:storage_index/read-test-write`` |
---|
873 | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
---|
874 | |
---|
875 | General purpose read-test-and-write operation for mutable storage indexes. |
---|
876 | A mutable storage index is also called a "slot" |
---|
877 | (particularly by the existing Tahoe-LAFS codebase). |
---|
878 | The first write operation on a mutable storage index creates it |
---|
879 | (that is, |
---|
880 | there is no separate "create this storage index" operation as there is for the immutable storage index type). |
---|
881 | |
---|
882 | The request MUST include ``X-Tahoe-Authorization`` headers with write enabler and lease secrets:: |
---|
883 | |
---|
884 | X-Tahoe-Authorization: write-enabler <base64-write-enabler-secret> |
---|
885 | X-Tahoe-Authorization: lease-cancel-secret <base64-lease-cancel-secret> |
---|
886 | X-Tahoe-Authorization: lease-renew-secret <base64-lease-renew-secret> |
---|
887 | |
---|
888 | The request body MUST include test, read, and write vectors for the operation. |
---|
889 | The request body MUST validate against this CDDL schema:: |
---|
890 | |
---|
891 | { |
---|
892 | "test-write-vectors": { |
---|
893 | 0*256 share_number : { |
---|
894 | "test": [0*30 {"offset": uint, "size": uint, "specimen": bstr}] |
---|
895 | "write": [* {"offset": uint, "data": bstr}] |
---|
896 | "new-length": uint / null |
---|
897 | } |
---|
898 | } |
---|
899 | "read-vector": [0*30 {"offset": uint, "size": uint}] |
---|
900 | } |
---|
901 | share_number = uint |
---|
902 | |
---|
903 | For example:: |
---|
904 | |
---|
905 | { |
---|
906 | "test-write-vectors": { |
---|
907 | 0: { |
---|
908 | "test": [{ |
---|
909 | "offset": 3, |
---|
910 | "size": 5, |
---|
911 | "specimen": "hello" |
---|
912 | }, ...], |
---|
913 | "write": [{ |
---|
914 | "offset": 9, |
---|
915 | "data": "world" |
---|
916 | }, ...], |
---|
917 | "new-length": 5 |
---|
918 | } |
---|
919 | }, |
---|
920 | "read-vector": [{"offset": 3, "size": 12}, ...] |
---|
921 | } |
---|
922 | |
---|
923 | The response body contains a boolean indicating whether the tests all succeed |
---|
924 | (and writes were applied) and a mapping giving read data (pre-write). |
---|
925 | The response body MUST validate against this CDDL schema:: |
---|
926 | |
---|
927 | { |
---|
928 | "success": bool, |
---|
929 | "data": {0*256 share_number: [0* bstr]} |
---|
930 | } |
---|
931 | share_number = uint |
---|
932 | |
---|
933 | For example:: |
---|
934 | |
---|
935 | { |
---|
936 | "success": true, |
---|
937 | "data": { |
---|
938 | 0: ["foo"], |
---|
939 | 5: ["bar"], |
---|
940 | ... |
---|
941 | } |
---|
942 | } |
---|
943 | |
---|
944 | A client MAY send a test vector or read vector to bytes beyond the end of existing data. |
---|
945 | In this case a server MUST behave as if the test or read vector referred to exactly as much data exists. |
---|
946 | |
---|
947 | For example, |
---|
948 | consider the case where the server has 5 bytes of data for a particular share. |
---|
949 | If a client sends a read vector with an ``offset`` of 1 and a ``size`` of 4 then the server MUST respond with all of the data except the first byte. |
---|
950 | If a client sends a read vector with the same ``offset`` and a ``size`` of 5 (or any larger value) then the server MUST respond in the same way. |
---|
951 | |
---|
952 | Similarly, |
---|
953 | if there is no data at all, |
---|
954 | an empty byte string is returned no matter what the offset or length. |
---|
955 | |
---|
956 | Reading |
---|
957 | ~~~~~~~ |
---|
958 | |
---|
959 | ``GET /storage/v1/mutable/:storage_index/shares`` |
---|
960 | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
---|
961 | |
---|
962 | Retrieve a set indicating all shares available for the indicated storage index. |
---|
963 | The response body MUST validate against this CDDL schema:: |
---|
964 | |
---|
965 | #6.258([0*256 uint]) |
---|
966 | |
---|
967 | For example:: |
---|
968 | |
---|
969 | [1, 5] |
---|
970 | |
---|
971 | ``GET /storage/v1/mutable/:storage_index/:share_number`` |
---|
972 | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
---|
973 | |
---|
974 | Read data from the indicated mutable shares, just like ``GET /storage/v1/immutable/:storage_index``. |
---|
975 | |
---|
976 | The response body MUST be the raw share data (i.e., ``application/octet-stream``). |
---|
977 | The ``Range`` header MAY be used to request exactly one ``bytes`` range, |
---|
978 | in which case the response code MUST be ``Partial Content`` (206). |
---|
979 | Interpretation and response behavior MUST be specified in RFC 7233 § 4.1. |
---|
980 | Multiple ranges in a single request are *not* supported; |
---|
981 | open-ended ranges are also not supported. |
---|
982 | Clients MUST NOT send requests using these features. |
---|
983 | |
---|
984 | If the response reads beyond the end of the data, |
---|
985 | the response MUST be shorter than the requested range. |
---|
986 | It MUST contain all data up to the end of the share and then end. |
---|
987 | The resulting ``Content-Range`` header MUST be consistent with the returned data. |
---|
988 | |
---|
989 | If the response to a query is an empty range, |
---|
990 | the server MUST send a ``No Content`` (204) response. |
---|
991 | |
---|
992 | |
---|
993 | ``POST /storage/v1/mutable/:storage_index/:share_number/corrupt`` |
---|
994 | !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! |
---|
995 | |
---|
996 | Advise the server the data read from the indicated share was corrupt. |
---|
997 | Just like the immutable version. |
---|
998 | |
---|
999 | Sample Interactions |
---|
1000 | ------------------- |
---|
1001 | |
---|
1002 | This section contains examples of client/server interactions to help illuminate the above specification. |
---|
1003 | This section is non-normative. |
---|
1004 | |
---|
1005 | Immutable Data |
---|
1006 | ~~~~~~~~~~~~~~ |
---|
1007 | |
---|
1008 | 1. Create a bucket for storage index ``AAAAAAAAAAAAAAAA`` to hold two immutable shares, discovering that share ``1`` was already uploaded:: |
---|
1009 | |
---|
1010 | POST /storage/v1/immutable/AAAAAAAAAAAAAAAA |
---|
1011 | Authorization: Tahoe-LAFS nurl-swissnum |
---|
1012 | X-Tahoe-Authorization: lease-renew-secret efgh |
---|
1013 | X-Tahoe-Authorization: lease-cancel-secret jjkl |
---|
1014 | X-Tahoe-Authorization: upload-secret xyzf |
---|
1015 | |
---|
1016 | {"share-numbers": [1, 7], "allocated-size": 48} |
---|
1017 | |
---|
1018 | 200 OK |
---|
1019 | {"already-have": [1], "allocated": [7]} |
---|
1020 | |
---|
1021 | #. Upload the content for immutable share ``7``:: |
---|
1022 | |
---|
1023 | PATCH /storage/v1/immutable/AAAAAAAAAAAAAAAA/7 |
---|
1024 | Authorization: Tahoe-LAFS nurl-swissnum |
---|
1025 | Content-Range: bytes 0-15/48 |
---|
1026 | X-Tahoe-Authorization: upload-secret xyzf |
---|
1027 | <first 16 bytes of share data> |
---|
1028 | |
---|
1029 | 200 OK |
---|
1030 | { "required": [ {"begin": 16, "end": 48 } ] } |
---|
1031 | |
---|
1032 | PATCH /storage/v1/immutable/AAAAAAAAAAAAAAAA/7 |
---|
1033 | Authorization: Tahoe-LAFS nurl-swissnum |
---|
1034 | Content-Range: bytes 16-31/48 |
---|
1035 | X-Tahoe-Authorization: upload-secret xyzf |
---|
1036 | <second 16 bytes of share data> |
---|
1037 | |
---|
1038 | 200 OK |
---|
1039 | { "required": [ {"begin": 32, "end": 48 } ] } |
---|
1040 | |
---|
1041 | PATCH /storage/v1/immutable/AAAAAAAAAAAAAAAA/7 |
---|
1042 | Authorization: Tahoe-LAFS nurl-swissnum |
---|
1043 | Content-Range: bytes 32-47/48 |
---|
1044 | X-Tahoe-Authorization: upload-secret xyzf |
---|
1045 | <final 16 bytes of share data> |
---|
1046 | |
---|
1047 | 201 CREATED |
---|
1048 | |
---|
1049 | #. Download the content of the previously uploaded immutable share ``7``:: |
---|
1050 | |
---|
1051 | GET /storage/v1/immutable/AAAAAAAAAAAAAAAA?share=7 |
---|
1052 | Authorization: Tahoe-LAFS nurl-swissnum |
---|
1053 | Range: bytes=0-47 |
---|
1054 | |
---|
1055 | 200 OK |
---|
1056 | Content-Range: bytes 0-47/48 |
---|
1057 | <complete 48 bytes of previously uploaded data> |
---|
1058 | |
---|
1059 | #. Renew the lease on all immutable shares in bucket ``AAAAAAAAAAAAAAAA``:: |
---|
1060 | |
---|
1061 | PUT /storage/v1/lease/AAAAAAAAAAAAAAAA |
---|
1062 | Authorization: Tahoe-LAFS nurl-swissnum |
---|
1063 | X-Tahoe-Authorization: lease-cancel-secret jjkl |
---|
1064 | X-Tahoe-Authorization: lease-renew-secret efgh |
---|
1065 | |
---|
1066 | 204 NO CONTENT |
---|
1067 | |
---|
1068 | Mutable Data |
---|
1069 | ~~~~~~~~~~~~ |
---|
1070 | |
---|
1071 | 1. Create mutable share number ``3`` with ``10`` bytes of data in slot ``BBBBBBBBBBBBBBBB``. |
---|
1072 | The special test vector of size 1 but empty bytes will only pass |
---|
1073 | if there is no existing share, |
---|
1074 | otherwise it will read a byte which won't match `b""`:: |
---|
1075 | |
---|
1076 | POST /storage/v1/mutable/BBBBBBBBBBBBBBBB/read-test-write |
---|
1077 | Authorization: Tahoe-LAFS nurl-swissnum |
---|
1078 | X-Tahoe-Authorization: write-enabler abcd |
---|
1079 | X-Tahoe-Authorization: lease-cancel-secret efgh |
---|
1080 | X-Tahoe-Authorization: lease-renew-secret ijkl |
---|
1081 | |
---|
1082 | { |
---|
1083 | "test-write-vectors": { |
---|
1084 | 3: { |
---|
1085 | "test": [{ |
---|
1086 | "offset": 0, |
---|
1087 | "size": 1, |
---|
1088 | "specimen": "" |
---|
1089 | }], |
---|
1090 | "write": [{ |
---|
1091 | "offset": 0, |
---|
1092 | "data": "xxxxxxxxxx" |
---|
1093 | }], |
---|
1094 | "new-length": 10 |
---|
1095 | } |
---|
1096 | }, |
---|
1097 | "read-vector": [] |
---|
1098 | } |
---|
1099 | |
---|
1100 | 200 OK |
---|
1101 | { |
---|
1102 | "success": true, |
---|
1103 | "data": [] |
---|
1104 | } |
---|
1105 | |
---|
1106 | #. Safely rewrite the contents of a known version of mutable share number ``3`` (or fail):: |
---|
1107 | |
---|
1108 | POST /storage/v1/mutable/BBBBBBBBBBBBBBBB/read-test-write |
---|
1109 | Authorization: Tahoe-LAFS nurl-swissnum |
---|
1110 | X-Tahoe-Authorization: write-enabler abcd |
---|
1111 | X-Tahoe-Authorization: lease-cancel-secret efgh |
---|
1112 | X-Tahoe-Authorization: lease-renew-secret ijkl |
---|
1113 | |
---|
1114 | { |
---|
1115 | "test-write-vectors": { |
---|
1116 | 3: { |
---|
1117 | "test": [{ |
---|
1118 | "offset": 0, |
---|
1119 | "size": <length of checkstring>, |
---|
1120 | "specimen": "<checkstring>" |
---|
1121 | }], |
---|
1122 | "write": [{ |
---|
1123 | "offset": 0, |
---|
1124 | "data": "yyyyyyyyyy" |
---|
1125 | }], |
---|
1126 | "new-length": 10 |
---|
1127 | } |
---|
1128 | }, |
---|
1129 | "read-vector": [] |
---|
1130 | } |
---|
1131 | |
---|
1132 | 200 OK |
---|
1133 | { |
---|
1134 | "success": true, |
---|
1135 | "data": [] |
---|
1136 | } |
---|
1137 | |
---|
1138 | #. Download the contents of share number ``3``:: |
---|
1139 | |
---|
1140 | GET /storage/v1/mutable/BBBBBBBBBBBBBBBB?share=3 |
---|
1141 | Authorization: Tahoe-LAFS nurl-swissnum |
---|
1142 | Range: bytes=0-16 |
---|
1143 | |
---|
1144 | 200 OK |
---|
1145 | Content-Range: bytes 0-15/16 |
---|
1146 | <complete 16 bytes of previously uploaded data> |
---|
1147 | |
---|
1148 | #. Renew the lease on previously uploaded mutable share in slot ``BBBBBBBBBBBBBBBB``:: |
---|
1149 | |
---|
1150 | PUT /storage/v1/lease/BBBBBBBBBBBBBBBB |
---|
1151 | Authorization: Tahoe-LAFS nurl-swissnum |
---|
1152 | X-Tahoe-Authorization: lease-cancel-secret efgh |
---|
1153 | X-Tahoe-Authorization: lease-renew-secret ijkl |
---|
1154 | |
---|
1155 | 204 NO CONTENT |
---|
1156 | |
---|
1157 | .. _Base64: https://www.rfc-editor.org/rfc/rfc4648#section-4 |
---|
1158 | |
---|
1159 | .. _RFC 4648: https://tools.ietf.org/html/rfc4648 |
---|
1160 | |
---|
1161 | .. _RFC 7469: https://tools.ietf.org/html/rfc7469#section-2.4 |
---|
1162 | |
---|
1163 | .. _RFC 7049: https://tools.ietf.org/html/rfc7049#section-4 |
---|
1164 | |
---|
1165 | .. _RFC 9110: https://tools.ietf.org/html/rfc9110 |
---|
1166 | |
---|
1167 | .. _CBOR: http://cbor.io/ |
---|
1168 | |
---|
1169 | .. [#] |
---|
1170 | The security value of checking ``notValidBefore`` and ``notValidAfter`` is not entirely clear. |
---|
1171 | The arguments which apply to web-facing certificates do not seem to apply |
---|
1172 | (due to the decision for Tahoe-LAFS to operate independently of the web-oriented CA system). |
---|
1173 | |
---|
1174 | Arguably, complexity is reduced by allowing an existing TLS implementation which wants to make these checks make them |
---|
1175 | (compared to including additional code to either bypass them or disregard their results). |
---|
1176 | Reducing complexity, at least in general, is often good for security. |
---|
1177 | |
---|
1178 | On the other hand, checking the validity time period forces certificate regeneration |
---|
1179 | (which comes with its own set of complexity). |
---|
1180 | |
---|
1181 | A possible compromise is to recommend certificates with validity periods of many years or decades. |
---|
1182 | "Recommend" may be read as "provide software supporting the generation of". |
---|
1183 | |
---|
1184 | What about key theft? |
---|
1185 | If certificates are valid for years then a successful attacker can pretend to be a valid storage node for years. |
---|
1186 | However, short-validity-period certificates are no help in this case. |
---|
1187 | The attacker can generate new, valid certificates using the stolen keys. |
---|
1188 | |
---|
1189 | Therefore, the only recourse to key theft |
---|
1190 | (really *identity theft*) |
---|
1191 | is to burn the identity and generate a new one. |
---|
1192 | Burning the identity is a non-trivial task. |
---|
1193 | It is worth solving but it is not solved here. |
---|
1194 | |
---|
1195 | .. [#] |
---|
1196 | More simply:: |
---|
1197 | |
---|
1198 | from hashlib import sha256 |
---|
1199 | from cryptography.hazmat.primitives.serialization import ( |
---|
1200 | Encoding, |
---|
1201 | PublicFormat, |
---|
1202 | ) |
---|
1203 | from pybase64 import urlsafe_b64encode |
---|
1204 | |
---|
1205 | def check_tub_id(tub_id): |
---|
1206 | spki_bytes = cert.public_key().public_bytes(Encoding.DER, PublicFormat.SubjectPublicKeyInfo) |
---|
1207 | spki_sha256 = sha256(spki_bytes).digest() |
---|
1208 | spki_encoded = urlsafe_b64encode(spki_sha256) |
---|
1209 | assert spki_encoded == tub_id |
---|
1210 | |
---|
1211 | Note we use `unpadded base64url`_ rather than the Foolscap- and Tahoe-LAFS-preferred Base32. |
---|
1212 | |
---|
1213 | .. [#] |
---|
1214 | https://www.cvedetails.com/cve/CVE-2017-5638/ |
---|
1215 | .. [#] |
---|
1216 | https://pivotal.io/security/cve-2018-1272 |
---|
1217 | .. [#] |
---|
1218 | https://nvd.nist.gov/vuln/detail/CVE-2017-5124 |
---|
1219 | .. [#] |
---|
1220 | https://efail.de/ |
---|
1221 | |
---|
1222 | .. _unpadded base64url: https://tools.ietf.org/html/rfc7515#appendix-C |
---|
1223 | |
---|
1224 | .. _attacking SHA1: https://en.wikipedia.org/wiki/SHA-1#Attacks |
---|