source: trunk/docs/architecture.rst

Last change on this file was 7e94ab2, checked in by GitHub <noreply@…>, at 2023-08-01T14:58:52Z

Merge branch 'master' into 4041

  • Property mode set to 100644
File size: 30.2 KB
Line 
1.. -*- coding: utf-8-with-signature -*-
2
3=======================
4Tahoe-LAFS Architecture
5=======================
6
71.  `Overview`_
82.  `The Key-Value Store`_
93.  `File Encoding`_
104.  `Capabilities`_
115.  `Server Selection`_
126.  `Swarming Download, Trickling Upload`_
137.  `The File Store Layer`_
148.  `Leases, Refreshing, Garbage Collection`_
159.  `File Repairer`_
1610. `Security`_
1711. `Reliability`_
18
19
20Overview
21========
22
23(See the `docs/specifications directory`_ for more details.)
24
25There are three layers: the key-value store, the file store, and the
26application.
27
28The lowest layer is the key-value store. The keys are "capabilities" -- short
29ASCII strings -- and the values are sequences of data bytes. This data is
30encrypted and distributed across a number of nodes, such that it will survive
31the loss of most of the nodes. There are no hard limits on the size of the
32values, but there may be performance issues with extremely large values (just
33due to the limitation of network bandwidth). In practice, values as small as
34a few bytes and as large as tens of gigabytes are in common use.
35
36The middle layer is the decentralized file store: a directed graph in which
37the intermediate nodes are directories and the leaf nodes are files. The leaf
38nodes contain only the data -- they contain no metadata other than the length
39in bytes. The edges leading to leaf nodes have metadata attached to them
40about the file they point to. Therefore, the same file may be associated with
41different metadata if it is referred to through different edges.
42
43The top layer consists of the applications using the file store.
44Allmydata.com used it for a backup service: the application periodically
45copies files from the local disk onto the decentralized file store. We later
46provide read-only access to those files, allowing users to recover them.
47There are several other applications built on top of the Tahoe-LAFS
48file store (see the RelatedProjects_ page of the wiki for a list).
49
50.. _docs/specifications directory: https://github.com/tahoe-lafs/tahoe-lafs/tree/master/docs/specifications
51.. _RelatedProjects: https://tahoe-lafs.org/trac/tahoe-lafs/wiki/RelatedProjects
52
53The Key-Value Store
54===================
55
56The key-value store is implemented by a grid of Tahoe-LAFS storage servers --
57user-space processes. Tahoe-LAFS storage clients communicate with the storage
58servers over TCP.
59
60There are two supported protocols:
61
62* Foolscap, the only supported protocol in release before v1.19.
63* HTTPS, new in v1.19.
64
65By default HTTPS is enabled. When HTTPS is enabled on the server, the server
66transparently listens for both Foolscap and HTTPS on the same port. When it is
67disabled, the server only supports Foolscap. Clients can use either; by default
68they will use HTTPS when possible, falling back to I2p, but when configured
69appropriately they will only use Foolscap. At this time the only limitations of
70HTTPS is that I2P is not supported, so any usage of I2P only uses Foolscap.
71
72Storage servers hold data in the form of "shares". Shares are encoded pieces
73of files. There are a configurable number of shares for each file, 10 by
74default. Normally, each share is stored on a separate server, but in some
75cases a single server can hold multiple shares of a file.
76
77Nodes learn about each other through an "introducer". Each server connects to
78the introducer at startup and announces its presence. Each client connects to
79the introducer at startup, and receives a list of all servers from it. Each
80client then connects to every server, creating a "bi-clique" topology. In the
81current release, nodes behind NAT boxes will connect to all nodes that they
82can open connections to, but they cannot open connections to other nodes
83behind NAT boxes. Therefore, the more nodes behind NAT boxes, the less the
84topology resembles the intended bi-clique topology.
85
86The introducer is a Single Point of Failure ("SPoF"), in that clients who
87never connect to the introducer will be unable to connect to any storage
88servers, but once a client has been introduced to everybody, it does not need
89the introducer again until it is restarted. The danger of a SPoF is further
90reduced in two ways. First, the introducer is defined by a hostname and a
91private key, which are easy to move to a new host in case the original one
92suffers an unrecoverable hardware problem. Second, even if the private key is
93lost, clients can be reconfigured to use a new introducer.
94
95For future releases, we have plans to decentralize introduction, allowing any
96server to tell a new client about all the others.
97
98
99File Encoding
100=============
101
102When a client stores a file on the grid, it first encrypts the file. It then
103breaks the encrypted file into small segments, in order to reduce the memory
104footprint, and to decrease the lag between initiating a download and
105receiving the first part of the file; for example the lag between hitting
106"play" and a movie actually starting.
107
108The client then erasure-codes each segment, producing blocks of which only a
109subset are needed to reconstruct the segment (3 out of 10, with the default
110settings).
111
112It sends one block from each segment to a given server. The set of blocks on
113a given server constitutes a "share". Therefore a subset of the shares (3 out
114of 10, by default) are needed to reconstruct the file.
115
116A hash of the encryption key is used to form the "storage index", which is
117used for both server selection (described below) and to index shares within
118the Storage Servers on the selected nodes.
119
120The client computes secure hashes of the ciphertext and of the shares. It
121uses `Merkle Trees`_ so that it is possible to verify the correctness of a
122subset of the data without requiring all of the data. For example, this
123allows you to verify the correctness of the first segment of a movie file and
124then begin playing the movie file in your movie viewer before the entire
125movie file has been downloaded.
126
127These hashes are stored in a small datastructure named the Capability
128Extension Block which is stored on the storage servers alongside each share.
129
130The capability contains the encryption key, the hash of the Capability
131Extension Block, and any encoding parameters necessary to perform the
132eventual decoding process. For convenience, it also contains the size of the
133file being stored.
134
135To download, the client that wishes to turn a capability into a sequence of
136bytes will obtain the blocks from storage servers, use erasure-decoding to
137turn them into segments of ciphertext, use the decryption key to convert that
138into plaintext, then emit the plaintext bytes to the output target.
139
140.. _`Merkle Trees`: http://systems.cs.colorado.edu/grunwald/Classes/Fall2003-InformationStorage/Papers/merkle-tree.pdf
141
142
143Capabilities
144============
145
146Capabilities to immutable files represent a specific set of bytes. Think of
147it like a hash function: you feed in a bunch of bytes, and you get out a
148capability, which is deterministically derived from the input data: changing
149even one bit of the input data will result in a completely different
150capability.
151
152Read-only capabilities to mutable files represent the ability to get a set of
153bytes representing some version of the file, most likely the latest version.
154Each read-only capability is unique. In fact, each mutable file has a unique
155public/private key pair created when the mutable file is created, and the
156read-only capability to that file includes a secure hash of the public key.
157
158Read-write capabilities to mutable files represent the ability to read the
159file (just like a read-only capability) and also to write a new version of
160the file, overwriting any extant version. Read-write capabilities are unique
161-- each one includes the secure hash of the private key associated with that
162mutable file.
163
164The capability provides both "location" and "identification": you can use it
165to retrieve a set of bytes, and then you can use it to validate ("identify")
166that these potential bytes are indeed the ones that you were looking for.
167
168The "key-value store" layer doesn't include human-meaningful names.
169Capabilities sit on the "global+secure" edge of `Zooko's Triangle`_. They are
170self-authenticating, meaning that nobody can trick you into accepting a file
171that doesn't match the capability you used to refer to that file. The
172file store layer (described below) adds human-meaningful names atop the
173key-value layer.
174
175.. _`Zooko's Triangle`: https://en.wikipedia.org/wiki/Zooko%27s_triangle
176
177
178Server Selection
179================
180
181When a file is uploaded, the encoded shares are sent to some servers. But to
182which ones? The "server selection" algorithm is used to make this choice.
183
184The storage index is used to consistently-permute the set of all servers nodes
185(by sorting them by ``HASH(storage_index+nodeid)``). Each file gets a different
186permutation, which (on average) will evenly distribute shares among the grid
187and avoid hotspots. Each server has announced its available space when it
188connected to the introducer, and we use that available space information to
189remove any servers that cannot hold an encoded share for our file. Then we ask
190some of the servers thus removed if they are already holding any encoded shares
191for our file; we use this information later. (We ask any servers which are in
192the first 2*``N`` elements of the permuted list.)
193
194We then use the permuted list of servers to ask each server, in turn, if it
195will hold a share for us (a share that was not reported as being already
196present when we talked to the full servers earlier, and that we have not
197already planned to upload to a different server). We plan to send a share to a
198server by sending an 'allocate_buckets() query' to the server with the number
199of that share. Some will say yes they can hold that share, others (those who
200have become full since they announced their available space) will say no; when
201a server refuses our request, we take that share to the next server on the
202list. In the response to allocate_buckets() the server will also inform us of
203any shares of that file that it already has. We keep going until we run out of
204shares that need to be stored. At the end of the process, we'll have a table
205that maps each share number to a server, and then we can begin the encode and
206push phase, using the table to decide where each share should be sent.
207
208Most of the time, this will result in one share per server, which gives us
209maximum reliability.  If there are fewer writable servers than there are
210unstored shares, we'll be forced to loop around, eventually giving multiple
211shares to a single server.
212
213If we have to loop through the node list a second time, we accelerate the query
214process, by asking each node to hold multiple shares on the second pass. In
215most cases, this means we'll never send more than two queries to any given
216node.
217
218If a server is unreachable, or has an error, or refuses to accept any of our
219shares, we remove it from the permuted list, so we won't query it again for
220this file. If a server already has shares for the file we're uploading, we add
221that information to the share-to-server table. This lets us do less work for
222files which have been uploaded once before, while making sure we still wind up
223with as many shares as we desire.
224
225Before a file upload is called successful, it has to pass an upload health
226check. For immutable files, we check to see that a condition called
227'servers-of-happiness' is satisfied. When satisfied, 'servers-of-happiness'
228assures us that enough pieces of the file are distributed across enough
229servers on the grid to ensure that the availability of the file will not be
230affected if a few of those servers later fail. For mutable files and
231directories, we check to see that all of the encoded shares generated during
232the upload process were successfully placed on the grid. This is a weaker
233check than 'servers-of-happiness'; it does not consider any information about
234how the encoded shares are placed on the grid, and cannot detect situations in
235which all or a majority of the encoded shares generated during the upload
236process reside on only one storage server. We hope to extend
237'servers-of-happiness' to mutable files in a future release of Tahoe-LAFS. If,
238at the end of the upload process, the appropriate upload health check fails,
239the upload is considered a failure.
240
241The current defaults use ``k`` = 3, ``servers_of_happiness`` = 7, and ``N`` = 10.
242``N`` = 10 means that we'll try to place 10 shares. ``k`` = 3 means that we need
243any three shares to recover the file. ``servers_of_happiness`` = 7 means that
244we'll consider an immutable file upload to be successful if we can place shares
245on enough servers that there are 7 different servers, the correct functioning
246of any ``k`` of which guarantee the availability of the immutable file.
247
248``N`` = 10 and ``k`` = 3 means there is a 3.3x expansion factor. On a small grid, you
249should set ``N`` about equal to the number of storage servers in your grid; on a
250large grid, you might set it to something smaller to avoid the overhead of
251contacting every server to place a file. In either case, you should then set ``k``
252such that ``N``/``k`` reflects your desired availability goals. The best value for
253``servers_of_happiness`` will depend on how you use Tahoe-LAFS. In a friendnet
254with a variable number of servers, it might make sense to set it to the smallest
255number of servers that you expect to have online and accepting shares at any
256given time. In a stable environment without much server churn, it may make
257sense to set ``servers_of_happiness`` = ``N``.
258
259When downloading a file, the current version just asks all known servers for
260any shares they might have. Once it has received enough responses that it
261knows where to find the needed k shares, it downloads at least the first
262segment from those servers. This means that it tends to download shares from
263the fastest servers. If some servers had more than one share, it will continue
264sending "Do You Have Block" requests to other servers, so that it can download
265subsequent segments from distinct servers (sorted by their DYHB round-trip
266times), if possible.
267
268  *future work*
269
270  A future release will use the server selection algorithm to reduce the
271  number of queries that must be sent out.
272
273  Other peer-node selection algorithms are possible. One earlier version
274  (known as "Tahoe 3") used the permutation to place the nodes around a large
275  ring, distributed the shares evenly around the same ring, then walked
276  clockwise from 0 with a basket. Each time it encountered a share, it put it
277  in the basket, each time it encountered a server, give it as many shares
278  from the basket as they'd accept. This reduced the number of queries
279  (usually to 1) for small grids (where ``N`` is larger than the number of
280  nodes), but resulted in extremely non-uniform share distribution, which
281  significantly hurt reliability (sometimes the permutation resulted in most
282  of the shares being dumped on a single node).
283
284  Another algorithm (known as "denver airport" [#naming]_) uses the permuted hash to
285  decide on an approximate target for each share, then sends lease requests
286  via Chord routing. The request includes the contact information of the
287  uploading node, and asks that the node which eventually accepts the lease
288  should contact the uploader directly. The shares are then transferred over
289  direct connections rather than through multiple Chord hops. Download uses
290  the same approach. This allows nodes to avoid maintaining a large number of
291  long-term connections, at the expense of complexity and latency.
292
293.. [#naming]  all of these names are derived from the location where they were
294        concocted, in this case in a car ride from Boulder to DEN. To be
295        precise, "Tahoe 1" was an unworkable scheme in which everyone who holds
296        shares for a given file would form a sort of cabal which kept track of
297        all the others, "Tahoe 2" is the first-100-nodes in the permuted hash
298        described in this document, and "Tahoe 3" (or perhaps "Potrero hill 1")
299        was the abandoned ring-with-many-hands approach.
300
301
302Swarming Download, Trickling Upload
303===================================
304
305Because the shares being downloaded are distributed across a large number of
306nodes, the download process will pull from many of them at the same time. The
307current encoding parameters require 3 shares to be retrieved for each
308segment, which means that up to 3 nodes will be used simultaneously. For
309larger networks, 8-of-22 encoding could be used, meaning 8 nodes can be used
310simultaneously. This allows the download process to use the sum of the
311available nodes' upload bandwidths, resulting in downloads that take full
312advantage of the common 8x disparity between download and upload bandwith on
313modern ADSL lines.
314
315On the other hand, uploads are hampered by the need to upload encoded shares
316that are larger than the original data (3.3x larger with the current default
317encoding parameters), through the slow end of the asymmetric connection. This
318means that on a typical 8x ADSL line, uploading a file will take about 32
319times longer than downloading it again later.
320
321Smaller expansion ratios can reduce this upload penalty, at the expense of
322reliability (see `Reliability`_, below). By using an "upload helper", this
323penalty is eliminated: the client does a 1x upload of encrypted data to the
324helper, then the helper performs encoding and pushes the shares to the
325storage servers. This is an improvement if the helper has significantly
326higher upload bandwidth than the client, so it makes the most sense for a
327commercially-run grid for which all of the storage servers are in a colo
328facility with high interconnect bandwidth. In this case, the helper is placed
329in the same facility, so the helper-to-storage-server bandwidth is huge.
330
331See :doc:`helper` for details about the upload helper.
332
333
334The File Store Layer
335====================
336
337The "file store" layer is responsible for mapping human-meaningful pathnames
338(directories and filenames) to pieces of data. The actual bytes inside these
339files are referenced by capability, but the file store layer is where the
340directory names, file names, and metadata are kept.
341
342The file store layer is a graph of directories. Each directory contains a
343table of named children. These children are either other directories or
344files. All children are referenced by their capability.
345
346A directory has two forms of capability: read-write caps and read-only caps.
347The table of children inside the directory has a read-write and read-only
348capability for each child. If you have a read-only capability for a given
349directory, you will not be able to access the read-write capability of its
350children. This results in "transitively read-only" directory access.
351
352By having two different capabilities, you can choose which you want to share
353with someone else. If you create a new directory and share the read-write
354capability for it with a friend, then you will both be able to modify its
355contents. If instead you give them the read-only capability, then they will
356*not* be able to modify the contents. Any capability that you receive can be
357linked in to any directory that you can modify, so very powerful
358shared+published directory structures can be built from these components.
359
360This structure enable individual users to have their own personal space, with
361links to spaces that are shared with specific other users, and other spaces
362that are globally visible.
363
364
365Leases, Refreshing, Garbage Collection
366======================================
367
368When a file or directory in the file store is no longer referenced, the space
369that its shares occupied on each storage server can be freed, making room for
370other shares. Tahoe-LAFS uses a garbage collection ("GC") mechanism to
371implement this space-reclamation process. Each share has one or more
372"leases", which are managed by clients who want the file/directory to be
373retained. The storage server accepts each share for a pre-defined period of
374time, and is allowed to delete the share if all of the leases are cancelled
375or allowed to expire.
376
377Garbage collection is not enabled by default: storage servers will not delete
378shares without being explicitly configured to do so. When GC is enabled,
379clients are responsible for renewing their leases on a periodic basis at
380least frequently enough to prevent any of the leases from expiring before the
381next renewal pass.
382
383See :doc:`garbage-collection` for further information, and for how to
384configure garbage collection.
385
386File Repairer
387=============
388
389Shares may go away because the storage server hosting them has suffered a
390failure: either temporary downtime (affecting availability of the file), or a
391permanent data loss (affecting the preservation of the file). Hard drives
392crash, power supplies explode, coffee spills, and asteroids strike. The goal
393of a robust distributed file store is to survive these setbacks.
394
395To work against this slow, continual loss of shares, a File Checker is used
396to periodically count the number of shares still available for any given
397file. A more extensive form of checking known as the File Verifier can
398download the ciphertext of the target file and perform integrity checks
399(using strong hashes) to make sure the data is still intact. When the file is
400found to have decayed below some threshold, the File Repairer can be used to
401regenerate and re-upload the missing shares. These processes are conceptually
402distinct (the repairer is only run if the checker/verifier decides it is
403necessary), but in practice they will be closely related, and may run in the
404same process.
405
406The repairer process does not get the full capability of the file to be
407maintained: it merely gets the "repairer capability" subset, which does not
408include the decryption key. The File Verifier uses that data to find out
409which nodes ought to hold shares for this file, and to see if those nodes are
410still around and willing to provide the data. If the file is not healthy
411enough, the File Repairer is invoked to download the ciphertext, regenerate
412any missing shares, and upload them to new nodes. The goal of the File
413Repairer is to finish up with a full set of ``N`` shares.
414
415There are a number of engineering issues to be resolved here. The bandwidth,
416disk IO, and CPU time consumed by the verification/repair process must be
417balanced against the robustness that it provides to the grid. The nodes
418involved in repair will have very different access patterns than normal
419nodes, such that these processes may need to be run on hosts with more memory
420or network connectivity than usual. The frequency of repair will directly
421affect the resources consumed. In some cases, verification of multiple files
422can be performed at the same time, and repair of files can be delegated off
423to other nodes.
424
425  *future work*
426
427  Currently there are two modes of checking on the health of your file:
428  "Checker" simply asks storage servers which shares they have and does
429  nothing to try to verify that they aren't lying. "Verifier" downloads and
430  cryptographically verifies every bit of every share of the file from every
431  server, which costs a lot of network and CPU. A future improvement would be
432  to make a random-sampling verifier which downloads and cryptographically
433  verifies only a few randomly-chosen blocks from each server. This would
434  require much less network and CPU but it could make it extremely unlikely
435  that any sort of corruption -- even malicious corruption intended to evade
436  detection -- would evade detection. This would be an instance of a
437  cryptographic notion called "Proof of Retrievability". Note that to implement
438  this requires no change to the server or to the cryptographic data structure
439  -- with the current data structure and the current protocol it is up to the
440  client which blocks they choose to download, so this would be solely a change
441  in client behavior.
442
443
444Security
445========
446
447The design goal for this project is that an attacker may be able to deny
448service (i.e. prevent you from recovering a file that was uploaded earlier)
449but can accomplish none of the following three attacks:
450
4511) violate confidentiality: the attacker gets to view data to which you have
452   not granted them access
4532) violate integrity: the attacker convinces you that the wrong data is
454   actually the data you were intending to retrieve
4553) violate unforgeability: the attacker gets to modify a mutable file or
456   directory (either the pathnames or the file contents) to which you have
457   not given them write permission
458
459Integrity (the promise that the downloaded data will match the uploaded data)
460is provided by the hashes embedded in the capability (for immutable files) or
461the digital signature (for mutable files). Confidentiality (the promise that
462the data is only readable by people with the capability) is provided by the
463encryption key embedded in the capability (for both immutable and mutable
464files). Data availability (the hope that data which has been uploaded in the
465past will be downloadable in the future) is provided by the grid, which
466distributes failures in a way that reduces the correlation between individual
467node failure and overall file recovery failure, and by the erasure-coding
468technique used to generate shares.
469
470Many of these security properties depend upon the usual cryptographic
471assumptions: the resistance of AES and RSA to attack, the resistance of
472SHA-256 to collision attacks and pre-image attacks, and upon the proximity of
4732^-128 and 2^-256 to zero. A break in AES would allow a confidentiality
474violation, a collision break in SHA-256 would allow a consistency violation,
475and a break in RSA would allow a mutability violation.
476
477There is no attempt made to provide anonymity, neither of the origin of a
478piece of data nor the identity of the subsequent downloaders. In general,
479anyone who already knows the contents of a file will be in a strong position
480to determine who else is uploading or downloading it. Also, it is quite easy
481for a sufficiently large coalition of nodes to correlate the set of nodes who
482are all uploading or downloading the same file, even if the attacker does not
483know the contents of the file in question.
484
485Also note that the file size and (when convergence is being used) a keyed
486hash of the plaintext are not protected. Many people can determine the size
487of the file you are accessing, and if they already know the contents of a
488given file, they will be able to determine that you are uploading or
489downloading the same one.
490
491The capability-based security model is used throughout this project.
492Directory operations are expressed in terms of distinct read- and write-
493capabilities. Knowing the read-capability of a file is equivalent to the
494ability to read the corresponding data. The capability to validate the
495correctness of a file is strictly weaker than the read-capability (possession
496of read-capability automatically grants you possession of
497validate-capability, but not vice versa). These capabilities may be expressly
498delegated (irrevocably) by simply transferring the relevant secrets.
499
500The application layer can provide whatever access model is desired, built on
501top of this capability access model.
502
503
504Reliability
505===========
506
507File encoding and peer-node selection parameters can be adjusted to achieve
508different goals. Each choice results in a number of properties; there are
509many tradeoffs.
510
511First, some terms: the erasure-coding algorithm is described as ``k``-out-of-``N``
512(for this release, the default values are ``k`` = 3 and ``N`` = 10). Each grid will
513have some number of nodes; this number will rise and fall over time as nodes
514join, drop out, come back, and leave forever. Files are of various sizes, some
515are popular, others are unpopular. Nodes have various capacities, variable
516upload/download bandwidths, and network latency. Most of the mathematical
517models that look at node failure assume some average (and independent)
518probability 'P' of a given node being available: this can be high (servers
519tend to be online and available >90% of the time) or low (laptops tend to be
520turned on for an hour then disappear for several days). Files are encoded in
521segments of a given maximum size, which affects memory usage.
522
523The ratio of ``N``/``k`` is the "expansion factor". Higher expansion factors
524improve reliability very quickly (the binomial distribution curve is very sharp),
525but consumes much more grid capacity. When P=50%, the absolute value of ``k``
526affects the granularity of the binomial curve (1-out-of-2 is much worse than
52750-out-of-100), but high values asymptotically approach a constant (i.e.
528500-of-1000 is not much better than 50-of-100). When P is high and the
529expansion factor is held at a constant, higher values of ``k`` and ``N`` give
530much better reliability (for P=99%, 50-out-of-100 is much much better than
5315-of-10, roughly 10^50 times better), because there are more shares that can
532be lost without losing the file.
533
534Likewise, the total number of nodes in the network affects the same
535granularity: having only one node means a single point of failure, no matter
536how many copies of the file you make. Independent nodes (with uncorrelated
537failures) are necessary to hit the mathematical ideals: if you have 100 nodes
538but they are all in the same office building, then a single power failure
539will take out all of them at once. Pseudospoofing, also called a "Sybil Attack",
540is where a single attacker convinces you that they are actually multiple
541servers, so that you think you are using a large number of independent nodes,
542but in fact you have a single point of failure (where the attacker turns off
543all their machines at once). Large grids, with lots of truly independent nodes,
544will enable the use of lower expansion factors to achieve the same reliability,
545but will increase overhead because each node needs to know something about
546every other, and the rate at which nodes come and go will be higher (requiring
547network maintenance traffic). Also, the File Repairer work will increase with
548larger grids, although then the job can be distributed out to more nodes.
549
550Higher values of ``N`` increase overhead: more shares means more Merkle hashes
551that must be included with the data, and more nodes to contact to retrieve
552the shares. Smaller segment sizes reduce memory usage (since each segment
553must be held in memory while erasure coding runs) and improves "alacrity"
554(since downloading can validate a smaller piece of data faster, delivering it
555to the target sooner), but also increase overhead (because more blocks means
556more Merkle hashes to validate them).
557
558In general, small private grids should work well, but the participants will
559have to decide between storage overhead and reliability. Large stable grids
560will be able to reduce the expansion factor down to a bare minimum while
561still retaining high reliability, but large unstable grids (where nodes are
562coming and going very quickly) may require more repair/verification bandwidth
563than actual upload/download traffic.
Note: See TracBrowser for help on using the repository browser.