1 | ============================================ |
---|
2 | Performance costs for some common operations |
---|
3 | ============================================ |
---|
4 | |
---|
5 | 1. `Publishing an A-byte immutable file`_ |
---|
6 | 2. `Publishing an A-byte mutable file`_ |
---|
7 | 3. `Downloading B bytes of an A-byte immutable file`_ |
---|
8 | 4. `Downloading B bytes of an A-byte mutable file`_ |
---|
9 | 5. `Modifying B bytes of an A-byte mutable file`_ |
---|
10 | 6. `Inserting/Removing B bytes in an A-byte mutable file`_ |
---|
11 | 7. `Adding an entry to an A-entry directory`_ |
---|
12 | 8. `Listing an A entry directory`_ |
---|
13 | 9. `Performing a file-check on an A-byte file`_ |
---|
14 | 10. `Performing a file-verify on an A-byte file`_ |
---|
15 | 11. `Repairing an A-byte file (mutable or immutable)`_ |
---|
16 | |
---|
17 | ``K`` indicates the number of shares required to reconstruct the file |
---|
18 | (default: 3) |
---|
19 | |
---|
20 | ``N`` indicates the total number of shares produced (default: 10) |
---|
21 | |
---|
22 | ``S`` indicates the segment size (default: 128 KiB) |
---|
23 | |
---|
24 | ``A`` indicates the number of bytes in a file |
---|
25 | |
---|
26 | ``B`` indicates the number of bytes of a file which are being read or |
---|
27 | written |
---|
28 | |
---|
29 | ``G`` indicates the number of storage servers on your grid |
---|
30 | |
---|
31 | Publishing an ``A``-byte immutable file |
---|
32 | ======================================= |
---|
33 | |
---|
34 | when the file is already uploaded |
---|
35 | --------------------------------- |
---|
36 | |
---|
37 | If the file is already uploaded with the exact same contents, same |
---|
38 | erasure coding parameters (K, N), and same added convergence secret, |
---|
39 | then it reads the whole file from disk one time while hashing it to |
---|
40 | compute the storage index, then contacts about N servers to ask each |
---|
41 | one to store a share. All of the servers reply that they already have |
---|
42 | a copy of that share, and the upload is done. |
---|
43 | |
---|
44 | disk: A |
---|
45 | |
---|
46 | cpu: ~A |
---|
47 | |
---|
48 | network: ~N |
---|
49 | |
---|
50 | memory footprint: N/K*S |
---|
51 | |
---|
52 | when the file is not already uploaded |
---|
53 | ------------------------------------- |
---|
54 | |
---|
55 | If the file is not already uploaded with the exact same contents, same |
---|
56 | erasure coding parameters (K, N), and same added convergence secret, |
---|
57 | then it reads the whole file from disk one time while hashing it to |
---|
58 | compute the storage index, then contacts about N servers to ask each |
---|
59 | one to store a share. Then it uploads each share to a storage server. |
---|
60 | |
---|
61 | disk: 2*A |
---|
62 | |
---|
63 | cpu: 2*~A |
---|
64 | |
---|
65 | network: ~N + ~A |
---|
66 | |
---|
67 | memory footprint: N/K*S |
---|
68 | |
---|
69 | Publishing an ``A``-byte mutable file |
---|
70 | ===================================== |
---|
71 | |
---|
72 | network: A |
---|
73 | |
---|
74 | memory footprint: N/K*A |
---|
75 | |
---|
76 | cpu: ~A + a large constant for RSA keypair generation |
---|
77 | |
---|
78 | notes: Tahoe-LAFS generates a new RSA keypair for each mutable file that it |
---|
79 | publishes to a grid. This takes up to 1 or 2 seconds on a typical desktop PC. |
---|
80 | |
---|
81 | Part of the process of encrypting, encoding, and uploading a mutable file to a |
---|
82 | Tahoe-LAFS grid requires that the entire file be in memory at once. For larger |
---|
83 | files, this may cause Tahoe-LAFS to have an unacceptably large memory footprint |
---|
84 | (at least when uploading a mutable file). |
---|
85 | |
---|
86 | Downloading ``B`` bytes of an ``A``-byte immutable file |
---|
87 | ======================================================= |
---|
88 | |
---|
89 | network: B |
---|
90 | |
---|
91 | cpu: ~A |
---|
92 | |
---|
93 | notes: When Tahoe-LAFS 1.8.0 or later is asked to read an arbitrary |
---|
94 | range of an immutable file, only the S-byte segments that overlap the |
---|
95 | requested range will be downloaded. |
---|
96 | |
---|
97 | (Earlier versions would download from the beginning of the file up |
---|
98 | until the end of the requested range, and then continue to download |
---|
99 | the rest of the file even after the request was satisfied.) |
---|
100 | |
---|
101 | Downloading ``B`` bytes of an ``A``-byte mutable file |
---|
102 | ===================================================== |
---|
103 | |
---|
104 | network: A |
---|
105 | |
---|
106 | memory footprint: A |
---|
107 | |
---|
108 | notes: As currently implemented, mutable files must be downloaded in |
---|
109 | their entirety before any part of them can be read. We are |
---|
110 | exploring fixes for this; see ticket #393 for more information. |
---|
111 | |
---|
112 | Modifying ``B`` bytes of an ``A``-byte mutable file |
---|
113 | =================================================== |
---|
114 | |
---|
115 | network: A |
---|
116 | |
---|
117 | memory footprint: N/K*A |
---|
118 | |
---|
119 | notes: If you upload a changed version of a mutable file that you |
---|
120 | earlier put onto your grid with, say, 'tahoe put --mutable', |
---|
121 | Tahoe-LAFS will replace the old file with the new file on the |
---|
122 | grid, rather than attempting to modify only those portions of the |
---|
123 | file that have changed. Modifying a file in this manner is |
---|
124 | essentially uploading the file over again, except that it re-uses |
---|
125 | the existing RSA keypair instead of generating a new one. |
---|
126 | |
---|
127 | Inserting/Removing ``B`` bytes in an ``A``-byte mutable file |
---|
128 | ============================================================ |
---|
129 | |
---|
130 | network: A |
---|
131 | |
---|
132 | memory footprint: N/K*A |
---|
133 | |
---|
134 | notes: Modifying any part of a mutable file in Tahoe-LAFS requires that |
---|
135 | the entire file be downloaded, modified, held in memory while it is |
---|
136 | encrypted and encoded, and then re-uploaded. A future version of the |
---|
137 | mutable file layout ("LDMF") may provide efficient inserts and |
---|
138 | deletes. Note that this sort of modification is mostly used internally |
---|
139 | for directories, and isn't something that the WUI, CLI, or other |
---|
140 | interfaces will do -- instead, they will simply overwrite the file to |
---|
141 | be modified, as described in "Modifying B bytes of an A-byte mutable |
---|
142 | file". |
---|
143 | |
---|
144 | Adding an entry to an ``A``-entry directory |
---|
145 | =========================================== |
---|
146 | |
---|
147 | network: ~A |
---|
148 | |
---|
149 | memory footprint: N/K*A |
---|
150 | |
---|
151 | notes: In Tahoe-LAFS, directories are implemented as specialized mutable |
---|
152 | files. So adding an entry to a directory is essentially adding B |
---|
153 | (actually, 300-330) bytes somewhere in an existing mutable file. |
---|
154 | |
---|
155 | Listing an ``A`` entry directory |
---|
156 | ================================ |
---|
157 | |
---|
158 | network: ~A |
---|
159 | |
---|
160 | memory footprint: N/K*A |
---|
161 | |
---|
162 | notes: Listing a directory requires that the mutable file storing the |
---|
163 | directory be downloaded from the grid. So listing an A entry |
---|
164 | directory requires downloading a (roughly) 330 * A byte mutable |
---|
165 | file, since each directory entry is about 300-330 bytes in size. |
---|
166 | |
---|
167 | Performing a file-check on an ``A``-byte file |
---|
168 | ============================================= |
---|
169 | |
---|
170 | network: ~G, where G is the number of servers on your grid |
---|
171 | |
---|
172 | memory footprint: negligible |
---|
173 | |
---|
174 | notes: To check a file, Tahoe-LAFS queries all the servers that it knows |
---|
175 | about. Note that neither of these values directly depend on the size |
---|
176 | of the file. This is relatively inexpensive, compared to the verify |
---|
177 | and repair operations. |
---|
178 | |
---|
179 | Performing a file-verify on an ``A``-byte file |
---|
180 | ============================================== |
---|
181 | |
---|
182 | network: N/K*A |
---|
183 | |
---|
184 | memory footprint: N/K*S |
---|
185 | |
---|
186 | notes: To verify a file, Tahoe-LAFS downloads all of the ciphertext |
---|
187 | shares that were originally uploaded to the grid and integrity |
---|
188 | checks them. This is, for well-behaved grids, likely to be more |
---|
189 | expensive than downloading an A-byte file, since only a fraction |
---|
190 | of these shares are necessary to recover the file. |
---|
191 | |
---|
192 | Repairing an ``A``-byte file (mutable or immutable) |
---|
193 | =================================================== |
---|
194 | |
---|
195 | network: variable; up to around ~A |
---|
196 | |
---|
197 | memory footprint: from S to (1+N/K)*S |
---|
198 | |
---|
199 | notes: To repair a file, Tahoe-LAFS downloads the file, and generates/uploads |
---|
200 | missing shares in the same way as when it initially uploads the file. |
---|
201 | So, depending on how many shares are missing, this can be about as |
---|
202 | expensive as initially uploading the file in the first place. |
---|