1 | Mon Aug 9 16:25:14 PDT 2010 Kevan Carstensen <kevan@isnotajoke.com> |
---|
2 | * mutable/checker.py and mutable/repair.py: Modify checker and repairer to work with MDMF |
---|
3 | |
---|
4 | The checker and repairer required minimal changes to work with the MDMF |
---|
5 | modifications made elsewhere. The checker duplicated a lot of the code |
---|
6 | that was already in the downloader, so I modified the downloader |
---|
7 | slightly to expose this functionality to the checker and removed the |
---|
8 | duplicated code. The repairer only required a minor change to deal with |
---|
9 | data representation. |
---|
10 | |
---|
11 | Mon Aug 9 16:32:44 PDT 2010 Kevan Carstensen <kevan@isnotajoke.com> |
---|
12 | * interfaces.py: Add #993 interfaces |
---|
13 | |
---|
14 | Mon Aug 9 16:35:35 PDT 2010 Kevan Carstensen <kevan@isnotajoke.com> |
---|
15 | * frontends/sftpd.py: Modify the sftp frontend to work with the MDMF changes |
---|
16 | |
---|
17 | Mon Aug 9 16:36:23 PDT 2010 Kevan Carstensen <kevan@isnotajoke.com> |
---|
18 | * nodemaker.py: Make nodemaker expose a way to create MDMF files |
---|
19 | |
---|
20 | Mon Aug 9 16:37:55 PDT 2010 Kevan Carstensen <kevan@isnotajoke.com> |
---|
21 | * web: Alter the webapi to get along with and take advantage of the MDMF changes |
---|
22 | |
---|
23 | The main benefit that the webapi gets from MDMF, at least initially, is |
---|
24 | the ability to do a streaming download of an MDMF mutable file. It also |
---|
25 | exposes a way (through the PUT verb) to append to or otherwise modify |
---|
26 | (in-place) an MDMF mutable file. |
---|
27 | |
---|
28 | Mon Aug 9 16:40:04 PDT 2010 Kevan Carstensen <kevan@isnotajoke.com> |
---|
29 | * mutable/layout.py and interfaces.py: add MDMF writer and reader |
---|
30 | |
---|
31 | The MDMF writer is responsible for keeping state as plaintext is |
---|
32 | gradually processed into share data by the upload process. When the |
---|
33 | upload finishes, it will write all of its share data to a remote server, |
---|
34 | reporting its status back to the publisher. |
---|
35 | |
---|
36 | The MDMF reader is responsible for abstracting an MDMF file as it sits |
---|
37 | on the grid from the downloader; specifically, by receiving and |
---|
38 | responding to requests for arbitrary data within the MDMF file. |
---|
39 | |
---|
40 | The interfaces.py file has also been modified to contain an interface |
---|
41 | for the writer. |
---|
42 | |
---|
43 | Mon Aug 9 17:06:19 PDT 2010 Kevan Carstensen <kevan@isnotajoke.com> |
---|
44 | * immutable/filenode.py: Make the immutable file node implement the same interfaces as the mutable one |
---|
45 | |
---|
46 | Mon Aug 9 17:06:33 PDT 2010 Kevan Carstensen <kevan@isnotajoke.com> |
---|
47 | * immutable/literal.py: implement the same interfaces as other filenodes |
---|
48 | |
---|
49 | Wed Aug 11 16:30:49 PDT 2010 Kevan Carstensen <kevan@isnotajoke.com> |
---|
50 | * mutable/filenode.py: add versions and partial-file updates to the mutable file node |
---|
51 | |
---|
52 | One of the goals of MDMF as a GSoC project is to lay the groundwork for |
---|
53 | LDMF, a format that will allow Tahoe-LAFS to deal with and encourage |
---|
54 | multiple versions of a single cap on the grid. In line with this, there |
---|
55 | is a now a distinction between an overriding mutable file (which can be |
---|
56 | thought to correspond to the cap/unique identifier for that mutable |
---|
57 | file) and versions of the mutable file (which we can download, update, |
---|
58 | and so on). All download, upload, and modification operations end up |
---|
59 | happening on a particular version of a mutable file, but there are |
---|
60 | shortcut methods on the object representing the overriding mutable file |
---|
61 | that perform these operations on the best version of the mutable file |
---|
62 | (which is what code should be doing until we have LDMF and better |
---|
63 | support for other paradigms). |
---|
64 | |
---|
65 | Another goal of MDMF was to take advantage of segmentation to give |
---|
66 | callers more efficient partial file updates or appends. This patch |
---|
67 | implements methods that do that, too. |
---|
68 | |
---|
69 | |
---|
70 | Wed Aug 11 16:31:01 PDT 2010 Kevan Carstensen <kevan@isnotajoke.com> |
---|
71 | * mutable/publish.py: Modify the publish process to support MDMF |
---|
72 | |
---|
73 | The inner workings of the publishing process needed to be reworked to a |
---|
74 | large extend to cope with segmented mutable files, and to cope with |
---|
75 | partial-file updates of mutable files. This patch does that. It also |
---|
76 | introduces wrappers for uploadable data, allowing the use of |
---|
77 | filehandle-like objects as data sources, in addition to strings. This |
---|
78 | reduces memory inefficiency when dealing with large files through the |
---|
79 | webapi, and clarifies update code there. |
---|
80 | |
---|
81 | Wed Aug 11 16:31:25 PDT 2010 Kevan Carstensen <kevan@isnotajoke.com> |
---|
82 | * mutable/retrieve.py: Modify the retrieval process to support MDMF |
---|
83 | |
---|
84 | The logic behind a mutable file download had to be adapted to work with |
---|
85 | segmented mutable files; this patch performs those adaptations. It also |
---|
86 | exposes some decoding and decrypting functionality to make partial-file |
---|
87 | updates a little easier, and supports efficient random-access downloads |
---|
88 | of parts of an MDMF file. |
---|
89 | |
---|
90 | Wed Aug 11 16:33:09 PDT 2010 Kevan Carstensen <kevan@isnotajoke.com> |
---|
91 | * mutable/servermap.py: Alter the servermap updater to work with MDMF files |
---|
92 | |
---|
93 | These modifications were basically all to the end of having the |
---|
94 | servermap updater use the unified MDMF + SDMF read interface whenever |
---|
95 | possible -- this reduces the complexity of the code, making it easier to |
---|
96 | read and maintain. To do this, I needed to modify the process of |
---|
97 | updating the servermap a little bit. |
---|
98 | |
---|
99 | To support partial-file updates, I also modified the servermap updater |
---|
100 | to fetch the block hash trees and certain segments of files while it |
---|
101 | performed a servermap update (this can be done without adding any new |
---|
102 | roundtrips because of batch-read functionality that the read proxy has). |
---|
103 | |
---|
104 | |
---|
105 | Wed Aug 11 16:33:31 PDT 2010 Kevan Carstensen <kevan@isnotajoke.com> |
---|
106 | * tests: |
---|
107 | |
---|
108 | - A lot of existing tests relied on aspects of the mutable file |
---|
109 | implementation that were changed. This patch updates those tests |
---|
110 | to work with the changes. |
---|
111 | - This patch also adds tests for new features. |
---|
112 | |
---|
113 | New patches: |
---|
114 | |
---|
115 | [mutable/checker.py and mutable/repair.py: Modify checker and repairer to work with MDMF |
---|
116 | Kevan Carstensen <kevan@isnotajoke.com>**20100809232514 |
---|
117 | Ignore-this: 1bcef2f262c868f61e57cc19a3cac89a |
---|
118 | |
---|
119 | The checker and repairer required minimal changes to work with the MDMF |
---|
120 | modifications made elsewhere. The checker duplicated a lot of the code |
---|
121 | that was already in the downloader, so I modified the downloader |
---|
122 | slightly to expose this functionality to the checker and removed the |
---|
123 | duplicated code. The repairer only required a minor change to deal with |
---|
124 | data representation. |
---|
125 | ] { |
---|
126 | hunk ./src/allmydata/mutable/checker.py 12 |
---|
127 | from allmydata.mutable.common import MODE_CHECK, CorruptShareError |
---|
128 | from allmydata.mutable.servermap import ServerMap, ServermapUpdater |
---|
129 | from allmydata.mutable.layout import unpack_share, SIGNED_PREFIX_LENGTH |
---|
130 | +from allmydata.mutable.retrieve import Retrieve # for verifying |
---|
131 | |
---|
132 | class MutableChecker: |
---|
133 | |
---|
134 | hunk ./src/allmydata/mutable/checker.py 29 |
---|
135 | |
---|
136 | def check(self, verify=False, add_lease=False): |
---|
137 | servermap = ServerMap() |
---|
138 | + # Updating the servermap in MODE_CHECK will stand a good chance |
---|
139 | + # of finding all of the shares, and getting a good idea of |
---|
140 | + # recoverability, etc, without verifying. |
---|
141 | u = ServermapUpdater(self._node, self._storage_broker, self._monitor, |
---|
142 | servermap, MODE_CHECK, add_lease=add_lease) |
---|
143 | if self._history: |
---|
144 | hunk ./src/allmydata/mutable/checker.py 55 |
---|
145 | if num_recoverable: |
---|
146 | self.best_version = servermap.best_recoverable_version() |
---|
147 | |
---|
148 | + # The file is unhealthy and needs to be repaired if: |
---|
149 | + # - There are unrecoverable versions. |
---|
150 | if servermap.unrecoverable_versions(): |
---|
151 | self.need_repair = True |
---|
152 | hunk ./src/allmydata/mutable/checker.py 59 |
---|
153 | + # - There isn't a recoverable version. |
---|
154 | if num_recoverable != 1: |
---|
155 | self.need_repair = True |
---|
156 | hunk ./src/allmydata/mutable/checker.py 62 |
---|
157 | + # - The best recoverable version is missing some shares. |
---|
158 | if self.best_version: |
---|
159 | available_shares = servermap.shares_available() |
---|
160 | (num_distinct_shares, k, N) = available_shares[self.best_version] |
---|
161 | hunk ./src/allmydata/mutable/checker.py 73 |
---|
162 | |
---|
163 | def _verify_all_shares(self, servermap): |
---|
164 | # read every byte of each share |
---|
165 | + # |
---|
166 | + # This logic is going to be very nearly the same as the |
---|
167 | + # downloader. I bet we could pass the downloader a flag that |
---|
168 | + # makes it do this, and piggyback onto that instead of |
---|
169 | + # duplicating a bunch of code. |
---|
170 | + # |
---|
171 | + # Like: |
---|
172 | + # r = Retrieve(blah, blah, blah, verify=True) |
---|
173 | + # d = r.download() |
---|
174 | + # (wait, wait, wait, d.callback) |
---|
175 | + # |
---|
176 | + # Then, when it has finished, we can check the servermap (which |
---|
177 | + # we provided to Retrieve) to figure out which shares are bad, |
---|
178 | + # since the Retrieve process will have updated the servermap as |
---|
179 | + # it went along. |
---|
180 | + # |
---|
181 | + # By passing the verify=True flag to the constructor, we are |
---|
182 | + # telling the downloader a few things. |
---|
183 | + # |
---|
184 | + # 1. It needs to download all N shares, not just K shares. |
---|
185 | + # 2. It doesn't need to decrypt or decode the shares, only |
---|
186 | + # verify them. |
---|
187 | if not self.best_version: |
---|
188 | return |
---|
189 | hunk ./src/allmydata/mutable/checker.py 97 |
---|
190 | - versionmap = servermap.make_versionmap() |
---|
191 | - shares = versionmap[self.best_version] |
---|
192 | - (seqnum, root_hash, IV, segsize, datalength, k, N, prefix, |
---|
193 | - offsets_tuple) = self.best_version |
---|
194 | - offsets = dict(offsets_tuple) |
---|
195 | - readv = [ (0, offsets["EOF"]) ] |
---|
196 | - dl = [] |
---|
197 | - for (shnum, peerid, timestamp) in shares: |
---|
198 | - ss = servermap.connections[peerid] |
---|
199 | - d = self._do_read(ss, peerid, self._storage_index, [shnum], readv) |
---|
200 | - d.addCallback(self._got_answer, peerid, servermap) |
---|
201 | - dl.append(d) |
---|
202 | - return defer.DeferredList(dl, fireOnOneErrback=True, consumeErrors=True) |
---|
203 | |
---|
204 | hunk ./src/allmydata/mutable/checker.py 98 |
---|
205 | - def _do_read(self, ss, peerid, storage_index, shnums, readv): |
---|
206 | - # isolate the callRemote to a separate method, so tests can subclass |
---|
207 | - # Publish and override it |
---|
208 | - d = ss.callRemote("slot_readv", storage_index, shnums, readv) |
---|
209 | + r = Retrieve(self._node, servermap, self.best_version, verify=True) |
---|
210 | + d = r.download() |
---|
211 | + d.addCallback(self._process_bad_shares) |
---|
212 | return d |
---|
213 | |
---|
214 | hunk ./src/allmydata/mutable/checker.py 103 |
---|
215 | - def _got_answer(self, datavs, peerid, servermap): |
---|
216 | - for shnum,datav in datavs.items(): |
---|
217 | - data = datav[0] |
---|
218 | - try: |
---|
219 | - self._got_results_one_share(shnum, peerid, data) |
---|
220 | - except CorruptShareError: |
---|
221 | - f = failure.Failure() |
---|
222 | - self.need_repair = True |
---|
223 | - self.bad_shares.append( (peerid, shnum, f) ) |
---|
224 | - prefix = data[:SIGNED_PREFIX_LENGTH] |
---|
225 | - servermap.mark_bad_share(peerid, shnum, prefix) |
---|
226 | - ss = servermap.connections[peerid] |
---|
227 | - self.notify_server_corruption(ss, shnum, str(f.value)) |
---|
228 | - |
---|
229 | - def check_prefix(self, peerid, shnum, data): |
---|
230 | - (seqnum, root_hash, IV, segsize, datalength, k, N, prefix, |
---|
231 | - offsets_tuple) = self.best_version |
---|
232 | - got_prefix = data[:SIGNED_PREFIX_LENGTH] |
---|
233 | - if got_prefix != prefix: |
---|
234 | - raise CorruptShareError(peerid, shnum, |
---|
235 | - "prefix mismatch: share changed while we were reading it") |
---|
236 | - |
---|
237 | - def _got_results_one_share(self, shnum, peerid, data): |
---|
238 | - self.check_prefix(peerid, shnum, data) |
---|
239 | - |
---|
240 | - # the [seqnum:signature] pieces are validated by _compare_prefix, |
---|
241 | - # which checks their signature against the pubkey known to be |
---|
242 | - # associated with this file. |
---|
243 | |
---|
244 | hunk ./src/allmydata/mutable/checker.py 104 |
---|
245 | - (seqnum, root_hash, IV, k, N, segsize, datalen, pubkey, signature, |
---|
246 | - share_hash_chain, block_hash_tree, share_data, |
---|
247 | - enc_privkey) = unpack_share(data) |
---|
248 | - |
---|
249 | - # validate [share_hash_chain,block_hash_tree,share_data] |
---|
250 | - |
---|
251 | - leaves = [hashutil.block_hash(share_data)] |
---|
252 | - t = hashtree.HashTree(leaves) |
---|
253 | - if list(t) != block_hash_tree: |
---|
254 | - raise CorruptShareError(peerid, shnum, "block hash tree failure") |
---|
255 | - share_hash_leaf = t[0] |
---|
256 | - t2 = hashtree.IncompleteHashTree(N) |
---|
257 | - # root_hash was checked by the signature |
---|
258 | - t2.set_hashes({0: root_hash}) |
---|
259 | - try: |
---|
260 | - t2.set_hashes(hashes=share_hash_chain, |
---|
261 | - leaves={shnum: share_hash_leaf}) |
---|
262 | - except (hashtree.BadHashError, hashtree.NotEnoughHashesError, |
---|
263 | - IndexError), e: |
---|
264 | - msg = "corrupt hashes: %s" % (e,) |
---|
265 | - raise CorruptShareError(peerid, shnum, msg) |
---|
266 | - |
---|
267 | - # validate enc_privkey: only possible if we have a write-cap |
---|
268 | - if not self._node.is_readonly(): |
---|
269 | - alleged_privkey_s = self._node._decrypt_privkey(enc_privkey) |
---|
270 | - alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s) |
---|
271 | - if alleged_writekey != self._node.get_writekey(): |
---|
272 | - raise CorruptShareError(peerid, shnum, "invalid privkey") |
---|
273 | + def _process_bad_shares(self, bad_shares): |
---|
274 | + if bad_shares: |
---|
275 | + self.need_repair = True |
---|
276 | + self.bad_shares = bad_shares |
---|
277 | |
---|
278 | hunk ./src/allmydata/mutable/checker.py 109 |
---|
279 | - def notify_server_corruption(self, ss, shnum, reason): |
---|
280 | - ss.callRemoteOnly("advise_corrupt_share", |
---|
281 | - "mutable", self._storage_index, shnum, reason) |
---|
282 | |
---|
283 | def _count_shares(self, smap, version): |
---|
284 | available_shares = smap.shares_available() |
---|
285 | hunk ./src/allmydata/mutable/repairer.py 5 |
---|
286 | from zope.interface import implements |
---|
287 | from twisted.internet import defer |
---|
288 | from allmydata.interfaces import IRepairResults, ICheckResults |
---|
289 | +from allmydata.mutable.publish import MutableData |
---|
290 | |
---|
291 | class RepairResults: |
---|
292 | implements(IRepairResults) |
---|
293 | hunk ./src/allmydata/mutable/repairer.py 108 |
---|
294 | raise RepairRequiresWritecapError("Sorry, repair currently requires a writecap, to set the write-enabler properly.") |
---|
295 | |
---|
296 | d = self.node.download_version(smap, best_version, fetch_privkey=True) |
---|
297 | + d.addCallback(lambda data: |
---|
298 | + MutableData(data)) |
---|
299 | d.addCallback(self.node.upload, smap) |
---|
300 | d.addCallback(self.get_results, smap) |
---|
301 | return d |
---|
302 | } |
---|
303 | [interfaces.py: Add #993 interfaces |
---|
304 | Kevan Carstensen <kevan@isnotajoke.com>**20100809233244 |
---|
305 | Ignore-this: b58621ac5cc86f1b4b4149f9e6c6a1ce |
---|
306 | ] { |
---|
307 | hunk ./src/allmydata/interfaces.py 495 |
---|
308 | class MustNotBeUnknownRWError(CapConstraintError): |
---|
309 | """Cannot add an unknown child cap specified in a rw_uri field.""" |
---|
310 | |
---|
311 | + |
---|
312 | +class IReadable(Interface): |
---|
313 | + """I represent a readable object -- either an immutable file, or a |
---|
314 | + specific version of a mutable file. |
---|
315 | + """ |
---|
316 | + |
---|
317 | + def is_readonly(): |
---|
318 | + """Return True if this reference provides mutable access to the given |
---|
319 | + file or directory (i.e. if you can modify it), or False if not. Note |
---|
320 | + that even if this reference is read-only, someone else may hold a |
---|
321 | + read-write reference to it. |
---|
322 | + |
---|
323 | + For an IReadable returned by get_best_readable_version(), this will |
---|
324 | + always return True, but for instances of subinterfaces such as |
---|
325 | + IMutableFileVersion, it may return False.""" |
---|
326 | + |
---|
327 | + def is_mutable(): |
---|
328 | + """Return True if this file or directory is mutable (by *somebody*, |
---|
329 | + not necessarily you), False if it is is immutable. Note that a file |
---|
330 | + might be mutable overall, but your reference to it might be |
---|
331 | + read-only. On the other hand, all references to an immutable file |
---|
332 | + will be read-only; there are no read-write references to an immutable |
---|
333 | + file.""" |
---|
334 | + |
---|
335 | + def get_storage_index(): |
---|
336 | + """Return the storage index of the file.""" |
---|
337 | + |
---|
338 | + def get_size(): |
---|
339 | + """Return the length (in bytes) of this readable object.""" |
---|
340 | + |
---|
341 | + def download_to_data(): |
---|
342 | + """Download all of the file contents. I return a Deferred that fires |
---|
343 | + with the contents as a byte string.""" |
---|
344 | + |
---|
345 | + def read(consumer, offset=0, size=None): |
---|
346 | + """Download a portion (possibly all) of the file's contents, making |
---|
347 | + them available to the given IConsumer. Return a Deferred that fires |
---|
348 | + (with the consumer) when the consumer is unregistered (either because |
---|
349 | + the last byte has been given to it, or because the consumer threw an |
---|
350 | + exception during write(), possibly because it no longer wants to |
---|
351 | + receive data). The portion downloaded will start at 'offset' and |
---|
352 | + contain 'size' bytes (or the remainder of the file if size==None). |
---|
353 | + |
---|
354 | + The consumer will be used in non-streaming mode: an IPullProducer |
---|
355 | + will be attached to it. |
---|
356 | + |
---|
357 | + The consumer will not receive data right away: several network trips |
---|
358 | + must occur first. The order of events will be:: |
---|
359 | + |
---|
360 | + consumer.registerProducer(p, streaming) |
---|
361 | + (if streaming == False):: |
---|
362 | + consumer does p.resumeProducing() |
---|
363 | + consumer.write(data) |
---|
364 | + consumer does p.resumeProducing() |
---|
365 | + consumer.write(data).. (repeat until all data is written) |
---|
366 | + consumer.unregisterProducer() |
---|
367 | + deferred.callback(consumer) |
---|
368 | + |
---|
369 | + If a download error occurs, or an exception is raised by |
---|
370 | + consumer.registerProducer() or consumer.write(), I will call |
---|
371 | + consumer.unregisterProducer() and then deliver the exception via |
---|
372 | + deferred.errback(). To cancel the download, the consumer should call |
---|
373 | + p.stopProducing(), which will result in an exception being delivered |
---|
374 | + via deferred.errback(). |
---|
375 | + |
---|
376 | + See src/allmydata/util/consumer.py for an example of a simple |
---|
377 | + download-to-memory consumer. |
---|
378 | + """ |
---|
379 | + |
---|
380 | + |
---|
381 | +class IWritable(Interface): |
---|
382 | + """ |
---|
383 | + I define methods that callers can use to update SDMF and MDMF |
---|
384 | + mutable files on a Tahoe-LAFS grid. |
---|
385 | + """ |
---|
386 | + # XXX: For the moment, we have only this. It is possible that we |
---|
387 | + # want to move overwrite() and modify() in here too. |
---|
388 | + def update(data, offset): |
---|
389 | + """ |
---|
390 | + I write the data from my data argument to the MDMF file, |
---|
391 | + starting at offset. I continue writing data until my data |
---|
392 | + argument is exhausted, appending data to the file as necessary. |
---|
393 | + """ |
---|
394 | + # assert IMutableUploadable.providedBy(data) |
---|
395 | + # to append data: offset=node.get_size_of_best_version() |
---|
396 | + # do we want to support compacting MDMF? |
---|
397 | + # for an MDMF file, this can be done with O(data.get_size()) |
---|
398 | + # memory. For an SDMF file, any modification takes |
---|
399 | + # O(node.get_size_of_best_version()). |
---|
400 | + |
---|
401 | + |
---|
402 | +class IMutableFileVersion(IReadable): |
---|
403 | + """I provide access to a particular version of a mutable file. The |
---|
404 | + access is read/write if I was obtained from a filenode derived from |
---|
405 | + a write cap, or read-only if the filenode was derived from a read cap. |
---|
406 | + """ |
---|
407 | + |
---|
408 | + def get_sequence_number(): |
---|
409 | + """Return the sequence number of this version.""" |
---|
410 | + |
---|
411 | + def get_servermap(): |
---|
412 | + """Return the IMutableFileServerMap instance that was used to create |
---|
413 | + this object. |
---|
414 | + """ |
---|
415 | + |
---|
416 | + def get_writekey(): |
---|
417 | + """Return this filenode's writekey, or None if the node does not have |
---|
418 | + write-capability. This may be used to assist with data structures |
---|
419 | + that need to make certain data available only to writers, such as the |
---|
420 | + read-write child caps in dirnodes. The recommended process is to have |
---|
421 | + reader-visible data be submitted to the filenode in the clear (where |
---|
422 | + it will be encrypted by the filenode using the readkey), but encrypt |
---|
423 | + writer-visible data using this writekey. |
---|
424 | + """ |
---|
425 | + |
---|
426 | + # TODO: Can this be overwrite instead of replace? |
---|
427 | + def replace(new_contents): |
---|
428 | + """Replace the contents of the mutable file, provided that no other |
---|
429 | + node has published (or is attempting to publish, concurrently) a |
---|
430 | + newer version of the file than this one. |
---|
431 | + |
---|
432 | + I will avoid modifying any share that is different than the version |
---|
433 | + given by get_sequence_number(). However, if another node is writing |
---|
434 | + to the file at the same time as me, I may manage to update some shares |
---|
435 | + while they update others. If I see any evidence of this, I will signal |
---|
436 | + UncoordinatedWriteError, and the file will be left in an inconsistent |
---|
437 | + state (possibly the version you provided, possibly the old version, |
---|
438 | + possibly somebody else's version, and possibly a mix of shares from |
---|
439 | + all of these). |
---|
440 | + |
---|
441 | + The recommended response to UncoordinatedWriteError is to either |
---|
442 | + return it to the caller (since they failed to coordinate their |
---|
443 | + writes), or to attempt some sort of recovery. It may be sufficient to |
---|
444 | + wait a random interval (with exponential backoff) and repeat your |
---|
445 | + operation. If I do not signal UncoordinatedWriteError, then I was |
---|
446 | + able to write the new version without incident. |
---|
447 | + |
---|
448 | + I return a Deferred that fires (with a PublishStatus object) when the |
---|
449 | + update has completed. |
---|
450 | + """ |
---|
451 | + |
---|
452 | + def modify(modifier_cb): |
---|
453 | + """Modify the contents of the file, by downloading this version, |
---|
454 | + applying the modifier function (or bound method), then uploading |
---|
455 | + the new version. This will succeed as long as no other node |
---|
456 | + publishes a version between the download and the upload. |
---|
457 | + I return a Deferred that fires (with a PublishStatus object) when |
---|
458 | + the update is complete. |
---|
459 | + |
---|
460 | + The modifier callable will be given three arguments: a string (with |
---|
461 | + the old contents), a 'first_time' boolean, and a servermap. As with |
---|
462 | + download_to_data(), the old contents will be from this version, |
---|
463 | + but the modifier can use the servermap to make other decisions |
---|
464 | + (such as refusing to apply the delta if there are multiple parallel |
---|
465 | + versions, or if there is evidence of a newer unrecoverable version). |
---|
466 | + 'first_time' will be True the first time the modifier is called, |
---|
467 | + and False on any subsequent calls. |
---|
468 | + |
---|
469 | + The callable should return a string with the new contents. The |
---|
470 | + callable must be prepared to be called multiple times, and must |
---|
471 | + examine the input string to see if the change that it wants to make |
---|
472 | + is already present in the old version. If it does not need to make |
---|
473 | + any changes, it can either return None, or return its input string. |
---|
474 | + |
---|
475 | + If the modifier raises an exception, it will be returned in the |
---|
476 | + errback. |
---|
477 | + """ |
---|
478 | + |
---|
479 | + |
---|
480 | # The hierarchy looks like this: |
---|
481 | # IFilesystemNode |
---|
482 | # IFileNode |
---|
483 | hunk ./src/allmydata/interfaces.py 754 |
---|
484 | def raise_error(): |
---|
485 | """Raise any error associated with this node.""" |
---|
486 | |
---|
487 | + # XXX: These may not be appropriate outside the context of an IReadable. |
---|
488 | def get_size(): |
---|
489 | """Return the length (in bytes) of the data this node represents. For |
---|
490 | directory nodes, I return the size of the backing store. I return |
---|
491 | hunk ./src/allmydata/interfaces.py 771 |
---|
492 | class IFileNode(IFilesystemNode): |
---|
493 | """I am a node which represents a file: a sequence of bytes. I am not a |
---|
494 | container, like IDirectoryNode.""" |
---|
495 | + def get_best_readable_version(): |
---|
496 | + """Return a Deferred that fires with an IReadable for the 'best' |
---|
497 | + available version of the file. The IReadable provides only read |
---|
498 | + access, even if this filenode was derived from a write cap. |
---|
499 | |
---|
500 | hunk ./src/allmydata/interfaces.py 776 |
---|
501 | -class IImmutableFileNode(IFileNode): |
---|
502 | - def read(consumer, offset=0, size=None): |
---|
503 | - """Download a portion (possibly all) of the file's contents, making |
---|
504 | - them available to the given IConsumer. Return a Deferred that fires |
---|
505 | - (with the consumer) when the consumer is unregistered (either because |
---|
506 | - the last byte has been given to it, or because the consumer threw an |
---|
507 | - exception during write(), possibly because it no longer wants to |
---|
508 | - receive data). The portion downloaded will start at 'offset' and |
---|
509 | - contain 'size' bytes (or the remainder of the file if size==None). |
---|
510 | - |
---|
511 | - The consumer will be used in non-streaming mode: an IPullProducer |
---|
512 | - will be attached to it. |
---|
513 | + For an immutable file, there is only one version. For a mutable |
---|
514 | + file, the 'best' version is the recoverable version with the |
---|
515 | + highest sequence number. If no uncoordinated writes have occurred, |
---|
516 | + and if enough shares are available, then this will be the most |
---|
517 | + recent version that has been uploaded. If no version is recoverable, |
---|
518 | + the Deferred will errback with an UnrecoverableFileError. |
---|
519 | + """ |
---|
520 | |
---|
521 | hunk ./src/allmydata/interfaces.py 784 |
---|
522 | - The consumer will not receive data right away: several network trips |
---|
523 | - must occur first. The order of events will be:: |
---|
524 | + def download_best_version(): |
---|
525 | + """Download the contents of the version that would be returned |
---|
526 | + by get_best_readable_version(). This is equivalent to calling |
---|
527 | + download_to_data() on the IReadable given by that method. |
---|
528 | |
---|
529 | hunk ./src/allmydata/interfaces.py 789 |
---|
530 | - consumer.registerProducer(p, streaming) |
---|
531 | - (if streaming == False):: |
---|
532 | - consumer does p.resumeProducing() |
---|
533 | - consumer.write(data) |
---|
534 | - consumer does p.resumeProducing() |
---|
535 | - consumer.write(data).. (repeat until all data is written) |
---|
536 | - consumer.unregisterProducer() |
---|
537 | - deferred.callback(consumer) |
---|
538 | + I return a Deferred that fires with a byte string when the file |
---|
539 | + has been fully downloaded. To support streaming download, use |
---|
540 | + the 'read' method of IReadable. If no version is recoverable, |
---|
541 | + the Deferred will errback with an UnrecoverableFileError. |
---|
542 | + """ |
---|
543 | |
---|
544 | hunk ./src/allmydata/interfaces.py 795 |
---|
545 | - If a download error occurs, or an exception is raised by |
---|
546 | - consumer.registerProducer() or consumer.write(), I will call |
---|
547 | - consumer.unregisterProducer() and then deliver the exception via |
---|
548 | - deferred.errback(). To cancel the download, the consumer should call |
---|
549 | - p.stopProducing(), which will result in an exception being delivered |
---|
550 | - via deferred.errback(). |
---|
551 | + def get_size_of_best_version(): |
---|
552 | + """Find the size of the version that would be returned by |
---|
553 | + get_best_readable_version(). |
---|
554 | |
---|
555 | hunk ./src/allmydata/interfaces.py 799 |
---|
556 | - See src/allmydata/util/consumer.py for an example of a simple |
---|
557 | - download-to-memory consumer. |
---|
558 | + I return a Deferred that fires with an integer. If no version |
---|
559 | + is recoverable, the Deferred will errback with an |
---|
560 | + UnrecoverableFileError. |
---|
561 | """ |
---|
562 | |
---|
563 | hunk ./src/allmydata/interfaces.py 804 |
---|
564 | + |
---|
565 | +class IImmutableFileNode(IFileNode, IReadable): |
---|
566 | + """I am a node representing an immutable file. Immutable files have |
---|
567 | + only one version""" |
---|
568 | + |
---|
569 | + |
---|
570 | class IMutableFileNode(IFileNode): |
---|
571 | """I provide access to a 'mutable file', which retains its identity |
---|
572 | regardless of what contents are put in it. |
---|
573 | hunk ./src/allmydata/interfaces.py 869 |
---|
574 | only be retrieved and updated all-at-once, as a single big string. Future |
---|
575 | versions of our mutable files will remove this restriction. |
---|
576 | """ |
---|
577 | - |
---|
578 | - def download_best_version(): |
---|
579 | - """Download the 'best' available version of the file, meaning one of |
---|
580 | - the recoverable versions with the highest sequence number. If no |
---|
581 | + def get_best_mutable_version(): |
---|
582 | + """Return a Deferred that fires with an IMutableFileVersion for |
---|
583 | + the 'best' available version of the file. The best version is |
---|
584 | + the recoverable version with the highest sequence number. If no |
---|
585 | uncoordinated writes have occurred, and if enough shares are |
---|
586 | hunk ./src/allmydata/interfaces.py 874 |
---|
587 | - available, then this will be the most recent version that has been |
---|
588 | - uploaded. |
---|
589 | + available, then this will be the most recent version that has |
---|
590 | + been uploaded. |
---|
591 | |
---|
592 | hunk ./src/allmydata/interfaces.py 877 |
---|
593 | - I update an internal servermap with MODE_READ, determine which |
---|
594 | - version of the file is indicated by |
---|
595 | - servermap.best_recoverable_version(), and return a Deferred that |
---|
596 | - fires with its contents. If no version is recoverable, the Deferred |
---|
597 | - will errback with UnrecoverableFileError. |
---|
598 | - """ |
---|
599 | - |
---|
600 | - def get_size_of_best_version(): |
---|
601 | - """Find the size of the version that would be downloaded with |
---|
602 | - download_best_version(), without actually downloading the whole file. |
---|
603 | - |
---|
604 | - I return a Deferred that fires with an integer. |
---|
605 | + If no version is recoverable, the Deferred will errback with an |
---|
606 | + UnrecoverableFileError. |
---|
607 | """ |
---|
608 | |
---|
609 | def overwrite(new_contents): |
---|
610 | hunk ./src/allmydata/interfaces.py 917 |
---|
611 | errback. |
---|
612 | """ |
---|
613 | |
---|
614 | - |
---|
615 | def get_servermap(mode): |
---|
616 | """Return a Deferred that fires with an IMutableFileServerMap |
---|
617 | instance, updated using the given mode. |
---|
618 | hunk ./src/allmydata/interfaces.py 970 |
---|
619 | writer-visible data using this writekey. |
---|
620 | """ |
---|
621 | |
---|
622 | + def set_version(version): |
---|
623 | + """Tahoe-LAFS supports SDMF and MDMF mutable files. By default, |
---|
624 | + we upload in SDMF for reasons of compatibility. If you want to |
---|
625 | + change this, set_version will let you do that. |
---|
626 | + |
---|
627 | + To say that this file should be uploaded in SDMF, pass in a 0. To |
---|
628 | + say that the file should be uploaded as MDMF, pass in a 1. |
---|
629 | + """ |
---|
630 | + |
---|
631 | + def get_version(): |
---|
632 | + """Returns the mutable file protocol version.""" |
---|
633 | + |
---|
634 | class NotEnoughSharesError(Exception): |
---|
635 | """Download was unable to get enough shares""" |
---|
636 | |
---|
637 | hunk ./src/allmydata/interfaces.py 1786 |
---|
638 | """The upload is finished, and whatever filehandle was in use may be |
---|
639 | closed.""" |
---|
640 | |
---|
641 | + |
---|
642 | +class IMutableUploadable(Interface): |
---|
643 | + """ |
---|
644 | + I represent content that is due to be uploaded to a mutable filecap. |
---|
645 | + """ |
---|
646 | + # This is somewhat simpler than the IUploadable interface above |
---|
647 | + # because mutable files do not need to be concerned with possibly |
---|
648 | + # generating a CHK, nor with per-file keys. It is a subset of the |
---|
649 | + # methods in IUploadable, though, so we could just as well implement |
---|
650 | + # the mutable uploadables as IUploadables that don't happen to use |
---|
651 | + # those methods (with the understanding that the unused methods will |
---|
652 | + # never be called on such objects) |
---|
653 | + def get_size(): |
---|
654 | + """ |
---|
655 | + Returns a Deferred that fires with the size of the content held |
---|
656 | + by the uploadable. |
---|
657 | + """ |
---|
658 | + |
---|
659 | + def read(length): |
---|
660 | + """ |
---|
661 | + Returns a list of strings which, when concatenated, are the next |
---|
662 | + length bytes of the file, or fewer if there are fewer bytes |
---|
663 | + between the current location and the end of the file. |
---|
664 | + """ |
---|
665 | + |
---|
666 | + def close(): |
---|
667 | + """ |
---|
668 | + The process that used the Uploadable is finished using it, so |
---|
669 | + the uploadable may be closed. |
---|
670 | + """ |
---|
671 | + |
---|
672 | class IUploadResults(Interface): |
---|
673 | """I am returned by upload() methods. I contain a number of public |
---|
674 | attributes which can be read to determine the results of the upload. Some |
---|
675 | } |
---|
676 | [frontends/sftpd.py: Modify the sftp frontend to work with the MDMF changes |
---|
677 | Kevan Carstensen <kevan@isnotajoke.com>**20100809233535 |
---|
678 | Ignore-this: 2d25e2cfcd0d7bbcbba660c7e1da12f |
---|
679 | ] { |
---|
680 | hunk ./src/allmydata/frontends/sftpd.py 33 |
---|
681 | from allmydata.interfaces import IFileNode, IDirectoryNode, ExistingChildError, \ |
---|
682 | NoSuchChildError, ChildOfWrongTypeError |
---|
683 | from allmydata.mutable.common import NotWriteableError |
---|
684 | +from allmydata.mutable.publish import MutableFileHandle |
---|
685 | from allmydata.immutable.upload import FileHandle |
---|
686 | from allmydata.dirnode import update_metadata |
---|
687 | from allmydata.util.fileutil import EncryptedTemporaryFile |
---|
688 | hunk ./src/allmydata/frontends/sftpd.py 664 |
---|
689 | else: |
---|
690 | assert IFileNode.providedBy(filenode), filenode |
---|
691 | |
---|
692 | - if filenode.is_mutable(): |
---|
693 | - self.async.addCallback(lambda ign: filenode.download_best_version()) |
---|
694 | - def _downloaded(data): |
---|
695 | - self.consumer = OverwriteableFileConsumer(len(data), tempfile_maker) |
---|
696 | - self.consumer.write(data) |
---|
697 | - self.consumer.finish() |
---|
698 | - return None |
---|
699 | - self.async.addCallback(_downloaded) |
---|
700 | - else: |
---|
701 | - download_size = filenode.get_size() |
---|
702 | - assert download_size is not None, "download_size is None" |
---|
703 | + self.async.addCallback(lambda ignored: filenode.get_best_readable_version()) |
---|
704 | + |
---|
705 | + def _read(version): |
---|
706 | + if noisy: self.log("_read", level=NOISY) |
---|
707 | + download_size = version.get_size() |
---|
708 | + assert download_size is not None |
---|
709 | + |
---|
710 | self.consumer = OverwriteableFileConsumer(download_size, tempfile_maker) |
---|
711 | hunk ./src/allmydata/frontends/sftpd.py 672 |
---|
712 | - def _read(ign): |
---|
713 | - if noisy: self.log("_read immutable", level=NOISY) |
---|
714 | - filenode.read(self.consumer, 0, None) |
---|
715 | - self.async.addCallback(_read) |
---|
716 | + |
---|
717 | + version.read(self.consumer, 0, None) |
---|
718 | + self.async.addCallback(_read) |
---|
719 | |
---|
720 | eventually(self.async.callback, None) |
---|
721 | |
---|
722 | hunk ./src/allmydata/frontends/sftpd.py 818 |
---|
723 | assert parent and childname, (parent, childname, self.metadata) |
---|
724 | d2.addCallback(lambda ign: parent.set_metadata_for(childname, self.metadata)) |
---|
725 | |
---|
726 | - d2.addCallback(lambda ign: self.consumer.get_current_size()) |
---|
727 | - d2.addCallback(lambda size: self.consumer.read(0, size)) |
---|
728 | - d2.addCallback(lambda new_contents: self.filenode.overwrite(new_contents)) |
---|
729 | + d2.addCallback(lambda ign: self.filenode.overwrite(MutableFileHandle(self.consumer.get_file()))) |
---|
730 | else: |
---|
731 | def _add_file(ign): |
---|
732 | self.log("_add_file childname=%r" % (childname,), level=OPERATIONAL) |
---|
733 | } |
---|
734 | [nodemaker.py: Make nodemaker expose a way to create MDMF files |
---|
735 | Kevan Carstensen <kevan@isnotajoke.com>**20100809233623 |
---|
736 | Ignore-this: a8a7c4283bb94be9fabb6fe3f2ca54b6 |
---|
737 | ] { |
---|
738 | hunk ./src/allmydata/nodemaker.py 3 |
---|
739 | import weakref |
---|
740 | from zope.interface import implements |
---|
741 | -from allmydata.interfaces import INodeMaker |
---|
742 | +from allmydata.util.assertutil import precondition |
---|
743 | +from allmydata.interfaces import INodeMaker, MustBeDeepImmutableError, \ |
---|
744 | + SDMF_VERSION, MDMF_VERSION |
---|
745 | from allmydata.immutable.literal import LiteralFileNode |
---|
746 | from allmydata.immutable.filenode import ImmutableFileNode, CiphertextFileNode |
---|
747 | from allmydata.immutable.upload import Data |
---|
748 | hunk ./src/allmydata/nodemaker.py 10 |
---|
749 | from allmydata.mutable.filenode import MutableFileNode |
---|
750 | +from allmydata.mutable.publish import MutableData |
---|
751 | from allmydata.dirnode import DirectoryNode, pack_children |
---|
752 | from allmydata.unknown import UnknownNode |
---|
753 | from allmydata import uri |
---|
754 | hunk ./src/allmydata/nodemaker.py 93 |
---|
755 | return self._create_dirnode(filenode) |
---|
756 | return None |
---|
757 | |
---|
758 | - def create_mutable_file(self, contents=None, keysize=None): |
---|
759 | + def create_mutable_file(self, contents=None, keysize=None, |
---|
760 | + version=SDMF_VERSION): |
---|
761 | n = MutableFileNode(self.storage_broker, self.secret_holder, |
---|
762 | self.default_encoding_parameters, self.history) |
---|
763 | hunk ./src/allmydata/nodemaker.py 97 |
---|
764 | + n.set_version(version) |
---|
765 | d = self.key_generator.generate(keysize) |
---|
766 | d.addCallback(n.create_with_keys, contents) |
---|
767 | d.addCallback(lambda res: n) |
---|
768 | hunk ./src/allmydata/nodemaker.py 103 |
---|
769 | return d |
---|
770 | |
---|
771 | - def create_new_mutable_directory(self, initial_children={}): |
---|
772 | + def create_new_mutable_directory(self, initial_children={}, |
---|
773 | + version=SDMF_VERSION): |
---|
774 | + # initial_children must have metadata (i.e. {} instead of None) |
---|
775 | + for (name, (node, metadata)) in initial_children.iteritems(): |
---|
776 | + precondition(isinstance(metadata, dict), |
---|
777 | + "create_new_mutable_directory requires metadata to be a dict, not None", metadata) |
---|
778 | + node.raise_error() |
---|
779 | d = self.create_mutable_file(lambda n: |
---|
780 | hunk ./src/allmydata/nodemaker.py 111 |
---|
781 | - pack_children(initial_children, n.get_writekey())) |
---|
782 | + MutableData(pack_children(initial_children, |
---|
783 | + n.get_writekey())), |
---|
784 | + version) |
---|
785 | d.addCallback(self._create_dirnode) |
---|
786 | return d |
---|
787 | |
---|
788 | } |
---|
789 | [web: Alter the webapi to get along with and take advantage of the MDMF changes |
---|
790 | Kevan Carstensen <kevan@isnotajoke.com>**20100809233755 |
---|
791 | Ignore-this: 724e169319427bb130c1331b30f92686 |
---|
792 | |
---|
793 | The main benefit that the webapi gets from MDMF, at least initially, is |
---|
794 | the ability to do a streaming download of an MDMF mutable file. It also |
---|
795 | exposes a way (through the PUT verb) to append to or otherwise modify |
---|
796 | (in-place) an MDMF mutable file. |
---|
797 | ] { |
---|
798 | hunk ./src/allmydata/web/common.py 34 |
---|
799 | else: |
---|
800 | return boolean_of_arg(replace) |
---|
801 | |
---|
802 | + |
---|
803 | +def parse_offset_arg(offset): |
---|
804 | + # XXX: This will raise a ValueError when invoked on something that |
---|
805 | + # is not an integer. Is that okay? Or do we want a better error |
---|
806 | + # message? Since this call is going to be used by programmers and |
---|
807 | + # their tools rather than users (through the wui), it is not |
---|
808 | + # inconsistent to return that, I guess. |
---|
809 | + offset = int(offset) |
---|
810 | + return offset |
---|
811 | + |
---|
812 | + |
---|
813 | def get_root(ctx_or_req): |
---|
814 | req = IRequest(ctx_or_req) |
---|
815 | # the addSlash=True gives us one extra (empty) segment |
---|
816 | hunk ./src/allmydata/web/filenode.py 12 |
---|
817 | from allmydata.interfaces import ExistingChildError |
---|
818 | from allmydata.monitor import Monitor |
---|
819 | from allmydata.immutable.upload import FileHandle |
---|
820 | +from allmydata.mutable.publish import MutableFileHandle |
---|
821 | from allmydata.util import log, base32 |
---|
822 | |
---|
823 | from allmydata.web.common import text_plain, WebError, RenderMixin, \ |
---|
824 | hunk ./src/allmydata/web/filenode.py 17 |
---|
825 | boolean_of_arg, get_arg, should_create_intermediate_directories, \ |
---|
826 | - MyExceptionHandler, parse_replace_arg |
---|
827 | + MyExceptionHandler, parse_replace_arg, parse_offset_arg |
---|
828 | from allmydata.web.check_results import CheckResults, \ |
---|
829 | CheckAndRepairResults, LiteralCheckResults |
---|
830 | from allmydata.web.info import MoreInfo |
---|
831 | hunk ./src/allmydata/web/filenode.py 27 |
---|
832 | # a new file is being uploaded in our place. |
---|
833 | mutable = boolean_of_arg(get_arg(req, "mutable", "false")) |
---|
834 | if mutable: |
---|
835 | - req.content.seek(0) |
---|
836 | - data = req.content.read() |
---|
837 | + data = MutableFileHandle(req.content) |
---|
838 | d = client.create_mutable_file(data) |
---|
839 | def _uploaded(newnode): |
---|
840 | d2 = self.parentnode.set_node(self.name, newnode, |
---|
841 | hunk ./src/allmydata/web/filenode.py 61 |
---|
842 | d.addCallback(lambda res: childnode.get_uri()) |
---|
843 | return d |
---|
844 | |
---|
845 | - def _read_data_from_formpost(self, req): |
---|
846 | - # SDMF: files are small, and we can only upload data, so we read |
---|
847 | - # the whole file into memory before uploading. |
---|
848 | - contents = req.fields["file"] |
---|
849 | - contents.file.seek(0) |
---|
850 | - data = contents.file.read() |
---|
851 | - return data |
---|
852 | |
---|
853 | def replace_me_with_a_formpost(self, req, client, replace): |
---|
854 | # create a new file, maybe mutable, maybe immutable |
---|
855 | hunk ./src/allmydata/web/filenode.py 66 |
---|
856 | mutable = boolean_of_arg(get_arg(req, "mutable", "false")) |
---|
857 | |
---|
858 | + # create an immutable file |
---|
859 | + contents = req.fields["file"] |
---|
860 | if mutable: |
---|
861 | hunk ./src/allmydata/web/filenode.py 69 |
---|
862 | - data = self._read_data_from_formpost(req) |
---|
863 | - d = client.create_mutable_file(data) |
---|
864 | + uploadable = MutableFileHandle(contents.file) |
---|
865 | + d = client.create_mutable_file(uploadable) |
---|
866 | def _uploaded(newnode): |
---|
867 | d2 = self.parentnode.set_node(self.name, newnode, |
---|
868 | overwrite=replace) |
---|
869 | hunk ./src/allmydata/web/filenode.py 78 |
---|
870 | return d2 |
---|
871 | d.addCallback(_uploaded) |
---|
872 | return d |
---|
873 | - # create an immutable file |
---|
874 | - contents = req.fields["file"] |
---|
875 | + |
---|
876 | uploadable = FileHandle(contents.file, convergence=client.convergence) |
---|
877 | d = self.parentnode.add_file(self.name, uploadable, overwrite=replace) |
---|
878 | d.addCallback(lambda newnode: newnode.get_uri()) |
---|
879 | hunk ./src/allmydata/web/filenode.py 84 |
---|
880 | return d |
---|
881 | |
---|
882 | + |
---|
883 | class PlaceHolderNodeHandler(RenderMixin, rend.Page, ReplaceMeMixin): |
---|
884 | def __init__(self, client, parentnode, name): |
---|
885 | rend.Page.__init__(self) |
---|
886 | hunk ./src/allmydata/web/filenode.py 167 |
---|
887 | # properly. So we assume that at least the browser will agree |
---|
888 | # with itself, and echo back the same bytes that we were given. |
---|
889 | filename = get_arg(req, "filename", self.name) or "unknown" |
---|
890 | - if self.node.is_mutable(): |
---|
891 | - # some day: d = self.node.get_best_version() |
---|
892 | - d = makeMutableDownloadable(self.node) |
---|
893 | - else: |
---|
894 | - d = defer.succeed(self.node) |
---|
895 | + d = self.node.get_best_readable_version() |
---|
896 | d.addCallback(lambda dn: FileDownloader(dn, filename)) |
---|
897 | return d |
---|
898 | if t == "json": |
---|
899 | hunk ./src/allmydata/web/filenode.py 191 |
---|
900 | if t: |
---|
901 | raise WebError("GET file: bad t=%s" % t) |
---|
902 | filename = get_arg(req, "filename", self.name) or "unknown" |
---|
903 | - if self.node.is_mutable(): |
---|
904 | - # some day: d = self.node.get_best_version() |
---|
905 | - d = makeMutableDownloadable(self.node) |
---|
906 | - else: |
---|
907 | - d = defer.succeed(self.node) |
---|
908 | + d = self.node.get_best_readable_version() |
---|
909 | d.addCallback(lambda dn: FileDownloader(dn, filename)) |
---|
910 | return d |
---|
911 | |
---|
912 | hunk ./src/allmydata/web/filenode.py 199 |
---|
913 | req = IRequest(ctx) |
---|
914 | t = get_arg(req, "t", "").strip() |
---|
915 | replace = parse_replace_arg(get_arg(req, "replace", "true")) |
---|
916 | + offset = parse_offset_arg(get_arg(req, "offset", -1)) |
---|
917 | |
---|
918 | if not t: |
---|
919 | hunk ./src/allmydata/web/filenode.py 202 |
---|
920 | - if self.node.is_mutable(): |
---|
921 | + if self.node.is_mutable() and offset >= 0: |
---|
922 | + return self.update_my_contents(req, offset) |
---|
923 | + |
---|
924 | + elif self.node.is_mutable(): |
---|
925 | return self.replace_my_contents(req) |
---|
926 | if not replace: |
---|
927 | # this is the early trap: if someone else modifies the |
---|
928 | hunk ./src/allmydata/web/filenode.py 212 |
---|
929 | # directory while we're uploading, the add_file(overwrite=) |
---|
930 | # call in replace_me_with_a_child will do the late trap. |
---|
931 | raise ExistingChildError() |
---|
932 | + if offset >= 0: |
---|
933 | + raise WebError("PUT to a file: append operation invoked " |
---|
934 | + "on an immutable cap") |
---|
935 | + |
---|
936 | + |
---|
937 | assert self.parentnode and self.name |
---|
938 | return self.replace_me_with_a_child(req, self.client, replace) |
---|
939 | if t == "uri": |
---|
940 | hunk ./src/allmydata/web/filenode.py 279 |
---|
941 | |
---|
942 | def replace_my_contents(self, req): |
---|
943 | req.content.seek(0) |
---|
944 | - new_contents = req.content.read() |
---|
945 | + new_contents = MutableFileHandle(req.content) |
---|
946 | d = self.node.overwrite(new_contents) |
---|
947 | d.addCallback(lambda res: self.node.get_uri()) |
---|
948 | return d |
---|
949 | hunk ./src/allmydata/web/filenode.py 284 |
---|
950 | |
---|
951 | + |
---|
952 | + def update_my_contents(self, req, offset): |
---|
953 | + req.content.seek(0) |
---|
954 | + added_contents = MutableFileHandle(req.content) |
---|
955 | + |
---|
956 | + d = self.node.get_best_mutable_version() |
---|
957 | + d.addCallback(lambda mv: |
---|
958 | + mv.update(added_contents, offset)) |
---|
959 | + d.addCallback(lambda ignored: |
---|
960 | + self.node.get_uri()) |
---|
961 | + return d |
---|
962 | + |
---|
963 | + |
---|
964 | def replace_my_contents_with_a_formpost(self, req): |
---|
965 | # we have a mutable file. Get the data from the formpost, and replace |
---|
966 | # the mutable file's contents with it. |
---|
967 | hunk ./src/allmydata/web/filenode.py 300 |
---|
968 | - new_contents = self._read_data_from_formpost(req) |
---|
969 | + new_contents = req.fields['file'] |
---|
970 | + new_contents = MutableFileHandle(new_contents.file) |
---|
971 | + |
---|
972 | d = self.node.overwrite(new_contents) |
---|
973 | d.addCallback(lambda res: self.node.get_uri()) |
---|
974 | return d |
---|
975 | hunk ./src/allmydata/web/filenode.py 307 |
---|
976 | |
---|
977 | -class MutableDownloadable: |
---|
978 | - #implements(IDownloadable) |
---|
979 | - def __init__(self, size, node): |
---|
980 | - self.size = size |
---|
981 | - self.node = node |
---|
982 | - def get_size(self): |
---|
983 | - return self.size |
---|
984 | - def is_mutable(self): |
---|
985 | - return True |
---|
986 | - def read(self, consumer, offset=0, size=None): |
---|
987 | - d = self.node.download_best_version() |
---|
988 | - d.addCallback(self._got_data, consumer, offset, size) |
---|
989 | - return d |
---|
990 | - def _got_data(self, contents, consumer, offset, size): |
---|
991 | - start = offset |
---|
992 | - if size is not None: |
---|
993 | - end = offset+size |
---|
994 | - else: |
---|
995 | - end = self.size |
---|
996 | - # SDMF: we can write the whole file in one big chunk |
---|
997 | - consumer.write(contents[start:end]) |
---|
998 | - return consumer |
---|
999 | - |
---|
1000 | -def makeMutableDownloadable(n): |
---|
1001 | - d = defer.maybeDeferred(n.get_size_of_best_version) |
---|
1002 | - d.addCallback(MutableDownloadable, n) |
---|
1003 | - return d |
---|
1004 | |
---|
1005 | class FileDownloader(rend.Page): |
---|
1006 | # since we override the rendering process (to let the tahoe Downloader |
---|
1007 | hunk ./src/allmydata/web/unlinked.py 7 |
---|
1008 | from twisted.internet import defer |
---|
1009 | from nevow import rend, url, tags as T |
---|
1010 | from allmydata.immutable.upload import FileHandle |
---|
1011 | +from allmydata.mutable.publish import MutableFileHandle |
---|
1012 | from allmydata.web.common import getxmlfile, get_arg, boolean_of_arg, \ |
---|
1013 | convert_children_json, WebError |
---|
1014 | from allmydata.web import status |
---|
1015 | hunk ./src/allmydata/web/unlinked.py 23 |
---|
1016 | def PUTUnlinkedSSK(req, client): |
---|
1017 | # SDMF: files are small, and we can only upload data |
---|
1018 | req.content.seek(0) |
---|
1019 | - data = req.content.read() |
---|
1020 | + data = MutableFileHandle(req.content) |
---|
1021 | d = client.create_mutable_file(data) |
---|
1022 | d.addCallback(lambda n: n.get_uri()) |
---|
1023 | return d |
---|
1024 | hunk ./src/allmydata/web/unlinked.py 87 |
---|
1025 | # "POST /uri", to create an unlinked file. |
---|
1026 | # SDMF: files are small, and we can only upload data |
---|
1027 | contents = req.fields["file"] |
---|
1028 | - contents.file.seek(0) |
---|
1029 | - data = contents.file.read() |
---|
1030 | + data = MutableFileHandle(contents.file) |
---|
1031 | d = client.create_mutable_file(data) |
---|
1032 | d.addCallback(lambda n: n.get_uri()) |
---|
1033 | return d |
---|
1034 | } |
---|
1035 | [mutable/layout.py and interfaces.py: add MDMF writer and reader |
---|
1036 | Kevan Carstensen <kevan@isnotajoke.com>**20100809234004 |
---|
1037 | Ignore-this: 90db36ee3318dbbd4397baebc6014f86 |
---|
1038 | |
---|
1039 | The MDMF writer is responsible for keeping state as plaintext is |
---|
1040 | gradually processed into share data by the upload process. When the |
---|
1041 | upload finishes, it will write all of its share data to a remote server, |
---|
1042 | reporting its status back to the publisher. |
---|
1043 | |
---|
1044 | The MDMF reader is responsible for abstracting an MDMF file as it sits |
---|
1045 | on the grid from the downloader; specifically, by receiving and |
---|
1046 | responding to requests for arbitrary data within the MDMF file. |
---|
1047 | |
---|
1048 | The interfaces.py file has also been modified to contain an interface |
---|
1049 | for the writer. |
---|
1050 | ] { |
---|
1051 | hunk ./src/allmydata/interfaces.py 7 |
---|
1052 | ChoiceOf, IntegerConstraint, Any, RemoteInterface, Referenceable |
---|
1053 | |
---|
1054 | HASH_SIZE=32 |
---|
1055 | +SALT_SIZE=16 |
---|
1056 | + |
---|
1057 | +SDMF_VERSION=0 |
---|
1058 | +MDMF_VERSION=1 |
---|
1059 | |
---|
1060 | Hash = StringConstraint(maxLength=HASH_SIZE, |
---|
1061 | minLength=HASH_SIZE)# binary format 32-byte SHA256 hash |
---|
1062 | hunk ./src/allmydata/interfaces.py 420 |
---|
1063 | """ |
---|
1064 | |
---|
1065 | |
---|
1066 | +class IMutableSlotWriter(Interface): |
---|
1067 | + """ |
---|
1068 | + The interface for a writer around a mutable slot on a remote server. |
---|
1069 | + """ |
---|
1070 | + def set_checkstring(checkstring, *args): |
---|
1071 | + """ |
---|
1072 | + Set the checkstring that I will pass to the remote server when |
---|
1073 | + writing. |
---|
1074 | + |
---|
1075 | + @param checkstring A packed checkstring to use. |
---|
1076 | + |
---|
1077 | + Note that implementations can differ in which semantics they |
---|
1078 | + wish to support for set_checkstring -- they can, for example, |
---|
1079 | + build the checkstring themselves from its constituents, or |
---|
1080 | + some other thing. |
---|
1081 | + """ |
---|
1082 | + |
---|
1083 | + def get_checkstring(): |
---|
1084 | + """ |
---|
1085 | + Get the checkstring that I think currently exists on the remote |
---|
1086 | + server. |
---|
1087 | + """ |
---|
1088 | + |
---|
1089 | + def put_block(data, segnum, salt): |
---|
1090 | + """ |
---|
1091 | + Add a block and salt to the share. |
---|
1092 | + """ |
---|
1093 | + |
---|
1094 | + def put_encprivey(encprivkey): |
---|
1095 | + """ |
---|
1096 | + Add the encrypted private key to the share. |
---|
1097 | + """ |
---|
1098 | + |
---|
1099 | + def put_blockhashes(blockhashes=list): |
---|
1100 | + """ |
---|
1101 | + Add the block hash tree to the share. |
---|
1102 | + """ |
---|
1103 | + |
---|
1104 | + def put_sharehashes(sharehashes=dict): |
---|
1105 | + """ |
---|
1106 | + Add the share hash chain to the share. |
---|
1107 | + """ |
---|
1108 | + |
---|
1109 | + def get_signable(): |
---|
1110 | + """ |
---|
1111 | + Return the part of the share that needs to be signed. |
---|
1112 | + """ |
---|
1113 | + |
---|
1114 | + def put_signature(signature): |
---|
1115 | + """ |
---|
1116 | + Add the signature to the share. |
---|
1117 | + """ |
---|
1118 | + |
---|
1119 | + def put_verification_key(verification_key): |
---|
1120 | + """ |
---|
1121 | + Add the verification key to the share. |
---|
1122 | + """ |
---|
1123 | + |
---|
1124 | + def finish_publishing(): |
---|
1125 | + """ |
---|
1126 | + Do anything necessary to finish writing the share to a remote |
---|
1127 | + server. I require that no further publishing needs to take place |
---|
1128 | + after this method has been called. |
---|
1129 | + """ |
---|
1130 | + |
---|
1131 | + |
---|
1132 | class IURI(Interface): |
---|
1133 | def init_from_string(uri): |
---|
1134 | """Accept a string (as created by my to_string() method) and populate |
---|
1135 | hunk ./src/allmydata/mutable/layout.py 4 |
---|
1136 | |
---|
1137 | import struct |
---|
1138 | from allmydata.mutable.common import NeedMoreDataError, UnknownVersionError |
---|
1139 | +from allmydata.interfaces import HASH_SIZE, SALT_SIZE, SDMF_VERSION, \ |
---|
1140 | + MDMF_VERSION, IMutableSlotWriter |
---|
1141 | +from allmydata.util import mathutil, observer |
---|
1142 | +from twisted.python import failure |
---|
1143 | +from twisted.internet import defer |
---|
1144 | +from zope.interface import implements |
---|
1145 | + |
---|
1146 | + |
---|
1147 | +# These strings describe the format of the packed structs they help process |
---|
1148 | +# Here's what they mean: |
---|
1149 | +# |
---|
1150 | +# PREFIX: |
---|
1151 | +# >: Big-endian byte order; the most significant byte is first (leftmost). |
---|
1152 | +# B: The version information; an 8 bit version identifier. Stored as |
---|
1153 | +# an unsigned char. This is currently 00 00 00 00; our modifications |
---|
1154 | +# will turn it into 00 00 00 01. |
---|
1155 | +# Q: The sequence number; this is sort of like a revision history for |
---|
1156 | +# mutable files; they start at 1 and increase as they are changed after |
---|
1157 | +# being uploaded. Stored as an unsigned long long, which is 8 bytes in |
---|
1158 | +# length. |
---|
1159 | +# 32s: The root hash of the share hash tree. We use sha-256d, so we use 32 |
---|
1160 | +# characters = 32 bytes to store the value. |
---|
1161 | +# 16s: The salt for the readkey. This is a 16-byte random value, stored as |
---|
1162 | +# 16 characters. |
---|
1163 | +# |
---|
1164 | +# SIGNED_PREFIX additions, things that are covered by the signature: |
---|
1165 | +# B: The "k" encoding parameter. We store this as an 8-bit character, |
---|
1166 | +# which is convenient because our erasure coding scheme cannot |
---|
1167 | +# encode if you ask for more than 255 pieces. |
---|
1168 | +# B: The "N" encoding parameter. Stored as an 8-bit character for the |
---|
1169 | +# same reasons as above. |
---|
1170 | +# Q: The segment size of the uploaded file. This will essentially be the |
---|
1171 | +# length of the file in SDMF. An unsigned long long, so we can store |
---|
1172 | +# files of quite large size. |
---|
1173 | +# Q: The data length of the uploaded file. Modulo padding, this will be |
---|
1174 | +# the same of the data length field. Like the data length field, it is |
---|
1175 | +# an unsigned long long and can be quite large. |
---|
1176 | +# |
---|
1177 | +# HEADER additions: |
---|
1178 | +# L: The offset of the signature of this. An unsigned long. |
---|
1179 | +# L: The offset of the share hash chain. An unsigned long. |
---|
1180 | +# L: The offset of the block hash tree. An unsigned long. |
---|
1181 | +# L: The offset of the share data. An unsigned long. |
---|
1182 | +# Q: The offset of the encrypted private key. An unsigned long long, to |
---|
1183 | +# account for the possibility of a lot of share data. |
---|
1184 | +# Q: The offset of the EOF. An unsigned long long, to account for the |
---|
1185 | +# possibility of a lot of share data. |
---|
1186 | +# |
---|
1187 | +# After all of these, we have the following: |
---|
1188 | +# - The verification key: Occupies the space between the end of the header |
---|
1189 | +# and the start of the signature (i.e.: data[HEADER_LENGTH:o['signature']]. |
---|
1190 | +# - The signature, which goes from the signature offset to the share hash |
---|
1191 | +# chain offset. |
---|
1192 | +# - The share hash chain, which goes from the share hash chain offset to |
---|
1193 | +# the block hash tree offset. |
---|
1194 | +# - The share data, which goes from the share data offset to the encrypted |
---|
1195 | +# private key offset. |
---|
1196 | +# - The encrypted private key offset, which goes until the end of the file. |
---|
1197 | +# |
---|
1198 | +# The block hash tree in this encoding has only one share, so the offset of |
---|
1199 | +# the share data will be 32 bits more than the offset of the block hash tree. |
---|
1200 | +# Given this, we may need to check to see how many bytes a reasonably sized |
---|
1201 | +# block hash tree will take up. |
---|
1202 | |
---|
1203 | PREFIX = ">BQ32s16s" # each version has a different prefix |
---|
1204 | SIGNED_PREFIX = ">BQ32s16s BBQQ" # this is covered by the signature |
---|
1205 | hunk ./src/allmydata/mutable/layout.py 73 |
---|
1206 | SIGNED_PREFIX_LENGTH = struct.calcsize(SIGNED_PREFIX) |
---|
1207 | HEADER = ">BQ32s16s BBQQ LLLLQQ" # includes offsets |
---|
1208 | HEADER_LENGTH = struct.calcsize(HEADER) |
---|
1209 | +OFFSETS = ">LLLLQQ" |
---|
1210 | +OFFSETS_LENGTH = struct.calcsize(OFFSETS) |
---|
1211 | |
---|
1212 | def unpack_header(data): |
---|
1213 | o = {} |
---|
1214 | hunk ./src/allmydata/mutable/layout.py 194 |
---|
1215 | return (share_hash_chain, block_hash_tree, share_data) |
---|
1216 | |
---|
1217 | |
---|
1218 | -def pack_checkstring(seqnum, root_hash, IV): |
---|
1219 | +def pack_checkstring(seqnum, root_hash, IV, version=0): |
---|
1220 | return struct.pack(PREFIX, |
---|
1221 | hunk ./src/allmydata/mutable/layout.py 196 |
---|
1222 | - 0, # version, |
---|
1223 | + version, |
---|
1224 | seqnum, |
---|
1225 | root_hash, |
---|
1226 | IV) |
---|
1227 | hunk ./src/allmydata/mutable/layout.py 269 |
---|
1228 | encprivkey]) |
---|
1229 | return final_share |
---|
1230 | |
---|
1231 | +def pack_prefix(seqnum, root_hash, IV, |
---|
1232 | + required_shares, total_shares, |
---|
1233 | + segment_size, data_length): |
---|
1234 | + prefix = struct.pack(SIGNED_PREFIX, |
---|
1235 | + 0, # version, |
---|
1236 | + seqnum, |
---|
1237 | + root_hash, |
---|
1238 | + IV, |
---|
1239 | + required_shares, |
---|
1240 | + total_shares, |
---|
1241 | + segment_size, |
---|
1242 | + data_length, |
---|
1243 | + ) |
---|
1244 | + return prefix |
---|
1245 | + |
---|
1246 | + |
---|
1247 | +class SDMFSlotWriteProxy: |
---|
1248 | + implements(IMutableSlotWriter) |
---|
1249 | + """ |
---|
1250 | + I represent a remote write slot for an SDMF mutable file. I build a |
---|
1251 | + share in memory, and then write it in one piece to the remote |
---|
1252 | + server. This mimics how SDMF shares were built before MDMF (and the |
---|
1253 | + new MDMF uploader), but provides that functionality in a way that |
---|
1254 | + allows the MDMF uploader to be built without much special-casing for |
---|
1255 | + file format, which makes the uploader code more readable. |
---|
1256 | + """ |
---|
1257 | + def __init__(self, |
---|
1258 | + shnum, |
---|
1259 | + rref, # a remote reference to a storage server |
---|
1260 | + storage_index, |
---|
1261 | + secrets, # (write_enabler, renew_secret, cancel_secret) |
---|
1262 | + seqnum, # the sequence number of the mutable file |
---|
1263 | + required_shares, |
---|
1264 | + total_shares, |
---|
1265 | + segment_size, |
---|
1266 | + data_length): # the length of the original file |
---|
1267 | + self.shnum = shnum |
---|
1268 | + self._rref = rref |
---|
1269 | + self._storage_index = storage_index |
---|
1270 | + self._secrets = secrets |
---|
1271 | + self._seqnum = seqnum |
---|
1272 | + self._required_shares = required_shares |
---|
1273 | + self._total_shares = total_shares |
---|
1274 | + self._segment_size = segment_size |
---|
1275 | + self._data_length = data_length |
---|
1276 | + |
---|
1277 | + # This is an SDMF file, so it should have only one segment, so, |
---|
1278 | + # modulo padding of the data length, the segment size and the |
---|
1279 | + # data length should be the same. |
---|
1280 | + expected_segment_size = mathutil.next_multiple(data_length, |
---|
1281 | + self._required_shares) |
---|
1282 | + assert expected_segment_size == segment_size |
---|
1283 | + |
---|
1284 | + self._block_size = self._segment_size / self._required_shares |
---|
1285 | + |
---|
1286 | + # This is meant to mimic how SDMF files were built before MDMF |
---|
1287 | + # entered the picture: we generate each share in its entirety, |
---|
1288 | + # then push it off to the storage server in one write. When |
---|
1289 | + # callers call set_*, they are just populating this dict. |
---|
1290 | + # finish_publishing will stitch these pieces together into a |
---|
1291 | + # coherent share, and then write the coherent share to the |
---|
1292 | + # storage server. |
---|
1293 | + self._share_pieces = {} |
---|
1294 | + |
---|
1295 | + # This tells the write logic what checkstring to use when |
---|
1296 | + # writing remote shares. |
---|
1297 | + self._testvs = [] |
---|
1298 | + |
---|
1299 | + self._readvs = [(0, struct.calcsize(PREFIX))] |
---|
1300 | + |
---|
1301 | + |
---|
1302 | + def set_checkstring(self, checkstring_or_seqnum, |
---|
1303 | + root_hash=None, |
---|
1304 | + salt=None): |
---|
1305 | + """ |
---|
1306 | + Set the checkstring that I will pass to the remote server when |
---|
1307 | + writing. |
---|
1308 | + |
---|
1309 | + @param checkstring_or_seqnum: A packed checkstring to use, |
---|
1310 | + or a sequence number. I will treat this as a checkstr |
---|
1311 | + |
---|
1312 | + Note that implementations can differ in which semantics they |
---|
1313 | + wish to support for set_checkstring -- they can, for example, |
---|
1314 | + build the checkstring themselves from its constituents, or |
---|
1315 | + some other thing. |
---|
1316 | + """ |
---|
1317 | + if root_hash and salt: |
---|
1318 | + checkstring = struct.pack(PREFIX, |
---|
1319 | + 0, |
---|
1320 | + checkstring_or_seqnum, |
---|
1321 | + root_hash, |
---|
1322 | + salt) |
---|
1323 | + else: |
---|
1324 | + checkstring = checkstring_or_seqnum |
---|
1325 | + self._testvs = [(0, len(checkstring), "eq", checkstring)] |
---|
1326 | + |
---|
1327 | + |
---|
1328 | + def get_checkstring(self): |
---|
1329 | + """ |
---|
1330 | + Get the checkstring that I think currently exists on the remote |
---|
1331 | + server. |
---|
1332 | + """ |
---|
1333 | + if self._testvs: |
---|
1334 | + return self._testvs[0][3] |
---|
1335 | + return "" |
---|
1336 | + |
---|
1337 | + |
---|
1338 | + def put_block(self, data, segnum, salt): |
---|
1339 | + """ |
---|
1340 | + Add a block and salt to the share. |
---|
1341 | + """ |
---|
1342 | + # SDMF files have only one segment |
---|
1343 | + assert segnum == 0 |
---|
1344 | + assert len(data) == self._block_size |
---|
1345 | + assert len(salt) == SALT_SIZE |
---|
1346 | + |
---|
1347 | + self._share_pieces['sharedata'] = data |
---|
1348 | + self._share_pieces['salt'] = salt |
---|
1349 | + |
---|
1350 | + # TODO: Figure out something intelligent to return. |
---|
1351 | + return defer.succeed(None) |
---|
1352 | + |
---|
1353 | + |
---|
1354 | + def put_encprivkey(self, encprivkey): |
---|
1355 | + """ |
---|
1356 | + Add the encrypted private key to the share. |
---|
1357 | + """ |
---|
1358 | + self._share_pieces['encprivkey'] = encprivkey |
---|
1359 | + |
---|
1360 | + return defer.succeed(None) |
---|
1361 | + |
---|
1362 | + |
---|
1363 | + def put_blockhashes(self, blockhashes): |
---|
1364 | + """ |
---|
1365 | + Add the block hash tree to the share. |
---|
1366 | + """ |
---|
1367 | + assert isinstance(blockhashes, list) |
---|
1368 | + for h in blockhashes: |
---|
1369 | + assert len(h) == HASH_SIZE |
---|
1370 | + |
---|
1371 | + # serialize the blockhashes, then set them. |
---|
1372 | + blockhashes_s = "".join(blockhashes) |
---|
1373 | + self._share_pieces['block_hash_tree'] = blockhashes_s |
---|
1374 | + |
---|
1375 | + return defer.succeed(None) |
---|
1376 | + |
---|
1377 | + |
---|
1378 | + def put_sharehashes(self, sharehashes): |
---|
1379 | + """ |
---|
1380 | + Add the share hash chain to the share. |
---|
1381 | + """ |
---|
1382 | + assert isinstance(sharehashes, dict) |
---|
1383 | + for h in sharehashes.itervalues(): |
---|
1384 | + assert len(h) == HASH_SIZE |
---|
1385 | + |
---|
1386 | + # serialize the sharehashes, then set them. |
---|
1387 | + sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i]) |
---|
1388 | + for i in sorted(sharehashes.keys())]) |
---|
1389 | + self._share_pieces['share_hash_chain'] = sharehashes_s |
---|
1390 | + |
---|
1391 | + return defer.succeed(None) |
---|
1392 | + |
---|
1393 | + |
---|
1394 | + def put_root_hash(self, root_hash): |
---|
1395 | + """ |
---|
1396 | + Add the root hash to the share. |
---|
1397 | + """ |
---|
1398 | + assert len(root_hash) == HASH_SIZE |
---|
1399 | + |
---|
1400 | + self._share_pieces['root_hash'] = root_hash |
---|
1401 | + |
---|
1402 | + return defer.succeed(None) |
---|
1403 | + |
---|
1404 | + |
---|
1405 | + def put_salt(self, salt): |
---|
1406 | + """ |
---|
1407 | + Add a salt to an empty SDMF file. |
---|
1408 | + """ |
---|
1409 | + assert len(salt) == SALT_SIZE |
---|
1410 | + |
---|
1411 | + self._share_pieces['salt'] = salt |
---|
1412 | + self._share_pieces['sharedata'] = "" |
---|
1413 | + |
---|
1414 | + |
---|
1415 | + def get_signable(self): |
---|
1416 | + """ |
---|
1417 | + Return the part of the share that needs to be signed. |
---|
1418 | + |
---|
1419 | + SDMF writers need to sign the packed representation of the |
---|
1420 | + first eight fields of the remote share, that is: |
---|
1421 | + - version number (0) |
---|
1422 | + - sequence number |
---|
1423 | + - root of the share hash tree |
---|
1424 | + - salt |
---|
1425 | + - k |
---|
1426 | + - n |
---|
1427 | + - segsize |
---|
1428 | + - datalen |
---|
1429 | + |
---|
1430 | + This method is responsible for returning that to callers. |
---|
1431 | + """ |
---|
1432 | + return struct.pack(SIGNED_PREFIX, |
---|
1433 | + 0, |
---|
1434 | + self._seqnum, |
---|
1435 | + self._share_pieces['root_hash'], |
---|
1436 | + self._share_pieces['salt'], |
---|
1437 | + self._required_shares, |
---|
1438 | + self._total_shares, |
---|
1439 | + self._segment_size, |
---|
1440 | + self._data_length) |
---|
1441 | + |
---|
1442 | + |
---|
1443 | + def put_signature(self, signature): |
---|
1444 | + """ |
---|
1445 | + Add the signature to the share. |
---|
1446 | + """ |
---|
1447 | + self._share_pieces['signature'] = signature |
---|
1448 | + |
---|
1449 | + return defer.succeed(None) |
---|
1450 | + |
---|
1451 | + |
---|
1452 | + def put_verification_key(self, verification_key): |
---|
1453 | + """ |
---|
1454 | + Add the verification key to the share. |
---|
1455 | + """ |
---|
1456 | + self._share_pieces['verification_key'] = verification_key |
---|
1457 | + |
---|
1458 | + return defer.succeed(None) |
---|
1459 | + |
---|
1460 | + |
---|
1461 | + def get_verinfo(self): |
---|
1462 | + """ |
---|
1463 | + I return my verinfo tuple. This is used by the ServermapUpdater |
---|
1464 | + to keep track of versions of mutable files. |
---|
1465 | + |
---|
1466 | + The verinfo tuple for MDMF files contains: |
---|
1467 | + - seqnum |
---|
1468 | + - root hash |
---|
1469 | + - a blank (nothing) |
---|
1470 | + - segsize |
---|
1471 | + - datalen |
---|
1472 | + - k |
---|
1473 | + - n |
---|
1474 | + - prefix (the thing that you sign) |
---|
1475 | + - a tuple of offsets |
---|
1476 | + |
---|
1477 | + We include the nonce in MDMF to simplify processing of version |
---|
1478 | + information tuples. |
---|
1479 | + |
---|
1480 | + The verinfo tuple for SDMF files is the same, but contains a |
---|
1481 | + 16-byte IV instead of a hash of salts. |
---|
1482 | + """ |
---|
1483 | + return (self._seqnum, |
---|
1484 | + self._share_pieces['root_hash'], |
---|
1485 | + self._share_pieces['salt'], |
---|
1486 | + self._segment_size, |
---|
1487 | + self._data_length, |
---|
1488 | + self._required_shares, |
---|
1489 | + self._total_shares, |
---|
1490 | + self.get_signable(), |
---|
1491 | + self._get_offsets_tuple()) |
---|
1492 | + |
---|
1493 | + def _get_offsets_dict(self): |
---|
1494 | + post_offset = HEADER_LENGTH |
---|
1495 | + offsets = {} |
---|
1496 | + |
---|
1497 | + verification_key_length = len(self._share_pieces['verification_key']) |
---|
1498 | + o1 = offsets['signature'] = post_offset + verification_key_length |
---|
1499 | + |
---|
1500 | + signature_length = len(self._share_pieces['signature']) |
---|
1501 | + o2 = offsets['share_hash_chain'] = o1 + signature_length |
---|
1502 | + |
---|
1503 | + share_hash_chain_length = len(self._share_pieces['share_hash_chain']) |
---|
1504 | + o3 = offsets['block_hash_tree'] = o2 + share_hash_chain_length |
---|
1505 | + |
---|
1506 | + block_hash_tree_length = len(self._share_pieces['block_hash_tree']) |
---|
1507 | + o4 = offsets['share_data'] = o3 + block_hash_tree_length |
---|
1508 | + |
---|
1509 | + share_data_length = len(self._share_pieces['sharedata']) |
---|
1510 | + o5 = offsets['enc_privkey'] = o4 + share_data_length |
---|
1511 | + |
---|
1512 | + encprivkey_length = len(self._share_pieces['encprivkey']) |
---|
1513 | + offsets['EOF'] = o5 + encprivkey_length |
---|
1514 | + return offsets |
---|
1515 | + |
---|
1516 | + |
---|
1517 | + def _get_offsets_tuple(self): |
---|
1518 | + offsets = self._get_offsets_dict() |
---|
1519 | + return tuple([(key, value) for key, value in offsets.items()]) |
---|
1520 | + |
---|
1521 | + |
---|
1522 | + def _pack_offsets(self): |
---|
1523 | + offsets = self._get_offsets_dict() |
---|
1524 | + return struct.pack(">LLLLQQ", |
---|
1525 | + offsets['signature'], |
---|
1526 | + offsets['share_hash_chain'], |
---|
1527 | + offsets['block_hash_tree'], |
---|
1528 | + offsets['share_data'], |
---|
1529 | + offsets['enc_privkey'], |
---|
1530 | + offsets['EOF']) |
---|
1531 | + |
---|
1532 | + |
---|
1533 | + def finish_publishing(self): |
---|
1534 | + """ |
---|
1535 | + Do anything necessary to finish writing the share to a remote |
---|
1536 | + server. I require that no further publishing needs to take place |
---|
1537 | + after this method has been called. |
---|
1538 | + """ |
---|
1539 | + for k in ["sharedata", "encprivkey", "signature", "verification_key", |
---|
1540 | + "share_hash_chain", "block_hash_tree"]: |
---|
1541 | + assert k in self._share_pieces |
---|
1542 | + # This is the only method that actually writes something to the |
---|
1543 | + # remote server. |
---|
1544 | + # First, we need to pack the share into data that we can write |
---|
1545 | + # to the remote server in one write. |
---|
1546 | + offsets = self._pack_offsets() |
---|
1547 | + prefix = self.get_signable() |
---|
1548 | + final_share = "".join([prefix, |
---|
1549 | + offsets, |
---|
1550 | + self._share_pieces['verification_key'], |
---|
1551 | + self._share_pieces['signature'], |
---|
1552 | + self._share_pieces['share_hash_chain'], |
---|
1553 | + self._share_pieces['block_hash_tree'], |
---|
1554 | + self._share_pieces['sharedata'], |
---|
1555 | + self._share_pieces['encprivkey']]) |
---|
1556 | + |
---|
1557 | + # Our only data vector is going to be writing the final share, |
---|
1558 | + # in its entirely. |
---|
1559 | + datavs = [(0, final_share)] |
---|
1560 | + |
---|
1561 | + if not self._testvs: |
---|
1562 | + # Our caller has not provided us with another checkstring |
---|
1563 | + # yet, so we assume that we are writing a new share, and set |
---|
1564 | + # a test vector that will allow a new share to be written. |
---|
1565 | + self._testvs = [] |
---|
1566 | + self._testvs.append(tuple([0, 1, "eq", ""])) |
---|
1567 | + new_share = True |
---|
1568 | + |
---|
1569 | + tw_vectors = {} |
---|
1570 | + tw_vectors[self.shnum] = (self._testvs, datavs, None) |
---|
1571 | + return self._rref.callRemote("slot_testv_and_readv_and_writev", |
---|
1572 | + self._storage_index, |
---|
1573 | + self._secrets, |
---|
1574 | + tw_vectors, |
---|
1575 | + # TODO is it useful to read something? |
---|
1576 | + self._readvs) |
---|
1577 | + |
---|
1578 | + |
---|
1579 | +MDMFHEADER = ">BQ32sBBQQ QQQQQQ" |
---|
1580 | +MDMFHEADERWITHOUTOFFSETS = ">BQ32sBBQQ" |
---|
1581 | +MDMFHEADERSIZE = struct.calcsize(MDMFHEADER) |
---|
1582 | +MDMFHEADERWITHOUTOFFSETSSIZE = struct.calcsize(MDMFHEADERWITHOUTOFFSETS) |
---|
1583 | +MDMFCHECKSTRING = ">BQ32s" |
---|
1584 | +MDMFSIGNABLEHEADER = ">BQ32sBBQQ" |
---|
1585 | +MDMFOFFSETS = ">QQQQQQ" |
---|
1586 | +MDMFOFFSETS_LENGTH = struct.calcsize(MDMFOFFSETS) |
---|
1587 | + |
---|
1588 | +class MDMFSlotWriteProxy: |
---|
1589 | + implements(IMutableSlotWriter) |
---|
1590 | + |
---|
1591 | + """ |
---|
1592 | + I represent a remote write slot for an MDMF mutable file. |
---|
1593 | + |
---|
1594 | + I abstract away from my caller the details of block and salt |
---|
1595 | + management, and the implementation of the on-disk format for MDMF |
---|
1596 | + shares. |
---|
1597 | + """ |
---|
1598 | + # Expected layout, MDMF: |
---|
1599 | + # offset: size: name: |
---|
1600 | + #-- signed part -- |
---|
1601 | + # 0 1 version number (01) |
---|
1602 | + # 1 8 sequence number |
---|
1603 | + # 9 32 share tree root hash |
---|
1604 | + # 41 1 The "k" encoding parameter |
---|
1605 | + # 42 1 The "N" encoding parameter |
---|
1606 | + # 43 8 The segment size of the uploaded file |
---|
1607 | + # 51 8 The data length of the original plaintext |
---|
1608 | + #-- end signed part -- |
---|
1609 | + # 59 8 The offset of the encrypted private key |
---|
1610 | + # 83 8 The offset of the signature |
---|
1611 | + # 91 8 The offset of the verification key |
---|
1612 | + # 67 8 The offset of the block hash tree |
---|
1613 | + # 75 8 The offset of the share hash chain |
---|
1614 | + # 99 8 The offset of the EOF |
---|
1615 | + # |
---|
1616 | + # followed by salts and share data, the encrypted private key, the |
---|
1617 | + # block hash tree, the salt hash tree, the share hash chain, a |
---|
1618 | + # signature over the first eight fields, and a verification key. |
---|
1619 | + # |
---|
1620 | + # The checkstring is the first three fields -- the version number, |
---|
1621 | + # sequence number, root hash and root salt hash. This is consistent |
---|
1622 | + # in meaning to what we have with SDMF files, except now instead of |
---|
1623 | + # using the literal salt, we use a value derived from all of the |
---|
1624 | + # salts -- the share hash root. |
---|
1625 | + # |
---|
1626 | + # The salt is stored before the block for each segment. The block |
---|
1627 | + # hash tree is computed over the combination of block and salt for |
---|
1628 | + # each segment. In this way, we get integrity checking for both |
---|
1629 | + # block and salt with the current block hash tree arrangement. |
---|
1630 | + # |
---|
1631 | + # The ordering of the offsets is different to reflect the dependencies |
---|
1632 | + # that we'll run into with an MDMF file. The expected write flow is |
---|
1633 | + # something like this: |
---|
1634 | + # |
---|
1635 | + # 0: Initialize with the sequence number, encoding parameters and |
---|
1636 | + # data length. From this, we can deduce the number of segments, |
---|
1637 | + # and where they should go.. We can also figure out where the |
---|
1638 | + # encrypted private key should go, because we can figure out how |
---|
1639 | + # big the share data will be. |
---|
1640 | + # |
---|
1641 | + # 1: Encrypt, encode, and upload the file in chunks. Do something |
---|
1642 | + # like |
---|
1643 | + # |
---|
1644 | + # put_block(data, segnum, salt) |
---|
1645 | + # |
---|
1646 | + # to write a block and a salt to the disk. We can do both of |
---|
1647 | + # these operations now because we have enough of the offsets to |
---|
1648 | + # know where to put them. |
---|
1649 | + # |
---|
1650 | + # 2: Put the encrypted private key. Use: |
---|
1651 | + # |
---|
1652 | + # put_encprivkey(encprivkey) |
---|
1653 | + # |
---|
1654 | + # Now that we know the length of the private key, we can fill |
---|
1655 | + # in the offset for the block hash tree. |
---|
1656 | + # |
---|
1657 | + # 3: We're now in a position to upload the block hash tree for |
---|
1658 | + # a share. Put that using something like: |
---|
1659 | + # |
---|
1660 | + # put_blockhashes(block_hash_tree) |
---|
1661 | + # |
---|
1662 | + # Note that block_hash_tree is a list of hashes -- we'll take |
---|
1663 | + # care of the details of serializing that appropriately. When |
---|
1664 | + # we get the block hash tree, we are also in a position to |
---|
1665 | + # calculate the offset for the share hash chain, and fill that |
---|
1666 | + # into the offsets table. |
---|
1667 | + # |
---|
1668 | + # 4: At the same time, we're in a position to upload the salt hash |
---|
1669 | + # tree. This is a Merkle tree over all of the salts. We use a |
---|
1670 | + # Merkle tree so that we can validate each block,salt pair as |
---|
1671 | + # we download them later. We do this using |
---|
1672 | + # |
---|
1673 | + # put_salthashes(salt_hash_tree) |
---|
1674 | + # |
---|
1675 | + # When you do this, I automatically put the root of the tree |
---|
1676 | + # (the hash at index 0 of the list) in its appropriate slot in |
---|
1677 | + # the signed prefix of the share. |
---|
1678 | + # |
---|
1679 | + # 5: We're now in a position to upload the share hash chain for |
---|
1680 | + # a share. Do that with something like: |
---|
1681 | + # |
---|
1682 | + # put_sharehashes(share_hash_chain) |
---|
1683 | + # |
---|
1684 | + # share_hash_chain should be a dictionary mapping shnums to |
---|
1685 | + # 32-byte hashes -- the wrapper handles serialization. |
---|
1686 | + # We'll know where to put the signature at this point, also. |
---|
1687 | + # The root of this tree will be put explicitly in the next |
---|
1688 | + # step. |
---|
1689 | + # |
---|
1690 | + # TODO: Why? Why not just include it in the tree here? |
---|
1691 | + # |
---|
1692 | + # 6: Before putting the signature, we must first put the |
---|
1693 | + # root_hash. Do this with: |
---|
1694 | + # |
---|
1695 | + # put_root_hash(root_hash). |
---|
1696 | + # |
---|
1697 | + # In terms of knowing where to put this value, it was always |
---|
1698 | + # possible to place it, but it makes sense semantically to |
---|
1699 | + # place it after the share hash tree, so that's why you do it |
---|
1700 | + # in this order. |
---|
1701 | + # |
---|
1702 | + # 6: With the root hash put, we can now sign the header. Use: |
---|
1703 | + # |
---|
1704 | + # get_signable() |
---|
1705 | + # |
---|
1706 | + # to get the part of the header that you want to sign, and use: |
---|
1707 | + # |
---|
1708 | + # put_signature(signature) |
---|
1709 | + # |
---|
1710 | + # to write your signature to the remote server. |
---|
1711 | + # |
---|
1712 | + # 6: Add the verification key, and finish. Do: |
---|
1713 | + # |
---|
1714 | + # put_verification_key(key) |
---|
1715 | + # |
---|
1716 | + # and |
---|
1717 | + # |
---|
1718 | + # finish_publish() |
---|
1719 | + # |
---|
1720 | + # Checkstring management: |
---|
1721 | + # |
---|
1722 | + # To write to a mutable slot, we have to provide test vectors to ensure |
---|
1723 | + # that we are writing to the same data that we think we are. These |
---|
1724 | + # vectors allow us to detect uncoordinated writes; that is, writes |
---|
1725 | + # where both we and some other shareholder are writing to the |
---|
1726 | + # mutable slot, and to report those back to the parts of the program |
---|
1727 | + # doing the writing. |
---|
1728 | + # |
---|
1729 | + # With SDMF, this was easy -- all of the share data was written in |
---|
1730 | + # one go, so it was easy to detect uncoordinated writes, and we only |
---|
1731 | + # had to do it once. With MDMF, not all of the file is written at |
---|
1732 | + # once. |
---|
1733 | + # |
---|
1734 | + # If a share is new, we write out as much of the header as we can |
---|
1735 | + # before writing out anything else. This gives other writers a |
---|
1736 | + # canary that they can use to detect uncoordinated writes, and, if |
---|
1737 | + # they do the same thing, gives us the same canary. We them update |
---|
1738 | + # the share. We won't be able to write out two fields of the header |
---|
1739 | + # -- the share tree hash and the salt hash -- until we finish |
---|
1740 | + # writing out the share. We only require the writer to provide the |
---|
1741 | + # initial checkstring, and keep track of what it should be after |
---|
1742 | + # updates ourselves. |
---|
1743 | + # |
---|
1744 | + # If we haven't written anything yet, then on the first write (which |
---|
1745 | + # will probably be a block + salt of a share), we'll also write out |
---|
1746 | + # the header. On subsequent passes, we'll expect to see the header. |
---|
1747 | + # This changes in two places: |
---|
1748 | + # |
---|
1749 | + # - When we write out the salt hash |
---|
1750 | + # - When we write out the root of the share hash tree |
---|
1751 | + # |
---|
1752 | + # since these values will change the header. It is possible that we |
---|
1753 | + # can just make those be written in one operation to minimize |
---|
1754 | + # disruption. |
---|
1755 | + def __init__(self, |
---|
1756 | + shnum, |
---|
1757 | + rref, # a remote reference to a storage server |
---|
1758 | + storage_index, |
---|
1759 | + secrets, # (write_enabler, renew_secret, cancel_secret) |
---|
1760 | + seqnum, # the sequence number of the mutable file |
---|
1761 | + required_shares, |
---|
1762 | + total_shares, |
---|
1763 | + segment_size, |
---|
1764 | + data_length): # the length of the original file |
---|
1765 | + self.shnum = shnum |
---|
1766 | + self._rref = rref |
---|
1767 | + self._storage_index = storage_index |
---|
1768 | + self._seqnum = seqnum |
---|
1769 | + self._required_shares = required_shares |
---|
1770 | + assert self.shnum >= 0 and self.shnum < total_shares |
---|
1771 | + self._total_shares = total_shares |
---|
1772 | + # We build up the offset table as we write things. It is the |
---|
1773 | + # last thing we write to the remote server. |
---|
1774 | + self._offsets = {} |
---|
1775 | + self._testvs = [] |
---|
1776 | + # This is a list of write vectors that will be sent to our |
---|
1777 | + # remote server once we are directed to write things there. |
---|
1778 | + self._writevs = [] |
---|
1779 | + self._secrets = secrets |
---|
1780 | + # The segment size needs to be a multiple of the k parameter -- |
---|
1781 | + # any padding should have been carried out by the publisher |
---|
1782 | + # already. |
---|
1783 | + assert segment_size % required_shares == 0 |
---|
1784 | + self._segment_size = segment_size |
---|
1785 | + self._data_length = data_length |
---|
1786 | + |
---|
1787 | + # These are set later -- we define them here so that we can |
---|
1788 | + # check for their existence easily |
---|
1789 | + |
---|
1790 | + # This is the root of the share hash tree -- the Merkle tree |
---|
1791 | + # over the roots of the block hash trees computed for shares in |
---|
1792 | + # this upload. |
---|
1793 | + self._root_hash = None |
---|
1794 | + |
---|
1795 | + # We haven't yet written anything to the remote bucket. By |
---|
1796 | + # setting this, we tell the _write method as much. The write |
---|
1797 | + # method will then know that it also needs to add a write vector |
---|
1798 | + # for the checkstring (or what we have of it) to the first write |
---|
1799 | + # request. We'll then record that value for future use. If |
---|
1800 | + # we're expecting something to be there already, we need to call |
---|
1801 | + # set_checkstring before we write anything to tell the first |
---|
1802 | + # write about that. |
---|
1803 | + self._written = False |
---|
1804 | + |
---|
1805 | + # When writing data to the storage servers, we get a read vector |
---|
1806 | + # for free. We'll read the checkstring, which will help us |
---|
1807 | + # figure out what's gone wrong if a write fails. |
---|
1808 | + self._readv = [(0, struct.calcsize(MDMFCHECKSTRING))] |
---|
1809 | + |
---|
1810 | + # We calculate the number of segments because it tells us |
---|
1811 | + # where the salt part of the file ends/share segment begins, |
---|
1812 | + # and also because it provides a useful amount of bounds checking. |
---|
1813 | + self._num_segments = mathutil.div_ceil(self._data_length, |
---|
1814 | + self._segment_size) |
---|
1815 | + self._block_size = self._segment_size / self._required_shares |
---|
1816 | + # We also calculate the share size, to help us with block |
---|
1817 | + # constraints later. |
---|
1818 | + tail_size = self._data_length % self._segment_size |
---|
1819 | + if not tail_size: |
---|
1820 | + self._tail_block_size = self._block_size |
---|
1821 | + else: |
---|
1822 | + self._tail_block_size = mathutil.next_multiple(tail_size, |
---|
1823 | + self._required_shares) |
---|
1824 | + self._tail_block_size /= self._required_shares |
---|
1825 | + |
---|
1826 | + # We already know where the sharedata starts; right after the end |
---|
1827 | + # of the header (which is defined as the signable part + the offsets) |
---|
1828 | + # We can also calculate where the encrypted private key begins |
---|
1829 | + # from what we know know. |
---|
1830 | + self._actual_block_size = self._block_size + SALT_SIZE |
---|
1831 | + data_size = self._actual_block_size * (self._num_segments - 1) |
---|
1832 | + data_size += self._tail_block_size |
---|
1833 | + data_size += SALT_SIZE |
---|
1834 | + self._offsets['enc_privkey'] = MDMFHEADERSIZE |
---|
1835 | + self._offsets['enc_privkey'] += data_size |
---|
1836 | + # We'll wait for the rest. Callers can now call my "put_block" and |
---|
1837 | + # "set_checkstring" methods. |
---|
1838 | + |
---|
1839 | + |
---|
1840 | + def set_checkstring(self, |
---|
1841 | + seqnum_or_checkstring, |
---|
1842 | + root_hash=None, |
---|
1843 | + salt=None): |
---|
1844 | + """ |
---|
1845 | + Set checkstring checkstring for the given shnum. |
---|
1846 | + |
---|
1847 | + This can be invoked in one of two ways. |
---|
1848 | + |
---|
1849 | + With one argument, I assume that you are giving me a literal |
---|
1850 | + checkstring -- e.g., the output of get_checkstring. I will then |
---|
1851 | + set that checkstring as it is. This form is used by unit tests. |
---|
1852 | + |
---|
1853 | + With two arguments, I assume that you are giving me a sequence |
---|
1854 | + number and root hash to make a checkstring from. In that case, I |
---|
1855 | + will build a checkstring and set it for you. This form is used |
---|
1856 | + by the publisher. |
---|
1857 | + |
---|
1858 | + By default, I assume that I am writing new shares to the grid. |
---|
1859 | + If you don't explcitly set your own checkstring, I will use |
---|
1860 | + one that requires that the remote share not exist. You will want |
---|
1861 | + to use this method if you are updating a share in-place; |
---|
1862 | + otherwise, writes will fail. |
---|
1863 | + """ |
---|
1864 | + # You're allowed to overwrite checkstrings with this method; |
---|
1865 | + # I assume that users know what they are doing when they call |
---|
1866 | + # it. |
---|
1867 | + if root_hash: |
---|
1868 | + checkstring = struct.pack(MDMFCHECKSTRING, |
---|
1869 | + 1, |
---|
1870 | + seqnum_or_checkstring, |
---|
1871 | + root_hash) |
---|
1872 | + else: |
---|
1873 | + checkstring = seqnum_or_checkstring |
---|
1874 | + |
---|
1875 | + if checkstring == "": |
---|
1876 | + # We special-case this, since len("") = 0, but we need |
---|
1877 | + # length of 1 for the case of an empty share to work on the |
---|
1878 | + # storage server, which is what a checkstring that is the |
---|
1879 | + # empty string means. |
---|
1880 | + self._testvs = [] |
---|
1881 | + else: |
---|
1882 | + self._testvs = [] |
---|
1883 | + self._testvs.append((0, len(checkstring), "eq", checkstring)) |
---|
1884 | + |
---|
1885 | + |
---|
1886 | + def __repr__(self): |
---|
1887 | + return "MDMFSlotWriteProxy for share %d" % self.shnum |
---|
1888 | + |
---|
1889 | + |
---|
1890 | + def get_checkstring(self): |
---|
1891 | + """ |
---|
1892 | + Given a share number, I return a representation of what the |
---|
1893 | + checkstring for that share on the server will look like. |
---|
1894 | + |
---|
1895 | + I am mostly used for tests. |
---|
1896 | + """ |
---|
1897 | + if self._root_hash: |
---|
1898 | + roothash = self._root_hash |
---|
1899 | + else: |
---|
1900 | + roothash = "\x00" * 32 |
---|
1901 | + return struct.pack(MDMFCHECKSTRING, |
---|
1902 | + 1, |
---|
1903 | + self._seqnum, |
---|
1904 | + roothash) |
---|
1905 | + |
---|
1906 | + |
---|
1907 | + def put_block(self, data, segnum, salt): |
---|
1908 | + """ |
---|
1909 | + I queue a write vector for the data, salt, and segment number |
---|
1910 | + provided to me. I return None, as I do not actually cause |
---|
1911 | + anything to be written yet. |
---|
1912 | + """ |
---|
1913 | + if segnum >= self._num_segments: |
---|
1914 | + raise LayoutInvalid("I won't overwrite the private key") |
---|
1915 | + if len(salt) != SALT_SIZE: |
---|
1916 | + raise LayoutInvalid("I was given a salt of size %d, but " |
---|
1917 | + "I wanted a salt of size %d") |
---|
1918 | + if segnum + 1 == self._num_segments: |
---|
1919 | + if len(data) != self._tail_block_size: |
---|
1920 | + raise LayoutInvalid("I was given the wrong size block to write") |
---|
1921 | + elif len(data) != self._block_size: |
---|
1922 | + raise LayoutInvalid("I was given the wrong size block to write") |
---|
1923 | + |
---|
1924 | + # We want to write at len(MDMFHEADER) + segnum * block_size. |
---|
1925 | + |
---|
1926 | + offset = MDMFHEADERSIZE + (self._actual_block_size * segnum) |
---|
1927 | + data = salt + data |
---|
1928 | + |
---|
1929 | + self._writevs.append(tuple([offset, data])) |
---|
1930 | + |
---|
1931 | + |
---|
1932 | + def put_encprivkey(self, encprivkey): |
---|
1933 | + """ |
---|
1934 | + I queue a write vector for the encrypted private key provided to |
---|
1935 | + me. |
---|
1936 | + """ |
---|
1937 | + assert self._offsets |
---|
1938 | + assert self._offsets['enc_privkey'] |
---|
1939 | + # You shouldn't re-write the encprivkey after the block hash |
---|
1940 | + # tree is written, since that could cause the private key to run |
---|
1941 | + # into the block hash tree. Before it writes the block hash |
---|
1942 | + # tree, the block hash tree writing method writes the offset of |
---|
1943 | + # the salt hash tree. So that's a good indicator of whether or |
---|
1944 | + # not the block hash tree has been written. |
---|
1945 | + if "share_hash_chain" in self._offsets: |
---|
1946 | + raise LayoutInvalid("You must write this before the block hash tree") |
---|
1947 | + |
---|
1948 | + self._offsets['block_hash_tree'] = self._offsets['enc_privkey'] + \ |
---|
1949 | + len(encprivkey) |
---|
1950 | + self._writevs.append(tuple([self._offsets['enc_privkey'], encprivkey])) |
---|
1951 | + |
---|
1952 | + |
---|
1953 | + def put_blockhashes(self, blockhashes): |
---|
1954 | + """ |
---|
1955 | + I queue a write vector to put the block hash tree in blockhashes |
---|
1956 | + onto the remote server. |
---|
1957 | + |
---|
1958 | + The encrypted private key must be queued before the block hash |
---|
1959 | + tree, since we need to know how large it is to know where the |
---|
1960 | + block hash tree should go. The block hash tree must be put |
---|
1961 | + before the salt hash tree, since its size determines the |
---|
1962 | + offset of the share hash chain. |
---|
1963 | + """ |
---|
1964 | + assert self._offsets |
---|
1965 | + assert isinstance(blockhashes, list) |
---|
1966 | + if "block_hash_tree" not in self._offsets: |
---|
1967 | + raise LayoutInvalid("You must put the encrypted private key " |
---|
1968 | + "before you put the block hash tree") |
---|
1969 | + # If written, the share hash chain causes the signature offset |
---|
1970 | + # to be defined. |
---|
1971 | + if "signature" in self._offsets: |
---|
1972 | + raise LayoutInvalid("You must put the block hash tree before " |
---|
1973 | + "you put the share hash chain") |
---|
1974 | + blockhashes_s = "".join(blockhashes) |
---|
1975 | + self._offsets['share_hash_chain'] = self._offsets['block_hash_tree'] + len(blockhashes_s) |
---|
1976 | + |
---|
1977 | + self._writevs.append(tuple([self._offsets['block_hash_tree'], |
---|
1978 | + blockhashes_s])) |
---|
1979 | + |
---|
1980 | + |
---|
1981 | + def put_sharehashes(self, sharehashes): |
---|
1982 | + """ |
---|
1983 | + I queue a write vector to put the share hash chain in my |
---|
1984 | + argument onto the remote server. |
---|
1985 | + |
---|
1986 | + The salt hash tree must be queued before the share hash chain, |
---|
1987 | + since we need to know where the salt hash tree ends before we |
---|
1988 | + can know where the share hash chain starts. The share hash chain |
---|
1989 | + must be put before the signature, since the length of the packed |
---|
1990 | + share hash chain determines the offset of the signature. Also, |
---|
1991 | + semantically, you must know what the root of the salt hash tree |
---|
1992 | + is before you can generate a valid signature. |
---|
1993 | + """ |
---|
1994 | + assert isinstance(sharehashes, dict) |
---|
1995 | + if "share_hash_chain" not in self._offsets: |
---|
1996 | + raise LayoutInvalid("You need to put the salt hash tree before " |
---|
1997 | + "you can put the share hash chain") |
---|
1998 | + # The signature comes after the share hash chain. If the |
---|
1999 | + # signature has already been written, we must not write another |
---|
2000 | + # share hash chain. The signature writes the verification key |
---|
2001 | + # offset when it gets sent to the remote server, so we look for |
---|
2002 | + # that. |
---|
2003 | + if "verification_key" in self._offsets: |
---|
2004 | + raise LayoutInvalid("You must write the share hash chain " |
---|
2005 | + "before you write the signature") |
---|
2006 | + sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i]) |
---|
2007 | + for i in sorted(sharehashes.keys())]) |
---|
2008 | + self._offsets['signature'] = self._offsets['share_hash_chain'] + len(sharehashes_s) |
---|
2009 | + self._writevs.append(tuple([self._offsets['share_hash_chain'], |
---|
2010 | + sharehashes_s])) |
---|
2011 | + |
---|
2012 | + |
---|
2013 | + def put_root_hash(self, roothash): |
---|
2014 | + """ |
---|
2015 | + Put the root hash (the root of the share hash tree) in the |
---|
2016 | + remote slot. |
---|
2017 | + """ |
---|
2018 | + # It does not make sense to be able to put the root |
---|
2019 | + # hash without first putting the share hashes, since you need |
---|
2020 | + # the share hashes to generate the root hash. |
---|
2021 | + # |
---|
2022 | + # Signature is defined by the routine that places the share hash |
---|
2023 | + # chain, so it's a good thing to look for in finding out whether |
---|
2024 | + # or not the share hash chain exists on the remote server. |
---|
2025 | + if "signature" not in self._offsets: |
---|
2026 | + raise LayoutInvalid("You need to put the share hash chain " |
---|
2027 | + "before you can put the root share hash") |
---|
2028 | + if len(roothash) != HASH_SIZE: |
---|
2029 | + raise LayoutInvalid("hashes and salts must be exactly %d bytes" |
---|
2030 | + % HASH_SIZE) |
---|
2031 | + self._root_hash = roothash |
---|
2032 | + # To write both of these values, we update the checkstring on |
---|
2033 | + # the remote server, which includes them |
---|
2034 | + checkstring = self.get_checkstring() |
---|
2035 | + self._writevs.append(tuple([0, checkstring])) |
---|
2036 | + # This write, if successful, changes the checkstring, so we need |
---|
2037 | + # to update our internal checkstring to be consistent with the |
---|
2038 | + # one on the server. |
---|
2039 | + |
---|
2040 | + |
---|
2041 | + def get_signable(self): |
---|
2042 | + """ |
---|
2043 | + Get the first seven fields of the mutable file; the parts that |
---|
2044 | + are signed. |
---|
2045 | + """ |
---|
2046 | + if not self._root_hash: |
---|
2047 | + raise LayoutInvalid("You need to set the root hash " |
---|
2048 | + "before getting something to " |
---|
2049 | + "sign") |
---|
2050 | + return struct.pack(MDMFSIGNABLEHEADER, |
---|
2051 | + 1, |
---|
2052 | + self._seqnum, |
---|
2053 | + self._root_hash, |
---|
2054 | + self._required_shares, |
---|
2055 | + self._total_shares, |
---|
2056 | + self._segment_size, |
---|
2057 | + self._data_length) |
---|
2058 | + |
---|
2059 | + |
---|
2060 | + def put_signature(self, signature): |
---|
2061 | + """ |
---|
2062 | + I queue a write vector for the signature of the MDMF share. |
---|
2063 | + |
---|
2064 | + I require that the root hash and share hash chain have been put |
---|
2065 | + to the grid before I will write the signature to the grid. |
---|
2066 | + """ |
---|
2067 | + if "signature" not in self._offsets: |
---|
2068 | + raise LayoutInvalid("You must put the share hash chain " |
---|
2069 | + # It does not make sense to put a signature without first |
---|
2070 | + # putting the root hash and the salt hash (since otherwise |
---|
2071 | + # the signature would be incomplete), so we don't allow that. |
---|
2072 | + "before putting the signature") |
---|
2073 | + if not self._root_hash: |
---|
2074 | + raise LayoutInvalid("You must complete the signed prefix " |
---|
2075 | + "before computing a signature") |
---|
2076 | + # If we put the signature after we put the verification key, we |
---|
2077 | + # could end up running into the verification key, and will |
---|
2078 | + # probably screw up the offsets as well. So we don't allow that. |
---|
2079 | + # The method that writes the verification key defines the EOF |
---|
2080 | + # offset before writing the verification key, so look for that. |
---|
2081 | + if "EOF" in self._offsets: |
---|
2082 | + raise LayoutInvalid("You must write the signature before the verification key") |
---|
2083 | + |
---|
2084 | + self._offsets['verification_key'] = self._offsets['signature'] + len(signature) |
---|
2085 | + self._writevs.append(tuple([self._offsets['signature'], signature])) |
---|
2086 | + |
---|
2087 | + |
---|
2088 | + def put_verification_key(self, verification_key): |
---|
2089 | + """ |
---|
2090 | + I queue a write vector for the verification key. |
---|
2091 | + |
---|
2092 | + I require that the signature have been written to the storage |
---|
2093 | + server before I allow the verification key to be written to the |
---|
2094 | + remote server. |
---|
2095 | + """ |
---|
2096 | + if "verification_key" not in self._offsets: |
---|
2097 | + raise LayoutInvalid("You must put the signature before you " |
---|
2098 | + "can put the verification key") |
---|
2099 | + self._offsets['EOF'] = self._offsets['verification_key'] + len(verification_key) |
---|
2100 | + self._writevs.append(tuple([self._offsets['verification_key'], |
---|
2101 | + verification_key])) |
---|
2102 | + |
---|
2103 | + |
---|
2104 | + def _get_offsets_tuple(self): |
---|
2105 | + return tuple([(key, value) for key, value in self._offsets.items()]) |
---|
2106 | + |
---|
2107 | + |
---|
2108 | + def get_verinfo(self): |
---|
2109 | + return (self._seqnum, |
---|
2110 | + self._root_hash, |
---|
2111 | + self._required_shares, |
---|
2112 | + self._total_shares, |
---|
2113 | + self._segment_size, |
---|
2114 | + self._data_length, |
---|
2115 | + self.get_signable(), |
---|
2116 | + self._get_offsets_tuple()) |
---|
2117 | + |
---|
2118 | + |
---|
2119 | + def finish_publishing(self): |
---|
2120 | + """ |
---|
2121 | + I add a write vector for the offsets table, and then cause all |
---|
2122 | + of the write vectors that I've dealt with so far to be published |
---|
2123 | + to the remote server, ending the write process. |
---|
2124 | + """ |
---|
2125 | + if "EOF" not in self._offsets: |
---|
2126 | + raise LayoutInvalid("You must put the verification key before " |
---|
2127 | + "you can publish the offsets") |
---|
2128 | + offsets_offset = struct.calcsize(MDMFHEADERWITHOUTOFFSETS) |
---|
2129 | + offsets = struct.pack(MDMFOFFSETS, |
---|
2130 | + self._offsets['enc_privkey'], |
---|
2131 | + self._offsets['block_hash_tree'], |
---|
2132 | + self._offsets['share_hash_chain'], |
---|
2133 | + self._offsets['signature'], |
---|
2134 | + self._offsets['verification_key'], |
---|
2135 | + self._offsets['EOF']) |
---|
2136 | + self._writevs.append(tuple([offsets_offset, offsets])) |
---|
2137 | + encoding_parameters_offset = struct.calcsize(MDMFCHECKSTRING) |
---|
2138 | + params = struct.pack(">BBQQ", |
---|
2139 | + self._required_shares, |
---|
2140 | + self._total_shares, |
---|
2141 | + self._segment_size, |
---|
2142 | + self._data_length) |
---|
2143 | + self._writevs.append(tuple([encoding_parameters_offset, params])) |
---|
2144 | + return self._write(self._writevs) |
---|
2145 | + |
---|
2146 | + |
---|
2147 | + def _write(self, datavs, on_failure=None, on_success=None): |
---|
2148 | + """I write the data vectors in datavs to the remote slot.""" |
---|
2149 | + tw_vectors = {} |
---|
2150 | + new_share = False |
---|
2151 | + if not self._testvs: |
---|
2152 | + self._testvs = [] |
---|
2153 | + self._testvs.append(tuple([0, 1, "eq", ""])) |
---|
2154 | + new_share = True |
---|
2155 | + if not self._written: |
---|
2156 | + # Write a new checkstring to the share when we write it, so |
---|
2157 | + # that we have something to check later. |
---|
2158 | + new_checkstring = self.get_checkstring() |
---|
2159 | + datavs.append((0, new_checkstring)) |
---|
2160 | + def _first_write(): |
---|
2161 | + self._written = True |
---|
2162 | + self._testvs = [(0, len(new_checkstring), "eq", new_checkstring)] |
---|
2163 | + on_success = _first_write |
---|
2164 | + tw_vectors[self.shnum] = (self._testvs, datavs, None) |
---|
2165 | + datalength = sum([len(x[1]) for x in datavs]) |
---|
2166 | + d = self._rref.callRemote("slot_testv_and_readv_and_writev", |
---|
2167 | + self._storage_index, |
---|
2168 | + self._secrets, |
---|
2169 | + tw_vectors, |
---|
2170 | + self._readv) |
---|
2171 | + def _result(results): |
---|
2172 | + if isinstance(results, failure.Failure) or not results[0]: |
---|
2173 | + # Do nothing; the write was unsuccessful. |
---|
2174 | + if on_failure: on_failure() |
---|
2175 | + else: |
---|
2176 | + if on_success: on_success() |
---|
2177 | + return results |
---|
2178 | + d.addCallback(_result) |
---|
2179 | + return d |
---|
2180 | + |
---|
2181 | + |
---|
2182 | +class MDMFSlotReadProxy: |
---|
2183 | + """ |
---|
2184 | + I read from a mutable slot filled with data written in the MDMF data |
---|
2185 | + format (which is described above). |
---|
2186 | + |
---|
2187 | + I can be initialized with some amount of data, which I will use (if |
---|
2188 | + it is valid) to eliminate some of the need to fetch it from servers. |
---|
2189 | + """ |
---|
2190 | + def __init__(self, |
---|
2191 | + rref, |
---|
2192 | + storage_index, |
---|
2193 | + shnum, |
---|
2194 | + data=""): |
---|
2195 | + # Start the initialization process. |
---|
2196 | + self._rref = rref |
---|
2197 | + self._storage_index = storage_index |
---|
2198 | + self.shnum = shnum |
---|
2199 | + |
---|
2200 | + # Before doing anything, the reader is probably going to want to |
---|
2201 | + # verify that the signature is correct. To do that, they'll need |
---|
2202 | + # the verification key, and the signature. To get those, we'll |
---|
2203 | + # need the offset table. So fetch the offset table on the |
---|
2204 | + # assumption that that will be the first thing that a reader is |
---|
2205 | + # going to do. |
---|
2206 | + |
---|
2207 | + # The fact that these encoding parameters are None tells us |
---|
2208 | + # that we haven't yet fetched them from the remote share, so we |
---|
2209 | + # should. We could just not set them, but the checks will be |
---|
2210 | + # easier to read if we don't have to use hasattr. |
---|
2211 | + self._version_number = None |
---|
2212 | + self._sequence_number = None |
---|
2213 | + self._root_hash = None |
---|
2214 | + # Filled in if we're dealing with an SDMF file. Unused |
---|
2215 | + # otherwise. |
---|
2216 | + self._salt = None |
---|
2217 | + self._required_shares = None |
---|
2218 | + self._total_shares = None |
---|
2219 | + self._segment_size = None |
---|
2220 | + self._data_length = None |
---|
2221 | + self._offsets = None |
---|
2222 | + |
---|
2223 | + # If the user has chosen to initialize us with some data, we'll |
---|
2224 | + # try to satisfy subsequent data requests with that data before |
---|
2225 | + # asking the storage server for it. If |
---|
2226 | + self._data = data |
---|
2227 | + # The way callers interact with cache in the filenode returns |
---|
2228 | + # None if there isn't any cached data, but the way we index the |
---|
2229 | + # cached data requires a string, so convert None to "". |
---|
2230 | + if self._data == None: |
---|
2231 | + self._data = "" |
---|
2232 | + |
---|
2233 | + self._queue_observers = observer.ObserverList() |
---|
2234 | + self._queue_errbacks = observer.ObserverList() |
---|
2235 | + self._readvs = [] |
---|
2236 | + |
---|
2237 | + |
---|
2238 | + def _maybe_fetch_offsets_and_header(self, force_remote=False): |
---|
2239 | + """ |
---|
2240 | + I fetch the offset table and the header from the remote slot if |
---|
2241 | + I don't already have them. If I do have them, I do nothing and |
---|
2242 | + return an empty Deferred. |
---|
2243 | + """ |
---|
2244 | + if self._offsets: |
---|
2245 | + return defer.succeed(None) |
---|
2246 | + # At this point, we may be either SDMF or MDMF. Fetching 107 |
---|
2247 | + # bytes will be enough to get header and offsets for both SDMF and |
---|
2248 | + # MDMF, though we'll be left with 4 more bytes than we |
---|
2249 | + # need if this ends up being MDMF. This is probably less |
---|
2250 | + # expensive than the cost of a second roundtrip. |
---|
2251 | + readvs = [(0, 107)] |
---|
2252 | + d = self._read(readvs, force_remote) |
---|
2253 | + d.addCallback(self._process_encoding_parameters) |
---|
2254 | + d.addCallback(self._process_offsets) |
---|
2255 | + return d |
---|
2256 | + |
---|
2257 | + |
---|
2258 | + def _process_encoding_parameters(self, encoding_parameters): |
---|
2259 | + assert self.shnum in encoding_parameters |
---|
2260 | + encoding_parameters = encoding_parameters[self.shnum][0] |
---|
2261 | + # The first byte is the version number. It will tell us what |
---|
2262 | + # to do next. |
---|
2263 | + (verno,) = struct.unpack(">B", encoding_parameters[:1]) |
---|
2264 | + if verno == MDMF_VERSION: |
---|
2265 | + read_size = MDMFHEADERWITHOUTOFFSETSSIZE |
---|
2266 | + (verno, |
---|
2267 | + seqnum, |
---|
2268 | + root_hash, |
---|
2269 | + k, |
---|
2270 | + n, |
---|
2271 | + segsize, |
---|
2272 | + datalen) = struct.unpack(MDMFHEADERWITHOUTOFFSETS, |
---|
2273 | + encoding_parameters[:read_size]) |
---|
2274 | + if segsize == 0 and datalen == 0: |
---|
2275 | + # Empty file, no segments. |
---|
2276 | + self._num_segments = 0 |
---|
2277 | + else: |
---|
2278 | + self._num_segments = mathutil.div_ceil(datalen, segsize) |
---|
2279 | + |
---|
2280 | + elif verno == SDMF_VERSION: |
---|
2281 | + read_size = SIGNED_PREFIX_LENGTH |
---|
2282 | + (verno, |
---|
2283 | + seqnum, |
---|
2284 | + root_hash, |
---|
2285 | + salt, |
---|
2286 | + k, |
---|
2287 | + n, |
---|
2288 | + segsize, |
---|
2289 | + datalen) = struct.unpack(">BQ32s16s BBQQ", |
---|
2290 | + encoding_parameters[:SIGNED_PREFIX_LENGTH]) |
---|
2291 | + self._salt = salt |
---|
2292 | + if segsize == 0 and datalen == 0: |
---|
2293 | + # empty file |
---|
2294 | + self._num_segments = 0 |
---|
2295 | + else: |
---|
2296 | + # non-empty SDMF files have one segment. |
---|
2297 | + self._num_segments = 1 |
---|
2298 | + else: |
---|
2299 | + raise UnknownVersionError("You asked me to read mutable file " |
---|
2300 | + "version %d, but I only understand " |
---|
2301 | + "%d and %d" % (verno, SDMF_VERSION, |
---|
2302 | + MDMF_VERSION)) |
---|
2303 | + |
---|
2304 | + self._version_number = verno |
---|
2305 | + self._sequence_number = seqnum |
---|
2306 | + self._root_hash = root_hash |
---|
2307 | + self._required_shares = k |
---|
2308 | + self._total_shares = n |
---|
2309 | + self._segment_size = segsize |
---|
2310 | + self._data_length = datalen |
---|
2311 | + |
---|
2312 | + self._block_size = self._segment_size / self._required_shares |
---|
2313 | + # We can upload empty files, and need to account for this fact |
---|
2314 | + # so as to avoid zero-division and zero-modulo errors. |
---|
2315 | + if datalen > 0: |
---|
2316 | + tail_size = self._data_length % self._segment_size |
---|
2317 | + else: |
---|
2318 | + tail_size = 0 |
---|
2319 | + if not tail_size: |
---|
2320 | + self._tail_block_size = self._block_size |
---|
2321 | + else: |
---|
2322 | + self._tail_block_size = mathutil.next_multiple(tail_size, |
---|
2323 | + self._required_shares) |
---|
2324 | + self._tail_block_size /= self._required_shares |
---|
2325 | + |
---|
2326 | + return encoding_parameters |
---|
2327 | + |
---|
2328 | + |
---|
2329 | + def _process_offsets(self, offsets): |
---|
2330 | + if self._version_number == 0: |
---|
2331 | + read_size = OFFSETS_LENGTH |
---|
2332 | + read_offset = SIGNED_PREFIX_LENGTH |
---|
2333 | + end = read_size + read_offset |
---|
2334 | + (signature, |
---|
2335 | + share_hash_chain, |
---|
2336 | + block_hash_tree, |
---|
2337 | + share_data, |
---|
2338 | + enc_privkey, |
---|
2339 | + EOF) = struct.unpack(">LLLLQQ", |
---|
2340 | + offsets[read_offset:end]) |
---|
2341 | + self._offsets = {} |
---|
2342 | + self._offsets['signature'] = signature |
---|
2343 | + self._offsets['share_data'] = share_data |
---|
2344 | + self._offsets['block_hash_tree'] = block_hash_tree |
---|
2345 | + self._offsets['share_hash_chain'] = share_hash_chain |
---|
2346 | + self._offsets['enc_privkey'] = enc_privkey |
---|
2347 | + self._offsets['EOF'] = EOF |
---|
2348 | + |
---|
2349 | + elif self._version_number == 1: |
---|
2350 | + read_offset = MDMFHEADERWITHOUTOFFSETSSIZE |
---|
2351 | + read_length = MDMFOFFSETS_LENGTH |
---|
2352 | + end = read_offset + read_length |
---|
2353 | + (encprivkey, |
---|
2354 | + blockhashes, |
---|
2355 | + sharehashes, |
---|
2356 | + signature, |
---|
2357 | + verification_key, |
---|
2358 | + eof) = struct.unpack(MDMFOFFSETS, |
---|
2359 | + offsets[read_offset:end]) |
---|
2360 | + self._offsets = {} |
---|
2361 | + self._offsets['enc_privkey'] = encprivkey |
---|
2362 | + self._offsets['block_hash_tree'] = blockhashes |
---|
2363 | + self._offsets['share_hash_chain'] = sharehashes |
---|
2364 | + self._offsets['signature'] = signature |
---|
2365 | + self._offsets['verification_key'] = verification_key |
---|
2366 | + self._offsets['EOF'] = eof |
---|
2367 | + |
---|
2368 | + |
---|
2369 | + def get_block_and_salt(self, segnum, queue=False): |
---|
2370 | + """ |
---|
2371 | + I return (block, salt), where block is the block data and |
---|
2372 | + salt is the salt used to encrypt that segment. |
---|
2373 | + """ |
---|
2374 | + d = self._maybe_fetch_offsets_and_header() |
---|
2375 | + def _then(ignored): |
---|
2376 | + if self._version_number == 1: |
---|
2377 | + base_share_offset = MDMFHEADERSIZE |
---|
2378 | + else: |
---|
2379 | + base_share_offset = self._offsets['share_data'] |
---|
2380 | + |
---|
2381 | + if segnum + 1 > self._num_segments: |
---|
2382 | + raise LayoutInvalid("Not a valid segment number") |
---|
2383 | + |
---|
2384 | + if self._version_number == 0: |
---|
2385 | + share_offset = base_share_offset + self._block_size * segnum |
---|
2386 | + else: |
---|
2387 | + share_offset = base_share_offset + (self._block_size + \ |
---|
2388 | + SALT_SIZE) * segnum |
---|
2389 | + if segnum + 1 == self._num_segments: |
---|
2390 | + data = self._tail_block_size |
---|
2391 | + else: |
---|
2392 | + data = self._block_size |
---|
2393 | + |
---|
2394 | + if self._version_number == 1: |
---|
2395 | + data += SALT_SIZE |
---|
2396 | + |
---|
2397 | + readvs = [(share_offset, data)] |
---|
2398 | + return readvs |
---|
2399 | + d.addCallback(_then) |
---|
2400 | + d.addCallback(lambda readvs: |
---|
2401 | + self._read(readvs, queue=queue)) |
---|
2402 | + def _process_results(results): |
---|
2403 | + assert self.shnum in results |
---|
2404 | + if self._version_number == 0: |
---|
2405 | + # We only read the share data, but we know the salt from |
---|
2406 | + # when we fetched the header |
---|
2407 | + data = results[self.shnum] |
---|
2408 | + if not data: |
---|
2409 | + data = "" |
---|
2410 | + else: |
---|
2411 | + assert len(data) == 1 |
---|
2412 | + data = data[0] |
---|
2413 | + salt = self._salt |
---|
2414 | + else: |
---|
2415 | + data = results[self.shnum] |
---|
2416 | + if not data: |
---|
2417 | + salt = data = "" |
---|
2418 | + else: |
---|
2419 | + salt_and_data = results[self.shnum][0] |
---|
2420 | + salt = salt_and_data[:SALT_SIZE] |
---|
2421 | + data = salt_and_data[SALT_SIZE:] |
---|
2422 | + return data, salt |
---|
2423 | + d.addCallback(_process_results) |
---|
2424 | + return d |
---|
2425 | + |
---|
2426 | + |
---|
2427 | + def get_blockhashes(self, needed=None, queue=False, force_remote=False): |
---|
2428 | + """ |
---|
2429 | + I return the block hash tree |
---|
2430 | + |
---|
2431 | + I take an optional argument, needed, which is a set of indices |
---|
2432 | + correspond to hashes that I should fetch. If this argument is |
---|
2433 | + missing, I will fetch the entire block hash tree; otherwise, I |
---|
2434 | + may attempt to fetch fewer hashes, based on what needed says |
---|
2435 | + that I should do. Note that I may fetch as many hashes as I |
---|
2436 | + want, so long as the set of hashes that I do fetch is a superset |
---|
2437 | + of the ones that I am asked for, so callers should be prepared |
---|
2438 | + to tolerate additional hashes. |
---|
2439 | + """ |
---|
2440 | + # TODO: Return only the parts of the block hash tree necessary |
---|
2441 | + # to validate the blocknum provided? |
---|
2442 | + # This is a good idea, but it is hard to implement correctly. It |
---|
2443 | + # is bad to fetch any one block hash more than once, so we |
---|
2444 | + # probably just want to fetch the whole thing at once and then |
---|
2445 | + # serve it. |
---|
2446 | + if needed == set([]): |
---|
2447 | + return defer.succeed([]) |
---|
2448 | + d = self._maybe_fetch_offsets_and_header() |
---|
2449 | + def _then(ignored): |
---|
2450 | + blockhashes_offset = self._offsets['block_hash_tree'] |
---|
2451 | + if self._version_number == 1: |
---|
2452 | + blockhashes_length = self._offsets['share_hash_chain'] - blockhashes_offset |
---|
2453 | + else: |
---|
2454 | + blockhashes_length = self._offsets['share_data'] - blockhashes_offset |
---|
2455 | + readvs = [(blockhashes_offset, blockhashes_length)] |
---|
2456 | + return readvs |
---|
2457 | + d.addCallback(_then) |
---|
2458 | + d.addCallback(lambda readvs: |
---|
2459 | + self._read(readvs, queue=queue, force_remote=force_remote)) |
---|
2460 | + def _build_block_hash_tree(results): |
---|
2461 | + assert self.shnum in results |
---|
2462 | + |
---|
2463 | + rawhashes = results[self.shnum][0] |
---|
2464 | + results = [rawhashes[i:i+HASH_SIZE] |
---|
2465 | + for i in range(0, len(rawhashes), HASH_SIZE)] |
---|
2466 | + return results |
---|
2467 | + d.addCallback(_build_block_hash_tree) |
---|
2468 | + return d |
---|
2469 | + |
---|
2470 | + |
---|
2471 | + def get_sharehashes(self, needed=None, queue=False, force_remote=False): |
---|
2472 | + """ |
---|
2473 | + I return the part of the share hash chain placed to validate |
---|
2474 | + this share. |
---|
2475 | + |
---|
2476 | + I take an optional argument, needed. Needed is a set of indices |
---|
2477 | + that correspond to the hashes that I should fetch. If needed is |
---|
2478 | + not present, I will fetch and return the entire share hash |
---|
2479 | + chain. Otherwise, I may fetch and return any part of the share |
---|
2480 | + hash chain that is a superset of the part that I am asked to |
---|
2481 | + fetch. Callers should be prepared to deal with more hashes than |
---|
2482 | + they've asked for. |
---|
2483 | + """ |
---|
2484 | + if needed == set([]): |
---|
2485 | + return defer.succeed([]) |
---|
2486 | + d = self._maybe_fetch_offsets_and_header() |
---|
2487 | + |
---|
2488 | + def _make_readvs(ignored): |
---|
2489 | + sharehashes_offset = self._offsets['share_hash_chain'] |
---|
2490 | + if self._version_number == 0: |
---|
2491 | + sharehashes_length = self._offsets['block_hash_tree'] - sharehashes_offset |
---|
2492 | + else: |
---|
2493 | + sharehashes_length = self._offsets['signature'] - sharehashes_offset |
---|
2494 | + readvs = [(sharehashes_offset, sharehashes_length)] |
---|
2495 | + return readvs |
---|
2496 | + d.addCallback(_make_readvs) |
---|
2497 | + d.addCallback(lambda readvs: |
---|
2498 | + self._read(readvs, queue=queue, force_remote=force_remote)) |
---|
2499 | + def _build_share_hash_chain(results): |
---|
2500 | + assert self.shnum in results |
---|
2501 | + |
---|
2502 | + sharehashes = results[self.shnum][0] |
---|
2503 | + results = [sharehashes[i:i+(HASH_SIZE + 2)] |
---|
2504 | + for i in range(0, len(sharehashes), HASH_SIZE + 2)] |
---|
2505 | + results = dict([struct.unpack(">H32s", data) |
---|
2506 | + for data in results]) |
---|
2507 | + return results |
---|
2508 | + d.addCallback(_build_share_hash_chain) |
---|
2509 | + return d |
---|
2510 | + |
---|
2511 | + |
---|
2512 | + def get_encprivkey(self, queue=False): |
---|
2513 | + """ |
---|
2514 | + I return the encrypted private key. |
---|
2515 | + """ |
---|
2516 | + d = self._maybe_fetch_offsets_and_header() |
---|
2517 | + |
---|
2518 | + def _make_readvs(ignored): |
---|
2519 | + privkey_offset = self._offsets['enc_privkey'] |
---|
2520 | + if self._version_number == 0: |
---|
2521 | + privkey_length = self._offsets['EOF'] - privkey_offset |
---|
2522 | + else: |
---|
2523 | + privkey_length = self._offsets['block_hash_tree'] - privkey_offset |
---|
2524 | + readvs = [(privkey_offset, privkey_length)] |
---|
2525 | + return readvs |
---|
2526 | + d.addCallback(_make_readvs) |
---|
2527 | + d.addCallback(lambda readvs: |
---|
2528 | + self._read(readvs, queue=queue)) |
---|
2529 | + def _process_results(results): |
---|
2530 | + assert self.shnum in results |
---|
2531 | + privkey = results[self.shnum][0] |
---|
2532 | + return privkey |
---|
2533 | + d.addCallback(_process_results) |
---|
2534 | + return d |
---|
2535 | + |
---|
2536 | + |
---|
2537 | + def get_signature(self, queue=False): |
---|
2538 | + """ |
---|
2539 | + I return the signature of my share. |
---|
2540 | + """ |
---|
2541 | + d = self._maybe_fetch_offsets_and_header() |
---|
2542 | + |
---|
2543 | + def _make_readvs(ignored): |
---|
2544 | + signature_offset = self._offsets['signature'] |
---|
2545 | + if self._version_number == 1: |
---|
2546 | + signature_length = self._offsets['verification_key'] - signature_offset |
---|
2547 | + else: |
---|
2548 | + signature_length = self._offsets['share_hash_chain'] - signature_offset |
---|
2549 | + readvs = [(signature_offset, signature_length)] |
---|
2550 | + return readvs |
---|
2551 | + d.addCallback(_make_readvs) |
---|
2552 | + d.addCallback(lambda readvs: |
---|
2553 | + self._read(readvs, queue=queue)) |
---|
2554 | + def _process_results(results): |
---|
2555 | + assert self.shnum in results |
---|
2556 | + signature = results[self.shnum][0] |
---|
2557 | + return signature |
---|
2558 | + d.addCallback(_process_results) |
---|
2559 | + return d |
---|
2560 | + |
---|
2561 | + |
---|
2562 | + def get_verification_key(self, queue=False): |
---|
2563 | + """ |
---|
2564 | + I return the verification key. |
---|
2565 | + """ |
---|
2566 | + d = self._maybe_fetch_offsets_and_header() |
---|
2567 | + |
---|
2568 | + def _make_readvs(ignored): |
---|
2569 | + if self._version_number == 1: |
---|
2570 | + vk_offset = self._offsets['verification_key'] |
---|
2571 | + vk_length = self._offsets['EOF'] - vk_offset |
---|
2572 | + else: |
---|
2573 | + vk_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ") |
---|
2574 | + vk_length = self._offsets['signature'] - vk_offset |
---|
2575 | + readvs = [(vk_offset, vk_length)] |
---|
2576 | + return readvs |
---|
2577 | + d.addCallback(_make_readvs) |
---|
2578 | + d.addCallback(lambda readvs: |
---|
2579 | + self._read(readvs, queue=queue)) |
---|
2580 | + def _process_results(results): |
---|
2581 | + assert self.shnum in results |
---|
2582 | + verification_key = results[self.shnum][0] |
---|
2583 | + return verification_key |
---|
2584 | + d.addCallback(_process_results) |
---|
2585 | + return d |
---|
2586 | + |
---|
2587 | + |
---|
2588 | + def get_encoding_parameters(self): |
---|
2589 | + """ |
---|
2590 | + I return (k, n, segsize, datalen) |
---|
2591 | + """ |
---|
2592 | + d = self._maybe_fetch_offsets_and_header() |
---|
2593 | + d.addCallback(lambda ignored: |
---|
2594 | + (self._required_shares, |
---|
2595 | + self._total_shares, |
---|
2596 | + self._segment_size, |
---|
2597 | + self._data_length)) |
---|
2598 | + return d |
---|
2599 | + |
---|
2600 | + |
---|
2601 | + def get_seqnum(self): |
---|
2602 | + """ |
---|
2603 | + I return the sequence number for this share. |
---|
2604 | + """ |
---|
2605 | + d = self._maybe_fetch_offsets_and_header() |
---|
2606 | + d.addCallback(lambda ignored: |
---|
2607 | + self._sequence_number) |
---|
2608 | + return d |
---|
2609 | + |
---|
2610 | + |
---|
2611 | + def get_root_hash(self): |
---|
2612 | + """ |
---|
2613 | + I return the root of the block hash tree |
---|
2614 | + """ |
---|
2615 | + d = self._maybe_fetch_offsets_and_header() |
---|
2616 | + d.addCallback(lambda ignored: self._root_hash) |
---|
2617 | + return d |
---|
2618 | + |
---|
2619 | + |
---|
2620 | + def get_checkstring(self): |
---|
2621 | + """ |
---|
2622 | + I return the packed representation of the following: |
---|
2623 | + |
---|
2624 | + - version number |
---|
2625 | + - sequence number |
---|
2626 | + - root hash |
---|
2627 | + - salt hash |
---|
2628 | + |
---|
2629 | + which my users use as a checkstring to detect other writers. |
---|
2630 | + """ |
---|
2631 | + d = self._maybe_fetch_offsets_and_header() |
---|
2632 | + def _build_checkstring(ignored): |
---|
2633 | + if self._salt: |
---|
2634 | + checkstring = strut.pack(PREFIX, |
---|
2635 | + self._version_number, |
---|
2636 | + self._sequence_number, |
---|
2637 | + self._root_hash, |
---|
2638 | + self._salt) |
---|
2639 | + else: |
---|
2640 | + checkstring = struct.pack(MDMFCHECKSTRING, |
---|
2641 | + self._version_number, |
---|
2642 | + self._sequence_number, |
---|
2643 | + self._root_hash) |
---|
2644 | + |
---|
2645 | + return checkstring |
---|
2646 | + d.addCallback(_build_checkstring) |
---|
2647 | + return d |
---|
2648 | + |
---|
2649 | + |
---|
2650 | + def get_prefix(self, force_remote): |
---|
2651 | + d = self._maybe_fetch_offsets_and_header(force_remote) |
---|
2652 | + d.addCallback(lambda ignored: |
---|
2653 | + self._build_prefix()) |
---|
2654 | + return d |
---|
2655 | + |
---|
2656 | + |
---|
2657 | + def _build_prefix(self): |
---|
2658 | + # The prefix is another name for the part of the remote share |
---|
2659 | + # that gets signed. It consists of everything up to and |
---|
2660 | + # including the datalength, packed by struct. |
---|
2661 | + if self._version_number == SDMF_VERSION: |
---|
2662 | + return struct.pack(SIGNED_PREFIX, |
---|
2663 | + self._version_number, |
---|
2664 | + self._sequence_number, |
---|
2665 | + self._root_hash, |
---|
2666 | + self._salt, |
---|
2667 | + self._required_shares, |
---|
2668 | + self._total_shares, |
---|
2669 | + self._segment_size, |
---|
2670 | + self._data_length) |
---|
2671 | + |
---|
2672 | + else: |
---|
2673 | + return struct.pack(MDMFSIGNABLEHEADER, |
---|
2674 | + self._version_number, |
---|
2675 | + self._sequence_number, |
---|
2676 | + self._root_hash, |
---|
2677 | + self._required_shares, |
---|
2678 | + self._total_shares, |
---|
2679 | + self._segment_size, |
---|
2680 | + self._data_length) |
---|
2681 | + |
---|
2682 | + |
---|
2683 | + def _get_offsets_tuple(self): |
---|
2684 | + # The offsets tuple is another component of the version |
---|
2685 | + # information tuple. It is basically our offsets dictionary, |
---|
2686 | + # itemized and in a tuple. |
---|
2687 | + return self._offsets.copy() |
---|
2688 | + |
---|
2689 | + |
---|
2690 | + def get_verinfo(self): |
---|
2691 | + """ |
---|
2692 | + I return my verinfo tuple. This is used by the ServermapUpdater |
---|
2693 | + to keep track of versions of mutable files. |
---|
2694 | + |
---|
2695 | + The verinfo tuple for MDMF files contains: |
---|
2696 | + - seqnum |
---|
2697 | + - root hash |
---|
2698 | + - a blank (nothing) |
---|
2699 | + - segsize |
---|
2700 | + - datalen |
---|
2701 | + - k |
---|
2702 | + - n |
---|
2703 | + - prefix (the thing that you sign) |
---|
2704 | + - a tuple of offsets |
---|
2705 | + |
---|
2706 | + We include the nonce in MDMF to simplify processing of version |
---|
2707 | + information tuples. |
---|
2708 | + |
---|
2709 | + The verinfo tuple for SDMF files is the same, but contains a |
---|
2710 | + 16-byte IV instead of a hash of salts. |
---|
2711 | + """ |
---|
2712 | + d = self._maybe_fetch_offsets_and_header() |
---|
2713 | + def _build_verinfo(ignored): |
---|
2714 | + if self._version_number == SDMF_VERSION: |
---|
2715 | + salt_to_use = self._salt |
---|
2716 | + else: |
---|
2717 | + salt_to_use = None |
---|
2718 | + return (self._sequence_number, |
---|
2719 | + self._root_hash, |
---|
2720 | + salt_to_use, |
---|
2721 | + self._segment_size, |
---|
2722 | + self._data_length, |
---|
2723 | + self._required_shares, |
---|
2724 | + self._total_shares, |
---|
2725 | + self._build_prefix(), |
---|
2726 | + self._get_offsets_tuple()) |
---|
2727 | + d.addCallback(_build_verinfo) |
---|
2728 | + return d |
---|
2729 | + |
---|
2730 | + |
---|
2731 | + def flush(self): |
---|
2732 | + """ |
---|
2733 | + I flush my queue of read vectors. |
---|
2734 | + """ |
---|
2735 | + d = self._read(self._readvs) |
---|
2736 | + def _then(results): |
---|
2737 | + self._readvs = [] |
---|
2738 | + if isinstance(results, failure.Failure): |
---|
2739 | + self._queue_errbacks.notify(results) |
---|
2740 | + else: |
---|
2741 | + self._queue_observers.notify(results) |
---|
2742 | + self._queue_observers = observer.ObserverList() |
---|
2743 | + self._queue_errbacks = observer.ObserverList() |
---|
2744 | + d.addBoth(_then) |
---|
2745 | + |
---|
2746 | + |
---|
2747 | + def _read(self, readvs, force_remote=False, queue=False): |
---|
2748 | + unsatisfiable = filter(lambda x: x[0] + x[1] > len(self._data), readvs) |
---|
2749 | + # TODO: It's entirely possible to tweak this so that it just |
---|
2750 | + # fulfills the requests that it can, and not demand that all |
---|
2751 | + # requests are satisfiable before running it. |
---|
2752 | + if not unsatisfiable and not force_remote: |
---|
2753 | + results = [self._data[offset:offset+length] |
---|
2754 | + for (offset, length) in readvs] |
---|
2755 | + results = {self.shnum: results} |
---|
2756 | + return defer.succeed(results) |
---|
2757 | + else: |
---|
2758 | + if queue: |
---|
2759 | + start = len(self._readvs) |
---|
2760 | + self._readvs += readvs |
---|
2761 | + end = len(self._readvs) |
---|
2762 | + def _get_results(results, start, end): |
---|
2763 | + if not self.shnum in results: |
---|
2764 | + return {self._shnum: [""]} |
---|
2765 | + return {self.shnum: results[self.shnum][start:end]} |
---|
2766 | + d = defer.Deferred() |
---|
2767 | + d.addCallback(_get_results, start, end) |
---|
2768 | + self._queue_observers.subscribe(d.callback) |
---|
2769 | + self._queue_errbacks.subscribe(d.errback) |
---|
2770 | + return d |
---|
2771 | + return self._rref.callRemote("slot_readv", |
---|
2772 | + self._storage_index, |
---|
2773 | + [self.shnum], |
---|
2774 | + readvs) |
---|
2775 | + |
---|
2776 | + |
---|
2777 | + def is_sdmf(self): |
---|
2778 | + """I tell my caller whether or not my remote file is SDMF or MDMF |
---|
2779 | + """ |
---|
2780 | + d = self._maybe_fetch_offsets_and_header() |
---|
2781 | + d.addCallback(lambda ignored: |
---|
2782 | + self._version_number == 0) |
---|
2783 | + return d |
---|
2784 | + |
---|
2785 | + |
---|
2786 | +class LayoutInvalid(Exception): |
---|
2787 | + """ |
---|
2788 | + This isn't a valid MDMF mutable file |
---|
2789 | + """ |
---|
2790 | hunk ./src/allmydata/test/test_storage.py 2 |
---|
2791 | |
---|
2792 | -import time, os.path, stat, re, simplejson, struct |
---|
2793 | +import time, os.path, stat, re, simplejson, struct, shutil |
---|
2794 | |
---|
2795 | from twisted.trial import unittest |
---|
2796 | |
---|
2797 | hunk ./src/allmydata/test/test_storage.py 22 |
---|
2798 | from allmydata.storage.expirer import LeaseCheckingCrawler |
---|
2799 | from allmydata.immutable.layout import WriteBucketProxy, WriteBucketProxy_v2, \ |
---|
2800 | ReadBucketProxy |
---|
2801 | -from allmydata.interfaces import BadWriteEnablerError |
---|
2802 | -from allmydata.test.common import LoggingServiceParent |
---|
2803 | +from allmydata.mutable.layout import MDMFSlotWriteProxy, MDMFSlotReadProxy, \ |
---|
2804 | + LayoutInvalid, MDMFSIGNABLEHEADER, \ |
---|
2805 | + SIGNED_PREFIX, MDMFHEADER, \ |
---|
2806 | + MDMFOFFSETS, SDMFSlotWriteProxy |
---|
2807 | +from allmydata.interfaces import BadWriteEnablerError, MDMF_VERSION, \ |
---|
2808 | + SDMF_VERSION |
---|
2809 | +from allmydata.test.common import LoggingServiceParent, ShouldFailMixin |
---|
2810 | from allmydata.test.common_web import WebRenderingMixin |
---|
2811 | from allmydata.web.storage import StorageStatus, remove_prefix |
---|
2812 | |
---|
2813 | hunk ./src/allmydata/test/test_storage.py 106 |
---|
2814 | |
---|
2815 | class RemoteBucket: |
---|
2816 | |
---|
2817 | + def __init__(self): |
---|
2818 | + self.read_count = 0 |
---|
2819 | + self.write_count = 0 |
---|
2820 | + |
---|
2821 | def callRemote(self, methname, *args, **kwargs): |
---|
2822 | def _call(): |
---|
2823 | meth = getattr(self.target, "remote_" + methname) |
---|
2824 | hunk ./src/allmydata/test/test_storage.py 114 |
---|
2825 | return meth(*args, **kwargs) |
---|
2826 | + |
---|
2827 | + if methname == "slot_readv": |
---|
2828 | + self.read_count += 1 |
---|
2829 | + if "writev" in methname: |
---|
2830 | + self.write_count += 1 |
---|
2831 | + |
---|
2832 | return defer.maybeDeferred(_call) |
---|
2833 | |
---|
2834 | hunk ./src/allmydata/test/test_storage.py 122 |
---|
2835 | + |
---|
2836 | class BucketProxy(unittest.TestCase): |
---|
2837 | def make_bucket(self, name, size): |
---|
2838 | basedir = os.path.join("storage", "BucketProxy", name) |
---|
2839 | hunk ./src/allmydata/test/test_storage.py 1313 |
---|
2840 | self.failUnless(os.path.exists(prefixdir), prefixdir) |
---|
2841 | self.failIf(os.path.exists(bucketdir), bucketdir) |
---|
2842 | |
---|
2843 | + |
---|
2844 | +class MDMFProxies(unittest.TestCase, ShouldFailMixin): |
---|
2845 | + def setUp(self): |
---|
2846 | + self.sparent = LoggingServiceParent() |
---|
2847 | + self._lease_secret = itertools.count() |
---|
2848 | + self.ss = self.create("MDMFProxies storage test server") |
---|
2849 | + self.rref = RemoteBucket() |
---|
2850 | + self.rref.target = self.ss |
---|
2851 | + self.secrets = (self.write_enabler("we_secret"), |
---|
2852 | + self.renew_secret("renew_secret"), |
---|
2853 | + self.cancel_secret("cancel_secret")) |
---|
2854 | + self.segment = "aaaaaa" |
---|
2855 | + self.block = "aa" |
---|
2856 | + self.salt = "a" * 16 |
---|
2857 | + self.block_hash = "a" * 32 |
---|
2858 | + self.block_hash_tree = [self.block_hash for i in xrange(6)] |
---|
2859 | + self.share_hash = self.block_hash |
---|
2860 | + self.share_hash_chain = dict([(i, self.share_hash) for i in xrange(6)]) |
---|
2861 | + self.signature = "foobarbaz" |
---|
2862 | + self.verification_key = "vvvvvv" |
---|
2863 | + self.encprivkey = "private" |
---|
2864 | + self.root_hash = self.block_hash |
---|
2865 | + self.salt_hash = self.root_hash |
---|
2866 | + self.salt_hash_tree = [self.salt_hash for i in xrange(6)] |
---|
2867 | + self.block_hash_tree_s = self.serialize_blockhashes(self.block_hash_tree) |
---|
2868 | + self.share_hash_chain_s = self.serialize_sharehashes(self.share_hash_chain) |
---|
2869 | + # blockhashes and salt hashes are serialized in the same way, |
---|
2870 | + # only we lop off the first element and store that in the |
---|
2871 | + # header. |
---|
2872 | + self.salt_hash_tree_s = self.serialize_blockhashes(self.salt_hash_tree[1:]) |
---|
2873 | + |
---|
2874 | + |
---|
2875 | + def tearDown(self): |
---|
2876 | + self.sparent.stopService() |
---|
2877 | + shutil.rmtree(self.workdir("MDMFProxies storage test server")) |
---|
2878 | + |
---|
2879 | + |
---|
2880 | + def write_enabler(self, we_tag): |
---|
2881 | + return hashutil.tagged_hash("we_blah", we_tag) |
---|
2882 | + |
---|
2883 | + |
---|
2884 | + def renew_secret(self, tag): |
---|
2885 | + return hashutil.tagged_hash("renew_blah", str(tag)) |
---|
2886 | + |
---|
2887 | + |
---|
2888 | + def cancel_secret(self, tag): |
---|
2889 | + return hashutil.tagged_hash("cancel_blah", str(tag)) |
---|
2890 | + |
---|
2891 | + |
---|
2892 | + def workdir(self, name): |
---|
2893 | + basedir = os.path.join("storage", "MutableServer", name) |
---|
2894 | + return basedir |
---|
2895 | + |
---|
2896 | + |
---|
2897 | + def create(self, name): |
---|
2898 | + workdir = self.workdir(name) |
---|
2899 | + ss = StorageServer(workdir, "\x00" * 20) |
---|
2900 | + ss.setServiceParent(self.sparent) |
---|
2901 | + return ss |
---|
2902 | + |
---|
2903 | + |
---|
2904 | + def build_test_mdmf_share(self, tail_segment=False, empty=False): |
---|
2905 | + # Start with the checkstring |
---|
2906 | + data = struct.pack(">BQ32s", |
---|
2907 | + 1, |
---|
2908 | + 0, |
---|
2909 | + self.root_hash) |
---|
2910 | + self.checkstring = data |
---|
2911 | + # Next, the encoding parameters |
---|
2912 | + if tail_segment: |
---|
2913 | + data += struct.pack(">BBQQ", |
---|
2914 | + 3, |
---|
2915 | + 10, |
---|
2916 | + 6, |
---|
2917 | + 33) |
---|
2918 | + elif empty: |
---|
2919 | + data += struct.pack(">BBQQ", |
---|
2920 | + 3, |
---|
2921 | + 10, |
---|
2922 | + 0, |
---|
2923 | + 0) |
---|
2924 | + else: |
---|
2925 | + data += struct.pack(">BBQQ", |
---|
2926 | + 3, |
---|
2927 | + 10, |
---|
2928 | + 6, |
---|
2929 | + 36) |
---|
2930 | + # Now we'll build the offsets. |
---|
2931 | + sharedata = "" |
---|
2932 | + if not tail_segment and not empty: |
---|
2933 | + for i in xrange(6): |
---|
2934 | + sharedata += self.salt + self.block |
---|
2935 | + elif tail_segment: |
---|
2936 | + for i in xrange(5): |
---|
2937 | + sharedata += self.salt + self.block |
---|
2938 | + sharedata += self.salt + "a" |
---|
2939 | + |
---|
2940 | + # The encrypted private key comes after the shares + salts |
---|
2941 | + offset_size = struct.calcsize(MDMFOFFSETS) |
---|
2942 | + encrypted_private_key_offset = len(data) + offset_size + len(sharedata) |
---|
2943 | + # The blockhashes come after the private key |
---|
2944 | + blockhashes_offset = encrypted_private_key_offset + len(self.encprivkey) |
---|
2945 | + # The sharehashes come after the salt hashes |
---|
2946 | + sharehashes_offset = blockhashes_offset + len(self.block_hash_tree_s) |
---|
2947 | + # The signature comes after the share hash chain |
---|
2948 | + signature_offset = sharehashes_offset + len(self.share_hash_chain_s) |
---|
2949 | + # The verification key comes after the signature |
---|
2950 | + verification_offset = signature_offset + len(self.signature) |
---|
2951 | + # The EOF comes after the verification key |
---|
2952 | + eof_offset = verification_offset + len(self.verification_key) |
---|
2953 | + data += struct.pack(MDMFOFFSETS, |
---|
2954 | + encrypted_private_key_offset, |
---|
2955 | + blockhashes_offset, |
---|
2956 | + sharehashes_offset, |
---|
2957 | + signature_offset, |
---|
2958 | + verification_offset, |
---|
2959 | + eof_offset) |
---|
2960 | + self.offsets = {} |
---|
2961 | + self.offsets['enc_privkey'] = encrypted_private_key_offset |
---|
2962 | + self.offsets['block_hash_tree'] = blockhashes_offset |
---|
2963 | + self.offsets['share_hash_chain'] = sharehashes_offset |
---|
2964 | + self.offsets['signature'] = signature_offset |
---|
2965 | + self.offsets['verification_key'] = verification_offset |
---|
2966 | + self.offsets['EOF'] = eof_offset |
---|
2967 | + # Next, we'll add in the salts and share data, |
---|
2968 | + data += sharedata |
---|
2969 | + # the private key, |
---|
2970 | + data += self.encprivkey |
---|
2971 | + # the block hash tree, |
---|
2972 | + data += self.block_hash_tree_s |
---|
2973 | + # the share hash chain, |
---|
2974 | + data += self.share_hash_chain_s |
---|
2975 | + # the signature, |
---|
2976 | + data += self.signature |
---|
2977 | + # and the verification key |
---|
2978 | + data += self.verification_key |
---|
2979 | + return data |
---|
2980 | + |
---|
2981 | + |
---|
2982 | + def write_test_share_to_server(self, |
---|
2983 | + storage_index, |
---|
2984 | + tail_segment=False, |
---|
2985 | + empty=False): |
---|
2986 | + """ |
---|
2987 | + I write some data for the read tests to read to self.ss |
---|
2988 | + |
---|
2989 | + If tail_segment=True, then I will write a share that has a |
---|
2990 | + smaller tail segment than other segments. |
---|
2991 | + """ |
---|
2992 | + write = self.ss.remote_slot_testv_and_readv_and_writev |
---|
2993 | + data = self.build_test_mdmf_share(tail_segment, empty) |
---|
2994 | + # Finally, we write the whole thing to the storage server in one |
---|
2995 | + # pass. |
---|
2996 | + testvs = [(0, 1, "eq", "")] |
---|
2997 | + tws = {} |
---|
2998 | + tws[0] = (testvs, [(0, data)], None) |
---|
2999 | + readv = [(0, 1)] |
---|
3000 | + results = write(storage_index, self.secrets, tws, readv) |
---|
3001 | + self.failUnless(results[0]) |
---|
3002 | + |
---|
3003 | + |
---|
3004 | + def build_test_sdmf_share(self, empty=False): |
---|
3005 | + if empty: |
---|
3006 | + sharedata = "" |
---|
3007 | + else: |
---|
3008 | + sharedata = self.segment * 6 |
---|
3009 | + self.sharedata = sharedata |
---|
3010 | + blocksize = len(sharedata) / 3 |
---|
3011 | + block = sharedata[:blocksize] |
---|
3012 | + self.blockdata = block |
---|
3013 | + prefix = struct.pack(">BQ32s16s BBQQ", |
---|
3014 | + 0, # version, |
---|
3015 | + 0, |
---|
3016 | + self.root_hash, |
---|
3017 | + self.salt, |
---|
3018 | + 3, |
---|
3019 | + 10, |
---|
3020 | + len(sharedata), |
---|
3021 | + len(sharedata), |
---|
3022 | + ) |
---|
3023 | + post_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ") |
---|
3024 | + signature_offset = post_offset + len(self.verification_key) |
---|
3025 | + sharehashes_offset = signature_offset + len(self.signature) |
---|
3026 | + blockhashes_offset = sharehashes_offset + len(self.share_hash_chain_s) |
---|
3027 | + sharedata_offset = blockhashes_offset + len(self.block_hash_tree_s) |
---|
3028 | + encprivkey_offset = sharedata_offset + len(block) |
---|
3029 | + eof_offset = encprivkey_offset + len(self.encprivkey) |
---|
3030 | + offsets = struct.pack(">LLLLQQ", |
---|
3031 | + signature_offset, |
---|
3032 | + sharehashes_offset, |
---|
3033 | + blockhashes_offset, |
---|
3034 | + sharedata_offset, |
---|
3035 | + encprivkey_offset, |
---|
3036 | + eof_offset) |
---|
3037 | + final_share = "".join([prefix, |
---|
3038 | + offsets, |
---|
3039 | + self.verification_key, |
---|
3040 | + self.signature, |
---|
3041 | + self.share_hash_chain_s, |
---|
3042 | + self.block_hash_tree_s, |
---|
3043 | + block, |
---|
3044 | + self.encprivkey]) |
---|
3045 | + self.offsets = {} |
---|
3046 | + self.offsets['signature'] = signature_offset |
---|
3047 | + self.offsets['share_hash_chain'] = sharehashes_offset |
---|
3048 | + self.offsets['block_hash_tree'] = blockhashes_offset |
---|
3049 | + self.offsets['share_data'] = sharedata_offset |
---|
3050 | + self.offsets['enc_privkey'] = encprivkey_offset |
---|
3051 | + self.offsets['EOF'] = eof_offset |
---|
3052 | + return final_share |
---|
3053 | + |
---|
3054 | + |
---|
3055 | + def write_sdmf_share_to_server(self, |
---|
3056 | + storage_index, |
---|
3057 | + empty=False): |
---|
3058 | + # Some tests need SDMF shares to verify that we can still |
---|
3059 | + # read them. This method writes one, which resembles but is not |
---|
3060 | + assert self.rref |
---|
3061 | + write = self.ss.remote_slot_testv_and_readv_and_writev |
---|
3062 | + share = self.build_test_sdmf_share(empty) |
---|
3063 | + testvs = [(0, 1, "eq", "")] |
---|
3064 | + tws = {} |
---|
3065 | + tws[0] = (testvs, [(0, share)], None) |
---|
3066 | + readv = [] |
---|
3067 | + results = write(storage_index, self.secrets, tws, readv) |
---|
3068 | + self.failUnless(results[0]) |
---|
3069 | + |
---|
3070 | + |
---|
3071 | + def test_read(self): |
---|
3072 | + self.write_test_share_to_server("si1") |
---|
3073 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0) |
---|
3074 | + # Check that every method equals what we expect it to. |
---|
3075 | + d = defer.succeed(None) |
---|
3076 | + def _check_block_and_salt((block, salt)): |
---|
3077 | + self.failUnlessEqual(block, self.block) |
---|
3078 | + self.failUnlessEqual(salt, self.salt) |
---|
3079 | + |
---|
3080 | + for i in xrange(6): |
---|
3081 | + d.addCallback(lambda ignored, i=i: |
---|
3082 | + mr.get_block_and_salt(i)) |
---|
3083 | + d.addCallback(_check_block_and_salt) |
---|
3084 | + |
---|
3085 | + d.addCallback(lambda ignored: |
---|
3086 | + mr.get_encprivkey()) |
---|
3087 | + d.addCallback(lambda encprivkey: |
---|
3088 | + self.failUnlessEqual(self.encprivkey, encprivkey)) |
---|
3089 | + |
---|
3090 | + d.addCallback(lambda ignored: |
---|
3091 | + mr.get_blockhashes()) |
---|
3092 | + d.addCallback(lambda blockhashes: |
---|
3093 | + self.failUnlessEqual(self.block_hash_tree, blockhashes)) |
---|
3094 | + |
---|
3095 | + d.addCallback(lambda ignored: |
---|
3096 | + mr.get_sharehashes()) |
---|
3097 | + d.addCallback(lambda sharehashes: |
---|
3098 | + self.failUnlessEqual(self.share_hash_chain, sharehashes)) |
---|
3099 | + |
---|
3100 | + d.addCallback(lambda ignored: |
---|
3101 | + mr.get_signature()) |
---|
3102 | + d.addCallback(lambda signature: |
---|
3103 | + self.failUnlessEqual(signature, self.signature)) |
---|
3104 | + |
---|
3105 | + d.addCallback(lambda ignored: |
---|
3106 | + mr.get_verification_key()) |
---|
3107 | + d.addCallback(lambda verification_key: |
---|
3108 | + self.failUnlessEqual(verification_key, self.verification_key)) |
---|
3109 | + |
---|
3110 | + d.addCallback(lambda ignored: |
---|
3111 | + mr.get_seqnum()) |
---|
3112 | + d.addCallback(lambda seqnum: |
---|
3113 | + self.failUnlessEqual(seqnum, 0)) |
---|
3114 | + |
---|
3115 | + d.addCallback(lambda ignored: |
---|
3116 | + mr.get_root_hash()) |
---|
3117 | + d.addCallback(lambda root_hash: |
---|
3118 | + self.failUnlessEqual(self.root_hash, root_hash)) |
---|
3119 | + |
---|
3120 | + d.addCallback(lambda ignored: |
---|
3121 | + mr.get_seqnum()) |
---|
3122 | + d.addCallback(lambda seqnum: |
---|
3123 | + self.failUnlessEqual(0, seqnum)) |
---|
3124 | + |
---|
3125 | + d.addCallback(lambda ignored: |
---|
3126 | + mr.get_encoding_parameters()) |
---|
3127 | + def _check_encoding_parameters((k, n, segsize, datalen)): |
---|
3128 | + self.failUnlessEqual(k, 3) |
---|
3129 | + self.failUnlessEqual(n, 10) |
---|
3130 | + self.failUnlessEqual(segsize, 6) |
---|
3131 | + self.failUnlessEqual(datalen, 36) |
---|
3132 | + d.addCallback(_check_encoding_parameters) |
---|
3133 | + |
---|
3134 | + d.addCallback(lambda ignored: |
---|
3135 | + mr.get_checkstring()) |
---|
3136 | + d.addCallback(lambda checkstring: |
---|
3137 | + self.failUnlessEqual(checkstring, checkstring)) |
---|
3138 | + return d |
---|
3139 | + |
---|
3140 | + |
---|
3141 | + def test_read_with_different_tail_segment_size(self): |
---|
3142 | + self.write_test_share_to_server("si1", tail_segment=True) |
---|
3143 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0) |
---|
3144 | + d = mr.get_block_and_salt(5) |
---|
3145 | + def _check_tail_segment(results): |
---|
3146 | + block, salt = results |
---|
3147 | + self.failUnlessEqual(len(block), 1) |
---|
3148 | + self.failUnlessEqual(block, "a") |
---|
3149 | + d.addCallback(_check_tail_segment) |
---|
3150 | + return d |
---|
3151 | + |
---|
3152 | + |
---|
3153 | + def test_get_block_with_invalid_segnum(self): |
---|
3154 | + self.write_test_share_to_server("si1") |
---|
3155 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0) |
---|
3156 | + d = defer.succeed(None) |
---|
3157 | + d.addCallback(lambda ignored: |
---|
3158 | + self.shouldFail(LayoutInvalid, "test invalid segnum", |
---|
3159 | + None, |
---|
3160 | + mr.get_block_and_salt, 7)) |
---|
3161 | + return d |
---|
3162 | + |
---|
3163 | + |
---|
3164 | + def test_get_encoding_parameters_first(self): |
---|
3165 | + self.write_test_share_to_server("si1") |
---|
3166 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0) |
---|
3167 | + d = mr.get_encoding_parameters() |
---|
3168 | + def _check_encoding_parameters((k, n, segment_size, datalen)): |
---|
3169 | + self.failUnlessEqual(k, 3) |
---|
3170 | + self.failUnlessEqual(n, 10) |
---|
3171 | + self.failUnlessEqual(segment_size, 6) |
---|
3172 | + self.failUnlessEqual(datalen, 36) |
---|
3173 | + d.addCallback(_check_encoding_parameters) |
---|
3174 | + return d |
---|
3175 | + |
---|
3176 | + |
---|
3177 | + def test_get_seqnum_first(self): |
---|
3178 | + self.write_test_share_to_server("si1") |
---|
3179 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0) |
---|
3180 | + d = mr.get_seqnum() |
---|
3181 | + d.addCallback(lambda seqnum: |
---|
3182 | + self.failUnlessEqual(seqnum, 0)) |
---|
3183 | + return d |
---|
3184 | + |
---|
3185 | + |
---|
3186 | + def test_get_root_hash_first(self): |
---|
3187 | + self.write_test_share_to_server("si1") |
---|
3188 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0) |
---|
3189 | + d = mr.get_root_hash() |
---|
3190 | + d.addCallback(lambda root_hash: |
---|
3191 | + self.failUnlessEqual(root_hash, self.root_hash)) |
---|
3192 | + return d |
---|
3193 | + |
---|
3194 | + |
---|
3195 | + def test_get_checkstring_first(self): |
---|
3196 | + self.write_test_share_to_server("si1") |
---|
3197 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0) |
---|
3198 | + d = mr.get_checkstring() |
---|
3199 | + d.addCallback(lambda checkstring: |
---|
3200 | + self.failUnlessEqual(checkstring, self.checkstring)) |
---|
3201 | + return d |
---|
3202 | + |
---|
3203 | + |
---|
3204 | + def test_write_read_vectors(self): |
---|
3205 | + # When writing for us, the storage server will return to us a |
---|
3206 | + # read vector, along with its result. If a write fails because |
---|
3207 | + # the test vectors failed, this read vector can help us to |
---|
3208 | + # diagnose the problem. This test ensures that the read vector |
---|
3209 | + # is working appropriately. |
---|
3210 | + mw = self._make_new_mw("si1", 0) |
---|
3211 | + |
---|
3212 | + for i in xrange(6): |
---|
3213 | + mw.put_block(self.block, i, self.salt) |
---|
3214 | + mw.put_encprivkey(self.encprivkey) |
---|
3215 | + mw.put_blockhashes(self.block_hash_tree) |
---|
3216 | + mw.put_sharehashes(self.share_hash_chain) |
---|
3217 | + mw.put_root_hash(self.root_hash) |
---|
3218 | + mw.put_signature(self.signature) |
---|
3219 | + mw.put_verification_key(self.verification_key) |
---|
3220 | + d = mw.finish_publishing() |
---|
3221 | + def _then(results): |
---|
3222 | + self.failUnless(len(results), 2) |
---|
3223 | + result, readv = results |
---|
3224 | + self.failUnless(result) |
---|
3225 | + self.failIf(readv) |
---|
3226 | + self.old_checkstring = mw.get_checkstring() |
---|
3227 | + mw.set_checkstring("") |
---|
3228 | + d.addCallback(_then) |
---|
3229 | + d.addCallback(lambda ignored: |
---|
3230 | + mw.finish_publishing()) |
---|
3231 | + def _then_again(results): |
---|
3232 | + self.failUnlessEqual(len(results), 2) |
---|
3233 | + result, readvs = results |
---|
3234 | + self.failIf(result) |
---|
3235 | + self.failUnlessIn(0, readvs) |
---|
3236 | + readv = readvs[0][0] |
---|
3237 | + self.failUnlessEqual(readv, self.old_checkstring) |
---|
3238 | + d.addCallback(_then_again) |
---|
3239 | + # The checkstring remains the same for the rest of the process. |
---|
3240 | + return d |
---|
3241 | + |
---|
3242 | + |
---|
3243 | + def test_blockhashes_after_share_hash_chain(self): |
---|
3244 | + mw = self._make_new_mw("si1", 0) |
---|
3245 | + d = defer.succeed(None) |
---|
3246 | + # Put everything up to and including the share hash chain |
---|
3247 | + for i in xrange(6): |
---|
3248 | + d.addCallback(lambda ignored, i=i: |
---|
3249 | + mw.put_block(self.block, i, self.salt)) |
---|
3250 | + d.addCallback(lambda ignored: |
---|
3251 | + mw.put_encprivkey(self.encprivkey)) |
---|
3252 | + d.addCallback(lambda ignored: |
---|
3253 | + mw.put_blockhashes(self.block_hash_tree)) |
---|
3254 | + d.addCallback(lambda ignored: |
---|
3255 | + mw.put_sharehashes(self.share_hash_chain)) |
---|
3256 | + |
---|
3257 | + # Now try to put the block hash tree again. |
---|
3258 | + d.addCallback(lambda ignored: |
---|
3259 | + self.shouldFail(LayoutInvalid, "test repeat salthashes", |
---|
3260 | + None, |
---|
3261 | + mw.put_blockhashes, self.block_hash_tree)) |
---|
3262 | + return d |
---|
3263 | + |
---|
3264 | + |
---|
3265 | + def test_encprivkey_after_blockhashes(self): |
---|
3266 | + mw = self._make_new_mw("si1", 0) |
---|
3267 | + d = defer.succeed(None) |
---|
3268 | + # Put everything up to and including the block hash tree |
---|
3269 | + for i in xrange(6): |
---|
3270 | + d.addCallback(lambda ignored, i=i: |
---|
3271 | + mw.put_block(self.block, i, self.salt)) |
---|
3272 | + d.addCallback(lambda ignored: |
---|
3273 | + mw.put_encprivkey(self.encprivkey)) |
---|
3274 | + d.addCallback(lambda ignored: |
---|
3275 | + mw.put_blockhashes(self.block_hash_tree)) |
---|
3276 | + d.addCallback(lambda ignored: |
---|
3277 | + self.shouldFail(LayoutInvalid, "out of order private key", |
---|
3278 | + None, |
---|
3279 | + mw.put_encprivkey, self.encprivkey)) |
---|
3280 | + return d |
---|
3281 | + |
---|
3282 | + |
---|
3283 | + def test_share_hash_chain_after_signature(self): |
---|
3284 | + mw = self._make_new_mw("si1", 0) |
---|
3285 | + d = defer.succeed(None) |
---|
3286 | + # Put everything up to and including the signature |
---|
3287 | + for i in xrange(6): |
---|
3288 | + d.addCallback(lambda ignored, i=i: |
---|
3289 | + mw.put_block(self.block, i, self.salt)) |
---|
3290 | + d.addCallback(lambda ignored: |
---|
3291 | + mw.put_encprivkey(self.encprivkey)) |
---|
3292 | + d.addCallback(lambda ignored: |
---|
3293 | + mw.put_blockhashes(self.block_hash_tree)) |
---|
3294 | + d.addCallback(lambda ignored: |
---|
3295 | + mw.put_sharehashes(self.share_hash_chain)) |
---|
3296 | + d.addCallback(lambda ignored: |
---|
3297 | + mw.put_root_hash(self.root_hash)) |
---|
3298 | + d.addCallback(lambda ignored: |
---|
3299 | + mw.put_signature(self.signature)) |
---|
3300 | + # Now try to put the share hash chain again. This should fail |
---|
3301 | + d.addCallback(lambda ignored: |
---|
3302 | + self.shouldFail(LayoutInvalid, "out of order share hash chain", |
---|
3303 | + None, |
---|
3304 | + mw.put_sharehashes, self.share_hash_chain)) |
---|
3305 | + return d |
---|
3306 | + |
---|
3307 | + |
---|
3308 | + def test_signature_after_verification_key(self): |
---|
3309 | + mw = self._make_new_mw("si1", 0) |
---|
3310 | + d = defer.succeed(None) |
---|
3311 | + # Put everything up to and including the verification key. |
---|
3312 | + for i in xrange(6): |
---|
3313 | + d.addCallback(lambda ignored, i=i: |
---|
3314 | + mw.put_block(self.block, i, self.salt)) |
---|
3315 | + d.addCallback(lambda ignored: |
---|
3316 | + mw.put_encprivkey(self.encprivkey)) |
---|
3317 | + d.addCallback(lambda ignored: |
---|
3318 | + mw.put_blockhashes(self.block_hash_tree)) |
---|
3319 | + d.addCallback(lambda ignored: |
---|
3320 | + mw.put_sharehashes(self.share_hash_chain)) |
---|
3321 | + d.addCallback(lambda ignored: |
---|
3322 | + mw.put_root_hash(self.root_hash)) |
---|
3323 | + d.addCallback(lambda ignored: |
---|
3324 | + mw.put_signature(self.signature)) |
---|
3325 | + d.addCallback(lambda ignored: |
---|
3326 | + mw.put_verification_key(self.verification_key)) |
---|
3327 | + # Now try to put the signature again. This should fail |
---|
3328 | + d.addCallback(lambda ignored: |
---|
3329 | + self.shouldFail(LayoutInvalid, "signature after verification", |
---|
3330 | + None, |
---|
3331 | + mw.put_signature, self.signature)) |
---|
3332 | + return d |
---|
3333 | + |
---|
3334 | + |
---|
3335 | + def test_uncoordinated_write(self): |
---|
3336 | + # Make two mutable writers, both pointing to the same storage |
---|
3337 | + # server, both at the same storage index, and try writing to the |
---|
3338 | + # same share. |
---|
3339 | + mw1 = self._make_new_mw("si1", 0) |
---|
3340 | + mw2 = self._make_new_mw("si1", 0) |
---|
3341 | + |
---|
3342 | + def _check_success(results): |
---|
3343 | + result, readvs = results |
---|
3344 | + self.failUnless(result) |
---|
3345 | + |
---|
3346 | + def _check_failure(results): |
---|
3347 | + result, readvs = results |
---|
3348 | + self.failIf(result) |
---|
3349 | + |
---|
3350 | + def _write_share(mw): |
---|
3351 | + for i in xrange(6): |
---|
3352 | + mw.put_block(self.block, i, self.salt) |
---|
3353 | + mw.put_encprivkey(self.encprivkey) |
---|
3354 | + mw.put_blockhashes(self.block_hash_tree) |
---|
3355 | + mw.put_sharehashes(self.share_hash_chain) |
---|
3356 | + mw.put_root_hash(self.root_hash) |
---|
3357 | + mw.put_signature(self.signature) |
---|
3358 | + mw.put_verification_key(self.verification_key) |
---|
3359 | + return mw.finish_publishing() |
---|
3360 | + d = _write_share(mw1) |
---|
3361 | + d.addCallback(_check_success) |
---|
3362 | + d.addCallback(lambda ignored: |
---|
3363 | + _write_share(mw2)) |
---|
3364 | + d.addCallback(_check_failure) |
---|
3365 | + return d |
---|
3366 | + |
---|
3367 | + |
---|
3368 | + def test_invalid_salt_size(self): |
---|
3369 | + # Salts need to be 16 bytes in size. Writes that attempt to |
---|
3370 | + # write more or less than this should be rejected. |
---|
3371 | + mw = self._make_new_mw("si1", 0) |
---|
3372 | + invalid_salt = "a" * 17 # 17 bytes |
---|
3373 | + another_invalid_salt = "b" * 15 # 15 bytes |
---|
3374 | + d = defer.succeed(None) |
---|
3375 | + d.addCallback(lambda ignored: |
---|
3376 | + self.shouldFail(LayoutInvalid, "salt too big", |
---|
3377 | + None, |
---|
3378 | + mw.put_block, self.block, 0, invalid_salt)) |
---|
3379 | + d.addCallback(lambda ignored: |
---|
3380 | + self.shouldFail(LayoutInvalid, "salt too small", |
---|
3381 | + None, |
---|
3382 | + mw.put_block, self.block, 0, |
---|
3383 | + another_invalid_salt)) |
---|
3384 | + return d |
---|
3385 | + |
---|
3386 | + |
---|
3387 | + def test_write_test_vectors(self): |
---|
3388 | + # If we give the write proxy a bogus test vector at |
---|
3389 | + # any point during the process, it should fail to write when we |
---|
3390 | + # tell it to write. |
---|
3391 | + def _check_failure(results): |
---|
3392 | + self.failUnlessEqual(len(results), 2) |
---|
3393 | + res, d = results |
---|
3394 | + self.failIf(res) |
---|
3395 | + |
---|
3396 | + def _check_success(results): |
---|
3397 | + self.failUnlessEqual(len(results), 2) |
---|
3398 | + res, d = results |
---|
3399 | + self.failUnless(results) |
---|
3400 | + |
---|
3401 | + mw = self._make_new_mw("si1", 0) |
---|
3402 | + mw.set_checkstring("this is a lie") |
---|
3403 | + for i in xrange(6): |
---|
3404 | + mw.put_block(self.block, i, self.salt) |
---|
3405 | + mw.put_encprivkey(self.encprivkey) |
---|
3406 | + mw.put_blockhashes(self.block_hash_tree) |
---|
3407 | + mw.put_sharehashes(self.share_hash_chain) |
---|
3408 | + mw.put_root_hash(self.root_hash) |
---|
3409 | + mw.put_signature(self.signature) |
---|
3410 | + mw.put_verification_key(self.verification_key) |
---|
3411 | + d = mw.finish_publishing() |
---|
3412 | + d.addCallback(_check_failure) |
---|
3413 | + d.addCallback(lambda ignored: |
---|
3414 | + mw.set_checkstring("")) |
---|
3415 | + d.addCallback(lambda ignored: |
---|
3416 | + mw.finish_publishing()) |
---|
3417 | + d.addCallback(_check_success) |
---|
3418 | + return d |
---|
3419 | + |
---|
3420 | + |
---|
3421 | + def serialize_blockhashes(self, blockhashes): |
---|
3422 | + return "".join(blockhashes) |
---|
3423 | + |
---|
3424 | + |
---|
3425 | + def serialize_sharehashes(self, sharehashes): |
---|
3426 | + ret = "".join([struct.pack(">H32s", i, sharehashes[i]) |
---|
3427 | + for i in sorted(sharehashes.keys())]) |
---|
3428 | + return ret |
---|
3429 | + |
---|
3430 | + |
---|
3431 | + def test_write(self): |
---|
3432 | + # This translates to a file with 6 6-byte segments, and with 2-byte |
---|
3433 | + # blocks. |
---|
3434 | + mw = self._make_new_mw("si1", 0) |
---|
3435 | + # Test writing some blocks. |
---|
3436 | + read = self.ss.remote_slot_readv |
---|
3437 | + expected_sharedata_offset = struct.calcsize(MDMFHEADER) |
---|
3438 | + written_block_size = 2 + len(self.salt) |
---|
3439 | + written_block = self.block + self.salt |
---|
3440 | + for i in xrange(6): |
---|
3441 | + mw.put_block(self.block, i, self.salt) |
---|
3442 | + |
---|
3443 | + mw.put_encprivkey(self.encprivkey) |
---|
3444 | + mw.put_blockhashes(self.block_hash_tree) |
---|
3445 | + mw.put_sharehashes(self.share_hash_chain) |
---|
3446 | + mw.put_root_hash(self.root_hash) |
---|
3447 | + mw.put_signature(self.signature) |
---|
3448 | + mw.put_verification_key(self.verification_key) |
---|
3449 | + d = mw.finish_publishing() |
---|
3450 | + def _check_publish(results): |
---|
3451 | + self.failUnlessEqual(len(results), 2) |
---|
3452 | + result, ign = results |
---|
3453 | + self.failUnless(result, "publish failed") |
---|
3454 | + for i in xrange(6): |
---|
3455 | + self.failUnlessEqual(read("si1", [0], [(expected_sharedata_offset + (i * written_block_size), written_block_size)]), |
---|
3456 | + {0: [written_block]}) |
---|
3457 | + |
---|
3458 | + expected_private_key_offset = expected_sharedata_offset + \ |
---|
3459 | + len(written_block) * 6 |
---|
3460 | + self.failUnlessEqual(len(self.encprivkey), 7) |
---|
3461 | + self.failUnlessEqual(read("si1", [0], [(expected_private_key_offset, 7)]), |
---|
3462 | + {0: [self.encprivkey]}) |
---|
3463 | + |
---|
3464 | + expected_block_hash_offset = expected_private_key_offset + len(self.encprivkey) |
---|
3465 | + self.failUnlessEqual(len(self.block_hash_tree_s), 32 * 6) |
---|
3466 | + self.failUnlessEqual(read("si1", [0], [(expected_block_hash_offset, 32 * 6)]), |
---|
3467 | + {0: [self.block_hash_tree_s]}) |
---|
3468 | + |
---|
3469 | + expected_share_hash_offset = expected_block_hash_offset + len(self.block_hash_tree_s) |
---|
3470 | + self.failUnlessEqual(read("si1", [0],[(expected_share_hash_offset, (32 + 2) * 6)]), |
---|
3471 | + {0: [self.share_hash_chain_s]}) |
---|
3472 | + |
---|
3473 | + self.failUnlessEqual(read("si1", [0], [(9, 32)]), |
---|
3474 | + {0: [self.root_hash]}) |
---|
3475 | + expected_signature_offset = expected_share_hash_offset + len(self.share_hash_chain_s) |
---|
3476 | + self.failUnlessEqual(len(self.signature), 9) |
---|
3477 | + self.failUnlessEqual(read("si1", [0], [(expected_signature_offset, 9)]), |
---|
3478 | + {0: [self.signature]}) |
---|
3479 | + |
---|
3480 | + expected_verification_key_offset = expected_signature_offset + len(self.signature) |
---|
3481 | + self.failUnlessEqual(len(self.verification_key), 6) |
---|
3482 | + self.failUnlessEqual(read("si1", [0], [(expected_verification_key_offset, 6)]), |
---|
3483 | + {0: [self.verification_key]}) |
---|
3484 | + |
---|
3485 | + signable = mw.get_signable() |
---|
3486 | + verno, seq, roothash, k, n, segsize, datalen = \ |
---|
3487 | + struct.unpack(">BQ32sBBQQ", |
---|
3488 | + signable) |
---|
3489 | + self.failUnlessEqual(verno, 1) |
---|
3490 | + self.failUnlessEqual(seq, 0) |
---|
3491 | + self.failUnlessEqual(roothash, self.root_hash) |
---|
3492 | + self.failUnlessEqual(k, 3) |
---|
3493 | + self.failUnlessEqual(n, 10) |
---|
3494 | + self.failUnlessEqual(segsize, 6) |
---|
3495 | + self.failUnlessEqual(datalen, 36) |
---|
3496 | + expected_eof_offset = expected_verification_key_offset + len(self.verification_key) |
---|
3497 | + |
---|
3498 | + # Check the version number to make sure that it is correct. |
---|
3499 | + expected_version_number = struct.pack(">B", 1) |
---|
3500 | + self.failUnlessEqual(read("si1", [0], [(0, 1)]), |
---|
3501 | + {0: [expected_version_number]}) |
---|
3502 | + # Check the sequence number to make sure that it is correct |
---|
3503 | + expected_sequence_number = struct.pack(">Q", 0) |
---|
3504 | + self.failUnlessEqual(read("si1", [0], [(1, 8)]), |
---|
3505 | + {0: [expected_sequence_number]}) |
---|
3506 | + # Check that the encoding parameters (k, N, segement size, data |
---|
3507 | + # length) are what they should be. These are 3, 10, 6, 36 |
---|
3508 | + expected_k = struct.pack(">B", 3) |
---|
3509 | + self.failUnlessEqual(read("si1", [0], [(41, 1)]), |
---|
3510 | + {0: [expected_k]}) |
---|
3511 | + expected_n = struct.pack(">B", 10) |
---|
3512 | + self.failUnlessEqual(read("si1", [0], [(42, 1)]), |
---|
3513 | + {0: [expected_n]}) |
---|
3514 | + expected_segment_size = struct.pack(">Q", 6) |
---|
3515 | + self.failUnlessEqual(read("si1", [0], [(43, 8)]), |
---|
3516 | + {0: [expected_segment_size]}) |
---|
3517 | + expected_data_length = struct.pack(">Q", 36) |
---|
3518 | + self.failUnlessEqual(read("si1", [0], [(51, 8)]), |
---|
3519 | + {0: [expected_data_length]}) |
---|
3520 | + expected_offset = struct.pack(">Q", expected_private_key_offset) |
---|
3521 | + self.failUnlessEqual(read("si1", [0], [(59, 8)]), |
---|
3522 | + {0: [expected_offset]}) |
---|
3523 | + expected_offset = struct.pack(">Q", expected_block_hash_offset) |
---|
3524 | + self.failUnlessEqual(read("si1", [0], [(67, 8)]), |
---|
3525 | + {0: [expected_offset]}) |
---|
3526 | + expected_offset = struct.pack(">Q", expected_share_hash_offset) |
---|
3527 | + self.failUnlessEqual(read("si1", [0], [(75, 8)]), |
---|
3528 | + {0: [expected_offset]}) |
---|
3529 | + expected_offset = struct.pack(">Q", expected_signature_offset) |
---|
3530 | + self.failUnlessEqual(read("si1", [0], [(83, 8)]), |
---|
3531 | + {0: [expected_offset]}) |
---|
3532 | + expected_offset = struct.pack(">Q", expected_verification_key_offset) |
---|
3533 | + self.failUnlessEqual(read("si1", [0], [(91, 8)]), |
---|
3534 | + {0: [expected_offset]}) |
---|
3535 | + expected_offset = struct.pack(">Q", expected_eof_offset) |
---|
3536 | + self.failUnlessEqual(read("si1", [0], [(99, 8)]), |
---|
3537 | + {0: [expected_offset]}) |
---|
3538 | + d.addCallback(_check_publish) |
---|
3539 | + return d |
---|
3540 | + |
---|
3541 | + def _make_new_mw(self, si, share, datalength=36): |
---|
3542 | + # This is a file of size 36 bytes. Since it has a segment |
---|
3543 | + # size of 6, we know that it has 6 byte segments, which will |
---|
3544 | + # be split into blocks of 2 bytes because our FEC k |
---|
3545 | + # parameter is 3. |
---|
3546 | + mw = MDMFSlotWriteProxy(share, self.rref, si, self.secrets, 0, 3, 10, |
---|
3547 | + 6, datalength) |
---|
3548 | + return mw |
---|
3549 | + |
---|
3550 | + |
---|
3551 | + def test_write_rejected_with_too_many_blocks(self): |
---|
3552 | + mw = self._make_new_mw("si0", 0) |
---|
3553 | + |
---|
3554 | + # Try writing too many blocks. We should not be able to write |
---|
3555 | + # more than 6 |
---|
3556 | + # blocks into each share. |
---|
3557 | + d = defer.succeed(None) |
---|
3558 | + for i in xrange(6): |
---|
3559 | + d.addCallback(lambda ignored, i=i: |
---|
3560 | + mw.put_block(self.block, i, self.salt)) |
---|
3561 | + d.addCallback(lambda ignored: |
---|
3562 | + self.shouldFail(LayoutInvalid, "too many blocks", |
---|
3563 | + None, |
---|
3564 | + mw.put_block, self.block, 7, self.salt)) |
---|
3565 | + return d |
---|
3566 | + |
---|
3567 | + |
---|
3568 | + def test_write_rejected_with_invalid_salt(self): |
---|
3569 | + # Try writing an invalid salt. Salts are 16 bytes -- any more or |
---|
3570 | + # less should cause an error. |
---|
3571 | + mw = self._make_new_mw("si1", 0) |
---|
3572 | + bad_salt = "a" * 17 # 17 bytes |
---|
3573 | + d = defer.succeed(None) |
---|
3574 | + d.addCallback(lambda ignored: |
---|
3575 | + self.shouldFail(LayoutInvalid, "test_invalid_salt", |
---|
3576 | + None, mw.put_block, self.block, 7, bad_salt)) |
---|
3577 | + return d |
---|
3578 | + |
---|
3579 | + |
---|
3580 | + def test_write_rejected_with_invalid_root_hash(self): |
---|
3581 | + # Try writing an invalid root hash. This should be SHA256d, and |
---|
3582 | + # 32 bytes long as a result. |
---|
3583 | + mw = self._make_new_mw("si2", 0) |
---|
3584 | + # 17 bytes != 32 bytes |
---|
3585 | + invalid_root_hash = "a" * 17 |
---|
3586 | + d = defer.succeed(None) |
---|
3587 | + # Before this test can work, we need to put some blocks + salts, |
---|
3588 | + # a block hash tree, and a share hash tree. Otherwise, we'll see |
---|
3589 | + # failures that match what we are looking for, but are caused by |
---|
3590 | + # the constraints imposed on operation ordering. |
---|
3591 | + for i in xrange(6): |
---|
3592 | + d.addCallback(lambda ignored, i=i: |
---|
3593 | + mw.put_block(self.block, i, self.salt)) |
---|
3594 | + d.addCallback(lambda ignored: |
---|
3595 | + mw.put_encprivkey(self.encprivkey)) |
---|
3596 | + d.addCallback(lambda ignored: |
---|
3597 | + mw.put_blockhashes(self.block_hash_tree)) |
---|
3598 | + d.addCallback(lambda ignored: |
---|
3599 | + mw.put_sharehashes(self.share_hash_chain)) |
---|
3600 | + d.addCallback(lambda ignored: |
---|
3601 | + self.shouldFail(LayoutInvalid, "invalid root hash", |
---|
3602 | + None, mw.put_root_hash, invalid_root_hash)) |
---|
3603 | + return d |
---|
3604 | + |
---|
3605 | + |
---|
3606 | + def test_write_rejected_with_invalid_blocksize(self): |
---|
3607 | + # The blocksize implied by the writer that we get from |
---|
3608 | + # _make_new_mw is 2bytes -- any more or any less than this |
---|
3609 | + # should be cause for failure, unless it is the tail segment, in |
---|
3610 | + # which case it may not be failure. |
---|
3611 | + invalid_block = "a" |
---|
3612 | + mw = self._make_new_mw("si3", 0, 33) # implies a tail segment with |
---|
3613 | + # one byte blocks |
---|
3614 | + # 1 bytes != 2 bytes |
---|
3615 | + d = defer.succeed(None) |
---|
3616 | + d.addCallback(lambda ignored, invalid_block=invalid_block: |
---|
3617 | + self.shouldFail(LayoutInvalid, "test blocksize too small", |
---|
3618 | + None, mw.put_block, invalid_block, 0, |
---|
3619 | + self.salt)) |
---|
3620 | + invalid_block = invalid_block * 3 |
---|
3621 | + # 3 bytes != 2 bytes |
---|
3622 | + d.addCallback(lambda ignored: |
---|
3623 | + self.shouldFail(LayoutInvalid, "test blocksize too large", |
---|
3624 | + None, |
---|
3625 | + mw.put_block, invalid_block, 0, self.salt)) |
---|
3626 | + for i in xrange(5): |
---|
3627 | + d.addCallback(lambda ignored, i=i: |
---|
3628 | + mw.put_block(self.block, i, self.salt)) |
---|
3629 | + # Try to put an invalid tail segment |
---|
3630 | + d.addCallback(lambda ignored: |
---|
3631 | + self.shouldFail(LayoutInvalid, "test invalid tail segment", |
---|
3632 | + None, |
---|
3633 | + mw.put_block, self.block, 5, self.salt)) |
---|
3634 | + valid_block = "a" |
---|
3635 | + d.addCallback(lambda ignored: |
---|
3636 | + mw.put_block(valid_block, 5, self.salt)) |
---|
3637 | + return d |
---|
3638 | + |
---|
3639 | + |
---|
3640 | + def test_write_enforces_order_constraints(self): |
---|
3641 | + # We require that the MDMFSlotWriteProxy be interacted with in a |
---|
3642 | + # specific way. |
---|
3643 | + # That way is: |
---|
3644 | + # 0: __init__ |
---|
3645 | + # 1: write blocks and salts |
---|
3646 | + # 2: Write the encrypted private key |
---|
3647 | + # 3: Write the block hashes |
---|
3648 | + # 4: Write the share hashes |
---|
3649 | + # 5: Write the root hash and salt hash |
---|
3650 | + # 6: Write the signature and verification key |
---|
3651 | + # 7: Write the file. |
---|
3652 | + # |
---|
3653 | + # Some of these can be performed out-of-order, and some can't. |
---|
3654 | + # The dependencies that I want to test here are: |
---|
3655 | + # - Private key before block hashes |
---|
3656 | + # - share hashes and block hashes before root hash |
---|
3657 | + # - root hash before signature |
---|
3658 | + # - signature before verification key |
---|
3659 | + mw0 = self._make_new_mw("si0", 0) |
---|
3660 | + # Write some shares |
---|
3661 | + d = defer.succeed(None) |
---|
3662 | + for i in xrange(6): |
---|
3663 | + d.addCallback(lambda ignored, i=i: |
---|
3664 | + mw0.put_block(self.block, i, self.salt)) |
---|
3665 | + # Try to write the block hashes before writing the encrypted |
---|
3666 | + # private key |
---|
3667 | + d.addCallback(lambda ignored: |
---|
3668 | + self.shouldFail(LayoutInvalid, "block hashes before key", |
---|
3669 | + None, mw0.put_blockhashes, |
---|
3670 | + self.block_hash_tree)) |
---|
3671 | + |
---|
3672 | + # Write the private key. |
---|
3673 | + d.addCallback(lambda ignored: |
---|
3674 | + mw0.put_encprivkey(self.encprivkey)) |
---|
3675 | + |
---|
3676 | + |
---|
3677 | + # Try to write the share hash chain without writing the block |
---|
3678 | + # hash tree |
---|
3679 | + d.addCallback(lambda ignored: |
---|
3680 | + self.shouldFail(LayoutInvalid, "share hash chain before " |
---|
3681 | + "salt hash tree", |
---|
3682 | + None, |
---|
3683 | + mw0.put_sharehashes, self.share_hash_chain)) |
---|
3684 | + |
---|
3685 | + # Try to write the root hash and without writing either the |
---|
3686 | + # block hashes or the or the share hashes |
---|
3687 | + d.addCallback(lambda ignored: |
---|
3688 | + self.shouldFail(LayoutInvalid, "root hash before share hashes", |
---|
3689 | + None, |
---|
3690 | + mw0.put_root_hash, self.root_hash)) |
---|
3691 | + |
---|
3692 | + # Now write the block hashes and try again |
---|
3693 | + d.addCallback(lambda ignored: |
---|
3694 | + mw0.put_blockhashes(self.block_hash_tree)) |
---|
3695 | + |
---|
3696 | + d.addCallback(lambda ignored: |
---|
3697 | + self.shouldFail(LayoutInvalid, "root hash before share hashes", |
---|
3698 | + None, mw0.put_root_hash, self.root_hash)) |
---|
3699 | + |
---|
3700 | + # We haven't yet put the root hash on the share, so we shouldn't |
---|
3701 | + # be able to sign it. |
---|
3702 | + d.addCallback(lambda ignored: |
---|
3703 | + self.shouldFail(LayoutInvalid, "signature before root hash", |
---|
3704 | + None, mw0.put_signature, self.signature)) |
---|
3705 | + |
---|
3706 | + d.addCallback(lambda ignored: |
---|
3707 | + self.failUnlessRaises(LayoutInvalid, mw0.get_signable)) |
---|
3708 | + |
---|
3709 | + # ..and, since that fails, we also shouldn't be able to put the |
---|
3710 | + # verification key. |
---|
3711 | + d.addCallback(lambda ignored: |
---|
3712 | + self.shouldFail(LayoutInvalid, "key before signature", |
---|
3713 | + None, mw0.put_verification_key, |
---|
3714 | + self.verification_key)) |
---|
3715 | + |
---|
3716 | + # Now write the share hashes. |
---|
3717 | + d.addCallback(lambda ignored: |
---|
3718 | + mw0.put_sharehashes(self.share_hash_chain)) |
---|
3719 | + # We should be able to write the root hash now too |
---|
3720 | + d.addCallback(lambda ignored: |
---|
3721 | + mw0.put_root_hash(self.root_hash)) |
---|
3722 | + |
---|
3723 | + # We should still be unable to put the verification key |
---|
3724 | + d.addCallback(lambda ignored: |
---|
3725 | + self.shouldFail(LayoutInvalid, "key before signature", |
---|
3726 | + None, mw0.put_verification_key, |
---|
3727 | + self.verification_key)) |
---|
3728 | + |
---|
3729 | + d.addCallback(lambda ignored: |
---|
3730 | + mw0.put_signature(self.signature)) |
---|
3731 | + |
---|
3732 | + # We shouldn't be able to write the offsets to the remote server |
---|
3733 | + # until the offset table is finished; IOW, until we have written |
---|
3734 | + # the verification key. |
---|
3735 | + d.addCallback(lambda ignored: |
---|
3736 | + self.shouldFail(LayoutInvalid, "offsets before verification key", |
---|
3737 | + None, |
---|
3738 | + mw0.finish_publishing)) |
---|
3739 | + |
---|
3740 | + d.addCallback(lambda ignored: |
---|
3741 | + mw0.put_verification_key(self.verification_key)) |
---|
3742 | + return d |
---|
3743 | + |
---|
3744 | + |
---|
3745 | + def test_end_to_end(self): |
---|
3746 | + mw = self._make_new_mw("si1", 0) |
---|
3747 | + # Write a share using the mutable writer, and make sure that the |
---|
3748 | + # reader knows how to read everything back to us. |
---|
3749 | + d = defer.succeed(None) |
---|
3750 | + for i in xrange(6): |
---|
3751 | + d.addCallback(lambda ignored, i=i: |
---|
3752 | + mw.put_block(self.block, i, self.salt)) |
---|
3753 | + d.addCallback(lambda ignored: |
---|
3754 | + mw.put_encprivkey(self.encprivkey)) |
---|
3755 | + d.addCallback(lambda ignored: |
---|
3756 | + mw.put_blockhashes(self.block_hash_tree)) |
---|
3757 | + d.addCallback(lambda ignored: |
---|
3758 | + mw.put_sharehashes(self.share_hash_chain)) |
---|
3759 | + d.addCallback(lambda ignored: |
---|
3760 | + mw.put_root_hash(self.root_hash)) |
---|
3761 | + d.addCallback(lambda ignored: |
---|
3762 | + mw.put_signature(self.signature)) |
---|
3763 | + d.addCallback(lambda ignored: |
---|
3764 | + mw.put_verification_key(self.verification_key)) |
---|
3765 | + d.addCallback(lambda ignored: |
---|
3766 | + mw.finish_publishing()) |
---|
3767 | + |
---|
3768 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0) |
---|
3769 | + def _check_block_and_salt((block, salt)): |
---|
3770 | + self.failUnlessEqual(block, self.block) |
---|
3771 | + self.failUnlessEqual(salt, self.salt) |
---|
3772 | + |
---|
3773 | + for i in xrange(6): |
---|
3774 | + d.addCallback(lambda ignored, i=i: |
---|
3775 | + mr.get_block_and_salt(i)) |
---|
3776 | + d.addCallback(_check_block_and_salt) |
---|
3777 | + |
---|
3778 | + d.addCallback(lambda ignored: |
---|
3779 | + mr.get_encprivkey()) |
---|
3780 | + d.addCallback(lambda encprivkey: |
---|
3781 | + self.failUnlessEqual(self.encprivkey, encprivkey)) |
---|
3782 | + |
---|
3783 | + d.addCallback(lambda ignored: |
---|
3784 | + mr.get_blockhashes()) |
---|
3785 | + d.addCallback(lambda blockhashes: |
---|
3786 | + self.failUnlessEqual(self.block_hash_tree, blockhashes)) |
---|
3787 | + |
---|
3788 | + d.addCallback(lambda ignored: |
---|
3789 | + mr.get_sharehashes()) |
---|
3790 | + d.addCallback(lambda sharehashes: |
---|
3791 | + self.failUnlessEqual(self.share_hash_chain, sharehashes)) |
---|
3792 | + |
---|
3793 | + d.addCallback(lambda ignored: |
---|
3794 | + mr.get_signature()) |
---|
3795 | + d.addCallback(lambda signature: |
---|
3796 | + self.failUnlessEqual(signature, self.signature)) |
---|
3797 | + |
---|
3798 | + d.addCallback(lambda ignored: |
---|
3799 | + mr.get_verification_key()) |
---|
3800 | + d.addCallback(lambda verification_key: |
---|
3801 | + self.failUnlessEqual(verification_key, self.verification_key)) |
---|
3802 | + |
---|
3803 | + d.addCallback(lambda ignored: |
---|
3804 | + mr.get_seqnum()) |
---|
3805 | + d.addCallback(lambda seqnum: |
---|
3806 | + self.failUnlessEqual(seqnum, 0)) |
---|
3807 | + |
---|
3808 | + d.addCallback(lambda ignored: |
---|
3809 | + mr.get_root_hash()) |
---|
3810 | + d.addCallback(lambda root_hash: |
---|
3811 | + self.failUnlessEqual(self.root_hash, root_hash)) |
---|
3812 | + |
---|
3813 | + d.addCallback(lambda ignored: |
---|
3814 | + mr.get_encoding_parameters()) |
---|
3815 | + def _check_encoding_parameters((k, n, segsize, datalen)): |
---|
3816 | + self.failUnlessEqual(k, 3) |
---|
3817 | + self.failUnlessEqual(n, 10) |
---|
3818 | + self.failUnlessEqual(segsize, 6) |
---|
3819 | + self.failUnlessEqual(datalen, 36) |
---|
3820 | + d.addCallback(_check_encoding_parameters) |
---|
3821 | + |
---|
3822 | + d.addCallback(lambda ignored: |
---|
3823 | + mr.get_checkstring()) |
---|
3824 | + d.addCallback(lambda checkstring: |
---|
3825 | + self.failUnlessEqual(checkstring, mw.get_checkstring())) |
---|
3826 | + return d |
---|
3827 | + |
---|
3828 | + |
---|
3829 | + def test_is_sdmf(self): |
---|
3830 | + # The MDMFSlotReadProxy should also know how to read SDMF files, |
---|
3831 | + # since it will encounter them on the grid. Callers use the |
---|
3832 | + # is_sdmf method to test this. |
---|
3833 | + self.write_sdmf_share_to_server("si1") |
---|
3834 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0) |
---|
3835 | + d = mr.is_sdmf() |
---|
3836 | + d.addCallback(lambda issdmf: |
---|
3837 | + self.failUnless(issdmf)) |
---|
3838 | + return d |
---|
3839 | + |
---|
3840 | + |
---|
3841 | + def test_reads_sdmf(self): |
---|
3842 | + # The slot read proxy should, naturally, know how to tell us |
---|
3843 | + # about data in the SDMF format |
---|
3844 | + self.write_sdmf_share_to_server("si1") |
---|
3845 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0) |
---|
3846 | + d = defer.succeed(None) |
---|
3847 | + d.addCallback(lambda ignored: |
---|
3848 | + mr.is_sdmf()) |
---|
3849 | + d.addCallback(lambda issdmf: |
---|
3850 | + self.failUnless(issdmf)) |
---|
3851 | + |
---|
3852 | + # What do we need to read? |
---|
3853 | + # - The sharedata |
---|
3854 | + # - The salt |
---|
3855 | + d.addCallback(lambda ignored: |
---|
3856 | + mr.get_block_and_salt(0)) |
---|
3857 | + def _check_block_and_salt(results): |
---|
3858 | + block, salt = results |
---|
3859 | + # Our original file is 36 bytes long. Then each share is 12 |
---|
3860 | + # bytes in size. The share is composed entirely of the |
---|
3861 | + # letter a. self.block contains 2 as, so 6 * self.block is |
---|
3862 | + # what we are looking for. |
---|
3863 | + self.failUnlessEqual(block, self.block * 6) |
---|
3864 | + self.failUnlessEqual(salt, self.salt) |
---|
3865 | + d.addCallback(_check_block_and_salt) |
---|
3866 | + |
---|
3867 | + # - The blockhashes |
---|
3868 | + d.addCallback(lambda ignored: |
---|
3869 | + mr.get_blockhashes()) |
---|
3870 | + d.addCallback(lambda blockhashes: |
---|
3871 | + self.failUnlessEqual(self.block_hash_tree, |
---|
3872 | + blockhashes, |
---|
3873 | + blockhashes)) |
---|
3874 | + # - The sharehashes |
---|
3875 | + d.addCallback(lambda ignored: |
---|
3876 | + mr.get_sharehashes()) |
---|
3877 | + d.addCallback(lambda sharehashes: |
---|
3878 | + self.failUnlessEqual(self.share_hash_chain, |
---|
3879 | + sharehashes)) |
---|
3880 | + # - The keys |
---|
3881 | + d.addCallback(lambda ignored: |
---|
3882 | + mr.get_encprivkey()) |
---|
3883 | + d.addCallback(lambda encprivkey: |
---|
3884 | + self.failUnlessEqual(encprivkey, self.encprivkey, encprivkey)) |
---|
3885 | + d.addCallback(lambda ignored: |
---|
3886 | + mr.get_verification_key()) |
---|
3887 | + d.addCallback(lambda verification_key: |
---|
3888 | + self.failUnlessEqual(verification_key, |
---|
3889 | + self.verification_key, |
---|
3890 | + verification_key)) |
---|
3891 | + # - The signature |
---|
3892 | + d.addCallback(lambda ignored: |
---|
3893 | + mr.get_signature()) |
---|
3894 | + d.addCallback(lambda signature: |
---|
3895 | + self.failUnlessEqual(signature, self.signature, signature)) |
---|
3896 | + |
---|
3897 | + # - The sequence number |
---|
3898 | + d.addCallback(lambda ignored: |
---|
3899 | + mr.get_seqnum()) |
---|
3900 | + d.addCallback(lambda seqnum: |
---|
3901 | + self.failUnlessEqual(seqnum, 0, seqnum)) |
---|
3902 | + |
---|
3903 | + # - The root hash |
---|
3904 | + d.addCallback(lambda ignored: |
---|
3905 | + mr.get_root_hash()) |
---|
3906 | + d.addCallback(lambda root_hash: |
---|
3907 | + self.failUnlessEqual(root_hash, self.root_hash, root_hash)) |
---|
3908 | + return d |
---|
3909 | + |
---|
3910 | + |
---|
3911 | + def test_only_reads_one_segment_sdmf(self): |
---|
3912 | + # SDMF shares have only one segment, so it doesn't make sense to |
---|
3913 | + # read more segments than that. The reader should know this and |
---|
3914 | + # complain if we try to do that. |
---|
3915 | + self.write_sdmf_share_to_server("si1") |
---|
3916 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0) |
---|
3917 | + d = defer.succeed(None) |
---|
3918 | + d.addCallback(lambda ignored: |
---|
3919 | + mr.is_sdmf()) |
---|
3920 | + d.addCallback(lambda issdmf: |
---|
3921 | + self.failUnless(issdmf)) |
---|
3922 | + d.addCallback(lambda ignored: |
---|
3923 | + self.shouldFail(LayoutInvalid, "test bad segment", |
---|
3924 | + None, |
---|
3925 | + mr.get_block_and_salt, 1)) |
---|
3926 | + return d |
---|
3927 | + |
---|
3928 | + |
---|
3929 | + def test_read_with_prefetched_mdmf_data(self): |
---|
3930 | + # The MDMFSlotReadProxy will prefill certain fields if you pass |
---|
3931 | + # it data that you have already fetched. This is useful for |
---|
3932 | + # cases like the Servermap, which prefetches ~2kb of data while |
---|
3933 | + # finding out which shares are on the remote peer so that it |
---|
3934 | + # doesn't waste round trips. |
---|
3935 | + mdmf_data = self.build_test_mdmf_share() |
---|
3936 | + self.write_test_share_to_server("si1") |
---|
3937 | + def _make_mr(ignored, length): |
---|
3938 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0, mdmf_data[:length]) |
---|
3939 | + return mr |
---|
3940 | + |
---|
3941 | + d = defer.succeed(None) |
---|
3942 | + # This should be enough to fill in both the encoding parameters |
---|
3943 | + # and the table of offsets, which will complete the version |
---|
3944 | + # information tuple. |
---|
3945 | + d.addCallback(_make_mr, 107) |
---|
3946 | + d.addCallback(lambda mr: |
---|
3947 | + mr.get_verinfo()) |
---|
3948 | + def _check_verinfo(verinfo): |
---|
3949 | + self.failUnless(verinfo) |
---|
3950 | + self.failUnlessEqual(len(verinfo), 9) |
---|
3951 | + (seqnum, |
---|
3952 | + root_hash, |
---|
3953 | + salt_hash, |
---|
3954 | + segsize, |
---|
3955 | + datalen, |
---|
3956 | + k, |
---|
3957 | + n, |
---|
3958 | + prefix, |
---|
3959 | + offsets) = verinfo |
---|
3960 | + self.failUnlessEqual(seqnum, 0) |
---|
3961 | + self.failUnlessEqual(root_hash, self.root_hash) |
---|
3962 | + self.failUnlessEqual(segsize, 6) |
---|
3963 | + self.failUnlessEqual(datalen, 36) |
---|
3964 | + self.failUnlessEqual(k, 3) |
---|
3965 | + self.failUnlessEqual(n, 10) |
---|
3966 | + expected_prefix = struct.pack(MDMFSIGNABLEHEADER, |
---|
3967 | + 1, |
---|
3968 | + seqnum, |
---|
3969 | + root_hash, |
---|
3970 | + k, |
---|
3971 | + n, |
---|
3972 | + segsize, |
---|
3973 | + datalen) |
---|
3974 | + self.failUnlessEqual(expected_prefix, prefix) |
---|
3975 | + self.failUnlessEqual(self.rref.read_count, 0) |
---|
3976 | + d.addCallback(_check_verinfo) |
---|
3977 | + # This is not enough data to read a block and a share, so the |
---|
3978 | + # wrapper should attempt to read this from the remote server. |
---|
3979 | + d.addCallback(_make_mr, 107) |
---|
3980 | + d.addCallback(lambda mr: |
---|
3981 | + mr.get_block_and_salt(0)) |
---|
3982 | + def _check_block_and_salt((block, salt)): |
---|
3983 | + self.failUnlessEqual(block, self.block) |
---|
3984 | + self.failUnlessEqual(salt, self.salt) |
---|
3985 | + self.failUnlessEqual(self.rref.read_count, 1) |
---|
3986 | + # This should be enough data to read one block. |
---|
3987 | + d.addCallback(_make_mr, 249) |
---|
3988 | + d.addCallback(lambda mr: |
---|
3989 | + mr.get_block_and_salt(0)) |
---|
3990 | + d.addCallback(_check_block_and_salt) |
---|
3991 | + return d |
---|
3992 | + |
---|
3993 | + |
---|
3994 | + def test_read_with_prefetched_sdmf_data(self): |
---|
3995 | + sdmf_data = self.build_test_sdmf_share() |
---|
3996 | + self.write_sdmf_share_to_server("si1") |
---|
3997 | + def _make_mr(ignored, length): |
---|
3998 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0, sdmf_data[:length]) |
---|
3999 | + return mr |
---|
4000 | + |
---|
4001 | + d = defer.succeed(None) |
---|
4002 | + # This should be enough to get us the encoding parameters, |
---|
4003 | + # offset table, and everything else we need to build a verinfo |
---|
4004 | + # string. |
---|
4005 | + d.addCallback(_make_mr, 107) |
---|
4006 | + d.addCallback(lambda mr: |
---|
4007 | + mr.get_verinfo()) |
---|
4008 | + def _check_verinfo(verinfo): |
---|
4009 | + self.failUnless(verinfo) |
---|
4010 | + self.failUnlessEqual(len(verinfo), 9) |
---|
4011 | + (seqnum, |
---|
4012 | + root_hash, |
---|
4013 | + salt, |
---|
4014 | + segsize, |
---|
4015 | + datalen, |
---|
4016 | + k, |
---|
4017 | + n, |
---|
4018 | + prefix, |
---|
4019 | + offsets) = verinfo |
---|
4020 | + self.failUnlessEqual(seqnum, 0) |
---|
4021 | + self.failUnlessEqual(root_hash, self.root_hash) |
---|
4022 | + self.failUnlessEqual(salt, self.salt) |
---|
4023 | + self.failUnlessEqual(segsize, 36) |
---|
4024 | + self.failUnlessEqual(datalen, 36) |
---|
4025 | + self.failUnlessEqual(k, 3) |
---|
4026 | + self.failUnlessEqual(n, 10) |
---|
4027 | + expected_prefix = struct.pack(SIGNED_PREFIX, |
---|
4028 | + 0, |
---|
4029 | + seqnum, |
---|
4030 | + root_hash, |
---|
4031 | + salt, |
---|
4032 | + k, |
---|
4033 | + n, |
---|
4034 | + segsize, |
---|
4035 | + datalen) |
---|
4036 | + self.failUnlessEqual(expected_prefix, prefix) |
---|
4037 | + self.failUnlessEqual(self.rref.read_count, 0) |
---|
4038 | + d.addCallback(_check_verinfo) |
---|
4039 | + # This shouldn't be enough to read any share data. |
---|
4040 | + d.addCallback(_make_mr, 107) |
---|
4041 | + d.addCallback(lambda mr: |
---|
4042 | + mr.get_block_and_salt(0)) |
---|
4043 | + def _check_block_and_salt((block, salt)): |
---|
4044 | + self.failUnlessEqual(block, self.block * 6) |
---|
4045 | + self.failUnlessEqual(salt, self.salt) |
---|
4046 | + # TODO: Fix the read routine so that it reads only the data |
---|
4047 | + # that it has cached if it can't read all of it. |
---|
4048 | + self.failUnlessEqual(self.rref.read_count, 2) |
---|
4049 | + |
---|
4050 | + # This should be enough to read share data. |
---|
4051 | + d.addCallback(_make_mr, self.offsets['share_data']) |
---|
4052 | + d.addCallback(lambda mr: |
---|
4053 | + mr.get_block_and_salt(0)) |
---|
4054 | + d.addCallback(_check_block_and_salt) |
---|
4055 | + return d |
---|
4056 | + |
---|
4057 | + |
---|
4058 | + def test_read_with_empty_mdmf_file(self): |
---|
4059 | + # Some tests upload a file with no contents to test things |
---|
4060 | + # unrelated to the actual handling of the content of the file. |
---|
4061 | + # The reader should behave intelligently in these cases. |
---|
4062 | + self.write_test_share_to_server("si1", empty=True) |
---|
4063 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0) |
---|
4064 | + # We should be able to get the encoding parameters, and they |
---|
4065 | + # should be correct. |
---|
4066 | + d = defer.succeed(None) |
---|
4067 | + d.addCallback(lambda ignored: |
---|
4068 | + mr.get_encoding_parameters()) |
---|
4069 | + def _check_encoding_parameters(params): |
---|
4070 | + self.failUnlessEqual(len(params), 4) |
---|
4071 | + k, n, segsize, datalen = params |
---|
4072 | + self.failUnlessEqual(k, 3) |
---|
4073 | + self.failUnlessEqual(n, 10) |
---|
4074 | + self.failUnlessEqual(segsize, 0) |
---|
4075 | + self.failUnlessEqual(datalen, 0) |
---|
4076 | + d.addCallback(_check_encoding_parameters) |
---|
4077 | + |
---|
4078 | + # We should not be able to fetch a block, since there are no |
---|
4079 | + # blocks to fetch |
---|
4080 | + d.addCallback(lambda ignored: |
---|
4081 | + self.shouldFail(LayoutInvalid, "get block on empty file", |
---|
4082 | + None, |
---|
4083 | + mr.get_block_and_salt, 0)) |
---|
4084 | + return d |
---|
4085 | + |
---|
4086 | + |
---|
4087 | + def test_read_with_empty_sdmf_file(self): |
---|
4088 | + self.write_sdmf_share_to_server("si1", empty=True) |
---|
4089 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0) |
---|
4090 | + # We should be able to get the encoding parameters, and they |
---|
4091 | + # should be correct |
---|
4092 | + d = defer.succeed(None) |
---|
4093 | + d.addCallback(lambda ignored: |
---|
4094 | + mr.get_encoding_parameters()) |
---|
4095 | + def _check_encoding_parameters(params): |
---|
4096 | + self.failUnlessEqual(len(params), 4) |
---|
4097 | + k, n, segsize, datalen = params |
---|
4098 | + self.failUnlessEqual(k, 3) |
---|
4099 | + self.failUnlessEqual(n, 10) |
---|
4100 | + self.failUnlessEqual(segsize, 0) |
---|
4101 | + self.failUnlessEqual(datalen, 0) |
---|
4102 | + d.addCallback(_check_encoding_parameters) |
---|
4103 | + |
---|
4104 | + # It does not make sense to get a block in this format, so we |
---|
4105 | + # should not be able to. |
---|
4106 | + d.addCallback(lambda ignored: |
---|
4107 | + self.shouldFail(LayoutInvalid, "get block on an empty file", |
---|
4108 | + None, |
---|
4109 | + mr.get_block_and_salt, 0)) |
---|
4110 | + return d |
---|
4111 | + |
---|
4112 | + |
---|
4113 | + def test_verinfo_with_sdmf_file(self): |
---|
4114 | + self.write_sdmf_share_to_server("si1") |
---|
4115 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0) |
---|
4116 | + # We should be able to get the version information. |
---|
4117 | + d = defer.succeed(None) |
---|
4118 | + d.addCallback(lambda ignored: |
---|
4119 | + mr.get_verinfo()) |
---|
4120 | + def _check_verinfo(verinfo): |
---|
4121 | + self.failUnless(verinfo) |
---|
4122 | + self.failUnlessEqual(len(verinfo), 9) |
---|
4123 | + (seqnum, |
---|
4124 | + root_hash, |
---|
4125 | + salt, |
---|
4126 | + segsize, |
---|
4127 | + datalen, |
---|
4128 | + k, |
---|
4129 | + n, |
---|
4130 | + prefix, |
---|
4131 | + offsets) = verinfo |
---|
4132 | + self.failUnlessEqual(seqnum, 0) |
---|
4133 | + self.failUnlessEqual(root_hash, self.root_hash) |
---|
4134 | + self.failUnlessEqual(salt, self.salt) |
---|
4135 | + self.failUnlessEqual(segsize, 36) |
---|
4136 | + self.failUnlessEqual(datalen, 36) |
---|
4137 | + self.failUnlessEqual(k, 3) |
---|
4138 | + self.failUnlessEqual(n, 10) |
---|
4139 | + expected_prefix = struct.pack(">BQ32s16s BBQQ", |
---|
4140 | + 0, |
---|
4141 | + seqnum, |
---|
4142 | + root_hash, |
---|
4143 | + salt, |
---|
4144 | + k, |
---|
4145 | + n, |
---|
4146 | + segsize, |
---|
4147 | + datalen) |
---|
4148 | + self.failUnlessEqual(prefix, expected_prefix) |
---|
4149 | + self.failUnlessEqual(offsets, self.offsets) |
---|
4150 | + d.addCallback(_check_verinfo) |
---|
4151 | + return d |
---|
4152 | + |
---|
4153 | + |
---|
4154 | + def test_verinfo_with_mdmf_file(self): |
---|
4155 | + self.write_test_share_to_server("si1") |
---|
4156 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0) |
---|
4157 | + d = defer.succeed(None) |
---|
4158 | + d.addCallback(lambda ignored: |
---|
4159 | + mr.get_verinfo()) |
---|
4160 | + def _check_verinfo(verinfo): |
---|
4161 | + self.failUnless(verinfo) |
---|
4162 | + self.failUnlessEqual(len(verinfo), 9) |
---|
4163 | + (seqnum, |
---|
4164 | + root_hash, |
---|
4165 | + IV, |
---|
4166 | + segsize, |
---|
4167 | + datalen, |
---|
4168 | + k, |
---|
4169 | + n, |
---|
4170 | + prefix, |
---|
4171 | + offsets) = verinfo |
---|
4172 | + self.failUnlessEqual(seqnum, 0) |
---|
4173 | + self.failUnlessEqual(root_hash, self.root_hash) |
---|
4174 | + self.failIf(IV) |
---|
4175 | + self.failUnlessEqual(segsize, 6) |
---|
4176 | + self.failUnlessEqual(datalen, 36) |
---|
4177 | + self.failUnlessEqual(k, 3) |
---|
4178 | + self.failUnlessEqual(n, 10) |
---|
4179 | + expected_prefix = struct.pack(">BQ32s BBQQ", |
---|
4180 | + 1, |
---|
4181 | + seqnum, |
---|
4182 | + root_hash, |
---|
4183 | + k, |
---|
4184 | + n, |
---|
4185 | + segsize, |
---|
4186 | + datalen) |
---|
4187 | + self.failUnlessEqual(prefix, expected_prefix) |
---|
4188 | + self.failUnlessEqual(offsets, self.offsets) |
---|
4189 | + d.addCallback(_check_verinfo) |
---|
4190 | + return d |
---|
4191 | + |
---|
4192 | + |
---|
4193 | + def test_reader_queue(self): |
---|
4194 | + self.write_test_share_to_server('si1') |
---|
4195 | + mr = MDMFSlotReadProxy(self.rref, "si1", 0) |
---|
4196 | + d1 = mr.get_block_and_salt(0, queue=True) |
---|
4197 | + d2 = mr.get_blockhashes(queue=True) |
---|
4198 | + d3 = mr.get_sharehashes(queue=True) |
---|
4199 | + d4 = mr.get_signature(queue=True) |
---|
4200 | + d5 = mr.get_verification_key(queue=True) |
---|
4201 | + dl = defer.DeferredList([d1, d2, d3, d4, d5]) |
---|
4202 | + mr.flush() |
---|
4203 | + def _print(results): |
---|
4204 | + self.failUnlessEqual(len(results), 5) |
---|
4205 | + # We have one read for version information and offsets, and |
---|
4206 | + # one for everything else. |
---|
4207 | + self.failUnlessEqual(self.rref.read_count, 2) |
---|
4208 | + block, salt = results[0][1] # results[0] is a boolean that says |
---|
4209 | + # whether or not the operation |
---|
4210 | + # worked. |
---|
4211 | + self.failUnlessEqual(self.block, block) |
---|
4212 | + self.failUnlessEqual(self.salt, salt) |
---|
4213 | + |
---|
4214 | + blockhashes = results[1][1] |
---|
4215 | + self.failUnlessEqual(self.block_hash_tree, blockhashes) |
---|
4216 | + |
---|
4217 | + sharehashes = results[2][1] |
---|
4218 | + self.failUnlessEqual(self.share_hash_chain, sharehashes) |
---|
4219 | + |
---|
4220 | + signature = results[3][1] |
---|
4221 | + self.failUnlessEqual(self.signature, signature) |
---|
4222 | + |
---|
4223 | + verification_key = results[4][1] |
---|
4224 | + self.failUnlessEqual(self.verification_key, verification_key) |
---|
4225 | + dl.addCallback(_print) |
---|
4226 | + return dl |
---|
4227 | + |
---|
4228 | + |
---|
4229 | + def test_sdmf_writer(self): |
---|
4230 | + # Go through the motions of writing an SDMF share to the storage |
---|
4231 | + # server. Then read the storage server to see that the share got |
---|
4232 | + # written in the way that we think it should have. |
---|
4233 | + |
---|
4234 | + # We do this first so that the necessary instance variables get |
---|
4235 | + # set the way we want them for the tests below. |
---|
4236 | + data = self.build_test_sdmf_share() |
---|
4237 | + sdmfr = SDMFSlotWriteProxy(0, |
---|
4238 | + self.rref, |
---|
4239 | + "si1", |
---|
4240 | + self.secrets, |
---|
4241 | + 0, 3, 10, 36, 36) |
---|
4242 | + # Put the block and salt. |
---|
4243 | + sdmfr.put_block(self.blockdata, 0, self.salt) |
---|
4244 | + |
---|
4245 | + # Put the encprivkey |
---|
4246 | + sdmfr.put_encprivkey(self.encprivkey) |
---|
4247 | + |
---|
4248 | + # Put the block and share hash chains |
---|
4249 | + sdmfr.put_blockhashes(self.block_hash_tree) |
---|
4250 | + sdmfr.put_sharehashes(self.share_hash_chain) |
---|
4251 | + sdmfr.put_root_hash(self.root_hash) |
---|
4252 | + |
---|
4253 | + # Put the signature |
---|
4254 | + sdmfr.put_signature(self.signature) |
---|
4255 | + |
---|
4256 | + # Put the verification key |
---|
4257 | + sdmfr.put_verification_key(self.verification_key) |
---|
4258 | + |
---|
4259 | + # Now check to make sure that nothing has been written yet. |
---|
4260 | + self.failUnlessEqual(self.rref.write_count, 0) |
---|
4261 | + |
---|
4262 | + # Now finish publishing |
---|
4263 | + d = sdmfr.finish_publishing() |
---|
4264 | + def _then(ignored): |
---|
4265 | + self.failUnlessEqual(self.rref.write_count, 1) |
---|
4266 | + read = self.ss.remote_slot_readv |
---|
4267 | + self.failUnlessEqual(read("si1", [0], [(0, len(data))]), |
---|
4268 | + {0: [data]}) |
---|
4269 | + d.addCallback(_then) |
---|
4270 | + return d |
---|
4271 | + |
---|
4272 | + |
---|
4273 | + def test_sdmf_writer_preexisting_share(self): |
---|
4274 | + data = self.build_test_sdmf_share() |
---|
4275 | + self.write_sdmf_share_to_server("si1") |
---|
4276 | + |
---|
4277 | + # Now there is a share on the storage server. To successfully |
---|
4278 | + # write, we need to set the checkstring correctly. When we |
---|
4279 | + # don't, no write should occur. |
---|
4280 | + sdmfw = SDMFSlotWriteProxy(0, |
---|
4281 | + self.rref, |
---|
4282 | + "si1", |
---|
4283 | + self.secrets, |
---|
4284 | + 1, 3, 10, 36, 36) |
---|
4285 | + sdmfw.put_block(self.blockdata, 0, self.salt) |
---|
4286 | + |
---|
4287 | + # Put the encprivkey |
---|
4288 | + sdmfw.put_encprivkey(self.encprivkey) |
---|
4289 | + |
---|
4290 | + # Put the block and share hash chains |
---|
4291 | + sdmfw.put_blockhashes(self.block_hash_tree) |
---|
4292 | + sdmfw.put_sharehashes(self.share_hash_chain) |
---|
4293 | + |
---|
4294 | + # Put the root hash |
---|
4295 | + sdmfw.put_root_hash(self.root_hash) |
---|
4296 | + |
---|
4297 | + # Put the signature |
---|
4298 | + sdmfw.put_signature(self.signature) |
---|
4299 | + |
---|
4300 | + # Put the verification key |
---|
4301 | + sdmfw.put_verification_key(self.verification_key) |
---|
4302 | + |
---|
4303 | + # We shouldn't have a checkstring yet |
---|
4304 | + self.failUnlessEqual(sdmfw.get_checkstring(), "") |
---|
4305 | + |
---|
4306 | + d = sdmfw.finish_publishing() |
---|
4307 | + def _then(results): |
---|
4308 | + self.failIf(results[0]) |
---|
4309 | + # this is the correct checkstring |
---|
4310 | + self._expected_checkstring = results[1][0][0] |
---|
4311 | + return self._expected_checkstring |
---|
4312 | + |
---|
4313 | + d.addCallback(_then) |
---|
4314 | + d.addCallback(sdmfw.set_checkstring) |
---|
4315 | + d.addCallback(lambda ignored: |
---|
4316 | + sdmfw.get_checkstring()) |
---|
4317 | + d.addCallback(lambda checkstring: |
---|
4318 | + self.failUnlessEqual(checkstring, self._expected_checkstring)) |
---|
4319 | + d.addCallback(lambda ignored: |
---|
4320 | + sdmfw.finish_publishing()) |
---|
4321 | + def _then_again(results): |
---|
4322 | + self.failUnless(results[0]) |
---|
4323 | + read = self.ss.remote_slot_readv |
---|
4324 | + self.failUnlessEqual(read("si1", [0], [(1, 8)]), |
---|
4325 | + {0: [struct.pack(">Q", 1)]}) |
---|
4326 | + self.failUnlessEqual(read("si1", [0], [(9, len(data) - 9)]), |
---|
4327 | + {0: [data[9:]]}) |
---|
4328 | + d.addCallback(_then_again) |
---|
4329 | + return d |
---|
4330 | + |
---|
4331 | + |
---|
4332 | class Stats(unittest.TestCase): |
---|
4333 | |
---|
4334 | def setUp(self): |
---|
4335 | } |
---|
4336 | [immutable/filenode.py: Make the immutable file node implement the same interfaces as the mutable one |
---|
4337 | Kevan Carstensen <kevan@isnotajoke.com>**20100810000619 |
---|
4338 | Ignore-this: 93e536c0f8efb705310f13ff64621527 |
---|
4339 | ] { |
---|
4340 | hunk ./src/allmydata/immutable/filenode.py 8 |
---|
4341 | now = time.time |
---|
4342 | from zope.interface import implements, Interface |
---|
4343 | from twisted.internet import defer |
---|
4344 | -from twisted.internet.interfaces import IConsumer |
---|
4345 | |
---|
4346 | hunk ./src/allmydata/immutable/filenode.py 9 |
---|
4347 | -from allmydata.interfaces import IImmutableFileNode, IUploadResults |
---|
4348 | from allmydata import uri |
---|
4349 | hunk ./src/allmydata/immutable/filenode.py 10 |
---|
4350 | +from twisted.internet.interfaces import IConsumer |
---|
4351 | +from twisted.protocols import basic |
---|
4352 | +from foolscap.api import eventually |
---|
4353 | +from allmydata.interfaces import IImmutableFileNode, ICheckable, \ |
---|
4354 | + IDownloadTarget, IUploadResults |
---|
4355 | +from allmydata.util import dictutil, log, base32, consumer |
---|
4356 | +from allmydata.immutable.checker import Checker |
---|
4357 | from allmydata.check_results import CheckResults, CheckAndRepairResults |
---|
4358 | from allmydata.util.dictutil import DictOfSets |
---|
4359 | from pycryptopp.cipher.aes import AES |
---|
4360 | hunk ./src/allmydata/immutable/filenode.py 296 |
---|
4361 | return self._cnode.check_and_repair(monitor, verify, add_lease) |
---|
4362 | def check(self, monitor, verify=False, add_lease=False): |
---|
4363 | return self._cnode.check(monitor, verify, add_lease) |
---|
4364 | + |
---|
4365 | + def get_best_readable_version(self): |
---|
4366 | + """ |
---|
4367 | + Return an IReadable of the best version of this file. Since |
---|
4368 | + immutable files can have only one version, we just return the |
---|
4369 | + current filenode. |
---|
4370 | + """ |
---|
4371 | + return defer.succeed(self) |
---|
4372 | + |
---|
4373 | + |
---|
4374 | + def download_best_version(self): |
---|
4375 | + """ |
---|
4376 | + Download the best version of this file, returning its contents |
---|
4377 | + as a bytestring. Since there is only one version of an immutable |
---|
4378 | + file, we download and return the contents of this file. |
---|
4379 | + """ |
---|
4380 | + d = consumer.download_to_data(self) |
---|
4381 | + return d |
---|
4382 | + |
---|
4383 | + # for an immutable file, download_to_data (specified in IReadable) |
---|
4384 | + # is the same as download_best_version (specified in IFileNode). For |
---|
4385 | + # mutable files, the difference is more meaningful, since they can |
---|
4386 | + # have multiple versions. |
---|
4387 | + download_to_data = download_best_version |
---|
4388 | + |
---|
4389 | + |
---|
4390 | + # get_size() (IReadable), get_current_size() (IFilesystemNode), and |
---|
4391 | + # get_size_of_best_version(IFileNode) are all the same for immutable |
---|
4392 | + # files. |
---|
4393 | + get_size_of_best_version = get_current_size |
---|
4394 | } |
---|
4395 | [immutable/literal.py: implement the same interfaces as other filenodes |
---|
4396 | Kevan Carstensen <kevan@isnotajoke.com>**20100810000633 |
---|
4397 | Ignore-this: b50dd5df2d34ecd6477b8499a27aef13 |
---|
4398 | ] hunk ./src/allmydata/immutable/literal.py 106 |
---|
4399 | d.addCallback(lambda lastSent: consumer) |
---|
4400 | return d |
---|
4401 | |
---|
4402 | + # IReadable, IFileNode, IFilesystemNode |
---|
4403 | + def get_best_readable_version(self): |
---|
4404 | + return defer.succeed(self) |
---|
4405 | + |
---|
4406 | + |
---|
4407 | + def download_best_version(self): |
---|
4408 | + return defer.succeed(self.u.data) |
---|
4409 | + |
---|
4410 | + |
---|
4411 | + download_to_data = download_best_version |
---|
4412 | + get_size_of_best_version = get_current_size |
---|
4413 | + |
---|
4414 | [mutable/filenode.py: add versions and partial-file updates to the mutable file node |
---|
4415 | Kevan Carstensen <kevan@isnotajoke.com>**20100811233049 |
---|
4416 | Ignore-this: edf9f6d5d2833909568757ba2dbeedff |
---|
4417 | |
---|
4418 | One of the goals of MDMF as a GSoC project is to lay the groundwork for |
---|
4419 | LDMF, a format that will allow Tahoe-LAFS to deal with and encourage |
---|
4420 | multiple versions of a single cap on the grid. In line with this, there |
---|
4421 | is a now a distinction between an overriding mutable file (which can be |
---|
4422 | thought to correspond to the cap/unique identifier for that mutable |
---|
4423 | file) and versions of the mutable file (which we can download, update, |
---|
4424 | and so on). All download, upload, and modification operations end up |
---|
4425 | happening on a particular version of a mutable file, but there are |
---|
4426 | shortcut methods on the object representing the overriding mutable file |
---|
4427 | that perform these operations on the best version of the mutable file |
---|
4428 | (which is what code should be doing until we have LDMF and better |
---|
4429 | support for other paradigms). |
---|
4430 | |
---|
4431 | Another goal of MDMF was to take advantage of segmentation to give |
---|
4432 | callers more efficient partial file updates or appends. This patch |
---|
4433 | implements methods that do that, too. |
---|
4434 | |
---|
4435 | ] { |
---|
4436 | hunk ./src/allmydata/mutable/filenode.py 7 |
---|
4437 | from zope.interface import implements |
---|
4438 | from twisted.internet import defer, reactor |
---|
4439 | from foolscap.api import eventually |
---|
4440 | -from allmydata.interfaces import IMutableFileNode, \ |
---|
4441 | - ICheckable, ICheckResults, NotEnoughSharesError |
---|
4442 | -from allmydata.util import hashutil, log |
---|
4443 | +from allmydata.interfaces import IMutableFileNode, ICheckable, ICheckResults, \ |
---|
4444 | + NotEnoughSharesError, MDMF_VERSION, SDMF_VERSION, IMutableUploadable, \ |
---|
4445 | + IMutableFileVersion, IWritable |
---|
4446 | +from allmydata import hashtree |
---|
4447 | +from allmydata.util import hashutil, log, consumer, deferredutil, mathutil |
---|
4448 | from allmydata.util.assertutil import precondition |
---|
4449 | from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI |
---|
4450 | from allmydata.monitor import Monitor |
---|
4451 | hunk ./src/allmydata/mutable/filenode.py 17 |
---|
4452 | from pycryptopp.cipher.aes import AES |
---|
4453 | |
---|
4454 | -from allmydata.mutable.publish import Publish |
---|
4455 | +from allmydata.mutable.publish import Publish, MutableFileHandle, \ |
---|
4456 | + MutableData,\ |
---|
4457 | + DEFAULT_MAX_SEGMENT_SIZE, \ |
---|
4458 | + TransformingUploadable |
---|
4459 | from allmydata.mutable.common import MODE_READ, MODE_WRITE, UnrecoverableFileError, \ |
---|
4460 | ResponseCache, UncoordinatedWriteError |
---|
4461 | from allmydata.mutable.servermap import ServerMap, ServermapUpdater |
---|
4462 | hunk ./src/allmydata/mutable/filenode.py 72 |
---|
4463 | self._sharemap = {} # known shares, shnum-to-[nodeids] |
---|
4464 | self._cache = ResponseCache() |
---|
4465 | self._most_recent_size = None |
---|
4466 | + # filled in after __init__ if we're being created for the first time; |
---|
4467 | + # filled in by the servermap updater before publishing, otherwise. |
---|
4468 | + # set to this default value in case neither of those things happen, |
---|
4469 | + # or in case the servermap can't find any shares to tell us what |
---|
4470 | + # to publish as. |
---|
4471 | + # TODO: Set this back to None, and find out why the tests fail |
---|
4472 | + # with it set to None. |
---|
4473 | + self._protocol_version = None |
---|
4474 | |
---|
4475 | # all users of this MutableFileNode go through the serializer. This |
---|
4476 | # takes advantage of the fact that Deferreds discard the callbacks |
---|
4477 | hunk ./src/allmydata/mutable/filenode.py 136 |
---|
4478 | return self._upload(initial_contents, None) |
---|
4479 | |
---|
4480 | def _get_initial_contents(self, contents): |
---|
4481 | - if isinstance(contents, str): |
---|
4482 | - return contents |
---|
4483 | if contents is None: |
---|
4484 | hunk ./src/allmydata/mutable/filenode.py 137 |
---|
4485 | - return "" |
---|
4486 | + return MutableData("") |
---|
4487 | + |
---|
4488 | + if IMutableUploadable.providedBy(contents): |
---|
4489 | + return contents |
---|
4490 | + |
---|
4491 | assert callable(contents), "%s should be callable, not %s" % \ |
---|
4492 | (contents, type(contents)) |
---|
4493 | return contents(self) |
---|
4494 | hunk ./src/allmydata/mutable/filenode.py 211 |
---|
4495 | |
---|
4496 | def get_size(self): |
---|
4497 | return self._most_recent_size |
---|
4498 | + |
---|
4499 | def get_current_size(self): |
---|
4500 | d = self.get_size_of_best_version() |
---|
4501 | d.addCallback(self._stash_size) |
---|
4502 | hunk ./src/allmydata/mutable/filenode.py 216 |
---|
4503 | return d |
---|
4504 | + |
---|
4505 | def _stash_size(self, size): |
---|
4506 | self._most_recent_size = size |
---|
4507 | return size |
---|
4508 | hunk ./src/allmydata/mutable/filenode.py 275 |
---|
4509 | return cmp(self.__class__, them.__class__) |
---|
4510 | return cmp(self._uri, them._uri) |
---|
4511 | |
---|
4512 | - def _do_serialized(self, cb, *args, **kwargs): |
---|
4513 | - # note: to avoid deadlock, this callable is *not* allowed to invoke |
---|
4514 | - # other serialized methods within this (or any other) |
---|
4515 | - # MutableFileNode. The callable should be a bound method of this same |
---|
4516 | - # MFN instance. |
---|
4517 | - d = defer.Deferred() |
---|
4518 | - self._serializer.addCallback(lambda ignore: cb(*args, **kwargs)) |
---|
4519 | - # we need to put off d.callback until this Deferred is finished being |
---|
4520 | - # processed. Otherwise the caller's subsequent activities (like, |
---|
4521 | - # doing other things with this node) can cause reentrancy problems in |
---|
4522 | - # the Deferred code itself |
---|
4523 | - self._serializer.addBoth(lambda res: eventually(d.callback, res)) |
---|
4524 | - # add a log.err just in case something really weird happens, because |
---|
4525 | - # self._serializer stays around forever, therefore we won't see the |
---|
4526 | - # usual Unhandled Error in Deferred that would give us a hint. |
---|
4527 | - self._serializer.addErrback(log.err) |
---|
4528 | - return d |
---|
4529 | |
---|
4530 | ################################# |
---|
4531 | # ICheckable |
---|
4532 | hunk ./src/allmydata/mutable/filenode.py 300 |
---|
4533 | |
---|
4534 | |
---|
4535 | ################################# |
---|
4536 | - # IMutableFileNode |
---|
4537 | + # IFileNode |
---|
4538 | + |
---|
4539 | + def get_best_readable_version(self): |
---|
4540 | + """ |
---|
4541 | + I return a Deferred that fires with a MutableFileVersion |
---|
4542 | + representing the best readable version of the file that I |
---|
4543 | + represent |
---|
4544 | + """ |
---|
4545 | + return self.get_readable_version() |
---|
4546 | + |
---|
4547 | + |
---|
4548 | + def get_readable_version(self, servermap=None, version=None): |
---|
4549 | + """ |
---|
4550 | + I return a Deferred that fires with an MutableFileVersion for my |
---|
4551 | + version argument, if there is a recoverable file of that version |
---|
4552 | + on the grid. If there is no recoverable version, I fire with an |
---|
4553 | + UnrecoverableFileError. |
---|
4554 | + |
---|
4555 | + If a servermap is provided, I look in there for the requested |
---|
4556 | + version. If no servermap is provided, I create and update a new |
---|
4557 | + one. |
---|
4558 | + |
---|
4559 | + If no version is provided, then I return a MutableFileVersion |
---|
4560 | + representing the best recoverable version of the file. |
---|
4561 | + """ |
---|
4562 | + d = self._get_version_from_servermap(MODE_READ, servermap, version) |
---|
4563 | + def _build_version((servermap, their_version)): |
---|
4564 | + assert their_version in servermap.recoverable_versions() |
---|
4565 | + assert their_version in servermap.make_versionmap() |
---|
4566 | + |
---|
4567 | + mfv = MutableFileVersion(self, |
---|
4568 | + servermap, |
---|
4569 | + their_version, |
---|
4570 | + self._storage_index, |
---|
4571 | + self._storage_broker, |
---|
4572 | + self._readkey, |
---|
4573 | + history=self._history) |
---|
4574 | + assert mfv.is_readonly() |
---|
4575 | + # our caller can use this to download the contents of the |
---|
4576 | + # mutable file. |
---|
4577 | + return mfv |
---|
4578 | + return d.addCallback(_build_version) |
---|
4579 | + |
---|
4580 | + |
---|
4581 | + def _get_version_from_servermap(self, |
---|
4582 | + mode, |
---|
4583 | + servermap=None, |
---|
4584 | + version=None): |
---|
4585 | + """ |
---|
4586 | + I return a Deferred that fires with (servermap, version). |
---|
4587 | + |
---|
4588 | + This function performs validation and a servermap update. If it |
---|
4589 | + returns (servermap, version), the caller can assume that: |
---|
4590 | + - servermap was last updated in mode. |
---|
4591 | + - version is recoverable, and corresponds to the servermap. |
---|
4592 | + |
---|
4593 | + If version and servermap are provided to me, I will validate |
---|
4594 | + that version exists in the servermap, and that the servermap was |
---|
4595 | + updated correctly. |
---|
4596 | + |
---|
4597 | + If version is not provided, but servermap is, I will validate |
---|
4598 | + the servermap and return the best recoverable version that I can |
---|
4599 | + find in the servermap. |
---|
4600 | + |
---|
4601 | + If the version is provided but the servermap isn't, I will |
---|
4602 | + obtain a servermap that has been updated in the correct mode and |
---|
4603 | + validate that version is found and recoverable. |
---|
4604 | + |
---|
4605 | + If neither servermap nor version are provided, I will obtain a |
---|
4606 | + servermap updated in the correct mode, and return the best |
---|
4607 | + recoverable version that I can find in there. |
---|
4608 | + """ |
---|
4609 | + # XXX: wording ^^^^ |
---|
4610 | + if servermap and servermap.last_update_mode == mode: |
---|
4611 | + d = defer.succeed(servermap) |
---|
4612 | + else: |
---|
4613 | + d = self._get_servermap(mode) |
---|
4614 | + |
---|
4615 | + def _get_version(servermap, v): |
---|
4616 | + if v and v not in servermap.recoverable_versions(): |
---|
4617 | + v = None |
---|
4618 | + elif not v: |
---|
4619 | + v = servermap.best_recoverable_version() |
---|
4620 | + if not v: |
---|
4621 | + raise UnrecoverableFileError("no recoverable versions") |
---|
4622 | + |
---|
4623 | + return (servermap, v) |
---|
4624 | + return d.addCallback(_get_version, version) |
---|
4625 | + |
---|
4626 | |
---|
4627 | def download_best_version(self): |
---|
4628 | hunk ./src/allmydata/mutable/filenode.py 391 |
---|
4629 | + """ |
---|
4630 | + I return a Deferred that fires with the contents of the best |
---|
4631 | + version of this mutable file. |
---|
4632 | + """ |
---|
4633 | return self._do_serialized(self._download_best_version) |
---|
4634 | hunk ./src/allmydata/mutable/filenode.py 396 |
---|
4635 | + |
---|
4636 | + |
---|
4637 | def _download_best_version(self): |
---|
4638 | hunk ./src/allmydata/mutable/filenode.py 399 |
---|
4639 | - servermap = ServerMap() |
---|
4640 | - d = self._try_once_to_download_best_version(servermap, MODE_READ) |
---|
4641 | - def _maybe_retry(f): |
---|
4642 | - f.trap(NotEnoughSharesError) |
---|
4643 | - # the download is worth retrying once. Make sure to use the |
---|
4644 | - # old servermap, since it is what remembers the bad shares, |
---|
4645 | - # but use MODE_WRITE to make it look for even more shares. |
---|
4646 | - # TODO: consider allowing this to retry multiple times.. this |
---|
4647 | - # approach will let us tolerate about 8 bad shares, I think. |
---|
4648 | - return self._try_once_to_download_best_version(servermap, |
---|
4649 | - MODE_WRITE) |
---|
4650 | + """ |
---|
4651 | + I am the serialized sibling of download_best_version. |
---|
4652 | + """ |
---|
4653 | + d = self.get_best_readable_version() |
---|
4654 | + d.addCallback(self._record_size) |
---|
4655 | + d.addCallback(lambda version: version.download_to_data()) |
---|
4656 | + |
---|
4657 | + # It is possible that the download will fail because there |
---|
4658 | + # aren't enough shares to be had. If so, we will try again after |
---|
4659 | + # updating the servermap in MODE_WRITE, which may find more |
---|
4660 | + # shares than updating in MODE_READ, as we just did. We can do |
---|
4661 | + # this by getting the best mutable version and downloading from |
---|
4662 | + # that -- the best mutable version will be a MutableFileVersion |
---|
4663 | + # with a servermap that was last updated in MODE_WRITE, as we |
---|
4664 | + # want. If this fails, then we give up. |
---|
4665 | + def _maybe_retry(failure): |
---|
4666 | + failure.trap(NotEnoughSharesError) |
---|
4667 | + |
---|
4668 | + d = self.get_best_mutable_version() |
---|
4669 | + d.addCallback(self._record_size) |
---|
4670 | + d.addCallback(lambda version: version.download_to_data()) |
---|
4671 | + return d |
---|
4672 | + |
---|
4673 | d.addErrback(_maybe_retry) |
---|
4674 | return d |
---|
4675 | hunk ./src/allmydata/mutable/filenode.py 424 |
---|
4676 | - def _try_once_to_download_best_version(self, servermap, mode): |
---|
4677 | - d = self._update_servermap(servermap, mode) |
---|
4678 | - d.addCallback(self._once_updated_download_best_version, servermap) |
---|
4679 | - return d |
---|
4680 | - def _once_updated_download_best_version(self, ignored, servermap): |
---|
4681 | - goal = servermap.best_recoverable_version() |
---|
4682 | - if not goal: |
---|
4683 | - raise UnrecoverableFileError("no recoverable versions") |
---|
4684 | - return self._try_once_to_download_version(servermap, goal) |
---|
4685 | + |
---|
4686 | + |
---|
4687 | + def _record_size(self, mfv): |
---|
4688 | + """ |
---|
4689 | + I record the size of a mutable file version. |
---|
4690 | + """ |
---|
4691 | + self._most_recent_size = mfv.get_size() |
---|
4692 | + return mfv |
---|
4693 | + |
---|
4694 | |
---|
4695 | def get_size_of_best_version(self): |
---|
4696 | hunk ./src/allmydata/mutable/filenode.py 435 |
---|
4697 | - d = self.get_servermap(MODE_READ) |
---|
4698 | - def _got_servermap(smap): |
---|
4699 | - ver = smap.best_recoverable_version() |
---|
4700 | - if not ver: |
---|
4701 | - raise UnrecoverableFileError("no recoverable version") |
---|
4702 | - return smap.size_of_version(ver) |
---|
4703 | - d.addCallback(_got_servermap) |
---|
4704 | - return d |
---|
4705 | + """ |
---|
4706 | + I return the size of the best version of this mutable file. |
---|
4707 | |
---|
4708 | hunk ./src/allmydata/mutable/filenode.py 438 |
---|
4709 | + This is equivalent to calling get_size() on the result of |
---|
4710 | + get_best_readable_version(). |
---|
4711 | + """ |
---|
4712 | + d = self.get_best_readable_version() |
---|
4713 | + return d.addCallback(lambda mfv: mfv.get_size()) |
---|
4714 | + |
---|
4715 | + |
---|
4716 | + ################################# |
---|
4717 | + # IMutableFileNode |
---|
4718 | + |
---|
4719 | + def get_best_mutable_version(self, servermap=None): |
---|
4720 | + """ |
---|
4721 | + I return a Deferred that fires with a MutableFileVersion |
---|
4722 | + representing the best readable version of the file that I |
---|
4723 | + represent. I am like get_best_readable_version, except that I |
---|
4724 | + will try to make a writable version if I can. |
---|
4725 | + """ |
---|
4726 | + return self.get_mutable_version(servermap=servermap) |
---|
4727 | + |
---|
4728 | + |
---|
4729 | + def get_mutable_version(self, servermap=None, version=None): |
---|
4730 | + """ |
---|
4731 | + I return a version of this mutable file. I return a Deferred |
---|
4732 | + that fires with a MutableFileVersion |
---|
4733 | + |
---|
4734 | + If version is provided, the Deferred will fire with a |
---|
4735 | + MutableFileVersion initailized with that version. Otherwise, it |
---|
4736 | + will fire with the best version that I can recover. |
---|
4737 | + |
---|
4738 | + If servermap is provided, I will use that to find versions |
---|
4739 | + instead of performing my own servermap update. |
---|
4740 | + """ |
---|
4741 | + if self.is_readonly(): |
---|
4742 | + return self.get_readable_version(servermap=servermap, |
---|
4743 | + version=version) |
---|
4744 | + |
---|
4745 | + # get_mutable_version => write intent, so we require that the |
---|
4746 | + # servermap is updated in MODE_WRITE |
---|
4747 | + d = self._get_version_from_servermap(MODE_WRITE, servermap, version) |
---|
4748 | + def _build_version((servermap, smap_version)): |
---|
4749 | + # these should have been set by the servermap update. |
---|
4750 | + assert self._secret_holder |
---|
4751 | + assert self._writekey |
---|
4752 | + |
---|
4753 | + mfv = MutableFileVersion(self, |
---|
4754 | + servermap, |
---|
4755 | + smap_version, |
---|
4756 | + self._storage_index, |
---|
4757 | + self._storage_broker, |
---|
4758 | + self._readkey, |
---|
4759 | + self._writekey, |
---|
4760 | + self._secret_holder, |
---|
4761 | + history=self._history) |
---|
4762 | + assert not mfv.is_readonly() |
---|
4763 | + return mfv |
---|
4764 | + |
---|
4765 | + return d.addCallback(_build_version) |
---|
4766 | + |
---|
4767 | + |
---|
4768 | + # XXX: I'm uncomfortable with the difference between upload and |
---|
4769 | + # overwrite, which, FWICT, is basically that you don't have to |
---|
4770 | + # do a servermap update before you overwrite. We split them up |
---|
4771 | + # that way anyway, so I guess there's no real difficulty in |
---|
4772 | + # offering both ways to callers, but it also makes the |
---|
4773 | + # public-facing API cluttery, and makes it hard to discern the |
---|
4774 | + # right way of doing things. |
---|
4775 | + |
---|
4776 | + # In general, we leave it to callers to ensure that they aren't |
---|
4777 | + # going to cause UncoordinatedWriteErrors when working with |
---|
4778 | + # MutableFileVersions. We know that the next three operations |
---|
4779 | + # (upload, overwrite, and modify) will all operate on the same |
---|
4780 | + # version, so we say that only one of them can be going on at once, |
---|
4781 | + # and serialize them to ensure that that actually happens, since as |
---|
4782 | + # the caller in this situation it is our job to do that. |
---|
4783 | def overwrite(self, new_contents): |
---|
4784 | hunk ./src/allmydata/mutable/filenode.py 513 |
---|
4785 | + """ |
---|
4786 | + I overwrite the contents of the best recoverable version of this |
---|
4787 | + mutable file with new_contents. This is equivalent to calling |
---|
4788 | + overwrite on the result of get_best_mutable_version with |
---|
4789 | + new_contents as an argument. I return a Deferred that eventually |
---|
4790 | + fires with the results of my replacement process. |
---|
4791 | + """ |
---|
4792 | return self._do_serialized(self._overwrite, new_contents) |
---|
4793 | hunk ./src/allmydata/mutable/filenode.py 521 |
---|
4794 | + |
---|
4795 | + |
---|
4796 | def _overwrite(self, new_contents): |
---|
4797 | hunk ./src/allmydata/mutable/filenode.py 524 |
---|
4798 | + """ |
---|
4799 | + I am the serialized sibling of overwrite. |
---|
4800 | + """ |
---|
4801 | + d = self.get_best_mutable_version() |
---|
4802 | + return d.addCallback(lambda mfv: mfv.overwrite(new_contents)) |
---|
4803 | + |
---|
4804 | + |
---|
4805 | + |
---|
4806 | + def upload(self, new_contents, servermap): |
---|
4807 | + """ |
---|
4808 | + I overwrite the contents of the best recoverable version of this |
---|
4809 | + mutable file with new_contents, using servermap instead of |
---|
4810 | + creating/updating our own servermap. I return a Deferred that |
---|
4811 | + fires with the results of my upload. |
---|
4812 | + """ |
---|
4813 | + return self._do_serialized(self._upload, new_contents, servermap) |
---|
4814 | + |
---|
4815 | + |
---|
4816 | + def _upload(self, new_contents, servermap): |
---|
4817 | + """ |
---|
4818 | + I am the serialized sibling of upload. |
---|
4819 | + """ |
---|
4820 | + d = self.get_best_mutable_version(servermap) |
---|
4821 | + return d.addCallback(lambda mfv: mfv.overwrite(new_contents)) |
---|
4822 | + |
---|
4823 | + |
---|
4824 | + def modify(self, modifier, backoffer=None): |
---|
4825 | + """ |
---|
4826 | + I modify the contents of the best recoverable version of this |
---|
4827 | + mutable file with the modifier. This is equivalent to calling |
---|
4828 | + modify on the result of get_best_mutable_version. I return a |
---|
4829 | + Deferred that eventually fires with an UploadResults instance |
---|
4830 | + describing this process. |
---|
4831 | + """ |
---|
4832 | + return self._do_serialized(self._modify, modifier, backoffer) |
---|
4833 | + |
---|
4834 | + |
---|
4835 | + def _modify(self, modifier, backoffer): |
---|
4836 | + """ |
---|
4837 | + I am the serialized sibling of modify. |
---|
4838 | + """ |
---|
4839 | + d = self.get_best_mutable_version() |
---|
4840 | + return d.addCallback(lambda mfv: mfv.modify(modifier, backoffer)) |
---|
4841 | + |
---|
4842 | + |
---|
4843 | + def download_version(self, servermap, version, fetch_privkey=False): |
---|
4844 | + """ |
---|
4845 | + Download the specified version of this mutable file. I return a |
---|
4846 | + Deferred that fires with the contents of the specified version |
---|
4847 | + as a bytestring, or errbacks if the file is not recoverable. |
---|
4848 | + """ |
---|
4849 | + d = self.get_readable_version(servermap, version) |
---|
4850 | + return d.addCallback(lambda mfv: mfv.download_to_data(fetch_privkey)) |
---|
4851 | + |
---|
4852 | + |
---|
4853 | + def get_servermap(self, mode): |
---|
4854 | + """ |
---|
4855 | + I return a servermap that has been updated in mode. |
---|
4856 | + |
---|
4857 | + mode should be one of MODE_READ, MODE_WRITE, MODE_CHECK or |
---|
4858 | + MODE_ANYTHING. See servermap.py for more on what these mean. |
---|
4859 | + """ |
---|
4860 | + return self._do_serialized(self._get_servermap, mode) |
---|
4861 | + |
---|
4862 | + |
---|
4863 | + def _get_servermap(self, mode): |
---|
4864 | + """ |
---|
4865 | + I am a serialized twin to get_servermap. |
---|
4866 | + """ |
---|
4867 | servermap = ServerMap() |
---|
4868 | hunk ./src/allmydata/mutable/filenode.py 594 |
---|
4869 | - d = self._update_servermap(servermap, mode=MODE_WRITE) |
---|
4870 | - d.addCallback(lambda ignored: self._upload(new_contents, servermap)) |
---|
4871 | + return self._update_servermap(servermap, mode) |
---|
4872 | + |
---|
4873 | + |
---|
4874 | + def _update_servermap(self, servermap, mode): |
---|
4875 | + u = ServermapUpdater(self, self._storage_broker, Monitor(), servermap, |
---|
4876 | + mode) |
---|
4877 | + if self._history: |
---|
4878 | + self._history.notify_mapupdate(u.get_status()) |
---|
4879 | + return u.update() |
---|
4880 | + |
---|
4881 | + |
---|
4882 | + def set_version(self, version): |
---|
4883 | + # I can be set in two ways: |
---|
4884 | + # 1. When the node is created. |
---|
4885 | + # 2. (for an existing share) when the Servermap is updated |
---|
4886 | + # before I am read. |
---|
4887 | + assert version in (MDMF_VERSION, SDMF_VERSION) |
---|
4888 | + self._protocol_version = version |
---|
4889 | + |
---|
4890 | + |
---|
4891 | + def get_version(self): |
---|
4892 | + return self._protocol_version |
---|
4893 | + |
---|
4894 | + |
---|
4895 | + def _do_serialized(self, cb, *args, **kwargs): |
---|
4896 | + # note: to avoid deadlock, this callable is *not* allowed to invoke |
---|
4897 | + # other serialized methods within this (or any other) |
---|
4898 | + # MutableFileNode. The callable should be a bound method of this same |
---|
4899 | + # MFN instance. |
---|
4900 | + d = defer.Deferred() |
---|
4901 | + self._serializer.addCallback(lambda ignore: cb(*args, **kwargs)) |
---|
4902 | + # we need to put off d.callback until this Deferred is finished being |
---|
4903 | + # processed. Otherwise the caller's subsequent activities (like, |
---|
4904 | + # doing other things with this node) can cause reentrancy problems in |
---|
4905 | + # the Deferred code itself |
---|
4906 | + self._serializer.addBoth(lambda res: eventually(d.callback, res)) |
---|
4907 | + # add a log.err just in case something really weird happens, because |
---|
4908 | + # self._serializer stays around forever, therefore we won't see the |
---|
4909 | + # usual Unhandled Error in Deferred that would give us a hint. |
---|
4910 | + self._serializer.addErrback(log.err) |
---|
4911 | return d |
---|
4912 | |
---|
4913 | |
---|
4914 | hunk ./src/allmydata/mutable/filenode.py 637 |
---|
4915 | + def _upload(self, new_contents, servermap): |
---|
4916 | + """ |
---|
4917 | + A MutableFileNode still has to have some way of getting |
---|
4918 | + published initially, which is what I am here for. After that, |
---|
4919 | + all publishing, updating, modifying and so on happens through |
---|
4920 | + MutableFileVersions. |
---|
4921 | + """ |
---|
4922 | + assert self._pubkey, "update_servermap must be called before publish" |
---|
4923 | + |
---|
4924 | + p = Publish(self, self._storage_broker, servermap) |
---|
4925 | + if self._history: |
---|
4926 | + self._history.notify_publish(p.get_status(), |
---|
4927 | + new_contents.get_size()) |
---|
4928 | + d = p.publish(new_contents) |
---|
4929 | + d.addCallback(self._did_upload, new_contents.get_size()) |
---|
4930 | + return d |
---|
4931 | + |
---|
4932 | + |
---|
4933 | + def _did_upload(self, res, size): |
---|
4934 | + self._most_recent_size = size |
---|
4935 | + return res |
---|
4936 | + |
---|
4937 | + |
---|
4938 | +class MutableFileVersion: |
---|
4939 | + """ |
---|
4940 | + I represent a specific version (most likely the best version) of a |
---|
4941 | + mutable file. |
---|
4942 | + |
---|
4943 | + Since I implement IReadable, instances which hold a |
---|
4944 | + reference to an instance of me are guaranteed the ability (absent |
---|
4945 | + connection difficulties or unrecoverable versions) to read the file |
---|
4946 | + that I represent. Depending on whether I was initialized with a |
---|
4947 | + write capability or not, I may also provide callers the ability to |
---|
4948 | + overwrite or modify the contents of the mutable file that I |
---|
4949 | + reference. |
---|
4950 | + """ |
---|
4951 | + implements(IMutableFileVersion, IWritable) |
---|
4952 | + |
---|
4953 | + def __init__(self, |
---|
4954 | + node, |
---|
4955 | + servermap, |
---|
4956 | + version, |
---|
4957 | + storage_index, |
---|
4958 | + storage_broker, |
---|
4959 | + readcap, |
---|
4960 | + writekey=None, |
---|
4961 | + write_secrets=None, |
---|
4962 | + history=None): |
---|
4963 | + |
---|
4964 | + self._node = node |
---|
4965 | + self._servermap = servermap |
---|
4966 | + self._version = version |
---|
4967 | + self._storage_index = storage_index |
---|
4968 | + self._write_secrets = write_secrets |
---|
4969 | + self._history = history |
---|
4970 | + self._storage_broker = storage_broker |
---|
4971 | + |
---|
4972 | + #assert isinstance(readcap, IURI) |
---|
4973 | + self._readcap = readcap |
---|
4974 | + |
---|
4975 | + self._writekey = writekey |
---|
4976 | + self._serializer = defer.succeed(None) |
---|
4977 | + self._size = None |
---|
4978 | + |
---|
4979 | + |
---|
4980 | + def get_sequence_number(self): |
---|
4981 | + """ |
---|
4982 | + Get the sequence number of the mutable version that I represent. |
---|
4983 | + """ |
---|
4984 | + return self._version[0] # verinfo[0] == the sequence number |
---|
4985 | + |
---|
4986 | + |
---|
4987 | + # TODO: Terminology? |
---|
4988 | + def get_writekey(self): |
---|
4989 | + """ |
---|
4990 | + I return a writekey or None if I don't have a writekey. |
---|
4991 | + """ |
---|
4992 | + return self._writekey |
---|
4993 | + |
---|
4994 | + |
---|
4995 | + def overwrite(self, new_contents): |
---|
4996 | + """ |
---|
4997 | + I overwrite the contents of this mutable file version with the |
---|
4998 | + data in new_contents. |
---|
4999 | + """ |
---|
5000 | + assert not self.is_readonly() |
---|
5001 | + |
---|
5002 | + return self._do_serialized(self._overwrite, new_contents) |
---|
5003 | + |
---|
5004 | + |
---|
5005 | + def _overwrite(self, new_contents): |
---|
5006 | + assert IMutableUploadable.providedBy(new_contents) |
---|
5007 | + assert self._servermap.last_update_mode == MODE_WRITE |
---|
5008 | + |
---|
5009 | + return self._upload(new_contents) |
---|
5010 | + |
---|
5011 | + |
---|
5012 | def modify(self, modifier, backoffer=None): |
---|
5013 | """I use a modifier callback to apply a change to the mutable file. |
---|
5014 | I implement the following pseudocode:: |
---|
5015 | hunk ./src/allmydata/mutable/filenode.py 774 |
---|
5016 | backoffer should not invoke any methods on this MutableFileNode |
---|
5017 | instance, and it needs to be highly conscious of deadlock issues. |
---|
5018 | """ |
---|
5019 | + assert not self.is_readonly() |
---|
5020 | + |
---|
5021 | return self._do_serialized(self._modify, modifier, backoffer) |
---|
5022 | hunk ./src/allmydata/mutable/filenode.py 777 |
---|
5023 | + |
---|
5024 | + |
---|
5025 | def _modify(self, modifier, backoffer): |
---|
5026 | hunk ./src/allmydata/mutable/filenode.py 780 |
---|
5027 | - servermap = ServerMap() |
---|
5028 | if backoffer is None: |
---|
5029 | backoffer = BackoffAgent().delay |
---|
5030 | hunk ./src/allmydata/mutable/filenode.py 782 |
---|
5031 | - return self._modify_and_retry(servermap, modifier, backoffer, True) |
---|
5032 | - def _modify_and_retry(self, servermap, modifier, backoffer, first_time): |
---|
5033 | - d = self._modify_once(servermap, modifier, first_time) |
---|
5034 | + return self._modify_and_retry(modifier, backoffer, True) |
---|
5035 | + |
---|
5036 | + |
---|
5037 | + def _modify_and_retry(self, modifier, backoffer, first_time): |
---|
5038 | + """ |
---|
5039 | + I try to apply modifier to the contents of this version of the |
---|
5040 | + mutable file. If I succeed, I return an UploadResults instance |
---|
5041 | + describing my success. If I fail, I try again after waiting for |
---|
5042 | + a little bit. |
---|
5043 | + """ |
---|
5044 | + log.msg("doing modify") |
---|
5045 | + d = self._modify_once(modifier, first_time) |
---|
5046 | def _retry(f): |
---|
5047 | f.trap(UncoordinatedWriteError) |
---|
5048 | d2 = defer.maybeDeferred(backoffer, self, f) |
---|
5049 | hunk ./src/allmydata/mutable/filenode.py 798 |
---|
5050 | d2.addCallback(lambda ignored: |
---|
5051 | - self._modify_and_retry(servermap, modifier, |
---|
5052 | + self._modify_and_retry(modifier, |
---|
5053 | backoffer, False)) |
---|
5054 | return d2 |
---|
5055 | d.addErrback(_retry) |
---|
5056 | hunk ./src/allmydata/mutable/filenode.py 803 |
---|
5057 | return d |
---|
5058 | - def _modify_once(self, servermap, modifier, first_time): |
---|
5059 | - d = self._update_servermap(servermap, MODE_WRITE) |
---|
5060 | - d.addCallback(self._once_updated_download_best_version, servermap) |
---|
5061 | + |
---|
5062 | + |
---|
5063 | + def _modify_once(self, modifier, first_time): |
---|
5064 | + """ |
---|
5065 | + I attempt to apply a modifier to the contents of the mutable |
---|
5066 | + file. |
---|
5067 | + """ |
---|
5068 | + # XXX: This is wrong -- we could get more servers if we updated |
---|
5069 | + # in MODE_ANYTHING and possibly MODE_CHECK. Probably we want to |
---|
5070 | + # assert that the last update wasn't MODE_READ |
---|
5071 | + assert self._servermap.last_update_mode == MODE_WRITE |
---|
5072 | + |
---|
5073 | + # download_to_data is serialized, so we have to call this to |
---|
5074 | + # avoid deadlock. |
---|
5075 | + d = self._try_to_download_data() |
---|
5076 | def _apply(old_contents): |
---|
5077 | hunk ./src/allmydata/mutable/filenode.py 819 |
---|
5078 | - new_contents = modifier(old_contents, servermap, first_time) |
---|
5079 | + new_contents = modifier(old_contents, self._servermap, first_time) |
---|
5080 | + precondition((isinstance(new_contents, str) or |
---|
5081 | + new_contents is None), |
---|
5082 | + "Modifier function must return a string " |
---|
5083 | + "or None") |
---|
5084 | + |
---|
5085 | if new_contents is None or new_contents == old_contents: |
---|
5086 | hunk ./src/allmydata/mutable/filenode.py 826 |
---|
5087 | + log.msg("no changes") |
---|
5088 | # no changes need to be made |
---|
5089 | if first_time: |
---|
5090 | return |
---|
5091 | hunk ./src/allmydata/mutable/filenode.py 834 |
---|
5092 | # recovery when it observes UCWE, we need to do a second |
---|
5093 | # publish. See #551 for details. We'll basically loop until |
---|
5094 | # we managed an uncontested publish. |
---|
5095 | - new_contents = old_contents |
---|
5096 | - precondition(isinstance(new_contents, str), |
---|
5097 | - "Modifier function must return a string or None") |
---|
5098 | - return self._upload(new_contents, servermap) |
---|
5099 | + old_uploadable = MutableData(old_contents) |
---|
5100 | + new_contents = old_uploadable |
---|
5101 | + else: |
---|
5102 | + new_contents = MutableData(new_contents) |
---|
5103 | + |
---|
5104 | + return self._upload(new_contents) |
---|
5105 | d.addCallback(_apply) |
---|
5106 | return d |
---|
5107 | |
---|
5108 | hunk ./src/allmydata/mutable/filenode.py 843 |
---|
5109 | - def get_servermap(self, mode): |
---|
5110 | - return self._do_serialized(self._get_servermap, mode) |
---|
5111 | - def _get_servermap(self, mode): |
---|
5112 | - servermap = ServerMap() |
---|
5113 | - return self._update_servermap(servermap, mode) |
---|
5114 | - def _update_servermap(self, servermap, mode): |
---|
5115 | - u = ServermapUpdater(self, self._storage_broker, Monitor(), servermap, |
---|
5116 | - mode) |
---|
5117 | - if self._history: |
---|
5118 | - self._history.notify_mapupdate(u.get_status()) |
---|
5119 | - return u.update() |
---|
5120 | |
---|
5121 | hunk ./src/allmydata/mutable/filenode.py 844 |
---|
5122 | - def download_version(self, servermap, version, fetch_privkey=False): |
---|
5123 | - return self._do_serialized(self._try_once_to_download_version, |
---|
5124 | - servermap, version, fetch_privkey) |
---|
5125 | - def _try_once_to_download_version(self, servermap, version, |
---|
5126 | - fetch_privkey=False): |
---|
5127 | - r = Retrieve(self, servermap, version, fetch_privkey) |
---|
5128 | + def is_readonly(self): |
---|
5129 | + """ |
---|
5130 | + I return True if this MutableFileVersion provides no write |
---|
5131 | + access to the file that it encapsulates, and False if it |
---|
5132 | + provides the ability to modify the file. |
---|
5133 | + """ |
---|
5134 | + return self._writekey is None |
---|
5135 | + |
---|
5136 | + |
---|
5137 | + def is_mutable(self): |
---|
5138 | + """ |
---|
5139 | + I return True, since mutable files are always mutable by |
---|
5140 | + somebody. |
---|
5141 | + """ |
---|
5142 | + return True |
---|
5143 | + |
---|
5144 | + |
---|
5145 | + def get_storage_index(self): |
---|
5146 | + """ |
---|
5147 | + I return the storage index of the reference that I encapsulate. |
---|
5148 | + """ |
---|
5149 | + return self._storage_index |
---|
5150 | + |
---|
5151 | + |
---|
5152 | + def get_size(self): |
---|
5153 | + """ |
---|
5154 | + I return the length, in bytes, of this readable object. |
---|
5155 | + """ |
---|
5156 | + return self._servermap.size_of_version(self._version) |
---|
5157 | + |
---|
5158 | + |
---|
5159 | + def download_to_data(self, fetch_privkey=False): |
---|
5160 | + """ |
---|
5161 | + I return a Deferred that fires with the contents of this |
---|
5162 | + readable object as a byte string. |
---|
5163 | + |
---|
5164 | + """ |
---|
5165 | + c = consumer.MemoryConsumer() |
---|
5166 | + d = self.read(c, fetch_privkey=fetch_privkey) |
---|
5167 | + d.addCallback(lambda mc: "".join(mc.chunks)) |
---|
5168 | + return d |
---|
5169 | + |
---|
5170 | + |
---|
5171 | + def _try_to_download_data(self): |
---|
5172 | + """ |
---|
5173 | + I am an unserialized cousin of download_to_data; I am called |
---|
5174 | + from the children of modify() to download the data associated |
---|
5175 | + with this mutable version. |
---|
5176 | + """ |
---|
5177 | + c = consumer.MemoryConsumer() |
---|
5178 | + # modify will almost certainly write, so we need the privkey. |
---|
5179 | + d = self._read(c, fetch_privkey=True) |
---|
5180 | + d.addCallback(lambda mc: "".join(mc.chunks)) |
---|
5181 | + return d |
---|
5182 | + |
---|
5183 | + |
---|
5184 | + def _update_servermap(self, mode=MODE_READ): |
---|
5185 | + """ |
---|
5186 | + I update our Servermap according to my mode argument. I return a |
---|
5187 | + Deferred that fires with None when this has finished. The |
---|
5188 | + updated Servermap will be at self._servermap in that case. |
---|
5189 | + """ |
---|
5190 | + d = self._node.get_servermap(mode) |
---|
5191 | + |
---|
5192 | + def _got_servermap(servermap): |
---|
5193 | + assert servermap.last_update_mode == mode |
---|
5194 | + |
---|
5195 | + self._servermap = servermap |
---|
5196 | + d.addCallback(_got_servermap) |
---|
5197 | + return d |
---|
5198 | + |
---|
5199 | + |
---|
5200 | + def read(self, consumer, offset=0, size=None, fetch_privkey=False): |
---|
5201 | + """ |
---|
5202 | + I read a portion (possibly all) of the mutable file that I |
---|
5203 | + reference into consumer. |
---|
5204 | + """ |
---|
5205 | + return self._do_serialized(self._read, consumer, offset, size, |
---|
5206 | + fetch_privkey) |
---|
5207 | + |
---|
5208 | + |
---|
5209 | + def _read(self, consumer, offset=0, size=None, fetch_privkey=False): |
---|
5210 | + """ |
---|
5211 | + I am the serialized companion of read. |
---|
5212 | + """ |
---|
5213 | + r = Retrieve(self._node, self._servermap, self._version, fetch_privkey) |
---|
5214 | if self._history: |
---|
5215 | self._history.notify_retrieve(r.get_status()) |
---|
5216 | hunk ./src/allmydata/mutable/filenode.py 932 |
---|
5217 | - d = r.download() |
---|
5218 | - d.addCallback(self._downloaded_version) |
---|
5219 | + d = r.download(consumer, offset, size) |
---|
5220 | return d |
---|
5221 | hunk ./src/allmydata/mutable/filenode.py 934 |
---|
5222 | - def _downloaded_version(self, data): |
---|
5223 | - self._most_recent_size = len(data) |
---|
5224 | - return data |
---|
5225 | |
---|
5226 | hunk ./src/allmydata/mutable/filenode.py 935 |
---|
5227 | - def upload(self, new_contents, servermap): |
---|
5228 | - return self._do_serialized(self._upload, new_contents, servermap) |
---|
5229 | - def _upload(self, new_contents, servermap): |
---|
5230 | - assert self._pubkey, "update_servermap must be called before publish" |
---|
5231 | - p = Publish(self, self._storage_broker, servermap) |
---|
5232 | + |
---|
5233 | + def _do_serialized(self, cb, *args, **kwargs): |
---|
5234 | + # note: to avoid deadlock, this callable is *not* allowed to invoke |
---|
5235 | + # other serialized methods within this (or any other) |
---|
5236 | + # MutableFileNode. The callable should be a bound method of this same |
---|
5237 | + # MFN instance. |
---|
5238 | + d = defer.Deferred() |
---|
5239 | + self._serializer.addCallback(lambda ignore: cb(*args, **kwargs)) |
---|
5240 | + # we need to put off d.callback until this Deferred is finished being |
---|
5241 | + # processed. Otherwise the caller's subsequent activities (like, |
---|
5242 | + # doing other things with this node) can cause reentrancy problems in |
---|
5243 | + # the Deferred code itself |
---|
5244 | + self._serializer.addBoth(lambda res: eventually(d.callback, res)) |
---|
5245 | + # add a log.err just in case something really weird happens, because |
---|
5246 | + # self._serializer stays around forever, therefore we won't see the |
---|
5247 | + # usual Unhandled Error in Deferred that would give us a hint. |
---|
5248 | + self._serializer.addErrback(log.err) |
---|
5249 | + return d |
---|
5250 | + |
---|
5251 | + |
---|
5252 | + def _upload(self, new_contents): |
---|
5253 | + #assert self._pubkey, "update_servermap must be called before publish" |
---|
5254 | + p = Publish(self._node, self._storage_broker, self._servermap) |
---|
5255 | if self._history: |
---|
5256 | hunk ./src/allmydata/mutable/filenode.py 959 |
---|
5257 | - self._history.notify_publish(p.get_status(), len(new_contents)) |
---|
5258 | + self._history.notify_publish(p.get_status(), |
---|
5259 | + new_contents.get_size()) |
---|
5260 | d = p.publish(new_contents) |
---|
5261 | hunk ./src/allmydata/mutable/filenode.py 962 |
---|
5262 | - d.addCallback(self._did_upload, len(new_contents)) |
---|
5263 | + d.addCallback(self._did_upload, new_contents.get_size()) |
---|
5264 | return d |
---|
5265 | hunk ./src/allmydata/mutable/filenode.py 964 |
---|
5266 | + |
---|
5267 | + |
---|
5268 | def _did_upload(self, res, size): |
---|
5269 | hunk ./src/allmydata/mutable/filenode.py 967 |
---|
5270 | - self._most_recent_size = size |
---|
5271 | + self._size = size |
---|
5272 | return res |
---|
5273 | hunk ./src/allmydata/mutable/filenode.py 969 |
---|
5274 | + |
---|
5275 | + def update(self, data, offset): |
---|
5276 | + """ |
---|
5277 | + Do an update of this mutable file version by inserting data at |
---|
5278 | + offset within the file. If offset is the EOF, this is an append |
---|
5279 | + operation. I return a Deferred that fires with the results of |
---|
5280 | + the update operation when it has completed. |
---|
5281 | + |
---|
5282 | + In cases where update does not append any data, or where it does |
---|
5283 | + not append so many blocks that the block count crosses a |
---|
5284 | + power-of-two boundary, this operation will use roughly |
---|
5285 | + O(data.get_size()) memory/bandwidth/CPU to perform the update. |
---|
5286 | + Otherwise, it must download, re-encode, and upload the entire |
---|
5287 | + file again, which will use O(filesize) resources. |
---|
5288 | + """ |
---|
5289 | + return self._do_serialized(self._update, data, offset) |
---|
5290 | + |
---|
5291 | + |
---|
5292 | + def _update(self, data, offset): |
---|
5293 | + """ |
---|
5294 | + I update the mutable file version represented by this particular |
---|
5295 | + IMutableVersion by inserting the data in data at the offset |
---|
5296 | + offset. I return a Deferred that fires when this has been |
---|
5297 | + completed. |
---|
5298 | + """ |
---|
5299 | + # We have two cases here: |
---|
5300 | + # 1. The new data will add few enough segments so that it does |
---|
5301 | + # not cross into the next power-of-two boundary. |
---|
5302 | + # 2. It doesn't. |
---|
5303 | + # |
---|
5304 | + # In the former case, we can modify the file in place. In the |
---|
5305 | + # latter case, we need to re-encode the file. |
---|
5306 | + new_size = data.get_size() + offset |
---|
5307 | + old_size = self.get_size() |
---|
5308 | + segment_size = self._version[3] |
---|
5309 | + num_old_segments = mathutil.div_ceil(old_size, |
---|
5310 | + segment_size) |
---|
5311 | + num_new_segments = mathutil.div_ceil(new_size, |
---|
5312 | + segment_size) |
---|
5313 | + log.msg("got %d old segments, %d new segments" % \ |
---|
5314 | + (num_old_segments, num_new_segments)) |
---|
5315 | + |
---|
5316 | + # We also do a whole file re-encode if the file is an SDMF file. |
---|
5317 | + if self._version[2]: # version[2] == SDMF salt, which MDMF lacks |
---|
5318 | + log.msg("doing re-encode instead of in-place update") |
---|
5319 | + return self._do_modify_update(data, offset) |
---|
5320 | + |
---|
5321 | + log.msg("updating in place") |
---|
5322 | + d = self._do_update_update(data, offset) |
---|
5323 | + d.addCallback(self._decode_and_decrypt_segments, data, offset) |
---|
5324 | + d.addCallback(self._build_uploadable_and_finish, data, offset) |
---|
5325 | + return d |
---|
5326 | + |
---|
5327 | + |
---|
5328 | + def _do_modify_update(self, data, offset): |
---|
5329 | + """ |
---|
5330 | + I perform a file update by modifying the contents of the file |
---|
5331 | + after downloading it, then reuploading it. I am less efficient |
---|
5332 | + than _do_update_update, but am necessary for certain updates. |
---|
5333 | + """ |
---|
5334 | + def m(old, servermap, first_time): |
---|
5335 | + start = offset |
---|
5336 | + rest = offset + data.get_size() |
---|
5337 | + new = old[:start] |
---|
5338 | + new += "".join(data.read(data.get_size())) |
---|
5339 | + new += old[rest:] |
---|
5340 | + return new |
---|
5341 | + return self._modify(m, None) |
---|
5342 | + |
---|
5343 | + |
---|
5344 | + def _do_update_update(self, data, offset): |
---|
5345 | + """ |
---|
5346 | + I start the Servermap update that gets us the data we need to |
---|
5347 | + continue the update process. I return a Deferred that fires when |
---|
5348 | + the servermap update is done. |
---|
5349 | + """ |
---|
5350 | + assert IMutableUploadable.providedBy(data) |
---|
5351 | + assert self.is_mutable() |
---|
5352 | + # offset == self.get_size() is valid and means that we are |
---|
5353 | + # appending data to the file. |
---|
5354 | + assert offset <= self.get_size() |
---|
5355 | + |
---|
5356 | + datasize = data.get_size() |
---|
5357 | + # We'll need the segment that the data starts in, regardless of |
---|
5358 | + # what we'll do later. |
---|
5359 | + start_segment = mathutil.div_ceil(offset, DEFAULT_MAX_SEGMENT_SIZE) |
---|
5360 | + start_segment -= 1 |
---|
5361 | + |
---|
5362 | + # We only need the end segment if the data we append does not go |
---|
5363 | + # beyond the current end-of-file. |
---|
5364 | + end_segment = start_segment |
---|
5365 | + if offset + data.get_size() < self.get_size(): |
---|
5366 | + end_data = offset + data.get_size() |
---|
5367 | + end_segment = mathutil.div_ceil(end_data, DEFAULT_MAX_SEGMENT_SIZE) |
---|
5368 | + end_segment -= 1 |
---|
5369 | + self._start_segment = start_segment |
---|
5370 | + self._end_segment = end_segment |
---|
5371 | + |
---|
5372 | + # Now ask for the servermap to be updated in MODE_WRITE with |
---|
5373 | + # this update range. |
---|
5374 | + u = ServermapUpdater(self._node, self._storage_broker, Monitor(), |
---|
5375 | + self._servermap, |
---|
5376 | + mode=MODE_WRITE, |
---|
5377 | + update_range=(start_segment, end_segment)) |
---|
5378 | + return u.update() |
---|
5379 | + |
---|
5380 | + |
---|
5381 | + def _decode_and_decrypt_segments(self, ignored, data, offset): |
---|
5382 | + """ |
---|
5383 | + After the servermap update, I take the encrypted and encoded |
---|
5384 | + data that the servermap fetched while doing its update and |
---|
5385 | + transform it into decoded-and-decrypted plaintext that can be |
---|
5386 | + used by the new uploadable. I return a Deferred that fires with |
---|
5387 | + the segments. |
---|
5388 | + """ |
---|
5389 | + r = Retrieve(self._node, self._servermap, self._version) |
---|
5390 | + # decode: takes in our blocks and salts from the servermap, |
---|
5391 | + # returns a Deferred that fires with the corresponding plaintext |
---|
5392 | + # segments. Does not download -- simply takes advantage of |
---|
5393 | + # existing infrastructure within the Retrieve class to avoid |
---|
5394 | + # duplicating code. |
---|
5395 | + sm = self._servermap |
---|
5396 | + # XXX: If the methods in the servermap don't work as |
---|
5397 | + # abstractions, you should rewrite them instead of going around |
---|
5398 | + # them. |
---|
5399 | + update_data = sm.update_data |
---|
5400 | + start_segments = {} # shnum -> start segment |
---|
5401 | + end_segments = {} # shnum -> end segment |
---|
5402 | + blockhashes = {} # shnum -> blockhash tree |
---|
5403 | + for (shnum, data) in update_data.iteritems(): |
---|
5404 | + data = [d[1] for d in data if d[0] == self._version] |
---|
5405 | + |
---|
5406 | + # Every data entry in our list should now be share shnum for |
---|
5407 | + # a particular version of the mutable file, so all of the |
---|
5408 | + # entries should be identical. |
---|
5409 | + datum = data[0] |
---|
5410 | + assert filter(lambda x: x != datum, data) == [] |
---|
5411 | + |
---|
5412 | + blockhashes[shnum] = datum[0] |
---|
5413 | + start_segments[shnum] = datum[1] |
---|
5414 | + end_segments[shnum] = datum[2] |
---|
5415 | + |
---|
5416 | + d1 = r.decode(start_segments, self._start_segment) |
---|
5417 | + d2 = r.decode(end_segments, self._end_segment) |
---|
5418 | + d3 = defer.succeed(blockhashes) |
---|
5419 | + return deferredutil.gatherResults([d1, d2, d3]) |
---|
5420 | + |
---|
5421 | + |
---|
5422 | + def _build_uploadable_and_finish(self, segments_and_bht, data, offset): |
---|
5423 | + """ |
---|
5424 | + After the process has the plaintext segments, I build the |
---|
5425 | + TransformingUploadable that the publisher will eventually |
---|
5426 | + re-upload to the grid. I then invoke the publisher with that |
---|
5427 | + uploadable, and return a Deferred when the publish operation has |
---|
5428 | + completed without issue. |
---|
5429 | + """ |
---|
5430 | + u = TransformingUploadable(data, offset, |
---|
5431 | + self._version[3], |
---|
5432 | + segments_and_bht[0], |
---|
5433 | + segments_and_bht[1]) |
---|
5434 | + p = Publish(self._node, self._storage_broker, self._servermap) |
---|
5435 | + return p.update(u, offset, segments_and_bht[2], self._version) |
---|
5436 | } |
---|
5437 | [mutable/publish.py: Modify the publish process to support MDMF |
---|
5438 | Kevan Carstensen <kevan@isnotajoke.com>**20100811233101 |
---|
5439 | Ignore-this: c2eb57cf67da7af5ad02be793e918bc6 |
---|
5440 | |
---|
5441 | The inner workings of the publishing process needed to be reworked to a |
---|
5442 | large extend to cope with segmented mutable files, and to cope with |
---|
5443 | partial-file updates of mutable files. This patch does that. It also |
---|
5444 | introduces wrappers for uploadable data, allowing the use of |
---|
5445 | filehandle-like objects as data sources, in addition to strings. This |
---|
5446 | reduces memory inefficiency when dealing with large files through the |
---|
5447 | webapi, and clarifies update code there. |
---|
5448 | ] { |
---|
5449 | hunk ./src/allmydata/mutable/publish.py 4 |
---|
5450 | |
---|
5451 | |
---|
5452 | import os, struct, time |
---|
5453 | +from StringIO import StringIO |
---|
5454 | from itertools import count |
---|
5455 | from zope.interface import implements |
---|
5456 | from twisted.internet import defer |
---|
5457 | hunk ./src/allmydata/mutable/publish.py 9 |
---|
5458 | from twisted.python import failure |
---|
5459 | -from allmydata.interfaces import IPublishStatus |
---|
5460 | +from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION, \ |
---|
5461 | + IMutableUploadable |
---|
5462 | from allmydata.util import base32, hashutil, mathutil, idlib, log |
---|
5463 | from allmydata import hashtree, codec |
---|
5464 | from allmydata.storage.server import si_b2a |
---|
5465 | hunk ./src/allmydata/mutable/publish.py 21 |
---|
5466 | UncoordinatedWriteError, NotEnoughServersError |
---|
5467 | from allmydata.mutable.servermap import ServerMap |
---|
5468 | from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \ |
---|
5469 | - unpack_checkstring, SIGNED_PREFIX |
---|
5470 | + unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy, \ |
---|
5471 | + SDMFSlotWriteProxy |
---|
5472 | + |
---|
5473 | +KiB = 1024 |
---|
5474 | +DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB |
---|
5475 | +PUSHING_BLOCKS_STATE = 0 |
---|
5476 | +PUSHING_EVERYTHING_ELSE_STATE = 1 |
---|
5477 | +DONE_STATE = 2 |
---|
5478 | |
---|
5479 | class PublishStatus: |
---|
5480 | implements(IPublishStatus) |
---|
5481 | hunk ./src/allmydata/mutable/publish.py 118 |
---|
5482 | self._status.set_helper(False) |
---|
5483 | self._status.set_progress(0.0) |
---|
5484 | self._status.set_active(True) |
---|
5485 | + self._version = self._node.get_version() |
---|
5486 | + assert self._version in (SDMF_VERSION, MDMF_VERSION) |
---|
5487 | + |
---|
5488 | |
---|
5489 | def get_status(self): |
---|
5490 | return self._status |
---|
5491 | hunk ./src/allmydata/mutable/publish.py 132 |
---|
5492 | kwargs["facility"] = "tahoe.mutable.publish" |
---|
5493 | return log.msg(*args, **kwargs) |
---|
5494 | |
---|
5495 | + |
---|
5496 | + def update(self, data, offset, blockhashes, version): |
---|
5497 | + """ |
---|
5498 | + I replace the contents of this file with the contents of data, |
---|
5499 | + starting at offset. I return a Deferred that fires with None |
---|
5500 | + when the replacement has been completed, or with an error if |
---|
5501 | + something went wrong during the process. |
---|
5502 | + |
---|
5503 | + Note that this process will not upload new shares. If the file |
---|
5504 | + being updated is in need of repair, callers will have to repair |
---|
5505 | + it on their own. |
---|
5506 | + """ |
---|
5507 | + # How this works: |
---|
5508 | + # 1: Make peer assignments. We'll assign each share that we know |
---|
5509 | + # about on the grid to that peer that currently holds that |
---|
5510 | + # share, and will not place any new shares. |
---|
5511 | + # 2: Setup encoding parameters. Most of these will stay the same |
---|
5512 | + # -- datalength will change, as will some of the offsets. |
---|
5513 | + # 3. Upload the new segments. |
---|
5514 | + # 4. Be done. |
---|
5515 | + assert IMutableUploadable.providedBy(data) |
---|
5516 | + |
---|
5517 | + self.data = data |
---|
5518 | + |
---|
5519 | + # XXX: Use the MutableFileVersion instead. |
---|
5520 | + self.datalength = self._node.get_size() |
---|
5521 | + if data.get_size() > self.datalength: |
---|
5522 | + self.datalength = data.get_size() |
---|
5523 | + |
---|
5524 | + self.log("starting update") |
---|
5525 | + self.log("adding new data of length %d at offset %d" % \ |
---|
5526 | + (data.get_size(), offset)) |
---|
5527 | + self.log("new data length is %d" % self.datalength) |
---|
5528 | + self._status.set_size(self.datalength) |
---|
5529 | + self._status.set_status("Started") |
---|
5530 | + self._started = time.time() |
---|
5531 | + |
---|
5532 | + self.done_deferred = defer.Deferred() |
---|
5533 | + |
---|
5534 | + self._writekey = self._node.get_writekey() |
---|
5535 | + assert self._writekey, "need write capability to publish" |
---|
5536 | + |
---|
5537 | + # first, which servers will we publish to? We require that the |
---|
5538 | + # servermap was updated in MODE_WRITE, so we can depend upon the |
---|
5539 | + # peerlist computed by that process instead of computing our own. |
---|
5540 | + assert self._servermap |
---|
5541 | + assert self._servermap.last_update_mode in (MODE_WRITE, MODE_CHECK) |
---|
5542 | + # we will push a version that is one larger than anything present |
---|
5543 | + # in the grid, according to the servermap. |
---|
5544 | + self._new_seqnum = self._servermap.highest_seqnum() + 1 |
---|
5545 | + self._status.set_servermap(self._servermap) |
---|
5546 | + |
---|
5547 | + self.log(format="new seqnum will be %(seqnum)d", |
---|
5548 | + seqnum=self._new_seqnum, level=log.NOISY) |
---|
5549 | + |
---|
5550 | + # We're updating an existing file, so all of the following |
---|
5551 | + # should be available. |
---|
5552 | + self.readkey = self._node.get_readkey() |
---|
5553 | + self.required_shares = self._node.get_required_shares() |
---|
5554 | + assert self.required_shares is not None |
---|
5555 | + self.total_shares = self._node.get_total_shares() |
---|
5556 | + assert self.total_shares is not None |
---|
5557 | + self._status.set_encoding(self.required_shares, self.total_shares) |
---|
5558 | + |
---|
5559 | + self._pubkey = self._node.get_pubkey() |
---|
5560 | + assert self._pubkey |
---|
5561 | + self._privkey = self._node.get_privkey() |
---|
5562 | + assert self._privkey |
---|
5563 | + self._encprivkey = self._node.get_encprivkey() |
---|
5564 | + |
---|
5565 | + sb = self._storage_broker |
---|
5566 | + full_peerlist = sb.get_servers_for_index(self._storage_index) |
---|
5567 | + self.full_peerlist = full_peerlist # for use later, immutable |
---|
5568 | + self.bad_peers = set() # peerids who have errbacked/refused requests |
---|
5569 | + |
---|
5570 | + # This will set self.segment_size, self.num_segments, and |
---|
5571 | + # self.fec. TODO: Does it know how to do the offset? Probably |
---|
5572 | + # not. So do that part next. |
---|
5573 | + self.setup_encoding_parameters(offset=offset) |
---|
5574 | + |
---|
5575 | + # if we experience any surprises (writes which were rejected because |
---|
5576 | + # our test vector did not match, or shares which we didn't expect to |
---|
5577 | + # see), we set this flag and report an UncoordinatedWriteError at the |
---|
5578 | + # end of the publish process. |
---|
5579 | + self.surprised = False |
---|
5580 | + |
---|
5581 | + # we keep track of three tables. The first is our goal: which share |
---|
5582 | + # we want to see on which servers. This is initially populated by the |
---|
5583 | + # existing servermap. |
---|
5584 | + self.goal = set() # pairs of (peerid, shnum) tuples |
---|
5585 | + |
---|
5586 | + # the second table is our list of outstanding queries: those which |
---|
5587 | + # are in flight and may or may not be delivered, accepted, or |
---|
5588 | + # acknowledged. Items are added to this table when the request is |
---|
5589 | + # sent, and removed when the response returns (or errbacks). |
---|
5590 | + self.outstanding = set() # (peerid, shnum) tuples |
---|
5591 | + |
---|
5592 | + # the third is a table of successes: share which have actually been |
---|
5593 | + # placed. These are populated when responses come back with success. |
---|
5594 | + # When self.placed == self.goal, we're done. |
---|
5595 | + self.placed = set() # (peerid, shnum) tuples |
---|
5596 | + |
---|
5597 | + # we also keep a mapping from peerid to RemoteReference. Each time we |
---|
5598 | + # pull a connection out of the full peerlist, we add it to this for |
---|
5599 | + # use later. |
---|
5600 | + self.connections = {} |
---|
5601 | + |
---|
5602 | + self.bad_share_checkstrings = {} |
---|
5603 | + |
---|
5604 | + # This is set at the last step of the publishing process. |
---|
5605 | + self.versioninfo = "" |
---|
5606 | + |
---|
5607 | + # we use the servermap to populate the initial goal: this way we will |
---|
5608 | + # try to update each existing share in place. Since we're |
---|
5609 | + # updating, we ignore damaged and missing shares -- callers must |
---|
5610 | + # do a repair to repair and recreate these. |
---|
5611 | + for (peerid, shnum) in self._servermap.servermap: |
---|
5612 | + self.goal.add( (peerid, shnum) ) |
---|
5613 | + self.connections[peerid] = self._servermap.connections[peerid] |
---|
5614 | + self.writers = {} |
---|
5615 | + |
---|
5616 | + # SDMF files are updated differently. |
---|
5617 | + self._version = MDMF_VERSION |
---|
5618 | + writer_class = MDMFSlotWriteProxy |
---|
5619 | + |
---|
5620 | + # For each (peerid, shnum) in self.goal, we make a |
---|
5621 | + # write proxy for that peer. We'll use this to write |
---|
5622 | + # shares to the peer. |
---|
5623 | + for key in self.goal: |
---|
5624 | + peerid, shnum = key |
---|
5625 | + write_enabler = self._node.get_write_enabler(peerid) |
---|
5626 | + renew_secret = self._node.get_renewal_secret(peerid) |
---|
5627 | + cancel_secret = self._node.get_cancel_secret(peerid) |
---|
5628 | + secrets = (write_enabler, renew_secret, cancel_secret) |
---|
5629 | + |
---|
5630 | + self.writers[shnum] = writer_class(shnum, |
---|
5631 | + self.connections[peerid], |
---|
5632 | + self._storage_index, |
---|
5633 | + secrets, |
---|
5634 | + self._new_seqnum, |
---|
5635 | + self.required_shares, |
---|
5636 | + self.total_shares, |
---|
5637 | + self.segment_size, |
---|
5638 | + self.datalength) |
---|
5639 | + self.writers[shnum].peerid = peerid |
---|
5640 | + assert (peerid, shnum) in self._servermap.servermap |
---|
5641 | + old_versionid, old_timestamp = self._servermap.servermap[key] |
---|
5642 | + (old_seqnum, old_root_hash, old_salt, old_segsize, |
---|
5643 | + old_datalength, old_k, old_N, old_prefix, |
---|
5644 | + old_offsets_tuple) = old_versionid |
---|
5645 | + self.writers[shnum].set_checkstring(old_seqnum, |
---|
5646 | + old_root_hash, |
---|
5647 | + old_salt) |
---|
5648 | + |
---|
5649 | + # Our remote shares will not have a complete checkstring until |
---|
5650 | + # after we are done writing share data and have started to write |
---|
5651 | + # blocks. In the meantime, we need to know what to look for when |
---|
5652 | + # writing, so that we can detect UncoordinatedWriteErrors. |
---|
5653 | + self._checkstring = self.writers.values()[0].get_checkstring() |
---|
5654 | + |
---|
5655 | + # Now, we start pushing shares. |
---|
5656 | + self._status.timings["setup"] = time.time() - self._started |
---|
5657 | + # First, we encrypt, encode, and publish the shares that we need |
---|
5658 | + # to encrypt, encode, and publish. |
---|
5659 | + |
---|
5660 | + # Our update process fetched these for us. We need to update |
---|
5661 | + # them in place as publishing happens. |
---|
5662 | + self.blockhashes = {} # (shnum, [blochashes]) |
---|
5663 | + for (i, bht) in blockhashes.iteritems(): |
---|
5664 | + # We need to extract the leaves from our old hash tree. |
---|
5665 | + old_segcount = mathutil.div_ceil(version[4], |
---|
5666 | + version[3]) |
---|
5667 | + h = hashtree.IncompleteHashTree(old_segcount) |
---|
5668 | + bht = dict(enumerate(bht)) |
---|
5669 | + h.set_hashes(bht) |
---|
5670 | + leaves = h[h.get_leaf_index(0):] |
---|
5671 | + for j in xrange(self.num_segments - len(leaves)): |
---|
5672 | + leaves.append(None) |
---|
5673 | + |
---|
5674 | + assert len(leaves) >= self.num_segments |
---|
5675 | + self.blockhashes[i] = leaves |
---|
5676 | + # This list will now be the leaves that were set during the |
---|
5677 | + # initial upload + enough empty hashes to make it a |
---|
5678 | + # power-of-two. If we exceed a power of two boundary, we |
---|
5679 | + # should be encoding the file over again, and should not be |
---|
5680 | + # here. So, we have |
---|
5681 | + #assert len(self.blockhashes[i]) == \ |
---|
5682 | + # hashtree.roundup_pow2(self.num_segments), \ |
---|
5683 | + # len(self.blockhashes[i]) |
---|
5684 | + # XXX: Except this doesn't work. Figure out why. |
---|
5685 | + |
---|
5686 | + # These are filled in later, after we've modified the block hash |
---|
5687 | + # tree suitably. |
---|
5688 | + self.sharehash_leaves = None # eventually [sharehashes] |
---|
5689 | + self.sharehashes = {} # shnum -> [sharehash leaves necessary to |
---|
5690 | + # validate the share] |
---|
5691 | + |
---|
5692 | + d = defer.succeed(None) |
---|
5693 | + self.log("Starting push") |
---|
5694 | + |
---|
5695 | + self._state = PUSHING_BLOCKS_STATE |
---|
5696 | + self._push() |
---|
5697 | + |
---|
5698 | + return self.done_deferred |
---|
5699 | + |
---|
5700 | + |
---|
5701 | def publish(self, newdata): |
---|
5702 | """Publish the filenode's current contents. Returns a Deferred that |
---|
5703 | fires (with None) when the publish has done as much work as it's ever |
---|
5704 | hunk ./src/allmydata/mutable/publish.py 345 |
---|
5705 | simultaneous write. |
---|
5706 | """ |
---|
5707 | |
---|
5708 | - # 1: generate shares (SDMF: files are small, so we can do it in RAM) |
---|
5709 | - # 2: perform peer selection, get candidate servers |
---|
5710 | - # 2a: send queries to n+epsilon servers, to determine current shares |
---|
5711 | - # 2b: based upon responses, create target map |
---|
5712 | - # 3: send slot_testv_and_readv_and_writev messages |
---|
5713 | - # 4: as responses return, update share-dispatch table |
---|
5714 | - # 4a: may need to run recovery algorithm |
---|
5715 | - # 5: when enough responses are back, we're done |
---|
5716 | + # 0. Setup encoding parameters, encoder, and other such things. |
---|
5717 | + # 1. Encrypt, encode, and publish segments. |
---|
5718 | + assert IMutableUploadable.providedBy(newdata) |
---|
5719 | |
---|
5720 | hunk ./src/allmydata/mutable/publish.py 349 |
---|
5721 | - self.log("starting publish, datalen is %s" % len(newdata)) |
---|
5722 | - self._status.set_size(len(newdata)) |
---|
5723 | + self.data = newdata |
---|
5724 | + self.datalength = newdata.get_size() |
---|
5725 | + #if self.datalength >= DEFAULT_MAX_SEGMENT_SIZE: |
---|
5726 | + # self._version = MDMF_VERSION |
---|
5727 | + #else: |
---|
5728 | + # self._version = SDMF_VERSION |
---|
5729 | + |
---|
5730 | + self.log("starting publish, datalen is %s" % self.datalength) |
---|
5731 | + self._status.set_size(self.datalength) |
---|
5732 | self._status.set_status("Started") |
---|
5733 | self._started = time.time() |
---|
5734 | |
---|
5735 | hunk ./src/allmydata/mutable/publish.py 405 |
---|
5736 | self.full_peerlist = full_peerlist # for use later, immutable |
---|
5737 | self.bad_peers = set() # peerids who have errbacked/refused requests |
---|
5738 | |
---|
5739 | - self.newdata = newdata |
---|
5740 | - self.salt = os.urandom(16) |
---|
5741 | - |
---|
5742 | + # This will set self.segment_size, self.num_segments, and |
---|
5743 | + # self.fec. |
---|
5744 | self.setup_encoding_parameters() |
---|
5745 | |
---|
5746 | # if we experience any surprises (writes which were rejected because |
---|
5747 | hunk ./src/allmydata/mutable/publish.py 415 |
---|
5748 | # end of the publish process. |
---|
5749 | self.surprised = False |
---|
5750 | |
---|
5751 | - # as a failsafe, refuse to iterate through self.loop more than a |
---|
5752 | - # thousand times. |
---|
5753 | - self.looplimit = 1000 |
---|
5754 | - |
---|
5755 | # we keep track of three tables. The first is our goal: which share |
---|
5756 | # we want to see on which servers. This is initially populated by the |
---|
5757 | # existing servermap. |
---|
5758 | hunk ./src/allmydata/mutable/publish.py 438 |
---|
5759 | |
---|
5760 | self.bad_share_checkstrings = {} |
---|
5761 | |
---|
5762 | + # This is set at the last step of the publishing process. |
---|
5763 | + self.versioninfo = "" |
---|
5764 | + |
---|
5765 | # we use the servermap to populate the initial goal: this way we will |
---|
5766 | # try to update each existing share in place. |
---|
5767 | for (peerid, shnum) in self._servermap.servermap: |
---|
5768 | hunk ./src/allmydata/mutable/publish.py 454 |
---|
5769 | self.bad_share_checkstrings[key] = old_checkstring |
---|
5770 | self.connections[peerid] = self._servermap.connections[peerid] |
---|
5771 | |
---|
5772 | - # create the shares. We'll discard these as they are delivered. SDMF: |
---|
5773 | - # we're allowed to hold everything in memory. |
---|
5774 | + # TODO: Make this part do peer selection. |
---|
5775 | + self.update_goal() |
---|
5776 | + self.writers = {} |
---|
5777 | + if self._version == MDMF_VERSION: |
---|
5778 | + writer_class = MDMFSlotWriteProxy |
---|
5779 | + else: |
---|
5780 | + writer_class = SDMFSlotWriteProxy |
---|
5781 | |
---|
5782 | hunk ./src/allmydata/mutable/publish.py 462 |
---|
5783 | + # For each (peerid, shnum) in self.goal, we make a |
---|
5784 | + # write proxy for that peer. We'll use this to write |
---|
5785 | + # shares to the peer. |
---|
5786 | + for key in self.goal: |
---|
5787 | + peerid, shnum = key |
---|
5788 | + write_enabler = self._node.get_write_enabler(peerid) |
---|
5789 | + renew_secret = self._node.get_renewal_secret(peerid) |
---|
5790 | + cancel_secret = self._node.get_cancel_secret(peerid) |
---|
5791 | + secrets = (write_enabler, renew_secret, cancel_secret) |
---|
5792 | + |
---|
5793 | + self.writers[shnum] = writer_class(shnum, |
---|
5794 | + self.connections[peerid], |
---|
5795 | + self._storage_index, |
---|
5796 | + secrets, |
---|
5797 | + self._new_seqnum, |
---|
5798 | + self.required_shares, |
---|
5799 | + self.total_shares, |
---|
5800 | + self.segment_size, |
---|
5801 | + self.datalength) |
---|
5802 | + self.writers[shnum].peerid = peerid |
---|
5803 | + if (peerid, shnum) in self._servermap.servermap: |
---|
5804 | + old_versionid, old_timestamp = self._servermap.servermap[key] |
---|
5805 | + (old_seqnum, old_root_hash, old_salt, old_segsize, |
---|
5806 | + old_datalength, old_k, old_N, old_prefix, |
---|
5807 | + old_offsets_tuple) = old_versionid |
---|
5808 | + self.writers[shnum].set_checkstring(old_seqnum, |
---|
5809 | + old_root_hash, |
---|
5810 | + old_salt) |
---|
5811 | + elif (peerid, shnum) in self.bad_share_checkstrings: |
---|
5812 | + old_checkstring = self.bad_share_checkstrings[(peerid, shnum)] |
---|
5813 | + self.writers[shnum].set_checkstring(old_checkstring) |
---|
5814 | + |
---|
5815 | + # Our remote shares will not have a complete checkstring until |
---|
5816 | + # after we are done writing share data and have started to write |
---|
5817 | + # blocks. In the meantime, we need to know what to look for when |
---|
5818 | + # writing, so that we can detect UncoordinatedWriteErrors. |
---|
5819 | + self._checkstring = self.writers.values()[0].get_checkstring() |
---|
5820 | + |
---|
5821 | + # Now, we start pushing shares. |
---|
5822 | self._status.timings["setup"] = time.time() - self._started |
---|
5823 | hunk ./src/allmydata/mutable/publish.py 502 |
---|
5824 | - d = self._encrypt_and_encode() |
---|
5825 | - d.addCallback(self._generate_shares) |
---|
5826 | - def _start_pushing(res): |
---|
5827 | - self._started_pushing = time.time() |
---|
5828 | - return res |
---|
5829 | - d.addCallback(_start_pushing) |
---|
5830 | - d.addCallback(self.loop) # trigger delivery |
---|
5831 | - d.addErrback(self._fatal_error) |
---|
5832 | + # First, we encrypt, encode, and publish the shares that we need |
---|
5833 | + # to encrypt, encode, and publish. |
---|
5834 | + |
---|
5835 | + # This will eventually hold the block hash chain for each share |
---|
5836 | + # that we publish. We define it this way so that empty publishes |
---|
5837 | + # will still have something to write to the remote slot. |
---|
5838 | + self.blockhashes = dict([(i, []) for i in xrange(self.total_shares)]) |
---|
5839 | + for i in xrange(self.total_shares): |
---|
5840 | + blocks = self.blockhashes[i] |
---|
5841 | + for j in xrange(self.num_segments): |
---|
5842 | + blocks.append(None) |
---|
5843 | + self.sharehash_leaves = None # eventually [sharehashes] |
---|
5844 | + self.sharehashes = {} # shnum -> [sharehash leaves necessary to |
---|
5845 | + # validate the share] |
---|
5846 | + |
---|
5847 | + d = defer.succeed(None) |
---|
5848 | + self.log("Starting push") |
---|
5849 | + |
---|
5850 | + self._state = PUSHING_BLOCKS_STATE |
---|
5851 | + self._push() |
---|
5852 | |
---|
5853 | return self.done_deferred |
---|
5854 | |
---|
5855 | hunk ./src/allmydata/mutable/publish.py 525 |
---|
5856 | - def setup_encoding_parameters(self): |
---|
5857 | - segment_size = len(self.newdata) |
---|
5858 | + |
---|
5859 | + def _update_status(self): |
---|
5860 | + self._status.set_status("Sending Shares: %d placed out of %d, " |
---|
5861 | + "%d messages outstanding" % |
---|
5862 | + (len(self.placed), |
---|
5863 | + len(self.goal), |
---|
5864 | + len(self.outstanding))) |
---|
5865 | + self._status.set_progress(1.0 * len(self.placed) / len(self.goal)) |
---|
5866 | + |
---|
5867 | + |
---|
5868 | + def setup_encoding_parameters(self, offset=0): |
---|
5869 | + if self._version == MDMF_VERSION: |
---|
5870 | + segment_size = DEFAULT_MAX_SEGMENT_SIZE # 128 KiB by default |
---|
5871 | + else: |
---|
5872 | + segment_size = self.datalength # SDMF is only one segment |
---|
5873 | # this must be a multiple of self.required_shares |
---|
5874 | segment_size = mathutil.next_multiple(segment_size, |
---|
5875 | self.required_shares) |
---|
5876 | hunk ./src/allmydata/mutable/publish.py 544 |
---|
5877 | self.segment_size = segment_size |
---|
5878 | + |
---|
5879 | + # Calculate the starting segment for the upload. |
---|
5880 | if segment_size: |
---|
5881 | hunk ./src/allmydata/mutable/publish.py 547 |
---|
5882 | - self.num_segments = mathutil.div_ceil(len(self.newdata), |
---|
5883 | + self.num_segments = mathutil.div_ceil(self.datalength, |
---|
5884 | segment_size) |
---|
5885 | hunk ./src/allmydata/mutable/publish.py 549 |
---|
5886 | + self.starting_segment = mathutil.div_ceil(offset, |
---|
5887 | + segment_size) |
---|
5888 | + self.starting_segment -= 1 |
---|
5889 | + if offset == 0: |
---|
5890 | + self.starting_segment = 0 |
---|
5891 | + |
---|
5892 | else: |
---|
5893 | self.num_segments = 0 |
---|
5894 | hunk ./src/allmydata/mutable/publish.py 557 |
---|
5895 | - assert self.num_segments in [0, 1,] # SDMF restrictions |
---|
5896 | + self.starting_segment = 0 |
---|
5897 | + |
---|
5898 | + |
---|
5899 | + self.log("building encoding parameters for file") |
---|
5900 | + self.log("got segsize %d" % self.segment_size) |
---|
5901 | + self.log("got %d segments" % self.num_segments) |
---|
5902 | + |
---|
5903 | + if self._version == SDMF_VERSION: |
---|
5904 | + assert self.num_segments in (0, 1) # SDMF |
---|
5905 | + # calculate the tail segment size. |
---|
5906 | + |
---|
5907 | + if segment_size and self.datalength: |
---|
5908 | + self.tail_segment_size = self.datalength % segment_size |
---|
5909 | + self.log("got tail segment size %d" % self.tail_segment_size) |
---|
5910 | + else: |
---|
5911 | + self.tail_segment_size = 0 |
---|
5912 | + |
---|
5913 | + if self.tail_segment_size == 0 and segment_size: |
---|
5914 | + # The tail segment is the same size as the other segments. |
---|
5915 | + self.tail_segment_size = segment_size |
---|
5916 | + |
---|
5917 | + # Make FEC encoders |
---|
5918 | + fec = codec.CRSEncoder() |
---|
5919 | + fec.set_params(self.segment_size, |
---|
5920 | + self.required_shares, self.total_shares) |
---|
5921 | + self.piece_size = fec.get_block_size() |
---|
5922 | + self.fec = fec |
---|
5923 | + |
---|
5924 | + if self.tail_segment_size == self.segment_size: |
---|
5925 | + self.tail_fec = self.fec |
---|
5926 | + else: |
---|
5927 | + tail_fec = codec.CRSEncoder() |
---|
5928 | + tail_fec.set_params(self.tail_segment_size, |
---|
5929 | + self.required_shares, |
---|
5930 | + self.total_shares) |
---|
5931 | + self.tail_fec = tail_fec |
---|
5932 | + |
---|
5933 | + self._current_segment = self.starting_segment |
---|
5934 | + self.end_segment = self.num_segments - 1 |
---|
5935 | + # Now figure out where the last segment should be. |
---|
5936 | + if self.data.get_size() != self.datalength: |
---|
5937 | + end = self.data.get_size() |
---|
5938 | + self.end_segment = mathutil.div_ceil(end, |
---|
5939 | + segment_size) |
---|
5940 | + self.end_segment -= 1 |
---|
5941 | + self.log("got start segment %d" % self.starting_segment) |
---|
5942 | + self.log("got end segment %d" % self.end_segment) |
---|
5943 | + |
---|
5944 | + |
---|
5945 | + def _push(self, ignored=None): |
---|
5946 | + """ |
---|
5947 | + I manage state transitions. In particular, I see that we still |
---|
5948 | + have a good enough number of writers to complete the upload |
---|
5949 | + successfully. |
---|
5950 | + """ |
---|
5951 | + # Can we still successfully publish this file? |
---|
5952 | + # TODO: Keep track of outstanding queries before aborting the |
---|
5953 | + # process. |
---|
5954 | + if len(self.writers) <= self.required_shares or self.surprised: |
---|
5955 | + return self._failure() |
---|
5956 | + |
---|
5957 | + # Figure out what we need to do next. Each of these needs to |
---|
5958 | + # return a deferred so that we don't block execution when this |
---|
5959 | + # is first called in the upload method. |
---|
5960 | + if self._state == PUSHING_BLOCKS_STATE: |
---|
5961 | + return self.push_segment(self._current_segment) |
---|
5962 | + |
---|
5963 | + elif self._state == PUSHING_EVERYTHING_ELSE_STATE: |
---|
5964 | + return self.push_everything_else() |
---|
5965 | + |
---|
5966 | + # If we make it to this point, we were successful in placing the |
---|
5967 | + # file. |
---|
5968 | + return self._done(None) |
---|
5969 | + |
---|
5970 | + |
---|
5971 | + def push_segment(self, segnum): |
---|
5972 | + if self.num_segments == 0 and self._version == SDMF_VERSION: |
---|
5973 | + self._add_dummy_salts() |
---|
5974 | |
---|
5975 | hunk ./src/allmydata/mutable/publish.py 636 |
---|
5976 | - def _fatal_error(self, f): |
---|
5977 | - self.log("error during loop", failure=f, level=log.UNUSUAL) |
---|
5978 | - self._done(f) |
---|
5979 | + if segnum > self.end_segment: |
---|
5980 | + # We don't have any more segments to push. |
---|
5981 | + self._state = PUSHING_EVERYTHING_ELSE_STATE |
---|
5982 | + return self._push() |
---|
5983 | + |
---|
5984 | + d = self._encode_segment(segnum) |
---|
5985 | + d.addCallback(self._push_segment, segnum) |
---|
5986 | + def _increment_segnum(ign): |
---|
5987 | + self._current_segment += 1 |
---|
5988 | + # XXX: I don't think we need to do addBoth here -- any errBacks |
---|
5989 | + # should be handled within push_segment. |
---|
5990 | + d.addBoth(_increment_segnum) |
---|
5991 | + d.addBoth(self._turn_barrier) |
---|
5992 | + d.addBoth(self._push) |
---|
5993 | + |
---|
5994 | + |
---|
5995 | + def _turn_barrier(self, result): |
---|
5996 | + """ |
---|
5997 | + I help the publish process avoid the recursion limit issues |
---|
5998 | + described in #237. |
---|
5999 | + """ |
---|
6000 | + return fireEventually(result) |
---|
6001 | + |
---|
6002 | + |
---|
6003 | + def _add_dummy_salts(self): |
---|
6004 | + """ |
---|
6005 | + SDMF files need a salt even if they're empty, or the signature |
---|
6006 | + won't make sense. This method adds a dummy salt to each of our |
---|
6007 | + SDMF writers so that they can write the signature later. |
---|
6008 | + """ |
---|
6009 | + salt = os.urandom(16) |
---|
6010 | + assert self._version == SDMF_VERSION |
---|
6011 | + |
---|
6012 | + for writer in self.writers.itervalues(): |
---|
6013 | + writer.put_salt(salt) |
---|
6014 | + |
---|
6015 | + |
---|
6016 | + def _encode_segment(self, segnum): |
---|
6017 | + """ |
---|
6018 | + I encrypt and encode the segment segnum. |
---|
6019 | + """ |
---|
6020 | + started = time.time() |
---|
6021 | + |
---|
6022 | + if segnum + 1 == self.num_segments: |
---|
6023 | + segsize = self.tail_segment_size |
---|
6024 | + else: |
---|
6025 | + segsize = self.segment_size |
---|
6026 | + |
---|
6027 | + |
---|
6028 | + self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments)) |
---|
6029 | + data = self.data.read(segsize) |
---|
6030 | + # XXX: This is dumb. Why return a list? |
---|
6031 | + data = "".join(data) |
---|
6032 | + |
---|
6033 | + assert len(data) == segsize, len(data) |
---|
6034 | + |
---|
6035 | + salt = os.urandom(16) |
---|
6036 | + |
---|
6037 | + key = hashutil.ssk_readkey_data_hash(salt, self.readkey) |
---|
6038 | + self._status.set_status("Encrypting") |
---|
6039 | + enc = AES(key) |
---|
6040 | + crypttext = enc.process(data) |
---|
6041 | + assert len(crypttext) == len(data) |
---|
6042 | + |
---|
6043 | + now = time.time() |
---|
6044 | + self._status.timings["encrypt"] = now - started |
---|
6045 | + started = now |
---|
6046 | + |
---|
6047 | + # now apply FEC |
---|
6048 | + if segnum + 1 == self.num_segments: |
---|
6049 | + fec = self.tail_fec |
---|
6050 | + else: |
---|
6051 | + fec = self.fec |
---|
6052 | + |
---|
6053 | + self._status.set_status("Encoding") |
---|
6054 | + crypttext_pieces = [None] * self.required_shares |
---|
6055 | + piece_size = fec.get_block_size() |
---|
6056 | + for i in range(len(crypttext_pieces)): |
---|
6057 | + offset = i * piece_size |
---|
6058 | + piece = crypttext[offset:offset+piece_size] |
---|
6059 | + piece = piece + "\x00"*(piece_size - len(piece)) # padding |
---|
6060 | + crypttext_pieces[i] = piece |
---|
6061 | + assert len(piece) == piece_size |
---|
6062 | + d = fec.encode(crypttext_pieces) |
---|
6063 | + def _done_encoding(res): |
---|
6064 | + elapsed = time.time() - started |
---|
6065 | + self._status.timings["encode"] = elapsed |
---|
6066 | + return (res, salt) |
---|
6067 | + d.addCallback(_done_encoding) |
---|
6068 | + return d |
---|
6069 | + |
---|
6070 | + |
---|
6071 | + def _push_segment(self, encoded_and_salt, segnum): |
---|
6072 | + """ |
---|
6073 | + I push (data, salt) as segment number segnum. |
---|
6074 | + """ |
---|
6075 | + results, salt = encoded_and_salt |
---|
6076 | + shares, shareids = results |
---|
6077 | + started = time.time() |
---|
6078 | + self._status.set_status("Pushing segment") |
---|
6079 | + for i in xrange(len(shares)): |
---|
6080 | + sharedata = shares[i] |
---|
6081 | + shareid = shareids[i] |
---|
6082 | + if self._version == MDMF_VERSION: |
---|
6083 | + hashed = salt + sharedata |
---|
6084 | + else: |
---|
6085 | + hashed = sharedata |
---|
6086 | + block_hash = hashutil.block_hash(hashed) |
---|
6087 | + old_hash = self.blockhashes[shareid][segnum] |
---|
6088 | + self.blockhashes[shareid][segnum] = block_hash |
---|
6089 | + # find the writer for this share |
---|
6090 | + writer = self.writers[shareid] |
---|
6091 | + writer.put_block(sharedata, segnum, salt) |
---|
6092 | + |
---|
6093 | + |
---|
6094 | + def push_everything_else(self): |
---|
6095 | + """ |
---|
6096 | + I put everything else associated with a share. |
---|
6097 | + """ |
---|
6098 | + self._pack_started = time.time() |
---|
6099 | + self.push_encprivkey() |
---|
6100 | + self.push_blockhashes() |
---|
6101 | + self.push_sharehashes() |
---|
6102 | + self.push_toplevel_hashes_and_signature() |
---|
6103 | + d = self.finish_publishing() |
---|
6104 | + def _change_state(ignored): |
---|
6105 | + self._state = DONE_STATE |
---|
6106 | + d.addCallback(_change_state) |
---|
6107 | + d.addCallback(self._push) |
---|
6108 | + return d |
---|
6109 | + |
---|
6110 | + |
---|
6111 | + def push_encprivkey(self): |
---|
6112 | + encprivkey = self._encprivkey |
---|
6113 | + self._status.set_status("Pushing encrypted private key") |
---|
6114 | + for writer in self.writers.itervalues(): |
---|
6115 | + writer.put_encprivkey(encprivkey) |
---|
6116 | + |
---|
6117 | + |
---|
6118 | + def push_blockhashes(self): |
---|
6119 | + self.sharehash_leaves = [None] * len(self.blockhashes) |
---|
6120 | + self._status.set_status("Building and pushing block hash tree") |
---|
6121 | + for shnum, blockhashes in self.blockhashes.iteritems(): |
---|
6122 | + t = hashtree.HashTree(blockhashes) |
---|
6123 | + self.blockhashes[shnum] = list(t) |
---|
6124 | + # set the leaf for future use. |
---|
6125 | + self.sharehash_leaves[shnum] = t[0] |
---|
6126 | + |
---|
6127 | + writer = self.writers[shnum] |
---|
6128 | + writer.put_blockhashes(self.blockhashes[shnum]) |
---|
6129 | + |
---|
6130 | + |
---|
6131 | + def push_sharehashes(self): |
---|
6132 | + self._status.set_status("Building and pushing share hash chain") |
---|
6133 | + share_hash_tree = hashtree.HashTree(self.sharehash_leaves) |
---|
6134 | + share_hash_chain = {} |
---|
6135 | + for shnum in xrange(len(self.sharehash_leaves)): |
---|
6136 | + needed_indices = share_hash_tree.needed_hashes(shnum) |
---|
6137 | + self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i]) |
---|
6138 | + for i in needed_indices] ) |
---|
6139 | + writer = self.writers[shnum] |
---|
6140 | + writer.put_sharehashes(self.sharehashes[shnum]) |
---|
6141 | + self.root_hash = share_hash_tree[0] |
---|
6142 | + |
---|
6143 | + |
---|
6144 | + def push_toplevel_hashes_and_signature(self): |
---|
6145 | + # We need to to three things here: |
---|
6146 | + # - Push the root hash and salt hash |
---|
6147 | + # - Get the checkstring of the resulting layout; sign that. |
---|
6148 | + # - Push the signature |
---|
6149 | + self._status.set_status("Pushing root hashes and signature") |
---|
6150 | + for shnum in xrange(self.total_shares): |
---|
6151 | + writer = self.writers[shnum] |
---|
6152 | + writer.put_root_hash(self.root_hash) |
---|
6153 | + self._update_checkstring() |
---|
6154 | + self._make_and_place_signature() |
---|
6155 | + |
---|
6156 | + |
---|
6157 | + def _update_checkstring(self): |
---|
6158 | + """ |
---|
6159 | + After putting the root hash, MDMF files will have the |
---|
6160 | + checkstring written to the storage server. This means that we |
---|
6161 | + can update our copy of the checkstring so we can detect |
---|
6162 | + uncoordinated writes. SDMF files will have the same checkstring, |
---|
6163 | + so we need not do anything. |
---|
6164 | + """ |
---|
6165 | + self._checkstring = self.writers.values()[0].get_checkstring() |
---|
6166 | + |
---|
6167 | + |
---|
6168 | + def _make_and_place_signature(self): |
---|
6169 | + """ |
---|
6170 | + I create and place the signature. |
---|
6171 | + """ |
---|
6172 | + started = time.time() |
---|
6173 | + self._status.set_status("Signing prefix") |
---|
6174 | + signable = self.writers[0].get_signable() |
---|
6175 | + self.signature = self._privkey.sign(signable) |
---|
6176 | + |
---|
6177 | + for (shnum, writer) in self.writers.iteritems(): |
---|
6178 | + writer.put_signature(self.signature) |
---|
6179 | + self._status.timings['sign'] = time.time() - started |
---|
6180 | + |
---|
6181 | + |
---|
6182 | + def finish_publishing(self): |
---|
6183 | + # We're almost done -- we just need to put the verification key |
---|
6184 | + # and the offsets |
---|
6185 | + started = time.time() |
---|
6186 | + self._status.set_status("Pushing shares") |
---|
6187 | + self._started_pushing = started |
---|
6188 | + ds = [] |
---|
6189 | + verification_key = self._pubkey.serialize() |
---|
6190 | + |
---|
6191 | + |
---|
6192 | + # TODO: Bad, since we remove from this same dict. We need to |
---|
6193 | + # make a copy, or just use a non-iterated value. |
---|
6194 | + for (shnum, writer) in self.writers.iteritems(): |
---|
6195 | + writer.put_verification_key(verification_key) |
---|
6196 | + d = writer.finish_publishing() |
---|
6197 | + # Add the (peerid, shnum) tuple to our list of outstanding |
---|
6198 | + # queries. This gets used by _loop if some of our queries |
---|
6199 | + # fail to place shares. |
---|
6200 | + self.outstanding.add((writer.peerid, writer.shnum)) |
---|
6201 | + d.addCallback(self._got_write_answer, writer, started) |
---|
6202 | + d.addErrback(self._connection_problem, writer) |
---|
6203 | + ds.append(d) |
---|
6204 | + self._record_verinfo() |
---|
6205 | + self._status.timings['pack'] = time.time() - started |
---|
6206 | + return defer.DeferredList(ds) |
---|
6207 | + |
---|
6208 | + |
---|
6209 | + def _record_verinfo(self): |
---|
6210 | + self.versioninfo = self.writers.values()[0].get_verinfo() |
---|
6211 | + |
---|
6212 | + |
---|
6213 | + def _connection_problem(self, f, writer): |
---|
6214 | + """ |
---|
6215 | + We ran into a connection problem while working with writer, and |
---|
6216 | + need to deal with that. |
---|
6217 | + """ |
---|
6218 | + self.log("found problem: %s" % str(f)) |
---|
6219 | + self._last_failure = f |
---|
6220 | + del(self.writers[writer.shnum]) |
---|
6221 | |
---|
6222 | hunk ./src/allmydata/mutable/publish.py 879 |
---|
6223 | - def _update_status(self): |
---|
6224 | - self._status.set_status("Sending Shares: %d placed out of %d, " |
---|
6225 | - "%d messages outstanding" % |
---|
6226 | - (len(self.placed), |
---|
6227 | - len(self.goal), |
---|
6228 | - len(self.outstanding))) |
---|
6229 | - self._status.set_progress(1.0 * len(self.placed) / len(self.goal)) |
---|
6230 | |
---|
6231 | hunk ./src/allmydata/mutable/publish.py 880 |
---|
6232 | - def loop(self, ignored=None): |
---|
6233 | - self.log("entering loop", level=log.NOISY) |
---|
6234 | - if not self._running: |
---|
6235 | - return |
---|
6236 | - |
---|
6237 | - self.looplimit -= 1 |
---|
6238 | - if self.looplimit <= 0: |
---|
6239 | - raise LoopLimitExceededError("loop limit exceeded") |
---|
6240 | - |
---|
6241 | - if self.surprised: |
---|
6242 | - # don't send out any new shares, just wait for the outstanding |
---|
6243 | - # ones to be retired. |
---|
6244 | - self.log("currently surprised, so don't send any new shares", |
---|
6245 | - level=log.NOISY) |
---|
6246 | - else: |
---|
6247 | - self.update_goal() |
---|
6248 | - # how far are we from our goal? |
---|
6249 | - needed = self.goal - self.placed - self.outstanding |
---|
6250 | - self._update_status() |
---|
6251 | - |
---|
6252 | - if needed: |
---|
6253 | - # we need to send out new shares |
---|
6254 | - self.log(format="need to send %(needed)d new shares", |
---|
6255 | - needed=len(needed), level=log.NOISY) |
---|
6256 | - self._send_shares(needed) |
---|
6257 | - return |
---|
6258 | - |
---|
6259 | - if self.outstanding: |
---|
6260 | - # queries are still pending, keep waiting |
---|
6261 | - self.log(format="%(outstanding)d queries still outstanding", |
---|
6262 | - outstanding=len(self.outstanding), |
---|
6263 | - level=log.NOISY) |
---|
6264 | - return |
---|
6265 | - |
---|
6266 | - # no queries outstanding, no placements needed: we're done |
---|
6267 | - self.log("no queries outstanding, no placements needed: done", |
---|
6268 | - level=log.OPERATIONAL) |
---|
6269 | - now = time.time() |
---|
6270 | - elapsed = now - self._started_pushing |
---|
6271 | - self._status.timings["push"] = elapsed |
---|
6272 | - return self._done(None) |
---|
6273 | - |
---|
6274 | def log_goal(self, goal, message=""): |
---|
6275 | logmsg = [message] |
---|
6276 | for (shnum, peerid) in sorted([(s,p) for (p,s) in goal]): |
---|
6277 | hunk ./src/allmydata/mutable/publish.py 961 |
---|
6278 | self.log_goal(self.goal, "after update: ") |
---|
6279 | |
---|
6280 | |
---|
6281 | + def _got_write_answer(self, answer, writer, started): |
---|
6282 | + if not answer: |
---|
6283 | + # SDMF writers only pretend to write when readers set their |
---|
6284 | + # blocks, salts, and so on -- they actually just write once, |
---|
6285 | + # at the end of the upload process. In fake writes, they |
---|
6286 | + # return defer.succeed(None). If we see that, we shouldn't |
---|
6287 | + # bother checking it. |
---|
6288 | + return |
---|
6289 | |
---|
6290 | hunk ./src/allmydata/mutable/publish.py 970 |
---|
6291 | - def _encrypt_and_encode(self): |
---|
6292 | - # this returns a Deferred that fires with a list of (sharedata, |
---|
6293 | - # sharenum) tuples. TODO: cache the ciphertext, only produce the |
---|
6294 | - # shares that we care about. |
---|
6295 | - self.log("_encrypt_and_encode") |
---|
6296 | - |
---|
6297 | - self._status.set_status("Encrypting") |
---|
6298 | - started = time.time() |
---|
6299 | - |
---|
6300 | - key = hashutil.ssk_readkey_data_hash(self.salt, self.readkey) |
---|
6301 | - enc = AES(key) |
---|
6302 | - crypttext = enc.process(self.newdata) |
---|
6303 | - assert len(crypttext) == len(self.newdata) |
---|
6304 | + peerid = writer.peerid |
---|
6305 | + lp = self.log("_got_write_answer from %s, share %d" % |
---|
6306 | + (idlib.shortnodeid_b2a(peerid), writer.shnum)) |
---|
6307 | |
---|
6308 | now = time.time() |
---|
6309 | hunk ./src/allmydata/mutable/publish.py 975 |
---|
6310 | - self._status.timings["encrypt"] = now - started |
---|
6311 | - started = now |
---|
6312 | - |
---|
6313 | - # now apply FEC |
---|
6314 | - |
---|
6315 | - self._status.set_status("Encoding") |
---|
6316 | - fec = codec.CRSEncoder() |
---|
6317 | - fec.set_params(self.segment_size, |
---|
6318 | - self.required_shares, self.total_shares) |
---|
6319 | - piece_size = fec.get_block_size() |
---|
6320 | - crypttext_pieces = [None] * self.required_shares |
---|
6321 | - for i in range(len(crypttext_pieces)): |
---|
6322 | - offset = i * piece_size |
---|
6323 | - piece = crypttext[offset:offset+piece_size] |
---|
6324 | - piece = piece + "\x00"*(piece_size - len(piece)) # padding |
---|
6325 | - crypttext_pieces[i] = piece |
---|
6326 | - assert len(piece) == piece_size |
---|
6327 | - |
---|
6328 | - d = fec.encode(crypttext_pieces) |
---|
6329 | - def _done_encoding(res): |
---|
6330 | - elapsed = time.time() - started |
---|
6331 | - self._status.timings["encode"] = elapsed |
---|
6332 | - return res |
---|
6333 | - d.addCallback(_done_encoding) |
---|
6334 | - return d |
---|
6335 | - |
---|
6336 | - def _generate_shares(self, shares_and_shareids): |
---|
6337 | - # this sets self.shares and self.root_hash |
---|
6338 | - self.log("_generate_shares") |
---|
6339 | - self._status.set_status("Generating Shares") |
---|
6340 | - started = time.time() |
---|
6341 | - |
---|
6342 | - # we should know these by now |
---|
6343 | - privkey = self._privkey |
---|
6344 | - encprivkey = self._encprivkey |
---|
6345 | - pubkey = self._pubkey |
---|
6346 | - |
---|
6347 | - (shares, share_ids) = shares_and_shareids |
---|
6348 | - |
---|
6349 | - assert len(shares) == len(share_ids) |
---|
6350 | - assert len(shares) == self.total_shares |
---|
6351 | - all_shares = {} |
---|
6352 | - block_hash_trees = {} |
---|
6353 | - share_hash_leaves = [None] * len(shares) |
---|
6354 | - for i in range(len(shares)): |
---|
6355 | - share_data = shares[i] |
---|
6356 | - shnum = share_ids[i] |
---|
6357 | - all_shares[shnum] = share_data |
---|
6358 | - |
---|
6359 | - # build the block hash tree. SDMF has only one leaf. |
---|
6360 | - leaves = [hashutil.block_hash(share_data)] |
---|
6361 | - t = hashtree.HashTree(leaves) |
---|
6362 | - block_hash_trees[shnum] = list(t) |
---|
6363 | - share_hash_leaves[shnum] = t[0] |
---|
6364 | - for leaf in share_hash_leaves: |
---|
6365 | - assert leaf is not None |
---|
6366 | - share_hash_tree = hashtree.HashTree(share_hash_leaves) |
---|
6367 | - share_hash_chain = {} |
---|
6368 | - for shnum in range(self.total_shares): |
---|
6369 | - needed_hashes = share_hash_tree.needed_hashes(shnum) |
---|
6370 | - share_hash_chain[shnum] = dict( [ (i, share_hash_tree[i]) |
---|
6371 | - for i in needed_hashes ] ) |
---|
6372 | - root_hash = share_hash_tree[0] |
---|
6373 | - assert len(root_hash) == 32 |
---|
6374 | - self.log("my new root_hash is %s" % base32.b2a(root_hash)) |
---|
6375 | - self._new_version_info = (self._new_seqnum, root_hash, self.salt) |
---|
6376 | - |
---|
6377 | - prefix = pack_prefix(self._new_seqnum, root_hash, self.salt, |
---|
6378 | - self.required_shares, self.total_shares, |
---|
6379 | - self.segment_size, len(self.newdata)) |
---|
6380 | - |
---|
6381 | - # now pack the beginning of the share. All shares are the same up |
---|
6382 | - # to the signature, then they have divergent share hash chains, |
---|
6383 | - # then completely different block hash trees + salt + share data, |
---|
6384 | - # then they all share the same encprivkey at the end. The sizes |
---|
6385 | - # of everything are the same for all shares. |
---|
6386 | - |
---|
6387 | - sign_started = time.time() |
---|
6388 | - signature = privkey.sign(prefix) |
---|
6389 | - self._status.timings["sign"] = time.time() - sign_started |
---|
6390 | - |
---|
6391 | - verification_key = pubkey.serialize() |
---|
6392 | - |
---|
6393 | - final_shares = {} |
---|
6394 | - for shnum in range(self.total_shares): |
---|
6395 | - final_share = pack_share(prefix, |
---|
6396 | - verification_key, |
---|
6397 | - signature, |
---|
6398 | - share_hash_chain[shnum], |
---|
6399 | - block_hash_trees[shnum], |
---|
6400 | - all_shares[shnum], |
---|
6401 | - encprivkey) |
---|
6402 | - final_shares[shnum] = final_share |
---|
6403 | - elapsed = time.time() - started |
---|
6404 | - self._status.timings["pack"] = elapsed |
---|
6405 | - self.shares = final_shares |
---|
6406 | - self.root_hash = root_hash |
---|
6407 | - |
---|
6408 | - # we also need to build up the version identifier for what we're |
---|
6409 | - # pushing. Extract the offsets from one of our shares. |
---|
6410 | - assert final_shares |
---|
6411 | - offsets = unpack_header(final_shares.values()[0])[-1] |
---|
6412 | - offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] ) |
---|
6413 | - verinfo = (self._new_seqnum, root_hash, self.salt, |
---|
6414 | - self.segment_size, len(self.newdata), |
---|
6415 | - self.required_shares, self.total_shares, |
---|
6416 | - prefix, offsets_tuple) |
---|
6417 | - self.versioninfo = verinfo |
---|
6418 | - |
---|
6419 | - |
---|
6420 | - |
---|
6421 | - def _send_shares(self, needed): |
---|
6422 | - self.log("_send_shares") |
---|
6423 | - |
---|
6424 | - # we're finally ready to send out our shares. If we encounter any |
---|
6425 | - # surprises here, it's because somebody else is writing at the same |
---|
6426 | - # time. (Note: in the future, when we remove the _query_peers() step |
---|
6427 | - # and instead speculate about [or remember] which shares are where, |
---|
6428 | - # surprises here are *not* indications of UncoordinatedWriteError, |
---|
6429 | - # and we'll need to respond to them more gracefully.) |
---|
6430 | - |
---|
6431 | - # needed is a set of (peerid, shnum) tuples. The first thing we do is |
---|
6432 | - # organize it by peerid. |
---|
6433 | - |
---|
6434 | - peermap = DictOfSets() |
---|
6435 | - for (peerid, shnum) in needed: |
---|
6436 | - peermap.add(peerid, shnum) |
---|
6437 | - |
---|
6438 | - # the next thing is to build up a bunch of test vectors. The |
---|
6439 | - # semantics of Publish are that we perform the operation if the world |
---|
6440 | - # hasn't changed since the ServerMap was constructed (more or less). |
---|
6441 | - # For every share we're trying to place, we create a test vector that |
---|
6442 | - # tests to see if the server*share still corresponds to the |
---|
6443 | - # map. |
---|
6444 | - |
---|
6445 | - all_tw_vectors = {} # maps peerid to tw_vectors |
---|
6446 | - sm = self._servermap.servermap |
---|
6447 | - |
---|
6448 | - for key in needed: |
---|
6449 | - (peerid, shnum) = key |
---|
6450 | - |
---|
6451 | - if key in sm: |
---|
6452 | - # an old version of that share already exists on the |
---|
6453 | - # server, according to our servermap. We will create a |
---|
6454 | - # request that attempts to replace it. |
---|
6455 | - old_versionid, old_timestamp = sm[key] |
---|
6456 | - (old_seqnum, old_root_hash, old_salt, old_segsize, |
---|
6457 | - old_datalength, old_k, old_N, old_prefix, |
---|
6458 | - old_offsets_tuple) = old_versionid |
---|
6459 | - old_checkstring = pack_checkstring(old_seqnum, |
---|
6460 | - old_root_hash, |
---|
6461 | - old_salt) |
---|
6462 | - testv = (0, len(old_checkstring), "eq", old_checkstring) |
---|
6463 | - |
---|
6464 | - elif key in self.bad_share_checkstrings: |
---|
6465 | - old_checkstring = self.bad_share_checkstrings[key] |
---|
6466 | - testv = (0, len(old_checkstring), "eq", old_checkstring) |
---|
6467 | - |
---|
6468 | - else: |
---|
6469 | - # add a testv that requires the share not exist |
---|
6470 | - |
---|
6471 | - # Unfortunately, foolscap-0.2.5 has a bug in the way inbound |
---|
6472 | - # constraints are handled. If the same object is referenced |
---|
6473 | - # multiple times inside the arguments, foolscap emits a |
---|
6474 | - # 'reference' token instead of a distinct copy of the |
---|
6475 | - # argument. The bug is that these 'reference' tokens are not |
---|
6476 | - # accepted by the inbound constraint code. To work around |
---|
6477 | - # this, we need to prevent python from interning the |
---|
6478 | - # (constant) tuple, by creating a new copy of this vector |
---|
6479 | - # each time. |
---|
6480 | - |
---|
6481 | - # This bug is fixed in foolscap-0.2.6, and even though this |
---|
6482 | - # version of Tahoe requires foolscap-0.3.1 or newer, we are |
---|
6483 | - # supposed to be able to interoperate with older versions of |
---|
6484 | - # Tahoe which are allowed to use older versions of foolscap, |
---|
6485 | - # including foolscap-0.2.5 . In addition, I've seen other |
---|
6486 | - # foolscap problems triggered by 'reference' tokens (see #541 |
---|
6487 | - # for details). So we must keep this workaround in place. |
---|
6488 | - |
---|
6489 | - #testv = (0, 1, 'eq', "") |
---|
6490 | - testv = tuple([0, 1, 'eq', ""]) |
---|
6491 | - |
---|
6492 | - testvs = [testv] |
---|
6493 | - # the write vector is simply the share |
---|
6494 | - writev = [(0, self.shares[shnum])] |
---|
6495 | - |
---|
6496 | - if peerid not in all_tw_vectors: |
---|
6497 | - all_tw_vectors[peerid] = {} |
---|
6498 | - # maps shnum to (testvs, writevs, new_length) |
---|
6499 | - assert shnum not in all_tw_vectors[peerid] |
---|
6500 | - |
---|
6501 | - all_tw_vectors[peerid][shnum] = (testvs, writev, None) |
---|
6502 | - |
---|
6503 | - # we read the checkstring back from each share, however we only use |
---|
6504 | - # it to detect whether there was a new share that we didn't know |
---|
6505 | - # about. The success or failure of the write will tell us whether |
---|
6506 | - # there was a collision or not. If there is a collision, the first |
---|
6507 | - # thing we'll do is update the servermap, which will find out what |
---|
6508 | - # happened. We could conceivably reduce a roundtrip by using the |
---|
6509 | - # readv checkstring to populate the servermap, but really we'd have |
---|
6510 | - # to read enough data to validate the signatures too, so it wouldn't |
---|
6511 | - # be an overall win. |
---|
6512 | - read_vector = [(0, struct.calcsize(SIGNED_PREFIX))] |
---|
6513 | - |
---|
6514 | - # ok, send the messages! |
---|
6515 | - self.log("sending %d shares" % len(all_tw_vectors), level=log.NOISY) |
---|
6516 | - started = time.time() |
---|
6517 | - for (peerid, tw_vectors) in all_tw_vectors.items(): |
---|
6518 | - |
---|
6519 | - write_enabler = self._node.get_write_enabler(peerid) |
---|
6520 | - renew_secret = self._node.get_renewal_secret(peerid) |
---|
6521 | - cancel_secret = self._node.get_cancel_secret(peerid) |
---|
6522 | - secrets = (write_enabler, renew_secret, cancel_secret) |
---|
6523 | - shnums = tw_vectors.keys() |
---|
6524 | - |
---|
6525 | - for shnum in shnums: |
---|
6526 | - self.outstanding.add( (peerid, shnum) ) |
---|
6527 | + elapsed = now - started |
---|
6528 | |
---|
6529 | hunk ./src/allmydata/mutable/publish.py 977 |
---|
6530 | - d = self._do_testreadwrite(peerid, secrets, |
---|
6531 | - tw_vectors, read_vector) |
---|
6532 | - d.addCallbacks(self._got_write_answer, self._got_write_error, |
---|
6533 | - callbackArgs=(peerid, shnums, started), |
---|
6534 | - errbackArgs=(peerid, shnums, started)) |
---|
6535 | - # tolerate immediate errback, like with DeadReferenceError |
---|
6536 | - d.addBoth(fireEventually) |
---|
6537 | - d.addCallback(self.loop) |
---|
6538 | - d.addErrback(self._fatal_error) |
---|
6539 | + self._status.add_per_server_time(peerid, elapsed) |
---|
6540 | |
---|
6541 | hunk ./src/allmydata/mutable/publish.py 979 |
---|
6542 | - self._update_status() |
---|
6543 | - self.log("%d shares sent" % len(all_tw_vectors), level=log.NOISY) |
---|
6544 | + wrote, read_data = answer |
---|
6545 | |
---|
6546 | hunk ./src/allmydata/mutable/publish.py 981 |
---|
6547 | - def _do_testreadwrite(self, peerid, secrets, |
---|
6548 | - tw_vectors, read_vector): |
---|
6549 | - storage_index = self._storage_index |
---|
6550 | - ss = self.connections[peerid] |
---|
6551 | + surprise_shares = set(read_data.keys()) - set([writer.shnum]) |
---|
6552 | |
---|
6553 | hunk ./src/allmydata/mutable/publish.py 983 |
---|
6554 | - #print "SS[%s] is %s" % (idlib.shortnodeid_b2a(peerid), ss), ss.tracker.interfaceName |
---|
6555 | - d = ss.callRemote("slot_testv_and_readv_and_writev", |
---|
6556 | - storage_index, |
---|
6557 | - secrets, |
---|
6558 | - tw_vectors, |
---|
6559 | - read_vector) |
---|
6560 | - return d |
---|
6561 | + # We need to remove from surprise_shares any shares that we are |
---|
6562 | + # knowingly also writing to that peer from other writers. |
---|
6563 | |
---|
6564 | hunk ./src/allmydata/mutable/publish.py 986 |
---|
6565 | - def _got_write_answer(self, answer, peerid, shnums, started): |
---|
6566 | - lp = self.log("_got_write_answer from %s" % |
---|
6567 | - idlib.shortnodeid_b2a(peerid)) |
---|
6568 | - for shnum in shnums: |
---|
6569 | - self.outstanding.discard( (peerid, shnum) ) |
---|
6570 | + # TODO: Precompute this. |
---|
6571 | + known_shnums = [x.shnum for x in self.writers.values() |
---|
6572 | + if x.peerid == peerid] |
---|
6573 | + surprise_shares -= set(known_shnums) |
---|
6574 | + self.log("found the following surprise shares: %s" % |
---|
6575 | + str(surprise_shares)) |
---|
6576 | |
---|
6577 | hunk ./src/allmydata/mutable/publish.py 993 |
---|
6578 | - now = time.time() |
---|
6579 | - elapsed = now - started |
---|
6580 | - self._status.add_per_server_time(peerid, elapsed) |
---|
6581 | - |
---|
6582 | - wrote, read_data = answer |
---|
6583 | - |
---|
6584 | - surprise_shares = set(read_data.keys()) - set(shnums) |
---|
6585 | + # Now surprise shares contains all of the shares that we did not |
---|
6586 | + # expect to be there. |
---|
6587 | |
---|
6588 | surprised = False |
---|
6589 | for shnum in surprise_shares: |
---|
6590 | hunk ./src/allmydata/mutable/publish.py 1000 |
---|
6591 | # read_data is a dict mapping shnum to checkstring (SIGNED_PREFIX) |
---|
6592 | checkstring = read_data[shnum][0] |
---|
6593 | - their_version_info = unpack_checkstring(checkstring) |
---|
6594 | - if their_version_info == self._new_version_info: |
---|
6595 | + # What we want to do here is to see if their (seqnum, |
---|
6596 | + # roothash, salt) is the same as our (seqnum, roothash, |
---|
6597 | + # salt), or the equivalent for MDMF. The best way to do this |
---|
6598 | + # is to store a packed representation of our checkstring |
---|
6599 | + # somewhere, then not bother unpacking the other |
---|
6600 | + # checkstring. |
---|
6601 | + if checkstring == self._checkstring: |
---|
6602 | # they have the right share, somehow |
---|
6603 | |
---|
6604 | if (peerid,shnum) in self.goal: |
---|
6605 | hunk ./src/allmydata/mutable/publish.py 1085 |
---|
6606 | self.log("our testv failed, so the write did not happen", |
---|
6607 | parent=lp, level=log.WEIRD, umid="8sc26g") |
---|
6608 | self.surprised = True |
---|
6609 | - self.bad_peers.add(peerid) # don't ask them again |
---|
6610 | + self.bad_peers.add(writer) # don't ask them again |
---|
6611 | # use the checkstring to add information to the log message |
---|
6612 | for (shnum,readv) in read_data.items(): |
---|
6613 | checkstring = readv[0] |
---|
6614 | hunk ./src/allmydata/mutable/publish.py 1107 |
---|
6615 | # if expected_version==None, then we didn't expect to see a |
---|
6616 | # share on that peer, and the 'surprise_shares' clause above |
---|
6617 | # will have logged it. |
---|
6618 | - # self.loop() will take care of finding new homes |
---|
6619 | return |
---|
6620 | |
---|
6621 | hunk ./src/allmydata/mutable/publish.py 1109 |
---|
6622 | - for shnum in shnums: |
---|
6623 | - self.placed.add( (peerid, shnum) ) |
---|
6624 | - # and update the servermap |
---|
6625 | - self._servermap.add_new_share(peerid, shnum, |
---|
6626 | + # and update the servermap |
---|
6627 | + # self.versioninfo is set during the last phase of publishing. |
---|
6628 | + # If we get there, we know that responses correspond to placed |
---|
6629 | + # shares, and can safely execute these statements. |
---|
6630 | + if self.versioninfo: |
---|
6631 | + self.log("wrote successfully: adding new share to servermap") |
---|
6632 | + self._servermap.add_new_share(peerid, writer.shnum, |
---|
6633 | self.versioninfo, started) |
---|
6634 | hunk ./src/allmydata/mutable/publish.py 1117 |
---|
6635 | - |
---|
6636 | - # self.loop() will take care of checking to see if we're done |
---|
6637 | + self.placed.add( (peerid, writer.shnum) ) |
---|
6638 | + self._update_status() |
---|
6639 | + # the next method in the deferred chain will check to see if |
---|
6640 | + # we're done and successful. |
---|
6641 | return |
---|
6642 | |
---|
6643 | hunk ./src/allmydata/mutable/publish.py 1123 |
---|
6644 | - def _got_write_error(self, f, peerid, shnums, started): |
---|
6645 | - for shnum in shnums: |
---|
6646 | - self.outstanding.discard( (peerid, shnum) ) |
---|
6647 | - self.bad_peers.add(peerid) |
---|
6648 | - if self._first_write_error is None: |
---|
6649 | - self._first_write_error = f |
---|
6650 | - self.log(format="error while writing shares %(shnums)s to peerid %(peerid)s", |
---|
6651 | - shnums=list(shnums), peerid=idlib.shortnodeid_b2a(peerid), |
---|
6652 | - failure=f, |
---|
6653 | - level=log.UNUSUAL) |
---|
6654 | - # self.loop() will take care of checking to see if we're done |
---|
6655 | - return |
---|
6656 | - |
---|
6657 | |
---|
6658 | def _done(self, res): |
---|
6659 | if not self._running: |
---|
6660 | hunk ./src/allmydata/mutable/publish.py 1130 |
---|
6661 | self._running = False |
---|
6662 | now = time.time() |
---|
6663 | self._status.timings["total"] = now - self._started |
---|
6664 | + |
---|
6665 | + elapsed = now - self._started_pushing |
---|
6666 | + self._status.timings['push'] = elapsed |
---|
6667 | + |
---|
6668 | self._status.set_active(False) |
---|
6669 | hunk ./src/allmydata/mutable/publish.py 1135 |
---|
6670 | - if isinstance(res, failure.Failure): |
---|
6671 | - self.log("Publish done, with failure", failure=res, |
---|
6672 | - level=log.WEIRD, umid="nRsR9Q") |
---|
6673 | - self._status.set_status("Failed") |
---|
6674 | - elif self.surprised: |
---|
6675 | - self.log("Publish done, UncoordinatedWriteError", level=log.UNUSUAL) |
---|
6676 | - self._status.set_status("UncoordinatedWriteError") |
---|
6677 | - # deliver a failure |
---|
6678 | - res = failure.Failure(UncoordinatedWriteError()) |
---|
6679 | - # TODO: recovery |
---|
6680 | - else: |
---|
6681 | - self.log("Publish done, success") |
---|
6682 | - self._status.set_status("Finished") |
---|
6683 | - self._status.set_progress(1.0) |
---|
6684 | + self.log("Publish done, success") |
---|
6685 | + self._status.set_status("Finished") |
---|
6686 | + self._status.set_progress(1.0) |
---|
6687 | eventually(self.done_deferred.callback, res) |
---|
6688 | |
---|
6689 | hunk ./src/allmydata/mutable/publish.py 1140 |
---|
6690 | + def _failure(self): |
---|
6691 | + |
---|
6692 | + if not self.surprised: |
---|
6693 | + # We ran out of servers |
---|
6694 | + self.log("Publish ran out of good servers, " |
---|
6695 | + "last failure was: %s" % str(self._last_failure)) |
---|
6696 | + e = NotEnoughServersError("Ran out of non-bad servers, " |
---|
6697 | + "last failure was %s" % |
---|
6698 | + str(self._last_failure)) |
---|
6699 | + else: |
---|
6700 | + # We ran into shares that we didn't recognize, which means |
---|
6701 | + # that we need to return an UncoordinatedWriteError. |
---|
6702 | + self.log("Publish failed with UncoordinatedWriteError") |
---|
6703 | + e = UncoordinatedWriteError() |
---|
6704 | + f = failure.Failure(e) |
---|
6705 | + eventually(self.done_deferred.callback, f) |
---|
6706 | + |
---|
6707 | + |
---|
6708 | +class MutableFileHandle: |
---|
6709 | + """ |
---|
6710 | + I am a mutable uploadable built around a filehandle-like object, |
---|
6711 | + usually either a StringIO instance or a handle to an actual file. |
---|
6712 | + """ |
---|
6713 | + implements(IMutableUploadable) |
---|
6714 | + |
---|
6715 | + def __init__(self, filehandle): |
---|
6716 | + # The filehandle is defined as a generally file-like object that |
---|
6717 | + # has these two methods. We don't care beyond that. |
---|
6718 | + assert hasattr(filehandle, "read") |
---|
6719 | + assert hasattr(filehandle, "close") |
---|
6720 | + |
---|
6721 | + self._filehandle = filehandle |
---|
6722 | + # We must start reading at the beginning of the file, or we risk |
---|
6723 | + # encountering errors when the data read does not match the size |
---|
6724 | + # reported to the uploader. |
---|
6725 | + self._filehandle.seek(0) |
---|
6726 | + |
---|
6727 | + # We have not yet read anything, so our position is 0. |
---|
6728 | + self._marker = 0 |
---|
6729 | + |
---|
6730 | + |
---|
6731 | + def get_size(self): |
---|
6732 | + """ |
---|
6733 | + I return the amount of data in my filehandle. |
---|
6734 | + """ |
---|
6735 | + if not hasattr(self, "_size"): |
---|
6736 | + old_position = self._filehandle.tell() |
---|
6737 | + # Seek to the end of the file by seeking 0 bytes from the |
---|
6738 | + # file's end |
---|
6739 | + self._filehandle.seek(0, 2) # 2 == os.SEEK_END in 2.5+ |
---|
6740 | + self._size = self._filehandle.tell() |
---|
6741 | + # Restore the previous position, in case this was called |
---|
6742 | + # after a read. |
---|
6743 | + self._filehandle.seek(old_position) |
---|
6744 | + assert self._filehandle.tell() == old_position |
---|
6745 | + |
---|
6746 | + assert hasattr(self, "_size") |
---|
6747 | + return self._size |
---|
6748 | + |
---|
6749 | + |
---|
6750 | + def pos(self): |
---|
6751 | + """ |
---|
6752 | + I return the position of my read marker -- i.e., how much data I |
---|
6753 | + have already read and returned to callers. |
---|
6754 | + """ |
---|
6755 | + return self._marker |
---|
6756 | + |
---|
6757 | + |
---|
6758 | + def read(self, length): |
---|
6759 | + """ |
---|
6760 | + I return some data (up to length bytes) from my filehandle. |
---|
6761 | + |
---|
6762 | + In most cases, I return length bytes, but sometimes I won't -- |
---|
6763 | + for example, if I am asked to read beyond the end of a file, or |
---|
6764 | + an error occurs. |
---|
6765 | + """ |
---|
6766 | + results = self._filehandle.read(length) |
---|
6767 | + self._marker += len(results) |
---|
6768 | + return [results] |
---|
6769 | + |
---|
6770 | + |
---|
6771 | + def close(self): |
---|
6772 | + """ |
---|
6773 | + I close the underlying filehandle. Any further operations on the |
---|
6774 | + filehandle fail at this point. |
---|
6775 | + """ |
---|
6776 | + self._filehandle.close() |
---|
6777 | + |
---|
6778 | + |
---|
6779 | +class MutableData(MutableFileHandle): |
---|
6780 | + """ |
---|
6781 | + I am a mutable uploadable built around a string, which I then cast |
---|
6782 | + into a StringIO and treat as a filehandle. |
---|
6783 | + """ |
---|
6784 | + |
---|
6785 | + def __init__(self, s): |
---|
6786 | + # Take a string and return a file-like uploadable. |
---|
6787 | + assert isinstance(s, str) |
---|
6788 | + |
---|
6789 | + MutableFileHandle.__init__(self, StringIO(s)) |
---|
6790 | + |
---|
6791 | + |
---|
6792 | +class TransformingUploadable: |
---|
6793 | + """ |
---|
6794 | + I am an IMutableUploadable that wraps another IMutableUploadable, |
---|
6795 | + and some segments that are already on the grid. When I am called to |
---|
6796 | + read, I handle merging of boundary segments. |
---|
6797 | + """ |
---|
6798 | + implements(IMutableUploadable) |
---|
6799 | + |
---|
6800 | + |
---|
6801 | + def __init__(self, data, offset, segment_size, start, end): |
---|
6802 | + assert IMutableUploadable.providedBy(data) |
---|
6803 | + |
---|
6804 | + self._newdata = data |
---|
6805 | + self._offset = offset |
---|
6806 | + self._segment_size = segment_size |
---|
6807 | + self._start = start |
---|
6808 | + self._end = end |
---|
6809 | + |
---|
6810 | + self._read_marker = 0 |
---|
6811 | + |
---|
6812 | + self._first_segment_offset = offset % segment_size |
---|
6813 | + |
---|
6814 | + num = self.log("TransformingUploadable: starting", parent=None) |
---|
6815 | + self._log_number = num |
---|
6816 | + self.log("got fso: %d" % self._first_segment_offset) |
---|
6817 | + self.log("got offset: %d" % self._offset) |
---|
6818 | + |
---|
6819 | + |
---|
6820 | + def log(self, *args, **kwargs): |
---|
6821 | + if 'parent' not in kwargs: |
---|
6822 | + kwargs['parent'] = self._log_number |
---|
6823 | + if "facility" not in kwargs: |
---|
6824 | + kwargs["facility"] = "tahoe.mutable.transforminguploadable" |
---|
6825 | + return log.msg(*args, **kwargs) |
---|
6826 | + |
---|
6827 | + |
---|
6828 | + def get_size(self): |
---|
6829 | + return self._offset + self._newdata.get_size() |
---|
6830 | + |
---|
6831 | + |
---|
6832 | + def read(self, length): |
---|
6833 | + # We can get data from 3 sources here. |
---|
6834 | + # 1. The first of the segments provided to us. |
---|
6835 | + # 2. The data that we're replacing things with. |
---|
6836 | + # 3. The last of the segments provided to us. |
---|
6837 | + |
---|
6838 | + # are we in state 0? |
---|
6839 | + self.log("reading %d bytes" % length) |
---|
6840 | + |
---|
6841 | + old_start_data = "" |
---|
6842 | + old_data_length = self._first_segment_offset - self._read_marker |
---|
6843 | + if old_data_length > 0: |
---|
6844 | + if old_data_length > length: |
---|
6845 | + old_data_length = length |
---|
6846 | + self.log("returning %d bytes of old start data" % old_data_length) |
---|
6847 | + |
---|
6848 | + old_data_end = old_data_length + self._read_marker |
---|
6849 | + old_start_data = self._start[self._read_marker:old_data_end] |
---|
6850 | + length -= old_data_length |
---|
6851 | + else: |
---|
6852 | + # otherwise calculations later get screwed up. |
---|
6853 | + old_data_length = 0 |
---|
6854 | + |
---|
6855 | + # Is there enough new data to satisfy this read? If not, we need |
---|
6856 | + # to pad the end of the data with data from our last segment. |
---|
6857 | + old_end_length = length - \ |
---|
6858 | + (self._newdata.get_size() - self._newdata.pos()) |
---|
6859 | + old_end_data = "" |
---|
6860 | + if old_end_length > 0: |
---|
6861 | + self.log("reading %d bytes of old end data" % old_end_length) |
---|
6862 | + |
---|
6863 | + # TODO: We're not explicitly checking for tail segment size |
---|
6864 | + # here. Is that a problem? |
---|
6865 | + old_data_offset = (length - old_end_length + \ |
---|
6866 | + old_data_length) % self._segment_size |
---|
6867 | + self.log("reading at offset %d" % old_data_offset) |
---|
6868 | + old_end = old_data_offset + old_end_length |
---|
6869 | + old_end_data = self._end[old_data_offset:old_end] |
---|
6870 | + length -= old_end_length |
---|
6871 | + assert length == self._newdata.get_size() - self._newdata.pos() |
---|
6872 | + |
---|
6873 | + self.log("reading %d bytes of new data" % length) |
---|
6874 | + new_data = self._newdata.read(length) |
---|
6875 | + new_data = "".join(new_data) |
---|
6876 | + |
---|
6877 | + self._read_marker += len(old_start_data + new_data + old_end_data) |
---|
6878 | + |
---|
6879 | + return old_start_data + new_data + old_end_data |
---|
6880 | |
---|
6881 | hunk ./src/allmydata/mutable/publish.py 1331 |
---|
6882 | + def close(self): |
---|
6883 | + pass |
---|
6884 | } |
---|
6885 | [mutable/retrieve.py: Modify the retrieval process to support MDMF |
---|
6886 | Kevan Carstensen <kevan@isnotajoke.com>**20100811233125 |
---|
6887 | Ignore-this: bb5f95e1d0e8bb734d43d5ed1550ce |
---|
6888 | |
---|
6889 | The logic behind a mutable file download had to be adapted to work with |
---|
6890 | segmented mutable files; this patch performs those adaptations. It also |
---|
6891 | exposes some decoding and decrypting functionality to make partial-file |
---|
6892 | updates a little easier, and supports efficient random-access downloads |
---|
6893 | of parts of an MDMF file. |
---|
6894 | ] { |
---|
6895 | hunk ./src/allmydata/mutable/retrieve.py 7 |
---|
6896 | from zope.interface import implements |
---|
6897 | from twisted.internet import defer |
---|
6898 | from twisted.python import failure |
---|
6899 | +from twisted.internet.interfaces import IPushProducer, IConsumer |
---|
6900 | from foolscap.api import DeadReferenceError, eventually, fireEventually |
---|
6901 | hunk ./src/allmydata/mutable/retrieve.py 9 |
---|
6902 | -from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError |
---|
6903 | -from allmydata.util import hashutil, idlib, log |
---|
6904 | +from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError, \ |
---|
6905 | + MDMF_VERSION, SDMF_VERSION |
---|
6906 | +from allmydata.util import hashutil, idlib, log, mathutil |
---|
6907 | from allmydata import hashtree, codec |
---|
6908 | from allmydata.storage.server import si_b2a |
---|
6909 | from pycryptopp.cipher.aes import AES |
---|
6910 | hunk ./src/allmydata/mutable/retrieve.py 18 |
---|
6911 | from pycryptopp.publickey import rsa |
---|
6912 | |
---|
6913 | from allmydata.mutable.common import DictOfSets, CorruptShareError, UncoordinatedWriteError |
---|
6914 | -from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data |
---|
6915 | +from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data, \ |
---|
6916 | + MDMFSlotReadProxy |
---|
6917 | |
---|
6918 | class RetrieveStatus: |
---|
6919 | implements(IRetrieveStatus) |
---|
6920 | hunk ./src/allmydata/mutable/retrieve.py 86 |
---|
6921 | # times, and each will have a separate response chain. However the |
---|
6922 | # Retrieve object will remain tied to a specific version of the file, and |
---|
6923 | # will use a single ServerMap instance. |
---|
6924 | + implements(IPushProducer) |
---|
6925 | |
---|
6926 | hunk ./src/allmydata/mutable/retrieve.py 88 |
---|
6927 | - def __init__(self, filenode, servermap, verinfo, fetch_privkey=False): |
---|
6928 | + def __init__(self, filenode, servermap, verinfo, fetch_privkey=False, |
---|
6929 | + verify=False): |
---|
6930 | self._node = filenode |
---|
6931 | assert self._node.get_pubkey() |
---|
6932 | self._storage_index = filenode.get_storage_index() |
---|
6933 | hunk ./src/allmydata/mutable/retrieve.py 107 |
---|
6934 | self.verinfo = verinfo |
---|
6935 | # during repair, we may be called upon to grab the private key, since |
---|
6936 | # it wasn't picked up during a verify=False checker run, and we'll |
---|
6937 | - # need it for repair to generate the a new version. |
---|
6938 | - self._need_privkey = fetch_privkey |
---|
6939 | - if self._node.get_privkey(): |
---|
6940 | + # need it for repair to generate a new version. |
---|
6941 | + self._need_privkey = fetch_privkey or verify |
---|
6942 | + if self._node.get_privkey() and not verify: |
---|
6943 | self._need_privkey = False |
---|
6944 | |
---|
6945 | hunk ./src/allmydata/mutable/retrieve.py 112 |
---|
6946 | + if self._need_privkey: |
---|
6947 | + # TODO: Evaluate the need for this. We'll use it if we want |
---|
6948 | + # to limit how many queries are on the wire for the privkey |
---|
6949 | + # at once. |
---|
6950 | + self._privkey_query_markers = [] # one Marker for each time we've |
---|
6951 | + # tried to get the privkey. |
---|
6952 | + |
---|
6953 | + # verify means that we are using the downloader logic to verify all |
---|
6954 | + # of our shares. This tells the downloader a few things. |
---|
6955 | + # |
---|
6956 | + # 1. We need to download all of the shares. |
---|
6957 | + # 2. We don't need to decode or decrypt the shares, since our |
---|
6958 | + # caller doesn't care about the plaintext, only the |
---|
6959 | + # information about which shares are or are not valid. |
---|
6960 | + # 3. When we are validating readers, we need to validate the |
---|
6961 | + # signature on the prefix. Do we? We already do this in the |
---|
6962 | + # servermap update? |
---|
6963 | + self._verify = False |
---|
6964 | + if verify: |
---|
6965 | + self._verify = True |
---|
6966 | + |
---|
6967 | self._status = RetrieveStatus() |
---|
6968 | self._status.set_storage_index(self._storage_index) |
---|
6969 | self._status.set_helper(False) |
---|
6970 | hunk ./src/allmydata/mutable/retrieve.py 142 |
---|
6971 | offsets_tuple) = self.verinfo |
---|
6972 | self._status.set_size(datalength) |
---|
6973 | self._status.set_encoding(k, N) |
---|
6974 | + self.readers = {} |
---|
6975 | + self._paused = False |
---|
6976 | + self._paused_deferred = None |
---|
6977 | + self._offset = None |
---|
6978 | + self._read_length = None |
---|
6979 | + self.log("got seqnum %d" % self.verinfo[0]) |
---|
6980 | + |
---|
6981 | |
---|
6982 | def get_status(self): |
---|
6983 | return self._status |
---|
6984 | hunk ./src/allmydata/mutable/retrieve.py 160 |
---|
6985 | kwargs["facility"] = "tahoe.mutable.retrieve" |
---|
6986 | return log.msg(*args, **kwargs) |
---|
6987 | |
---|
6988 | - def download(self): |
---|
6989 | + |
---|
6990 | + ################### |
---|
6991 | + # IPushProducer |
---|
6992 | + |
---|
6993 | + def pauseProducing(self): |
---|
6994 | + """ |
---|
6995 | + I am called by my download target if we have produced too much |
---|
6996 | + data for it to handle. I make the downloader stop producing new |
---|
6997 | + data until my resumeProducing method is called. |
---|
6998 | + """ |
---|
6999 | + if self._paused: |
---|
7000 | + return |
---|
7001 | + |
---|
7002 | + # fired when the download is unpaused. |
---|
7003 | + self._old_status = self._status.get_status() |
---|
7004 | + self._status.set_status("Paused") |
---|
7005 | + |
---|
7006 | + self._pause_deferred = defer.Deferred() |
---|
7007 | + self._paused = True |
---|
7008 | + |
---|
7009 | + |
---|
7010 | + def resumeProducing(self): |
---|
7011 | + """ |
---|
7012 | + I am called by my download target once it is ready to begin |
---|
7013 | + receiving data again. |
---|
7014 | + """ |
---|
7015 | + if not self._paused: |
---|
7016 | + return |
---|
7017 | + |
---|
7018 | + self._paused = False |
---|
7019 | + p = self._pause_deferred |
---|
7020 | + self._pause_deferred = None |
---|
7021 | + self._status.set_status(self._old_status) |
---|
7022 | + |
---|
7023 | + eventually(p.callback, None) |
---|
7024 | + |
---|
7025 | + |
---|
7026 | + def _check_for_paused(self, res): |
---|
7027 | + """ |
---|
7028 | + I am called just before a write to the consumer. I return a |
---|
7029 | + Deferred that eventually fires with the data that is to be |
---|
7030 | + written to the consumer. If the download has not been paused, |
---|
7031 | + the Deferred fires immediately. Otherwise, the Deferred fires |
---|
7032 | + when the downloader is unpaused. |
---|
7033 | + """ |
---|
7034 | + if self._paused: |
---|
7035 | + d = defer.Deferred() |
---|
7036 | + self._pause_defered.addCallback(lambda ignored: d.callback(res)) |
---|
7037 | + return d |
---|
7038 | + return defer.succeed(res) |
---|
7039 | + |
---|
7040 | + |
---|
7041 | + def download(self, consumer=None, offset=0, size=None): |
---|
7042 | + assert IConsumer.providedBy(consumer) or self._verify |
---|
7043 | + |
---|
7044 | + if consumer: |
---|
7045 | + self._consumer = consumer |
---|
7046 | + # we provide IPushProducer, so streaming=True, per |
---|
7047 | + # IConsumer. |
---|
7048 | + self._consumer.registerProducer(self, streaming=True) |
---|
7049 | + |
---|
7050 | self._done_deferred = defer.Deferred() |
---|
7051 | self._started = time.time() |
---|
7052 | self._status.set_status("Retrieving Shares") |
---|
7053 | hunk ./src/allmydata/mutable/retrieve.py 225 |
---|
7054 | |
---|
7055 | + self._offset = offset |
---|
7056 | + self._read_length = size |
---|
7057 | + |
---|
7058 | # first, which servers can we use? |
---|
7059 | versionmap = self.servermap.make_versionmap() |
---|
7060 | shares = versionmap[self.verinfo] |
---|
7061 | hunk ./src/allmydata/mutable/retrieve.py 235 |
---|
7062 | self.remaining_sharemap = DictOfSets() |
---|
7063 | for (shnum, peerid, timestamp) in shares: |
---|
7064 | self.remaining_sharemap.add(shnum, peerid) |
---|
7065 | + # If the servermap update fetched anything, it fetched at least 1 |
---|
7066 | + # KiB, so we ask for that much. |
---|
7067 | + # TODO: Change the cache methods to allow us to fetch all of the |
---|
7068 | + # data that they have, then change this method to do that. |
---|
7069 | + any_cache, timestamp = self._node._read_from_cache(self.verinfo, |
---|
7070 | + shnum, |
---|
7071 | + 0, |
---|
7072 | + 1000) |
---|
7073 | + ss = self.servermap.connections[peerid] |
---|
7074 | + reader = MDMFSlotReadProxy(ss, |
---|
7075 | + self._storage_index, |
---|
7076 | + shnum, |
---|
7077 | + any_cache) |
---|
7078 | + reader.peerid = peerid |
---|
7079 | + self.readers[shnum] = reader |
---|
7080 | + |
---|
7081 | |
---|
7082 | self.shares = {} # maps shnum to validated blocks |
---|
7083 | hunk ./src/allmydata/mutable/retrieve.py 253 |
---|
7084 | + self._active_readers = [] # list of active readers for this dl. |
---|
7085 | + self._validated_readers = set() # set of readers that we have |
---|
7086 | + # validated the prefix of |
---|
7087 | + self._block_hash_trees = {} # shnum => hashtree |
---|
7088 | |
---|
7089 | # how many shares do we need? |
---|
7090 | hunk ./src/allmydata/mutable/retrieve.py 259 |
---|
7091 | - (seqnum, root_hash, IV, segsize, datalength, k, N, prefix, |
---|
7092 | + (seqnum, |
---|
7093 | + root_hash, |
---|
7094 | + IV, |
---|
7095 | + segsize, |
---|
7096 | + datalength, |
---|
7097 | + k, |
---|
7098 | + N, |
---|
7099 | + prefix, |
---|
7100 | offsets_tuple) = self.verinfo |
---|
7101 | hunk ./src/allmydata/mutable/retrieve.py 268 |
---|
7102 | - assert len(self.remaining_sharemap) >= k |
---|
7103 | - # we start with the lowest shnums we have available, since FEC is |
---|
7104 | - # faster if we're using "primary shares" |
---|
7105 | - self.active_shnums = set(sorted(self.remaining_sharemap.keys())[:k]) |
---|
7106 | - for shnum in self.active_shnums: |
---|
7107 | - # we use an arbitrary peer who has the share. If shares are |
---|
7108 | - # doubled up (more than one share per peer), we could make this |
---|
7109 | - # run faster by spreading the load among multiple peers. But the |
---|
7110 | - # algorithm to do that is more complicated than I want to write |
---|
7111 | - # right now, and a well-provisioned grid shouldn't have multiple |
---|
7112 | - # shares per peer. |
---|
7113 | - peerid = list(self.remaining_sharemap[shnum])[0] |
---|
7114 | - self.get_data(shnum, peerid) |
---|
7115 | |
---|
7116 | hunk ./src/allmydata/mutable/retrieve.py 269 |
---|
7117 | - # control flow beyond this point: state machine. Receiving responses |
---|
7118 | - # from queries is the input. We might send out more queries, or we |
---|
7119 | - # might produce a result. |
---|
7120 | |
---|
7121 | hunk ./src/allmydata/mutable/retrieve.py 270 |
---|
7122 | + # We need one share hash tree for the entire file; its leaves |
---|
7123 | + # are the roots of the block hash trees for the shares that |
---|
7124 | + # comprise it, and its root is in the verinfo. |
---|
7125 | + self.share_hash_tree = hashtree.IncompleteHashTree(N) |
---|
7126 | + self.share_hash_tree.set_hashes({0: root_hash}) |
---|
7127 | + |
---|
7128 | + # This will set up both the segment decoder and the tail segment |
---|
7129 | + # decoder, as well as a variety of other instance variables that |
---|
7130 | + # the download process will use. |
---|
7131 | + self._setup_encoding_parameters() |
---|
7132 | + assert len(self.remaining_sharemap) >= k |
---|
7133 | + |
---|
7134 | + self.log("starting download") |
---|
7135 | + self._paused = False |
---|
7136 | + self._started_fetching = time.time() |
---|
7137 | + |
---|
7138 | + self._add_active_peers() |
---|
7139 | + # The download process beyond this is a state machine. |
---|
7140 | + # _add_active_peers will select the peers that we want to use |
---|
7141 | + # for the download, and then attempt to start downloading. After |
---|
7142 | + # each segment, it will check for doneness, reacting to broken |
---|
7143 | + # peers and corrupt shares as necessary. If it runs out of good |
---|
7144 | + # peers before downloading all of the segments, _done_deferred |
---|
7145 | + # will errback. Otherwise, it will eventually callback with the |
---|
7146 | + # contents of the mutable file. |
---|
7147 | return self._done_deferred |
---|
7148 | |
---|
7149 | hunk ./src/allmydata/mutable/retrieve.py 297 |
---|
7150 | - def get_data(self, shnum, peerid): |
---|
7151 | - self.log(format="sending sh#%(shnum)d request to [%(peerid)s]", |
---|
7152 | - shnum=shnum, |
---|
7153 | - peerid=idlib.shortnodeid_b2a(peerid), |
---|
7154 | - level=log.NOISY) |
---|
7155 | - ss = self.servermap.connections[peerid] |
---|
7156 | - started = time.time() |
---|
7157 | - (seqnum, root_hash, IV, segsize, datalength, k, N, prefix, |
---|
7158 | + |
---|
7159 | + def decode(self, blocks_and_salts, segnum): |
---|
7160 | + """ |
---|
7161 | + I am a helper method that the mutable file update process uses |
---|
7162 | + as a shortcut to decode and decrypt the segments that it needs |
---|
7163 | + to fetch in order to perform a file update. I take in a |
---|
7164 | + collection of blocks and salts, and pick some of those to make a |
---|
7165 | + segment with. I return the plaintext associated with that |
---|
7166 | + segment. |
---|
7167 | + """ |
---|
7168 | + # shnum => block hash tree. Unusued, but setup_encoding_parameters will |
---|
7169 | + # want to set this. |
---|
7170 | + # XXX: Make it so that it won't set this if we're just decoding. |
---|
7171 | + self._block_hash_trees = {} |
---|
7172 | + self._setup_encoding_parameters() |
---|
7173 | + # This is the form expected by decode. |
---|
7174 | + blocks_and_salts = blocks_and_salts.items() |
---|
7175 | + blocks_and_salts = [(True, [d]) for d in blocks_and_salts] |
---|
7176 | + |
---|
7177 | + d = self._decode_blocks(blocks_and_salts, segnum) |
---|
7178 | + d.addCallback(self._decrypt_segment) |
---|
7179 | + return d |
---|
7180 | + |
---|
7181 | + |
---|
7182 | + def _setup_encoding_parameters(self): |
---|
7183 | + """ |
---|
7184 | + I set up the encoding parameters, including k, n, the number |
---|
7185 | + of segments associated with this file, and the segment decoder. |
---|
7186 | + """ |
---|
7187 | + (seqnum, |
---|
7188 | + root_hash, |
---|
7189 | + IV, |
---|
7190 | + segsize, |
---|
7191 | + datalength, |
---|
7192 | + k, |
---|
7193 | + n, |
---|
7194 | + known_prefix, |
---|
7195 | offsets_tuple) = self.verinfo |
---|
7196 | hunk ./src/allmydata/mutable/retrieve.py 335 |
---|
7197 | - offsets = dict(offsets_tuple) |
---|
7198 | + self._required_shares = k |
---|
7199 | + self._total_shares = n |
---|
7200 | + self._segment_size = segsize |
---|
7201 | + self._data_length = datalength |
---|
7202 | |
---|
7203 | hunk ./src/allmydata/mutable/retrieve.py 340 |
---|
7204 | - # we read the checkstring, to make sure that the data we grab is from |
---|
7205 | - # the right version. |
---|
7206 | - readv = [ (0, struct.calcsize(SIGNED_PREFIX)) ] |
---|
7207 | + if not IV: |
---|
7208 | + self._version = MDMF_VERSION |
---|
7209 | + else: |
---|
7210 | + self._version = SDMF_VERSION |
---|
7211 | |
---|
7212 | hunk ./src/allmydata/mutable/retrieve.py 345 |
---|
7213 | - # We also read the data, and the hashes necessary to validate them |
---|
7214 | - # (share_hash_chain, block_hash_tree, share_data). We don't read the |
---|
7215 | - # signature or the pubkey, since that was handled during the |
---|
7216 | - # servermap phase, and we'll be comparing the share hash chain |
---|
7217 | - # against the roothash that was validated back then. |
---|
7218 | + if datalength and segsize: |
---|
7219 | + self._num_segments = mathutil.div_ceil(datalength, segsize) |
---|
7220 | + self._tail_data_size = datalength % segsize |
---|
7221 | + else: |
---|
7222 | + self._num_segments = 0 |
---|
7223 | + self._tail_data_size = 0 |
---|
7224 | |
---|
7225 | hunk ./src/allmydata/mutable/retrieve.py 352 |
---|
7226 | - readv.append( (offsets['share_hash_chain'], |
---|
7227 | - offsets['enc_privkey'] - offsets['share_hash_chain'] ) ) |
---|
7228 | + self._segment_decoder = codec.CRSDecoder() |
---|
7229 | + self._segment_decoder.set_params(segsize, k, n) |
---|
7230 | |
---|
7231 | hunk ./src/allmydata/mutable/retrieve.py 355 |
---|
7232 | - # if we need the private key (for repair), we also fetch that |
---|
7233 | - if self._need_privkey: |
---|
7234 | - readv.append( (offsets['enc_privkey'], |
---|
7235 | - offsets['EOF'] - offsets['enc_privkey']) ) |
---|
7236 | + if not self._tail_data_size: |
---|
7237 | + self._tail_data_size = segsize |
---|
7238 | + |
---|
7239 | + self._tail_segment_size = mathutil.next_multiple(self._tail_data_size, |
---|
7240 | + self._required_shares) |
---|
7241 | + if self._tail_segment_size == self._segment_size: |
---|
7242 | + self._tail_decoder = self._segment_decoder |
---|
7243 | + else: |
---|
7244 | + self._tail_decoder = codec.CRSDecoder() |
---|
7245 | + self._tail_decoder.set_params(self._tail_segment_size, |
---|
7246 | + self._required_shares, |
---|
7247 | + self._total_shares) |
---|
7248 | |
---|
7249 | hunk ./src/allmydata/mutable/retrieve.py 368 |
---|
7250 | - m = Marker() |
---|
7251 | - self._outstanding_queries[m] = (peerid, shnum, started) |
---|
7252 | + self.log("got encoding parameters: " |
---|
7253 | + "k: %d " |
---|
7254 | + "n: %d " |
---|
7255 | + "%d segments of %d bytes each (%d byte tail segment)" % \ |
---|
7256 | + (k, n, self._num_segments, self._segment_size, |
---|
7257 | + self._tail_segment_size)) |
---|
7258 | |
---|
7259 | hunk ./src/allmydata/mutable/retrieve.py 375 |
---|
7260 | - # ask the cache first |
---|
7261 | - got_from_cache = False |
---|
7262 | - datavs = [] |
---|
7263 | - for (offset, length) in readv: |
---|
7264 | - (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum, |
---|
7265 | - offset, length) |
---|
7266 | - if data is not None: |
---|
7267 | - datavs.append(data) |
---|
7268 | - if len(datavs) == len(readv): |
---|
7269 | - self.log("got data from cache") |
---|
7270 | - got_from_cache = True |
---|
7271 | - d = fireEventually({shnum: datavs}) |
---|
7272 | - # datavs is a dict mapping shnum to a pair of strings |
---|
7273 | + for i in xrange(self._total_shares): |
---|
7274 | + # So we don't have to do this later. |
---|
7275 | + self._block_hash_trees[i] = hashtree.IncompleteHashTree(self._num_segments) |
---|
7276 | + |
---|
7277 | + # Our last task is to tell the downloader where to start and |
---|
7278 | + # where to stop. We use three parameters for that: |
---|
7279 | + # - self._start_segment: the segment that we need to start |
---|
7280 | + # downloading from. |
---|
7281 | + # - self._current_segment: the next segment that we need to |
---|
7282 | + # download. |
---|
7283 | + # - self._last_segment: The last segment that we were asked to |
---|
7284 | + # download. |
---|
7285 | + # |
---|
7286 | + # We say that the download is complete when |
---|
7287 | + # self._current_segment > self._last_segment. We use |
---|
7288 | + # self._start_segment and self._last_segment to know when to |
---|
7289 | + # strip things off of segments, and how much to strip. |
---|
7290 | + if self._offset: |
---|
7291 | + self.log("got offset: %d" % self._offset) |
---|
7292 | + # our start segment is the first segment containing the |
---|
7293 | + # offset we were given. |
---|
7294 | + start = mathutil.div_ceil(self._offset, |
---|
7295 | + self._segment_size) |
---|
7296 | + # this gets us the first segment after self._offset. Then |
---|
7297 | + # our start segment is the one before it. |
---|
7298 | + start -= 1 |
---|
7299 | + |
---|
7300 | + assert start < self._num_segments |
---|
7301 | + self._start_segment = start |
---|
7302 | + self.log("got start segment: %d" % self._start_segment) |
---|
7303 | else: |
---|
7304 | hunk ./src/allmydata/mutable/retrieve.py 406 |
---|
7305 | - d = self._do_read(ss, peerid, self._storage_index, [shnum], readv) |
---|
7306 | - self.remaining_sharemap.discard(shnum, peerid) |
---|
7307 | + self._start_segment = 0 |
---|
7308 | |
---|
7309 | hunk ./src/allmydata/mutable/retrieve.py 408 |
---|
7310 | - d.addCallback(self._got_results, m, peerid, started, got_from_cache) |
---|
7311 | - d.addErrback(self._query_failed, m, peerid) |
---|
7312 | - # errors that aren't handled by _query_failed (and errors caused by |
---|
7313 | - # _query_failed) get logged, but we still want to check for doneness. |
---|
7314 | - def _oops(f): |
---|
7315 | - self.log(format="problem in _query_failed for sh#%(shnum)d to %(peerid)s", |
---|
7316 | - shnum=shnum, |
---|
7317 | - peerid=idlib.shortnodeid_b2a(peerid), |
---|
7318 | - failure=f, |
---|
7319 | - level=log.WEIRD, umid="W0xnQA") |
---|
7320 | - d.addErrback(_oops) |
---|
7321 | - d.addBoth(self._check_for_done) |
---|
7322 | - # any error during _check_for_done means the download fails. If the |
---|
7323 | - # download is successful, _check_for_done will fire _done by itself. |
---|
7324 | - d.addErrback(self._done) |
---|
7325 | - d.addErrback(log.err) |
---|
7326 | - return d # purely for testing convenience |
---|
7327 | |
---|
7328 | hunk ./src/allmydata/mutable/retrieve.py 409 |
---|
7329 | - def _do_read(self, ss, peerid, storage_index, shnums, readv): |
---|
7330 | - # isolate the callRemote to a separate method, so tests can subclass |
---|
7331 | - # Publish and override it |
---|
7332 | - d = ss.callRemote("slot_readv", storage_index, shnums, readv) |
---|
7333 | - return d |
---|
7334 | + if self._read_length: |
---|
7335 | + # our end segment is the last segment containing part of the |
---|
7336 | + # segment that we were asked to read. |
---|
7337 | + self.log("got read length %d" % self._read_length) |
---|
7338 | + end_data = self._offset + self._read_length |
---|
7339 | + end = mathutil.div_ceil(end_data, |
---|
7340 | + self._segment_size) |
---|
7341 | + end -= 1 |
---|
7342 | + assert end < self._num_segments |
---|
7343 | + self._last_segment = end |
---|
7344 | + self.log("got end segment: %d" % self._last_segment) |
---|
7345 | + else: |
---|
7346 | + self._last_segment = self._num_segments - 1 |
---|
7347 | |
---|
7348 | hunk ./src/allmydata/mutable/retrieve.py 423 |
---|
7349 | - def remove_peer(self, peerid): |
---|
7350 | - for shnum in list(self.remaining_sharemap.keys()): |
---|
7351 | - self.remaining_sharemap.discard(shnum, peerid) |
---|
7352 | + self._current_segment = self._start_segment |
---|
7353 | |
---|
7354 | hunk ./src/allmydata/mutable/retrieve.py 425 |
---|
7355 | - def _got_results(self, datavs, marker, peerid, started, got_from_cache): |
---|
7356 | - now = time.time() |
---|
7357 | - elapsed = now - started |
---|
7358 | - if not got_from_cache: |
---|
7359 | - self._status.add_fetch_timing(peerid, elapsed) |
---|
7360 | - self.log(format="got results (%(shares)d shares) from [%(peerid)s]", |
---|
7361 | - shares=len(datavs), |
---|
7362 | - peerid=idlib.shortnodeid_b2a(peerid), |
---|
7363 | - level=log.NOISY) |
---|
7364 | - self._outstanding_queries.pop(marker, None) |
---|
7365 | - if not self._running: |
---|
7366 | - return |
---|
7367 | + def _add_active_peers(self): |
---|
7368 | + """ |
---|
7369 | + I populate self._active_readers with enough active readers to |
---|
7370 | + retrieve the contents of this mutable file. I am called before |
---|
7371 | + downloading starts, and (eventually) after each validation |
---|
7372 | + error, connection error, or other problem in the download. |
---|
7373 | + """ |
---|
7374 | + # TODO: It would be cool to investigate other heuristics for |
---|
7375 | + # reader selection. For instance, the cost (in time the user |
---|
7376 | + # spends waiting for their file) of selecting a really slow peer |
---|
7377 | + # that happens to have a primary share is probably more than |
---|
7378 | + # selecting a really fast peer that doesn't have a primary |
---|
7379 | + # share. Maybe the servermap could be extended to provide this |
---|
7380 | + # information; it could keep track of latency information while |
---|
7381 | + # it gathers more important data, and then this routine could |
---|
7382 | + # use that to select active readers. |
---|
7383 | + # |
---|
7384 | + # (these and other questions would be easier to answer with a |
---|
7385 | + # robust, configurable tahoe-lafs simulator, which modeled node |
---|
7386 | + # failures, differences in node speed, and other characteristics |
---|
7387 | + # that we expect storage servers to have. You could have |
---|
7388 | + # presets for really stable grids (like allmydata.com), |
---|
7389 | + # friendnets, make it easy to configure your own settings, and |
---|
7390 | + # then simulate the effect of big changes on these use cases |
---|
7391 | + # instead of just reasoning about what the effect might be. Out |
---|
7392 | + # of scope for MDMF, though.) |
---|
7393 | |
---|
7394 | hunk ./src/allmydata/mutable/retrieve.py 452 |
---|
7395 | - # note that we only ask for a single share per query, so we only |
---|
7396 | - # expect a single share back. On the other hand, we use the extra |
---|
7397 | - # shares if we get them.. seems better than an assert(). |
---|
7398 | + # We need at least self._required_shares readers to download a |
---|
7399 | + # segment. |
---|
7400 | + if self._verify: |
---|
7401 | + needed = self._total_shares |
---|
7402 | + else: |
---|
7403 | + needed = self._required_shares - len(self._active_readers) |
---|
7404 | + # XXX: Why don't format= log messages work here? |
---|
7405 | + self.log("adding %d peers to the active peers list" % needed) |
---|
7406 | |
---|
7407 | hunk ./src/allmydata/mutable/retrieve.py 461 |
---|
7408 | - for shnum,datav in datavs.items(): |
---|
7409 | - (prefix, hash_and_data) = datav[:2] |
---|
7410 | - try: |
---|
7411 | - self._got_results_one_share(shnum, peerid, |
---|
7412 | - prefix, hash_and_data) |
---|
7413 | - except CorruptShareError, e: |
---|
7414 | - # log it and give the other shares a chance to be processed |
---|
7415 | - f = failure.Failure() |
---|
7416 | - self.log(format="bad share: %(f_value)s", |
---|
7417 | - f_value=str(f.value), failure=f, |
---|
7418 | - level=log.WEIRD, umid="7fzWZw") |
---|
7419 | - self.notify_server_corruption(peerid, shnum, str(e)) |
---|
7420 | - self.remove_peer(peerid) |
---|
7421 | - self.servermap.mark_bad_share(peerid, shnum, prefix) |
---|
7422 | - self._bad_shares.add( (peerid, shnum) ) |
---|
7423 | - self._status.problems[peerid] = f |
---|
7424 | - self._last_failure = f |
---|
7425 | - pass |
---|
7426 | - if self._need_privkey and len(datav) > 2: |
---|
7427 | - lp = None |
---|
7428 | - self._try_to_validate_privkey(datav[2], peerid, shnum, lp) |
---|
7429 | - # all done! |
---|
7430 | + # We favor lower numbered shares, since FEC is faster with |
---|
7431 | + # primary shares than with other shares, and lower-numbered |
---|
7432 | + # shares are more likely to be primary than higher numbered |
---|
7433 | + # shares. |
---|
7434 | + active_shnums = set(sorted(self.remaining_sharemap.keys())) |
---|
7435 | + # We shouldn't consider adding shares that we already have; this |
---|
7436 | + # will cause problems later. |
---|
7437 | + active_shnums -= set([reader.shnum for reader in self._active_readers]) |
---|
7438 | + active_shnums = list(active_shnums)[:needed] |
---|
7439 | + if len(active_shnums) < needed and not self._verify: |
---|
7440 | + # We don't have enough readers to retrieve the file; fail. |
---|
7441 | + return self._failed() |
---|
7442 | |
---|
7443 | hunk ./src/allmydata/mutable/retrieve.py 474 |
---|
7444 | - def notify_server_corruption(self, peerid, shnum, reason): |
---|
7445 | - ss = self.servermap.connections[peerid] |
---|
7446 | - ss.callRemoteOnly("advise_corrupt_share", |
---|
7447 | - "mutable", self._storage_index, shnum, reason) |
---|
7448 | + for shnum in active_shnums: |
---|
7449 | + self._active_readers.append(self.readers[shnum]) |
---|
7450 | + self.log("added reader for share %d" % shnum) |
---|
7451 | + assert len(self._active_readers) >= self._required_shares |
---|
7452 | + # Conceptually, this is part of the _add_active_peers step. It |
---|
7453 | + # validates the prefixes of newly added readers to make sure |
---|
7454 | + # that they match what we are expecting for self.verinfo. If |
---|
7455 | + # validation is successful, _validate_active_prefixes will call |
---|
7456 | + # _download_current_segment for us. If validation is |
---|
7457 | + # unsuccessful, then _validate_prefixes will remove the peer and |
---|
7458 | + # call _add_active_peers again, where we will attempt to rectify |
---|
7459 | + # the problem by choosing another peer. |
---|
7460 | + return self._validate_active_prefixes() |
---|
7461 | |
---|
7462 | hunk ./src/allmydata/mutable/retrieve.py 488 |
---|
7463 | - def _got_results_one_share(self, shnum, peerid, |
---|
7464 | - got_prefix, got_hash_and_data): |
---|
7465 | - self.log("_got_results: got shnum #%d from peerid %s" |
---|
7466 | - % (shnum, idlib.shortnodeid_b2a(peerid))) |
---|
7467 | - (seqnum, root_hash, IV, segsize, datalength, k, N, prefix, |
---|
7468 | - offsets_tuple) = self.verinfo |
---|
7469 | - assert len(got_prefix) == len(prefix), (len(got_prefix), len(prefix)) |
---|
7470 | - if got_prefix != prefix: |
---|
7471 | - msg = "someone wrote to the data since we read the servermap: prefix changed" |
---|
7472 | - raise UncoordinatedWriteError(msg) |
---|
7473 | - (share_hash_chain, block_hash_tree, |
---|
7474 | - share_data) = unpack_share_data(self.verinfo, got_hash_and_data) |
---|
7475 | |
---|
7476 | hunk ./src/allmydata/mutable/retrieve.py 489 |
---|
7477 | - assert isinstance(share_data, str) |
---|
7478 | - # build the block hash tree. SDMF has only one leaf. |
---|
7479 | - leaves = [hashutil.block_hash(share_data)] |
---|
7480 | - t = hashtree.HashTree(leaves) |
---|
7481 | - if list(t) != block_hash_tree: |
---|
7482 | - raise CorruptShareError(peerid, shnum, "block hash tree failure") |
---|
7483 | - share_hash_leaf = t[0] |
---|
7484 | - t2 = hashtree.IncompleteHashTree(N) |
---|
7485 | - # root_hash was checked by the signature |
---|
7486 | - t2.set_hashes({0: root_hash}) |
---|
7487 | - try: |
---|
7488 | - t2.set_hashes(hashes=share_hash_chain, |
---|
7489 | - leaves={shnum: share_hash_leaf}) |
---|
7490 | - except (hashtree.BadHashError, hashtree.NotEnoughHashesError, |
---|
7491 | - IndexError), e: |
---|
7492 | - msg = "corrupt hashes: %s" % (e,) |
---|
7493 | - raise CorruptShareError(peerid, shnum, msg) |
---|
7494 | - self.log(" data valid! len=%d" % len(share_data)) |
---|
7495 | - # each query comes down to this: placing validated share data into |
---|
7496 | - # self.shares |
---|
7497 | - self.shares[shnum] = share_data |
---|
7498 | + def _validate_active_prefixes(self): |
---|
7499 | + """ |
---|
7500 | + I check to make sure that the prefixes on the peers that I am |
---|
7501 | + currently reading from match the prefix that we want to see, as |
---|
7502 | + said in self.verinfo. |
---|
7503 | |
---|
7504 | hunk ./src/allmydata/mutable/retrieve.py 495 |
---|
7505 | - def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp): |
---|
7506 | + If I find that all of the active peers have acceptable prefixes, |
---|
7507 | + I pass control to _download_current_segment, which will use |
---|
7508 | + those peers to do cool things. If I find that some of the active |
---|
7509 | + peers have unacceptable prefixes, I will remove them from active |
---|
7510 | + peers (and from further consideration) and call |
---|
7511 | + _add_active_peers to attempt to rectify the situation. I keep |
---|
7512 | + track of which peers I have already validated so that I don't |
---|
7513 | + need to do so again. |
---|
7514 | + """ |
---|
7515 | + assert self._active_readers, "No more active readers" |
---|
7516 | |
---|
7517 | hunk ./src/allmydata/mutable/retrieve.py 506 |
---|
7518 | - alleged_privkey_s = self._node._decrypt_privkey(enc_privkey) |
---|
7519 | - alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s) |
---|
7520 | - if alleged_writekey != self._node.get_writekey(): |
---|
7521 | - self.log("invalid privkey from %s shnum %d" % |
---|
7522 | - (idlib.nodeid_b2a(peerid)[:8], shnum), |
---|
7523 | - parent=lp, level=log.WEIRD, umid="YIw4tA") |
---|
7524 | - return |
---|
7525 | + ds = [] |
---|
7526 | + new_readers = set(self._active_readers) - self._validated_readers |
---|
7527 | + self.log('validating %d newly-added active readers' % len(new_readers)) |
---|
7528 | |
---|
7529 | hunk ./src/allmydata/mutable/retrieve.py 510 |
---|
7530 | - # it's good |
---|
7531 | - self.log("got valid privkey from shnum %d on peerid %s" % |
---|
7532 | - (shnum, idlib.shortnodeid_b2a(peerid)), |
---|
7533 | - parent=lp) |
---|
7534 | - privkey = rsa.create_signing_key_from_string(alleged_privkey_s) |
---|
7535 | - self._node._populate_encprivkey(enc_privkey) |
---|
7536 | - self._node._populate_privkey(privkey) |
---|
7537 | - self._need_privkey = False |
---|
7538 | + for reader in new_readers: |
---|
7539 | + # We force a remote read here -- otherwise, we are relying |
---|
7540 | + # on cached data that we already verified as valid, and we |
---|
7541 | + # won't detect an uncoordinated write that has occurred |
---|
7542 | + # since the last servermap update. |
---|
7543 | + d = reader.get_prefix(force_remote=True) |
---|
7544 | + d.addCallback(self._try_to_validate_prefix, reader) |
---|
7545 | + ds.append(d) |
---|
7546 | + dl = defer.DeferredList(ds, consumeErrors=True) |
---|
7547 | + def _check_results(results): |
---|
7548 | + # Each result in results will be of the form (success, msg). |
---|
7549 | + # We don't care about msg, but success will tell us whether |
---|
7550 | + # or not the checkstring validated. If it didn't, we need to |
---|
7551 | + # remove the offending (peer,share) from our active readers, |
---|
7552 | + # and ensure that active readers is again populated. |
---|
7553 | + bad_readers = [] |
---|
7554 | + for i, result in enumerate(results): |
---|
7555 | + if not result[0]: |
---|
7556 | + reader = self._active_readers[i] |
---|
7557 | + f = result[1] |
---|
7558 | + assert isinstance(f, failure.Failure) |
---|
7559 | |
---|
7560 | hunk ./src/allmydata/mutable/retrieve.py 532 |
---|
7561 | - def _query_failed(self, f, marker, peerid): |
---|
7562 | - self.log(format="query to [%(peerid)s] failed", |
---|
7563 | - peerid=idlib.shortnodeid_b2a(peerid), |
---|
7564 | - level=log.NOISY) |
---|
7565 | - self._status.problems[peerid] = f |
---|
7566 | - self._outstanding_queries.pop(marker, None) |
---|
7567 | - if not self._running: |
---|
7568 | - return |
---|
7569 | - self._last_failure = f |
---|
7570 | - self.remove_peer(peerid) |
---|
7571 | - level = log.WEIRD |
---|
7572 | - if f.check(DeadReferenceError): |
---|
7573 | - level = log.UNUSUAL |
---|
7574 | - self.log(format="error during query: %(f_value)s", |
---|
7575 | - f_value=str(f.value), failure=f, level=level, umid="gOJB5g") |
---|
7576 | + self.log("The reader %s failed to " |
---|
7577 | + "properly validate: %s" % \ |
---|
7578 | + (reader, str(f.value))) |
---|
7579 | + bad_readers.append((reader, f)) |
---|
7580 | + else: |
---|
7581 | + reader = self._active_readers[i] |
---|
7582 | + self.log("the reader %s checks out, so we'll use it" % \ |
---|
7583 | + reader) |
---|
7584 | + self._validated_readers.add(reader) |
---|
7585 | + # Each time we validate a reader, we check to see if |
---|
7586 | + # we need the private key. If we do, we politely ask |
---|
7587 | + # for it and then continue computing. If we find |
---|
7588 | + # that we haven't gotten it at the end of |
---|
7589 | + # segment decoding, then we'll take more drastic |
---|
7590 | + # measures. |
---|
7591 | + if self._need_privkey and not self._node.is_readonly(): |
---|
7592 | + d = reader.get_encprivkey() |
---|
7593 | + d.addCallback(self._try_to_validate_privkey, reader) |
---|
7594 | + if bad_readers: |
---|
7595 | + # We do them all at once, or else we screw up list indexing. |
---|
7596 | + for (reader, f) in bad_readers: |
---|
7597 | + self._mark_bad_share(reader, f) |
---|
7598 | + if self._verify: |
---|
7599 | + if len(self._active_readers) >= self._required_shares: |
---|
7600 | + return self._download_current_segment() |
---|
7601 | + else: |
---|
7602 | + return self._failed() |
---|
7603 | + else: |
---|
7604 | + return self._add_active_peers() |
---|
7605 | + else: |
---|
7606 | + return self._download_current_segment() |
---|
7607 | + # The next step will assert that it has enough active |
---|
7608 | + # readers to fetch shares; we just need to remove it. |
---|
7609 | + dl.addCallback(_check_results) |
---|
7610 | + return dl |
---|
7611 | |
---|
7612 | hunk ./src/allmydata/mutable/retrieve.py 568 |
---|
7613 | - def _check_for_done(self, res): |
---|
7614 | - # exit paths: |
---|
7615 | - # return : keep waiting, no new queries |
---|
7616 | - # return self._send_more_queries(outstanding) : send some more queries |
---|
7617 | - # fire self._done(plaintext) : download successful |
---|
7618 | - # raise exception : download fails |
---|
7619 | |
---|
7620 | hunk ./src/allmydata/mutable/retrieve.py 569 |
---|
7621 | - self.log(format="_check_for_done: running=%(running)s, decoding=%(decoding)s", |
---|
7622 | - running=self._running, decoding=self._decoding, |
---|
7623 | - level=log.NOISY) |
---|
7624 | - if not self._running: |
---|
7625 | - return |
---|
7626 | - if self._decoding: |
---|
7627 | - return |
---|
7628 | - (seqnum, root_hash, IV, segsize, datalength, k, N, prefix, |
---|
7629 | + def _try_to_validate_prefix(self, prefix, reader): |
---|
7630 | + """ |
---|
7631 | + I check that the prefix returned by a candidate server for |
---|
7632 | + retrieval matches the prefix that the servermap knows about |
---|
7633 | + (and, hence, the prefix that was validated earlier). If it does, |
---|
7634 | + I return True, which means that I approve of the use of the |
---|
7635 | + candidate server for segment retrieval. If it doesn't, I return |
---|
7636 | + False, which means that another server must be chosen. |
---|
7637 | + """ |
---|
7638 | + (seqnum, |
---|
7639 | + root_hash, |
---|
7640 | + IV, |
---|
7641 | + segsize, |
---|
7642 | + datalength, |
---|
7643 | + k, |
---|
7644 | + N, |
---|
7645 | + known_prefix, |
---|
7646 | offsets_tuple) = self.verinfo |
---|
7647 | hunk ./src/allmydata/mutable/retrieve.py 587 |
---|
7648 | + if known_prefix != prefix: |
---|
7649 | + self.log("prefix from share %d doesn't match" % reader.shnum) |
---|
7650 | + raise UncoordinatedWriteError("Mismatched prefix -- this could " |
---|
7651 | + "indicate an uncoordinated write") |
---|
7652 | + # Otherwise, we're okay -- no issues. |
---|
7653 | |
---|
7654 | hunk ./src/allmydata/mutable/retrieve.py 593 |
---|
7655 | - if len(self.shares) < k: |
---|
7656 | - # we don't have enough shares yet |
---|
7657 | - return self._maybe_send_more_queries(k) |
---|
7658 | - if self._need_privkey: |
---|
7659 | - # we got k shares, but none of them had a valid privkey. TODO: |
---|
7660 | - # look further. Adding code to do this is a bit complicated, and |
---|
7661 | - # I want to avoid that complication, and this should be pretty |
---|
7662 | - # rare (k shares with bitflips in the enc_privkey but not in the |
---|
7663 | - # data blocks). If we actually do get here, the subsequent repair |
---|
7664 | - # will fail for lack of a privkey. |
---|
7665 | - self.log("got k shares but still need_privkey, bummer", |
---|
7666 | - level=log.WEIRD, umid="MdRHPA") |
---|
7667 | |
---|
7668 | hunk ./src/allmydata/mutable/retrieve.py 594 |
---|
7669 | - # we have enough to finish. All the shares have had their hashes |
---|
7670 | - # checked, so if something fails at this point, we don't know how |
---|
7671 | - # to fix it, so the download will fail. |
---|
7672 | + def _remove_reader(self, reader): |
---|
7673 | + """ |
---|
7674 | + At various points, we will wish to remove a peer from |
---|
7675 | + consideration and/or use. These include, but are not necessarily |
---|
7676 | + limited to: |
---|
7677 | |
---|
7678 | hunk ./src/allmydata/mutable/retrieve.py 600 |
---|
7679 | - self._decoding = True # avoid reentrancy |
---|
7680 | - self._status.set_status("decoding") |
---|
7681 | - now = time.time() |
---|
7682 | - elapsed = now - self._started |
---|
7683 | - self._status.timings["fetch"] = elapsed |
---|
7684 | + - A connection error. |
---|
7685 | + - A mismatched prefix (that is, a prefix that does not match |
---|
7686 | + our conception of the version information string). |
---|
7687 | + - A failing block hash, salt hash, or share hash, which can |
---|
7688 | + indicate disk failure/bit flips, or network trouble. |
---|
7689 | |
---|
7690 | hunk ./src/allmydata/mutable/retrieve.py 606 |
---|
7691 | - d = defer.maybeDeferred(self._decode) |
---|
7692 | - d.addCallback(self._decrypt, IV, self._node.get_readkey()) |
---|
7693 | - d.addBoth(self._done) |
---|
7694 | - return d # purely for test convenience |
---|
7695 | + This method will do that. I will make sure that the |
---|
7696 | + (shnum,reader) combination represented by my reader argument is |
---|
7697 | + not used for anything else during this download. I will not |
---|
7698 | + advise the reader of any corruption, something that my callers |
---|
7699 | + may wish to do on their own. |
---|
7700 | + """ |
---|
7701 | + # TODO: When you're done writing this, see if this is ever |
---|
7702 | + # actually used for something that _mark_bad_share isn't. I have |
---|
7703 | + # a feeling that they will be used for very similar things, and |
---|
7704 | + # that having them both here is just going to be an epic amount |
---|
7705 | + # of code duplication. |
---|
7706 | + # |
---|
7707 | + # (well, okay, not epic, but meaningful) |
---|
7708 | + self.log("removing reader %s" % reader) |
---|
7709 | + # Remove the reader from _active_readers |
---|
7710 | + self._active_readers.remove(reader) |
---|
7711 | + # TODO: self.readers.remove(reader)? |
---|
7712 | + for shnum in list(self.remaining_sharemap.keys()): |
---|
7713 | + self.remaining_sharemap.discard(shnum, reader.peerid) |
---|
7714 | |
---|
7715 | hunk ./src/allmydata/mutable/retrieve.py 626 |
---|
7716 | - def _maybe_send_more_queries(self, k): |
---|
7717 | - # we don't have enough shares yet. Should we send out more queries? |
---|
7718 | - # There are some number of queries outstanding, each for a single |
---|
7719 | - # share. If we can generate 'needed_shares' additional queries, we do |
---|
7720 | - # so. If we can't, then we know this file is a goner, and we raise |
---|
7721 | - # NotEnoughSharesError. |
---|
7722 | - self.log(format=("_maybe_send_more_queries, have=%(have)d, k=%(k)d, " |
---|
7723 | - "outstanding=%(outstanding)d"), |
---|
7724 | - have=len(self.shares), k=k, |
---|
7725 | - outstanding=len(self._outstanding_queries), |
---|
7726 | - level=log.NOISY) |
---|
7727 | |
---|
7728 | hunk ./src/allmydata/mutable/retrieve.py 627 |
---|
7729 | - remaining_shares = k - len(self.shares) |
---|
7730 | - needed = remaining_shares - len(self._outstanding_queries) |
---|
7731 | - if not needed: |
---|
7732 | - # we have enough queries in flight already |
---|
7733 | + def _mark_bad_share(self, reader, f): |
---|
7734 | + """ |
---|
7735 | + I mark the (peerid, shnum) encapsulated by my reader argument as |
---|
7736 | + a bad share, which means that it will not be used anywhere else. |
---|
7737 | |
---|
7738 | hunk ./src/allmydata/mutable/retrieve.py 632 |
---|
7739 | - # TODO: but if they've been in flight for a long time, and we |
---|
7740 | - # have reason to believe that new queries might respond faster |
---|
7741 | - # (i.e. we've seen other queries come back faster, then consider |
---|
7742 | - # sending out new queries. This could help with peers which have |
---|
7743 | - # silently gone away since the servermap was updated, for which |
---|
7744 | - # we're still waiting for the 15-minute TCP disconnect to happen. |
---|
7745 | - self.log("enough queries are in flight, no more are needed", |
---|
7746 | - level=log.NOISY) |
---|
7747 | - return |
---|
7748 | + There are several reasons to want to mark something as a bad |
---|
7749 | + share. These include: |
---|
7750 | + |
---|
7751 | + - A connection error to the peer. |
---|
7752 | + - A mismatched prefix (that is, a prefix that does not match |
---|
7753 | + our local conception of the version information string). |
---|
7754 | + - A failing block hash, salt hash, share hash, or other |
---|
7755 | + integrity check. |
---|
7756 | |
---|
7757 | hunk ./src/allmydata/mutable/retrieve.py 641 |
---|
7758 | - outstanding_shnums = set([shnum |
---|
7759 | - for (peerid, shnum, started) |
---|
7760 | - in self._outstanding_queries.values()]) |
---|
7761 | - # prefer low-numbered shares, they are more likely to be primary |
---|
7762 | - available_shnums = sorted(self.remaining_sharemap.keys()) |
---|
7763 | - for shnum in available_shnums: |
---|
7764 | - if shnum in outstanding_shnums: |
---|
7765 | - # skip ones that are already in transit |
---|
7766 | - continue |
---|
7767 | - if shnum not in self.remaining_sharemap: |
---|
7768 | - # no servers for that shnum. note that DictOfSets removes |
---|
7769 | - # empty sets from the dict for us. |
---|
7770 | - continue |
---|
7771 | - peerid = list(self.remaining_sharemap[shnum])[0] |
---|
7772 | - # get_data will remove that peerid from the sharemap, and add the |
---|
7773 | - # query to self._outstanding_queries |
---|
7774 | - self._status.set_status("Retrieving More Shares") |
---|
7775 | - self.get_data(shnum, peerid) |
---|
7776 | - needed -= 1 |
---|
7777 | - if not needed: |
---|
7778 | + This method will ensure that readers that we wish to mark bad |
---|
7779 | + (for these reasons or other reasons) are not used for the rest |
---|
7780 | + of the download. Additionally, it will attempt to tell the |
---|
7781 | + remote peer (with no guarantee of success) that its share is |
---|
7782 | + corrupt. |
---|
7783 | + """ |
---|
7784 | + self.log("marking share %d on server %s as bad" % \ |
---|
7785 | + (reader.shnum, reader)) |
---|
7786 | + prefix = self.verinfo[-2] |
---|
7787 | + self.servermap.mark_bad_share(reader.peerid, |
---|
7788 | + reader.shnum, |
---|
7789 | + prefix) |
---|
7790 | + self._remove_reader(reader) |
---|
7791 | + self._bad_shares.add((reader.peerid, reader.shnum, f)) |
---|
7792 | + self._status.problems[reader.peerid] = f |
---|
7793 | + self._last_failure = f |
---|
7794 | + self.notify_server_corruption(reader.peerid, reader.shnum, |
---|
7795 | + str(f.value)) |
---|
7796 | + |
---|
7797 | + |
---|
7798 | + def _download_current_segment(self): |
---|
7799 | + """ |
---|
7800 | + I download, validate, decode, decrypt, and assemble the segment |
---|
7801 | + that this Retrieve is currently responsible for downloading. |
---|
7802 | + """ |
---|
7803 | + assert len(self._active_readers) >= self._required_shares |
---|
7804 | + if self._current_segment <= self._last_segment: |
---|
7805 | + d = self._process_segment(self._current_segment) |
---|
7806 | + else: |
---|
7807 | + d = defer.succeed(None) |
---|
7808 | + d.addBoth(self._turn_barrier) |
---|
7809 | + d.addCallback(self._check_for_done) |
---|
7810 | + return d |
---|
7811 | + |
---|
7812 | + |
---|
7813 | + def _turn_barrier(self, result): |
---|
7814 | + """ |
---|
7815 | + I help the download process avoid the recursion limit issues |
---|
7816 | + discussed in #237. |
---|
7817 | + """ |
---|
7818 | + return fireEventually(result) |
---|
7819 | + |
---|
7820 | + |
---|
7821 | + def _process_segment(self, segnum): |
---|
7822 | + """ |
---|
7823 | + I download, validate, decode, and decrypt one segment of the |
---|
7824 | + file that this Retrieve is retrieving. This means coordinating |
---|
7825 | + the process of getting k blocks of that file, validating them, |
---|
7826 | + assembling them into one segment with the decoder, and then |
---|
7827 | + decrypting them. |
---|
7828 | + """ |
---|
7829 | + self.log("processing segment %d" % segnum) |
---|
7830 | + |
---|
7831 | + # TODO: The old code uses a marker. Should this code do that |
---|
7832 | + # too? What did the Marker do? |
---|
7833 | + assert len(self._active_readers) >= self._required_shares |
---|
7834 | + |
---|
7835 | + # We need to ask each of our active readers for its block and |
---|
7836 | + # salt. We will then validate those. If validation is |
---|
7837 | + # successful, we will assemble the results into plaintext. |
---|
7838 | + ds = [] |
---|
7839 | + for reader in self._active_readers: |
---|
7840 | + started = time.time() |
---|
7841 | + d = reader.get_block_and_salt(segnum, queue=True) |
---|
7842 | + d2 = self._get_needed_hashes(reader, segnum) |
---|
7843 | + dl = defer.DeferredList([d, d2], consumeErrors=True) |
---|
7844 | + dl.addCallback(self._validate_block, segnum, reader, started) |
---|
7845 | + dl.addErrback(self._validation_or_decoding_failed, [reader]) |
---|
7846 | + ds.append(dl) |
---|
7847 | + reader.flush() |
---|
7848 | + dl = defer.DeferredList(ds) |
---|
7849 | + if self._verify: |
---|
7850 | + dl.addCallback(lambda ignored: "") |
---|
7851 | + dl.addCallback(self._set_segment) |
---|
7852 | + else: |
---|
7853 | + dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum) |
---|
7854 | + return dl |
---|
7855 | + |
---|
7856 | + |
---|
7857 | + def _maybe_decode_and_decrypt_segment(self, blocks_and_salts, segnum): |
---|
7858 | + """ |
---|
7859 | + I take the results of fetching and validating the blocks from a |
---|
7860 | + callback chain in another method. If the results are such that |
---|
7861 | + they tell me that validation and fetching succeeded without |
---|
7862 | + incident, I will proceed with decoding and decryption. |
---|
7863 | + Otherwise, I will do nothing. |
---|
7864 | + """ |
---|
7865 | + self.log("trying to decode and decrypt segment %d" % segnum) |
---|
7866 | + failures = False |
---|
7867 | + for block_and_salt in blocks_and_salts: |
---|
7868 | + if not block_and_salt[0] or block_and_salt[1] == None: |
---|
7869 | + self.log("some validation operations failed; not proceeding") |
---|
7870 | + failures = True |
---|
7871 | break |
---|
7872 | hunk ./src/allmydata/mutable/retrieve.py 735 |
---|
7873 | + if not failures: |
---|
7874 | + self.log("everything looks ok, building segment %d" % segnum) |
---|
7875 | + d = self._decode_blocks(blocks_and_salts, segnum) |
---|
7876 | + d.addCallback(self._decrypt_segment) |
---|
7877 | + d.addErrback(self._validation_or_decoding_failed, |
---|
7878 | + self._active_readers) |
---|
7879 | + # check to see whether we've been paused before writing |
---|
7880 | + # anything. |
---|
7881 | + d.addCallback(self._check_for_paused) |
---|
7882 | + d.addCallback(self._set_segment) |
---|
7883 | + return d |
---|
7884 | + else: |
---|
7885 | + return defer.succeed(None) |
---|
7886 | + |
---|
7887 | + |
---|
7888 | + def _set_segment(self, segment): |
---|
7889 | + """ |
---|
7890 | + Given a plaintext segment, I register that segment with the |
---|
7891 | + target that is handling the file download. |
---|
7892 | + """ |
---|
7893 | + self.log("got plaintext for segment %d" % self._current_segment) |
---|
7894 | + if self._current_segment == self._start_segment: |
---|
7895 | + # We're on the first segment. It's possible that we want |
---|
7896 | + # only some part of the end of this segment, and that we |
---|
7897 | + # just downloaded the whole thing to get that part. If so, |
---|
7898 | + # we need to account for that and give the reader just the |
---|
7899 | + # data that they want. |
---|
7900 | + n = self._offset % self._segment_size |
---|
7901 | + self.log("stripping %d bytes off of the first segment" % n) |
---|
7902 | + self.log("original segment length: %d" % len(segment)) |
---|
7903 | + segment = segment[n:] |
---|
7904 | + self.log("new segment length: %d" % len(segment)) |
---|
7905 | + |
---|
7906 | + if self._current_segment == self._last_segment and self._read_length is not None: |
---|
7907 | + # We're on the last segment. It's possible that we only want |
---|
7908 | + # part of the beginning of this segment, and that we |
---|
7909 | + # downloaded the whole thing anyway. Make sure to give the |
---|
7910 | + # caller only the portion of the segment that they want to |
---|
7911 | + # receive. |
---|
7912 | + extra = self._read_length |
---|
7913 | + if self._start_segment != self._last_segment: |
---|
7914 | + extra -= self._segment_size - \ |
---|
7915 | + (self._offset % self._segment_size) |
---|
7916 | + extra %= self._segment_size |
---|
7917 | + self.log("original segment length: %d" % len(segment)) |
---|
7918 | + segment = segment[:extra] |
---|
7919 | + self.log("new segment length: %d" % len(segment)) |
---|
7920 | + self.log("only taking %d bytes of the last segment" % extra) |
---|
7921 | + |
---|
7922 | + if not self._verify: |
---|
7923 | + self._consumer.write(segment) |
---|
7924 | + else: |
---|
7925 | + # we don't care about the plaintext if we are doing a verify. |
---|
7926 | + segment = None |
---|
7927 | + self._current_segment += 1 |
---|
7928 | |
---|
7929 | hunk ./src/allmydata/mutable/retrieve.py 791 |
---|
7930 | - # at this point, we have as many outstanding queries as we can. If |
---|
7931 | - # needed!=0 then we might not have enough to recover the file. |
---|
7932 | - if needed: |
---|
7933 | - format = ("ran out of peers: " |
---|
7934 | - "have %(have)d shares (k=%(k)d), " |
---|
7935 | - "%(outstanding)d queries in flight, " |
---|
7936 | - "need %(need)d more, " |
---|
7937 | - "found %(bad)d bad shares") |
---|
7938 | - args = {"have": len(self.shares), |
---|
7939 | - "k": k, |
---|
7940 | - "outstanding": len(self._outstanding_queries), |
---|
7941 | - "need": needed, |
---|
7942 | - "bad": len(self._bad_shares), |
---|
7943 | - } |
---|
7944 | - self.log(format=format, |
---|
7945 | - level=log.WEIRD, umid="ezTfjw", **args) |
---|
7946 | - err = NotEnoughSharesError("%s, last failure: %s" % |
---|
7947 | - (format % args, self._last_failure)) |
---|
7948 | - if self._bad_shares: |
---|
7949 | - self.log("We found some bad shares this pass. You should " |
---|
7950 | - "update the servermap and try again to check " |
---|
7951 | - "more peers", |
---|
7952 | - level=log.WEIRD, umid="EFkOlA") |
---|
7953 | - err.servermap = self.servermap |
---|
7954 | - raise err |
---|
7955 | |
---|
7956 | hunk ./src/allmydata/mutable/retrieve.py 792 |
---|
7957 | + def _validation_or_decoding_failed(self, f, readers): |
---|
7958 | + """ |
---|
7959 | + I am called when a block or a salt fails to correctly validate, or when |
---|
7960 | + the decryption or decoding operation fails for some reason. I react to |
---|
7961 | + this failure by notifying the remote server of corruption, and then |
---|
7962 | + removing the remote peer from further activity. |
---|
7963 | + """ |
---|
7964 | + assert isinstance(readers, list) |
---|
7965 | + bad_shnums = [reader.shnum for reader in readers] |
---|
7966 | + |
---|
7967 | + self.log("validation or decoding failed on share(s) %s, peer(s) %s " |
---|
7968 | + ", segment %d: %s" % \ |
---|
7969 | + (bad_shnums, readers, self._current_segment, str(f))) |
---|
7970 | + for reader in readers: |
---|
7971 | + self._mark_bad_share(reader, f) |
---|
7972 | return |
---|
7973 | |
---|
7974 | hunk ./src/allmydata/mutable/retrieve.py 809 |
---|
7975 | - def _decode(self): |
---|
7976 | - started = time.time() |
---|
7977 | - (seqnum, root_hash, IV, segsize, datalength, k, N, prefix, |
---|
7978 | - offsets_tuple) = self.verinfo |
---|
7979 | |
---|
7980 | hunk ./src/allmydata/mutable/retrieve.py 810 |
---|
7981 | - # shares_dict is a dict mapping shnum to share data, but the codec |
---|
7982 | - # wants two lists. |
---|
7983 | - shareids = []; shares = [] |
---|
7984 | - for shareid, share in self.shares.items(): |
---|
7985 | + def _validate_block(self, results, segnum, reader, started): |
---|
7986 | + """ |
---|
7987 | + I validate a block from one share on a remote server. |
---|
7988 | + """ |
---|
7989 | + # Grab the part of the block hash tree that is necessary to |
---|
7990 | + # validate this block, then generate the block hash root. |
---|
7991 | + self.log("validating share %d for segment %d" % (reader.shnum, |
---|
7992 | + segnum)) |
---|
7993 | + self._status.add_fetch_timing(reader.peerid, started) |
---|
7994 | + self._status.set_status("Valdiating blocks for segment %d" % segnum) |
---|
7995 | + # Did we fail to fetch either of the things that we were |
---|
7996 | + # supposed to? Fail if so. |
---|
7997 | + if not results[0][0] and results[1][0]: |
---|
7998 | + # handled by the errback handler. |
---|
7999 | + |
---|
8000 | + # These all get batched into one query, so the resulting |
---|
8001 | + # failure should be the same for all of them, so we can just |
---|
8002 | + # use the first one. |
---|
8003 | + assert isinstance(results[0][1], failure.Failure) |
---|
8004 | + |
---|
8005 | + f = results[0][1] |
---|
8006 | + raise CorruptShareError(reader.peerid, |
---|
8007 | + reader.shnum, |
---|
8008 | + "Connection error: %s" % str(f)) |
---|
8009 | + |
---|
8010 | + block_and_salt, block_and_sharehashes = results |
---|
8011 | + block, salt = block_and_salt[1] |
---|
8012 | + blockhashes, sharehashes = block_and_sharehashes[1] |
---|
8013 | + |
---|
8014 | + blockhashes = dict(enumerate(blockhashes[1])) |
---|
8015 | + self.log("the reader gave me the following blockhashes: %s" % \ |
---|
8016 | + blockhashes.keys()) |
---|
8017 | + self.log("the reader gave me the following sharehashes: %s" % \ |
---|
8018 | + sharehashes[1].keys()) |
---|
8019 | + bht = self._block_hash_trees[reader.shnum] |
---|
8020 | + |
---|
8021 | + if bht.needed_hashes(segnum, include_leaf=True): |
---|
8022 | + try: |
---|
8023 | + bht.set_hashes(blockhashes) |
---|
8024 | + except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \ |
---|
8025 | + IndexError), e: |
---|
8026 | + raise CorruptShareError(reader.peerid, |
---|
8027 | + reader.shnum, |
---|
8028 | + "block hash tree failure: %s" % e) |
---|
8029 | + |
---|
8030 | + if self._version == MDMF_VERSION: |
---|
8031 | + blockhash = hashutil.block_hash(salt + block) |
---|
8032 | + else: |
---|
8033 | + blockhash = hashutil.block_hash(block) |
---|
8034 | + # If this works without an error, then validation is |
---|
8035 | + # successful. |
---|
8036 | + try: |
---|
8037 | + bht.set_hashes(leaves={segnum: blockhash}) |
---|
8038 | + except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \ |
---|
8039 | + IndexError), e: |
---|
8040 | + raise CorruptShareError(reader.peerid, |
---|
8041 | + reader.shnum, |
---|
8042 | + "block hash tree failure: %s" % e) |
---|
8043 | + |
---|
8044 | + # Reaching this point means that we know that this segment |
---|
8045 | + # is correct. Now we need to check to see whether the share |
---|
8046 | + # hash chain is also correct. |
---|
8047 | + # SDMF wrote share hash chains that didn't contain the |
---|
8048 | + # leaves, which would be produced from the block hash tree. |
---|
8049 | + # So we need to validate the block hash tree first. If |
---|
8050 | + # successful, then bht[0] will contain the root for the |
---|
8051 | + # shnum, which will be a leaf in the share hash tree, which |
---|
8052 | + # will allow us to validate the rest of the tree. |
---|
8053 | + if self.share_hash_tree.needed_hashes(reader.shnum, |
---|
8054 | + include_leaf=True) or \ |
---|
8055 | + self._verify: |
---|
8056 | + try: |
---|
8057 | + self.share_hash_tree.set_hashes(hashes=sharehashes[1], |
---|
8058 | + leaves={reader.shnum: bht[0]}) |
---|
8059 | + except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \ |
---|
8060 | + IndexError), e: |
---|
8061 | + raise CorruptShareError(reader.peerid, |
---|
8062 | + reader.shnum, |
---|
8063 | + "corrupt hashes: %s" % e) |
---|
8064 | + |
---|
8065 | + self.log('share %d is valid for segment %d' % (reader.shnum, |
---|
8066 | + segnum)) |
---|
8067 | + return {reader.shnum: (block, salt)} |
---|
8068 | + |
---|
8069 | + |
---|
8070 | + def _get_needed_hashes(self, reader, segnum): |
---|
8071 | + """ |
---|
8072 | + I get the hashes needed to validate segnum from the reader, then return |
---|
8073 | + to my caller when this is done. |
---|
8074 | + """ |
---|
8075 | + bht = self._block_hash_trees[reader.shnum] |
---|
8076 | + needed = bht.needed_hashes(segnum, include_leaf=True) |
---|
8077 | + # The root of the block hash tree is also a leaf in the share |
---|
8078 | + # hash tree. So we don't need to fetch it from the remote |
---|
8079 | + # server. In the case of files with one segment, this means that |
---|
8080 | + # we won't fetch any block hash tree from the remote server, |
---|
8081 | + # since the hash of each share of the file is the entire block |
---|
8082 | + # hash tree, and is a leaf in the share hash tree. This is fine, |
---|
8083 | + # since any share corruption will be detected in the share hash |
---|
8084 | + # tree. |
---|
8085 | + #needed.discard(0) |
---|
8086 | + self.log("getting blockhashes for segment %d, share %d: %s" % \ |
---|
8087 | + (segnum, reader.shnum, str(needed))) |
---|
8088 | + d1 = reader.get_blockhashes(needed, queue=True, force_remote=True) |
---|
8089 | + if self.share_hash_tree.needed_hashes(reader.shnum): |
---|
8090 | + need = self.share_hash_tree.needed_hashes(reader.shnum) |
---|
8091 | + self.log("also need sharehashes for share %d: %s" % (reader.shnum, |
---|
8092 | + str(need))) |
---|
8093 | + d2 = reader.get_sharehashes(need, queue=True, force_remote=True) |
---|
8094 | + else: |
---|
8095 | + d2 = defer.succeed({}) # the logic in the next method |
---|
8096 | + # expects a dict |
---|
8097 | + dl = defer.DeferredList([d1, d2], consumeErrors=True) |
---|
8098 | + return dl |
---|
8099 | + |
---|
8100 | + |
---|
8101 | + def _decode_blocks(self, blocks_and_salts, segnum): |
---|
8102 | + """ |
---|
8103 | + I take a list of k blocks and salts, and decode that into a |
---|
8104 | + single encrypted segment. |
---|
8105 | + """ |
---|
8106 | + d = {} |
---|
8107 | + # We want to merge our dictionaries to the form |
---|
8108 | + # {shnum: blocks_and_salts} |
---|
8109 | + # |
---|
8110 | + # The dictionaries come from validate block that way, so we just |
---|
8111 | + # need to merge them. |
---|
8112 | + for block_and_salt in blocks_and_salts: |
---|
8113 | + d.update(block_and_salt[1]) |
---|
8114 | + |
---|
8115 | + # All of these blocks should have the same salt; in SDMF, it is |
---|
8116 | + # the file-wide IV, while in MDMF it is the per-segment salt. In |
---|
8117 | + # either case, we just need to get one of them and use it. |
---|
8118 | + # |
---|
8119 | + # d.items()[0] is like (shnum, (block, salt)) |
---|
8120 | + # d.items()[0][1] is like (block, salt) |
---|
8121 | + # d.items()[0][1][1] is the salt. |
---|
8122 | + salt = d.items()[0][1][1] |
---|
8123 | + # Next, extract just the blocks from the dict. We'll use the |
---|
8124 | + # salt in the next step. |
---|
8125 | + share_and_shareids = [(k, v[0]) for k, v in d.items()] |
---|
8126 | + d2 = dict(share_and_shareids) |
---|
8127 | + shareids = [] |
---|
8128 | + shares = [] |
---|
8129 | + for shareid, share in d2.items(): |
---|
8130 | shareids.append(shareid) |
---|
8131 | shares.append(share) |
---|
8132 | |
---|
8133 | hunk ./src/allmydata/mutable/retrieve.py 958 |
---|
8134 | - assert len(shareids) >= k, len(shareids) |
---|
8135 | + self._status.set_status("Decoding") |
---|
8136 | + started = time.time() |
---|
8137 | + assert len(shareids) >= self._required_shares, len(shareids) |
---|
8138 | # zfec really doesn't want extra shares |
---|
8139 | hunk ./src/allmydata/mutable/retrieve.py 962 |
---|
8140 | - shareids = shareids[:k] |
---|
8141 | - shares = shares[:k] |
---|
8142 | - |
---|
8143 | - fec = codec.CRSDecoder() |
---|
8144 | - fec.set_params(segsize, k, N) |
---|
8145 | - |
---|
8146 | - self.log("params %s, we have %d shares" % ((segsize, k, N), len(shares))) |
---|
8147 | - self.log("about to decode, shareids=%s" % (shareids,)) |
---|
8148 | - d = defer.maybeDeferred(fec.decode, shares, shareids) |
---|
8149 | - def _done(buffers): |
---|
8150 | - self._status.timings["decode"] = time.time() - started |
---|
8151 | - self.log(" decode done, %d buffers" % len(buffers)) |
---|
8152 | + shareids = shareids[:self._required_shares] |
---|
8153 | + shares = shares[:self._required_shares] |
---|
8154 | + self.log("decoding segment %d" % segnum) |
---|
8155 | + if segnum == self._num_segments - 1: |
---|
8156 | + d = defer.maybeDeferred(self._tail_decoder.decode, shares, shareids) |
---|
8157 | + else: |
---|
8158 | + d = defer.maybeDeferred(self._segment_decoder.decode, shares, shareids) |
---|
8159 | + def _process(buffers): |
---|
8160 | segment = "".join(buffers) |
---|
8161 | hunk ./src/allmydata/mutable/retrieve.py 971 |
---|
8162 | + self.log(format="now decoding segment %(segnum)s of %(numsegs)s", |
---|
8163 | + segnum=segnum, |
---|
8164 | + numsegs=self._num_segments, |
---|
8165 | + level=log.NOISY) |
---|
8166 | self.log(" joined length %d, datalength %d" % |
---|
8167 | hunk ./src/allmydata/mutable/retrieve.py 976 |
---|
8168 | - (len(segment), datalength)) |
---|
8169 | - segment = segment[:datalength] |
---|
8170 | + (len(segment), self._data_length)) |
---|
8171 | + if segnum == self._num_segments - 1: |
---|
8172 | + size_to_use = self._tail_data_size |
---|
8173 | + else: |
---|
8174 | + size_to_use = self._segment_size |
---|
8175 | + segment = segment[:size_to_use] |
---|
8176 | self.log(" segment len=%d" % len(segment)) |
---|
8177 | hunk ./src/allmydata/mutable/retrieve.py 983 |
---|
8178 | - return segment |
---|
8179 | - def _err(f): |
---|
8180 | - self.log(" decode failed: %s" % f) |
---|
8181 | - return f |
---|
8182 | - d.addCallback(_done) |
---|
8183 | - d.addErrback(_err) |
---|
8184 | + self._status.timings.setdefault("decode", 0) |
---|
8185 | + self._status.timings['decode'] = time.time() - started |
---|
8186 | + return segment, salt |
---|
8187 | + d.addCallback(_process) |
---|
8188 | return d |
---|
8189 | |
---|
8190 | hunk ./src/allmydata/mutable/retrieve.py 989 |
---|
8191 | - def _decrypt(self, crypttext, IV, readkey): |
---|
8192 | + |
---|
8193 | + def _decrypt_segment(self, segment_and_salt): |
---|
8194 | + """ |
---|
8195 | + I take a single segment and its salt, and decrypt it. I return |
---|
8196 | + the plaintext of the segment that is in my argument. |
---|
8197 | + """ |
---|
8198 | + segment, salt = segment_and_salt |
---|
8199 | self._status.set_status("decrypting") |
---|
8200 | hunk ./src/allmydata/mutable/retrieve.py 997 |
---|
8201 | + self.log("decrypting segment %d" % self._current_segment) |
---|
8202 | started = time.time() |
---|
8203 | hunk ./src/allmydata/mutable/retrieve.py 999 |
---|
8204 | - key = hashutil.ssk_readkey_data_hash(IV, readkey) |
---|
8205 | + key = hashutil.ssk_readkey_data_hash(salt, self._node.get_readkey()) |
---|
8206 | decryptor = AES(key) |
---|
8207 | hunk ./src/allmydata/mutable/retrieve.py 1001 |
---|
8208 | - plaintext = decryptor.process(crypttext) |
---|
8209 | - self._status.timings["decrypt"] = time.time() - started |
---|
8210 | + plaintext = decryptor.process(segment) |
---|
8211 | + self._status.timings.setdefault("decrypt", 0) |
---|
8212 | + self._status.timings['decrypt'] = time.time() - started |
---|
8213 | return plaintext |
---|
8214 | |
---|
8215 | hunk ./src/allmydata/mutable/retrieve.py 1006 |
---|
8216 | - def _done(self, res): |
---|
8217 | - if not self._running: |
---|
8218 | + |
---|
8219 | + def notify_server_corruption(self, peerid, shnum, reason): |
---|
8220 | + ss = self.servermap.connections[peerid] |
---|
8221 | + ss.callRemoteOnly("advise_corrupt_share", |
---|
8222 | + "mutable", self._storage_index, shnum, reason) |
---|
8223 | + |
---|
8224 | + |
---|
8225 | + def _try_to_validate_privkey(self, enc_privkey, reader): |
---|
8226 | + alleged_privkey_s = self._node._decrypt_privkey(enc_privkey) |
---|
8227 | + alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s) |
---|
8228 | + if alleged_writekey != self._node.get_writekey(): |
---|
8229 | + self.log("invalid privkey from %s shnum %d" % |
---|
8230 | + (reader, reader.shnum), |
---|
8231 | + level=log.WEIRD, umid="YIw4tA") |
---|
8232 | + if self._verify: |
---|
8233 | + self.servermap.mark_bad_share(reader.peerid, reader.shnum, |
---|
8234 | + self.verinfo[-2]) |
---|
8235 | + e = CorruptShareError(reader.peerid, |
---|
8236 | + reader.shnum, |
---|
8237 | + "invalid privkey") |
---|
8238 | + f = failure.Failure(e) |
---|
8239 | + self._bad_shares.add((reader.peerid, reader.shnum, f)) |
---|
8240 | return |
---|
8241 | hunk ./src/allmydata/mutable/retrieve.py 1029 |
---|
8242 | + |
---|
8243 | + # it's good |
---|
8244 | + self.log("got valid privkey from shnum %d on reader %s" % |
---|
8245 | + (reader.shnum, reader)) |
---|
8246 | + privkey = rsa.create_signing_key_from_string(alleged_privkey_s) |
---|
8247 | + self._node._populate_encprivkey(enc_privkey) |
---|
8248 | + self._node._populate_privkey(privkey) |
---|
8249 | + self._need_privkey = False |
---|
8250 | + |
---|
8251 | + |
---|
8252 | + def _check_for_done(self, res): |
---|
8253 | + """ |
---|
8254 | + I check to see if this Retrieve object has successfully finished |
---|
8255 | + its work. |
---|
8256 | + |
---|
8257 | + I can exit in the following ways: |
---|
8258 | + - If there are no more segments to download, then I exit by |
---|
8259 | + causing self._done_deferred to fire with the plaintext |
---|
8260 | + content requested by the caller. |
---|
8261 | + - If there are still segments to be downloaded, and there |
---|
8262 | + are enough active readers (readers which have not broken |
---|
8263 | + and have not given us corrupt data) to continue |
---|
8264 | + downloading, I send control back to |
---|
8265 | + _download_current_segment. |
---|
8266 | + - If there are still segments to be downloaded but there are |
---|
8267 | + not enough active peers to download them, I ask |
---|
8268 | + _add_active_peers to add more peers. If it is successful, |
---|
8269 | + it will call _download_current_segment. If there are not |
---|
8270 | + enough peers to retrieve the file, then that will cause |
---|
8271 | + _done_deferred to errback. |
---|
8272 | + """ |
---|
8273 | + self.log("checking for doneness") |
---|
8274 | + if self._current_segment > self._last_segment: |
---|
8275 | + # No more segments to download, we're done. |
---|
8276 | + self.log("got plaintext, done") |
---|
8277 | + return self._done() |
---|
8278 | + |
---|
8279 | + if len(self._active_readers) >= self._required_shares: |
---|
8280 | + # More segments to download, but we have enough good peers |
---|
8281 | + # in self._active_readers that we can do that without issue, |
---|
8282 | + # so go nab the next segment. |
---|
8283 | + self.log("not done yet: on segment %d of %d" % \ |
---|
8284 | + (self._current_segment + 1, self._num_segments)) |
---|
8285 | + return self._download_current_segment() |
---|
8286 | + |
---|
8287 | + self.log("not done yet: on segment %d of %d, need to add peers" % \ |
---|
8288 | + (self._current_segment + 1, self._num_segments)) |
---|
8289 | + return self._add_active_peers() |
---|
8290 | + |
---|
8291 | + |
---|
8292 | + def _done(self): |
---|
8293 | + """ |
---|
8294 | + I am called by _check_for_done when the download process has |
---|
8295 | + finished successfully. After making some useful logging |
---|
8296 | + statements, I return the decrypted contents to the owner of this |
---|
8297 | + Retrieve object through self._done_deferred. |
---|
8298 | + """ |
---|
8299 | self._running = False |
---|
8300 | self._status.set_active(False) |
---|
8301 | hunk ./src/allmydata/mutable/retrieve.py 1088 |
---|
8302 | - self._status.timings["total"] = time.time() - self._started |
---|
8303 | - # res is either the new contents, or a Failure |
---|
8304 | - if isinstance(res, failure.Failure): |
---|
8305 | - self.log("Retrieve done, with failure", failure=res, |
---|
8306 | - level=log.UNUSUAL) |
---|
8307 | - self._status.set_status("Failed") |
---|
8308 | + now = time.time() |
---|
8309 | + self._status.timings['total'] = now - self._started |
---|
8310 | + self._status.timings['fetch'] = now - self._started_fetching |
---|
8311 | + |
---|
8312 | + if self._verify: |
---|
8313 | + ret = list(self._bad_shares) |
---|
8314 | + self.log("done verifying, found %d bad shares" % len(ret)) |
---|
8315 | else: |
---|
8316 | hunk ./src/allmydata/mutable/retrieve.py 1096 |
---|
8317 | - self.log("Retrieve done, success!") |
---|
8318 | - self._status.set_status("Finished") |
---|
8319 | - self._status.set_progress(1.0) |
---|
8320 | - # remember the encoding parameters, use them again next time |
---|
8321 | - (seqnum, root_hash, IV, segsize, datalength, k, N, prefix, |
---|
8322 | - offsets_tuple) = self.verinfo |
---|
8323 | - self._node._populate_required_shares(k) |
---|
8324 | - self._node._populate_total_shares(N) |
---|
8325 | - eventually(self._done_deferred.callback, res) |
---|
8326 | + # TODO: upload status here? |
---|
8327 | + ret = self._consumer |
---|
8328 | + self._consumer.unregisterProducer() |
---|
8329 | + eventually(self._done_deferred.callback, ret) |
---|
8330 | + |
---|
8331 | |
---|
8332 | hunk ./src/allmydata/mutable/retrieve.py 1102 |
---|
8333 | + def _failed(self): |
---|
8334 | + """ |
---|
8335 | + I am called by _add_active_peers when there are not enough |
---|
8336 | + active peers left to complete the download. After making some |
---|
8337 | + useful logging statements, I return an exception to that effect |
---|
8338 | + to the caller of this Retrieve object through |
---|
8339 | + self._done_deferred. |
---|
8340 | + """ |
---|
8341 | + self._running = False |
---|
8342 | + self._status.set_active(False) |
---|
8343 | + now = time.time() |
---|
8344 | + self._status.timings['total'] = now - self._started |
---|
8345 | + self._status.timings['fetch'] = now - self._started_fetching |
---|
8346 | + |
---|
8347 | + if self._verify: |
---|
8348 | + ret = list(self._bad_shares) |
---|
8349 | + else: |
---|
8350 | + format = ("ran out of peers: " |
---|
8351 | + "have %(have)d of %(total)d segments " |
---|
8352 | + "found %(bad)d bad shares " |
---|
8353 | + "encoding %(k)d-of-%(n)d") |
---|
8354 | + args = {"have": self._current_segment, |
---|
8355 | + "total": self._num_segments, |
---|
8356 | + "need": self._last_segment, |
---|
8357 | + "k": self._required_shares, |
---|
8358 | + "n": self._total_shares, |
---|
8359 | + "bad": len(self._bad_shares)} |
---|
8360 | + e = NotEnoughSharesError("%s, last failure: %s" % \ |
---|
8361 | + (format % args, str(self._last_failure))) |
---|
8362 | + f = failure.Failure(e) |
---|
8363 | + ret = f |
---|
8364 | + eventually(self._done_deferred.callback, ret) |
---|
8365 | } |
---|
8366 | [mutable/servermap.py: Alter the servermap updater to work with MDMF files |
---|
8367 | Kevan Carstensen <kevan@isnotajoke.com>**20100811233309 |
---|
8368 | Ignore-this: 5d2c922283c12cad93a5346e978cd691 |
---|
8369 | |
---|
8370 | These modifications were basically all to the end of having the |
---|
8371 | servermap updater use the unified MDMF + SDMF read interface whenever |
---|
8372 | possible -- this reduces the complexity of the code, making it easier to |
---|
8373 | read and maintain. To do this, I needed to modify the process of |
---|
8374 | updating the servermap a little bit. |
---|
8375 | |
---|
8376 | To support partial-file updates, I also modified the servermap updater |
---|
8377 | to fetch the block hash trees and certain segments of files while it |
---|
8378 | performed a servermap update (this can be done without adding any new |
---|
8379 | roundtrips because of batch-read functionality that the read proxy has). |
---|
8380 | |
---|
8381 | ] { |
---|
8382 | hunk ./src/allmydata/mutable/servermap.py 2 |
---|
8383 | |
---|
8384 | -import sys, time |
---|
8385 | +import sys, time, struct |
---|
8386 | from zope.interface import implements |
---|
8387 | from itertools import count |
---|
8388 | from twisted.internet import defer |
---|
8389 | hunk ./src/allmydata/mutable/servermap.py 7 |
---|
8390 | from twisted.python import failure |
---|
8391 | -from foolscap.api import DeadReferenceError, RemoteException, eventually |
---|
8392 | -from allmydata.util import base32, hashutil, idlib, log |
---|
8393 | +from foolscap.api import DeadReferenceError, RemoteException, eventually, \ |
---|
8394 | + fireEventually |
---|
8395 | +from allmydata.util import base32, hashutil, idlib, log, deferredutil |
---|
8396 | from allmydata.storage.server import si_b2a |
---|
8397 | from allmydata.interfaces import IServermapUpdaterStatus |
---|
8398 | from pycryptopp.publickey import rsa |
---|
8399 | hunk ./src/allmydata/mutable/servermap.py 17 |
---|
8400 | from allmydata.mutable.common import MODE_CHECK, MODE_ANYTHING, MODE_WRITE, MODE_READ, \ |
---|
8401 | DictOfSets, CorruptShareError, NeedMoreDataError |
---|
8402 | from allmydata.mutable.layout import unpack_prefix_and_signature, unpack_header, unpack_share, \ |
---|
8403 | - SIGNED_PREFIX_LENGTH |
---|
8404 | + SIGNED_PREFIX_LENGTH, MDMFSlotReadProxy |
---|
8405 | |
---|
8406 | class UpdateStatus: |
---|
8407 | implements(IServermapUpdaterStatus) |
---|
8408 | hunk ./src/allmydata/mutable/servermap.py 124 |
---|
8409 | self.bad_shares = {} # maps (peerid,shnum) to old checkstring |
---|
8410 | self.last_update_mode = None |
---|
8411 | self.last_update_time = 0 |
---|
8412 | + self.update_data = {} # (verinfo,shnum) => data |
---|
8413 | |
---|
8414 | def copy(self): |
---|
8415 | s = ServerMap() |
---|
8416 | hunk ./src/allmydata/mutable/servermap.py 255 |
---|
8417 | """Return a set of versionids, one for each version that is currently |
---|
8418 | recoverable.""" |
---|
8419 | versionmap = self.make_versionmap() |
---|
8420 | - |
---|
8421 | recoverable_versions = set() |
---|
8422 | for (verinfo, shares) in versionmap.items(): |
---|
8423 | (seqnum, root_hash, IV, segsize, datalength, k, N, prefix, |
---|
8424 | hunk ./src/allmydata/mutable/servermap.py 340 |
---|
8425 | return False |
---|
8426 | |
---|
8427 | |
---|
8428 | + def get_update_data_for_share_and_verinfo(self, shnum, verinfo): |
---|
8429 | + """ |
---|
8430 | + I return the update data for the given shnum |
---|
8431 | + """ |
---|
8432 | + update_data = self.update_data[shnum] |
---|
8433 | + update_datum = [i[1] for i in update_data if i[0] == verinfo][0] |
---|
8434 | + return update_datum |
---|
8435 | + |
---|
8436 | + |
---|
8437 | + def set_update_data_for_share_and_verinfo(self, shnum, verinfo, data): |
---|
8438 | + """ |
---|
8439 | + I record the block hash tree for the given shnum. |
---|
8440 | + """ |
---|
8441 | + self.update_data.setdefault(shnum , []).append((verinfo, data)) |
---|
8442 | + |
---|
8443 | + |
---|
8444 | class ServermapUpdater: |
---|
8445 | def __init__(self, filenode, storage_broker, monitor, servermap, |
---|
8446 | hunk ./src/allmydata/mutable/servermap.py 358 |
---|
8447 | - mode=MODE_READ, add_lease=False): |
---|
8448 | + mode=MODE_READ, add_lease=False, update_range=None): |
---|
8449 | """I update a servermap, locating a sufficient number of useful |
---|
8450 | shares and remembering where they are located. |
---|
8451 | |
---|
8452 | hunk ./src/allmydata/mutable/servermap.py 390 |
---|
8453 | # * if we need the encrypted private key, we want [-1216ish:] |
---|
8454 | # * but we can't read from negative offsets |
---|
8455 | # * the offset table tells us the 'ish', also the positive offset |
---|
8456 | - # A future version of the SMDF slot format should consider using |
---|
8457 | - # fixed-size slots so we can retrieve less data. For now, we'll just |
---|
8458 | - # read 2000 bytes, which also happens to read enough actual data to |
---|
8459 | - # pre-fetch a 9-entry dirnode. |
---|
8460 | + # MDMF: |
---|
8461 | + # * Checkstring? [0:72] |
---|
8462 | + # * If we want to validate the checkstring, then [0:72], [143:?] -- |
---|
8463 | + # the offset table will tell us for sure. |
---|
8464 | + # * If we need the verification key, we have to consult the offset |
---|
8465 | + # table as well. |
---|
8466 | + # At this point, we don't know which we are. Our filenode can |
---|
8467 | + # tell us, but it might be lying -- in some cases, we're |
---|
8468 | + # responsible for telling it which kind of file it is. |
---|
8469 | self._read_size = 4000 |
---|
8470 | if mode == MODE_CHECK: |
---|
8471 | # we use unpack_prefix_and_signature, so we need 1k |
---|
8472 | hunk ./src/allmydata/mutable/servermap.py 410 |
---|
8473 | # to ask for it during the check, we'll have problems doing the |
---|
8474 | # publish. |
---|
8475 | |
---|
8476 | + self.fetch_update_data = False |
---|
8477 | + if mode == MODE_WRITE and update_range: |
---|
8478 | + # We're updating the servermap in preparation for an |
---|
8479 | + # in-place file update, so we need to fetch some additional |
---|
8480 | + # data from each share that we find. |
---|
8481 | + assert len(update_range) == 2 |
---|
8482 | + |
---|
8483 | + self.start_segment = update_range[0] |
---|
8484 | + self.end_segment = update_range[1] |
---|
8485 | + self.fetch_update_data = True |
---|
8486 | + |
---|
8487 | prefix = si_b2a(self._storage_index)[:5] |
---|
8488 | self._log_number = log.msg(format="SharemapUpdater(%(si)s): starting (%(mode)s)", |
---|
8489 | si=prefix, mode=mode) |
---|
8490 | hunk ./src/allmydata/mutable/servermap.py 459 |
---|
8491 | self._queries_completed = 0 |
---|
8492 | |
---|
8493 | sb = self._storage_broker |
---|
8494 | + # All of the peers, permuted by the storage index, as usual. |
---|
8495 | full_peerlist = sb.get_servers_for_index(self._storage_index) |
---|
8496 | self.full_peerlist = full_peerlist # for use later, immutable |
---|
8497 | self.extra_peers = full_peerlist[:] # peers are removed as we use them |
---|
8498 | hunk ./src/allmydata/mutable/servermap.py 466 |
---|
8499 | self._good_peers = set() # peers who had some shares |
---|
8500 | self._empty_peers = set() # peers who don't have any shares |
---|
8501 | self._bad_peers = set() # peers to whom our queries failed |
---|
8502 | + self._readers = {} # peerid -> dict(sharewriters), filled in |
---|
8503 | + # after responses come in. |
---|
8504 | |
---|
8505 | k = self._node.get_required_shares() |
---|
8506 | hunk ./src/allmydata/mutable/servermap.py 470 |
---|
8507 | + # For what cases can these conditions work? |
---|
8508 | if k is None: |
---|
8509 | # make a guess |
---|
8510 | k = 3 |
---|
8511 | hunk ./src/allmydata/mutable/servermap.py 483 |
---|
8512 | self.num_peers_to_query = k + self.EPSILON |
---|
8513 | |
---|
8514 | if self.mode == MODE_CHECK: |
---|
8515 | + # We want to query all of the peers. |
---|
8516 | initial_peers_to_query = dict(full_peerlist) |
---|
8517 | must_query = set(initial_peers_to_query.keys()) |
---|
8518 | self.extra_peers = [] |
---|
8519 | hunk ./src/allmydata/mutable/servermap.py 491 |
---|
8520 | # we're planning to replace all the shares, so we want a good |
---|
8521 | # chance of finding them all. We will keep searching until we've |
---|
8522 | # seen epsilon that don't have a share. |
---|
8523 | + # We don't query all of the peers because that could take a while. |
---|
8524 | self.num_peers_to_query = N + self.EPSILON |
---|
8525 | initial_peers_to_query, must_query = self._build_initial_querylist() |
---|
8526 | self.required_num_empty_peers = self.EPSILON |
---|
8527 | hunk ./src/allmydata/mutable/servermap.py 501 |
---|
8528 | # might also avoid the round trip required to read the encrypted |
---|
8529 | # private key. |
---|
8530 | |
---|
8531 | - else: |
---|
8532 | + else: # MODE_READ, MODE_ANYTHING |
---|
8533 | + # 2k peers is good enough. |
---|
8534 | initial_peers_to_query, must_query = self._build_initial_querylist() |
---|
8535 | |
---|
8536 | # this is a set of peers that we are required to get responses from: |
---|
8537 | hunk ./src/allmydata/mutable/servermap.py 517 |
---|
8538 | # before we can consider ourselves finished, and self.extra_peers |
---|
8539 | # contains the overflow (peers that we should tap if we don't get |
---|
8540 | # enough responses) |
---|
8541 | + # I guess that self._must_query is a subset of |
---|
8542 | + # initial_peers_to_query? |
---|
8543 | + assert set(must_query).issubset(set(initial_peers_to_query)) |
---|
8544 | |
---|
8545 | self._send_initial_requests(initial_peers_to_query) |
---|
8546 | self._status.timings["initial_queries"] = time.time() - self._started |
---|
8547 | hunk ./src/allmydata/mutable/servermap.py 576 |
---|
8548 | # errors that aren't handled by _query_failed (and errors caused by |
---|
8549 | # _query_failed) get logged, but we still want to check for doneness. |
---|
8550 | d.addErrback(log.err) |
---|
8551 | - d.addBoth(self._check_for_done) |
---|
8552 | d.addErrback(self._fatal_error) |
---|
8553 | hunk ./src/allmydata/mutable/servermap.py 577 |
---|
8554 | + d.addCallback(self._check_for_done) |
---|
8555 | return d |
---|
8556 | |
---|
8557 | def _do_read(self, ss, peerid, storage_index, shnums, readv): |
---|
8558 | hunk ./src/allmydata/mutable/servermap.py 596 |
---|
8559 | d = ss.callRemote("slot_readv", storage_index, shnums, readv) |
---|
8560 | return d |
---|
8561 | |
---|
8562 | + |
---|
8563 | + def _got_corrupt_share(self, e, shnum, peerid, data, lp): |
---|
8564 | + """ |
---|
8565 | + I am called when a remote server returns a corrupt share in |
---|
8566 | + response to one of our queries. By corrupt, I mean a share |
---|
8567 | + without a valid signature. I then record the failure, notify the |
---|
8568 | + server of the corruption, and record the share as bad. |
---|
8569 | + """ |
---|
8570 | + f = failure.Failure(e) |
---|
8571 | + self.log(format="bad share: %(f_value)s", f_value=str(f), |
---|
8572 | + failure=f, parent=lp, level=log.WEIRD, umid="h5llHg") |
---|
8573 | + # Notify the server that its share is corrupt. |
---|
8574 | + self.notify_server_corruption(peerid, shnum, str(e)) |
---|
8575 | + # By flagging this as a bad peer, we won't count any of |
---|
8576 | + # the other shares on that peer as valid, though if we |
---|
8577 | + # happen to find a valid version string amongst those |
---|
8578 | + # shares, we'll keep track of it so that we don't need |
---|
8579 | + # to validate the signature on those again. |
---|
8580 | + self._bad_peers.add(peerid) |
---|
8581 | + self._last_failure = f |
---|
8582 | + # XXX: Use the reader for this? |
---|
8583 | + checkstring = data[:SIGNED_PREFIX_LENGTH] |
---|
8584 | + self._servermap.mark_bad_share(peerid, shnum, checkstring) |
---|
8585 | + self._servermap.problems.append(f) |
---|
8586 | + |
---|
8587 | + |
---|
8588 | + def _cache_good_sharedata(self, verinfo, shnum, now, data): |
---|
8589 | + """ |
---|
8590 | + If one of my queries returns successfully (which means that we |
---|
8591 | + were able to and successfully did validate the signature), I |
---|
8592 | + cache the data that we initially fetched from the storage |
---|
8593 | + server. This will help reduce the number of roundtrips that need |
---|
8594 | + to occur when the file is downloaded, or when the file is |
---|
8595 | + updated. |
---|
8596 | + """ |
---|
8597 | + if verinfo: |
---|
8598 | + self._node._add_to_cache(verinfo, shnum, 0, data, now) |
---|
8599 | + |
---|
8600 | + |
---|
8601 | def _got_results(self, datavs, peerid, readsize, stuff, started): |
---|
8602 | lp = self.log(format="got result from [%(peerid)s], %(numshares)d shares", |
---|
8603 | peerid=idlib.shortnodeid_b2a(peerid), |
---|
8604 | hunk ./src/allmydata/mutable/servermap.py 642 |
---|
8605 | level=log.NOISY) |
---|
8606 | now = time.time() |
---|
8607 | elapsed = now - started |
---|
8608 | - self._queries_outstanding.discard(peerid) |
---|
8609 | - self._servermap.reachable_peers.add(peerid) |
---|
8610 | - self._must_query.discard(peerid) |
---|
8611 | - self._queries_completed += 1 |
---|
8612 | + def _done_processing(ignored=None): |
---|
8613 | + self._queries_outstanding.discard(peerid) |
---|
8614 | + self._servermap.reachable_peers.add(peerid) |
---|
8615 | + self._must_query.discard(peerid) |
---|
8616 | + self._queries_completed += 1 |
---|
8617 | if not self._running: |
---|
8618 | self.log("but we're not running, so we'll ignore it", parent=lp, |
---|
8619 | level=log.NOISY) |
---|
8620 | hunk ./src/allmydata/mutable/servermap.py 650 |
---|
8621 | + _done_processing() |
---|
8622 | self._status.add_per_server_time(peerid, "late", started, elapsed) |
---|
8623 | return |
---|
8624 | self._status.add_per_server_time(peerid, "query", started, elapsed) |
---|
8625 | hunk ./src/allmydata/mutable/servermap.py 660 |
---|
8626 | else: |
---|
8627 | self._empty_peers.add(peerid) |
---|
8628 | |
---|
8629 | - last_verinfo = None |
---|
8630 | - last_shnum = None |
---|
8631 | + ss, storage_index = stuff |
---|
8632 | + ds = [] |
---|
8633 | + |
---|
8634 | for shnum,datav in datavs.items(): |
---|
8635 | data = datav[0] |
---|
8636 | hunk ./src/allmydata/mutable/servermap.py 665 |
---|
8637 | - try: |
---|
8638 | - verinfo = self._got_results_one_share(shnum, data, peerid, lp) |
---|
8639 | - last_verinfo = verinfo |
---|
8640 | - last_shnum = shnum |
---|
8641 | - self._node._add_to_cache(verinfo, shnum, 0, data, now) |
---|
8642 | - except CorruptShareError, e: |
---|
8643 | - # log it and give the other shares a chance to be processed |
---|
8644 | - f = failure.Failure() |
---|
8645 | - self.log(format="bad share: %(f_value)s", f_value=str(f.value), |
---|
8646 | - failure=f, parent=lp, level=log.WEIRD, umid="h5llHg") |
---|
8647 | - self.notify_server_corruption(peerid, shnum, str(e)) |
---|
8648 | - self._bad_peers.add(peerid) |
---|
8649 | - self._last_failure = f |
---|
8650 | - checkstring = data[:SIGNED_PREFIX_LENGTH] |
---|
8651 | - self._servermap.mark_bad_share(peerid, shnum, checkstring) |
---|
8652 | - self._servermap.problems.append(f) |
---|
8653 | - pass |
---|
8654 | + reader = MDMFSlotReadProxy(ss, |
---|
8655 | + storage_index, |
---|
8656 | + shnum, |
---|
8657 | + data) |
---|
8658 | + self._readers.setdefault(peerid, dict())[shnum] = reader |
---|
8659 | + # our goal, with each response, is to validate the version |
---|
8660 | + # information and share data as best we can at this point -- |
---|
8661 | + # we do this by validating the signature. To do this, we |
---|
8662 | + # need to do the following: |
---|
8663 | + # - If we don't already have the public key, fetch the |
---|
8664 | + # public key. We use this to validate the signature. |
---|
8665 | + if not self._node.get_pubkey(): |
---|
8666 | + # fetch and set the public key. |
---|
8667 | + d = reader.get_verification_key(queue=True) |
---|
8668 | + d.addCallback(lambda results, shnum=shnum, peerid=peerid: |
---|
8669 | + self._try_to_set_pubkey(results, peerid, shnum, lp)) |
---|
8670 | + # XXX: Make self._pubkey_query_failed? |
---|
8671 | + d.addErrback(lambda error, shnum=shnum, peerid=peerid: |
---|
8672 | + self._got_corrupt_share(error, shnum, peerid, data, lp)) |
---|
8673 | + else: |
---|
8674 | + # we already have the public key. |
---|
8675 | + d = defer.succeed(None) |
---|
8676 | |
---|
8677 | hunk ./src/allmydata/mutable/servermap.py 688 |
---|
8678 | - self._status.timings["cumulative_verify"] += (time.time() - now) |
---|
8679 | + # Neither of these two branches return anything of |
---|
8680 | + # consequence, so the first entry in our deferredlist will |
---|
8681 | + # be None. |
---|
8682 | |
---|
8683 | hunk ./src/allmydata/mutable/servermap.py 692 |
---|
8684 | - if self._need_privkey and last_verinfo: |
---|
8685 | - # send them a request for the privkey. We send one request per |
---|
8686 | - # server. |
---|
8687 | - lp2 = self.log("sending privkey request", |
---|
8688 | - parent=lp, level=log.NOISY) |
---|
8689 | - (seqnum, root_hash, IV, segsize, datalength, k, N, prefix, |
---|
8690 | - offsets_tuple) = last_verinfo |
---|
8691 | - o = dict(offsets_tuple) |
---|
8692 | + # - Next, we need the version information. We almost |
---|
8693 | + # certainly got this by reading the first thousand or so |
---|
8694 | + # bytes of the share on the storage server, so we |
---|
8695 | + # shouldn't need to fetch anything at this step. |
---|
8696 | + d2 = reader.get_verinfo() |
---|
8697 | + d2.addErrback(lambda error, shnum=shnum, peerid=peerid: |
---|
8698 | + self._got_corrupt_share(error, shnum, peerid, data, lp)) |
---|
8699 | + # - Next, we need the signature. For an SDMF share, it is |
---|
8700 | + # likely that we fetched this when doing our initial fetch |
---|
8701 | + # to get the version information. In MDMF, this lives at |
---|
8702 | + # the end of the share, so unless the file is quite small, |
---|
8703 | + # we'll need to do a remote fetch to get it. |
---|
8704 | + d3 = reader.get_signature(queue=True) |
---|
8705 | + d3.addErrback(lambda error, shnum=shnum, peerid=peerid: |
---|
8706 | + self._got_corrupt_share(error, shnum, peerid, data, lp)) |
---|
8707 | + # Once we have all three of these responses, we can move on |
---|
8708 | + # to validating the signature |
---|
8709 | |
---|
8710 | hunk ./src/allmydata/mutable/servermap.py 710 |
---|
8711 | - self._queries_outstanding.add(peerid) |
---|
8712 | - readv = [ (o['enc_privkey'], (o['EOF'] - o['enc_privkey'])) ] |
---|
8713 | - ss = self._servermap.connections[peerid] |
---|
8714 | - privkey_started = time.time() |
---|
8715 | - d = self._do_read(ss, peerid, self._storage_index, |
---|
8716 | - [last_shnum], readv) |
---|
8717 | - d.addCallback(self._got_privkey_results, peerid, last_shnum, |
---|
8718 | - privkey_started, lp2) |
---|
8719 | - d.addErrback(self._privkey_query_failed, peerid, last_shnum, lp2) |
---|
8720 | - d.addErrback(log.err) |
---|
8721 | - d.addCallback(self._check_for_done) |
---|
8722 | - d.addErrback(self._fatal_error) |
---|
8723 | + # Does the node already have a privkey? If not, we'll try to |
---|
8724 | + # fetch it here. |
---|
8725 | + if self._need_privkey: |
---|
8726 | + d4 = reader.get_encprivkey(queue=True) |
---|
8727 | + d4.addCallback(lambda results, shnum=shnum, peerid=peerid: |
---|
8728 | + self._try_to_validate_privkey(results, peerid, shnum, lp)) |
---|
8729 | + d4.addErrback(lambda error, shnum=shnum, peerid=peerid: |
---|
8730 | + self._privkey_query_failed(error, shnum, data, lp)) |
---|
8731 | + else: |
---|
8732 | + d4 = defer.succeed(None) |
---|
8733 | + |
---|
8734 | + |
---|
8735 | + if self.fetch_update_data: |
---|
8736 | + # fetch the block hash tree and first + last segment, as |
---|
8737 | + # configured earlier. |
---|
8738 | + # Then set them in wherever we happen to want to set |
---|
8739 | + # them. |
---|
8740 | + ds = [] |
---|
8741 | + # XXX: We do this above, too. Is there a good way to |
---|
8742 | + # make the two routines share the value without |
---|
8743 | + # introducing more roundtrips? |
---|
8744 | + ds.append(reader.get_verinfo()) |
---|
8745 | + ds.append(reader.get_blockhashes(queue=True)) |
---|
8746 | + ds.append(reader.get_block_and_salt(self.start_segment, |
---|
8747 | + queue=True)) |
---|
8748 | + ds.append(reader.get_block_and_salt(self.end_segment, |
---|
8749 | + queue=True)) |
---|
8750 | + d5 = deferredutil.gatherResults(ds) |
---|
8751 | + d5.addCallback(self._got_update_results_one_share, shnum) |
---|
8752 | + else: |
---|
8753 | + d5 = defer.succeed(None) |
---|
8754 | |
---|
8755 | hunk ./src/allmydata/mutable/servermap.py 742 |
---|
8756 | + dl = defer.DeferredList([d, d2, d3, d4, d5]) |
---|
8757 | + dl.addBoth(self._turn_barrier) |
---|
8758 | + reader.flush() |
---|
8759 | + dl.addCallback(lambda results, shnum=shnum, peerid=peerid: |
---|
8760 | + self._got_signature_one_share(results, shnum, peerid, lp)) |
---|
8761 | + dl.addErrback(lambda error, shnum=shnum, data=data: |
---|
8762 | + self._got_corrupt_share(error, shnum, peerid, data, lp)) |
---|
8763 | + dl.addCallback(lambda verinfo, shnum=shnum, peerid=peerid, data=data: |
---|
8764 | + self._cache_good_sharedata(verinfo, shnum, now, data)) |
---|
8765 | + ds.append(dl) |
---|
8766 | + # dl is a deferred list that will fire when all of the shares |
---|
8767 | + # that we found on this peer are done processing. When dl fires, |
---|
8768 | + # we know that processing is done, so we can decrement the |
---|
8769 | + # semaphore-like thing that we incremented earlier. |
---|
8770 | + dl = defer.DeferredList(ds, fireOnOneErrback=True) |
---|
8771 | + # Are we done? Done means that there are no more queries to |
---|
8772 | + # send, that there are no outstanding queries, and that we |
---|
8773 | + # haven't received any queries that are still processing. If we |
---|
8774 | + # are done, self._check_for_done will cause the done deferred |
---|
8775 | + # that we returned to our caller to fire, which tells them that |
---|
8776 | + # they have a complete servermap, and that we won't be touching |
---|
8777 | + # the servermap anymore. |
---|
8778 | + dl.addCallback(_done_processing) |
---|
8779 | + dl.addCallback(self._check_for_done) |
---|
8780 | + dl.addErrback(self._fatal_error) |
---|
8781 | # all done! |
---|
8782 | self.log("_got_results done", parent=lp, level=log.NOISY) |
---|
8783 | hunk ./src/allmydata/mutable/servermap.py 769 |
---|
8784 | + return dl |
---|
8785 | + |
---|
8786 | + |
---|
8787 | + def _turn_barrier(self, result): |
---|
8788 | + """ |
---|
8789 | + I help the servermap updater avoid the recursion limit issues |
---|
8790 | + discussed in #237. |
---|
8791 | + """ |
---|
8792 | + return fireEventually(result) |
---|
8793 | + |
---|
8794 | + |
---|
8795 | + def _try_to_set_pubkey(self, pubkey_s, peerid, shnum, lp): |
---|
8796 | + if self._node.get_pubkey(): |
---|
8797 | + return # don't go through this again if we don't have to |
---|
8798 | + fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s) |
---|
8799 | + assert len(fingerprint) == 32 |
---|
8800 | + if fingerprint != self._node.get_fingerprint(): |
---|
8801 | + raise CorruptShareError(peerid, shnum, |
---|
8802 | + "pubkey doesn't match fingerprint") |
---|
8803 | + self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s)) |
---|
8804 | + assert self._node.get_pubkey() |
---|
8805 | + |
---|
8806 | |
---|
8807 | def notify_server_corruption(self, peerid, shnum, reason): |
---|
8808 | ss = self._servermap.connections[peerid] |
---|
8809 | hunk ./src/allmydata/mutable/servermap.py 797 |
---|
8810 | ss.callRemoteOnly("advise_corrupt_share", |
---|
8811 | "mutable", self._storage_index, shnum, reason) |
---|
8812 | |
---|
8813 | - def _got_results_one_share(self, shnum, data, peerid, lp): |
---|
8814 | + |
---|
8815 | + def _got_signature_one_share(self, results, shnum, peerid, lp): |
---|
8816 | + # It is our job to give versioninfo to our caller. We need to |
---|
8817 | + # raise CorruptShareError if the share is corrupt for any |
---|
8818 | + # reason, something that our caller will handle. |
---|
8819 | self.log(format="_got_results: got shnum #%(shnum)d from peerid %(peerid)s", |
---|
8820 | shnum=shnum, |
---|
8821 | peerid=idlib.shortnodeid_b2a(peerid), |
---|
8822 | hunk ./src/allmydata/mutable/servermap.py 807 |
---|
8823 | level=log.NOISY, |
---|
8824 | parent=lp) |
---|
8825 | + if not self._running: |
---|
8826 | + # We can't process the results, since we can't touch the |
---|
8827 | + # servermap anymore. |
---|
8828 | + self.log("but we're not running anymore.") |
---|
8829 | + return None |
---|
8830 | |
---|
8831 | hunk ./src/allmydata/mutable/servermap.py 813 |
---|
8832 | - # this might raise NeedMoreDataError, if the pubkey and signature |
---|
8833 | - # live at some weird offset. That shouldn't happen, so I'm going to |
---|
8834 | - # treat it as a bad share. |
---|
8835 | - (seqnum, root_hash, IV, k, N, segsize, datalength, |
---|
8836 | - pubkey_s, signature, prefix) = unpack_prefix_and_signature(data) |
---|
8837 | - |
---|
8838 | - if not self._node.get_pubkey(): |
---|
8839 | - fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s) |
---|
8840 | - assert len(fingerprint) == 32 |
---|
8841 | - if fingerprint != self._node.get_fingerprint(): |
---|
8842 | - raise CorruptShareError(peerid, shnum, |
---|
8843 | - "pubkey doesn't match fingerprint") |
---|
8844 | - self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s)) |
---|
8845 | - |
---|
8846 | - if self._need_privkey: |
---|
8847 | - self._try_to_extract_privkey(data, peerid, shnum, lp) |
---|
8848 | - |
---|
8849 | - (ig_version, ig_seqnum, ig_root_hash, ig_IV, ig_k, ig_N, |
---|
8850 | - ig_segsize, ig_datalen, offsets) = unpack_header(data) |
---|
8851 | + _, verinfo, signature, __, ___ = results |
---|
8852 | + (seqnum, |
---|
8853 | + root_hash, |
---|
8854 | + saltish, |
---|
8855 | + segsize, |
---|
8856 | + datalen, |
---|
8857 | + k, |
---|
8858 | + n, |
---|
8859 | + prefix, |
---|
8860 | + offsets) = verinfo[1] |
---|
8861 | offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] ) |
---|
8862 | |
---|
8863 | hunk ./src/allmydata/mutable/servermap.py 825 |
---|
8864 | - verinfo = (seqnum, root_hash, IV, segsize, datalength, k, N, prefix, |
---|
8865 | + # XXX: This should be done for us in the method, so |
---|
8866 | + # presumably you can go in there and fix it. |
---|
8867 | + verinfo = (seqnum, |
---|
8868 | + root_hash, |
---|
8869 | + saltish, |
---|
8870 | + segsize, |
---|
8871 | + datalen, |
---|
8872 | + k, |
---|
8873 | + n, |
---|
8874 | + prefix, |
---|
8875 | offsets_tuple) |
---|
8876 | hunk ./src/allmydata/mutable/servermap.py 836 |
---|
8877 | + # This tuple uniquely identifies a share on the grid; we use it |
---|
8878 | + # to keep track of the ones that we've already seen. |
---|
8879 | |
---|
8880 | if verinfo not in self._valid_versions: |
---|
8881 | hunk ./src/allmydata/mutable/servermap.py 840 |
---|
8882 | - # it's a new pair. Verify the signature. |
---|
8883 | - valid = self._node.get_pubkey().verify(prefix, signature) |
---|
8884 | + # This is a new version tuple, and we need to validate it |
---|
8885 | + # against the public key before keeping track of it. |
---|
8886 | + assert self._node.get_pubkey() |
---|
8887 | + valid = self._node.get_pubkey().verify(prefix, signature[1]) |
---|
8888 | if not valid: |
---|
8889 | hunk ./src/allmydata/mutable/servermap.py 845 |
---|
8890 | - raise CorruptShareError(peerid, shnum, "signature is invalid") |
---|
8891 | + raise CorruptShareError(peerid, shnum, |
---|
8892 | + "signature is invalid") |
---|
8893 | |
---|
8894 | hunk ./src/allmydata/mutable/servermap.py 848 |
---|
8895 | - # ok, it's a valid verinfo. Add it to the list of validated |
---|
8896 | - # versions. |
---|
8897 | - self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d" |
---|
8898 | - % (seqnum, base32.b2a(root_hash)[:4], |
---|
8899 | - idlib.shortnodeid_b2a(peerid), shnum, |
---|
8900 | - k, N, segsize, datalength), |
---|
8901 | - parent=lp) |
---|
8902 | - self._valid_versions.add(verinfo) |
---|
8903 | - # We now know that this is a valid candidate verinfo. |
---|
8904 | + # ok, it's a valid verinfo. Add it to the list of validated |
---|
8905 | + # versions. |
---|
8906 | + self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d" |
---|
8907 | + % (seqnum, base32.b2a(root_hash)[:4], |
---|
8908 | + idlib.shortnodeid_b2a(peerid), shnum, |
---|
8909 | + k, n, segsize, datalen), |
---|
8910 | + parent=lp) |
---|
8911 | + self._valid_versions.add(verinfo) |
---|
8912 | + # We now know that this is a valid candidate verinfo. Whether or |
---|
8913 | + # not this instance of it is valid is a matter for the next |
---|
8914 | + # statement; at this point, we just know that if we see this |
---|
8915 | + # version info again, that its signature checks out and that |
---|
8916 | + # we're okay to skip the signature-checking step. |
---|
8917 | |
---|
8918 | hunk ./src/allmydata/mutable/servermap.py 862 |
---|
8919 | + # (peerid, shnum) are bound in the method invocation. |
---|
8920 | if (peerid, shnum) in self._servermap.bad_shares: |
---|
8921 | # we've been told that the rest of the data in this share is |
---|
8922 | # unusable, so don't add it to the servermap. |
---|
8923 | hunk ./src/allmydata/mutable/servermap.py 875 |
---|
8924 | self._servermap.add_new_share(peerid, shnum, verinfo, timestamp) |
---|
8925 | # and the versionmap |
---|
8926 | self.versionmap.add(verinfo, (shnum, peerid, timestamp)) |
---|
8927 | + |
---|
8928 | + # It's our job to set the protocol version of our parent |
---|
8929 | + # filenode if it isn't already set. |
---|
8930 | + if not self._node.get_version(): |
---|
8931 | + # The first byte of the prefix is the version. |
---|
8932 | + v = struct.unpack(">B", prefix[:1])[0] |
---|
8933 | + self.log("got version %d" % v) |
---|
8934 | + self._node.set_version(v) |
---|
8935 | + |
---|
8936 | return verinfo |
---|
8937 | |
---|
8938 | hunk ./src/allmydata/mutable/servermap.py 886 |
---|
8939 | - def _deserialize_pubkey(self, pubkey_s): |
---|
8940 | - verifier = rsa.create_verifying_key_from_string(pubkey_s) |
---|
8941 | - return verifier |
---|
8942 | |
---|
8943 | hunk ./src/allmydata/mutable/servermap.py 887 |
---|
8944 | - def _try_to_extract_privkey(self, data, peerid, shnum, lp): |
---|
8945 | - try: |
---|
8946 | - r = unpack_share(data) |
---|
8947 | - except NeedMoreDataError, e: |
---|
8948 | - # this share won't help us. oh well. |
---|
8949 | - offset = e.encprivkey_offset |
---|
8950 | - length = e.encprivkey_length |
---|
8951 | - self.log("shnum %d on peerid %s: share was too short (%dB) " |
---|
8952 | - "to get the encprivkey; [%d:%d] ought to hold it" % |
---|
8953 | - (shnum, idlib.shortnodeid_b2a(peerid), len(data), |
---|
8954 | - offset, offset+length), |
---|
8955 | - parent=lp) |
---|
8956 | - # NOTE: if uncoordinated writes are taking place, someone might |
---|
8957 | - # change the share (and most probably move the encprivkey) before |
---|
8958 | - # we get a chance to do one of these reads and fetch it. This |
---|
8959 | - # will cause us to see a NotEnoughSharesError(unable to fetch |
---|
8960 | - # privkey) instead of an UncoordinatedWriteError . This is a |
---|
8961 | - # nuisance, but it will go away when we move to DSA-based mutable |
---|
8962 | - # files (since the privkey will be small enough to fit in the |
---|
8963 | - # write cap). |
---|
8964 | + def _got_update_results_one_share(self, results, share): |
---|
8965 | + """ |
---|
8966 | + I record the update results in results. |
---|
8967 | + """ |
---|
8968 | + assert len(results) == 4 |
---|
8969 | + verinfo, blockhashes, start, end = results |
---|
8970 | + (seqnum, |
---|
8971 | + root_hash, |
---|
8972 | + saltish, |
---|
8973 | + segsize, |
---|
8974 | + datalen, |
---|
8975 | + k, |
---|
8976 | + n, |
---|
8977 | + prefix, |
---|
8978 | + offsets) = verinfo |
---|
8979 | + offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] ) |
---|
8980 | |
---|
8981 | hunk ./src/allmydata/mutable/servermap.py 904 |
---|
8982 | - return |
---|
8983 | + # XXX: This should be done for us in the method, so |
---|
8984 | + # presumably you can go in there and fix it. |
---|
8985 | + verinfo = (seqnum, |
---|
8986 | + root_hash, |
---|
8987 | + saltish, |
---|
8988 | + segsize, |
---|
8989 | + datalen, |
---|
8990 | + k, |
---|
8991 | + n, |
---|
8992 | + prefix, |
---|
8993 | + offsets_tuple) |
---|
8994 | |
---|
8995 | hunk ./src/allmydata/mutable/servermap.py 916 |
---|
8996 | - (seqnum, root_hash, IV, k, N, segsize, datalen, |
---|
8997 | - pubkey, signature, share_hash_chain, block_hash_tree, |
---|
8998 | - share_data, enc_privkey) = r |
---|
8999 | + update_data = (blockhashes, start, end) |
---|
9000 | + self._servermap.set_update_data_for_share_and_verinfo(share, |
---|
9001 | + verinfo, |
---|
9002 | + update_data) |
---|
9003 | |
---|
9004 | hunk ./src/allmydata/mutable/servermap.py 921 |
---|
9005 | - return self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp) |
---|
9006 | + |
---|
9007 | + def _deserialize_pubkey(self, pubkey_s): |
---|
9008 | + verifier = rsa.create_verifying_key_from_string(pubkey_s) |
---|
9009 | + return verifier |
---|
9010 | |
---|
9011 | hunk ./src/allmydata/mutable/servermap.py 926 |
---|
9012 | - def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp): |
---|
9013 | |
---|
9014 | hunk ./src/allmydata/mutable/servermap.py 927 |
---|
9015 | + def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp): |
---|
9016 | + """ |
---|
9017 | + Given a writekey from a remote server, I validate it against the |
---|
9018 | + writekey stored in my node. If it is valid, then I set the |
---|
9019 | + privkey and encprivkey properties of the node. |
---|
9020 | + """ |
---|
9021 | alleged_privkey_s = self._node._decrypt_privkey(enc_privkey) |
---|
9022 | alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s) |
---|
9023 | if alleged_writekey != self._node.get_writekey(): |
---|
9024 | hunk ./src/allmydata/mutable/servermap.py 1005 |
---|
9025 | self._queries_completed += 1 |
---|
9026 | self._last_failure = f |
---|
9027 | |
---|
9028 | - def _got_privkey_results(self, datavs, peerid, shnum, started, lp): |
---|
9029 | - now = time.time() |
---|
9030 | - elapsed = now - started |
---|
9031 | - self._status.add_per_server_time(peerid, "privkey", started, elapsed) |
---|
9032 | - self._queries_outstanding.discard(peerid) |
---|
9033 | - if not self._need_privkey: |
---|
9034 | - return |
---|
9035 | - if shnum not in datavs: |
---|
9036 | - self.log("privkey wasn't there when we asked it", |
---|
9037 | - level=log.WEIRD, umid="VA9uDQ") |
---|
9038 | - return |
---|
9039 | - datav = datavs[shnum] |
---|
9040 | - enc_privkey = datav[0] |
---|
9041 | - self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp) |
---|
9042 | |
---|
9043 | def _privkey_query_failed(self, f, peerid, shnum, lp): |
---|
9044 | self._queries_outstanding.discard(peerid) |
---|
9045 | hunk ./src/allmydata/mutable/servermap.py 1019 |
---|
9046 | self._servermap.problems.append(f) |
---|
9047 | self._last_failure = f |
---|
9048 | |
---|
9049 | + |
---|
9050 | def _check_for_done(self, res): |
---|
9051 | # exit paths: |
---|
9052 | # return self._send_more_queries(outstanding) : send some more queries |
---|
9053 | hunk ./src/allmydata/mutable/servermap.py 1025 |
---|
9054 | # return self._done() : all done |
---|
9055 | # return : keep waiting, no new queries |
---|
9056 | - |
---|
9057 | lp = self.log(format=("_check_for_done, mode is '%(mode)s', " |
---|
9058 | "%(outstanding)d queries outstanding, " |
---|
9059 | "%(extra)d extra peers available, " |
---|
9060 | hunk ./src/allmydata/mutable/servermap.py 1216 |
---|
9061 | |
---|
9062 | def _done(self): |
---|
9063 | if not self._running: |
---|
9064 | + self.log("not running; we're already done") |
---|
9065 | return |
---|
9066 | self._running = False |
---|
9067 | now = time.time() |
---|
9068 | hunk ./src/allmydata/mutable/servermap.py 1231 |
---|
9069 | self._servermap.last_update_time = self._started |
---|
9070 | # the servermap will not be touched after this |
---|
9071 | self.log("servermap: %s" % self._servermap.summarize_versions()) |
---|
9072 | + |
---|
9073 | eventually(self._done_deferred.callback, self._servermap) |
---|
9074 | |
---|
9075 | def _fatal_error(self, f): |
---|
9076 | } |
---|
9077 | [tests: |
---|
9078 | Kevan Carstensen <kevan@isnotajoke.com>**20100811233331 |
---|
9079 | Ignore-this: 2c2c6049abe088edce8fc54f248a2225 |
---|
9080 | |
---|
9081 | - A lot of existing tests relied on aspects of the mutable file |
---|
9082 | implementation that were changed. This patch updates those tests |
---|
9083 | to work with the changes. |
---|
9084 | - This patch also adds tests for new features. |
---|
9085 | ] { |
---|
9086 | hunk ./src/allmydata/test/common.py 12 |
---|
9087 | from allmydata import uri, dirnode, client |
---|
9088 | from allmydata.introducer.server import IntroducerNode |
---|
9089 | from allmydata.interfaces import IMutableFileNode, IImmutableFileNode, \ |
---|
9090 | - FileTooLargeError, NotEnoughSharesError, ICheckable |
---|
9091 | + FileTooLargeError, NotEnoughSharesError, ICheckable, \ |
---|
9092 | + IMutableUploadable |
---|
9093 | from allmydata.check_results import CheckResults, CheckAndRepairResults, \ |
---|
9094 | DeepCheckResults, DeepCheckAndRepairResults |
---|
9095 | from allmydata.mutable.common import CorruptShareError |
---|
9096 | hunk ./src/allmydata/test/common.py 18 |
---|
9097 | from allmydata.mutable.layout import unpack_header |
---|
9098 | +from allmydata.mutable.publish import MutableData |
---|
9099 | from allmydata.storage.server import storage_index_to_dir |
---|
9100 | from allmydata.storage.mutable import MutableShareFile |
---|
9101 | from allmydata.util import hashutil, log, fileutil, pollmixin |
---|
9102 | hunk ./src/allmydata/test/common.py 152 |
---|
9103 | consumer.write(data[start:end]) |
---|
9104 | return consumer |
---|
9105 | |
---|
9106 | + |
---|
9107 | + def get_best_readable_version(self): |
---|
9108 | + return defer.succeed(self) |
---|
9109 | + |
---|
9110 | + |
---|
9111 | + download_best_version = download_to_data |
---|
9112 | + |
---|
9113 | + |
---|
9114 | + def download_to_data(self): |
---|
9115 | + return download_to_data(self) |
---|
9116 | + |
---|
9117 | + |
---|
9118 | + def get_size_of_best_version(self): |
---|
9119 | + return defer.succeed(self.get_size) |
---|
9120 | + |
---|
9121 | + |
---|
9122 | def make_chk_file_cap(size): |
---|
9123 | return uri.CHKFileURI(key=os.urandom(16), |
---|
9124 | uri_extension_hash=os.urandom(32), |
---|
9125 | hunk ./src/allmydata/test/common.py 198 |
---|
9126 | self.init_from_cap(make_mutable_file_cap()) |
---|
9127 | def create(self, contents, key_generator=None, keysize=None): |
---|
9128 | initial_contents = self._get_initial_contents(contents) |
---|
9129 | - if len(initial_contents) > self.MUTABLE_SIZELIMIT: |
---|
9130 | - raise FileTooLargeError("SDMF is limited to one segment, and " |
---|
9131 | - "%d > %d" % (len(initial_contents), |
---|
9132 | - self.MUTABLE_SIZELIMIT)) |
---|
9133 | - self.all_contents[self.storage_index] = initial_contents |
---|
9134 | + data = initial_contents.read(initial_contents.get_size()) |
---|
9135 | + data = "".join(data) |
---|
9136 | + self.all_contents[self.storage_index] = data |
---|
9137 | return defer.succeed(self) |
---|
9138 | def _get_initial_contents(self, contents): |
---|
9139 | hunk ./src/allmydata/test/common.py 203 |
---|
9140 | - if isinstance(contents, str): |
---|
9141 | - return contents |
---|
9142 | if contents is None: |
---|
9143 | hunk ./src/allmydata/test/common.py 204 |
---|
9144 | - return "" |
---|
9145 | + return MutableData("") |
---|
9146 | + |
---|
9147 | + if IMutableUploadable.providedBy(contents): |
---|
9148 | + return contents |
---|
9149 | + |
---|
9150 | assert callable(contents), "%s should be callable, not %s" % \ |
---|
9151 | (contents, type(contents)) |
---|
9152 | return contents(self) |
---|
9153 | hunk ./src/allmydata/test/common.py 314 |
---|
9154 | return d |
---|
9155 | |
---|
9156 | def download_best_version(self): |
---|
9157 | + return defer.succeed(self._download_best_version()) |
---|
9158 | + |
---|
9159 | + |
---|
9160 | + def _download_best_version(self, ignored=None): |
---|
9161 | if isinstance(self.my_uri, uri.LiteralFileURI): |
---|
9162 | hunk ./src/allmydata/test/common.py 319 |
---|
9163 | - return defer.succeed(self.my_uri.data) |
---|
9164 | + return self.my_uri.data |
---|
9165 | if self.storage_index not in self.all_contents: |
---|
9166 | hunk ./src/allmydata/test/common.py 321 |
---|
9167 | - return defer.fail(NotEnoughSharesError(None, 0, 3)) |
---|
9168 | - return defer.succeed(self.all_contents[self.storage_index]) |
---|
9169 | + raise NotEnoughSharesError(None, 0, 3) |
---|
9170 | + return self.all_contents[self.storage_index] |
---|
9171 | + |
---|
9172 | |
---|
9173 | def overwrite(self, new_contents): |
---|
9174 | hunk ./src/allmydata/test/common.py 326 |
---|
9175 | - if len(new_contents) > self.MUTABLE_SIZELIMIT: |
---|
9176 | - raise FileTooLargeError("SDMF is limited to one segment, and " |
---|
9177 | - "%d > %d" % (len(new_contents), |
---|
9178 | - self.MUTABLE_SIZELIMIT)) |
---|
9179 | assert not self.is_readonly() |
---|
9180 | hunk ./src/allmydata/test/common.py 327 |
---|
9181 | - self.all_contents[self.storage_index] = new_contents |
---|
9182 | + new_data = new_contents.read(new_contents.get_size()) |
---|
9183 | + new_data = "".join(new_data) |
---|
9184 | + self.all_contents[self.storage_index] = new_data |
---|
9185 | return defer.succeed(None) |
---|
9186 | def modify(self, modifier): |
---|
9187 | # this does not implement FileTooLargeError, but the real one does |
---|
9188 | hunk ./src/allmydata/test/common.py 337 |
---|
9189 | def _modify(self, modifier): |
---|
9190 | assert not self.is_readonly() |
---|
9191 | old_contents = self.all_contents[self.storage_index] |
---|
9192 | - self.all_contents[self.storage_index] = modifier(old_contents, None, True) |
---|
9193 | + new_data = modifier(old_contents, None, True) |
---|
9194 | + self.all_contents[self.storage_index] = new_data |
---|
9195 | return None |
---|
9196 | |
---|
9197 | hunk ./src/allmydata/test/common.py 341 |
---|
9198 | + # As actually implemented, MutableFilenode and MutableFileVersion |
---|
9199 | + # are distinct. However, nothing in the webapi uses (yet) that |
---|
9200 | + # distinction -- it just uses the unified download interface |
---|
9201 | + # provided by get_best_readable_version and read. When we start |
---|
9202 | + # doing cooler things like LDMF, we will want to revise this code to |
---|
9203 | + # be less simplistic. |
---|
9204 | + def get_best_readable_version(self): |
---|
9205 | + return defer.succeed(self) |
---|
9206 | + |
---|
9207 | + |
---|
9208 | + def get_best_mutable_version(self): |
---|
9209 | + return defer.succeed(self) |
---|
9210 | + |
---|
9211 | + # Ditto for this, which is an implementation of IWritable. |
---|
9212 | + # XXX: Declare that the same is implemented. |
---|
9213 | + def update(self, data, offset): |
---|
9214 | + assert not self.is_readonly() |
---|
9215 | + def modifier(old, servermap, first_time): |
---|
9216 | + new = old[:offset] + "".join(data.read(data.get_size())) |
---|
9217 | + new += old[len(new):] |
---|
9218 | + return new |
---|
9219 | + return self.modify(modifier) |
---|
9220 | + |
---|
9221 | + |
---|
9222 | + def read(self, consumer, offset=0, size=None): |
---|
9223 | + data = self._download_best_version() |
---|
9224 | + if size: |
---|
9225 | + data = data[offset:offset+size] |
---|
9226 | + consumer.write(data) |
---|
9227 | + return defer.succeed(consumer) |
---|
9228 | + |
---|
9229 | + |
---|
9230 | def make_mutable_file_cap(): |
---|
9231 | return uri.WriteableSSKFileURI(writekey=os.urandom(16), |
---|
9232 | fingerprint=os.urandom(32)) |
---|
9233 | hunk ./src/allmydata/test/test_checker.py 11 |
---|
9234 | from allmydata.test.no_network import GridTestMixin |
---|
9235 | from allmydata.immutable.upload import Data |
---|
9236 | from allmydata.test.common_web import WebRenderingMixin |
---|
9237 | +from allmydata.mutable.publish import MutableData |
---|
9238 | |
---|
9239 | class FakeClient: |
---|
9240 | def get_storage_broker(self): |
---|
9241 | hunk ./src/allmydata/test/test_checker.py 291 |
---|
9242 | def _stash_immutable(ur): |
---|
9243 | self.imm = c0.create_node_from_uri(ur.uri) |
---|
9244 | d.addCallback(_stash_immutable) |
---|
9245 | - d.addCallback(lambda ign: c0.create_mutable_file("contents")) |
---|
9246 | + d.addCallback(lambda ign: |
---|
9247 | + c0.create_mutable_file(MutableData("contents"))) |
---|
9248 | def _stash_mutable(node): |
---|
9249 | self.mut = node |
---|
9250 | d.addCallback(_stash_mutable) |
---|
9251 | hunk ./src/allmydata/test/test_cli.py 11 |
---|
9252 | from allmydata.util import fileutil, hashutil, base32 |
---|
9253 | from allmydata import uri |
---|
9254 | from allmydata.immutable import upload |
---|
9255 | +from allmydata.mutable.publish import MutableData |
---|
9256 | from allmydata.dirnode import normalize |
---|
9257 | |
---|
9258 | # Test that the scripts can be imported -- although the actual tests of their |
---|
9259 | hunk ./src/allmydata/test/test_cli.py 644 |
---|
9260 | |
---|
9261 | d = self.do_cli("create-alias", etudes_arg) |
---|
9262 | def _check_create_unicode((rc, out, err)): |
---|
9263 | - self.failUnlessReallyEqual(rc, 0) |
---|
9264 | + #self.failUnlessReallyEqual(rc, 0) |
---|
9265 | self.failUnlessReallyEqual(err, "") |
---|
9266 | self.failUnlessIn("Alias %s created" % quote_output(u"\u00E9tudes"), out) |
---|
9267 | |
---|
9268 | hunk ./src/allmydata/test/test_cli.py 1975 |
---|
9269 | self.set_up_grid() |
---|
9270 | c0 = self.g.clients[0] |
---|
9271 | DATA = "data" * 100 |
---|
9272 | - d = c0.create_mutable_file(DATA) |
---|
9273 | + DATA_uploadable = MutableData(DATA) |
---|
9274 | + d = c0.create_mutable_file(DATA_uploadable) |
---|
9275 | def _stash_uri(n): |
---|
9276 | self.uri = n.get_uri() |
---|
9277 | d.addCallback(_stash_uri) |
---|
9278 | hunk ./src/allmydata/test/test_cli.py 2077 |
---|
9279 | upload.Data("literal", |
---|
9280 | convergence=""))) |
---|
9281 | d.addCallback(_stash_uri, "small") |
---|
9282 | - d.addCallback(lambda ign: c0.create_mutable_file(DATA+"1")) |
---|
9283 | + d.addCallback(lambda ign: |
---|
9284 | + c0.create_mutable_file(MutableData(DATA+"1"))) |
---|
9285 | d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn)) |
---|
9286 | d.addCallback(_stash_uri, "mutable") |
---|
9287 | |
---|
9288 | hunk ./src/allmydata/test/test_cli.py 2096 |
---|
9289 | # root/small |
---|
9290 | # root/mutable |
---|
9291 | |
---|
9292 | + # We haven't broken anything yet, so this should all be healthy. |
---|
9293 | d.addCallback(lambda ign: self.do_cli("deep-check", "--verbose", |
---|
9294 | self.rooturi)) |
---|
9295 | def _check2((rc, out, err)): |
---|
9296 | hunk ./src/allmydata/test/test_cli.py 2111 |
---|
9297 | in lines, out) |
---|
9298 | d.addCallback(_check2) |
---|
9299 | |
---|
9300 | + # Similarly, all of these results should be as we expect them to |
---|
9301 | + # be for a healthy file layout. |
---|
9302 | d.addCallback(lambda ign: self.do_cli("stats", self.rooturi)) |
---|
9303 | def _check_stats((rc, out, err)): |
---|
9304 | self.failUnlessReallyEqual(err, "") |
---|
9305 | hunk ./src/allmydata/test/test_cli.py 2128 |
---|
9306 | self.failUnlessIn(" 317-1000 : 1 (1000 B, 1000 B)", lines) |
---|
9307 | d.addCallback(_check_stats) |
---|
9308 | |
---|
9309 | + # Now we break things. |
---|
9310 | def _clobber_shares(ignored): |
---|
9311 | shares = self.find_uri_shares(self.uris[u"g\u00F6\u00F6d"]) |
---|
9312 | self.failUnlessReallyEqual(len(shares), 10) |
---|
9313 | hunk ./src/allmydata/test/test_cli.py 2153 |
---|
9314 | |
---|
9315 | d.addCallback(lambda ign: |
---|
9316 | self.do_cli("deep-check", "--verbose", self.rooturi)) |
---|
9317 | + # This should reveal the missing share, but not the corrupt |
---|
9318 | + # share, since we didn't tell the deep check operation to also |
---|
9319 | + # verify. |
---|
9320 | def _check3((rc, out, err)): |
---|
9321 | self.failUnlessReallyEqual(err, "") |
---|
9322 | self.failUnlessReallyEqual(rc, 0) |
---|
9323 | hunk ./src/allmydata/test/test_cli.py 2204 |
---|
9324 | "--verbose", "--verify", "--repair", |
---|
9325 | self.rooturi)) |
---|
9326 | def _check6((rc, out, err)): |
---|
9327 | + # We've just repaired the directory. There is no reason for |
---|
9328 | + # that repair to be unsuccessful. |
---|
9329 | self.failUnlessReallyEqual(err, "") |
---|
9330 | self.failUnlessReallyEqual(rc, 0) |
---|
9331 | lines = out.splitlines() |
---|
9332 | hunk ./src/allmydata/test/test_deepcheck.py 9 |
---|
9333 | from twisted.internet import threads # CLI tests use deferToThread |
---|
9334 | from allmydata.immutable import upload |
---|
9335 | from allmydata.mutable.common import UnrecoverableFileError |
---|
9336 | +from allmydata.mutable.publish import MutableData |
---|
9337 | from allmydata.util import idlib |
---|
9338 | from allmydata.util import base32 |
---|
9339 | from allmydata.scripts import runner |
---|
9340 | hunk ./src/allmydata/test/test_deepcheck.py 38 |
---|
9341 | self.basedir = "deepcheck/MutableChecker/good" |
---|
9342 | self.set_up_grid() |
---|
9343 | CONTENTS = "a little bit of data" |
---|
9344 | - d = self.g.clients[0].create_mutable_file(CONTENTS) |
---|
9345 | + CONTENTS_uploadable = MutableData(CONTENTS) |
---|
9346 | + d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable) |
---|
9347 | def _created(node): |
---|
9348 | self.node = node |
---|
9349 | self.fileurl = "uri/" + urllib.quote(node.get_uri()) |
---|
9350 | hunk ./src/allmydata/test/test_deepcheck.py 61 |
---|
9351 | self.basedir = "deepcheck/MutableChecker/corrupt" |
---|
9352 | self.set_up_grid() |
---|
9353 | CONTENTS = "a little bit of data" |
---|
9354 | - d = self.g.clients[0].create_mutable_file(CONTENTS) |
---|
9355 | + CONTENTS_uploadable = MutableData(CONTENTS) |
---|
9356 | + d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable) |
---|
9357 | def _stash_and_corrupt(node): |
---|
9358 | self.node = node |
---|
9359 | self.fileurl = "uri/" + urllib.quote(node.get_uri()) |
---|
9360 | hunk ./src/allmydata/test/test_deepcheck.py 99 |
---|
9361 | self.basedir = "deepcheck/MutableChecker/delete_share" |
---|
9362 | self.set_up_grid() |
---|
9363 | CONTENTS = "a little bit of data" |
---|
9364 | - d = self.g.clients[0].create_mutable_file(CONTENTS) |
---|
9365 | + CONTENTS_uploadable = MutableData(CONTENTS) |
---|
9366 | + d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable) |
---|
9367 | def _stash_and_delete(node): |
---|
9368 | self.node = node |
---|
9369 | self.fileurl = "uri/" + urllib.quote(node.get_uri()) |
---|
9370 | hunk ./src/allmydata/test/test_deepcheck.py 223 |
---|
9371 | self.root = n |
---|
9372 | self.root_uri = n.get_uri() |
---|
9373 | d.addCallback(_created_root) |
---|
9374 | - d.addCallback(lambda ign: c0.create_mutable_file("mutable file contents")) |
---|
9375 | + d.addCallback(lambda ign: |
---|
9376 | + c0.create_mutable_file(MutableData("mutable file contents"))) |
---|
9377 | d.addCallback(lambda n: self.root.set_node(u"mutable", n)) |
---|
9378 | def _created_mutable(n): |
---|
9379 | self.mutable = n |
---|
9380 | hunk ./src/allmydata/test/test_deepcheck.py 965 |
---|
9381 | def create_mangled(self, ignored, name): |
---|
9382 | nodetype, mangletype = name.split("-", 1) |
---|
9383 | if nodetype == "mutable": |
---|
9384 | - d = self.g.clients[0].create_mutable_file("mutable file contents") |
---|
9385 | + mutable_uploadable = MutableData("mutable file contents") |
---|
9386 | + d = self.g.clients[0].create_mutable_file(mutable_uploadable) |
---|
9387 | d.addCallback(lambda n: self.root.set_node(unicode(name), n)) |
---|
9388 | elif nodetype == "large": |
---|
9389 | large = upload.Data("Lots of data\n" * 1000 + name + "\n", None) |
---|
9390 | hunk ./src/allmydata/test/test_dirnode.py 1304 |
---|
9391 | implements(IMutableFileNode) |
---|
9392 | counter = 0 |
---|
9393 | def __init__(self, initial_contents=""): |
---|
9394 | - self.data = self._get_initial_contents(initial_contents) |
---|
9395 | + data = self._get_initial_contents(initial_contents) |
---|
9396 | + self.data = data.read(data.get_size()) |
---|
9397 | + self.data = "".join(self.data) |
---|
9398 | + |
---|
9399 | counter = FakeMutableFile.counter |
---|
9400 | FakeMutableFile.counter += 1 |
---|
9401 | writekey = hashutil.ssk_writekey_hash(str(counter)) |
---|
9402 | hunk ./src/allmydata/test/test_dirnode.py 1354 |
---|
9403 | pass |
---|
9404 | |
---|
9405 | def modify(self, modifier): |
---|
9406 | - self.data = modifier(self.data, None, True) |
---|
9407 | + data = modifier(self.data, None, True) |
---|
9408 | + self.data = data |
---|
9409 | return defer.succeed(None) |
---|
9410 | |
---|
9411 | class FakeNodeMaker(NodeMaker): |
---|
9412 | hunk ./src/allmydata/test/test_filenode.py 98 |
---|
9413 | def _check_segment(res): |
---|
9414 | self.failUnlessEqual(res, DATA[1:1+5]) |
---|
9415 | d.addCallback(_check_segment) |
---|
9416 | + d.addCallback(lambda ignored: fn1.get_best_readable_version()) |
---|
9417 | + d.addCallback(lambda fn2: self.failUnlessEqual(fn1, fn2)) |
---|
9418 | + d.addCallback(lambda ignored: |
---|
9419 | + fn1.get_size_of_best_version()) |
---|
9420 | + d.addCallback(lambda size: |
---|
9421 | + self.failUnlessEqual(size, len(DATA))) |
---|
9422 | + d.addCallback(lambda ignored: |
---|
9423 | + fn1.download_to_data()) |
---|
9424 | + d.addCallback(lambda data: |
---|
9425 | + self.failUnlessEqual(data, DATA)) |
---|
9426 | + d.addCallback(lambda ignored: |
---|
9427 | + fn1.download_best_version()) |
---|
9428 | + d.addCallback(lambda data: |
---|
9429 | + self.failUnlessEqual(data, DATA)) |
---|
9430 | |
---|
9431 | return d |
---|
9432 | |
---|
9433 | hunk ./src/allmydata/test/test_hung_server.py 10 |
---|
9434 | from allmydata.util.consumer import download_to_data |
---|
9435 | from allmydata.immutable import upload |
---|
9436 | from allmydata.mutable.common import UnrecoverableFileError |
---|
9437 | +from allmydata.mutable.publish import MutableData |
---|
9438 | from allmydata.storage.common import storage_index_to_dir |
---|
9439 | from allmydata.test.no_network import GridTestMixin |
---|
9440 | from allmydata.test.common import ShouldFailMixin |
---|
9441 | hunk ./src/allmydata/test/test_hung_server.py 108 |
---|
9442 | self.servers = [(id, ss) for (id, ss) in nm.storage_broker.get_all_servers()] |
---|
9443 | |
---|
9444 | if mutable: |
---|
9445 | - d = nm.create_mutable_file(mutable_plaintext) |
---|
9446 | + uploadable = MutableData(mutable_plaintext) |
---|
9447 | + d = nm.create_mutable_file(uploadable) |
---|
9448 | def _uploaded_mutable(node): |
---|
9449 | self.uri = node.get_uri() |
---|
9450 | self.shares = self.find_uri_shares(self.uri) |
---|
9451 | hunk ./src/allmydata/test/test_immutable.py 4 |
---|
9452 | from allmydata.test import common |
---|
9453 | from allmydata.interfaces import NotEnoughSharesError |
---|
9454 | from allmydata.util.consumer import download_to_data |
---|
9455 | -from twisted.internet import defer |
---|
9456 | +from twisted.internet import defer, base |
---|
9457 | from twisted.trial import unittest |
---|
9458 | import random |
---|
9459 | |
---|
9460 | hunk ./src/allmydata/test/test_immutable.py 143 |
---|
9461 | d.addCallback(_after_attempt) |
---|
9462 | return d |
---|
9463 | |
---|
9464 | + def test_download_to_data(self): |
---|
9465 | + d = self.n.download_to_data() |
---|
9466 | + d.addCallback(lambda data: |
---|
9467 | + self.failUnlessEqual(data, common.TEST_DATA)) |
---|
9468 | + return d |
---|
9469 | |
---|
9470 | hunk ./src/allmydata/test/test_immutable.py 149 |
---|
9471 | + |
---|
9472 | + def test_download_best_version(self): |
---|
9473 | + d = self.n.download_best_version() |
---|
9474 | + d.addCallback(lambda data: |
---|
9475 | + self.failUnlessEqual(data, common.TEST_DATA)) |
---|
9476 | + return d |
---|
9477 | + |
---|
9478 | + |
---|
9479 | + def test_get_best_readable_version(self): |
---|
9480 | + d = self.n.get_best_readable_version() |
---|
9481 | + d.addCallback(lambda n2: |
---|
9482 | + self.failUnlessEqual(n2, self.n)) |
---|
9483 | + return d |
---|
9484 | + |
---|
9485 | + def test_get_size_of_best_version(self): |
---|
9486 | + d = self.n.get_size_of_best_version() |
---|
9487 | + d.addCallback(lambda size: |
---|
9488 | + self.failUnlessEqual(size, len(common.TEST_DATA))) |
---|
9489 | + return d |
---|
9490 | + |
---|
9491 | + |
---|
9492 | # XXX extend these tests to show bad behavior of various kinds from servers: |
---|
9493 | # raising exception from each remove_foo() method, for example |
---|
9494 | |
---|
9495 | hunk ./src/allmydata/test/test_mutable.py 2 |
---|
9496 | |
---|
9497 | -import struct |
---|
9498 | +import struct, os |
---|
9499 | from cStringIO import StringIO |
---|
9500 | from twisted.trial import unittest |
---|
9501 | from twisted.internet import defer, reactor |
---|
9502 | hunk ./src/allmydata/test/test_mutable.py 8 |
---|
9503 | from allmydata import uri, client |
---|
9504 | from allmydata.nodemaker import NodeMaker |
---|
9505 | -from allmydata.util import base32 |
---|
9506 | +from allmydata.util import base32, consumer, mathutil |
---|
9507 | from allmydata.util.hashutil import tagged_hash, ssk_writekey_hash, \ |
---|
9508 | ssk_pubkey_fingerprint_hash |
---|
9509 | hunk ./src/allmydata/test/test_mutable.py 11 |
---|
9510 | +from allmydata.util.deferredutil import gatherResults |
---|
9511 | from allmydata.interfaces import IRepairResults, ICheckAndRepairResults, \ |
---|
9512 | hunk ./src/allmydata/test/test_mutable.py 13 |
---|
9513 | - NotEnoughSharesError |
---|
9514 | + NotEnoughSharesError, SDMF_VERSION, MDMF_VERSION |
---|
9515 | from allmydata.monitor import Monitor |
---|
9516 | from allmydata.test.common import ShouldFailMixin |
---|
9517 | from allmydata.test.no_network import GridTestMixin |
---|
9518 | hunk ./src/allmydata/test/test_mutable.py 27 |
---|
9519 | NeedMoreDataError, UnrecoverableFileError, UncoordinatedWriteError, \ |
---|
9520 | NotEnoughServersError, CorruptShareError |
---|
9521 | from allmydata.mutable.retrieve import Retrieve |
---|
9522 | -from allmydata.mutable.publish import Publish |
---|
9523 | +from allmydata.mutable.publish import Publish, MutableFileHandle, \ |
---|
9524 | + MutableData, \ |
---|
9525 | + DEFAULT_MAX_SEGMENT_SIZE |
---|
9526 | from allmydata.mutable.servermap import ServerMap, ServermapUpdater |
---|
9527 | hunk ./src/allmydata/test/test_mutable.py 31 |
---|
9528 | -from allmydata.mutable.layout import unpack_header, unpack_share |
---|
9529 | +from allmydata.mutable.layout import unpack_header, unpack_share, \ |
---|
9530 | + MDMFSlotReadProxy |
---|
9531 | from allmydata.mutable.repairer import MustForceRepairError |
---|
9532 | |
---|
9533 | import allmydata.test.common_util as testutil |
---|
9534 | hunk ./src/allmydata/test/test_mutable.py 101 |
---|
9535 | self.storage = storage |
---|
9536 | self.queries = 0 |
---|
9537 | def callRemote(self, methname, *args, **kwargs): |
---|
9538 | + self.queries += 1 |
---|
9539 | def _call(): |
---|
9540 | meth = getattr(self, methname) |
---|
9541 | return meth(*args, **kwargs) |
---|
9542 | hunk ./src/allmydata/test/test_mutable.py 108 |
---|
9543 | d = fireEventually() |
---|
9544 | d.addCallback(lambda res: _call()) |
---|
9545 | return d |
---|
9546 | + |
---|
9547 | def callRemoteOnly(self, methname, *args, **kwargs): |
---|
9548 | hunk ./src/allmydata/test/test_mutable.py 110 |
---|
9549 | + self.queries += 1 |
---|
9550 | d = self.callRemote(methname, *args, **kwargs) |
---|
9551 | d.addBoth(lambda ignore: None) |
---|
9552 | pass |
---|
9553 | hunk ./src/allmydata/test/test_mutable.py 158 |
---|
9554 | chr(ord(original[byte_offset]) ^ 0x01) + |
---|
9555 | original[byte_offset+1:]) |
---|
9556 | |
---|
9557 | +def add_two(original, byte_offset): |
---|
9558 | + # It isn't enough to simply flip the bit for the version number, |
---|
9559 | + # because 1 is a valid version number. So we add two instead. |
---|
9560 | + return (original[:byte_offset] + |
---|
9561 | + chr(ord(original[byte_offset]) ^ 0x02) + |
---|
9562 | + original[byte_offset+1:]) |
---|
9563 | + |
---|
9564 | def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0): |
---|
9565 | # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a |
---|
9566 | # list of shnums to corrupt. |
---|
9567 | hunk ./src/allmydata/test/test_mutable.py 168 |
---|
9568 | + ds = [] |
---|
9569 | for peerid in s._peers: |
---|
9570 | shares = s._peers[peerid] |
---|
9571 | for shnum in shares: |
---|
9572 | hunk ./src/allmydata/test/test_mutable.py 176 |
---|
9573 | and shnum not in shnums_to_corrupt): |
---|
9574 | continue |
---|
9575 | data = shares[shnum] |
---|
9576 | - (version, |
---|
9577 | - seqnum, |
---|
9578 | - root_hash, |
---|
9579 | - IV, |
---|
9580 | - k, N, segsize, datalen, |
---|
9581 | - o) = unpack_header(data) |
---|
9582 | - if isinstance(offset, tuple): |
---|
9583 | - offset1, offset2 = offset |
---|
9584 | - else: |
---|
9585 | - offset1 = offset |
---|
9586 | - offset2 = 0 |
---|
9587 | - if offset1 == "pubkey": |
---|
9588 | - real_offset = 107 |
---|
9589 | - elif offset1 in o: |
---|
9590 | - real_offset = o[offset1] |
---|
9591 | - else: |
---|
9592 | - real_offset = offset1 |
---|
9593 | - real_offset = int(real_offset) + offset2 + offset_offset |
---|
9594 | - assert isinstance(real_offset, int), offset |
---|
9595 | - shares[shnum] = flip_bit(data, real_offset) |
---|
9596 | - return res |
---|
9597 | + # We're feeding the reader all of the share data, so it |
---|
9598 | + # won't need to use the rref that we didn't provide, nor the |
---|
9599 | + # storage index that we didn't provide. We do this because |
---|
9600 | + # the reader will work for both MDMF and SDMF. |
---|
9601 | + reader = MDMFSlotReadProxy(None, None, shnum, data) |
---|
9602 | + # We need to get the offsets for the next part. |
---|
9603 | + d = reader.get_verinfo() |
---|
9604 | + def _do_corruption(verinfo, data, shnum): |
---|
9605 | + (seqnum, |
---|
9606 | + root_hash, |
---|
9607 | + IV, |
---|
9608 | + segsize, |
---|
9609 | + datalen, |
---|
9610 | + k, n, prefix, o) = verinfo |
---|
9611 | + if isinstance(offset, tuple): |
---|
9612 | + offset1, offset2 = offset |
---|
9613 | + else: |
---|
9614 | + offset1 = offset |
---|
9615 | + offset2 = 0 |
---|
9616 | + if offset1 == "pubkey" and IV: |
---|
9617 | + real_offset = 107 |
---|
9618 | + elif offset1 == "share_data" and not IV: |
---|
9619 | + real_offset = 107 |
---|
9620 | + elif offset1 in o: |
---|
9621 | + real_offset = o[offset1] |
---|
9622 | + else: |
---|
9623 | + real_offset = offset1 |
---|
9624 | + real_offset = int(real_offset) + offset2 + offset_offset |
---|
9625 | + assert isinstance(real_offset, int), offset |
---|
9626 | + if offset1 == 0: # verbyte |
---|
9627 | + f = add_two |
---|
9628 | + else: |
---|
9629 | + f = flip_bit |
---|
9630 | + shares[shnum] = f(data, real_offset) |
---|
9631 | + d.addCallback(_do_corruption, data, shnum) |
---|
9632 | + ds.append(d) |
---|
9633 | + dl = defer.DeferredList(ds) |
---|
9634 | + dl.addCallback(lambda ignored: res) |
---|
9635 | + return dl |
---|
9636 | |
---|
9637 | def make_storagebroker(s=None, num_peers=10): |
---|
9638 | if not s: |
---|
9639 | hunk ./src/allmydata/test/test_mutable.py 257 |
---|
9640 | self.failUnlessEqual(len(shnums), 1) |
---|
9641 | d.addCallback(_created) |
---|
9642 | return d |
---|
9643 | + test_create.timeout = 15 |
---|
9644 | + |
---|
9645 | + |
---|
9646 | + def test_create_mdmf(self): |
---|
9647 | + d = self.nodemaker.create_mutable_file(version=MDMF_VERSION) |
---|
9648 | + def _created(n): |
---|
9649 | + self.failUnless(isinstance(n, MutableFileNode)) |
---|
9650 | + self.failUnlessEqual(n.get_storage_index(), n._storage_index) |
---|
9651 | + sb = self.nodemaker.storage_broker |
---|
9652 | + peer0 = sorted(sb.get_all_serverids())[0] |
---|
9653 | + shnums = self._storage._peers[peer0].keys() |
---|
9654 | + self.failUnlessEqual(len(shnums), 1) |
---|
9655 | + d.addCallback(_created) |
---|
9656 | + return d |
---|
9657 | + |
---|
9658 | |
---|
9659 | def test_serialize(self): |
---|
9660 | n = MutableFileNode(None, None, {"k": 3, "n": 10}, None) |
---|
9661 | hunk ./src/allmydata/test/test_mutable.py 302 |
---|
9662 | d.addCallback(lambda smap: smap.dump(StringIO())) |
---|
9663 | d.addCallback(lambda sio: |
---|
9664 | self.failUnless("3-of-10" in sio.getvalue())) |
---|
9665 | - d.addCallback(lambda res: n.overwrite("contents 1")) |
---|
9666 | + d.addCallback(lambda res: n.overwrite(MutableData("contents 1"))) |
---|
9667 | d.addCallback(lambda res: self.failUnlessIdentical(res, None)) |
---|
9668 | d.addCallback(lambda res: n.download_best_version()) |
---|
9669 | d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1")) |
---|
9670 | hunk ./src/allmydata/test/test_mutable.py 309 |
---|
9671 | d.addCallback(lambda res: n.get_size_of_best_version()) |
---|
9672 | d.addCallback(lambda size: |
---|
9673 | self.failUnlessEqual(size, len("contents 1"))) |
---|
9674 | - d.addCallback(lambda res: n.overwrite("contents 2")) |
---|
9675 | + d.addCallback(lambda res: n.overwrite(MutableData("contents 2"))) |
---|
9676 | d.addCallback(lambda res: n.download_best_version()) |
---|
9677 | d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2")) |
---|
9678 | d.addCallback(lambda res: n.get_servermap(MODE_WRITE)) |
---|
9679 | hunk ./src/allmydata/test/test_mutable.py 313 |
---|
9680 | - d.addCallback(lambda smap: n.upload("contents 3", smap)) |
---|
9681 | + d.addCallback(lambda smap: n.upload(MutableData("contents 3"), smap)) |
---|
9682 | d.addCallback(lambda res: n.download_best_version()) |
---|
9683 | d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3")) |
---|
9684 | d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING)) |
---|
9685 | hunk ./src/allmydata/test/test_mutable.py 325 |
---|
9686 | # mapupdate-to-retrieve data caching (i.e. make the shares larger |
---|
9687 | # than the default readsize, which is 2000 bytes). A 15kB file |
---|
9688 | # will have 5kB shares. |
---|
9689 | - d.addCallback(lambda res: n.overwrite("large size file" * 1000)) |
---|
9690 | + d.addCallback(lambda res: n.overwrite(MutableData("large size file" * 1000))) |
---|
9691 | d.addCallback(lambda res: n.download_best_version()) |
---|
9692 | d.addCallback(lambda res: |
---|
9693 | self.failUnlessEqual(res, "large size file" * 1000)) |
---|
9694 | hunk ./src/allmydata/test/test_mutable.py 333 |
---|
9695 | d.addCallback(_created) |
---|
9696 | return d |
---|
9697 | |
---|
9698 | + |
---|
9699 | + def test_upload_and_download_mdmf(self): |
---|
9700 | + d = self.nodemaker.create_mutable_file(version=MDMF_VERSION) |
---|
9701 | + def _created(n): |
---|
9702 | + d = defer.succeed(None) |
---|
9703 | + d.addCallback(lambda ignored: |
---|
9704 | + n.get_servermap(MODE_READ)) |
---|
9705 | + def _then(servermap): |
---|
9706 | + dumped = servermap.dump(StringIO()) |
---|
9707 | + self.failUnlessIn("3-of-10", dumped.getvalue()) |
---|
9708 | + d.addCallback(_then) |
---|
9709 | + # Now overwrite the contents with some new contents. We want |
---|
9710 | + # to make them big enough to force the file to be uploaded |
---|
9711 | + # in more than one segment. |
---|
9712 | + big_contents = "contents1" * 100000 # about 900 KiB |
---|
9713 | + big_contents_uploadable = MutableData(big_contents) |
---|
9714 | + d.addCallback(lambda ignored: |
---|
9715 | + n.overwrite(big_contents_uploadable)) |
---|
9716 | + d.addCallback(lambda ignored: |
---|
9717 | + n.download_best_version()) |
---|
9718 | + d.addCallback(lambda data: |
---|
9719 | + self.failUnlessEqual(data, big_contents)) |
---|
9720 | + # Overwrite the contents again with some new contents. As |
---|
9721 | + # before, they need to be big enough to force multiple |
---|
9722 | + # segments, so that we make the downloader deal with |
---|
9723 | + # multiple segments. |
---|
9724 | + bigger_contents = "contents2" * 1000000 # about 9MiB |
---|
9725 | + bigger_contents_uploadable = MutableData(bigger_contents) |
---|
9726 | + d.addCallback(lambda ignored: |
---|
9727 | + n.overwrite(bigger_contents_uploadable)) |
---|
9728 | + d.addCallback(lambda ignored: |
---|
9729 | + n.download_best_version()) |
---|
9730 | + d.addCallback(lambda data: |
---|
9731 | + self.failUnlessEqual(data, bigger_contents)) |
---|
9732 | + return d |
---|
9733 | + d.addCallback(_created) |
---|
9734 | + return d |
---|
9735 | + |
---|
9736 | + |
---|
9737 | + def test_mdmf_write_count(self): |
---|
9738 | + # Publishing an MDMF file should only cause one write for each |
---|
9739 | + # share that is to be published. Otherwise, we introduce |
---|
9740 | + # undesirable semantics that are a regression from SDMF |
---|
9741 | + upload = MutableData("MDMF" * 100000) # about 400 KiB |
---|
9742 | + d = self.nodemaker.create_mutable_file(upload, |
---|
9743 | + version=MDMF_VERSION) |
---|
9744 | + def _check_server_write_counts(ignored): |
---|
9745 | + sb = self.nodemaker.storage_broker |
---|
9746 | + peers = sb.test_servers.values() |
---|
9747 | + for peer in peers: |
---|
9748 | + self.failUnlessEqual(peer.queries, 1) |
---|
9749 | + d.addCallback(_check_server_write_counts) |
---|
9750 | + return d |
---|
9751 | + |
---|
9752 | + |
---|
9753 | def test_create_with_initial_contents(self): |
---|
9754 | hunk ./src/allmydata/test/test_mutable.py 389 |
---|
9755 | - d = self.nodemaker.create_mutable_file("contents 1") |
---|
9756 | + upload1 = MutableData("contents 1") |
---|
9757 | + d = self.nodemaker.create_mutable_file(upload1) |
---|
9758 | def _created(n): |
---|
9759 | d = n.download_best_version() |
---|
9760 | d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1")) |
---|
9761 | hunk ./src/allmydata/test/test_mutable.py 394 |
---|
9762 | - d.addCallback(lambda res: n.overwrite("contents 2")) |
---|
9763 | + upload2 = MutableData("contents 2") |
---|
9764 | + d.addCallback(lambda res: n.overwrite(upload2)) |
---|
9765 | d.addCallback(lambda res: n.download_best_version()) |
---|
9766 | d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2")) |
---|
9767 | return d |
---|
9768 | hunk ./src/allmydata/test/test_mutable.py 401 |
---|
9769 | d.addCallback(_created) |
---|
9770 | return d |
---|
9771 | + test_create_with_initial_contents.timeout = 15 |
---|
9772 | + |
---|
9773 | + |
---|
9774 | + def test_create_mdmf_with_initial_contents(self): |
---|
9775 | + initial_contents = "foobarbaz" * 131072 # 900KiB |
---|
9776 | + initial_contents_uploadable = MutableData(initial_contents) |
---|
9777 | + d = self.nodemaker.create_mutable_file(initial_contents_uploadable, |
---|
9778 | + version=MDMF_VERSION) |
---|
9779 | + def _created(n): |
---|
9780 | + d = n.download_best_version() |
---|
9781 | + d.addCallback(lambda data: |
---|
9782 | + self.failUnlessEqual(data, initial_contents)) |
---|
9783 | + uploadable2 = MutableData(initial_contents + "foobarbaz") |
---|
9784 | + d.addCallback(lambda ignored: |
---|
9785 | + n.overwrite(uploadable2)) |
---|
9786 | + d.addCallback(lambda ignored: |
---|
9787 | + n.download_best_version()) |
---|
9788 | + d.addCallback(lambda data: |
---|
9789 | + self.failUnlessEqual(data, initial_contents + |
---|
9790 | + "foobarbaz")) |
---|
9791 | + return d |
---|
9792 | + d.addCallback(_created) |
---|
9793 | + return d |
---|
9794 | + test_create_mdmf_with_initial_contents.timeout = 20 |
---|
9795 | + |
---|
9796 | |
---|
9797 | def test_create_with_initial_contents_function(self): |
---|
9798 | data = "initial contents" |
---|
9799 | hunk ./src/allmydata/test/test_mutable.py 434 |
---|
9800 | key = n.get_writekey() |
---|
9801 | self.failUnless(isinstance(key, str), key) |
---|
9802 | self.failUnlessEqual(len(key), 16) # AES key size |
---|
9803 | - return data |
---|
9804 | + return MutableData(data) |
---|
9805 | d = self.nodemaker.create_mutable_file(_make_contents) |
---|
9806 | def _created(n): |
---|
9807 | return n.download_best_version() |
---|
9808 | hunk ./src/allmydata/test/test_mutable.py 442 |
---|
9809 | d.addCallback(lambda data2: self.failUnlessEqual(data2, data)) |
---|
9810 | return d |
---|
9811 | |
---|
9812 | + |
---|
9813 | + def test_create_mdmf_with_initial_contents_function(self): |
---|
9814 | + data = "initial contents" * 100000 |
---|
9815 | + def _make_contents(n): |
---|
9816 | + self.failUnless(isinstance(n, MutableFileNode)) |
---|
9817 | + key = n.get_writekey() |
---|
9818 | + self.failUnless(isinstance(key, str), key) |
---|
9819 | + self.failUnlessEqual(len(key), 16) |
---|
9820 | + return MutableData(data) |
---|
9821 | + d = self.nodemaker.create_mutable_file(_make_contents, |
---|
9822 | + version=MDMF_VERSION) |
---|
9823 | + d.addCallback(lambda n: |
---|
9824 | + n.download_best_version()) |
---|
9825 | + d.addCallback(lambda data2: |
---|
9826 | + self.failUnlessEqual(data2, data)) |
---|
9827 | + return d |
---|
9828 | + |
---|
9829 | + |
---|
9830 | def test_create_with_too_large_contents(self): |
---|
9831 | BIG = "a" * (self.OLD_MAX_SEGMENT_SIZE + 1) |
---|
9832 | hunk ./src/allmydata/test/test_mutable.py 462 |
---|
9833 | - d = self.nodemaker.create_mutable_file(BIG) |
---|
9834 | + BIG_uploadable = MutableData(BIG) |
---|
9835 | + d = self.nodemaker.create_mutable_file(BIG_uploadable) |
---|
9836 | def _created(n): |
---|
9837 | hunk ./src/allmydata/test/test_mutable.py 465 |
---|
9838 | - d = n.overwrite(BIG) |
---|
9839 | + other_BIG_uploadable = MutableData(BIG) |
---|
9840 | + d = n.overwrite(other_BIG_uploadable) |
---|
9841 | return d |
---|
9842 | d.addCallback(_created) |
---|
9843 | return d |
---|
9844 | hunk ./src/allmydata/test/test_mutable.py 480 |
---|
9845 | |
---|
9846 | def test_modify(self): |
---|
9847 | def _modifier(old_contents, servermap, first_time): |
---|
9848 | - return old_contents + "line2" |
---|
9849 | + new_contents = old_contents + "line2" |
---|
9850 | + return new_contents |
---|
9851 | def _non_modifier(old_contents, servermap, first_time): |
---|
9852 | return old_contents |
---|
9853 | def _none_modifier(old_contents, servermap, first_time): |
---|
9854 | hunk ./src/allmydata/test/test_mutable.py 489 |
---|
9855 | def _error_modifier(old_contents, servermap, first_time): |
---|
9856 | raise ValueError("oops") |
---|
9857 | def _toobig_modifier(old_contents, servermap, first_time): |
---|
9858 | - return "b" * (self.OLD_MAX_SEGMENT_SIZE+1) |
---|
9859 | + new_content = "b" * (self.OLD_MAX_SEGMENT_SIZE + 1) |
---|
9860 | + return new_content |
---|
9861 | calls = [] |
---|
9862 | def _ucw_error_modifier(old_contents, servermap, first_time): |
---|
9863 | # simulate an UncoordinatedWriteError once |
---|
9864 | hunk ./src/allmydata/test/test_mutable.py 497 |
---|
9865 | calls.append(1) |
---|
9866 | if len(calls) <= 1: |
---|
9867 | raise UncoordinatedWriteError("simulated") |
---|
9868 | - return old_contents + "line3" |
---|
9869 | + new_contents = old_contents + "line3" |
---|
9870 | + return new_contents |
---|
9871 | def _ucw_error_non_modifier(old_contents, servermap, first_time): |
---|
9872 | # simulate an UncoordinatedWriteError once, and don't actually |
---|
9873 | # modify the contents on subsequent invocations |
---|
9874 | hunk ./src/allmydata/test/test_mutable.py 507 |
---|
9875 | raise UncoordinatedWriteError("simulated") |
---|
9876 | return old_contents |
---|
9877 | |
---|
9878 | - d = self.nodemaker.create_mutable_file("line1") |
---|
9879 | + initial_contents = "line1" |
---|
9880 | + d = self.nodemaker.create_mutable_file(MutableData(initial_contents)) |
---|
9881 | def _created(n): |
---|
9882 | d = n.modify(_modifier) |
---|
9883 | d.addCallback(lambda res: n.download_best_version()) |
---|
9884 | hunk ./src/allmydata/test/test_mutable.py 565 |
---|
9885 | return d |
---|
9886 | d.addCallback(_created) |
---|
9887 | return d |
---|
9888 | + test_modify.timeout = 15 |
---|
9889 | + |
---|
9890 | |
---|
9891 | def test_modify_backoffer(self): |
---|
9892 | def _modifier(old_contents, servermap, first_time): |
---|
9893 | hunk ./src/allmydata/test/test_mutable.py 592 |
---|
9894 | giveuper._delay = 0.1 |
---|
9895 | giveuper.factor = 1 |
---|
9896 | |
---|
9897 | - d = self.nodemaker.create_mutable_file("line1") |
---|
9898 | + d = self.nodemaker.create_mutable_file(MutableData("line1")) |
---|
9899 | def _created(n): |
---|
9900 | d = n.modify(_modifier) |
---|
9901 | d.addCallback(lambda res: n.download_best_version()) |
---|
9902 | hunk ./src/allmydata/test/test_mutable.py 642 |
---|
9903 | d.addCallback(lambda smap: smap.dump(StringIO())) |
---|
9904 | d.addCallback(lambda sio: |
---|
9905 | self.failUnless("3-of-10" in sio.getvalue())) |
---|
9906 | - d.addCallback(lambda res: n.overwrite("contents 1")) |
---|
9907 | + d.addCallback(lambda res: n.overwrite(MutableData("contents 1"))) |
---|
9908 | d.addCallback(lambda res: self.failUnlessIdentical(res, None)) |
---|
9909 | d.addCallback(lambda res: n.download_best_version()) |
---|
9910 | d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1")) |
---|
9911 | hunk ./src/allmydata/test/test_mutable.py 646 |
---|
9912 | - d.addCallback(lambda res: n.overwrite("contents 2")) |
---|
9913 | + d.addCallback(lambda res: n.overwrite(MutableData("contents 2"))) |
---|
9914 | d.addCallback(lambda res: n.download_best_version()) |
---|
9915 | d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2")) |
---|
9916 | d.addCallback(lambda res: n.get_servermap(MODE_WRITE)) |
---|
9917 | hunk ./src/allmydata/test/test_mutable.py 650 |
---|
9918 | - d.addCallback(lambda smap: n.upload("contents 3", smap)) |
---|
9919 | + d.addCallback(lambda smap: n.upload(MutableData("contents 3"), smap)) |
---|
9920 | d.addCallback(lambda res: n.download_best_version()) |
---|
9921 | d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3")) |
---|
9922 | d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING)) |
---|
9923 | hunk ./src/allmydata/test/test_mutable.py 663 |
---|
9924 | return d |
---|
9925 | |
---|
9926 | |
---|
9927 | -class MakeShares(unittest.TestCase): |
---|
9928 | - def test_encrypt(self): |
---|
9929 | - nm = make_nodemaker() |
---|
9930 | - CONTENTS = "some initial contents" |
---|
9931 | - d = nm.create_mutable_file(CONTENTS) |
---|
9932 | - def _created(fn): |
---|
9933 | - p = Publish(fn, nm.storage_broker, None) |
---|
9934 | - p.salt = "SALT" * 4 |
---|
9935 | - p.readkey = "\x00" * 16 |
---|
9936 | - p.newdata = CONTENTS |
---|
9937 | - p.required_shares = 3 |
---|
9938 | - p.total_shares = 10 |
---|
9939 | - p.setup_encoding_parameters() |
---|
9940 | - return p._encrypt_and_encode() |
---|
9941 | +class PublishMixin: |
---|
9942 | + def publish_one(self): |
---|
9943 | + # publish a file and create shares, which can then be manipulated |
---|
9944 | + # later. |
---|
9945 | + self.CONTENTS = "New contents go here" * 1000 |
---|
9946 | + self.uploadable = MutableData(self.CONTENTS) |
---|
9947 | + self._storage = FakeStorage() |
---|
9948 | + self._nodemaker = make_nodemaker(self._storage) |
---|
9949 | + self._storage_broker = self._nodemaker.storage_broker |
---|
9950 | + d = self._nodemaker.create_mutable_file(self.uploadable) |
---|
9951 | + def _created(node): |
---|
9952 | + self._fn = node |
---|
9953 | + self._fn2 = self._nodemaker.create_from_cap(node.get_uri()) |
---|
9954 | d.addCallback(_created) |
---|
9955 | hunk ./src/allmydata/test/test_mutable.py 677 |
---|
9956 | - def _done(shares_and_shareids): |
---|
9957 | - (shares, share_ids) = shares_and_shareids |
---|
9958 | - self.failUnlessEqual(len(shares), 10) |
---|
9959 | - for sh in shares: |
---|
9960 | - self.failUnless(isinstance(sh, str)) |
---|
9961 | - self.failUnlessEqual(len(sh), 7) |
---|
9962 | - self.failUnlessEqual(len(share_ids), 10) |
---|
9963 | - d.addCallback(_done) |
---|
9964 | return d |
---|
9965 | |
---|
9966 | hunk ./src/allmydata/test/test_mutable.py 679 |
---|
9967 | - def test_generate(self): |
---|
9968 | - nm = make_nodemaker() |
---|
9969 | - CONTENTS = "some initial contents" |
---|
9970 | - d = nm.create_mutable_file(CONTENTS) |
---|
9971 | - def _created(fn): |
---|
9972 | - self._fn = fn |
---|
9973 | - p = Publish(fn, nm.storage_broker, None) |
---|
9974 | - self._p = p |
---|
9975 | - p.newdata = CONTENTS |
---|
9976 | - p.required_shares = 3 |
---|
9977 | - p.total_shares = 10 |
---|
9978 | - p.setup_encoding_parameters() |
---|
9979 | - p._new_seqnum = 3 |
---|
9980 | - p.salt = "SALT" * 4 |
---|
9981 | - # make some fake shares |
---|
9982 | - shares_and_ids = ( ["%07d" % i for i in range(10)], range(10) ) |
---|
9983 | - p._privkey = fn.get_privkey() |
---|
9984 | - p._encprivkey = fn.get_encprivkey() |
---|
9985 | - p._pubkey = fn.get_pubkey() |
---|
9986 | - return p._generate_shares(shares_and_ids) |
---|
9987 | + def publish_mdmf(self): |
---|
9988 | + # like publish_one, except that the result is guaranteed to be |
---|
9989 | + # an MDMF file. |
---|
9990 | + # self.CONTENTS should have more than one segment. |
---|
9991 | + self.CONTENTS = "This is an MDMF file" * 100000 |
---|
9992 | + self.uploadable = MutableData(self.CONTENTS) |
---|
9993 | + self._storage = FakeStorage() |
---|
9994 | + self._nodemaker = make_nodemaker(self._storage) |
---|
9995 | + self._storage_broker = self._nodemaker.storage_broker |
---|
9996 | + d = self._nodemaker.create_mutable_file(self.uploadable, version=MDMF_VERSION) |
---|
9997 | + def _created(node): |
---|
9998 | + self._fn = node |
---|
9999 | + self._fn2 = self._nodemaker.create_from_cap(node.get_uri()) |
---|
10000 | d.addCallback(_created) |
---|
10001 | hunk ./src/allmydata/test/test_mutable.py 693 |
---|
10002 | - def _generated(res): |
---|
10003 | - p = self._p |
---|
10004 | - final_shares = p.shares |
---|
10005 | - root_hash = p.root_hash |
---|
10006 | - self.failUnlessEqual(len(root_hash), 32) |
---|
10007 | - self.failUnless(isinstance(final_shares, dict)) |
---|
10008 | - self.failUnlessEqual(len(final_shares), 10) |
---|
10009 | - self.failUnlessEqual(sorted(final_shares.keys()), range(10)) |
---|
10010 | - for i,sh in final_shares.items(): |
---|
10011 | - self.failUnless(isinstance(sh, str)) |
---|
10012 | - # feed the share through the unpacker as a sanity-check |
---|
10013 | - pieces = unpack_share(sh) |
---|
10014 | - (u_seqnum, u_root_hash, IV, k, N, segsize, datalen, |
---|
10015 | - pubkey, signature, share_hash_chain, block_hash_tree, |
---|
10016 | - share_data, enc_privkey) = pieces |
---|
10017 | - self.failUnlessEqual(u_seqnum, 3) |
---|
10018 | - self.failUnlessEqual(u_root_hash, root_hash) |
---|
10019 | - self.failUnlessEqual(k, 3) |
---|
10020 | - self.failUnlessEqual(N, 10) |
---|
10021 | - self.failUnlessEqual(segsize, 21) |
---|
10022 | - self.failUnlessEqual(datalen, len(CONTENTS)) |
---|
10023 | - self.failUnlessEqual(pubkey, p._pubkey.serialize()) |
---|
10024 | - sig_material = struct.pack(">BQ32s16s BBQQ", |
---|
10025 | - 0, p._new_seqnum, root_hash, IV, |
---|
10026 | - k, N, segsize, datalen) |
---|
10027 | - self.failUnless(p._pubkey.verify(sig_material, signature)) |
---|
10028 | - #self.failUnlessEqual(signature, p._privkey.sign(sig_material)) |
---|
10029 | - self.failUnless(isinstance(share_hash_chain, dict)) |
---|
10030 | - self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++ |
---|
10031 | - for shnum,share_hash in share_hash_chain.items(): |
---|
10032 | - self.failUnless(isinstance(shnum, int)) |
---|
10033 | - self.failUnless(isinstance(share_hash, str)) |
---|
10034 | - self.failUnlessEqual(len(share_hash), 32) |
---|
10035 | - self.failUnless(isinstance(block_hash_tree, list)) |
---|
10036 | - self.failUnlessEqual(len(block_hash_tree), 1) # very small tree |
---|
10037 | - self.failUnlessEqual(IV, "SALT"*4) |
---|
10038 | - self.failUnlessEqual(len(share_data), len("%07d" % 1)) |
---|
10039 | - self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey()) |
---|
10040 | - d.addCallback(_generated) |
---|
10041 | return d |
---|
10042 | |
---|
10043 | hunk ./src/allmydata/test/test_mutable.py 695 |
---|
10044 | - # TODO: when we publish to 20 peers, we should get one share per peer on 10 |
---|
10045 | - # when we publish to 3 peers, we should get either 3 or 4 shares per peer |
---|
10046 | - # when we publish to zero peers, we should get a NotEnoughSharesError |
---|
10047 | |
---|
10048 | hunk ./src/allmydata/test/test_mutable.py 696 |
---|
10049 | -class PublishMixin: |
---|
10050 | - def publish_one(self): |
---|
10051 | - # publish a file and create shares, which can then be manipulated |
---|
10052 | - # later. |
---|
10053 | - self.CONTENTS = "New contents go here" * 1000 |
---|
10054 | + def publish_sdmf(self): |
---|
10055 | + # like publish_one, except that the result is guaranteed to be |
---|
10056 | + # an SDMF file |
---|
10057 | + self.CONTENTS = "This is an SDMF file" * 1000 |
---|
10058 | + self.uploadable = MutableData(self.CONTENTS) |
---|
10059 | self._storage = FakeStorage() |
---|
10060 | self._nodemaker = make_nodemaker(self._storage) |
---|
10061 | self._storage_broker = self._nodemaker.storage_broker |
---|
10062 | hunk ./src/allmydata/test/test_mutable.py 704 |
---|
10063 | - d = self._nodemaker.create_mutable_file(self.CONTENTS) |
---|
10064 | + d = self._nodemaker.create_mutable_file(self.uploadable, version=SDMF_VERSION) |
---|
10065 | def _created(node): |
---|
10066 | self._fn = node |
---|
10067 | self._fn2 = self._nodemaker.create_from_cap(node.get_uri()) |
---|
10068 | hunk ./src/allmydata/test/test_mutable.py 711 |
---|
10069 | d.addCallback(_created) |
---|
10070 | return d |
---|
10071 | |
---|
10072 | - def publish_multiple(self): |
---|
10073 | + |
---|
10074 | + def publish_multiple(self, version=0): |
---|
10075 | self.CONTENTS = ["Contents 0", |
---|
10076 | "Contents 1", |
---|
10077 | "Contents 2", |
---|
10078 | hunk ./src/allmydata/test/test_mutable.py 718 |
---|
10079 | "Contents 3a", |
---|
10080 | "Contents 3b"] |
---|
10081 | + self.uploadables = [MutableData(d) for d in self.CONTENTS] |
---|
10082 | self._copied_shares = {} |
---|
10083 | self._storage = FakeStorage() |
---|
10084 | self._nodemaker = make_nodemaker(self._storage) |
---|
10085 | hunk ./src/allmydata/test/test_mutable.py 722 |
---|
10086 | - d = self._nodemaker.create_mutable_file(self.CONTENTS[0]) # seqnum=1 |
---|
10087 | + d = self._nodemaker.create_mutable_file(self.uploadables[0], version=version) # seqnum=1 |
---|
10088 | def _created(node): |
---|
10089 | self._fn = node |
---|
10090 | # now create multiple versions of the same file, and accumulate |
---|
10091 | hunk ./src/allmydata/test/test_mutable.py 729 |
---|
10092 | # their shares, so we can mix and match them later. |
---|
10093 | d = defer.succeed(None) |
---|
10094 | d.addCallback(self._copy_shares, 0) |
---|
10095 | - d.addCallback(lambda res: node.overwrite(self.CONTENTS[1])) #s2 |
---|
10096 | + d.addCallback(lambda res: node.overwrite(self.uploadables[1])) #s2 |
---|
10097 | d.addCallback(self._copy_shares, 1) |
---|
10098 | hunk ./src/allmydata/test/test_mutable.py 731 |
---|
10099 | - d.addCallback(lambda res: node.overwrite(self.CONTENTS[2])) #s3 |
---|
10100 | + d.addCallback(lambda res: node.overwrite(self.uploadables[2])) #s3 |
---|
10101 | d.addCallback(self._copy_shares, 2) |
---|
10102 | hunk ./src/allmydata/test/test_mutable.py 733 |
---|
10103 | - d.addCallback(lambda res: node.overwrite(self.CONTENTS[3])) #s4a |
---|
10104 | + d.addCallback(lambda res: node.overwrite(self.uploadables[3])) #s4a |
---|
10105 | d.addCallback(self._copy_shares, 3) |
---|
10106 | # now we replace all the shares with version s3, and upload a new |
---|
10107 | # version to get s4b. |
---|
10108 | hunk ./src/allmydata/test/test_mutable.py 739 |
---|
10109 | rollback = dict([(i,2) for i in range(10)]) |
---|
10110 | d.addCallback(lambda res: self._set_versions(rollback)) |
---|
10111 | - d.addCallback(lambda res: node.overwrite(self.CONTENTS[4])) #s4b |
---|
10112 | + d.addCallback(lambda res: node.overwrite(self.uploadables[4])) #s4b |
---|
10113 | d.addCallback(self._copy_shares, 4) |
---|
10114 | # we leave the storage in state 4 |
---|
10115 | return d |
---|
10116 | hunk ./src/allmydata/test/test_mutable.py 746 |
---|
10117 | d.addCallback(_created) |
---|
10118 | return d |
---|
10119 | |
---|
10120 | + |
---|
10121 | def _copy_shares(self, ignored, index): |
---|
10122 | shares = self._storage._peers |
---|
10123 | # we need a deep copy |
---|
10124 | hunk ./src/allmydata/test/test_mutable.py 770 |
---|
10125 | shares[peerid][shnum] = oldshares[index][peerid][shnum] |
---|
10126 | |
---|
10127 | |
---|
10128 | + |
---|
10129 | + |
---|
10130 | class Servermap(unittest.TestCase, PublishMixin): |
---|
10131 | def setUp(self): |
---|
10132 | return self.publish_one() |
---|
10133 | hunk ./src/allmydata/test/test_mutable.py 776 |
---|
10134 | |
---|
10135 | - def make_servermap(self, mode=MODE_CHECK, fn=None, sb=None): |
---|
10136 | + def make_servermap(self, mode=MODE_CHECK, fn=None, sb=None, |
---|
10137 | + update_range=None): |
---|
10138 | if fn is None: |
---|
10139 | fn = self._fn |
---|
10140 | if sb is None: |
---|
10141 | hunk ./src/allmydata/test/test_mutable.py 783 |
---|
10142 | sb = self._storage_broker |
---|
10143 | smu = ServermapUpdater(fn, sb, Monitor(), |
---|
10144 | - ServerMap(), mode) |
---|
10145 | + ServerMap(), mode, update_range=update_range) |
---|
10146 | d = smu.update() |
---|
10147 | return d |
---|
10148 | |
---|
10149 | hunk ./src/allmydata/test/test_mutable.py 849 |
---|
10150 | # create a new file, which is large enough to knock the privkey out |
---|
10151 | # of the early part of the file |
---|
10152 | LARGE = "These are Larger contents" * 200 # about 5KB |
---|
10153 | - d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE)) |
---|
10154 | + LARGE_uploadable = MutableData(LARGE) |
---|
10155 | + d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE_uploadable)) |
---|
10156 | def _created(large_fn): |
---|
10157 | large_fn2 = self._nodemaker.create_from_cap(large_fn.get_uri()) |
---|
10158 | return self.make_servermap(MODE_WRITE, large_fn2) |
---|
10159 | hunk ./src/allmydata/test/test_mutable.py 858 |
---|
10160 | d.addCallback(lambda sm: self.failUnlessOneRecoverable(sm, 10)) |
---|
10161 | return d |
---|
10162 | |
---|
10163 | + |
---|
10164 | def test_mark_bad(self): |
---|
10165 | d = defer.succeed(None) |
---|
10166 | ms = self.make_servermap |
---|
10167 | hunk ./src/allmydata/test/test_mutable.py 904 |
---|
10168 | self._storage._peers = {} # delete all shares |
---|
10169 | ms = self.make_servermap |
---|
10170 | d = defer.succeed(None) |
---|
10171 | - |
---|
10172 | +# |
---|
10173 | d.addCallback(lambda res: ms(mode=MODE_CHECK)) |
---|
10174 | d.addCallback(lambda sm: self.failUnlessNoneRecoverable(sm)) |
---|
10175 | |
---|
10176 | hunk ./src/allmydata/test/test_mutable.py 956 |
---|
10177 | return d |
---|
10178 | |
---|
10179 | |
---|
10180 | + def test_servermapupdater_finds_mdmf_files(self): |
---|
10181 | + # setUp already published an MDMF file for us. We just need to |
---|
10182 | + # make sure that when we run the ServermapUpdater, the file is |
---|
10183 | + # reported to have one recoverable version. |
---|
10184 | + d = defer.succeed(None) |
---|
10185 | + d.addCallback(lambda ignored: |
---|
10186 | + self.publish_mdmf()) |
---|
10187 | + d.addCallback(lambda ignored: |
---|
10188 | + self.make_servermap(mode=MODE_CHECK)) |
---|
10189 | + # Calling make_servermap also updates the servermap in the mode |
---|
10190 | + # that we specify, so we just need to see what it says. |
---|
10191 | + def _check_servermap(sm): |
---|
10192 | + self.failUnlessEqual(len(sm.recoverable_versions()), 1) |
---|
10193 | + d.addCallback(_check_servermap) |
---|
10194 | + return d |
---|
10195 | + |
---|
10196 | + |
---|
10197 | + def test_fetch_update(self): |
---|
10198 | + d = defer.succeed(None) |
---|
10199 | + d.addCallback(lambda ignored: |
---|
10200 | + self.publish_mdmf()) |
---|
10201 | + d.addCallback(lambda ignored: |
---|
10202 | + self.make_servermap(mode=MODE_WRITE, update_range=(1, 2))) |
---|
10203 | + def _check_servermap(sm): |
---|
10204 | + # 10 shares |
---|
10205 | + self.failUnlessEqual(len(sm.update_data), 10) |
---|
10206 | + # one version |
---|
10207 | + for data in sm.update_data.itervalues(): |
---|
10208 | + self.failUnlessEqual(len(data), 1) |
---|
10209 | + d.addCallback(_check_servermap) |
---|
10210 | + return d |
---|
10211 | + |
---|
10212 | + |
---|
10213 | + def test_servermapupdater_finds_sdmf_files(self): |
---|
10214 | + d = defer.succeed(None) |
---|
10215 | + d.addCallback(lambda ignored: |
---|
10216 | + self.publish_sdmf()) |
---|
10217 | + d.addCallback(lambda ignored: |
---|
10218 | + self.make_servermap(mode=MODE_CHECK)) |
---|
10219 | + d.addCallback(lambda servermap: |
---|
10220 | + self.failUnlessEqual(len(servermap.recoverable_versions()), 1)) |
---|
10221 | + return d |
---|
10222 | + |
---|
10223 | |
---|
10224 | class Roundtrip(unittest.TestCase, testutil.ShouldFailMixin, PublishMixin): |
---|
10225 | def setUp(self): |
---|
10226 | hunk ./src/allmydata/test/test_mutable.py 1039 |
---|
10227 | if version is None: |
---|
10228 | version = servermap.best_recoverable_version() |
---|
10229 | r = Retrieve(self._fn, servermap, version) |
---|
10230 | - return r.download() |
---|
10231 | + c = consumer.MemoryConsumer() |
---|
10232 | + d = r.download(consumer=c) |
---|
10233 | + d.addCallback(lambda mc: "".join(mc.chunks)) |
---|
10234 | + return d |
---|
10235 | + |
---|
10236 | |
---|
10237 | def test_basic(self): |
---|
10238 | d = self.make_servermap() |
---|
10239 | hunk ./src/allmydata/test/test_mutable.py 1120 |
---|
10240 | return d |
---|
10241 | test_no_servers_download.timeout = 15 |
---|
10242 | |
---|
10243 | + |
---|
10244 | def _test_corrupt_all(self, offset, substring, |
---|
10245 | hunk ./src/allmydata/test/test_mutable.py 1122 |
---|
10246 | - should_succeed=False, corrupt_early=True, |
---|
10247 | - failure_checker=None): |
---|
10248 | + should_succeed=False, |
---|
10249 | + corrupt_early=True, |
---|
10250 | + failure_checker=None, |
---|
10251 | + fetch_privkey=False): |
---|
10252 | d = defer.succeed(None) |
---|
10253 | if corrupt_early: |
---|
10254 | d.addCallback(corrupt, self._storage, offset) |
---|
10255 | hunk ./src/allmydata/test/test_mutable.py 1142 |
---|
10256 | self.failUnlessIn(substring, "".join(allproblems)) |
---|
10257 | return servermap |
---|
10258 | if should_succeed: |
---|
10259 | - d1 = self._fn.download_version(servermap, ver) |
---|
10260 | + d1 = self._fn.download_version(servermap, ver, |
---|
10261 | + fetch_privkey) |
---|
10262 | d1.addCallback(lambda new_contents: |
---|
10263 | self.failUnlessEqual(new_contents, self.CONTENTS)) |
---|
10264 | else: |
---|
10265 | hunk ./src/allmydata/test/test_mutable.py 1150 |
---|
10266 | d1 = self.shouldFail(NotEnoughSharesError, |
---|
10267 | "_corrupt_all(offset=%s)" % (offset,), |
---|
10268 | substring, |
---|
10269 | - self._fn.download_version, servermap, ver) |
---|
10270 | + self._fn.download_version, servermap, |
---|
10271 | + ver, |
---|
10272 | + fetch_privkey) |
---|
10273 | if failure_checker: |
---|
10274 | d1.addCallback(failure_checker) |
---|
10275 | d1.addCallback(lambda res: servermap) |
---|
10276 | hunk ./src/allmydata/test/test_mutable.py 1161 |
---|
10277 | return d |
---|
10278 | |
---|
10279 | def test_corrupt_all_verbyte(self): |
---|
10280 | - # when the version byte is not 0, we hit an UnknownVersionError error |
---|
10281 | - # in unpack_share(). |
---|
10282 | + # when the version byte is not 0 or 1, we hit an UnknownVersionError |
---|
10283 | + # error in unpack_share(). |
---|
10284 | d = self._test_corrupt_all(0, "UnknownVersionError") |
---|
10285 | def _check_servermap(servermap): |
---|
10286 | # and the dump should mention the problems |
---|
10287 | hunk ./src/allmydata/test/test_mutable.py 1168 |
---|
10288 | s = StringIO() |
---|
10289 | dump = servermap.dump(s).getvalue() |
---|
10290 | - self.failUnless("10 PROBLEMS" in dump, dump) |
---|
10291 | + self.failUnless("30 PROBLEMS" in dump, dump) |
---|
10292 | d.addCallback(_check_servermap) |
---|
10293 | return d |
---|
10294 | |
---|
10295 | hunk ./src/allmydata/test/test_mutable.py 1238 |
---|
10296 | return self._test_corrupt_all("enc_privkey", None, should_succeed=True) |
---|
10297 | |
---|
10298 | |
---|
10299 | + def test_corrupt_all_encprivkey_late(self): |
---|
10300 | + # this should work for the same reason as above, but we corrupt |
---|
10301 | + # after the servermap update to exercise the error handling |
---|
10302 | + # code. |
---|
10303 | + # We need to remove the privkey from the node, or the retrieve |
---|
10304 | + # process won't know to update it. |
---|
10305 | + self._fn._privkey = None |
---|
10306 | + return self._test_corrupt_all("enc_privkey", |
---|
10307 | + None, # this shouldn't fail |
---|
10308 | + should_succeed=True, |
---|
10309 | + corrupt_early=False, |
---|
10310 | + fetch_privkey=True) |
---|
10311 | + |
---|
10312 | + |
---|
10313 | def test_corrupt_all_seqnum_late(self): |
---|
10314 | # corrupting the seqnum between mapupdate and retrieve should result |
---|
10315 | # in NotEnoughSharesError, since each share will look invalid |
---|
10316 | hunk ./src/allmydata/test/test_mutable.py 1258 |
---|
10317 | def _check(res): |
---|
10318 | f = res[0] |
---|
10319 | self.failUnless(f.check(NotEnoughSharesError)) |
---|
10320 | - self.failUnless("someone wrote to the data since we read the servermap" in str(f)) |
---|
10321 | + self.failUnless("uncoordinated write" in str(f)) |
---|
10322 | return self._test_corrupt_all(1, "ran out of peers", |
---|
10323 | corrupt_early=False, |
---|
10324 | failure_checker=_check) |
---|
10325 | hunk ./src/allmydata/test/test_mutable.py 1302 |
---|
10326 | in str(servermap.problems[0])) |
---|
10327 | ver = servermap.best_recoverable_version() |
---|
10328 | r = Retrieve(self._fn, servermap, ver) |
---|
10329 | - return r.download() |
---|
10330 | + c = consumer.MemoryConsumer() |
---|
10331 | + return r.download(c) |
---|
10332 | d.addCallback(_do_retrieve) |
---|
10333 | hunk ./src/allmydata/test/test_mutable.py 1305 |
---|
10334 | + d.addCallback(lambda mc: "".join(mc.chunks)) |
---|
10335 | d.addCallback(lambda new_contents: |
---|
10336 | self.failUnlessEqual(new_contents, self.CONTENTS)) |
---|
10337 | return d |
---|
10338 | hunk ./src/allmydata/test/test_mutable.py 1310 |
---|
10339 | |
---|
10340 | - def test_corrupt_some(self): |
---|
10341 | - # corrupt the data of first five shares (so the servermap thinks |
---|
10342 | - # they're good but retrieve marks them as bad), so that the |
---|
10343 | - # MODE_READ set of 6 will be insufficient, forcing node.download to |
---|
10344 | - # retry with more servers. |
---|
10345 | - corrupt(None, self._storage, "share_data", range(5)) |
---|
10346 | - d = self.make_servermap() |
---|
10347 | + |
---|
10348 | + def _test_corrupt_some(self, offset, mdmf=False): |
---|
10349 | + if mdmf: |
---|
10350 | + d = self.publish_mdmf() |
---|
10351 | + else: |
---|
10352 | + d = defer.succeed(None) |
---|
10353 | + d.addCallback(lambda ignored: |
---|
10354 | + corrupt(None, self._storage, offset, range(5))) |
---|
10355 | + d.addCallback(lambda ignored: |
---|
10356 | + self.make_servermap()) |
---|
10357 | def _do_retrieve(servermap): |
---|
10358 | ver = servermap.best_recoverable_version() |
---|
10359 | self.failUnless(ver) |
---|
10360 | hunk ./src/allmydata/test/test_mutable.py 1326 |
---|
10361 | return self._fn.download_best_version() |
---|
10362 | d.addCallback(_do_retrieve) |
---|
10363 | d.addCallback(lambda new_contents: |
---|
10364 | - self.failUnlessEqual(new_contents, self.CONTENTS)) |
---|
10365 | + self.failUnlessEqual(new_contents, self.CONTENTS)) |
---|
10366 | return d |
---|
10367 | |
---|
10368 | hunk ./src/allmydata/test/test_mutable.py 1329 |
---|
10369 | + |
---|
10370 | + def test_corrupt_some(self): |
---|
10371 | + # corrupt the data of first five shares (so the servermap thinks |
---|
10372 | + # they're good but retrieve marks them as bad), so that the |
---|
10373 | + # MODE_READ set of 6 will be insufficient, forcing node.download to |
---|
10374 | + # retry with more servers. |
---|
10375 | + return self._test_corrupt_some("share_data") |
---|
10376 | + |
---|
10377 | + |
---|
10378 | def test_download_fails(self): |
---|
10379 | hunk ./src/allmydata/test/test_mutable.py 1339 |
---|
10380 | - corrupt(None, self._storage, "signature") |
---|
10381 | - d = self.shouldFail(UnrecoverableFileError, "test_download_anyway", |
---|
10382 | + d = corrupt(None, self._storage, "signature") |
---|
10383 | + d.addCallback(lambda ignored: |
---|
10384 | + self.shouldFail(UnrecoverableFileError, "test_download_anyway", |
---|
10385 | "no recoverable versions", |
---|
10386 | hunk ./src/allmydata/test/test_mutable.py 1343 |
---|
10387 | - self._fn.download_best_version) |
---|
10388 | + self._fn.download_best_version)) |
---|
10389 | return d |
---|
10390 | |
---|
10391 | |
---|
10392 | hunk ./src/allmydata/test/test_mutable.py 1347 |
---|
10393 | + |
---|
10394 | + def test_corrupt_mdmf_block_hash_tree(self): |
---|
10395 | + d = self.publish_mdmf() |
---|
10396 | + d.addCallback(lambda ignored: |
---|
10397 | + self._test_corrupt_all(("block_hash_tree", 12 * 32), |
---|
10398 | + "block hash tree failure", |
---|
10399 | + corrupt_early=False, |
---|
10400 | + should_succeed=False)) |
---|
10401 | + return d |
---|
10402 | + |
---|
10403 | + |
---|
10404 | + def test_corrupt_mdmf_block_hash_tree_late(self): |
---|
10405 | + d = self.publish_mdmf() |
---|
10406 | + d.addCallback(lambda ignored: |
---|
10407 | + self._test_corrupt_all(("block_hash_tree", 12 * 32), |
---|
10408 | + "block hash tree failure", |
---|
10409 | + corrupt_early=True, |
---|
10410 | + should_succeed=False)) |
---|
10411 | + return d |
---|
10412 | + |
---|
10413 | + |
---|
10414 | + def test_corrupt_mdmf_share_data(self): |
---|
10415 | + d = self.publish_mdmf() |
---|
10416 | + d.addCallback(lambda ignored: |
---|
10417 | + # TODO: Find out what the block size is and corrupt a |
---|
10418 | + # specific block, rather than just guessing. |
---|
10419 | + self._test_corrupt_all(("share_data", 12 * 40), |
---|
10420 | + "block hash tree failure", |
---|
10421 | + corrupt_early=True, |
---|
10422 | + should_succeed=False)) |
---|
10423 | + return d |
---|
10424 | + |
---|
10425 | + |
---|
10426 | + def test_corrupt_some_mdmf(self): |
---|
10427 | + return self._test_corrupt_some(("share_data", 12 * 40), |
---|
10428 | + mdmf=True) |
---|
10429 | + |
---|
10430 | + |
---|
10431 | class CheckerMixin: |
---|
10432 | def check_good(self, r, where): |
---|
10433 | self.failUnless(r.is_healthy(), where) |
---|
10434 | hunk ./src/allmydata/test/test_mutable.py 1415 |
---|
10435 | d.addCallback(self.check_good, "test_check_good") |
---|
10436 | return d |
---|
10437 | |
---|
10438 | + def test_check_mdmf_good(self): |
---|
10439 | + d = self.publish_mdmf() |
---|
10440 | + d.addCallback(lambda ignored: |
---|
10441 | + self._fn.check(Monitor())) |
---|
10442 | + d.addCallback(self.check_good, "test_check_mdmf_good") |
---|
10443 | + return d |
---|
10444 | + |
---|
10445 | def test_check_no_shares(self): |
---|
10446 | for shares in self._storage._peers.values(): |
---|
10447 | shares.clear() |
---|
10448 | hunk ./src/allmydata/test/test_mutable.py 1429 |
---|
10449 | d.addCallback(self.check_bad, "test_check_no_shares") |
---|
10450 | return d |
---|
10451 | |
---|
10452 | + def test_check_mdmf_no_shares(self): |
---|
10453 | + d = self.publish_mdmf() |
---|
10454 | + def _then(ignored): |
---|
10455 | + for share in self._storage._peers.values(): |
---|
10456 | + share.clear() |
---|
10457 | + d.addCallback(_then) |
---|
10458 | + d.addCallback(lambda ignored: |
---|
10459 | + self._fn.check(Monitor())) |
---|
10460 | + d.addCallback(self.check_bad, "test_check_mdmf_no_shares") |
---|
10461 | + return d |
---|
10462 | + |
---|
10463 | def test_check_not_enough_shares(self): |
---|
10464 | for shares in self._storage._peers.values(): |
---|
10465 | for shnum in shares.keys(): |
---|
10466 | hunk ./src/allmydata/test/test_mutable.py 1449 |
---|
10467 | d.addCallback(self.check_bad, "test_check_not_enough_shares") |
---|
10468 | return d |
---|
10469 | |
---|
10470 | + def test_check_mdmf_not_enough_shares(self): |
---|
10471 | + d = self.publish_mdmf() |
---|
10472 | + def _then(ignored): |
---|
10473 | + for shares in self._storage._peers.values(): |
---|
10474 | + for shnum in shares.keys(): |
---|
10475 | + if shnum > 0: |
---|
10476 | + del shares[shnum] |
---|
10477 | + d.addCallback(_then) |
---|
10478 | + d.addCallback(lambda ignored: |
---|
10479 | + self._fn.check(Monitor())) |
---|
10480 | + d.addCallback(self.check_bad, "test_check_mdmf_not_enougH_shares") |
---|
10481 | + return d |
---|
10482 | + |
---|
10483 | + |
---|
10484 | def test_check_all_bad_sig(self): |
---|
10485 | hunk ./src/allmydata/test/test_mutable.py 1464 |
---|
10486 | - corrupt(None, self._storage, 1) # bad sig |
---|
10487 | - d = self._fn.check(Monitor()) |
---|
10488 | + d = corrupt(None, self._storage, 1) # bad sig |
---|
10489 | + d.addCallback(lambda ignored: |
---|
10490 | + self._fn.check(Monitor())) |
---|
10491 | d.addCallback(self.check_bad, "test_check_all_bad_sig") |
---|
10492 | return d |
---|
10493 | |
---|
10494 | hunk ./src/allmydata/test/test_mutable.py 1470 |
---|
10495 | + def test_check_mdmf_all_bad_sig(self): |
---|
10496 | + d = self.publish_mdmf() |
---|
10497 | + d.addCallback(lambda ignored: |
---|
10498 | + corrupt(None, self._storage, 1)) |
---|
10499 | + d.addCallback(lambda ignored: |
---|
10500 | + self._fn.check(Monitor())) |
---|
10501 | + d.addCallback(self.check_bad, "test_check_mdmf_all_bad_sig") |
---|
10502 | + return d |
---|
10503 | + |
---|
10504 | def test_check_all_bad_blocks(self): |
---|
10505 | hunk ./src/allmydata/test/test_mutable.py 1480 |
---|
10506 | - corrupt(None, self._storage, "share_data", [9]) # bad blocks |
---|
10507 | + d = corrupt(None, self._storage, "share_data", [9]) # bad blocks |
---|
10508 | # the Checker won't notice this.. it doesn't look at actual data |
---|
10509 | hunk ./src/allmydata/test/test_mutable.py 1482 |
---|
10510 | - d = self._fn.check(Monitor()) |
---|
10511 | + d.addCallback(lambda ignored: |
---|
10512 | + self._fn.check(Monitor())) |
---|
10513 | d.addCallback(self.check_good, "test_check_all_bad_blocks") |
---|
10514 | return d |
---|
10515 | |
---|
10516 | hunk ./src/allmydata/test/test_mutable.py 1487 |
---|
10517 | + |
---|
10518 | + def test_check_mdmf_all_bad_blocks(self): |
---|
10519 | + d = self.publish_mdmf() |
---|
10520 | + d.addCallback(lambda ignored: |
---|
10521 | + corrupt(None, self._storage, "share_data")) |
---|
10522 | + d.addCallback(lambda ignored: |
---|
10523 | + self._fn.check(Monitor())) |
---|
10524 | + d.addCallback(self.check_good, "test_check_mdmf_all_bad_blocks") |
---|
10525 | + return d |
---|
10526 | + |
---|
10527 | def test_verify_good(self): |
---|
10528 | d = self._fn.check(Monitor(), verify=True) |
---|
10529 | d.addCallback(self.check_good, "test_verify_good") |
---|
10530 | hunk ./src/allmydata/test/test_mutable.py 1501 |
---|
10531 | return d |
---|
10532 | + test_verify_good.timeout = 15 |
---|
10533 | |
---|
10534 | def test_verify_all_bad_sig(self): |
---|
10535 | hunk ./src/allmydata/test/test_mutable.py 1504 |
---|
10536 | - corrupt(None, self._storage, 1) # bad sig |
---|
10537 | - d = self._fn.check(Monitor(), verify=True) |
---|
10538 | + d = corrupt(None, self._storage, 1) # bad sig |
---|
10539 | + d.addCallback(lambda ignored: |
---|
10540 | + self._fn.check(Monitor(), verify=True)) |
---|
10541 | d.addCallback(self.check_bad, "test_verify_all_bad_sig") |
---|
10542 | return d |
---|
10543 | |
---|
10544 | hunk ./src/allmydata/test/test_mutable.py 1511 |
---|
10545 | def test_verify_one_bad_sig(self): |
---|
10546 | - corrupt(None, self._storage, 1, [9]) # bad sig |
---|
10547 | - d = self._fn.check(Monitor(), verify=True) |
---|
10548 | + d = corrupt(None, self._storage, 1, [9]) # bad sig |
---|
10549 | + d.addCallback(lambda ignored: |
---|
10550 | + self._fn.check(Monitor(), verify=True)) |
---|
10551 | d.addCallback(self.check_bad, "test_verify_one_bad_sig") |
---|
10552 | return d |
---|
10553 | |
---|
10554 | hunk ./src/allmydata/test/test_mutable.py 1518 |
---|
10555 | def test_verify_one_bad_block(self): |
---|
10556 | - corrupt(None, self._storage, "share_data", [9]) # bad blocks |
---|
10557 | + d = corrupt(None, self._storage, "share_data", [9]) # bad blocks |
---|
10558 | # the Verifier *will* notice this, since it examines every byte |
---|
10559 | hunk ./src/allmydata/test/test_mutable.py 1520 |
---|
10560 | - d = self._fn.check(Monitor(), verify=True) |
---|
10561 | + d.addCallback(lambda ignored: |
---|
10562 | + self._fn.check(Monitor(), verify=True)) |
---|
10563 | d.addCallback(self.check_bad, "test_verify_one_bad_block") |
---|
10564 | d.addCallback(self.check_expected_failure, |
---|
10565 | CorruptShareError, "block hash tree failure", |
---|
10566 | hunk ./src/allmydata/test/test_mutable.py 1529 |
---|
10567 | return d |
---|
10568 | |
---|
10569 | def test_verify_one_bad_sharehash(self): |
---|
10570 | - corrupt(None, self._storage, "share_hash_chain", [9], 5) |
---|
10571 | - d = self._fn.check(Monitor(), verify=True) |
---|
10572 | + d = corrupt(None, self._storage, "share_hash_chain", [9], 5) |
---|
10573 | + d.addCallback(lambda ignored: |
---|
10574 | + self._fn.check(Monitor(), verify=True)) |
---|
10575 | d.addCallback(self.check_bad, "test_verify_one_bad_sharehash") |
---|
10576 | d.addCallback(self.check_expected_failure, |
---|
10577 | CorruptShareError, "corrupt hashes", |
---|
10578 | hunk ./src/allmydata/test/test_mutable.py 1539 |
---|
10579 | return d |
---|
10580 | |
---|
10581 | def test_verify_one_bad_encprivkey(self): |
---|
10582 | - corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey |
---|
10583 | - d = self._fn.check(Monitor(), verify=True) |
---|
10584 | + d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey |
---|
10585 | + d.addCallback(lambda ignored: |
---|
10586 | + self._fn.check(Monitor(), verify=True)) |
---|
10587 | d.addCallback(self.check_bad, "test_verify_one_bad_encprivkey") |
---|
10588 | d.addCallback(self.check_expected_failure, |
---|
10589 | CorruptShareError, "invalid privkey", |
---|
10590 | hunk ./src/allmydata/test/test_mutable.py 1549 |
---|
10591 | return d |
---|
10592 | |
---|
10593 | def test_verify_one_bad_encprivkey_uncheckable(self): |
---|
10594 | - corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey |
---|
10595 | + d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey |
---|
10596 | readonly_fn = self._fn.get_readonly() |
---|
10597 | # a read-only node has no way to validate the privkey |
---|
10598 | hunk ./src/allmydata/test/test_mutable.py 1552 |
---|
10599 | - d = readonly_fn.check(Monitor(), verify=True) |
---|
10600 | + d.addCallback(lambda ignored: |
---|
10601 | + readonly_fn.check(Monitor(), verify=True)) |
---|
10602 | d.addCallback(self.check_good, |
---|
10603 | "test_verify_one_bad_encprivkey_uncheckable") |
---|
10604 | return d |
---|
10605 | hunk ./src/allmydata/test/test_mutable.py 1558 |
---|
10606 | |
---|
10607 | + |
---|
10608 | + def test_verify_mdmf_good(self): |
---|
10609 | + d = self.publish_mdmf() |
---|
10610 | + d.addCallback(lambda ignored: |
---|
10611 | + self._fn.check(Monitor(), verify=True)) |
---|
10612 | + d.addCallback(self.check_good, "test_verify_mdmf_good") |
---|
10613 | + return d |
---|
10614 | + |
---|
10615 | + |
---|
10616 | + def test_verify_mdmf_one_bad_block(self): |
---|
10617 | + d = self.publish_mdmf() |
---|
10618 | + d.addCallback(lambda ignored: |
---|
10619 | + corrupt(None, self._storage, "share_data", [1])) |
---|
10620 | + d.addCallback(lambda ignored: |
---|
10621 | + self._fn.check(Monitor(), verify=True)) |
---|
10622 | + # We should find one bad block here |
---|
10623 | + d.addCallback(self.check_bad, "test_verify_mdmf_one_bad_block") |
---|
10624 | + d.addCallback(self.check_expected_failure, |
---|
10625 | + CorruptShareError, "block hash tree failure", |
---|
10626 | + "test_verify_mdmf_one_bad_block") |
---|
10627 | + return d |
---|
10628 | + |
---|
10629 | + |
---|
10630 | + def test_verify_mdmf_bad_encprivkey(self): |
---|
10631 | + d = self.publish_mdmf() |
---|
10632 | + d.addCallback(lambda ignored: |
---|
10633 | + corrupt(None, self._storage, "enc_privkey", [1])) |
---|
10634 | + d.addCallback(lambda ignored: |
---|
10635 | + self._fn.check(Monitor(), verify=True)) |
---|
10636 | + d.addCallback(self.check_bad, "test_verify_mdmf_bad_encprivkey") |
---|
10637 | + d.addCallback(self.check_expected_failure, |
---|
10638 | + CorruptShareError, "privkey", |
---|
10639 | + "test_verify_mdmf_bad_encprivkey") |
---|
10640 | + return d |
---|
10641 | + |
---|
10642 | + |
---|
10643 | + def test_verify_mdmf_bad_sig(self): |
---|
10644 | + d = self.publish_mdmf() |
---|
10645 | + d.addCallback(lambda ignored: |
---|
10646 | + corrupt(None, self._storage, 1, [1])) |
---|
10647 | + d.addCallback(lambda ignored: |
---|
10648 | + self._fn.check(Monitor(), verify=True)) |
---|
10649 | + d.addCallback(self.check_bad, "test_verify_mdmf_bad_sig") |
---|
10650 | + return d |
---|
10651 | + |
---|
10652 | + |
---|
10653 | + def test_verify_mdmf_bad_encprivkey_uncheckable(self): |
---|
10654 | + d = self.publish_mdmf() |
---|
10655 | + d.addCallback(lambda ignored: |
---|
10656 | + corrupt(None, self._storage, "enc_privkey", [1])) |
---|
10657 | + d.addCallback(lambda ignored: |
---|
10658 | + self._fn.get_readonly()) |
---|
10659 | + d.addCallback(lambda fn: |
---|
10660 | + fn.check(Monitor(), verify=True)) |
---|
10661 | + d.addCallback(self.check_good, |
---|
10662 | + "test_verify_mdmf_bad_encprivkey_uncheckable") |
---|
10663 | + return d |
---|
10664 | + |
---|
10665 | + |
---|
10666 | class Repair(unittest.TestCase, PublishMixin, ShouldFailMixin): |
---|
10667 | |
---|
10668 | def get_shares(self, s): |
---|
10669 | hunk ./src/allmydata/test/test_mutable.py 1682 |
---|
10670 | current_shares = self.old_shares[-1] |
---|
10671 | self.failUnlessEqual(old_shares, current_shares) |
---|
10672 | |
---|
10673 | + |
---|
10674 | def test_unrepairable_0shares(self): |
---|
10675 | d = self.publish_one() |
---|
10676 | def _delete_all_shares(ign): |
---|
10677 | hunk ./src/allmydata/test/test_mutable.py 1697 |
---|
10678 | d.addCallback(_check) |
---|
10679 | return d |
---|
10680 | |
---|
10681 | + def test_mdmf_unrepairable_0shares(self): |
---|
10682 | + d = self.publish_mdmf() |
---|
10683 | + def _delete_all_shares(ign): |
---|
10684 | + shares = self._storage._peers |
---|
10685 | + for peerid in shares: |
---|
10686 | + shares[peerid] = {} |
---|
10687 | + d.addCallback(_delete_all_shares) |
---|
10688 | + d.addCallback(lambda ign: self._fn.check(Monitor())) |
---|
10689 | + d.addCallback(lambda check_results: self._fn.repair(check_results)) |
---|
10690 | + d.addCallback(lambda crr: self.failIf(crr.get_successful())) |
---|
10691 | + return d |
---|
10692 | + |
---|
10693 | + |
---|
10694 | def test_unrepairable_1share(self): |
---|
10695 | d = self.publish_one() |
---|
10696 | def _delete_all_shares(ign): |
---|
10697 | hunk ./src/allmydata/test/test_mutable.py 1726 |
---|
10698 | d.addCallback(_check) |
---|
10699 | return d |
---|
10700 | |
---|
10701 | + def test_mdmf_unrepairable_1share(self): |
---|
10702 | + d = self.publish_mdmf() |
---|
10703 | + def _delete_all_shares(ign): |
---|
10704 | + shares = self._storage._peers |
---|
10705 | + for peerid in shares: |
---|
10706 | + for shnum in list(shares[peerid]): |
---|
10707 | + if shnum > 0: |
---|
10708 | + del shares[peerid][shnum] |
---|
10709 | + d.addCallback(_delete_all_shares) |
---|
10710 | + d.addCallback(lambda ign: self._fn.check(Monitor())) |
---|
10711 | + d.addCallback(lambda check_results: self._fn.repair(check_results)) |
---|
10712 | + def _check(crr): |
---|
10713 | + self.failUnlessEqual(crr.get_successful(), False) |
---|
10714 | + d.addCallback(_check) |
---|
10715 | + return d |
---|
10716 | + |
---|
10717 | + def test_repairable_5shares(self): |
---|
10718 | + d = self.publish_mdmf() |
---|
10719 | + def _delete_all_shares(ign): |
---|
10720 | + shares = self._storage._peers |
---|
10721 | + for peerid in shares: |
---|
10722 | + for shnum in list(shares[peerid]): |
---|
10723 | + if shnum > 4: |
---|
10724 | + del shares[peerid][shnum] |
---|
10725 | + d.addCallback(_delete_all_shares) |
---|
10726 | + d.addCallback(lambda ign: self._fn.check(Monitor())) |
---|
10727 | + d.addCallback(lambda check_results: self._fn.repair(check_results)) |
---|
10728 | + def _check(crr): |
---|
10729 | + self.failUnlessEqual(crr.get_successful(), True) |
---|
10730 | + d.addCallback(_check) |
---|
10731 | + return d |
---|
10732 | + |
---|
10733 | + def test_mdmf_repairable_5shares(self): |
---|
10734 | + d = self.publish_mdmf() |
---|
10735 | + def _delete_some_shares(ign): |
---|
10736 | + shares = self._storage._peers |
---|
10737 | + for peerid in shares: |
---|
10738 | + for shnum in list(shares[peerid]): |
---|
10739 | + if shnum > 5: |
---|
10740 | + del shares[peerid][shnum] |
---|
10741 | + d.addCallback(_delete_some_shares) |
---|
10742 | + d.addCallback(lambda ign: self._fn.check(Monitor())) |
---|
10743 | + def _check(cr): |
---|
10744 | + self.failIf(cr.is_healthy()) |
---|
10745 | + self.failUnless(cr.is_recoverable()) |
---|
10746 | + return cr |
---|
10747 | + d.addCallback(_check) |
---|
10748 | + d.addCallback(lambda check_results: self._fn.repair(check_results)) |
---|
10749 | + def _check1(crr): |
---|
10750 | + self.failUnlessEqual(crr.get_successful(), True) |
---|
10751 | + d.addCallback(_check1) |
---|
10752 | + return d |
---|
10753 | + |
---|
10754 | + |
---|
10755 | def test_merge(self): |
---|
10756 | self.old_shares = [] |
---|
10757 | d = self.publish_multiple() |
---|
10758 | hunk ./src/allmydata/test/test_mutable.py 1894 |
---|
10759 | class MultipleEncodings(unittest.TestCase): |
---|
10760 | def setUp(self): |
---|
10761 | self.CONTENTS = "New contents go here" |
---|
10762 | + self.uploadable = MutableData(self.CONTENTS) |
---|
10763 | self._storage = FakeStorage() |
---|
10764 | self._nodemaker = make_nodemaker(self._storage, num_peers=20) |
---|
10765 | self._storage_broker = self._nodemaker.storage_broker |
---|
10766 | hunk ./src/allmydata/test/test_mutable.py 1898 |
---|
10767 | - d = self._nodemaker.create_mutable_file(self.CONTENTS) |
---|
10768 | + d = self._nodemaker.create_mutable_file(self.uploadable) |
---|
10769 | def _created(node): |
---|
10770 | self._fn = node |
---|
10771 | d.addCallback(_created) |
---|
10772 | hunk ./src/allmydata/test/test_mutable.py 1904 |
---|
10773 | return d |
---|
10774 | |
---|
10775 | - def _encode(self, k, n, data): |
---|
10776 | + def _encode(self, k, n, data, version=SDMF_VERSION): |
---|
10777 | # encode 'data' into a peerid->shares dict. |
---|
10778 | |
---|
10779 | fn = self._fn |
---|
10780 | hunk ./src/allmydata/test/test_mutable.py 1920 |
---|
10781 | # and set the encoding parameters to something completely different |
---|
10782 | fn2._required_shares = k |
---|
10783 | fn2._total_shares = n |
---|
10784 | + # Normally a servermap update would occur before a publish. |
---|
10785 | + # Here, it doesn't, so we have to do it ourselves. |
---|
10786 | + fn2.set_version(version) |
---|
10787 | |
---|
10788 | s = self._storage |
---|
10789 | s._peers = {} # clear existing storage |
---|
10790 | hunk ./src/allmydata/test/test_mutable.py 1927 |
---|
10791 | p2 = Publish(fn2, self._storage_broker, None) |
---|
10792 | - d = p2.publish(data) |
---|
10793 | + uploadable = MutableData(data) |
---|
10794 | + d = p2.publish(uploadable) |
---|
10795 | def _published(res): |
---|
10796 | shares = s._peers |
---|
10797 | s._peers = {} |
---|
10798 | hunk ./src/allmydata/test/test_mutable.py 2230 |
---|
10799 | self.basedir = "mutable/Problems/test_publish_surprise" |
---|
10800 | self.set_up_grid() |
---|
10801 | nm = self.g.clients[0].nodemaker |
---|
10802 | - d = nm.create_mutable_file("contents 1") |
---|
10803 | + d = nm.create_mutable_file(MutableData("contents 1")) |
---|
10804 | def _created(n): |
---|
10805 | d = defer.succeed(None) |
---|
10806 | d.addCallback(lambda res: n.get_servermap(MODE_WRITE)) |
---|
10807 | hunk ./src/allmydata/test/test_mutable.py 2240 |
---|
10808 | d.addCallback(_got_smap1) |
---|
10809 | # then modify the file, leaving the old map untouched |
---|
10810 | d.addCallback(lambda res: log.msg("starting winning write")) |
---|
10811 | - d.addCallback(lambda res: n.overwrite("contents 2")) |
---|
10812 | + d.addCallback(lambda res: n.overwrite(MutableData("contents 2"))) |
---|
10813 | # now attempt to modify the file with the old servermap. This |
---|
10814 | # will look just like an uncoordinated write, in which every |
---|
10815 | # single share got updated between our mapupdate and our publish |
---|
10816 | hunk ./src/allmydata/test/test_mutable.py 2249 |
---|
10817 | self.shouldFail(UncoordinatedWriteError, |
---|
10818 | "test_publish_surprise", None, |
---|
10819 | n.upload, |
---|
10820 | - "contents 2a", self.old_map)) |
---|
10821 | + MutableData("contents 2a"), self.old_map)) |
---|
10822 | return d |
---|
10823 | d.addCallback(_created) |
---|
10824 | return d |
---|
10825 | hunk ./src/allmydata/test/test_mutable.py 2258 |
---|
10826 | self.basedir = "mutable/Problems/test_retrieve_surprise" |
---|
10827 | self.set_up_grid() |
---|
10828 | nm = self.g.clients[0].nodemaker |
---|
10829 | - d = nm.create_mutable_file("contents 1") |
---|
10830 | + d = nm.create_mutable_file(MutableData("contents 1")) |
---|
10831 | def _created(n): |
---|
10832 | d = defer.succeed(None) |
---|
10833 | d.addCallback(lambda res: n.get_servermap(MODE_READ)) |
---|
10834 | hunk ./src/allmydata/test/test_mutable.py 2268 |
---|
10835 | d.addCallback(_got_smap1) |
---|
10836 | # then modify the file, leaving the old map untouched |
---|
10837 | d.addCallback(lambda res: log.msg("starting winning write")) |
---|
10838 | - d.addCallback(lambda res: n.overwrite("contents 2")) |
---|
10839 | + d.addCallback(lambda res: n.overwrite(MutableData("contents 2"))) |
---|
10840 | # now attempt to retrieve the old version with the old servermap. |
---|
10841 | # This will look like someone has changed the file since we |
---|
10842 | # updated the servermap. |
---|
10843 | hunk ./src/allmydata/test/test_mutable.py 2277 |
---|
10844 | d.addCallback(lambda res: |
---|
10845 | self.shouldFail(NotEnoughSharesError, |
---|
10846 | "test_retrieve_surprise", |
---|
10847 | - "ran out of peers: have 0 shares (k=3)", |
---|
10848 | + "ran out of peers: have 0 of 1", |
---|
10849 | n.download_version, |
---|
10850 | self.old_map, |
---|
10851 | self.old_map.best_recoverable_version(), |
---|
10852 | hunk ./src/allmydata/test/test_mutable.py 2286 |
---|
10853 | d.addCallback(_created) |
---|
10854 | return d |
---|
10855 | |
---|
10856 | + |
---|
10857 | def test_unexpected_shares(self): |
---|
10858 | # upload the file, take a servermap, shut down one of the servers, |
---|
10859 | # upload it again (causing shares to appear on a new server), then |
---|
10860 | hunk ./src/allmydata/test/test_mutable.py 2296 |
---|
10861 | self.basedir = "mutable/Problems/test_unexpected_shares" |
---|
10862 | self.set_up_grid() |
---|
10863 | nm = self.g.clients[0].nodemaker |
---|
10864 | - d = nm.create_mutable_file("contents 1") |
---|
10865 | + d = nm.create_mutable_file(MutableData("contents 1")) |
---|
10866 | def _created(n): |
---|
10867 | d = defer.succeed(None) |
---|
10868 | d.addCallback(lambda res: n.get_servermap(MODE_WRITE)) |
---|
10869 | hunk ./src/allmydata/test/test_mutable.py 2308 |
---|
10870 | self.g.remove_server(peer0) |
---|
10871 | # then modify the file, leaving the old map untouched |
---|
10872 | log.msg("starting winning write") |
---|
10873 | - return n.overwrite("contents 2") |
---|
10874 | + return n.overwrite(MutableData("contents 2")) |
---|
10875 | d.addCallback(_got_smap1) |
---|
10876 | # now attempt to modify the file with the old servermap. This |
---|
10877 | # will look just like an uncoordinated write, in which every |
---|
10878 | hunk ./src/allmydata/test/test_mutable.py 2318 |
---|
10879 | self.shouldFail(UncoordinatedWriteError, |
---|
10880 | "test_surprise", None, |
---|
10881 | n.upload, |
---|
10882 | - "contents 2a", self.old_map)) |
---|
10883 | + MutableData("contents 2a"), self.old_map)) |
---|
10884 | return d |
---|
10885 | d.addCallback(_created) |
---|
10886 | return d |
---|
10887 | hunk ./src/allmydata/test/test_mutable.py 2322 |
---|
10888 | + test_unexpected_shares.timeout = 15 |
---|
10889 | |
---|
10890 | def test_bad_server(self): |
---|
10891 | # Break one server, then create the file: the initial publish should |
---|
10892 | hunk ./src/allmydata/test/test_mutable.py 2358 |
---|
10893 | d.addCallback(_break_peer0) |
---|
10894 | # now "create" the file, using the pre-established key, and let the |
---|
10895 | # initial publish finally happen |
---|
10896 | - d.addCallback(lambda res: nm.create_mutable_file("contents 1")) |
---|
10897 | + d.addCallback(lambda res: nm.create_mutable_file(MutableData("contents 1"))) |
---|
10898 | # that ought to work |
---|
10899 | def _got_node(n): |
---|
10900 | d = n.download_best_version() |
---|
10901 | hunk ./src/allmydata/test/test_mutable.py 2367 |
---|
10902 | def _break_peer1(res): |
---|
10903 | self.connection1.broken = True |
---|
10904 | d.addCallback(_break_peer1) |
---|
10905 | - d.addCallback(lambda res: n.overwrite("contents 2")) |
---|
10906 | + d.addCallback(lambda res: n.overwrite(MutableData("contents 2"))) |
---|
10907 | # that ought to work too |
---|
10908 | d.addCallback(lambda res: n.download_best_version()) |
---|
10909 | d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2")) |
---|
10910 | hunk ./src/allmydata/test/test_mutable.py 2399 |
---|
10911 | peerids = [serverid for (serverid,ss) in sb.get_all_servers()] |
---|
10912 | self.g.break_server(peerids[0]) |
---|
10913 | |
---|
10914 | - d = nm.create_mutable_file("contents 1") |
---|
10915 | + d = nm.create_mutable_file(MutableData("contents 1")) |
---|
10916 | def _created(n): |
---|
10917 | d = n.download_best_version() |
---|
10918 | d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1")) |
---|
10919 | hunk ./src/allmydata/test/test_mutable.py 2407 |
---|
10920 | def _break_second_server(res): |
---|
10921 | self.g.break_server(peerids[1]) |
---|
10922 | d.addCallback(_break_second_server) |
---|
10923 | - d.addCallback(lambda res: n.overwrite("contents 2")) |
---|
10924 | + d.addCallback(lambda res: n.overwrite(MutableData("contents 2"))) |
---|
10925 | # that ought to work too |
---|
10926 | d.addCallback(lambda res: n.download_best_version()) |
---|
10927 | d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2")) |
---|
10928 | hunk ./src/allmydata/test/test_mutable.py 2426 |
---|
10929 | d = self.shouldFail(NotEnoughServersError, |
---|
10930 | "test_publish_all_servers_bad", |
---|
10931 | "Ran out of non-bad servers", |
---|
10932 | - nm.create_mutable_file, "contents") |
---|
10933 | + nm.create_mutable_file, MutableData("contents")) |
---|
10934 | return d |
---|
10935 | |
---|
10936 | def test_publish_no_servers(self): |
---|
10937 | hunk ./src/allmydata/test/test_mutable.py 2438 |
---|
10938 | d = self.shouldFail(NotEnoughServersError, |
---|
10939 | "test_publish_no_servers", |
---|
10940 | "Ran out of non-bad servers", |
---|
10941 | - nm.create_mutable_file, "contents") |
---|
10942 | + nm.create_mutable_file, MutableData("contents")) |
---|
10943 | return d |
---|
10944 | test_publish_no_servers.timeout = 30 |
---|
10945 | |
---|
10946 | hunk ./src/allmydata/test/test_mutable.py 2456 |
---|
10947 | # we need some contents that are large enough to push the privkey out |
---|
10948 | # of the early part of the file |
---|
10949 | LARGE = "These are Larger contents" * 2000 # about 50KB |
---|
10950 | - d = nm.create_mutable_file(LARGE) |
---|
10951 | + LARGE_uploadable = MutableData(LARGE) |
---|
10952 | + d = nm.create_mutable_file(LARGE_uploadable) |
---|
10953 | def _created(n): |
---|
10954 | self.uri = n.get_uri() |
---|
10955 | self.n2 = nm.create_from_cap(self.uri) |
---|
10956 | hunk ./src/allmydata/test/test_mutable.py 2492 |
---|
10957 | self.basedir = "mutable/Problems/test_privkey_query_missing" |
---|
10958 | self.set_up_grid(num_servers=20) |
---|
10959 | nm = self.g.clients[0].nodemaker |
---|
10960 | - LARGE = "These are Larger contents" * 2000 # about 50KB |
---|
10961 | + LARGE = "These are Larger contents" * 2000 # about 50KiB |
---|
10962 | + LARGE_uploadable = MutableData(LARGE) |
---|
10963 | nm._node_cache = DevNullDictionary() # disable the nodecache |
---|
10964 | |
---|
10965 | hunk ./src/allmydata/test/test_mutable.py 2496 |
---|
10966 | - d = nm.create_mutable_file(LARGE) |
---|
10967 | + d = nm.create_mutable_file(LARGE_uploadable) |
---|
10968 | def _created(n): |
---|
10969 | self.uri = n.get_uri() |
---|
10970 | self.n2 = nm.create_from_cap(self.uri) |
---|
10971 | hunk ./src/allmydata/test/test_mutable.py 2506 |
---|
10972 | d.addCallback(_created) |
---|
10973 | d.addCallback(lambda res: self.n2.get_servermap(MODE_WRITE)) |
---|
10974 | return d |
---|
10975 | + |
---|
10976 | + |
---|
10977 | + def test_block_and_hash_query_error(self): |
---|
10978 | + # This tests for what happens when a query to a remote server |
---|
10979 | + # fails in either the hash validation step or the block getting |
---|
10980 | + # step (because of batching, this is the same actual query). |
---|
10981 | + # We need to have the storage server persist up until the point |
---|
10982 | + # that its prefix is validated, then suddenly die. This |
---|
10983 | + # exercises some exception handling code in Retrieve. |
---|
10984 | + self.basedir = "mutable/Problems/test_block_and_hash_query_error" |
---|
10985 | + self.set_up_grid(num_servers=20) |
---|
10986 | + nm = self.g.clients[0].nodemaker |
---|
10987 | + CONTENTS = "contents" * 2000 |
---|
10988 | + CONTENTS_uploadable = MutableData(CONTENTS) |
---|
10989 | + d = nm.create_mutable_file(CONTENTS_uploadable) |
---|
10990 | + def _created(node): |
---|
10991 | + self._node = node |
---|
10992 | + d.addCallback(_created) |
---|
10993 | + d.addCallback(lambda ignored: |
---|
10994 | + self._node.get_servermap(MODE_READ)) |
---|
10995 | + def _then(servermap): |
---|
10996 | + # we have our servermap. Now we set up the servers like the |
---|
10997 | + # tests above -- the first one that gets a read call should |
---|
10998 | + # start throwing errors, but only after returning its prefix |
---|
10999 | + # for validation. Since we'll download without fetching the |
---|
11000 | + # private key, the next query to the remote server will be |
---|
11001 | + # for either a block and salt or for hashes, either of which |
---|
11002 | + # will exercise the error handling code. |
---|
11003 | + killer = FirstServerGetsKilled() |
---|
11004 | + for (serverid, ss) in nm.storage_broker.get_all_servers(): |
---|
11005 | + ss.post_call_notifier = killer.notify |
---|
11006 | + ver = servermap.best_recoverable_version() |
---|
11007 | + assert ver |
---|
11008 | + return self._node.download_version(servermap, ver) |
---|
11009 | + d.addCallback(_then) |
---|
11010 | + d.addCallback(lambda data: |
---|
11011 | + self.failUnlessEqual(data, CONTENTS)) |
---|
11012 | + return d |
---|
11013 | + |
---|
11014 | + |
---|
11015 | +class FileHandle(unittest.TestCase): |
---|
11016 | + def setUp(self): |
---|
11017 | + self.test_data = "Test Data" * 50000 |
---|
11018 | + self.sio = StringIO(self.test_data) |
---|
11019 | + self.uploadable = MutableFileHandle(self.sio) |
---|
11020 | + |
---|
11021 | + |
---|
11022 | + def test_filehandle_read(self): |
---|
11023 | + self.basedir = "mutable/FileHandle/test_filehandle_read" |
---|
11024 | + chunk_size = 10 |
---|
11025 | + for i in xrange(0, len(self.test_data), chunk_size): |
---|
11026 | + data = self.uploadable.read(chunk_size) |
---|
11027 | + data = "".join(data) |
---|
11028 | + start = i |
---|
11029 | + end = i + chunk_size |
---|
11030 | + self.failUnlessEqual(data, self.test_data[start:end]) |
---|
11031 | + |
---|
11032 | + |
---|
11033 | + def test_filehandle_get_size(self): |
---|
11034 | + self.basedir = "mutable/FileHandle/test_filehandle_get_size" |
---|
11035 | + actual_size = len(self.test_data) |
---|
11036 | + size = self.uploadable.get_size() |
---|
11037 | + self.failUnlessEqual(size, actual_size) |
---|
11038 | + |
---|
11039 | + |
---|
11040 | + def test_filehandle_get_size_out_of_order(self): |
---|
11041 | + # We should be able to call get_size whenever we want without |
---|
11042 | + # disturbing the location of the seek pointer. |
---|
11043 | + chunk_size = 100 |
---|
11044 | + data = self.uploadable.read(chunk_size) |
---|
11045 | + self.failUnlessEqual("".join(data), self.test_data[:chunk_size]) |
---|
11046 | + |
---|
11047 | + # Now get the size. |
---|
11048 | + size = self.uploadable.get_size() |
---|
11049 | + self.failUnlessEqual(size, len(self.test_data)) |
---|
11050 | + |
---|
11051 | + # Now get more data. We should be right where we left off. |
---|
11052 | + more_data = self.uploadable.read(chunk_size) |
---|
11053 | + start = chunk_size |
---|
11054 | + end = chunk_size * 2 |
---|
11055 | + self.failUnlessEqual("".join(more_data), self.test_data[start:end]) |
---|
11056 | + |
---|
11057 | + |
---|
11058 | + def test_filehandle_file(self): |
---|
11059 | + # Make sure that the MutableFileHandle works on a file as well |
---|
11060 | + # as a StringIO object, since in some cases it will be asked to |
---|
11061 | + # deal with files. |
---|
11062 | + self.basedir = self.mktemp() |
---|
11063 | + # necessary? What am I doing wrong here? |
---|
11064 | + os.mkdir(self.basedir) |
---|
11065 | + f_path = os.path.join(self.basedir, "test_file") |
---|
11066 | + f = open(f_path, "w") |
---|
11067 | + f.write(self.test_data) |
---|
11068 | + f.close() |
---|
11069 | + f = open(f_path, "r") |
---|
11070 | + |
---|
11071 | + uploadable = MutableFileHandle(f) |
---|
11072 | + |
---|
11073 | + data = uploadable.read(len(self.test_data)) |
---|
11074 | + self.failUnlessEqual("".join(data), self.test_data) |
---|
11075 | + size = uploadable.get_size() |
---|
11076 | + self.failUnlessEqual(size, len(self.test_data)) |
---|
11077 | + |
---|
11078 | + |
---|
11079 | + def test_close(self): |
---|
11080 | + # Make sure that the MutableFileHandle closes its handle when |
---|
11081 | + # told to do so. |
---|
11082 | + self.uploadable.close() |
---|
11083 | + self.failUnless(self.sio.closed) |
---|
11084 | + |
---|
11085 | + |
---|
11086 | +class DataHandle(unittest.TestCase): |
---|
11087 | + def setUp(self): |
---|
11088 | + self.test_data = "Test Data" * 50000 |
---|
11089 | + self.uploadable = MutableData(self.test_data) |
---|
11090 | + |
---|
11091 | + |
---|
11092 | + def test_datahandle_read(self): |
---|
11093 | + chunk_size = 10 |
---|
11094 | + for i in xrange(0, len(self.test_data), chunk_size): |
---|
11095 | + data = self.uploadable.read(chunk_size) |
---|
11096 | + data = "".join(data) |
---|
11097 | + start = i |
---|
11098 | + end = i + chunk_size |
---|
11099 | + self.failUnlessEqual(data, self.test_data[start:end]) |
---|
11100 | + |
---|
11101 | + |
---|
11102 | + def test_datahandle_get_size(self): |
---|
11103 | + actual_size = len(self.test_data) |
---|
11104 | + size = self.uploadable.get_size() |
---|
11105 | + self.failUnlessEqual(size, actual_size) |
---|
11106 | + |
---|
11107 | + |
---|
11108 | + def test_datahandle_get_size_out_of_order(self): |
---|
11109 | + # We should be able to call get_size whenever we want without |
---|
11110 | + # disturbing the location of the seek pointer. |
---|
11111 | + chunk_size = 100 |
---|
11112 | + data = self.uploadable.read(chunk_size) |
---|
11113 | + self.failUnlessEqual("".join(data), self.test_data[:chunk_size]) |
---|
11114 | + |
---|
11115 | + # Now get the size. |
---|
11116 | + size = self.uploadable.get_size() |
---|
11117 | + self.failUnlessEqual(size, len(self.test_data)) |
---|
11118 | + |
---|
11119 | + # Now get more data. We should be right where we left off. |
---|
11120 | + more_data = self.uploadable.read(chunk_size) |
---|
11121 | + start = chunk_size |
---|
11122 | + end = chunk_size * 2 |
---|
11123 | + self.failUnlessEqual("".join(more_data), self.test_data[start:end]) |
---|
11124 | + |
---|
11125 | + |
---|
11126 | +class Version(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin, \ |
---|
11127 | + PublishMixin): |
---|
11128 | + def setUp(self): |
---|
11129 | + GridTestMixin.setUp(self) |
---|
11130 | + self.basedir = self.mktemp() |
---|
11131 | + self.set_up_grid() |
---|
11132 | + self.c = self.g.clients[0] |
---|
11133 | + self.nm = self.c.nodemaker |
---|
11134 | + self.data = "test data" * 100000 # about 900 KiB; MDMF |
---|
11135 | + self.small_data = "test data" * 10 # about 90 B; SDMF |
---|
11136 | + return self.do_upload() |
---|
11137 | + |
---|
11138 | + |
---|
11139 | + def do_upload(self): |
---|
11140 | + d1 = self.nm.create_mutable_file(MutableData(self.data), |
---|
11141 | + version=MDMF_VERSION) |
---|
11142 | + d2 = self.nm.create_mutable_file(MutableData(self.small_data)) |
---|
11143 | + dl = gatherResults([d1, d2]) |
---|
11144 | + def _then((n1, n2)): |
---|
11145 | + assert isinstance(n1, MutableFileNode) |
---|
11146 | + assert isinstance(n2, MutableFileNode) |
---|
11147 | + |
---|
11148 | + self.mdmf_node = n1 |
---|
11149 | + self.sdmf_node = n2 |
---|
11150 | + dl.addCallback(_then) |
---|
11151 | + return dl |
---|
11152 | + |
---|
11153 | + |
---|
11154 | + def test_get_readonly_mutable_version(self): |
---|
11155 | + # Attempting to get a mutable version of a mutable file from a |
---|
11156 | + # filenode initialized with a readcap should return a readonly |
---|
11157 | + # version of that same node. |
---|
11158 | + ro = self.mdmf_node.get_readonly() |
---|
11159 | + d = ro.get_best_mutable_version() |
---|
11160 | + d.addCallback(lambda version: |
---|
11161 | + self.failUnless(version.is_readonly())) |
---|
11162 | + d.addCallback(lambda ignored: |
---|
11163 | + self.sdmf_node.get_readonly()) |
---|
11164 | + d.addCallback(lambda version: |
---|
11165 | + self.failUnless(version.is_readonly())) |
---|
11166 | + return d |
---|
11167 | + |
---|
11168 | + |
---|
11169 | + def test_get_sequence_number(self): |
---|
11170 | + d = self.mdmf_node.get_best_readable_version() |
---|
11171 | + d.addCallback(lambda bv: |
---|
11172 | + self.failUnlessEqual(bv.get_sequence_number(), 1)) |
---|
11173 | + d.addCallback(lambda ignored: |
---|
11174 | + self.sdmf_node.get_best_readable_version()) |
---|
11175 | + d.addCallback(lambda bv: |
---|
11176 | + self.failUnlessEqual(bv.get_sequence_number(), 1)) |
---|
11177 | + # Now update. The sequence number in both cases should be 1 in |
---|
11178 | + # both cases. |
---|
11179 | + def _do_update(ignored): |
---|
11180 | + new_data = MutableData("foo bar baz" * 100000) |
---|
11181 | + new_small_data = MutableData("foo bar baz" * 10) |
---|
11182 | + d1 = self.mdmf_node.overwrite(new_data) |
---|
11183 | + d2 = self.sdmf_node.overwrite(new_small_data) |
---|
11184 | + dl = gatherResults([d1, d2]) |
---|
11185 | + return dl |
---|
11186 | + d.addCallback(_do_update) |
---|
11187 | + d.addCallback(lambda ignored: |
---|
11188 | + self.mdmf_node.get_best_readable_version()) |
---|
11189 | + d.addCallback(lambda bv: |
---|
11190 | + self.failUnlessEqual(bv.get_sequence_number(), 2)) |
---|
11191 | + d.addCallback(lambda ignored: |
---|
11192 | + self.sdmf_node.get_best_readable_version()) |
---|
11193 | + d.addCallback(lambda bv: |
---|
11194 | + self.failUnlessEqual(bv.get_sequence_number(), 2)) |
---|
11195 | + return d |
---|
11196 | + |
---|
11197 | + |
---|
11198 | + def test_get_writekey(self): |
---|
11199 | + d = self.mdmf_node.get_best_mutable_version() |
---|
11200 | + d.addCallback(lambda bv: |
---|
11201 | + self.failUnlessEqual(bv.get_writekey(), |
---|
11202 | + self.mdmf_node.get_writekey())) |
---|
11203 | + d.addCallback(lambda ignored: |
---|
11204 | + self.sdmf_node.get_best_mutable_version()) |
---|
11205 | + d.addCallback(lambda bv: |
---|
11206 | + self.failUnlessEqual(bv.get_writekey(), |
---|
11207 | + self.sdmf_node.get_writekey())) |
---|
11208 | + return d |
---|
11209 | + |
---|
11210 | + |
---|
11211 | + def test_get_storage_index(self): |
---|
11212 | + d = self.mdmf_node.get_best_mutable_version() |
---|
11213 | + d.addCallback(lambda bv: |
---|
11214 | + self.failUnlessEqual(bv.get_storage_index(), |
---|
11215 | + self.mdmf_node.get_storage_index())) |
---|
11216 | + d.addCallback(lambda ignored: |
---|
11217 | + self.sdmf_node.get_best_mutable_version()) |
---|
11218 | + d.addCallback(lambda bv: |
---|
11219 | + self.failUnlessEqual(bv.get_storage_index(), |
---|
11220 | + self.sdmf_node.get_storage_index())) |
---|
11221 | + return d |
---|
11222 | + |
---|
11223 | + |
---|
11224 | + def test_get_readonly_version(self): |
---|
11225 | + d = self.mdmf_node.get_best_readable_version() |
---|
11226 | + d.addCallback(lambda bv: |
---|
11227 | + self.failUnless(bv.is_readonly())) |
---|
11228 | + d.addCallback(lambda ignored: |
---|
11229 | + self.sdmf_node.get_best_readable_version()) |
---|
11230 | + d.addCallback(lambda bv: |
---|
11231 | + self.failUnless(bv.is_readonly())) |
---|
11232 | + return d |
---|
11233 | + |
---|
11234 | + |
---|
11235 | + def test_get_mutable_version(self): |
---|
11236 | + d = self.mdmf_node.get_best_mutable_version() |
---|
11237 | + d.addCallback(lambda bv: |
---|
11238 | + self.failIf(bv.is_readonly())) |
---|
11239 | + d.addCallback(lambda ignored: |
---|
11240 | + self.sdmf_node.get_best_mutable_version()) |
---|
11241 | + d.addCallback(lambda bv: |
---|
11242 | + self.failIf(bv.is_readonly())) |
---|
11243 | + return d |
---|
11244 | + |
---|
11245 | + |
---|
11246 | + def test_toplevel_overwrite(self): |
---|
11247 | + new_data = MutableData("foo bar baz" * 100000) |
---|
11248 | + new_small_data = MutableData("foo bar baz" * 10) |
---|
11249 | + d = self.mdmf_node.overwrite(new_data) |
---|
11250 | + d.addCallback(lambda ignored: |
---|
11251 | + self.mdmf_node.download_best_version()) |
---|
11252 | + d.addCallback(lambda data: |
---|
11253 | + self.failUnlessEqual(data, "foo bar baz" * 100000)) |
---|
11254 | + d.addCallback(lambda ignored: |
---|
11255 | + self.sdmf_node.overwrite(new_small_data)) |
---|
11256 | + d.addCallback(lambda ignored: |
---|
11257 | + self.sdmf_node.download_best_version()) |
---|
11258 | + d.addCallback(lambda data: |
---|
11259 | + self.failUnlessEqual(data, "foo bar baz" * 10)) |
---|
11260 | + return d |
---|
11261 | + |
---|
11262 | + |
---|
11263 | + def test_toplevel_modify(self): |
---|
11264 | + def modifier(old_contents, servermap, first_time): |
---|
11265 | + return old_contents + "modified" |
---|
11266 | + d = self.mdmf_node.modify(modifier) |
---|
11267 | + d.addCallback(lambda ignored: |
---|
11268 | + self.mdmf_node.download_best_version()) |
---|
11269 | + d.addCallback(lambda data: |
---|
11270 | + self.failUnlessIn("modified", data)) |
---|
11271 | + d.addCallback(lambda ignored: |
---|
11272 | + self.sdmf_node.modify(modifier)) |
---|
11273 | + d.addCallback(lambda ignored: |
---|
11274 | + self.sdmf_node.download_best_version()) |
---|
11275 | + d.addCallback(lambda data: |
---|
11276 | + self.failUnlessIn("modified", data)) |
---|
11277 | + return d |
---|
11278 | + |
---|
11279 | + |
---|
11280 | + def test_version_modify(self): |
---|
11281 | + # TODO: When we can publish multiple versions, alter this test |
---|
11282 | + # to modify a version other than the best usable version, then |
---|
11283 | + # test to see that the best recoverable version is that. |
---|
11284 | + def modifier(old_contents, servermap, first_time): |
---|
11285 | + return old_contents + "modified" |
---|
11286 | + d = self.mdmf_node.modify(modifier) |
---|
11287 | + d.addCallback(lambda ignored: |
---|
11288 | + self.mdmf_node.download_best_version()) |
---|
11289 | + d.addCallback(lambda data: |
---|
11290 | + self.failUnlessIn("modified", data)) |
---|
11291 | + d.addCallback(lambda ignored: |
---|
11292 | + self.sdmf_node.modify(modifier)) |
---|
11293 | + d.addCallback(lambda ignored: |
---|
11294 | + self.sdmf_node.download_best_version()) |
---|
11295 | + d.addCallback(lambda data: |
---|
11296 | + self.failUnlessIn("modified", data)) |
---|
11297 | + return d |
---|
11298 | + |
---|
11299 | + |
---|
11300 | + def test_download_version(self): |
---|
11301 | + d = self.publish_multiple() |
---|
11302 | + # We want to have two recoverable versions on the grid. |
---|
11303 | + d.addCallback(lambda res: |
---|
11304 | + self._set_versions({0:0,2:0,4:0,6:0,8:0, |
---|
11305 | + 1:1,3:1,5:1,7:1,9:1})) |
---|
11306 | + # Now try to download each version. We should get the plaintext |
---|
11307 | + # associated with that version. |
---|
11308 | + d.addCallback(lambda ignored: |
---|
11309 | + self._fn.get_servermap(mode=MODE_READ)) |
---|
11310 | + def _got_servermap(smap): |
---|
11311 | + versions = smap.recoverable_versions() |
---|
11312 | + assert len(versions) == 2 |
---|
11313 | + |
---|
11314 | + self.servermap = smap |
---|
11315 | + self.version1, self.version2 = versions |
---|
11316 | + assert self.version1 != self.version2 |
---|
11317 | + |
---|
11318 | + self.version1_seqnum = self.version1[0] |
---|
11319 | + self.version2_seqnum = self.version2[0] |
---|
11320 | + self.version1_index = self.version1_seqnum - 1 |
---|
11321 | + self.version2_index = self.version2_seqnum - 1 |
---|
11322 | + |
---|
11323 | + d.addCallback(_got_servermap) |
---|
11324 | + d.addCallback(lambda ignored: |
---|
11325 | + self._fn.download_version(self.servermap, self.version1)) |
---|
11326 | + d.addCallback(lambda results: |
---|
11327 | + self.failUnlessEqual(self.CONTENTS[self.version1_index], |
---|
11328 | + results)) |
---|
11329 | + d.addCallback(lambda ignored: |
---|
11330 | + self._fn.download_version(self.servermap, self.version2)) |
---|
11331 | + d.addCallback(lambda results: |
---|
11332 | + self.failUnlessEqual(self.CONTENTS[self.version2_index], |
---|
11333 | + results)) |
---|
11334 | + return d |
---|
11335 | + |
---|
11336 | + |
---|
11337 | + def test_partial_read(self): |
---|
11338 | + # read only a few bytes at a time, and see that the results are |
---|
11339 | + # what we expect. |
---|
11340 | + d = self.mdmf_node.get_best_readable_version() |
---|
11341 | + def _read_data(version): |
---|
11342 | + c = consumer.MemoryConsumer() |
---|
11343 | + d2 = defer.succeed(None) |
---|
11344 | + for i in xrange(0, len(self.data), 10000): |
---|
11345 | + d2.addCallback(lambda ignored, i=i: version.read(c, i, 10000)) |
---|
11346 | + d2.addCallback(lambda ignored: |
---|
11347 | + self.failUnlessEqual(self.data, "".join(c.chunks))) |
---|
11348 | + return d2 |
---|
11349 | + d.addCallback(_read_data) |
---|
11350 | + return d |
---|
11351 | + |
---|
11352 | + |
---|
11353 | + def test_read(self): |
---|
11354 | + d = self.mdmf_node.get_best_readable_version() |
---|
11355 | + def _read_data(version): |
---|
11356 | + c = consumer.MemoryConsumer() |
---|
11357 | + d2 = defer.succeed(None) |
---|
11358 | + d2.addCallback(lambda ignored: version.read(c)) |
---|
11359 | + d2.addCallback(lambda ignored: |
---|
11360 | + self.failUnlessEqual("".join(c.chunks), self.data)) |
---|
11361 | + return d2 |
---|
11362 | + d.addCallback(_read_data) |
---|
11363 | + return d |
---|
11364 | + |
---|
11365 | + |
---|
11366 | + def test_download_best_version(self): |
---|
11367 | + d = self.mdmf_node.download_best_version() |
---|
11368 | + d.addCallback(lambda data: |
---|
11369 | + self.failUnlessEqual(data, self.data)) |
---|
11370 | + d.addCallback(lambda ignored: |
---|
11371 | + self.sdmf_node.download_best_version()) |
---|
11372 | + d.addCallback(lambda data: |
---|
11373 | + self.failUnlessEqual(data, self.small_data)) |
---|
11374 | + return d |
---|
11375 | + |
---|
11376 | + |
---|
11377 | +class Update(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin): |
---|
11378 | + def setUp(self): |
---|
11379 | + GridTestMixin.setUp(self) |
---|
11380 | + self.basedir = self.mktemp() |
---|
11381 | + self.set_up_grid() |
---|
11382 | + self.c = self.g.clients[0] |
---|
11383 | + self.nm = self.c.nodemaker |
---|
11384 | + self.data = "test data" * 100000 # about 900 KiB; MDMF |
---|
11385 | + self.small_data = "test data" * 10 # about 90 B; SDMF |
---|
11386 | + return self.do_upload() |
---|
11387 | + |
---|
11388 | + |
---|
11389 | + def do_upload(self): |
---|
11390 | + d1 = self.nm.create_mutable_file(MutableData(self.data), |
---|
11391 | + version=MDMF_VERSION) |
---|
11392 | + d2 = self.nm.create_mutable_file(MutableData(self.small_data)) |
---|
11393 | + dl = gatherResults([d1, d2]) |
---|
11394 | + def _then((n1, n2)): |
---|
11395 | + assert isinstance(n1, MutableFileNode) |
---|
11396 | + assert isinstance(n2, MutableFileNode) |
---|
11397 | + |
---|
11398 | + self.mdmf_node = n1 |
---|
11399 | + self.sdmf_node = n2 |
---|
11400 | + dl.addCallback(_then) |
---|
11401 | + return dl |
---|
11402 | + |
---|
11403 | + |
---|
11404 | + def test_append(self): |
---|
11405 | + # We should be able to append data to the middle of a mutable |
---|
11406 | + # file and get what we expect. |
---|
11407 | + new_data = self.data + "appended" |
---|
11408 | + d = self.mdmf_node.get_best_mutable_version() |
---|
11409 | + d.addCallback(lambda mv: |
---|
11410 | + mv.update(MutableData("appended"), len(self.data))) |
---|
11411 | + d.addCallback(lambda ignored: |
---|
11412 | + self.mdmf_node.download_best_version()) |
---|
11413 | + d.addCallback(lambda results: |
---|
11414 | + self.failUnlessEqual(results, new_data)) |
---|
11415 | + return d |
---|
11416 | + test_append.timeout = 15 |
---|
11417 | + |
---|
11418 | + |
---|
11419 | + def test_replace(self): |
---|
11420 | + # We should be able to replace data in the middle of a mutable |
---|
11421 | + # file and get what we expect back. |
---|
11422 | + new_data = self.data[:100] |
---|
11423 | + new_data += "appended" |
---|
11424 | + new_data += self.data[108:] |
---|
11425 | + d = self.mdmf_node.get_best_mutable_version() |
---|
11426 | + d.addCallback(lambda mv: |
---|
11427 | + mv.update(MutableData("appended"), 100)) |
---|
11428 | + d.addCallback(lambda ignored: |
---|
11429 | + self.mdmf_node.download_best_version()) |
---|
11430 | + d.addCallback(lambda results: |
---|
11431 | + self.failUnlessEqual(results, new_data)) |
---|
11432 | + return d |
---|
11433 | + |
---|
11434 | + |
---|
11435 | + def test_replace_and_extend(self): |
---|
11436 | + # We should be able to replace data in the middle of a mutable |
---|
11437 | + # file and extend that mutable file and get what we expect. |
---|
11438 | + new_data = self.data[:100] |
---|
11439 | + new_data += "modified " * 100000 |
---|
11440 | + d = self.mdmf_node.get_best_mutable_version() |
---|
11441 | + d.addCallback(lambda mv: |
---|
11442 | + mv.update(MutableData("modified " * 100000), 100)) |
---|
11443 | + d.addCallback(lambda ignored: |
---|
11444 | + self.mdmf_node.download_best_version()) |
---|
11445 | + d.addCallback(lambda results: |
---|
11446 | + self.failUnlessEqual(results, new_data)) |
---|
11447 | + return d |
---|
11448 | + |
---|
11449 | + |
---|
11450 | + def test_append_power_of_two(self): |
---|
11451 | + # If we attempt to extend a mutable file so that its segment |
---|
11452 | + # count crosses a power-of-two boundary, the update operation |
---|
11453 | + # should know how to reencode the file. |
---|
11454 | + |
---|
11455 | + # Note that the data populating self.mdmf_node is about 900 KiB |
---|
11456 | + # long -- this is 7 segments in the default segment size. So we |
---|
11457 | + # need to add 2 segments worth of data to push it over a |
---|
11458 | + # power-of-two boundary. |
---|
11459 | + segment = "a" * DEFAULT_MAX_SEGMENT_SIZE |
---|
11460 | + new_data = self.data + (segment * 2) |
---|
11461 | + d = self.mdmf_node.get_best_mutable_version() |
---|
11462 | + d.addCallback(lambda mv: |
---|
11463 | + mv.update(MutableData(segment * 2), len(self.data))) |
---|
11464 | + d.addCallback(lambda ignored: |
---|
11465 | + self.mdmf_node.download_best_version()) |
---|
11466 | + d.addCallback(lambda results: |
---|
11467 | + self.failUnlessEqual(results, new_data)) |
---|
11468 | + return d |
---|
11469 | + test_append_power_of_two.timeout = 15 |
---|
11470 | + |
---|
11471 | + |
---|
11472 | + def test_update_sdmf(self): |
---|
11473 | + # Running update on a single-segment file should still work. |
---|
11474 | + new_data = self.small_data + "appended" |
---|
11475 | + d = self.sdmf_node.get_best_mutable_version() |
---|
11476 | + d.addCallback(lambda mv: |
---|
11477 | + mv.update(MutableData("appended"), len(self.small_data))) |
---|
11478 | + d.addCallback(lambda ignored: |
---|
11479 | + self.sdmf_node.download_best_version()) |
---|
11480 | + d.addCallback(lambda results: |
---|
11481 | + self.failUnlessEqual(results, new_data)) |
---|
11482 | + return d |
---|
11483 | + |
---|
11484 | + def test_replace_in_last_segment(self): |
---|
11485 | + # The wrapper should know how to handle the tail segment |
---|
11486 | + # appropriately. |
---|
11487 | + replace_offset = len(self.data) - 100 |
---|
11488 | + new_data = self.data[:replace_offset] + "replaced" |
---|
11489 | + rest_offset = replace_offset + len("replaced") |
---|
11490 | + new_data += self.data[rest_offset:] |
---|
11491 | + d = self.mdmf_node.get_best_mutable_version() |
---|
11492 | + d.addCallback(lambda mv: |
---|
11493 | + mv.update(MutableData("replaced"), replace_offset)) |
---|
11494 | + d.addCallback(lambda ignored: |
---|
11495 | + self.mdmf_node.download_best_version()) |
---|
11496 | + d.addCallback(lambda results: |
---|
11497 | + self.failUnlessEqual(results, new_data)) |
---|
11498 | + return d |
---|
11499 | + |
---|
11500 | + |
---|
11501 | + def test_multiple_segment_replace(self): |
---|
11502 | + replace_offset = 2 * DEFAULT_MAX_SEGMENT_SIZE |
---|
11503 | + new_data = self.data[:replace_offset] |
---|
11504 | + new_segment = "a" * DEFAULT_MAX_SEGMENT_SIZE |
---|
11505 | + new_data += 2 * new_segment |
---|
11506 | + new_data += "replaced" |
---|
11507 | + rest_offset = len(new_data) |
---|
11508 | + new_data += self.data[rest_offset:] |
---|
11509 | + d = self.mdmf_node.get_best_mutable_version() |
---|
11510 | + d.addCallback(lambda mv: |
---|
11511 | + mv.update(MutableData((2 * new_segment) + "replaced"), |
---|
11512 | + replace_offset)) |
---|
11513 | + d.addCallback(lambda ignored: |
---|
11514 | + self.mdmf_node.download_best_version()) |
---|
11515 | + d.addCallback(lambda results: |
---|
11516 | + self.failUnlessEqual(results, new_data)) |
---|
11517 | + return d |
---|
11518 | hunk ./src/allmydata/test/test_sftp.py 32 |
---|
11519 | |
---|
11520 | from allmydata.util.consumer import download_to_data |
---|
11521 | from allmydata.immutable import upload |
---|
11522 | +from allmydata.mutable import publish |
---|
11523 | from allmydata.test.no_network import GridTestMixin |
---|
11524 | from allmydata.test.common import ShouldFailMixin |
---|
11525 | from allmydata.test.common_util import ReallyEqualMixin |
---|
11526 | hunk ./src/allmydata/test/test_sftp.py 84 |
---|
11527 | return d |
---|
11528 | |
---|
11529 | def _set_up_tree(self): |
---|
11530 | - d = self.client.create_mutable_file("mutable file contents") |
---|
11531 | + u = publish.MutableData("mutable file contents") |
---|
11532 | + d = self.client.create_mutable_file(u) |
---|
11533 | d.addCallback(lambda node: self.root.set_node(u"mutable", node)) |
---|
11534 | def _created_mutable(n): |
---|
11535 | self.mutable = n |
---|
11536 | hunk ./src/allmydata/test/test_sftp.py 1334 |
---|
11537 | d.addCallback(lambda ign: self.failUnlessEqual(sftpd.all_heisenfiles, {})) |
---|
11538 | d.addCallback(lambda ign: self.failUnlessEqual(self.handler._heisenfiles, {})) |
---|
11539 | return d |
---|
11540 | + test_makeDirectory.timeout = 15 |
---|
11541 | |
---|
11542 | def test_execCommand_and_openShell(self): |
---|
11543 | class FakeProtocol: |
---|
11544 | hunk ./src/allmydata/test/test_system.py 25 |
---|
11545 | from allmydata.monitor import Monitor |
---|
11546 | from allmydata.mutable.common import NotWriteableError |
---|
11547 | from allmydata.mutable import layout as mutable_layout |
---|
11548 | +from allmydata.mutable.publish import MutableData |
---|
11549 | from foolscap.api import DeadReferenceError |
---|
11550 | from twisted.python.failure import Failure |
---|
11551 | from twisted.web.client import getPage |
---|
11552 | hunk ./src/allmydata/test/test_system.py 463 |
---|
11553 | def test_mutable(self): |
---|
11554 | self.basedir = "system/SystemTest/test_mutable" |
---|
11555 | DATA = "initial contents go here." # 25 bytes % 3 != 0 |
---|
11556 | + DATA_uploadable = MutableData(DATA) |
---|
11557 | NEWDATA = "new contents yay" |
---|
11558 | hunk ./src/allmydata/test/test_system.py 465 |
---|
11559 | + NEWDATA_uploadable = MutableData(NEWDATA) |
---|
11560 | NEWERDATA = "this is getting old" |
---|
11561 | hunk ./src/allmydata/test/test_system.py 467 |
---|
11562 | + NEWERDATA_uploadable = MutableData(NEWERDATA) |
---|
11563 | |
---|
11564 | d = self.set_up_nodes(use_key_generator=True) |
---|
11565 | |
---|
11566 | hunk ./src/allmydata/test/test_system.py 474 |
---|
11567 | def _create_mutable(res): |
---|
11568 | c = self.clients[0] |
---|
11569 | log.msg("starting create_mutable_file") |
---|
11570 | - d1 = c.create_mutable_file(DATA) |
---|
11571 | + d1 = c.create_mutable_file(DATA_uploadable) |
---|
11572 | def _done(res): |
---|
11573 | log.msg("DONE: %s" % (res,)) |
---|
11574 | self._mutable_node_1 = res |
---|
11575 | hunk ./src/allmydata/test/test_system.py 561 |
---|
11576 | self.failUnlessEqual(res, DATA) |
---|
11577 | # replace the data |
---|
11578 | log.msg("starting replace1") |
---|
11579 | - d1 = newnode.overwrite(NEWDATA) |
---|
11580 | + d1 = newnode.overwrite(NEWDATA_uploadable) |
---|
11581 | d1.addCallback(lambda res: newnode.download_best_version()) |
---|
11582 | return d1 |
---|
11583 | d.addCallback(_check_download_3) |
---|
11584 | hunk ./src/allmydata/test/test_system.py 575 |
---|
11585 | newnode2 = self.clients[3].create_node_from_uri(uri) |
---|
11586 | self._newnode3 = self.clients[3].create_node_from_uri(uri) |
---|
11587 | log.msg("starting replace2") |
---|
11588 | - d1 = newnode1.overwrite(NEWERDATA) |
---|
11589 | + d1 = newnode1.overwrite(NEWERDATA_uploadable) |
---|
11590 | d1.addCallback(lambda res: newnode2.download_best_version()) |
---|
11591 | return d1 |
---|
11592 | d.addCallback(_check_download_4) |
---|
11593 | hunk ./src/allmydata/test/test_system.py 645 |
---|
11594 | def _check_empty_file(res): |
---|
11595 | # make sure we can create empty files, this usually screws up the |
---|
11596 | # segsize math |
---|
11597 | - d1 = self.clients[2].create_mutable_file("") |
---|
11598 | + d1 = self.clients[2].create_mutable_file(MutableData("")) |
---|
11599 | d1.addCallback(lambda newnode: newnode.download_best_version()) |
---|
11600 | d1.addCallback(lambda res: self.failUnlessEqual("", res)) |
---|
11601 | return d1 |
---|
11602 | hunk ./src/allmydata/test/test_system.py 676 |
---|
11603 | self.key_generator_svc.key_generator.pool_size + size_delta) |
---|
11604 | |
---|
11605 | d.addCallback(check_kg_poolsize, 0) |
---|
11606 | - d.addCallback(lambda junk: self.clients[3].create_mutable_file('hello, world')) |
---|
11607 | + d.addCallback(lambda junk: |
---|
11608 | + self.clients[3].create_mutable_file(MutableData('hello, world'))) |
---|
11609 | d.addCallback(check_kg_poolsize, -1) |
---|
11610 | d.addCallback(lambda junk: self.clients[3].create_dirnode()) |
---|
11611 | d.addCallback(check_kg_poolsize, -2) |
---|
11612 | hunk ./src/allmydata/test/test_web.py 750 |
---|
11613 | self.PUT, base + "/@@name=/blah.txt", "") |
---|
11614 | return d |
---|
11615 | |
---|
11616 | + |
---|
11617 | def test_GET_DIRURL_named_bad(self): |
---|
11618 | base = "/file/%s" % urllib.quote(self._foo_uri) |
---|
11619 | d = self.shouldFail2(error.Error, "test_PUT_DIRURL_named_bad", |
---|
11620 | hunk ./src/allmydata/test/test_web.py 898 |
---|
11621 | return d |
---|
11622 | |
---|
11623 | def test_PUT_NEWFILEURL_mutable_toobig(self): |
---|
11624 | - d = self.shouldFail2(error.Error, "test_PUT_NEWFILEURL_mutable_toobig", |
---|
11625 | - "413 Request Entity Too Large", |
---|
11626 | - "SDMF is limited to one segment, and 10001 > 10000", |
---|
11627 | - self.PUT, |
---|
11628 | - self.public_url + "/foo/new.txt?mutable=true", |
---|
11629 | - "b" * (self.s.MUTABLE_SIZELIMIT+1)) |
---|
11630 | + # It is okay to upload large mutable files, so we should be able |
---|
11631 | + # to do that. |
---|
11632 | + d = self.PUT(self.public_url + "/foo/new.txt?mutable=true", |
---|
11633 | + "b" * (self.s.MUTABLE_SIZELIMIT + 1)) |
---|
11634 | return d |
---|
11635 | |
---|
11636 | def test_PUT_NEWFILEURL_replace(self): |
---|
11637 | hunk ./src/allmydata/test/test_web.py 1684 |
---|
11638 | return d |
---|
11639 | |
---|
11640 | def test_POST_upload_no_link_mutable_toobig(self): |
---|
11641 | - d = self.shouldFail2(error.Error, |
---|
11642 | - "test_POST_upload_no_link_mutable_toobig", |
---|
11643 | - "413 Request Entity Too Large", |
---|
11644 | - "SDMF is limited to one segment, and 10001 > 10000", |
---|
11645 | - self.POST, |
---|
11646 | - "/uri", t="upload", mutable="true", |
---|
11647 | - file=("new.txt", |
---|
11648 | - "b" * (self.s.MUTABLE_SIZELIMIT+1)) ) |
---|
11649 | + # The SDMF size limit is no longer in place, so we should be |
---|
11650 | + # able to upload mutable files that are as large as we want them |
---|
11651 | + # to be. |
---|
11652 | + d = self.POST("/uri", t="upload", mutable="true", |
---|
11653 | + file=("new.txt", "b" * (self.s.MUTABLE_SIZELIMIT + 1))) |
---|
11654 | return d |
---|
11655 | |
---|
11656 | def test_POST_upload_mutable(self): |
---|
11657 | hunk ./src/allmydata/test/test_web.py 1815 |
---|
11658 | self.failUnlessReallyEqual(headers["content-type"], ["text/plain"]) |
---|
11659 | d.addCallback(_got_headers) |
---|
11660 | |
---|
11661 | - # make sure that size errors are displayed correctly for overwrite |
---|
11662 | - d.addCallback(lambda res: |
---|
11663 | - self.shouldFail2(error.Error, |
---|
11664 | - "test_POST_upload_mutable-toobig", |
---|
11665 | - "413 Request Entity Too Large", |
---|
11666 | - "SDMF is limited to one segment, and 10001 > 10000", |
---|
11667 | - self.POST, |
---|
11668 | - self.public_url + "/foo", t="upload", |
---|
11669 | - mutable="true", |
---|
11670 | - file=("new.txt", |
---|
11671 | - "b" * (self.s.MUTABLE_SIZELIMIT+1)), |
---|
11672 | - )) |
---|
11673 | - |
---|
11674 | + # make sure that outdated size limits aren't enforced anymore. |
---|
11675 | + d.addCallback(lambda ignored: |
---|
11676 | + self.POST(self.public_url + "/foo", t="upload", |
---|
11677 | + mutable="true", |
---|
11678 | + file=("new.txt", |
---|
11679 | + "b" * (self.s.MUTABLE_SIZELIMIT+1)))) |
---|
11680 | d.addErrback(self.dump_error) |
---|
11681 | return d |
---|
11682 | |
---|
11683 | hunk ./src/allmydata/test/test_web.py 1825 |
---|
11684 | def test_POST_upload_mutable_toobig(self): |
---|
11685 | - d = self.shouldFail2(error.Error, |
---|
11686 | - "test_POST_upload_mutable_toobig", |
---|
11687 | - "413 Request Entity Too Large", |
---|
11688 | - "SDMF is limited to one segment, and 10001 > 10000", |
---|
11689 | - self.POST, |
---|
11690 | - self.public_url + "/foo", |
---|
11691 | - t="upload", mutable="true", |
---|
11692 | - file=("new.txt", |
---|
11693 | - "b" * (self.s.MUTABLE_SIZELIMIT+1)) ) |
---|
11694 | + # SDMF had a size limti that was removed a while ago. MDMF has |
---|
11695 | + # never had a size limit. Test to make sure that we do not |
---|
11696 | + # encounter errors when trying to upload large mutable files, |
---|
11697 | + # since there should be no coded prohibitions regarding large |
---|
11698 | + # mutable files. |
---|
11699 | + d = self.POST(self.public_url + "/foo", |
---|
11700 | + t="upload", mutable="true", |
---|
11701 | + file=("new.txt", "b" * (self.s.MUTABLE_SIZELIMIT + 1))) |
---|
11702 | return d |
---|
11703 | |
---|
11704 | def dump_error(self, f): |
---|
11705 | hunk ./src/allmydata/test/test_web.py 2956 |
---|
11706 | d.addCallback(_done) |
---|
11707 | return d |
---|
11708 | |
---|
11709 | + |
---|
11710 | + def test_PUT_update_at_offset(self): |
---|
11711 | + file_contents = "test file" * 100000 # about 900 KiB |
---|
11712 | + d = self.PUT("/uri?mutable=true", file_contents) |
---|
11713 | + def _then(filecap): |
---|
11714 | + self.filecap = filecap |
---|
11715 | + new_data = file_contents[:100] |
---|
11716 | + new = "replaced and so on" |
---|
11717 | + new_data += new |
---|
11718 | + new_data += file_contents[len(new_data):] |
---|
11719 | + assert len(new_data) == len(file_contents) |
---|
11720 | + self.new_data = new_data |
---|
11721 | + d.addCallback(_then) |
---|
11722 | + d.addCallback(lambda ignored: |
---|
11723 | + self.PUT("/uri/%s?replace=True&offset=100" % self.filecap, |
---|
11724 | + "replaced and so on")) |
---|
11725 | + def _get_data(filecap): |
---|
11726 | + n = self.s.create_node_from_uri(filecap) |
---|
11727 | + return n.download_best_version() |
---|
11728 | + d.addCallback(_get_data) |
---|
11729 | + d.addCallback(lambda results: |
---|
11730 | + self.failUnlessEqual(results, self.new_data)) |
---|
11731 | + # Now try appending things to the file |
---|
11732 | + d.addCallback(lambda ignored: |
---|
11733 | + self.PUT("/uri/%s?offset=%d" % (self.filecap, len(self.new_data)), |
---|
11734 | + "puppies" * 100)) |
---|
11735 | + d.addCallback(_get_data) |
---|
11736 | + d.addCallback(lambda results: |
---|
11737 | + self.failUnlessEqual(results, self.new_data + ("puppies" * 100))) |
---|
11738 | + return d |
---|
11739 | + |
---|
11740 | + |
---|
11741 | + def test_PUT_update_at_offset_immutable(self): |
---|
11742 | + file_contents = "Test file" * 100000 |
---|
11743 | + d = self.PUT("/uri", file_contents) |
---|
11744 | + def _then(filecap): |
---|
11745 | + self.filecap = filecap |
---|
11746 | + d.addCallback(_then) |
---|
11747 | + d.addCallback(lambda ignored: |
---|
11748 | + self.shouldHTTPError("test immutable update", |
---|
11749 | + 400, "Bad Request", |
---|
11750 | + "immutable", |
---|
11751 | + self.PUT, |
---|
11752 | + "/uri/%s?offset=50" % self.filecap, |
---|
11753 | + "foo")) |
---|
11754 | + return d |
---|
11755 | + |
---|
11756 | + |
---|
11757 | def test_bad_method(self): |
---|
11758 | url = self.webish_url + self.public_url + "/foo/bar.txt" |
---|
11759 | d = self.shouldHTTPError("test_bad_method", |
---|
11760 | hunk ./src/allmydata/test/test_web.py 3257 |
---|
11761 | def _stash_mutable_uri(n, which): |
---|
11762 | self.uris[which] = n.get_uri() |
---|
11763 | assert isinstance(self.uris[which], str) |
---|
11764 | - d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3")) |
---|
11765 | + d.addCallback(lambda ign: |
---|
11766 | + c0.create_mutable_file(publish.MutableData(DATA+"3"))) |
---|
11767 | d.addCallback(_stash_mutable_uri, "corrupt") |
---|
11768 | d.addCallback(lambda ign: |
---|
11769 | c0.upload(upload.Data("literal", convergence=""))) |
---|
11770 | hunk ./src/allmydata/test/test_web.py 3404 |
---|
11771 | def _stash_mutable_uri(n, which): |
---|
11772 | self.uris[which] = n.get_uri() |
---|
11773 | assert isinstance(self.uris[which], str) |
---|
11774 | - d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3")) |
---|
11775 | + d.addCallback(lambda ign: |
---|
11776 | + c0.create_mutable_file(publish.MutableData(DATA+"3"))) |
---|
11777 | d.addCallback(_stash_mutable_uri, "corrupt") |
---|
11778 | |
---|
11779 | def _compute_fileurls(ignored): |
---|
11780 | hunk ./src/allmydata/test/test_web.py 4067 |
---|
11781 | def _stash_mutable_uri(n, which): |
---|
11782 | self.uris[which] = n.get_uri() |
---|
11783 | assert isinstance(self.uris[which], str) |
---|
11784 | - d.addCallback(lambda ign: c0.create_mutable_file(DATA+"2")) |
---|
11785 | + d.addCallback(lambda ign: |
---|
11786 | + c0.create_mutable_file(publish.MutableData(DATA+"2"))) |
---|
11787 | d.addCallback(_stash_mutable_uri, "mutable") |
---|
11788 | |
---|
11789 | def _compute_fileurls(ignored): |
---|
11790 | hunk ./src/allmydata/test/test_web.py 4167 |
---|
11791 | convergence=""))) |
---|
11792 | d.addCallback(_stash_uri, "small") |
---|
11793 | |
---|
11794 | - d.addCallback(lambda ign: c0.create_mutable_file("mutable")) |
---|
11795 | + d.addCallback(lambda ign: |
---|
11796 | + c0.create_mutable_file(publish.MutableData("mutable"))) |
---|
11797 | d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn)) |
---|
11798 | d.addCallback(_stash_uri, "mutable") |
---|
11799 | |
---|
11800 | } |
---|
11801 | |
---|
11802 | Context: |
---|
11803 | |
---|
11804 | [web download-status: tolerate DYHBs that haven't retired yet. Fixes #1160. |
---|
11805 | Brian Warner <warner@lothar.com>**20100809225100 |
---|
11806 | Ignore-this: cb0add71adde0a2e24f4bcc00abf9938 |
---|
11807 | |
---|
11808 | Also add a better unit test for it. |
---|
11809 | ] |
---|
11810 | [immutable/filenode.py: put off DownloadStatus creation until first read() call |
---|
11811 | Brian Warner <warner@lothar.com>**20100809225055 |
---|
11812 | Ignore-this: 48564598f236eb73e96cd2d2a21a2445 |
---|
11813 | |
---|
11814 | This avoids spamming the "recent uploads and downloads" /status page from |
---|
11815 | FileNode instances that were created for a directory read but which nobody is |
---|
11816 | ever going to read from. I also cleaned up the way DownloadStatus instances |
---|
11817 | are made to only ever do it in the CiphertextFileNode, not in the |
---|
11818 | higher-level plaintext FileNode. Also fixed DownloadStatus handling of read |
---|
11819 | size, thanks to David-Sarah for the catch. |
---|
11820 | ] |
---|
11821 | [Share: hush log entries in the main loop() after the fetch has been completed. |
---|
11822 | Brian Warner <warner@lothar.com>**20100809204359 |
---|
11823 | Ignore-this: 72b9e262980edf5a967873ebbe1e9479 |
---|
11824 | ] |
---|
11825 | [test_runner.py: correct and simplify normalization of package directory for case-insensitive filesystems. |
---|
11826 | david-sarah@jacaranda.org**20100808185005 |
---|
11827 | Ignore-this: fba96e967d4e7f33f301c7d56b577de |
---|
11828 | ] |
---|
11829 | [test_runner.py: make test_path work for test-from-installdir. |
---|
11830 | david-sarah@jacaranda.org**20100808171340 |
---|
11831 | Ignore-this: 46328d769ae6ec8d191c3cddacc91dc9 |
---|
11832 | ] |
---|
11833 | [src/allmydata/__init__.py: make the package paths more accurate when we fail to get them from setuptools. |
---|
11834 | david-sarah@jacaranda.org**20100808171235 |
---|
11835 | Ignore-this: 8d534d2764d64f7434880bd70696cd75 |
---|
11836 | ] |
---|
11837 | [test_runner.py: another try at calculating the rootdir correctly for test-from-egg and test-from-prefixdir. |
---|
11838 | david-sarah@jacaranda.org**20100808154307 |
---|
11839 | Ignore-this: 66737313935f2a0313d1de9b2ed68d0 |
---|
11840 | ] |
---|
11841 | [test_runner.py: calculate the location of bin/tahoe correctly for test-from-prefixdir (by copying code from misc/build_helpers/run_trial.py). Also fix the false-positive check for Unicode paths in test_the_right_code, which was causing skips that should have been failures. |
---|
11842 | david-sarah@jacaranda.org**20100808042817 |
---|
11843 | Ignore-this: 1b7dfff07cbfb1a74f94141b18da2c3f |
---|
11844 | ] |
---|
11845 | [TAG allmydata-tahoe-1.8.0c1 |
---|
11846 | david-sarah@jacaranda.org**20100807004546 |
---|
11847 | Ignore-this: 484ff2513774f3b48ca49c992e878b89 |
---|
11848 | ] |
---|
11849 | [how_to_make_a_tahoe-lafs_release.txt: add step to check that release will report itself as the intended version. |
---|
11850 | david-sarah@jacaranda.org**20100807004254 |
---|
11851 | Ignore-this: 7709322e883f4118f38c7f042f5a9a2 |
---|
11852 | ] |
---|
11853 | [relnotes.txt: 1.8.0c1 release |
---|
11854 | david-sarah@jacaranda.org**20100807003646 |
---|
11855 | Ignore-this: 1994ffcaf55089eb05e96c23c037dfee |
---|
11856 | ] |
---|
11857 | [NEWS, quickstart.html and known_issues.txt for 1.8.0c1 release. |
---|
11858 | david-sarah@jacaranda.org**20100806235111 |
---|
11859 | Ignore-this: 777cea943685cf2d48b6147a7648fca0 |
---|
11860 | ] |
---|
11861 | [TAG allmydata-tahoe-1.8.0rc1 |
---|
11862 | warner@lothar.com**20100806080450] |
---|
11863 | [update NEWS and other docs in preparation for 1.8.0rc1 |
---|
11864 | Brian Warner <warner@lothar.com>**20100806080228 |
---|
11865 | Ignore-this: 6ebdf11806f6dfbfde0b61115421a459 |
---|
11866 | |
---|
11867 | in particular, merge the various 1.8.0b1/b2 sections, and remove the |
---|
11868 | datestamp. NEWS gets updated just before a release, doesn't need to precisely |
---|
11869 | describe pre-release candidates, and the datestamp gets updated just before |
---|
11870 | the final release is tagged |
---|
11871 | |
---|
11872 | Also, I removed the BOM from some files. My toolchain made it hard to retain, |
---|
11873 | and BOMs in UTF-8 don't make a whole lot of sense anyway. Sorry if that |
---|
11874 | messes anything up. |
---|
11875 | ] |
---|
11876 | [downloader.Segmentation: unregisterProducer when asked to stopProducing, this |
---|
11877 | Brian Warner <warner@lothar.com>**20100806070705 |
---|
11878 | Ignore-this: a0a71dcf83df8a6f727deb9a61fa4fdf |
---|
11879 | seems to avoid the #1155 log message which reveals the URI (and filecap). |
---|
11880 | |
---|
11881 | Also add an [ERROR] marker to the flog entry, since unregisterProducer also |
---|
11882 | makes interrupted downloads appear "200 OK"; this makes it more obvious that |
---|
11883 | the download did not complete. |
---|
11884 | ] |
---|
11885 | [TAG allmydata-tahoe-1.8.0b2 |
---|
11886 | david-sarah@jacaranda.org**20100806052415 |
---|
11887 | Ignore-this: 2c1af8df5e25a6ebd90a32b49b8486dc |
---|
11888 | ] |
---|
11889 | [relnotes.txt and docs/known_issues.txt for 1.8.0beta2. |
---|
11890 | david-sarah@jacaranda.org**20100806040823 |
---|
11891 | Ignore-this: 862ad55d93ee37259ded9e2c9da78eb9 |
---|
11892 | ] |
---|
11893 | [test_util.py: use SHA-256 from pycryptopp instead of MD5 from hashlib (for uses in which any hash will do), since hashlib was only added to the stdlib in Python 2.5. |
---|
11894 | david-sarah@jacaranda.org**20100806050051 |
---|
11895 | Ignore-this: 552049b5d190a5ca775a8240030dbe3f |
---|
11896 | ] |
---|
11897 | [test_runner.py: increase timeout to cater for Francois' ARM buildslave. |
---|
11898 | david-sarah@jacaranda.org**20100806042601 |
---|
11899 | Ignore-this: 6ee618cf00ac1c99cb7ddb60fd7ef078 |
---|
11900 | ] |
---|
11901 | [test_util.py: remove use of 'a if p else b' syntax that requires Python 2.5. |
---|
11902 | david-sarah@jacaranda.org**20100806041616 |
---|
11903 | Ignore-this: 5fecba9aa530ef352797fcfa70d5c592 |
---|
11904 | ] |
---|
11905 | [NEWS and docs/quickstart.html for 1.8.0beta2. |
---|
11906 | david-sarah@jacaranda.org**20100806035112 |
---|
11907 | Ignore-this: 3a593cfdc2ae265da8f64c6c8aebae4 |
---|
11908 | ] |
---|
11909 | [docs/quickstart.html: remove link to tahoe-lafs-ticket798-1.8.0b.zip, due to appname regression. refs #1159 |
---|
11910 | david-sarah@jacaranda.org**20100806002435 |
---|
11911 | Ignore-this: bad61b30cdcc3d93b4165d5800047b85 |
---|
11912 | ] |
---|
11913 | [test_download.DownloadTest.test_simultaneous_goodguess: enable some disabled |
---|
11914 | Brian Warner <warner@lothar.com>**20100805185507 |
---|
11915 | Ignore-this: ac53d44643805412238ccbfae920d20c |
---|
11916 | checks that used to fail but work now. |
---|
11917 | ] |
---|
11918 | [DownloadNode: fix lost-progress in fetch_failed, tolerate cancel when no segment-fetch is active. Fixes #1154. |
---|
11919 | Brian Warner <warner@lothar.com>**20100805185507 |
---|
11920 | Ignore-this: 35fd36b273b21b6dca12ab3d11ee7d2d |
---|
11921 | |
---|
11922 | The lost-progress bug occurred when two simultanous read() calls fetched |
---|
11923 | different segments, and the first one failed (due to corruption, or the other |
---|
11924 | bugs in #1154): the second read() would never complete. While in this state, |
---|
11925 | cancelling the second read by having its consumer call stopProducing) would |
---|
11926 | trigger the cancel-intolerance bug. Finally, in downloader.node.Cancel, |
---|
11927 | prevent late cancels by adding an 'active' flag |
---|
11928 | ] |
---|
11929 | [util/spans.py: __nonzero__ cannot return a long either. for #1154 |
---|
11930 | Brian Warner <warner@lothar.com>**20100805185507 |
---|
11931 | Ignore-this: 6f87fead8252e7a820bffee74a1c51a2 |
---|
11932 | ] |
---|
11933 | [test_storage.py: change skip note for test_large_share to say that Windows doesn't support sparse files. refs #569 |
---|
11934 | david-sarah@jacaranda.org**20100805022612 |
---|
11935 | Ignore-this: 85c807a536dc4eeb8bf14980028bb05b |
---|
11936 | ] |
---|
11937 | [One fix for bug #1154: webapi GETs with a 'Range' header broke new-downloader. |
---|
11938 | Brian Warner <warner@lothar.com>**20100804184549 |
---|
11939 | Ignore-this: ffa3e703093a905b416af125a7923b7b |
---|
11940 | |
---|
11941 | The Range header causes n.read() to be called with an offset= of type 'long', |
---|
11942 | which eventually got used in a Spans/DataSpans object's __len__ method. |
---|
11943 | Apparently python doesn't permit __len__() to return longs, only ints. |
---|
11944 | Rewrote Spans/DataSpans to use s.len() instead of len(s) aka s.__len__() . |
---|
11945 | Added a test in test_download. Note that test_web didn't catch this because |
---|
11946 | it uses mock FileNodes for speed: it's probably time to rewrite that. |
---|
11947 | |
---|
11948 | There is still an unresolved error-recovery problem in #1154, so I'm not |
---|
11949 | closing the ticket quite yet. |
---|
11950 | ] |
---|
11951 | [test_download: minor cleanup |
---|
11952 | Brian Warner <warner@lothar.com>**20100804175555 |
---|
11953 | Ignore-this: f4aec3c77f6a0d7f7b2c07f302755cc1 |
---|
11954 | ] |
---|
11955 | [fetcher.py: improve comments |
---|
11956 | Brian Warner <warner@lothar.com>**20100804072814 |
---|
11957 | Ignore-this: 8bf74c21aef55cf0b0642e55ee4e7c5f |
---|
11958 | ] |
---|
11959 | [lazily create DownloadNode upon first read()/get_segment() |
---|
11960 | Brian Warner <warner@lothar.com>**20100804072808 |
---|
11961 | Ignore-this: 4bb1c49290cefac1dadd9d42fac46ba2 |
---|
11962 | ] |
---|
11963 | [test_hung_server: update comments, remove dead "stage_4_d" code |
---|
11964 | Brian Warner <warner@lothar.com>**20100804072800 |
---|
11965 | Ignore-this: 4d18b374b568237603466f93346d00db |
---|
11966 | ] |
---|
11967 | [copy the rest of David-Sarah's changes to make my tree match 1.8.0beta |
---|
11968 | Brian Warner <warner@lothar.com>**20100804072752 |
---|
11969 | Ignore-this: 9ac7f21c9b27e53452371096146be5bb |
---|
11970 | ] |
---|
11971 | [ShareFinder: add 10s OVERDUE timer, send new requests to replace overdue ones |
---|
11972 | Brian Warner <warner@lothar.com>**20100804072741 |
---|
11973 | Ignore-this: 7fa674edbf239101b79b341bb2944349 |
---|
11974 | |
---|
11975 | The fixed 10-second timer will eventually be replaced with a per-server |
---|
11976 | value, calculated based on observed response times. |
---|
11977 | |
---|
11978 | test_hung_server.py: enhance to exercise DYHB=OVERDUE state. Split existing |
---|
11979 | mutable+immutable tests into two pieces for clarity. Reenabled several tests. |
---|
11980 | Deleted the now-obsolete "test_failover_during_stage_4". |
---|
11981 | ] |
---|
11982 | [Rewrite immutable downloader (#798). This patch adds and updates unit tests. |
---|
11983 | Brian Warner <warner@lothar.com>**20100804072710 |
---|
11984 | Ignore-this: c3c838e124d67b39edaa39e002c653e1 |
---|
11985 | ] |
---|
11986 | [Rewrite immutable downloader (#798). This patch includes higher-level |
---|
11987 | Brian Warner <warner@lothar.com>**20100804072702 |
---|
11988 | Ignore-this: 40901ddb07d73505cb58d06d9bff73d9 |
---|
11989 | integration into the NodeMaker, and updates the web-status display to handle |
---|
11990 | the new download events. |
---|
11991 | ] |
---|
11992 | [Rewrite immutable downloader (#798). This patch rearranges the rest of src/allmydata/immutable/ . |
---|
11993 | Brian Warner <warner@lothar.com>**20100804072639 |
---|
11994 | Ignore-this: 302b1427a39985bfd11ccc14a1199ea4 |
---|
11995 | ] |
---|
11996 | [Rewrite immutable downloader (#798). This patch adds the new downloader itself. |
---|
11997 | Brian Warner <warner@lothar.com>**20100804072629 |
---|
11998 | Ignore-this: e9102460798123dd55ddca7653f4fc16 |
---|
11999 | ] |
---|
12000 | [util/observer.py: add EventStreamObserver |
---|
12001 | Brian Warner <warner@lothar.com>**20100804072612 |
---|
12002 | Ignore-this: fb9d205f34a6db7580b9be33414dfe21 |
---|
12003 | ] |
---|
12004 | [Add a byte-spans utility class, like perl's Set::IntSpan for .newsrc files. |
---|
12005 | Brian Warner <warner@lothar.com>**20100804072600 |
---|
12006 | Ignore-this: bbad42104aeb2f26b8dd0779de546128 |
---|
12007 | Also a data-spans class, which records a byte (instead of a bit) for each |
---|
12008 | index. |
---|
12009 | ] |
---|
12010 | [check-umids: oops, forgot to add the tool |
---|
12011 | Brian Warner <warner@lothar.com>**20100804071713 |
---|
12012 | Ignore-this: bbeb74d075414f3713fabbdf66189faf |
---|
12013 | ] |
---|
12014 | [coverage tools: ignore errors, display lines-uncovered in elisp mode. Fix Makefile paths. |
---|
12015 | "Brian Warner <warner@lothar.com>"**20100804071131] |
---|
12016 | [check-umids: new tool to check uniqueness of umids |
---|
12017 | "Brian Warner <warner@lothar.com>"**20100804071042] |
---|
12018 | [misc/simulators/sizes.py: update, we now use SHA256 (not SHA1), so large-file overhead grows to 0.5% |
---|
12019 | "Brian Warner <warner@lothar.com>"**20100804070942] |
---|
12020 | [storage-overhead: try to fix, probably still broken |
---|
12021 | "Brian Warner <warner@lothar.com>"**20100804070815] |
---|
12022 | [docs/quickstart.html: link to 1.8.0beta zip, and note 'bin\tahoe' on Windows. |
---|
12023 | david-sarah@jacaranda.org**20100803233254 |
---|
12024 | Ignore-this: 3c11f249efc42a588e3a7056349739ed |
---|
12025 | ] |
---|
12026 | [docs: relnotes.txt for 1.8.0β |
---|
12027 | zooko@zooko.com**20100803154913 |
---|
12028 | Ignore-this: d9101f72572b18da3cfac3c0e272c907 |
---|
12029 | ] |
---|
12030 | [test_storage.py: avoid spurious test failure by accepting either 'Next crawl in 59 minutes' or 'Next crawl in 60 minutes'. fixes #1140 |
---|
12031 | david-sarah@jacaranda.org**20100803102058 |
---|
12032 | Ignore-this: aa2419fc295727e4fbccec3c7b780e76 |
---|
12033 | ] |
---|
12034 | [misc/build_helpers/show-tool-versions.py: get sys.std{out,err}.encoding and 'as' version correctly, and improve formatting. |
---|
12035 | david-sarah@jacaranda.org**20100803101128 |
---|
12036 | Ignore-this: 4fd2907d86da58eb220e104010e9c6a |
---|
12037 | ] |
---|
12038 | [misc/build_helpers/show-tool-versions.py: avoid error message when 'as -version' does not create a.out. |
---|
12039 | david-sarah@jacaranda.org**20100803094812 |
---|
12040 | Ignore-this: 38fc2d639f30b4e123b9551e6931998d |
---|
12041 | ] |
---|
12042 | [CLI: further improve consistency of basedir options and add tests. addresses #118 |
---|
12043 | david-sarah@jacaranda.org**20100803085416 |
---|
12044 | Ignore-this: d8f8f55738abb5ea44ed4cf24d750efe |
---|
12045 | ] |
---|
12046 | [CLI: make the synopsis for 'tahoe unlink' say unlink instead of rm. |
---|
12047 | david-sarah@jacaranda.org**20100803085359 |
---|
12048 | Ignore-this: c35d3f99f906dfab61df8f5e81a42c92 |
---|
12049 | ] |
---|
12050 | [CLI: make all of the option descriptions imperative sentences. |
---|
12051 | david-sarah@jacaranda.org**20100803084801 |
---|
12052 | Ignore-this: ec80c7d2a10c6452d190fee4e1a60739 |
---|
12053 | ] |
---|
12054 | [test_cli.py: make 'tahoe mkdir' tests slightly less dumb (check for 'URI:' in the output). |
---|
12055 | david-sarah@jacaranda.org**20100803084720 |
---|
12056 | Ignore-this: 31a4ae4fb5f7c123bc6b6e36a9e3911e |
---|
12057 | ] |
---|
12058 | [test_cli.py: use u-escapes instead of UTF-8. |
---|
12059 | david-sarah@jacaranda.org**20100803083538 |
---|
12060 | Ignore-this: a48af66942defe8491c6e1811c7809b5 |
---|
12061 | ] |
---|
12062 | [NEWS: remove XXX comment and separate description of #890. |
---|
12063 | david-sarah@jacaranda.org**20100803050827 |
---|
12064 | Ignore-this: 6d308f34dc9d929d3d0811f7a1f5c786 |
---|
12065 | ] |
---|
12066 | [docs: more updates to NEWS for 1.8.0β |
---|
12067 | zooko@zooko.com**20100803044618 |
---|
12068 | Ignore-this: 8193a1be38effe2bdcc632fdb570e9fc |
---|
12069 | ] |
---|
12070 | [docs: incomplete beginnings of a NEWS update for v1.8β |
---|
12071 | zooko@zooko.com**20100802072840 |
---|
12072 | Ignore-this: cb00fcd4f1e0eaed8c8341014a2ba4d4 |
---|
12073 | ] |
---|
12074 | [docs/quickstart.html: extra step to open a new Command Prompt or log out/in on Windows. |
---|
12075 | david-sarah@jacaranda.org**20100803004938 |
---|
12076 | Ignore-this: 1334a2cd01f77e0c9eddaeccfeff2370 |
---|
12077 | ] |
---|
12078 | [update bundled zetuptools with doc changes, change to script setup for Windows XP, and to have the 'develop' command run script setup. |
---|
12079 | david-sarah@jacaranda.org**20100803003815 |
---|
12080 | Ignore-this: 73c86e154f4d3f7cc9855eb31a20b1ed |
---|
12081 | ] |
---|
12082 | [bundled setuptools/command/scriptsetup.py: use SendMessageTimeoutW, to test whether that broadcasts environment changes any better. |
---|
12083 | david-sarah@jacaranda.org**20100802224505 |
---|
12084 | Ignore-this: 7788f7c2f9355e7852a376ec94182056 |
---|
12085 | ] |
---|
12086 | [bundled zetuptoolz: add missing setuptools/command/scriptsetup.py |
---|
12087 | david-sarah@jacaranda.org**20100802072129 |
---|
12088 | Ignore-this: 794b1c411f6cdec76eeb716223a55d0 |
---|
12089 | ] |
---|
12090 | [test_runner.py: add test_run_with_python_options, which checks that the Windows script changes haven't broken 'python <options> bin/tahoe'. |
---|
12091 | david-sarah@jacaranda.org**20100802062558 |
---|
12092 | Ignore-this: 812a2ccb7d9c7a8e01d5ca04d875aba5 |
---|
12093 | ] |
---|
12094 | [test_runner.py: fix missing import of get_filesystem_encoding |
---|
12095 | david-sarah@jacaranda.org**20100802060902 |
---|
12096 | Ignore-this: 2e9e439b7feb01e0c3c94b54e802503b |
---|
12097 | ] |
---|
12098 | [Bundle setuptools-0.6c16dev (with Windows script changes, and the change to only warn if site.py wasn't generated by setuptools) instead of 0.6c15dev. addresses #565, #1073, #1074 |
---|
12099 | david-sarah@jacaranda.org**20100802060602 |
---|
12100 | Ignore-this: 34ee2735e49e2c05b57e353d48f83050 |
---|
12101 | ] |
---|
12102 | [.darcs-boringfile: changes needed to take account of egg directories being bundled. Also, make _trial_temp a prefix rather than exact match. |
---|
12103 | david-sarah@jacaranda.org**20100802050313 |
---|
12104 | Ignore-this: 8de6a8dbaba014ba88dec6c792fc5a9d |
---|
12105 | ] |
---|
12106 | [.darcs-boringfile: changes needed to take account of pyscript wrappers on Windows. |
---|
12107 | david-sarah@jacaranda.org**20100802050128 |
---|
12108 | Ignore-this: 7366b631e2095166696e6da5765d9180 |
---|
12109 | ] |
---|
12110 | [misc/build_helpers/run_trial.py: check that the root from which the module we are testing was loaded is the current directory. This version of the patch folds in later fixes to the logic for caculating the directories to compare, and improvements to error messages. addresses #1137 |
---|
12111 | david-sarah@jacaranda.org**20100802045535 |
---|
12112 | Ignore-this: 9d3c1447f0539c6308127413098eb646 |
---|
12113 | ] |
---|
12114 | [Skip option arguments to the python interpreter when reconstructing Unicode argv on Windows. |
---|
12115 | david-sarah@jacaranda.org**20100728062731 |
---|
12116 | Ignore-this: 2b17fc43860bcc02a66bb6e5e050ea7c |
---|
12117 | ] |
---|
12118 | [windows/fixups.py: improve comments and reference some relevant Python bugs. |
---|
12119 | david-sarah@jacaranda.org**20100727181921 |
---|
12120 | Ignore-this: 32e61cf98dfc2e3dac60b750dda6429b |
---|
12121 | ] |
---|
12122 | [windows/fixups.py: make errors reported to original_stderr have enough information to debug even if we can't see the traceback. |
---|
12123 | david-sarah@jacaranda.org**20100726221904 |
---|
12124 | Ignore-this: e30b4629a7aa5d71554237c7e809c080 |
---|
12125 | ] |
---|
12126 | [windows/fixups.py: fix paste-o in name of Unicode stderr wrapper. |
---|
12127 | david-sarah@jacaranda.org**20100726214736 |
---|
12128 | Ignore-this: cb220931f1683eb53b0c7269e18a38be |
---|
12129 | ] |
---|
12130 | [windows/fixups.py: Don't rely on buggy MSVCRT library for Unicode output, use the Win32 API instead. This should make it work on XP. Also, change how we handle the case where sys.stdout and sys.stderr are redirected, since the .encoding attribute isn't necessarily writeable. |
---|
12131 | david-sarah@jacaranda.org**20100726045019 |
---|
12132 | Ignore-this: 69267abc5065cbd5b86ca71fe4921fb6 |
---|
12133 | ] |
---|
12134 | [test_runner.py: change to code for locating the bin/tahoe script that was missed when rebasing the patch for #1074. |
---|
12135 | david-sarah@jacaranda.org**20100725182008 |
---|
12136 | Ignore-this: d891a93989ecc3f4301a17110c3d196c |
---|
12137 | ] |
---|
12138 | [Add missing windows/fixups.py (for setting up Unicode args and output on Windows). |
---|
12139 | david-sarah@jacaranda.org**20100725092849 |
---|
12140 | Ignore-this: 35a1e8aeb4e1dea6e81433bf0825a6f6 |
---|
12141 | ] |
---|
12142 | [Changes to Tahoe needed to work with new zetuptoolz (that does not use .exe wrappers on Windows), and to support Unicode arguments and stdout/stderr -- v5 |
---|
12143 | david-sarah@jacaranda.org**20100725083216 |
---|
12144 | Ignore-this: 5041a634b1328f041130658233f6a7ce |
---|
12145 | ] |
---|
12146 | [scripts/common.py: fix an error introduced when rebasing to the ticket798 branch, which caused base directories to be duplicated in self.basedirs. |
---|
12147 | david-sarah@jacaranda.org**20100802064929 |
---|
12148 | Ignore-this: 116fd437d1f91a647879fe8d9510f513 |
---|
12149 | ] |
---|
12150 | [Basedir/node directory option improvements for ticket798 branch. addresses #188, #706, #715, #772, #890 |
---|
12151 | david-sarah@jacaranda.org**20100802043004 |
---|
12152 | Ignore-this: d19fc24349afa19833406518595bfdf7 |
---|
12153 | ] |
---|
12154 | [scripts/create_node.py: allow nickname to be Unicode. Also ensure webport is validly encoded in config file. |
---|
12155 | david-sarah@jacaranda.org**20100802000212 |
---|
12156 | Ignore-this: fb236169280507dd1b3b70d459155f6e |
---|
12157 | ] |
---|
12158 | [test_runner.py: Fix error in message arguments to 'fail' calls. |
---|
12159 | david-sarah@jacaranda.org**20100802013526 |
---|
12160 | Ignore-this: 3bfdef19ae3cf993194811367da5d020 |
---|
12161 | ] |
---|
12162 | [Additional Unicode basedir changes for ticket798 branch. |
---|
12163 | david-sarah@jacaranda.org**20100802010552 |
---|
12164 | Ignore-this: 7090d8c6b04eb6275345a55e75142028 |
---|
12165 | ] |
---|
12166 | [Unicode basedir changes for ticket798 branch. |
---|
12167 | david-sarah@jacaranda.org**20100801235310 |
---|
12168 | Ignore-this: a00717eaeae8650847b5395801e04c45 |
---|
12169 | ] |
---|
12170 | [fileutil: change WindowsError to OSError in abspath_expanduser_unicode, because WindowsError might not exist. |
---|
12171 | david-sarah@jacaranda.org**20100725222603 |
---|
12172 | Ignore-this: e125d503670ed049a9ade0322faa0c51 |
---|
12173 | ] |
---|
12174 | [test_system: correct a failure in _test_runner caused by Unicode basedir patch on non-Unicode platforms. |
---|
12175 | david-sarah@jacaranda.org**20100724032123 |
---|
12176 | Ignore-this: 399b3953104fdd1bbed3f7564d163553 |
---|
12177 | ] |
---|
12178 | [Fix test failures due to Unicode basedir patches. |
---|
12179 | david-sarah@jacaranda.org**20100725010318 |
---|
12180 | Ignore-this: fe92cd439eb3e60a56c007ae452784ed |
---|
12181 | ] |
---|
12182 | [util.encodingutil: change quote_output to do less unnecessary escaping, and to use double-quotes more consistently when needed. This version avoids u-escaping for characters that are representable in the output encoding, when double quotes are used, and includes tests. fixes #1135 |
---|
12183 | david-sarah@jacaranda.org**20100723075314 |
---|
12184 | Ignore-this: b82205834d17db61612dd16436b7c5a2 |
---|
12185 | ] |
---|
12186 | [Replace uses of os.path.abspath with abspath_expanduser_unicode where necessary. This makes basedir paths consistently represented as Unicode. |
---|
12187 | david-sarah@jacaranda.org**20100722001418 |
---|
12188 | Ignore-this: 9f8cb706540e695550e0dbe303c01f52 |
---|
12189 | ] |
---|
12190 | [util.fileutil, test.test_util: add abspath_expanduser_unicode function, to work around <http://bugs.python.org/issue3426>. util.encodingutil: add a convenience function argv_to_abspath. |
---|
12191 | david-sarah@jacaranda.org**20100721231507 |
---|
12192 | Ignore-this: eee6904d1f65a733ff35190879844d08 |
---|
12193 | ] |
---|
12194 | [setup: increase requirement on foolscap from >= 0.4.1 to >= 0.5.1 to avoid the foolscap performance bug with transferring large mutable files |
---|
12195 | zooko@zooko.com**20100802071748 |
---|
12196 | Ignore-this: 53b5b8571ebfee48e6b11e3f3a5efdb7 |
---|
12197 | ] |
---|
12198 | [upload: tidy up logging messages |
---|
12199 | zooko@zooko.com**20100802070212 |
---|
12200 | Ignore-this: b3532518326f6d808d085da52c14b661 |
---|
12201 | reformat code to be less than 100 chars wide, refactor formatting of logging messages, add log levels to some logging messages, M-x whitespace-cleanup |
---|
12202 | ] |
---|
12203 | [tests: remove debug print |
---|
12204 | zooko@zooko.com**20100802063339 |
---|
12205 | Ignore-this: b13b8c15e946556bffca9d7ad7c890f5 |
---|
12206 | ] |
---|
12207 | [docs: update the list of forums to announce Tahoe-LAFS too, add empty checkboxes |
---|
12208 | zooko@zooko.com**20100802063314 |
---|
12209 | Ignore-this: 89d0e8bd43f1749a9e85fcee2205bb04 |
---|
12210 | ] |
---|
12211 | [immutable: tidy-up some code by using a set instead of list to hold homeless_shares |
---|
12212 | zooko@zooko.com**20100802062004 |
---|
12213 | Ignore-this: a70bda3cf6c48ab0f0688756b015cf8d |
---|
12214 | ] |
---|
12215 | [setup: fix a couple instances of hard-coded 'allmydata-tahoe' in the scripts, tighten the tests (as suggested by David-Sarah) |
---|
12216 | zooko@zooko.com**20100801164207 |
---|
12217 | Ignore-this: 50265b562193a9a3797293123ed8ba5c |
---|
12218 | ] |
---|
12219 | [setup: replace hardcoded 'allmydata-tahoe' with allmydata.__appname__ |
---|
12220 | zooko@zooko.com**20100801160517 |
---|
12221 | Ignore-this: 55e1a98515300d228f02df10975f7ba |
---|
12222 | ] |
---|
12223 | [NEWS: describe #1055 |
---|
12224 | zooko@zooko.com**20100801034338 |
---|
12225 | Ignore-this: 3a16cfa387c2b245c610ea1d1ad8d7f1 |
---|
12226 | ] |
---|
12227 | [immutable: use PrefixingLogMixin to organize logging in Tahoe2PeerSelector and add more detailed messages about peer |
---|
12228 | zooko@zooko.com**20100719082000 |
---|
12229 | Ignore-this: e034c4988b327f7e138a106d913a3082 |
---|
12230 | ] |
---|
12231 | [benchmarking: update bench_dirnode to be correct and use the shiniest new pyutil.benchutil features concerning what units you measure in |
---|
12232 | zooko@zooko.com**20100719044948 |
---|
12233 | Ignore-this: b72059e4ff921741b490e6b47ec687c6 |
---|
12234 | ] |
---|
12235 | [trivial: rename and add in-line doc to clarify "used_peers" => "upload_servers" |
---|
12236 | zooko@zooko.com**20100719044744 |
---|
12237 | Ignore-this: 93c42081676e0dea181e55187cfc506d |
---|
12238 | ] |
---|
12239 | [abbreviate time edge case python2.5 unit test |
---|
12240 | jacob.lyles@gmail.com**20100729210638 |
---|
12241 | Ignore-this: 80f9b1dc98ee768372a50be7d0ef66af |
---|
12242 | ] |
---|
12243 | [docs: add Jacob Lyles to CREDITS |
---|
12244 | zooko@zooko.com**20100730230500 |
---|
12245 | Ignore-this: 9dbbd6a591b4b1a5a8dcb69b7b757792 |
---|
12246 | ] |
---|
12247 | [web: don't use %d formatting on a potentially large negative float -- there is a bug in Python 2.5 in that case |
---|
12248 | jacob.lyles@gmail.com**20100730220550 |
---|
12249 | Ignore-this: 7080eb4bddbcce29cba5447f8f4872ee |
---|
12250 | fixes #1055 |
---|
12251 | ] |
---|
12252 | [test_upload.py: rename test_problem_layout_ticket1124 to test_problem_layout_ticket_1124 -- fix .todo reference. |
---|
12253 | david-sarah@jacaranda.org**20100729152927 |
---|
12254 | Ignore-this: c8fe1047edcc83c87b9feb47f4aa587b |
---|
12255 | ] |
---|
12256 | [test_upload.py: rename test_problem_layout_ticket1124 to test_problem_layout_ticket_1124 for consistency. |
---|
12257 | david-sarah@jacaranda.org**20100729142250 |
---|
12258 | Ignore-this: bc3aad5919ae9079ceb9968ad0f5ea5a |
---|
12259 | ] |
---|
12260 | [docs: fix licensing typo that was earlier fixed in [20090921164651-92b7f-7f97b58101d93dc588445c52a9aaa56a2c7ae336] |
---|
12261 | zooko@zooko.com**20100729052923 |
---|
12262 | Ignore-this: a975d79115911688e5469d4d869e1664 |
---|
12263 | I wish we didn't copies of this licensing text in several different files so that changes can be accidentally omitted from some of them. |
---|
12264 | ] |
---|
12265 | [misc/build_helpers/run-with-pythonpath.py: fix stale comment, and remove 'trial' example that is not the right way to run trial. |
---|
12266 | david-sarah@jacaranda.org**20100726225729 |
---|
12267 | Ignore-this: a61f55557ad69a1633bfb2b8172cce97 |
---|
12268 | ] |
---|
12269 | [docs/specifications/dirnodes.txt: 'mesh'->'grid'. |
---|
12270 | david-sarah@jacaranda.org**20100723061616 |
---|
12271 | Ignore-this: 887bcf921ef00afba8e05e9239035bca |
---|
12272 | ] |
---|
12273 | [docs/specifications/dirnodes.txt: bring layer terminology up-to-date with architecture.txt, and a few other updates (e.g. note that the MAC is no longer verified, and that URIs can be unknown). Also 'Tahoe'->'Tahoe-LAFS'. |
---|
12274 | david-sarah@jacaranda.org**20100723054703 |
---|
12275 | Ignore-this: f3b98183e7d0a0f391225b8b93ac6c37 |
---|
12276 | ] |
---|
12277 | [docs: use current cap to Zooko's wiki page in example text |
---|
12278 | zooko@zooko.com**20100721010543 |
---|
12279 | Ignore-this: 4f36f36758f9fdbaf9eb73eac23b6652 |
---|
12280 | fixes #1134 |
---|
12281 | ] |
---|
12282 | [__init__.py: silence DeprecationWarning about BaseException.message globally. fixes #1129 |
---|
12283 | david-sarah@jacaranda.org**20100720011939 |
---|
12284 | Ignore-this: 38808986ba79cb2786b010504a22f89 |
---|
12285 | ] |
---|
12286 | [test_runner: test that 'tahoe --version' outputs no noise (e.g. DeprecationWarnings). |
---|
12287 | david-sarah@jacaranda.org**20100720011345 |
---|
12288 | Ignore-this: dd358b7b2e5d57282cbe133e8069702e |
---|
12289 | ] |
---|
12290 | [TAG allmydata-tahoe-1.7.1 |
---|
12291 | zooko@zooko.com**20100719131352 |
---|
12292 | Ignore-this: 6942056548433dc653a746703819ad8c |
---|
12293 | ] |
---|
12294 | Patch bundle hash: |
---|
12295 | 8ebc519acc2ab0a4fde7985febe8abf350684a4e |
---|