Ticket #393: 393status33.dpatch

File 393status33.dpatch, 544.6 KB (added by kevan, at 2010-08-14T00:22:22Z)
Line 
1Mon Aug  9 16:25:14 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
2  * mutable/checker.py and mutable/repair.py: Modify checker and repairer to work with MDMF
3 
4  The checker and repairer required minimal changes to work with the MDMF
5  modifications made elsewhere. The checker duplicated a lot of the code
6  that was already in the downloader, so I modified the downloader
7  slightly to expose this functionality to the checker and removed the
8  duplicated code. The repairer only required a minor change to deal with
9  data representation.
10
11Mon Aug  9 16:32:44 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
12  * interfaces.py: Add #993 interfaces
13
14Mon Aug  9 16:35:35 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
15  * frontends/sftpd.py: Modify the sftp frontend to work with the MDMF changes
16
17Mon Aug  9 16:36:23 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
18  * nodemaker.py: Make nodemaker expose a way to create MDMF files
19
20Mon Aug  9 16:40:04 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
21  * mutable/layout.py and interfaces.py: add MDMF writer and reader
22 
23  The MDMF writer is responsible for keeping state as plaintext is
24  gradually processed into share data by the upload process. When the
25  upload finishes, it will write all of its share data to a remote server,
26  reporting its status back to the publisher.
27 
28  The MDMF reader is responsible for abstracting an MDMF file as it sits
29  on the grid from the downloader; specifically, by receiving and
30  responding to requests for arbitrary data within the MDMF file.
31 
32  The interfaces.py file has also been modified to contain an interface
33  for the writer.
34
35Mon Aug  9 17:06:19 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
36  * immutable/filenode.py: Make the immutable file node implement the same interfaces as the mutable one
37
38Mon Aug  9 17:06:33 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
39  * immutable/literal.py: implement the same interfaces as other filenodes
40
41Wed Aug 11 16:30:49 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
42  * mutable/filenode.py: add versions and partial-file updates to the mutable file node
43 
44  One of the goals of MDMF as a GSoC project is to lay the groundwork for
45  LDMF, a format that will allow Tahoe-LAFS to deal with and encourage
46  multiple versions of a single cap on the grid. In line with this, there
47  is a now a distinction between an overriding mutable file (which can be
48  thought to correspond to the cap/unique identifier for that mutable
49  file) and versions of the mutable file (which we can download, update,
50  and so on). All download, upload, and modification operations end up
51  happening on a particular version of a mutable file, but there are
52  shortcut methods on the object representing the overriding mutable file
53  that perform these operations on the best version of the mutable file
54  (which is what code should be doing until we have LDMF and better
55  support for other paradigms).
56 
57  Another goal of MDMF was to take advantage of segmentation to give
58  callers more efficient partial file updates or appends. This patch
59  implements methods that do that, too.
60 
61
62Wed Aug 11 16:31:01 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
63  * mutable/publish.py: Modify the publish process to support MDMF
64 
65  The inner workings of the publishing process needed to be reworked to a
66  large extend to cope with segmented mutable files, and to cope with
67  partial-file updates of mutable files. This patch does that. It also
68  introduces wrappers for uploadable data, allowing the use of
69  filehandle-like objects as data sources, in addition to strings. This
70  reduces memory inefficiency when dealing with large files through the
71  webapi, and clarifies update code there.
72
73Wed Aug 11 16:31:25 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
74  * mutable/retrieve.py: Modify the retrieval process to support MDMF
75 
76  The logic behind a mutable file download had to be adapted to work with
77  segmented mutable files; this patch performs those adaptations. It also
78  exposes some decoding and decrypting functionality to make partial-file
79  updates a little easier, and supports efficient random-access downloads
80  of parts of an MDMF file.
81
82Wed Aug 11 16:33:09 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
83  * mutable/servermap.py: Alter the servermap updater to work with MDMF files
84 
85  These modifications were basically all to the end of having the
86  servermap updater use the unified MDMF + SDMF read interface whenever
87  possible -- this reduces the complexity of the code, making it easier to
88  read and maintain. To do this, I needed to modify the process of
89  updating the servermap a little bit.
90 
91  To support partial-file updates, I also modified the servermap updater
92  to fetch the block hash trees and certain segments of files while it
93  performed a servermap update (this can be done without adding any new
94  roundtrips because of batch-read functionality that the read proxy has).
95 
96
97Thu Aug 12 16:14:10 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
98  * client.py: learn how to create different kinds of mutable files
99
100Fri Aug 13 16:49:57 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
101  * scripts: tell 'tahoe put' about MDMF
102
103Fri Aug 13 16:50:38 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
104  * tests:
105 
106      - A lot of existing tests relied on aspects of the mutable file
107        implementation that were changed. This patch updates those tests
108        to work with the changes.
109      - This patch also adds tests for new features.
110
111Fri Aug 13 16:51:06 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
112  * web: Alter the webapi to get along with and take advantage of the MDMF changes
113 
114  The main benefit that the webapi gets from MDMF, at least initially, is
115  the ability to do a streaming download of an MDMF mutable file. It also
116  exposes a way (through the PUT verb) to append to or otherwise modify
117  (in-place) an MDMF mutable file.
118
119New patches:
120
121[mutable/checker.py and mutable/repair.py: Modify checker and repairer to work with MDMF
122Kevan Carstensen <kevan@isnotajoke.com>**20100809232514
123 Ignore-this: 1bcef2f262c868f61e57cc19a3cac89a
124 
125 The checker and repairer required minimal changes to work with the MDMF
126 modifications made elsewhere. The checker duplicated a lot of the code
127 that was already in the downloader, so I modified the downloader
128 slightly to expose this functionality to the checker and removed the
129 duplicated code. The repairer only required a minor change to deal with
130 data representation.
131] {
132hunk ./src/allmydata/mutable/checker.py 12
133 from allmydata.mutable.common import MODE_CHECK, CorruptShareError
134 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
135 from allmydata.mutable.layout import unpack_share, SIGNED_PREFIX_LENGTH
136+from allmydata.mutable.retrieve import Retrieve # for verifying
137 
138 class MutableChecker:
139 
140hunk ./src/allmydata/mutable/checker.py 29
141 
142     def check(self, verify=False, add_lease=False):
143         servermap = ServerMap()
144+        # Updating the servermap in MODE_CHECK will stand a good chance
145+        # of finding all of the shares, and getting a good idea of
146+        # recoverability, etc, without verifying.
147         u = ServermapUpdater(self._node, self._storage_broker, self._monitor,
148                              servermap, MODE_CHECK, add_lease=add_lease)
149         if self._history:
150hunk ./src/allmydata/mutable/checker.py 55
151         if num_recoverable:
152             self.best_version = servermap.best_recoverable_version()
153 
154+        # The file is unhealthy and needs to be repaired if:
155+        # - There are unrecoverable versions.
156         if servermap.unrecoverable_versions():
157             self.need_repair = True
158hunk ./src/allmydata/mutable/checker.py 59
159+        # - There isn't a recoverable version.
160         if num_recoverable != 1:
161             self.need_repair = True
162hunk ./src/allmydata/mutable/checker.py 62
163+        # - The best recoverable version is missing some shares.
164         if self.best_version:
165             available_shares = servermap.shares_available()
166             (num_distinct_shares, k, N) = available_shares[self.best_version]
167hunk ./src/allmydata/mutable/checker.py 73
168 
169     def _verify_all_shares(self, servermap):
170         # read every byte of each share
171+        #
172+        # This logic is going to be very nearly the same as the
173+        # downloader. I bet we could pass the downloader a flag that
174+        # makes it do this, and piggyback onto that instead of
175+        # duplicating a bunch of code.
176+        #
177+        # Like:
178+        #  r = Retrieve(blah, blah, blah, verify=True)
179+        #  d = r.download()
180+        #  (wait, wait, wait, d.callback)
181+        # 
182+        #  Then, when it has finished, we can check the servermap (which
183+        #  we provided to Retrieve) to figure out which shares are bad,
184+        #  since the Retrieve process will have updated the servermap as
185+        #  it went along.
186+        #
187+        #  By passing the verify=True flag to the constructor, we are
188+        #  telling the downloader a few things.
189+        #
190+        #  1. It needs to download all N shares, not just K shares.
191+        #  2. It doesn't need to decrypt or decode the shares, only
192+        #     verify them.
193         if not self.best_version:
194             return
195hunk ./src/allmydata/mutable/checker.py 97
196-        versionmap = servermap.make_versionmap()
197-        shares = versionmap[self.best_version]
198-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
199-         offsets_tuple) = self.best_version
200-        offsets = dict(offsets_tuple)
201-        readv = [ (0, offsets["EOF"]) ]
202-        dl = []
203-        for (shnum, peerid, timestamp) in shares:
204-            ss = servermap.connections[peerid]
205-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
206-            d.addCallback(self._got_answer, peerid, servermap)
207-            dl.append(d)
208-        return defer.DeferredList(dl, fireOnOneErrback=True, consumeErrors=True)
209 
210hunk ./src/allmydata/mutable/checker.py 98
211-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
212-        # isolate the callRemote to a separate method, so tests can subclass
213-        # Publish and override it
214-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
215+        r = Retrieve(self._node, servermap, self.best_version, verify=True)
216+        d = r.download()
217+        d.addCallback(self._process_bad_shares)
218         return d
219 
220hunk ./src/allmydata/mutable/checker.py 103
221-    def _got_answer(self, datavs, peerid, servermap):
222-        for shnum,datav in datavs.items():
223-            data = datav[0]
224-            try:
225-                self._got_results_one_share(shnum, peerid, data)
226-            except CorruptShareError:
227-                f = failure.Failure()
228-                self.need_repair = True
229-                self.bad_shares.append( (peerid, shnum, f) )
230-                prefix = data[:SIGNED_PREFIX_LENGTH]
231-                servermap.mark_bad_share(peerid, shnum, prefix)
232-                ss = servermap.connections[peerid]
233-                self.notify_server_corruption(ss, shnum, str(f.value))
234-
235-    def check_prefix(self, peerid, shnum, data):
236-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
237-         offsets_tuple) = self.best_version
238-        got_prefix = data[:SIGNED_PREFIX_LENGTH]
239-        if got_prefix != prefix:
240-            raise CorruptShareError(peerid, shnum,
241-                                    "prefix mismatch: share changed while we were reading it")
242-
243-    def _got_results_one_share(self, shnum, peerid, data):
244-        self.check_prefix(peerid, shnum, data)
245-
246-        # the [seqnum:signature] pieces are validated by _compare_prefix,
247-        # which checks their signature against the pubkey known to be
248-        # associated with this file.
249 
250hunk ./src/allmydata/mutable/checker.py 104
251-        (seqnum, root_hash, IV, k, N, segsize, datalen, pubkey, signature,
252-         share_hash_chain, block_hash_tree, share_data,
253-         enc_privkey) = unpack_share(data)
254-
255-        # validate [share_hash_chain,block_hash_tree,share_data]
256-
257-        leaves = [hashutil.block_hash(share_data)]
258-        t = hashtree.HashTree(leaves)
259-        if list(t) != block_hash_tree:
260-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
261-        share_hash_leaf = t[0]
262-        t2 = hashtree.IncompleteHashTree(N)
263-        # root_hash was checked by the signature
264-        t2.set_hashes({0: root_hash})
265-        try:
266-            t2.set_hashes(hashes=share_hash_chain,
267-                          leaves={shnum: share_hash_leaf})
268-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
269-                IndexError), e:
270-            msg = "corrupt hashes: %s" % (e,)
271-            raise CorruptShareError(peerid, shnum, msg)
272-
273-        # validate enc_privkey: only possible if we have a write-cap
274-        if not self._node.is_readonly():
275-            alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
276-            alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
277-            if alleged_writekey != self._node.get_writekey():
278-                raise CorruptShareError(peerid, shnum, "invalid privkey")
279+    def _process_bad_shares(self, bad_shares):
280+        if bad_shares:
281+            self.need_repair = True
282+        self.bad_shares = bad_shares
283 
284hunk ./src/allmydata/mutable/checker.py 109
285-    def notify_server_corruption(self, ss, shnum, reason):
286-        ss.callRemoteOnly("advise_corrupt_share",
287-                          "mutable", self._storage_index, shnum, reason)
288 
289     def _count_shares(self, smap, version):
290         available_shares = smap.shares_available()
291hunk ./src/allmydata/mutable/repairer.py 5
292 from zope.interface import implements
293 from twisted.internet import defer
294 from allmydata.interfaces import IRepairResults, ICheckResults
295+from allmydata.mutable.publish import MutableData
296 
297 class RepairResults:
298     implements(IRepairResults)
299hunk ./src/allmydata/mutable/repairer.py 108
300             raise RepairRequiresWritecapError("Sorry, repair currently requires a writecap, to set the write-enabler properly.")
301 
302         d = self.node.download_version(smap, best_version, fetch_privkey=True)
303+        d.addCallback(lambda data:
304+            MutableData(data))
305         d.addCallback(self.node.upload, smap)
306         d.addCallback(self.get_results, smap)
307         return d
308}
309[interfaces.py: Add #993 interfaces
310Kevan Carstensen <kevan@isnotajoke.com>**20100809233244
311 Ignore-this: b58621ac5cc86f1b4b4149f9e6c6a1ce
312] {
313hunk ./src/allmydata/interfaces.py 495
314 class MustNotBeUnknownRWError(CapConstraintError):
315     """Cannot add an unknown child cap specified in a rw_uri field."""
316 
317+
318+class IReadable(Interface):
319+    """I represent a readable object -- either an immutable file, or a
320+    specific version of a mutable file.
321+    """
322+
323+    def is_readonly():
324+        """Return True if this reference provides mutable access to the given
325+        file or directory (i.e. if you can modify it), or False if not. Note
326+        that even if this reference is read-only, someone else may hold a
327+        read-write reference to it.
328+
329+        For an IReadable returned by get_best_readable_version(), this will
330+        always return True, but for instances of subinterfaces such as
331+        IMutableFileVersion, it may return False."""
332+
333+    def is_mutable():
334+        """Return True if this file or directory is mutable (by *somebody*,
335+        not necessarily you), False if it is is immutable. Note that a file
336+        might be mutable overall, but your reference to it might be
337+        read-only. On the other hand, all references to an immutable file
338+        will be read-only; there are no read-write references to an immutable
339+        file."""
340+
341+    def get_storage_index():
342+        """Return the storage index of the file."""
343+
344+    def get_size():
345+        """Return the length (in bytes) of this readable object."""
346+
347+    def download_to_data():
348+        """Download all of the file contents. I return a Deferred that fires
349+        with the contents as a byte string."""
350+
351+    def read(consumer, offset=0, size=None):
352+        """Download a portion (possibly all) of the file's contents, making
353+        them available to the given IConsumer. Return a Deferred that fires
354+        (with the consumer) when the consumer is unregistered (either because
355+        the last byte has been given to it, or because the consumer threw an
356+        exception during write(), possibly because it no longer wants to
357+        receive data). The portion downloaded will start at 'offset' and
358+        contain 'size' bytes (or the remainder of the file if size==None).
359+
360+        The consumer will be used in non-streaming mode: an IPullProducer
361+        will be attached to it.
362+
363+        The consumer will not receive data right away: several network trips
364+        must occur first. The order of events will be::
365+
366+         consumer.registerProducer(p, streaming)
367+          (if streaming == False)::
368+           consumer does p.resumeProducing()
369+            consumer.write(data)
370+           consumer does p.resumeProducing()
371+            consumer.write(data).. (repeat until all data is written)
372+         consumer.unregisterProducer()
373+         deferred.callback(consumer)
374+
375+        If a download error occurs, or an exception is raised by
376+        consumer.registerProducer() or consumer.write(), I will call
377+        consumer.unregisterProducer() and then deliver the exception via
378+        deferred.errback(). To cancel the download, the consumer should call
379+        p.stopProducing(), which will result in an exception being delivered
380+        via deferred.errback().
381+
382+        See src/allmydata/util/consumer.py for an example of a simple
383+        download-to-memory consumer.
384+        """
385+
386+
387+class IWritable(Interface):
388+    """
389+    I define methods that callers can use to update SDMF and MDMF
390+    mutable files on a Tahoe-LAFS grid.
391+    """
392+    # XXX: For the moment, we have only this. It is possible that we
393+    #      want to move overwrite() and modify() in here too.
394+    def update(data, offset):
395+        """
396+        I write the data from my data argument to the MDMF file,
397+        starting at offset. I continue writing data until my data
398+        argument is exhausted, appending data to the file as necessary.
399+        """
400+        # assert IMutableUploadable.providedBy(data)
401+        # to append data: offset=node.get_size_of_best_version()
402+        # do we want to support compacting MDMF?
403+        # for an MDMF file, this can be done with O(data.get_size())
404+        # memory. For an SDMF file, any modification takes
405+        # O(node.get_size_of_best_version()).
406+
407+
408+class IMutableFileVersion(IReadable):
409+    """I provide access to a particular version of a mutable file. The
410+    access is read/write if I was obtained from a filenode derived from
411+    a write cap, or read-only if the filenode was derived from a read cap.
412+    """
413+
414+    def get_sequence_number():
415+        """Return the sequence number of this version."""
416+
417+    def get_servermap():
418+        """Return the IMutableFileServerMap instance that was used to create
419+        this object.
420+        """
421+
422+    def get_writekey():
423+        """Return this filenode's writekey, or None if the node does not have
424+        write-capability. This may be used to assist with data structures
425+        that need to make certain data available only to writers, such as the
426+        read-write child caps in dirnodes. The recommended process is to have
427+        reader-visible data be submitted to the filenode in the clear (where
428+        it will be encrypted by the filenode using the readkey), but encrypt
429+        writer-visible data using this writekey.
430+        """
431+
432+    # TODO: Can this be overwrite instead of replace?
433+    def replace(new_contents):
434+        """Replace the contents of the mutable file, provided that no other
435+        node has published (or is attempting to publish, concurrently) a
436+        newer version of the file than this one.
437+
438+        I will avoid modifying any share that is different than the version
439+        given by get_sequence_number(). However, if another node is writing
440+        to the file at the same time as me, I may manage to update some shares
441+        while they update others. If I see any evidence of this, I will signal
442+        UncoordinatedWriteError, and the file will be left in an inconsistent
443+        state (possibly the version you provided, possibly the old version,
444+        possibly somebody else's version, and possibly a mix of shares from
445+        all of these).
446+
447+        The recommended response to UncoordinatedWriteError is to either
448+        return it to the caller (since they failed to coordinate their
449+        writes), or to attempt some sort of recovery. It may be sufficient to
450+        wait a random interval (with exponential backoff) and repeat your
451+        operation. If I do not signal UncoordinatedWriteError, then I was
452+        able to write the new version without incident.
453+
454+        I return a Deferred that fires (with a PublishStatus object) when the
455+        update has completed.
456+        """
457+
458+    def modify(modifier_cb):
459+        """Modify the contents of the file, by downloading this version,
460+        applying the modifier function (or bound method), then uploading
461+        the new version. This will succeed as long as no other node
462+        publishes a version between the download and the upload.
463+        I return a Deferred that fires (with a PublishStatus object) when
464+        the update is complete.
465+
466+        The modifier callable will be given three arguments: a string (with
467+        the old contents), a 'first_time' boolean, and a servermap. As with
468+        download_to_data(), the old contents will be from this version,
469+        but the modifier can use the servermap to make other decisions
470+        (such as refusing to apply the delta if there are multiple parallel
471+        versions, or if there is evidence of a newer unrecoverable version).
472+        'first_time' will be True the first time the modifier is called,
473+        and False on any subsequent calls.
474+
475+        The callable should return a string with the new contents. The
476+        callable must be prepared to be called multiple times, and must
477+        examine the input string to see if the change that it wants to make
478+        is already present in the old version. If it does not need to make
479+        any changes, it can either return None, or return its input string.
480+
481+        If the modifier raises an exception, it will be returned in the
482+        errback.
483+        """
484+
485+
486 # The hierarchy looks like this:
487 #  IFilesystemNode
488 #   IFileNode
489hunk ./src/allmydata/interfaces.py 754
490     def raise_error():
491         """Raise any error associated with this node."""
492 
493+    # XXX: These may not be appropriate outside the context of an IReadable.
494     def get_size():
495         """Return the length (in bytes) of the data this node represents. For
496         directory nodes, I return the size of the backing store. I return
497hunk ./src/allmydata/interfaces.py 771
498 class IFileNode(IFilesystemNode):
499     """I am a node which represents a file: a sequence of bytes. I am not a
500     container, like IDirectoryNode."""
501+    def get_best_readable_version():
502+        """Return a Deferred that fires with an IReadable for the 'best'
503+        available version of the file. The IReadable provides only read
504+        access, even if this filenode was derived from a write cap.
505 
506hunk ./src/allmydata/interfaces.py 776
507-class IImmutableFileNode(IFileNode):
508-    def read(consumer, offset=0, size=None):
509-        """Download a portion (possibly all) of the file's contents, making
510-        them available to the given IConsumer. Return a Deferred that fires
511-        (with the consumer) when the consumer is unregistered (either because
512-        the last byte has been given to it, or because the consumer threw an
513-        exception during write(), possibly because it no longer wants to
514-        receive data). The portion downloaded will start at 'offset' and
515-        contain 'size' bytes (or the remainder of the file if size==None).
516-
517-        The consumer will be used in non-streaming mode: an IPullProducer
518-        will be attached to it.
519+        For an immutable file, there is only one version. For a mutable
520+        file, the 'best' version is the recoverable version with the
521+        highest sequence number. If no uncoordinated writes have occurred,
522+        and if enough shares are available, then this will be the most
523+        recent version that has been uploaded. If no version is recoverable,
524+        the Deferred will errback with an UnrecoverableFileError.
525+        """
526 
527hunk ./src/allmydata/interfaces.py 784
528-        The consumer will not receive data right away: several network trips
529-        must occur first. The order of events will be::
530+    def download_best_version():
531+        """Download the contents of the version that would be returned
532+        by get_best_readable_version(). This is equivalent to calling
533+        download_to_data() on the IReadable given by that method.
534 
535hunk ./src/allmydata/interfaces.py 789
536-         consumer.registerProducer(p, streaming)
537-          (if streaming == False)::
538-           consumer does p.resumeProducing()
539-            consumer.write(data)
540-           consumer does p.resumeProducing()
541-            consumer.write(data).. (repeat until all data is written)
542-         consumer.unregisterProducer()
543-         deferred.callback(consumer)
544+        I return a Deferred that fires with a byte string when the file
545+        has been fully downloaded. To support streaming download, use
546+        the 'read' method of IReadable. If no version is recoverable,
547+        the Deferred will errback with an UnrecoverableFileError.
548+        """
549 
550hunk ./src/allmydata/interfaces.py 795
551-        If a download error occurs, or an exception is raised by
552-        consumer.registerProducer() or consumer.write(), I will call
553-        consumer.unregisterProducer() and then deliver the exception via
554-        deferred.errback(). To cancel the download, the consumer should call
555-        p.stopProducing(), which will result in an exception being delivered
556-        via deferred.errback().
557+    def get_size_of_best_version():
558+        """Find the size of the version that would be returned by
559+        get_best_readable_version().
560 
561hunk ./src/allmydata/interfaces.py 799
562-        See src/allmydata/util/consumer.py for an example of a simple
563-        download-to-memory consumer.
564+        I return a Deferred that fires with an integer. If no version
565+        is recoverable, the Deferred will errback with an
566+        UnrecoverableFileError.
567         """
568 
569hunk ./src/allmydata/interfaces.py 804
570+
571+class IImmutableFileNode(IFileNode, IReadable):
572+    """I am a node representing an immutable file. Immutable files have
573+    only one version"""
574+
575+
576 class IMutableFileNode(IFileNode):
577     """I provide access to a 'mutable file', which retains its identity
578     regardless of what contents are put in it.
579hunk ./src/allmydata/interfaces.py 869
580     only be retrieved and updated all-at-once, as a single big string. Future
581     versions of our mutable files will remove this restriction.
582     """
583-
584-    def download_best_version():
585-        """Download the 'best' available version of the file, meaning one of
586-        the recoverable versions with the highest sequence number. If no
587+    def get_best_mutable_version():
588+        """Return a Deferred that fires with an IMutableFileVersion for
589+        the 'best' available version of the file. The best version is
590+        the recoverable version with the highest sequence number. If no
591         uncoordinated writes have occurred, and if enough shares are
592hunk ./src/allmydata/interfaces.py 874
593-        available, then this will be the most recent version that has been
594-        uploaded.
595+        available, then this will be the most recent version that has
596+        been uploaded.
597 
598hunk ./src/allmydata/interfaces.py 877
599-        I update an internal servermap with MODE_READ, determine which
600-        version of the file is indicated by
601-        servermap.best_recoverable_version(), and return a Deferred that
602-        fires with its contents. If no version is recoverable, the Deferred
603-        will errback with UnrecoverableFileError.
604-        """
605-
606-    def get_size_of_best_version():
607-        """Find the size of the version that would be downloaded with
608-        download_best_version(), without actually downloading the whole file.
609-
610-        I return a Deferred that fires with an integer.
611+        If no version is recoverable, the Deferred will errback with an
612+        UnrecoverableFileError.
613         """
614 
615     def overwrite(new_contents):
616hunk ./src/allmydata/interfaces.py 917
617         errback.
618         """
619 
620-
621     def get_servermap(mode):
622         """Return a Deferred that fires with an IMutableFileServerMap
623         instance, updated using the given mode.
624hunk ./src/allmydata/interfaces.py 970
625         writer-visible data using this writekey.
626         """
627 
628+    def set_version(version):
629+        """Tahoe-LAFS supports SDMF and MDMF mutable files. By default,
630+        we upload in SDMF for reasons of compatibility. If you want to
631+        change this, set_version will let you do that.
632+
633+        To say that this file should be uploaded in SDMF, pass in a 0. To
634+        say that the file should be uploaded as MDMF, pass in a 1.
635+        """
636+
637+    def get_version():
638+        """Returns the mutable file protocol version."""
639+
640 class NotEnoughSharesError(Exception):
641     """Download was unable to get enough shares"""
642 
643hunk ./src/allmydata/interfaces.py 1786
644         """The upload is finished, and whatever filehandle was in use may be
645         closed."""
646 
647+
648+class IMutableUploadable(Interface):
649+    """
650+    I represent content that is due to be uploaded to a mutable filecap.
651+    """
652+    # This is somewhat simpler than the IUploadable interface above
653+    # because mutable files do not need to be concerned with possibly
654+    # generating a CHK, nor with per-file keys. It is a subset of the
655+    # methods in IUploadable, though, so we could just as well implement
656+    # the mutable uploadables as IUploadables that don't happen to use
657+    # those methods (with the understanding that the unused methods will
658+    # never be called on such objects)
659+    def get_size():
660+        """
661+        Returns a Deferred that fires with the size of the content held
662+        by the uploadable.
663+        """
664+
665+    def read(length):
666+        """
667+        Returns a list of strings which, when concatenated, are the next
668+        length bytes of the file, or fewer if there are fewer bytes
669+        between the current location and the end of the file.
670+        """
671+
672+    def close():
673+        """
674+        The process that used the Uploadable is finished using it, so
675+        the uploadable may be closed.
676+        """
677+
678 class IUploadResults(Interface):
679     """I am returned by upload() methods. I contain a number of public
680     attributes which can be read to determine the results of the upload. Some
681}
682[frontends/sftpd.py: Modify the sftp frontend to work with the MDMF changes
683Kevan Carstensen <kevan@isnotajoke.com>**20100809233535
684 Ignore-this: 2d25e2cfcd0d7bbcbba660c7e1da12f
685] {
686hunk ./src/allmydata/frontends/sftpd.py 33
687 from allmydata.interfaces import IFileNode, IDirectoryNode, ExistingChildError, \
688      NoSuchChildError, ChildOfWrongTypeError
689 from allmydata.mutable.common import NotWriteableError
690+from allmydata.mutable.publish import MutableFileHandle
691 from allmydata.immutable.upload import FileHandle
692 from allmydata.dirnode import update_metadata
693 from allmydata.util.fileutil import EncryptedTemporaryFile
694hunk ./src/allmydata/frontends/sftpd.py 664
695         else:
696             assert IFileNode.providedBy(filenode), filenode
697 
698-            if filenode.is_mutable():
699-                self.async.addCallback(lambda ign: filenode.download_best_version())
700-                def _downloaded(data):
701-                    self.consumer = OverwriteableFileConsumer(len(data), tempfile_maker)
702-                    self.consumer.write(data)
703-                    self.consumer.finish()
704-                    return None
705-                self.async.addCallback(_downloaded)
706-            else:
707-                download_size = filenode.get_size()
708-                assert download_size is not None, "download_size is None"
709+            self.async.addCallback(lambda ignored: filenode.get_best_readable_version())
710+
711+            def _read(version):
712+                if noisy: self.log("_read", level=NOISY)
713+                download_size = version.get_size()
714+                assert download_size is not None
715+
716                 self.consumer = OverwriteableFileConsumer(download_size, tempfile_maker)
717hunk ./src/allmydata/frontends/sftpd.py 672
718-                def _read(ign):
719-                    if noisy: self.log("_read immutable", level=NOISY)
720-                    filenode.read(self.consumer, 0, None)
721-                self.async.addCallback(_read)
722+
723+                version.read(self.consumer, 0, None)
724+            self.async.addCallback(_read)
725 
726         eventually(self.async.callback, None)
727 
728hunk ./src/allmydata/frontends/sftpd.py 818
729                     assert parent and childname, (parent, childname, self.metadata)
730                     d2.addCallback(lambda ign: parent.set_metadata_for(childname, self.metadata))
731 
732-                d2.addCallback(lambda ign: self.consumer.get_current_size())
733-                d2.addCallback(lambda size: self.consumer.read(0, size))
734-                d2.addCallback(lambda new_contents: self.filenode.overwrite(new_contents))
735+                d2.addCallback(lambda ign: self.filenode.overwrite(MutableFileHandle(self.consumer.get_file())))
736             else:
737                 def _add_file(ign):
738                     self.log("_add_file childname=%r" % (childname,), level=OPERATIONAL)
739}
740[nodemaker.py: Make nodemaker expose a way to create MDMF files
741Kevan Carstensen <kevan@isnotajoke.com>**20100809233623
742 Ignore-this: a8a7c4283bb94be9fabb6fe3f2ca54b6
743] {
744hunk ./src/allmydata/nodemaker.py 3
745 import weakref
746 from zope.interface import implements
747-from allmydata.interfaces import INodeMaker
748+from allmydata.util.assertutil import precondition
749+from allmydata.interfaces import INodeMaker, MustBeDeepImmutableError, \
750+                                 SDMF_VERSION, MDMF_VERSION
751 from allmydata.immutable.literal import LiteralFileNode
752 from allmydata.immutable.filenode import ImmutableFileNode, CiphertextFileNode
753 from allmydata.immutable.upload import Data
754hunk ./src/allmydata/nodemaker.py 10
755 from allmydata.mutable.filenode import MutableFileNode
756+from allmydata.mutable.publish import MutableData
757 from allmydata.dirnode import DirectoryNode, pack_children
758 from allmydata.unknown import UnknownNode
759 from allmydata import uri
760hunk ./src/allmydata/nodemaker.py 93
761             return self._create_dirnode(filenode)
762         return None
763 
764-    def create_mutable_file(self, contents=None, keysize=None):
765+    def create_mutable_file(self, contents=None, keysize=None,
766+                            version=SDMF_VERSION):
767         n = MutableFileNode(self.storage_broker, self.secret_holder,
768                             self.default_encoding_parameters, self.history)
769hunk ./src/allmydata/nodemaker.py 97
770+        n.set_version(version)
771         d = self.key_generator.generate(keysize)
772         d.addCallback(n.create_with_keys, contents)
773         d.addCallback(lambda res: n)
774hunk ./src/allmydata/nodemaker.py 103
775         return d
776 
777-    def create_new_mutable_directory(self, initial_children={}):
778+    def create_new_mutable_directory(self, initial_children={},
779+                                     version=SDMF_VERSION):
780+        # initial_children must have metadata (i.e. {} instead of None)
781+        for (name, (node, metadata)) in initial_children.iteritems():
782+            precondition(isinstance(metadata, dict),
783+                         "create_new_mutable_directory requires metadata to be a dict, not None", metadata)
784+            node.raise_error()
785         d = self.create_mutable_file(lambda n:
786hunk ./src/allmydata/nodemaker.py 111
787-                                     pack_children(initial_children, n.get_writekey()))
788+                                     MutableData(pack_children(initial_children,
789+                                                    n.get_writekey())),
790+                                     version)
791         d.addCallback(self._create_dirnode)
792         return d
793 
794}
795[mutable/layout.py and interfaces.py: add MDMF writer and reader
796Kevan Carstensen <kevan@isnotajoke.com>**20100809234004
797 Ignore-this: 90db36ee3318dbbd4397baebc6014f86
798 
799 The MDMF writer is responsible for keeping state as plaintext is
800 gradually processed into share data by the upload process. When the
801 upload finishes, it will write all of its share data to a remote server,
802 reporting its status back to the publisher.
803 
804 The MDMF reader is responsible for abstracting an MDMF file as it sits
805 on the grid from the downloader; specifically, by receiving and
806 responding to requests for arbitrary data within the MDMF file.
807 
808 The interfaces.py file has also been modified to contain an interface
809 for the writer.
810] {
811hunk ./src/allmydata/interfaces.py 7
812      ChoiceOf, IntegerConstraint, Any, RemoteInterface, Referenceable
813 
814 HASH_SIZE=32
815+SALT_SIZE=16
816+
817+SDMF_VERSION=0
818+MDMF_VERSION=1
819 
820 Hash = StringConstraint(maxLength=HASH_SIZE,
821                         minLength=HASH_SIZE)# binary format 32-byte SHA256 hash
822hunk ./src/allmydata/interfaces.py 420
823         """
824 
825 
826+class IMutableSlotWriter(Interface):
827+    """
828+    The interface for a writer around a mutable slot on a remote server.
829+    """
830+    def set_checkstring(checkstring, *args):
831+        """
832+        Set the checkstring that I will pass to the remote server when
833+        writing.
834+
835+            @param checkstring A packed checkstring to use.
836+
837+        Note that implementations can differ in which semantics they
838+        wish to support for set_checkstring -- they can, for example,
839+        build the checkstring themselves from its constituents, or
840+        some other thing.
841+        """
842+
843+    def get_checkstring():
844+        """
845+        Get the checkstring that I think currently exists on the remote
846+        server.
847+        """
848+
849+    def put_block(data, segnum, salt):
850+        """
851+        Add a block and salt to the share.
852+        """
853+
854+    def put_encprivey(encprivkey):
855+        """
856+        Add the encrypted private key to the share.
857+        """
858+
859+    def put_blockhashes(blockhashes=list):
860+        """
861+        Add the block hash tree to the share.
862+        """
863+
864+    def put_sharehashes(sharehashes=dict):
865+        """
866+        Add the share hash chain to the share.
867+        """
868+
869+    def get_signable():
870+        """
871+        Return the part of the share that needs to be signed.
872+        """
873+
874+    def put_signature(signature):
875+        """
876+        Add the signature to the share.
877+        """
878+
879+    def put_verification_key(verification_key):
880+        """
881+        Add the verification key to the share.
882+        """
883+
884+    def finish_publishing():
885+        """
886+        Do anything necessary to finish writing the share to a remote
887+        server. I require that no further publishing needs to take place
888+        after this method has been called.
889+        """
890+
891+
892 class IURI(Interface):
893     def init_from_string(uri):
894         """Accept a string (as created by my to_string() method) and populate
895hunk ./src/allmydata/mutable/layout.py 4
896 
897 import struct
898 from allmydata.mutable.common import NeedMoreDataError, UnknownVersionError
899+from allmydata.interfaces import HASH_SIZE, SALT_SIZE, SDMF_VERSION, \
900+                                 MDMF_VERSION, IMutableSlotWriter
901+from allmydata.util import mathutil, observer
902+from twisted.python import failure
903+from twisted.internet import defer
904+from zope.interface import implements
905+
906+
907+# These strings describe the format of the packed structs they help process
908+# Here's what they mean:
909+#
910+#  PREFIX:
911+#    >: Big-endian byte order; the most significant byte is first (leftmost).
912+#    B: The version information; an 8 bit version identifier. Stored as
913+#       an unsigned char. This is currently 00 00 00 00; our modifications
914+#       will turn it into 00 00 00 01.
915+#    Q: The sequence number; this is sort of like a revision history for
916+#       mutable files; they start at 1 and increase as they are changed after
917+#       being uploaded. Stored as an unsigned long long, which is 8 bytes in
918+#       length.
919+#  32s: The root hash of the share hash tree. We use sha-256d, so we use 32
920+#       characters = 32 bytes to store the value.
921+#  16s: The salt for the readkey. This is a 16-byte random value, stored as
922+#       16 characters.
923+#
924+#  SIGNED_PREFIX additions, things that are covered by the signature:
925+#    B: The "k" encoding parameter. We store this as an 8-bit character,
926+#       which is convenient because our erasure coding scheme cannot
927+#       encode if you ask for more than 255 pieces.
928+#    B: The "N" encoding parameter. Stored as an 8-bit character for the
929+#       same reasons as above.
930+#    Q: The segment size of the uploaded file. This will essentially be the
931+#       length of the file in SDMF. An unsigned long long, so we can store
932+#       files of quite large size.
933+#    Q: The data length of the uploaded file. Modulo padding, this will be
934+#       the same of the data length field. Like the data length field, it is
935+#       an unsigned long long and can be quite large.
936+#
937+#   HEADER additions:
938+#     L: The offset of the signature of this. An unsigned long.
939+#     L: The offset of the share hash chain. An unsigned long.
940+#     L: The offset of the block hash tree. An unsigned long.
941+#     L: The offset of the share data. An unsigned long.
942+#     Q: The offset of the encrypted private key. An unsigned long long, to
943+#        account for the possibility of a lot of share data.
944+#     Q: The offset of the EOF. An unsigned long long, to account for the
945+#        possibility of a lot of share data.
946+#
947+#  After all of these, we have the following:
948+#    - The verification key: Occupies the space between the end of the header
949+#      and the start of the signature (i.e.: data[HEADER_LENGTH:o['signature']].
950+#    - The signature, which goes from the signature offset to the share hash
951+#      chain offset.
952+#    - The share hash chain, which goes from the share hash chain offset to
953+#      the block hash tree offset.
954+#    - The share data, which goes from the share data offset to the encrypted
955+#      private key offset.
956+#    - The encrypted private key offset, which goes until the end of the file.
957+#
958+#  The block hash tree in this encoding has only one share, so the offset of
959+#  the share data will be 32 bits more than the offset of the block hash tree.
960+#  Given this, we may need to check to see how many bytes a reasonably sized
961+#  block hash tree will take up.
962 
963 PREFIX = ">BQ32s16s" # each version has a different prefix
964 SIGNED_PREFIX = ">BQ32s16s BBQQ" # this is covered by the signature
965hunk ./src/allmydata/mutable/layout.py 73
966 SIGNED_PREFIX_LENGTH = struct.calcsize(SIGNED_PREFIX)
967 HEADER = ">BQ32s16s BBQQ LLLLQQ" # includes offsets
968 HEADER_LENGTH = struct.calcsize(HEADER)
969+OFFSETS = ">LLLLQQ"
970+OFFSETS_LENGTH = struct.calcsize(OFFSETS)
971 
972 def unpack_header(data):
973     o = {}
974hunk ./src/allmydata/mutable/layout.py 194
975     return (share_hash_chain, block_hash_tree, share_data)
976 
977 
978-def pack_checkstring(seqnum, root_hash, IV):
979+def pack_checkstring(seqnum, root_hash, IV, version=0):
980     return struct.pack(PREFIX,
981hunk ./src/allmydata/mutable/layout.py 196
982-                       0, # version,
983+                       version,
984                        seqnum,
985                        root_hash,
986                        IV)
987hunk ./src/allmydata/mutable/layout.py 269
988                            encprivkey])
989     return final_share
990 
991+def pack_prefix(seqnum, root_hash, IV,
992+                required_shares, total_shares,
993+                segment_size, data_length):
994+    prefix = struct.pack(SIGNED_PREFIX,
995+                         0, # version,
996+                         seqnum,
997+                         root_hash,
998+                         IV,
999+                         required_shares,
1000+                         total_shares,
1001+                         segment_size,
1002+                         data_length,
1003+                         )
1004+    return prefix
1005+
1006+
1007+class SDMFSlotWriteProxy:
1008+    implements(IMutableSlotWriter)
1009+    """
1010+    I represent a remote write slot for an SDMF mutable file. I build a
1011+    share in memory, and then write it in one piece to the remote
1012+    server. This mimics how SDMF shares were built before MDMF (and the
1013+    new MDMF uploader), but provides that functionality in a way that
1014+    allows the MDMF uploader to be built without much special-casing for
1015+    file format, which makes the uploader code more readable.
1016+    """
1017+    def __init__(self,
1018+                 shnum,
1019+                 rref, # a remote reference to a storage server
1020+                 storage_index,
1021+                 secrets, # (write_enabler, renew_secret, cancel_secret)
1022+                 seqnum, # the sequence number of the mutable file
1023+                 required_shares,
1024+                 total_shares,
1025+                 segment_size,
1026+                 data_length): # the length of the original file
1027+        self.shnum = shnum
1028+        self._rref = rref
1029+        self._storage_index = storage_index
1030+        self._secrets = secrets
1031+        self._seqnum = seqnum
1032+        self._required_shares = required_shares
1033+        self._total_shares = total_shares
1034+        self._segment_size = segment_size
1035+        self._data_length = data_length
1036+
1037+        # This is an SDMF file, so it should have only one segment, so,
1038+        # modulo padding of the data length, the segment size and the
1039+        # data length should be the same.
1040+        expected_segment_size = mathutil.next_multiple(data_length,
1041+                                                       self._required_shares)
1042+        assert expected_segment_size == segment_size
1043+
1044+        self._block_size = self._segment_size / self._required_shares
1045+
1046+        # This is meant to mimic how SDMF files were built before MDMF
1047+        # entered the picture: we generate each share in its entirety,
1048+        # then push it off to the storage server in one write. When
1049+        # callers call set_*, they are just populating this dict.
1050+        # finish_publishing will stitch these pieces together into a
1051+        # coherent share, and then write the coherent share to the
1052+        # storage server.
1053+        self._share_pieces = {}
1054+
1055+        # This tells the write logic what checkstring to use when
1056+        # writing remote shares.
1057+        self._testvs = []
1058+
1059+        self._readvs = [(0, struct.calcsize(PREFIX))]
1060+
1061+
1062+    def set_checkstring(self, checkstring_or_seqnum,
1063+                              root_hash=None,
1064+                              salt=None):
1065+        """
1066+        Set the checkstring that I will pass to the remote server when
1067+        writing.
1068+
1069+            @param checkstring_or_seqnum: A packed checkstring to use,
1070+                   or a sequence number. I will treat this as a checkstr
1071+
1072+        Note that implementations can differ in which semantics they
1073+        wish to support for set_checkstring -- they can, for example,
1074+        build the checkstring themselves from its constituents, or
1075+        some other thing.
1076+        """
1077+        if root_hash and salt:
1078+            checkstring = struct.pack(PREFIX,
1079+                                      0,
1080+                                      checkstring_or_seqnum,
1081+                                      root_hash,
1082+                                      salt)
1083+        else:
1084+            checkstring = checkstring_or_seqnum
1085+        self._testvs = [(0, len(checkstring), "eq", checkstring)]
1086+
1087+
1088+    def get_checkstring(self):
1089+        """
1090+        Get the checkstring that I think currently exists on the remote
1091+        server.
1092+        """
1093+        if self._testvs:
1094+            return self._testvs[0][3]
1095+        return ""
1096+
1097+
1098+    def put_block(self, data, segnum, salt):
1099+        """
1100+        Add a block and salt to the share.
1101+        """
1102+        # SDMF files have only one segment
1103+        assert segnum == 0
1104+        assert len(data) == self._block_size
1105+        assert len(salt) == SALT_SIZE
1106+
1107+        self._share_pieces['sharedata'] = data
1108+        self._share_pieces['salt'] = salt
1109+
1110+        # TODO: Figure out something intelligent to return.
1111+        return defer.succeed(None)
1112+
1113+
1114+    def put_encprivkey(self, encprivkey):
1115+        """
1116+        Add the encrypted private key to the share.
1117+        """
1118+        self._share_pieces['encprivkey'] = encprivkey
1119+
1120+        return defer.succeed(None)
1121+
1122+
1123+    def put_blockhashes(self, blockhashes):
1124+        """
1125+        Add the block hash tree to the share.
1126+        """
1127+        assert isinstance(blockhashes, list)
1128+        for h in blockhashes:
1129+            assert len(h) == HASH_SIZE
1130+
1131+        # serialize the blockhashes, then set them.
1132+        blockhashes_s = "".join(blockhashes)
1133+        self._share_pieces['block_hash_tree'] = blockhashes_s
1134+
1135+        return defer.succeed(None)
1136+
1137+
1138+    def put_sharehashes(self, sharehashes):
1139+        """
1140+        Add the share hash chain to the share.
1141+        """
1142+        assert isinstance(sharehashes, dict)
1143+        for h in sharehashes.itervalues():
1144+            assert len(h) == HASH_SIZE
1145+
1146+        # serialize the sharehashes, then set them.
1147+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
1148+                                 for i in sorted(sharehashes.keys())])
1149+        self._share_pieces['share_hash_chain'] = sharehashes_s
1150+
1151+        return defer.succeed(None)
1152+
1153+
1154+    def put_root_hash(self, root_hash):
1155+        """
1156+        Add the root hash to the share.
1157+        """
1158+        assert len(root_hash) == HASH_SIZE
1159+
1160+        self._share_pieces['root_hash'] = root_hash
1161+
1162+        return defer.succeed(None)
1163+
1164+
1165+    def put_salt(self, salt):
1166+        """
1167+        Add a salt to an empty SDMF file.
1168+        """
1169+        assert len(salt) == SALT_SIZE
1170+
1171+        self._share_pieces['salt'] = salt
1172+        self._share_pieces['sharedata'] = ""
1173+
1174+
1175+    def get_signable(self):
1176+        """
1177+        Return the part of the share that needs to be signed.
1178+
1179+        SDMF writers need to sign the packed representation of the
1180+        first eight fields of the remote share, that is:
1181+            - version number (0)
1182+            - sequence number
1183+            - root of the share hash tree
1184+            - salt
1185+            - k
1186+            - n
1187+            - segsize
1188+            - datalen
1189+
1190+        This method is responsible for returning that to callers.
1191+        """
1192+        return struct.pack(SIGNED_PREFIX,
1193+                           0,
1194+                           self._seqnum,
1195+                           self._share_pieces['root_hash'],
1196+                           self._share_pieces['salt'],
1197+                           self._required_shares,
1198+                           self._total_shares,
1199+                           self._segment_size,
1200+                           self._data_length)
1201+
1202+
1203+    def put_signature(self, signature):
1204+        """
1205+        Add the signature to the share.
1206+        """
1207+        self._share_pieces['signature'] = signature
1208+
1209+        return defer.succeed(None)
1210+
1211+
1212+    def put_verification_key(self, verification_key):
1213+        """
1214+        Add the verification key to the share.
1215+        """
1216+        self._share_pieces['verification_key'] = verification_key
1217+
1218+        return defer.succeed(None)
1219+
1220+
1221+    def get_verinfo(self):
1222+        """
1223+        I return my verinfo tuple. This is used by the ServermapUpdater
1224+        to keep track of versions of mutable files.
1225+
1226+        The verinfo tuple for MDMF files contains:
1227+            - seqnum
1228+            - root hash
1229+            - a blank (nothing)
1230+            - segsize
1231+            - datalen
1232+            - k
1233+            - n
1234+            - prefix (the thing that you sign)
1235+            - a tuple of offsets
1236+
1237+        We include the nonce in MDMF to simplify processing of version
1238+        information tuples.
1239+
1240+        The verinfo tuple for SDMF files is the same, but contains a
1241+        16-byte IV instead of a hash of salts.
1242+        """
1243+        return (self._seqnum,
1244+                self._share_pieces['root_hash'],
1245+                self._share_pieces['salt'],
1246+                self._segment_size,
1247+                self._data_length,
1248+                self._required_shares,
1249+                self._total_shares,
1250+                self.get_signable(),
1251+                self._get_offsets_tuple())
1252+
1253+    def _get_offsets_dict(self):
1254+        post_offset = HEADER_LENGTH
1255+        offsets = {}
1256+
1257+        verification_key_length = len(self._share_pieces['verification_key'])
1258+        o1 = offsets['signature'] = post_offset + verification_key_length
1259+
1260+        signature_length = len(self._share_pieces['signature'])
1261+        o2 = offsets['share_hash_chain'] = o1 + signature_length
1262+
1263+        share_hash_chain_length = len(self._share_pieces['share_hash_chain'])
1264+        o3 = offsets['block_hash_tree'] = o2 + share_hash_chain_length
1265+
1266+        block_hash_tree_length = len(self._share_pieces['block_hash_tree'])
1267+        o4 = offsets['share_data'] = o3 + block_hash_tree_length
1268+
1269+        share_data_length = len(self._share_pieces['sharedata'])
1270+        o5 = offsets['enc_privkey'] = o4 + share_data_length
1271+
1272+        encprivkey_length = len(self._share_pieces['encprivkey'])
1273+        offsets['EOF'] = o5 + encprivkey_length
1274+        return offsets
1275+
1276+
1277+    def _get_offsets_tuple(self):
1278+        offsets = self._get_offsets_dict()
1279+        return tuple([(key, value) for key, value in offsets.items()])
1280+
1281+
1282+    def _pack_offsets(self):
1283+        offsets = self._get_offsets_dict()
1284+        return struct.pack(">LLLLQQ",
1285+                           offsets['signature'],
1286+                           offsets['share_hash_chain'],
1287+                           offsets['block_hash_tree'],
1288+                           offsets['share_data'],
1289+                           offsets['enc_privkey'],
1290+                           offsets['EOF'])
1291+
1292+
1293+    def finish_publishing(self):
1294+        """
1295+        Do anything necessary to finish writing the share to a remote
1296+        server. I require that no further publishing needs to take place
1297+        after this method has been called.
1298+        """
1299+        for k in ["sharedata", "encprivkey", "signature", "verification_key",
1300+                  "share_hash_chain", "block_hash_tree"]:
1301+            assert k in self._share_pieces
1302+        # This is the only method that actually writes something to the
1303+        # remote server.
1304+        # First, we need to pack the share into data that we can write
1305+        # to the remote server in one write.
1306+        offsets = self._pack_offsets()
1307+        prefix = self.get_signable()
1308+        final_share = "".join([prefix,
1309+                               offsets,
1310+                               self._share_pieces['verification_key'],
1311+                               self._share_pieces['signature'],
1312+                               self._share_pieces['share_hash_chain'],
1313+                               self._share_pieces['block_hash_tree'],
1314+                               self._share_pieces['sharedata'],
1315+                               self._share_pieces['encprivkey']])
1316+
1317+        # Our only data vector is going to be writing the final share,
1318+        # in its entirely.
1319+        datavs = [(0, final_share)]
1320+
1321+        if not self._testvs:
1322+            # Our caller has not provided us with another checkstring
1323+            # yet, so we assume that we are writing a new share, and set
1324+            # a test vector that will allow a new share to be written.
1325+            self._testvs = []
1326+            self._testvs.append(tuple([0, 1, "eq", ""]))
1327+            new_share = True
1328+
1329+        tw_vectors = {}
1330+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
1331+        return self._rref.callRemote("slot_testv_and_readv_and_writev",
1332+                                     self._storage_index,
1333+                                     self._secrets,
1334+                                     tw_vectors,
1335+                                     # TODO is it useful to read something?
1336+                                     self._readvs)
1337+
1338+
1339+MDMFHEADER = ">BQ32sBBQQ QQQQQQ"
1340+MDMFHEADERWITHOUTOFFSETS = ">BQ32sBBQQ"
1341+MDMFHEADERSIZE = struct.calcsize(MDMFHEADER)
1342+MDMFHEADERWITHOUTOFFSETSSIZE = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
1343+MDMFCHECKSTRING = ">BQ32s"
1344+MDMFSIGNABLEHEADER = ">BQ32sBBQQ"
1345+MDMFOFFSETS = ">QQQQQQ"
1346+MDMFOFFSETS_LENGTH = struct.calcsize(MDMFOFFSETS)
1347+
1348+class MDMFSlotWriteProxy:
1349+    implements(IMutableSlotWriter)
1350+
1351+    """
1352+    I represent a remote write slot for an MDMF mutable file.
1353+
1354+    I abstract away from my caller the details of block and salt
1355+    management, and the implementation of the on-disk format for MDMF
1356+    shares.
1357+    """
1358+    # Expected layout, MDMF:
1359+    # offset:     size:       name:
1360+    #-- signed part --
1361+    # 0           1           version number (01)
1362+    # 1           8           sequence number
1363+    # 9           32          share tree root hash
1364+    # 41          1           The "k" encoding parameter
1365+    # 42          1           The "N" encoding parameter
1366+    # 43          8           The segment size of the uploaded file
1367+    # 51          8           The data length of the original plaintext
1368+    #-- end signed part --
1369+    # 59          8           The offset of the encrypted private key
1370+    # 83          8           The offset of the signature
1371+    # 91          8           The offset of the verification key
1372+    # 67          8           The offset of the block hash tree
1373+    # 75          8           The offset of the share hash chain
1374+    # 99          8           The offset of the EOF
1375+    #
1376+    # followed by salts and share data, the encrypted private key, the
1377+    # block hash tree, the salt hash tree, the share hash chain, a
1378+    # signature over the first eight fields, and a verification key.
1379+    #
1380+    # The checkstring is the first three fields -- the version number,
1381+    # sequence number, root hash and root salt hash. This is consistent
1382+    # in meaning to what we have with SDMF files, except now instead of
1383+    # using the literal salt, we use a value derived from all of the
1384+    # salts -- the share hash root.
1385+    #
1386+    # The salt is stored before the block for each segment. The block
1387+    # hash tree is computed over the combination of block and salt for
1388+    # each segment. In this way, we get integrity checking for both
1389+    # block and salt with the current block hash tree arrangement.
1390+    #
1391+    # The ordering of the offsets is different to reflect the dependencies
1392+    # that we'll run into with an MDMF file. The expected write flow is
1393+    # something like this:
1394+    #
1395+    #   0: Initialize with the sequence number, encoding parameters and
1396+    #      data length. From this, we can deduce the number of segments,
1397+    #      and where they should go.. We can also figure out where the
1398+    #      encrypted private key should go, because we can figure out how
1399+    #      big the share data will be.
1400+    #
1401+    #   1: Encrypt, encode, and upload the file in chunks. Do something
1402+    #      like
1403+    #
1404+    #       put_block(data, segnum, salt)
1405+    #
1406+    #      to write a block and a salt to the disk. We can do both of
1407+    #      these operations now because we have enough of the offsets to
1408+    #      know where to put them.
1409+    #
1410+    #   2: Put the encrypted private key. Use:
1411+    #
1412+    #        put_encprivkey(encprivkey)
1413+    #
1414+    #      Now that we know the length of the private key, we can fill
1415+    #      in the offset for the block hash tree.
1416+    #
1417+    #   3: We're now in a position to upload the block hash tree for
1418+    #      a share. Put that using something like:
1419+    #       
1420+    #        put_blockhashes(block_hash_tree)
1421+    #
1422+    #      Note that block_hash_tree is a list of hashes -- we'll take
1423+    #      care of the details of serializing that appropriately. When
1424+    #      we get the block hash tree, we are also in a position to
1425+    #      calculate the offset for the share hash chain, and fill that
1426+    #      into the offsets table.
1427+    #
1428+    #   4: At the same time, we're in a position to upload the salt hash
1429+    #      tree. This is a Merkle tree over all of the salts. We use a
1430+    #      Merkle tree so that we can validate each block,salt pair as
1431+    #      we download them later. We do this using
1432+    #
1433+    #        put_salthashes(salt_hash_tree)
1434+    #
1435+    #      When you do this, I automatically put the root of the tree
1436+    #      (the hash at index 0 of the list) in its appropriate slot in
1437+    #      the signed prefix of the share.
1438+    #
1439+    #   5: We're now in a position to upload the share hash chain for
1440+    #      a share. Do that with something like:
1441+    #     
1442+    #        put_sharehashes(share_hash_chain)
1443+    #
1444+    #      share_hash_chain should be a dictionary mapping shnums to
1445+    #      32-byte hashes -- the wrapper handles serialization.
1446+    #      We'll know where to put the signature at this point, also.
1447+    #      The root of this tree will be put explicitly in the next
1448+    #      step.
1449+    #
1450+    #      TODO: Why? Why not just include it in the tree here?
1451+    #
1452+    #   6: Before putting the signature, we must first put the
1453+    #      root_hash. Do this with:
1454+    #
1455+    #        put_root_hash(root_hash).
1456+    #     
1457+    #      In terms of knowing where to put this value, it was always
1458+    #      possible to place it, but it makes sense semantically to
1459+    #      place it after the share hash tree, so that's why you do it
1460+    #      in this order.
1461+    #
1462+    #   6: With the root hash put, we can now sign the header. Use:
1463+    #
1464+    #        get_signable()
1465+    #
1466+    #      to get the part of the header that you want to sign, and use:
1467+    #       
1468+    #        put_signature(signature)
1469+    #
1470+    #      to write your signature to the remote server.
1471+    #
1472+    #   6: Add the verification key, and finish. Do:
1473+    #
1474+    #        put_verification_key(key)
1475+    #
1476+    #      and
1477+    #
1478+    #        finish_publish()
1479+    #
1480+    # Checkstring management:
1481+    #
1482+    # To write to a mutable slot, we have to provide test vectors to ensure
1483+    # that we are writing to the same data that we think we are. These
1484+    # vectors allow us to detect uncoordinated writes; that is, writes
1485+    # where both we and some other shareholder are writing to the
1486+    # mutable slot, and to report those back to the parts of the program
1487+    # doing the writing.
1488+    #
1489+    # With SDMF, this was easy -- all of the share data was written in
1490+    # one go, so it was easy to detect uncoordinated writes, and we only
1491+    # had to do it once. With MDMF, not all of the file is written at
1492+    # once.
1493+    #
1494+    # If a share is new, we write out as much of the header as we can
1495+    # before writing out anything else. This gives other writers a
1496+    # canary that they can use to detect uncoordinated writes, and, if
1497+    # they do the same thing, gives us the same canary. We them update
1498+    # the share. We won't be able to write out two fields of the header
1499+    # -- the share tree hash and the salt hash -- until we finish
1500+    # writing out the share. We only require the writer to provide the
1501+    # initial checkstring, and keep track of what it should be after
1502+    # updates ourselves.
1503+    #
1504+    # If we haven't written anything yet, then on the first write (which
1505+    # will probably be a block + salt of a share), we'll also write out
1506+    # the header. On subsequent passes, we'll expect to see the header.
1507+    # This changes in two places:
1508+    #
1509+    #   - When we write out the salt hash
1510+    #   - When we write out the root of the share hash tree
1511+    #
1512+    # since these values will change the header. It is possible that we
1513+    # can just make those be written in one operation to minimize
1514+    # disruption.
1515+    def __init__(self,
1516+                 shnum,
1517+                 rref, # a remote reference to a storage server
1518+                 storage_index,
1519+                 secrets, # (write_enabler, renew_secret, cancel_secret)
1520+                 seqnum, # the sequence number of the mutable file
1521+                 required_shares,
1522+                 total_shares,
1523+                 segment_size,
1524+                 data_length): # the length of the original file
1525+        self.shnum = shnum
1526+        self._rref = rref
1527+        self._storage_index = storage_index
1528+        self._seqnum = seqnum
1529+        self._required_shares = required_shares
1530+        assert self.shnum >= 0 and self.shnum < total_shares
1531+        self._total_shares = total_shares
1532+        # We build up the offset table as we write things. It is the
1533+        # last thing we write to the remote server.
1534+        self._offsets = {}
1535+        self._testvs = []
1536+        # This is a list of write vectors that will be sent to our
1537+        # remote server once we are directed to write things there.
1538+        self._writevs = []
1539+        self._secrets = secrets
1540+        # The segment size needs to be a multiple of the k parameter --
1541+        # any padding should have been carried out by the publisher
1542+        # already.
1543+        assert segment_size % required_shares == 0
1544+        self._segment_size = segment_size
1545+        self._data_length = data_length
1546+
1547+        # These are set later -- we define them here so that we can
1548+        # check for their existence easily
1549+
1550+        # This is the root of the share hash tree -- the Merkle tree
1551+        # over the roots of the block hash trees computed for shares in
1552+        # this upload.
1553+        self._root_hash = None
1554+
1555+        # We haven't yet written anything to the remote bucket. By
1556+        # setting this, we tell the _write method as much. The write
1557+        # method will then know that it also needs to add a write vector
1558+        # for the checkstring (or what we have of it) to the first write
1559+        # request. We'll then record that value for future use.  If
1560+        # we're expecting something to be there already, we need to call
1561+        # set_checkstring before we write anything to tell the first
1562+        # write about that.
1563+        self._written = False
1564+
1565+        # When writing data to the storage servers, we get a read vector
1566+        # for free. We'll read the checkstring, which will help us
1567+        # figure out what's gone wrong if a write fails.
1568+        self._readv = [(0, struct.calcsize(MDMFCHECKSTRING))]
1569+
1570+        # We calculate the number of segments because it tells us
1571+        # where the salt part of the file ends/share segment begins,
1572+        # and also because it provides a useful amount of bounds checking.
1573+        self._num_segments = mathutil.div_ceil(self._data_length,
1574+                                               self._segment_size)
1575+        self._block_size = self._segment_size / self._required_shares
1576+        # We also calculate the share size, to help us with block
1577+        # constraints later.
1578+        tail_size = self._data_length % self._segment_size
1579+        if not tail_size:
1580+            self._tail_block_size = self._block_size
1581+        else:
1582+            self._tail_block_size = mathutil.next_multiple(tail_size,
1583+                                                           self._required_shares)
1584+            self._tail_block_size /= self._required_shares
1585+
1586+        # We already know where the sharedata starts; right after the end
1587+        # of the header (which is defined as the signable part + the offsets)
1588+        # We can also calculate where the encrypted private key begins
1589+        # from what we know know.
1590+        self._actual_block_size = self._block_size + SALT_SIZE
1591+        data_size = self._actual_block_size * (self._num_segments - 1)
1592+        data_size += self._tail_block_size
1593+        data_size += SALT_SIZE
1594+        self._offsets['enc_privkey'] = MDMFHEADERSIZE
1595+        self._offsets['enc_privkey'] += data_size
1596+        # We'll wait for the rest. Callers can now call my "put_block" and
1597+        # "set_checkstring" methods.
1598+
1599+
1600+    def set_checkstring(self,
1601+                        seqnum_or_checkstring,
1602+                        root_hash=None,
1603+                        salt=None):
1604+        """
1605+        Set checkstring checkstring for the given shnum.
1606+
1607+        This can be invoked in one of two ways.
1608+
1609+        With one argument, I assume that you are giving me a literal
1610+        checkstring -- e.g., the output of get_checkstring. I will then
1611+        set that checkstring as it is. This form is used by unit tests.
1612+
1613+        With two arguments, I assume that you are giving me a sequence
1614+        number and root hash to make a checkstring from. In that case, I
1615+        will build a checkstring and set it for you. This form is used
1616+        by the publisher.
1617+
1618+        By default, I assume that I am writing new shares to the grid.
1619+        If you don't explcitly set your own checkstring, I will use
1620+        one that requires that the remote share not exist. You will want
1621+        to use this method if you are updating a share in-place;
1622+        otherwise, writes will fail.
1623+        """
1624+        # You're allowed to overwrite checkstrings with this method;
1625+        # I assume that users know what they are doing when they call
1626+        # it.
1627+        if root_hash:
1628+            checkstring = struct.pack(MDMFCHECKSTRING,
1629+                                      1,
1630+                                      seqnum_or_checkstring,
1631+                                      root_hash)
1632+        else:
1633+            checkstring = seqnum_or_checkstring
1634+
1635+        if checkstring == "":
1636+            # We special-case this, since len("") = 0, but we need
1637+            # length of 1 for the case of an empty share to work on the
1638+            # storage server, which is what a checkstring that is the
1639+            # empty string means.
1640+            self._testvs = []
1641+        else:
1642+            self._testvs = []
1643+            self._testvs.append((0, len(checkstring), "eq", checkstring))
1644+
1645+
1646+    def __repr__(self):
1647+        return "MDMFSlotWriteProxy for share %d" % self.shnum
1648+
1649+
1650+    def get_checkstring(self):
1651+        """
1652+        Given a share number, I return a representation of what the
1653+        checkstring for that share on the server will look like.
1654+
1655+        I am mostly used for tests.
1656+        """
1657+        if self._root_hash:
1658+            roothash = self._root_hash
1659+        else:
1660+            roothash = "\x00" * 32
1661+        return struct.pack(MDMFCHECKSTRING,
1662+                           1,
1663+                           self._seqnum,
1664+                           roothash)
1665+
1666+
1667+    def put_block(self, data, segnum, salt):
1668+        """
1669+        I queue a write vector for the data, salt, and segment number
1670+        provided to me. I return None, as I do not actually cause
1671+        anything to be written yet.
1672+        """
1673+        if segnum >= self._num_segments:
1674+            raise LayoutInvalid("I won't overwrite the private key")
1675+        if len(salt) != SALT_SIZE:
1676+            raise LayoutInvalid("I was given a salt of size %d, but "
1677+                                "I wanted a salt of size %d")
1678+        if segnum + 1 == self._num_segments:
1679+            if len(data) != self._tail_block_size:
1680+                raise LayoutInvalid("I was given the wrong size block to write")
1681+        elif len(data) != self._block_size:
1682+            raise LayoutInvalid("I was given the wrong size block to write")
1683+
1684+        # We want to write at len(MDMFHEADER) + segnum * block_size.
1685+
1686+        offset = MDMFHEADERSIZE + (self._actual_block_size * segnum)
1687+        data = salt + data
1688+
1689+        self._writevs.append(tuple([offset, data]))
1690+
1691+
1692+    def put_encprivkey(self, encprivkey):
1693+        """
1694+        I queue a write vector for the encrypted private key provided to
1695+        me.
1696+        """
1697+        assert self._offsets
1698+        assert self._offsets['enc_privkey']
1699+        # You shouldn't re-write the encprivkey after the block hash
1700+        # tree is written, since that could cause the private key to run
1701+        # into the block hash tree. Before it writes the block hash
1702+        # tree, the block hash tree writing method writes the offset of
1703+        # the salt hash tree. So that's a good indicator of whether or
1704+        # not the block hash tree has been written.
1705+        if "share_hash_chain" in self._offsets:
1706+            raise LayoutInvalid("You must write this before the block hash tree")
1707+
1708+        self._offsets['block_hash_tree'] = self._offsets['enc_privkey'] + \
1709+            len(encprivkey)
1710+        self._writevs.append(tuple([self._offsets['enc_privkey'], encprivkey]))
1711+
1712+
1713+    def put_blockhashes(self, blockhashes):
1714+        """
1715+        I queue a write vector to put the block hash tree in blockhashes
1716+        onto the remote server.
1717+
1718+        The encrypted private key must be queued before the block hash
1719+        tree, since we need to know how large it is to know where the
1720+        block hash tree should go. The block hash tree must be put
1721+        before the salt hash tree, since its size determines the
1722+        offset of the share hash chain.
1723+        """
1724+        assert self._offsets
1725+        assert isinstance(blockhashes, list)
1726+        if "block_hash_tree" not in self._offsets:
1727+            raise LayoutInvalid("You must put the encrypted private key "
1728+                                "before you put the block hash tree")
1729+        # If written, the share hash chain causes the signature offset
1730+        # to be defined.
1731+        if "signature" in self._offsets:
1732+            raise LayoutInvalid("You must put the block hash tree before "
1733+                                "you put the share hash chain")
1734+        blockhashes_s = "".join(blockhashes)
1735+        self._offsets['share_hash_chain'] = self._offsets['block_hash_tree'] + len(blockhashes_s)
1736+
1737+        self._writevs.append(tuple([self._offsets['block_hash_tree'],
1738+                                  blockhashes_s]))
1739+
1740+
1741+    def put_sharehashes(self, sharehashes):
1742+        """
1743+        I queue a write vector to put the share hash chain in my
1744+        argument onto the remote server.
1745+
1746+        The salt hash tree must be queued before the share hash chain,
1747+        since we need to know where the salt hash tree ends before we
1748+        can know where the share hash chain starts. The share hash chain
1749+        must be put before the signature, since the length of the packed
1750+        share hash chain determines the offset of the signature. Also,
1751+        semantically, you must know what the root of the salt hash tree
1752+        is before you can generate a valid signature.
1753+        """
1754+        assert isinstance(sharehashes, dict)
1755+        if "share_hash_chain" not in self._offsets:
1756+            raise LayoutInvalid("You need to put the salt hash tree before "
1757+                                "you can put the share hash chain")
1758+        # The signature comes after the share hash chain. If the
1759+        # signature has already been written, we must not write another
1760+        # share hash chain. The signature writes the verification key
1761+        # offset when it gets sent to the remote server, so we look for
1762+        # that.
1763+        if "verification_key" in self._offsets:
1764+            raise LayoutInvalid("You must write the share hash chain "
1765+                                "before you write the signature")
1766+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
1767+                                  for i in sorted(sharehashes.keys())])
1768+        self._offsets['signature'] = self._offsets['share_hash_chain'] + len(sharehashes_s)
1769+        self._writevs.append(tuple([self._offsets['share_hash_chain'],
1770+                            sharehashes_s]))
1771+
1772+
1773+    def put_root_hash(self, roothash):
1774+        """
1775+        Put the root hash (the root of the share hash tree) in the
1776+        remote slot.
1777+        """
1778+        # It does not make sense to be able to put the root
1779+        # hash without first putting the share hashes, since you need
1780+        # the share hashes to generate the root hash.
1781+        #
1782+        # Signature is defined by the routine that places the share hash
1783+        # chain, so it's a good thing to look for in finding out whether
1784+        # or not the share hash chain exists on the remote server.
1785+        if "signature" not in self._offsets:
1786+            raise LayoutInvalid("You need to put the share hash chain "
1787+                                "before you can put the root share hash")
1788+        if len(roothash) != HASH_SIZE:
1789+            raise LayoutInvalid("hashes and salts must be exactly %d bytes"
1790+                                 % HASH_SIZE)
1791+        self._root_hash = roothash
1792+        # To write both of these values, we update the checkstring on
1793+        # the remote server, which includes them
1794+        checkstring = self.get_checkstring()
1795+        self._writevs.append(tuple([0, checkstring]))
1796+        # This write, if successful, changes the checkstring, so we need
1797+        # to update our internal checkstring to be consistent with the
1798+        # one on the server.
1799+
1800+
1801+    def get_signable(self):
1802+        """
1803+        Get the first seven fields of the mutable file; the parts that
1804+        are signed.
1805+        """
1806+        if not self._root_hash:
1807+            raise LayoutInvalid("You need to set the root hash "
1808+                                "before getting something to "
1809+                                "sign")
1810+        return struct.pack(MDMFSIGNABLEHEADER,
1811+                           1,
1812+                           self._seqnum,
1813+                           self._root_hash,
1814+                           self._required_shares,
1815+                           self._total_shares,
1816+                           self._segment_size,
1817+                           self._data_length)
1818+
1819+
1820+    def put_signature(self, signature):
1821+        """
1822+        I queue a write vector for the signature of the MDMF share.
1823+
1824+        I require that the root hash and share hash chain have been put
1825+        to the grid before I will write the signature to the grid.
1826+        """
1827+        if "signature" not in self._offsets:
1828+            raise LayoutInvalid("You must put the share hash chain "
1829+        # It does not make sense to put a signature without first
1830+        # putting the root hash and the salt hash (since otherwise
1831+        # the signature would be incomplete), so we don't allow that.
1832+                       "before putting the signature")
1833+        if not self._root_hash:
1834+            raise LayoutInvalid("You must complete the signed prefix "
1835+                                "before computing a signature")
1836+        # If we put the signature after we put the verification key, we
1837+        # could end up running into the verification key, and will
1838+        # probably screw up the offsets as well. So we don't allow that.
1839+        # The method that writes the verification key defines the EOF
1840+        # offset before writing the verification key, so look for that.
1841+        if "EOF" in self._offsets:
1842+            raise LayoutInvalid("You must write the signature before the verification key")
1843+
1844+        self._offsets['verification_key'] = self._offsets['signature'] + len(signature)
1845+        self._writevs.append(tuple([self._offsets['signature'], signature]))
1846+
1847+
1848+    def put_verification_key(self, verification_key):
1849+        """
1850+        I queue a write vector for the verification key.
1851+
1852+        I require that the signature have been written to the storage
1853+        server before I allow the verification key to be written to the
1854+        remote server.
1855+        """
1856+        if "verification_key" not in self._offsets:
1857+            raise LayoutInvalid("You must put the signature before you "
1858+                                "can put the verification key")
1859+        self._offsets['EOF'] = self._offsets['verification_key'] + len(verification_key)
1860+        self._writevs.append(tuple([self._offsets['verification_key'],
1861+                            verification_key]))
1862+
1863+
1864+    def _get_offsets_tuple(self):
1865+        return tuple([(key, value) for key, value in self._offsets.items()])
1866+
1867+
1868+    def get_verinfo(self):
1869+        return (self._seqnum,
1870+                self._root_hash,
1871+                self._required_shares,
1872+                self._total_shares,
1873+                self._segment_size,
1874+                self._data_length,
1875+                self.get_signable(),
1876+                self._get_offsets_tuple())
1877+
1878+
1879+    def finish_publishing(self):
1880+        """
1881+        I add a write vector for the offsets table, and then cause all
1882+        of the write vectors that I've dealt with so far to be published
1883+        to the remote server, ending the write process.
1884+        """
1885+        if "EOF" not in self._offsets:
1886+            raise LayoutInvalid("You must put the verification key before "
1887+                                "you can publish the offsets")
1888+        offsets_offset = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
1889+        offsets = struct.pack(MDMFOFFSETS,
1890+                              self._offsets['enc_privkey'],
1891+                              self._offsets['block_hash_tree'],
1892+                              self._offsets['share_hash_chain'],
1893+                              self._offsets['signature'],
1894+                              self._offsets['verification_key'],
1895+                              self._offsets['EOF'])
1896+        self._writevs.append(tuple([offsets_offset, offsets]))
1897+        encoding_parameters_offset = struct.calcsize(MDMFCHECKSTRING)
1898+        params = struct.pack(">BBQQ",
1899+                             self._required_shares,
1900+                             self._total_shares,
1901+                             self._segment_size,
1902+                             self._data_length)
1903+        self._writevs.append(tuple([encoding_parameters_offset, params]))
1904+        return self._write(self._writevs)
1905+
1906+
1907+    def _write(self, datavs, on_failure=None, on_success=None):
1908+        """I write the data vectors in datavs to the remote slot."""
1909+        tw_vectors = {}
1910+        new_share = False
1911+        if not self._testvs:
1912+            self._testvs = []
1913+            self._testvs.append(tuple([0, 1, "eq", ""]))
1914+            new_share = True
1915+        if not self._written:
1916+            # Write a new checkstring to the share when we write it, so
1917+            # that we have something to check later.
1918+            new_checkstring = self.get_checkstring()
1919+            datavs.append((0, new_checkstring))
1920+            def _first_write():
1921+                self._written = True
1922+                self._testvs = [(0, len(new_checkstring), "eq", new_checkstring)]
1923+            on_success = _first_write
1924+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
1925+        datalength = sum([len(x[1]) for x in datavs])
1926+        d = self._rref.callRemote("slot_testv_and_readv_and_writev",
1927+                                  self._storage_index,
1928+                                  self._secrets,
1929+                                  tw_vectors,
1930+                                  self._readv)
1931+        def _result(results):
1932+            if isinstance(results, failure.Failure) or not results[0]:
1933+                # Do nothing; the write was unsuccessful.
1934+                if on_failure: on_failure()
1935+            else:
1936+                if on_success: on_success()
1937+            return results
1938+        d.addCallback(_result)
1939+        return d
1940+
1941+
1942+class MDMFSlotReadProxy:
1943+    """
1944+    I read from a mutable slot filled with data written in the MDMF data
1945+    format (which is described above).
1946+
1947+    I can be initialized with some amount of data, which I will use (if
1948+    it is valid) to eliminate some of the need to fetch it from servers.
1949+    """
1950+    def __init__(self,
1951+                 rref,
1952+                 storage_index,
1953+                 shnum,
1954+                 data=""):
1955+        # Start the initialization process.
1956+        self._rref = rref
1957+        self._storage_index = storage_index
1958+        self.shnum = shnum
1959+
1960+        # Before doing anything, the reader is probably going to want to
1961+        # verify that the signature is correct. To do that, they'll need
1962+        # the verification key, and the signature. To get those, we'll
1963+        # need the offset table. So fetch the offset table on the
1964+        # assumption that that will be the first thing that a reader is
1965+        # going to do.
1966+
1967+        # The fact that these encoding parameters are None tells us
1968+        # that we haven't yet fetched them from the remote share, so we
1969+        # should. We could just not set them, but the checks will be
1970+        # easier to read if we don't have to use hasattr.
1971+        self._version_number = None
1972+        self._sequence_number = None
1973+        self._root_hash = None
1974+        # Filled in if we're dealing with an SDMF file. Unused
1975+        # otherwise.
1976+        self._salt = None
1977+        self._required_shares = None
1978+        self._total_shares = None
1979+        self._segment_size = None
1980+        self._data_length = None
1981+        self._offsets = None
1982+
1983+        # If the user has chosen to initialize us with some data, we'll
1984+        # try to satisfy subsequent data requests with that data before
1985+        # asking the storage server for it. If
1986+        self._data = data
1987+        # The way callers interact with cache in the filenode returns
1988+        # None if there isn't any cached data, but the way we index the
1989+        # cached data requires a string, so convert None to "".
1990+        if self._data == None:
1991+            self._data = ""
1992+
1993+        self._queue_observers = observer.ObserverList()
1994+        self._queue_errbacks = observer.ObserverList()
1995+        self._readvs = []
1996+
1997+
1998+    def _maybe_fetch_offsets_and_header(self, force_remote=False):
1999+        """
2000+        I fetch the offset table and the header from the remote slot if
2001+        I don't already have them. If I do have them, I do nothing and
2002+        return an empty Deferred.
2003+        """
2004+        if self._offsets:
2005+            return defer.succeed(None)
2006+        # At this point, we may be either SDMF or MDMF. Fetching 107
2007+        # bytes will be enough to get header and offsets for both SDMF and
2008+        # MDMF, though we'll be left with 4 more bytes than we
2009+        # need if this ends up being MDMF. This is probably less
2010+        # expensive than the cost of a second roundtrip.
2011+        readvs = [(0, 107)]
2012+        d = self._read(readvs, force_remote)
2013+        d.addCallback(self._process_encoding_parameters)
2014+        d.addCallback(self._process_offsets)
2015+        return d
2016+
2017+
2018+    def _process_encoding_parameters(self, encoding_parameters):
2019+        assert self.shnum in encoding_parameters
2020+        encoding_parameters = encoding_parameters[self.shnum][0]
2021+        # The first byte is the version number. It will tell us what
2022+        # to do next.
2023+        (verno,) = struct.unpack(">B", encoding_parameters[:1])
2024+        if verno == MDMF_VERSION:
2025+            read_size = MDMFHEADERWITHOUTOFFSETSSIZE
2026+            (verno,
2027+             seqnum,
2028+             root_hash,
2029+             k,
2030+             n,
2031+             segsize,
2032+             datalen) = struct.unpack(MDMFHEADERWITHOUTOFFSETS,
2033+                                      encoding_parameters[:read_size])
2034+            if segsize == 0 and datalen == 0:
2035+                # Empty file, no segments.
2036+                self._num_segments = 0
2037+            else:
2038+                self._num_segments = mathutil.div_ceil(datalen, segsize)
2039+
2040+        elif verno == SDMF_VERSION:
2041+            read_size = SIGNED_PREFIX_LENGTH
2042+            (verno,
2043+             seqnum,
2044+             root_hash,
2045+             salt,
2046+             k,
2047+             n,
2048+             segsize,
2049+             datalen) = struct.unpack(">BQ32s16s BBQQ",
2050+                                encoding_parameters[:SIGNED_PREFIX_LENGTH])
2051+            self._salt = salt
2052+            if segsize == 0 and datalen == 0:
2053+                # empty file
2054+                self._num_segments = 0
2055+            else:
2056+                # non-empty SDMF files have one segment.
2057+                self._num_segments = 1
2058+        else:
2059+            raise UnknownVersionError("You asked me to read mutable file "
2060+                                      "version %d, but I only understand "
2061+                                      "%d and %d" % (verno, SDMF_VERSION,
2062+                                                     MDMF_VERSION))
2063+
2064+        self._version_number = verno
2065+        self._sequence_number = seqnum
2066+        self._root_hash = root_hash
2067+        self._required_shares = k
2068+        self._total_shares = n
2069+        self._segment_size = segsize
2070+        self._data_length = datalen
2071+
2072+        self._block_size = self._segment_size / self._required_shares
2073+        # We can upload empty files, and need to account for this fact
2074+        # so as to avoid zero-division and zero-modulo errors.
2075+        if datalen > 0:
2076+            tail_size = self._data_length % self._segment_size
2077+        else:
2078+            tail_size = 0
2079+        if not tail_size:
2080+            self._tail_block_size = self._block_size
2081+        else:
2082+            self._tail_block_size = mathutil.next_multiple(tail_size,
2083+                                                    self._required_shares)
2084+            self._tail_block_size /= self._required_shares
2085+
2086+        return encoding_parameters
2087+
2088+
2089+    def _process_offsets(self, offsets):
2090+        if self._version_number == 0:
2091+            read_size = OFFSETS_LENGTH
2092+            read_offset = SIGNED_PREFIX_LENGTH
2093+            end = read_size + read_offset
2094+            (signature,
2095+             share_hash_chain,
2096+             block_hash_tree,
2097+             share_data,
2098+             enc_privkey,
2099+             EOF) = struct.unpack(">LLLLQQ",
2100+                                  offsets[read_offset:end])
2101+            self._offsets = {}
2102+            self._offsets['signature'] = signature
2103+            self._offsets['share_data'] = share_data
2104+            self._offsets['block_hash_tree'] = block_hash_tree
2105+            self._offsets['share_hash_chain'] = share_hash_chain
2106+            self._offsets['enc_privkey'] = enc_privkey
2107+            self._offsets['EOF'] = EOF
2108+
2109+        elif self._version_number == 1:
2110+            read_offset = MDMFHEADERWITHOUTOFFSETSSIZE
2111+            read_length = MDMFOFFSETS_LENGTH
2112+            end = read_offset + read_length
2113+            (encprivkey,
2114+             blockhashes,
2115+             sharehashes,
2116+             signature,
2117+             verification_key,
2118+             eof) = struct.unpack(MDMFOFFSETS,
2119+                                  offsets[read_offset:end])
2120+            self._offsets = {}
2121+            self._offsets['enc_privkey'] = encprivkey
2122+            self._offsets['block_hash_tree'] = blockhashes
2123+            self._offsets['share_hash_chain'] = sharehashes
2124+            self._offsets['signature'] = signature
2125+            self._offsets['verification_key'] = verification_key
2126+            self._offsets['EOF'] = eof
2127+
2128+
2129+    def get_block_and_salt(self, segnum, queue=False):
2130+        """
2131+        I return (block, salt), where block is the block data and
2132+        salt is the salt used to encrypt that segment.
2133+        """
2134+        d = self._maybe_fetch_offsets_and_header()
2135+        def _then(ignored):
2136+            if self._version_number == 1:
2137+                base_share_offset = MDMFHEADERSIZE
2138+            else:
2139+                base_share_offset = self._offsets['share_data']
2140+
2141+            if segnum + 1 > self._num_segments:
2142+                raise LayoutInvalid("Not a valid segment number")
2143+
2144+            if self._version_number == 0:
2145+                share_offset = base_share_offset + self._block_size * segnum
2146+            else:
2147+                share_offset = base_share_offset + (self._block_size + \
2148+                                                    SALT_SIZE) * segnum
2149+            if segnum + 1 == self._num_segments:
2150+                data = self._tail_block_size
2151+            else:
2152+                data = self._block_size
2153+
2154+            if self._version_number == 1:
2155+                data += SALT_SIZE
2156+
2157+            readvs = [(share_offset, data)]
2158+            return readvs
2159+        d.addCallback(_then)
2160+        d.addCallback(lambda readvs:
2161+            self._read(readvs, queue=queue))
2162+        def _process_results(results):
2163+            assert self.shnum in results
2164+            if self._version_number == 0:
2165+                # We only read the share data, but we know the salt from
2166+                # when we fetched the header
2167+                data = results[self.shnum]
2168+                if not data:
2169+                    data = ""
2170+                else:
2171+                    assert len(data) == 1
2172+                    data = data[0]
2173+                salt = self._salt
2174+            else:
2175+                data = results[self.shnum]
2176+                if not data:
2177+                    salt = data = ""
2178+                else:
2179+                    salt_and_data = results[self.shnum][0]
2180+                    salt = salt_and_data[:SALT_SIZE]
2181+                    data = salt_and_data[SALT_SIZE:]
2182+            return data, salt
2183+        d.addCallback(_process_results)
2184+        return d
2185+
2186+
2187+    def get_blockhashes(self, needed=None, queue=False, force_remote=False):
2188+        """
2189+        I return the block hash tree
2190+
2191+        I take an optional argument, needed, which is a set of indices
2192+        correspond to hashes that I should fetch. If this argument is
2193+        missing, I will fetch the entire block hash tree; otherwise, I
2194+        may attempt to fetch fewer hashes, based on what needed says
2195+        that I should do. Note that I may fetch as many hashes as I
2196+        want, so long as the set of hashes that I do fetch is a superset
2197+        of the ones that I am asked for, so callers should be prepared
2198+        to tolerate additional hashes.
2199+        """
2200+        # TODO: Return only the parts of the block hash tree necessary
2201+        # to validate the blocknum provided?
2202+        # This is a good idea, but it is hard to implement correctly. It
2203+        # is bad to fetch any one block hash more than once, so we
2204+        # probably just want to fetch the whole thing at once and then
2205+        # serve it.
2206+        if needed == set([]):
2207+            return defer.succeed([])
2208+        d = self._maybe_fetch_offsets_and_header()
2209+        def _then(ignored):
2210+            blockhashes_offset = self._offsets['block_hash_tree']
2211+            if self._version_number == 1:
2212+                blockhashes_length = self._offsets['share_hash_chain'] - blockhashes_offset
2213+            else:
2214+                blockhashes_length = self._offsets['share_data'] - blockhashes_offset
2215+            readvs = [(blockhashes_offset, blockhashes_length)]
2216+            return readvs
2217+        d.addCallback(_then)
2218+        d.addCallback(lambda readvs:
2219+            self._read(readvs, queue=queue, force_remote=force_remote))
2220+        def _build_block_hash_tree(results):
2221+            assert self.shnum in results
2222+
2223+            rawhashes = results[self.shnum][0]
2224+            results = [rawhashes[i:i+HASH_SIZE]
2225+                       for i in range(0, len(rawhashes), HASH_SIZE)]
2226+            return results
2227+        d.addCallback(_build_block_hash_tree)
2228+        return d
2229+
2230+
2231+    def get_sharehashes(self, needed=None, queue=False, force_remote=False):
2232+        """
2233+        I return the part of the share hash chain placed to validate
2234+        this share.
2235+
2236+        I take an optional argument, needed. Needed is a set of indices
2237+        that correspond to the hashes that I should fetch. If needed is
2238+        not present, I will fetch and return the entire share hash
2239+        chain. Otherwise, I may fetch and return any part of the share
2240+        hash chain that is a superset of the part that I am asked to
2241+        fetch. Callers should be prepared to deal with more hashes than
2242+        they've asked for.
2243+        """
2244+        if needed == set([]):
2245+            return defer.succeed([])
2246+        d = self._maybe_fetch_offsets_and_header()
2247+
2248+        def _make_readvs(ignored):
2249+            sharehashes_offset = self._offsets['share_hash_chain']
2250+            if self._version_number == 0:
2251+                sharehashes_length = self._offsets['block_hash_tree'] - sharehashes_offset
2252+            else:
2253+                sharehashes_length = self._offsets['signature'] - sharehashes_offset
2254+            readvs = [(sharehashes_offset, sharehashes_length)]
2255+            return readvs
2256+        d.addCallback(_make_readvs)
2257+        d.addCallback(lambda readvs:
2258+            self._read(readvs, queue=queue, force_remote=force_remote))
2259+        def _build_share_hash_chain(results):
2260+            assert self.shnum in results
2261+
2262+            sharehashes = results[self.shnum][0]
2263+            results = [sharehashes[i:i+(HASH_SIZE + 2)]
2264+                       for i in range(0, len(sharehashes), HASH_SIZE + 2)]
2265+            results = dict([struct.unpack(">H32s", data)
2266+                            for data in results])
2267+            return results
2268+        d.addCallback(_build_share_hash_chain)
2269+        return d
2270+
2271+
2272+    def get_encprivkey(self, queue=False):
2273+        """
2274+        I return the encrypted private key.
2275+        """
2276+        d = self._maybe_fetch_offsets_and_header()
2277+
2278+        def _make_readvs(ignored):
2279+            privkey_offset = self._offsets['enc_privkey']
2280+            if self._version_number == 0:
2281+                privkey_length = self._offsets['EOF'] - privkey_offset
2282+            else:
2283+                privkey_length = self._offsets['block_hash_tree'] - privkey_offset
2284+            readvs = [(privkey_offset, privkey_length)]
2285+            return readvs
2286+        d.addCallback(_make_readvs)
2287+        d.addCallback(lambda readvs:
2288+            self._read(readvs, queue=queue))
2289+        def _process_results(results):
2290+            assert self.shnum in results
2291+            privkey = results[self.shnum][0]
2292+            return privkey
2293+        d.addCallback(_process_results)
2294+        return d
2295+
2296+
2297+    def get_signature(self, queue=False):
2298+        """
2299+        I return the signature of my share.
2300+        """
2301+        d = self._maybe_fetch_offsets_and_header()
2302+
2303+        def _make_readvs(ignored):
2304+            signature_offset = self._offsets['signature']
2305+            if self._version_number == 1:
2306+                signature_length = self._offsets['verification_key'] - signature_offset
2307+            else:
2308+                signature_length = self._offsets['share_hash_chain'] - signature_offset
2309+            readvs = [(signature_offset, signature_length)]
2310+            return readvs
2311+        d.addCallback(_make_readvs)
2312+        d.addCallback(lambda readvs:
2313+            self._read(readvs, queue=queue))
2314+        def _process_results(results):
2315+            assert self.shnum in results
2316+            signature = results[self.shnum][0]
2317+            return signature
2318+        d.addCallback(_process_results)
2319+        return d
2320+
2321+
2322+    def get_verification_key(self, queue=False):
2323+        """
2324+        I return the verification key.
2325+        """
2326+        d = self._maybe_fetch_offsets_and_header()
2327+
2328+        def _make_readvs(ignored):
2329+            if self._version_number == 1:
2330+                vk_offset = self._offsets['verification_key']
2331+                vk_length = self._offsets['EOF'] - vk_offset
2332+            else:
2333+                vk_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
2334+                vk_length = self._offsets['signature'] - vk_offset
2335+            readvs = [(vk_offset, vk_length)]
2336+            return readvs
2337+        d.addCallback(_make_readvs)
2338+        d.addCallback(lambda readvs:
2339+            self._read(readvs, queue=queue))
2340+        def _process_results(results):
2341+            assert self.shnum in results
2342+            verification_key = results[self.shnum][0]
2343+            return verification_key
2344+        d.addCallback(_process_results)
2345+        return d
2346+
2347+
2348+    def get_encoding_parameters(self):
2349+        """
2350+        I return (k, n, segsize, datalen)
2351+        """
2352+        d = self._maybe_fetch_offsets_and_header()
2353+        d.addCallback(lambda ignored:
2354+            (self._required_shares,
2355+             self._total_shares,
2356+             self._segment_size,
2357+             self._data_length))
2358+        return d
2359+
2360+
2361+    def get_seqnum(self):
2362+        """
2363+        I return the sequence number for this share.
2364+        """
2365+        d = self._maybe_fetch_offsets_and_header()
2366+        d.addCallback(lambda ignored:
2367+            self._sequence_number)
2368+        return d
2369+
2370+
2371+    def get_root_hash(self):
2372+        """
2373+        I return the root of the block hash tree
2374+        """
2375+        d = self._maybe_fetch_offsets_and_header()
2376+        d.addCallback(lambda ignored: self._root_hash)
2377+        return d
2378+
2379+
2380+    def get_checkstring(self):
2381+        """
2382+        I return the packed representation of the following:
2383+
2384+            - version number
2385+            - sequence number
2386+            - root hash
2387+            - salt hash
2388+
2389+        which my users use as a checkstring to detect other writers.
2390+        """
2391+        d = self._maybe_fetch_offsets_and_header()
2392+        def _build_checkstring(ignored):
2393+            if self._salt:
2394+                checkstring = strut.pack(PREFIX,
2395+                                         self._version_number,
2396+                                         self._sequence_number,
2397+                                         self._root_hash,
2398+                                         self._salt)
2399+            else:
2400+                checkstring = struct.pack(MDMFCHECKSTRING,
2401+                                          self._version_number,
2402+                                          self._sequence_number,
2403+                                          self._root_hash)
2404+
2405+            return checkstring
2406+        d.addCallback(_build_checkstring)
2407+        return d
2408+
2409+
2410+    def get_prefix(self, force_remote):
2411+        d = self._maybe_fetch_offsets_and_header(force_remote)
2412+        d.addCallback(lambda ignored:
2413+            self._build_prefix())
2414+        return d
2415+
2416+
2417+    def _build_prefix(self):
2418+        # The prefix is another name for the part of the remote share
2419+        # that gets signed. It consists of everything up to and
2420+        # including the datalength, packed by struct.
2421+        if self._version_number == SDMF_VERSION:
2422+            return struct.pack(SIGNED_PREFIX,
2423+                           self._version_number,
2424+                           self._sequence_number,
2425+                           self._root_hash,
2426+                           self._salt,
2427+                           self._required_shares,
2428+                           self._total_shares,
2429+                           self._segment_size,
2430+                           self._data_length)
2431+
2432+        else:
2433+            return struct.pack(MDMFSIGNABLEHEADER,
2434+                           self._version_number,
2435+                           self._sequence_number,
2436+                           self._root_hash,
2437+                           self._required_shares,
2438+                           self._total_shares,
2439+                           self._segment_size,
2440+                           self._data_length)
2441+
2442+
2443+    def _get_offsets_tuple(self):
2444+        # The offsets tuple is another component of the version
2445+        # information tuple. It is basically our offsets dictionary,
2446+        # itemized and in a tuple.
2447+        return self._offsets.copy()
2448+
2449+
2450+    def get_verinfo(self):
2451+        """
2452+        I return my verinfo tuple. This is used by the ServermapUpdater
2453+        to keep track of versions of mutable files.
2454+
2455+        The verinfo tuple for MDMF files contains:
2456+            - seqnum
2457+            - root hash
2458+            - a blank (nothing)
2459+            - segsize
2460+            - datalen
2461+            - k
2462+            - n
2463+            - prefix (the thing that you sign)
2464+            - a tuple of offsets
2465+
2466+        We include the nonce in MDMF to simplify processing of version
2467+        information tuples.
2468+
2469+        The verinfo tuple for SDMF files is the same, but contains a
2470+        16-byte IV instead of a hash of salts.
2471+        """
2472+        d = self._maybe_fetch_offsets_and_header()
2473+        def _build_verinfo(ignored):
2474+            if self._version_number == SDMF_VERSION:
2475+                salt_to_use = self._salt
2476+            else:
2477+                salt_to_use = None
2478+            return (self._sequence_number,
2479+                    self._root_hash,
2480+                    salt_to_use,
2481+                    self._segment_size,
2482+                    self._data_length,
2483+                    self._required_shares,
2484+                    self._total_shares,
2485+                    self._build_prefix(),
2486+                    self._get_offsets_tuple())
2487+        d.addCallback(_build_verinfo)
2488+        return d
2489+
2490+
2491+    def flush(self):
2492+        """
2493+        I flush my queue of read vectors.
2494+        """
2495+        d = self._read(self._readvs)
2496+        def _then(results):
2497+            self._readvs = []
2498+            if isinstance(results, failure.Failure):
2499+                self._queue_errbacks.notify(results)
2500+            else:
2501+                self._queue_observers.notify(results)
2502+            self._queue_observers = observer.ObserverList()
2503+            self._queue_errbacks = observer.ObserverList()
2504+        d.addBoth(_then)
2505+
2506+
2507+    def _read(self, readvs, force_remote=False, queue=False):
2508+        unsatisfiable = filter(lambda x: x[0] + x[1] > len(self._data), readvs)
2509+        # TODO: It's entirely possible to tweak this so that it just
2510+        # fulfills the requests that it can, and not demand that all
2511+        # requests are satisfiable before running it.
2512+        if not unsatisfiable and not force_remote:
2513+            results = [self._data[offset:offset+length]
2514+                       for (offset, length) in readvs]
2515+            results = {self.shnum: results}
2516+            return defer.succeed(results)
2517+        else:
2518+            if queue:
2519+                start = len(self._readvs)
2520+                self._readvs += readvs
2521+                end = len(self._readvs)
2522+                def _get_results(results, start, end):
2523+                    if not self.shnum in results:
2524+                        return {self._shnum: [""]}
2525+                    return {self.shnum: results[self.shnum][start:end]}
2526+                d = defer.Deferred()
2527+                d.addCallback(_get_results, start, end)
2528+                self._queue_observers.subscribe(d.callback)
2529+                self._queue_errbacks.subscribe(d.errback)
2530+                return d
2531+            return self._rref.callRemote("slot_readv",
2532+                                         self._storage_index,
2533+                                         [self.shnum],
2534+                                         readvs)
2535+
2536+
2537+    def is_sdmf(self):
2538+        """I tell my caller whether or not my remote file is SDMF or MDMF
2539+        """
2540+        d = self._maybe_fetch_offsets_and_header()
2541+        d.addCallback(lambda ignored:
2542+            self._version_number == 0)
2543+        return d
2544+
2545+
2546+class LayoutInvalid(Exception):
2547+    """
2548+    This isn't a valid MDMF mutable file
2549+    """
2550hunk ./src/allmydata/test/test_storage.py 2
2551 
2552-import time, os.path, stat, re, simplejson, struct
2553+import time, os.path, stat, re, simplejson, struct, shutil
2554 
2555 from twisted.trial import unittest
2556 
2557hunk ./src/allmydata/test/test_storage.py 22
2558 from allmydata.storage.expirer import LeaseCheckingCrawler
2559 from allmydata.immutable.layout import WriteBucketProxy, WriteBucketProxy_v2, \
2560      ReadBucketProxy
2561-from allmydata.interfaces import BadWriteEnablerError
2562-from allmydata.test.common import LoggingServiceParent
2563+from allmydata.mutable.layout import MDMFSlotWriteProxy, MDMFSlotReadProxy, \
2564+                                     LayoutInvalid, MDMFSIGNABLEHEADER, \
2565+                                     SIGNED_PREFIX, MDMFHEADER, \
2566+                                     MDMFOFFSETS, SDMFSlotWriteProxy
2567+from allmydata.interfaces import BadWriteEnablerError, MDMF_VERSION, \
2568+                                 SDMF_VERSION
2569+from allmydata.test.common import LoggingServiceParent, ShouldFailMixin
2570 from allmydata.test.common_web import WebRenderingMixin
2571 from allmydata.web.storage import StorageStatus, remove_prefix
2572 
2573hunk ./src/allmydata/test/test_storage.py 106
2574 
2575 class RemoteBucket:
2576 
2577+    def __init__(self):
2578+        self.read_count = 0
2579+        self.write_count = 0
2580+
2581     def callRemote(self, methname, *args, **kwargs):
2582         def _call():
2583             meth = getattr(self.target, "remote_" + methname)
2584hunk ./src/allmydata/test/test_storage.py 114
2585             return meth(*args, **kwargs)
2586+
2587+        if methname == "slot_readv":
2588+            self.read_count += 1
2589+        if "writev" in methname:
2590+            self.write_count += 1
2591+
2592         return defer.maybeDeferred(_call)
2593 
2594hunk ./src/allmydata/test/test_storage.py 122
2595+
2596 class BucketProxy(unittest.TestCase):
2597     def make_bucket(self, name, size):
2598         basedir = os.path.join("storage", "BucketProxy", name)
2599hunk ./src/allmydata/test/test_storage.py 1313
2600         self.failUnless(os.path.exists(prefixdir), prefixdir)
2601         self.failIf(os.path.exists(bucketdir), bucketdir)
2602 
2603+
2604+class MDMFProxies(unittest.TestCase, ShouldFailMixin):
2605+    def setUp(self):
2606+        self.sparent = LoggingServiceParent()
2607+        self._lease_secret = itertools.count()
2608+        self.ss = self.create("MDMFProxies storage test server")
2609+        self.rref = RemoteBucket()
2610+        self.rref.target = self.ss
2611+        self.secrets = (self.write_enabler("we_secret"),
2612+                        self.renew_secret("renew_secret"),
2613+                        self.cancel_secret("cancel_secret"))
2614+        self.segment = "aaaaaa"
2615+        self.block = "aa"
2616+        self.salt = "a" * 16
2617+        self.block_hash = "a" * 32
2618+        self.block_hash_tree = [self.block_hash for i in xrange(6)]
2619+        self.share_hash = self.block_hash
2620+        self.share_hash_chain = dict([(i, self.share_hash) for i in xrange(6)])
2621+        self.signature = "foobarbaz"
2622+        self.verification_key = "vvvvvv"
2623+        self.encprivkey = "private"
2624+        self.root_hash = self.block_hash
2625+        self.salt_hash = self.root_hash
2626+        self.salt_hash_tree = [self.salt_hash for i in xrange(6)]
2627+        self.block_hash_tree_s = self.serialize_blockhashes(self.block_hash_tree)
2628+        self.share_hash_chain_s = self.serialize_sharehashes(self.share_hash_chain)
2629+        # blockhashes and salt hashes are serialized in the same way,
2630+        # only we lop off the first element and store that in the
2631+        # header.
2632+        self.salt_hash_tree_s = self.serialize_blockhashes(self.salt_hash_tree[1:])
2633+
2634+
2635+    def tearDown(self):
2636+        self.sparent.stopService()
2637+        shutil.rmtree(self.workdir("MDMFProxies storage test server"))
2638+
2639+
2640+    def write_enabler(self, we_tag):
2641+        return hashutil.tagged_hash("we_blah", we_tag)
2642+
2643+
2644+    def renew_secret(self, tag):
2645+        return hashutil.tagged_hash("renew_blah", str(tag))
2646+
2647+
2648+    def cancel_secret(self, tag):
2649+        return hashutil.tagged_hash("cancel_blah", str(tag))
2650+
2651+
2652+    def workdir(self, name):
2653+        basedir = os.path.join("storage", "MutableServer", name)
2654+        return basedir
2655+
2656+
2657+    def create(self, name):
2658+        workdir = self.workdir(name)
2659+        ss = StorageServer(workdir, "\x00" * 20)
2660+        ss.setServiceParent(self.sparent)
2661+        return ss
2662+
2663+
2664+    def build_test_mdmf_share(self, tail_segment=False, empty=False):
2665+        # Start with the checkstring
2666+        data = struct.pack(">BQ32s",
2667+                           1,
2668+                           0,
2669+                           self.root_hash)
2670+        self.checkstring = data
2671+        # Next, the encoding parameters
2672+        if tail_segment:
2673+            data += struct.pack(">BBQQ",
2674+                                3,
2675+                                10,
2676+                                6,
2677+                                33)
2678+        elif empty:
2679+            data += struct.pack(">BBQQ",
2680+                                3,
2681+                                10,
2682+                                0,
2683+                                0)
2684+        else:
2685+            data += struct.pack(">BBQQ",
2686+                                3,
2687+                                10,
2688+                                6,
2689+                                36)
2690+        # Now we'll build the offsets.
2691+        sharedata = ""
2692+        if not tail_segment and not empty:
2693+            for i in xrange(6):
2694+                sharedata += self.salt + self.block
2695+        elif tail_segment:
2696+            for i in xrange(5):
2697+                sharedata += self.salt + self.block
2698+            sharedata += self.salt + "a"
2699+
2700+        # The encrypted private key comes after the shares + salts
2701+        offset_size = struct.calcsize(MDMFOFFSETS)
2702+        encrypted_private_key_offset = len(data) + offset_size + len(sharedata)
2703+        # The blockhashes come after the private key
2704+        blockhashes_offset = encrypted_private_key_offset + len(self.encprivkey)
2705+        # The sharehashes come after the salt hashes
2706+        sharehashes_offset = blockhashes_offset + len(self.block_hash_tree_s)
2707+        # The signature comes after the share hash chain
2708+        signature_offset = sharehashes_offset + len(self.share_hash_chain_s)
2709+        # The verification key comes after the signature
2710+        verification_offset = signature_offset + len(self.signature)
2711+        # The EOF comes after the verification key
2712+        eof_offset = verification_offset + len(self.verification_key)
2713+        data += struct.pack(MDMFOFFSETS,
2714+                            encrypted_private_key_offset,
2715+                            blockhashes_offset,
2716+                            sharehashes_offset,
2717+                            signature_offset,
2718+                            verification_offset,
2719+                            eof_offset)
2720+        self.offsets = {}
2721+        self.offsets['enc_privkey'] = encrypted_private_key_offset
2722+        self.offsets['block_hash_tree'] = blockhashes_offset
2723+        self.offsets['share_hash_chain'] = sharehashes_offset
2724+        self.offsets['signature'] = signature_offset
2725+        self.offsets['verification_key'] = verification_offset
2726+        self.offsets['EOF'] = eof_offset
2727+        # Next, we'll add in the salts and share data,
2728+        data += sharedata
2729+        # the private key,
2730+        data += self.encprivkey
2731+        # the block hash tree,
2732+        data += self.block_hash_tree_s
2733+        # the share hash chain,
2734+        data += self.share_hash_chain_s
2735+        # the signature,
2736+        data += self.signature
2737+        # and the verification key
2738+        data += self.verification_key
2739+        return data
2740+
2741+
2742+    def write_test_share_to_server(self,
2743+                                   storage_index,
2744+                                   tail_segment=False,
2745+                                   empty=False):
2746+        """
2747+        I write some data for the read tests to read to self.ss
2748+
2749+        If tail_segment=True, then I will write a share that has a
2750+        smaller tail segment than other segments.
2751+        """
2752+        write = self.ss.remote_slot_testv_and_readv_and_writev
2753+        data = self.build_test_mdmf_share(tail_segment, empty)
2754+        # Finally, we write the whole thing to the storage server in one
2755+        # pass.
2756+        testvs = [(0, 1, "eq", "")]
2757+        tws = {}
2758+        tws[0] = (testvs, [(0, data)], None)
2759+        readv = [(0, 1)]
2760+        results = write(storage_index, self.secrets, tws, readv)
2761+        self.failUnless(results[0])
2762+
2763+
2764+    def build_test_sdmf_share(self, empty=False):
2765+        if empty:
2766+            sharedata = ""
2767+        else:
2768+            sharedata = self.segment * 6
2769+        self.sharedata = sharedata
2770+        blocksize = len(sharedata) / 3
2771+        block = sharedata[:blocksize]
2772+        self.blockdata = block
2773+        prefix = struct.pack(">BQ32s16s BBQQ",
2774+                             0, # version,
2775+                             0,
2776+                             self.root_hash,
2777+                             self.salt,
2778+                             3,
2779+                             10,
2780+                             len(sharedata),
2781+                             len(sharedata),
2782+                            )
2783+        post_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
2784+        signature_offset = post_offset + len(self.verification_key)
2785+        sharehashes_offset = signature_offset + len(self.signature)
2786+        blockhashes_offset = sharehashes_offset + len(self.share_hash_chain_s)
2787+        sharedata_offset = blockhashes_offset + len(self.block_hash_tree_s)
2788+        encprivkey_offset = sharedata_offset + len(block)
2789+        eof_offset = encprivkey_offset + len(self.encprivkey)
2790+        offsets = struct.pack(">LLLLQQ",
2791+                              signature_offset,
2792+                              sharehashes_offset,
2793+                              blockhashes_offset,
2794+                              sharedata_offset,
2795+                              encprivkey_offset,
2796+                              eof_offset)
2797+        final_share = "".join([prefix,
2798+                           offsets,
2799+                           self.verification_key,
2800+                           self.signature,
2801+                           self.share_hash_chain_s,
2802+                           self.block_hash_tree_s,
2803+                           block,
2804+                           self.encprivkey])
2805+        self.offsets = {}
2806+        self.offsets['signature'] = signature_offset
2807+        self.offsets['share_hash_chain'] = sharehashes_offset
2808+        self.offsets['block_hash_tree'] = blockhashes_offset
2809+        self.offsets['share_data'] = sharedata_offset
2810+        self.offsets['enc_privkey'] = encprivkey_offset
2811+        self.offsets['EOF'] = eof_offset
2812+        return final_share
2813+
2814+
2815+    def write_sdmf_share_to_server(self,
2816+                                   storage_index,
2817+                                   empty=False):
2818+        # Some tests need SDMF shares to verify that we can still
2819+        # read them. This method writes one, which resembles but is not
2820+        assert self.rref
2821+        write = self.ss.remote_slot_testv_and_readv_and_writev
2822+        share = self.build_test_sdmf_share(empty)
2823+        testvs = [(0, 1, "eq", "")]
2824+        tws = {}
2825+        tws[0] = (testvs, [(0, share)], None)
2826+        readv = []
2827+        results = write(storage_index, self.secrets, tws, readv)
2828+        self.failUnless(results[0])
2829+
2830+
2831+    def test_read(self):
2832+        self.write_test_share_to_server("si1")
2833+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
2834+        # Check that every method equals what we expect it to.
2835+        d = defer.succeed(None)
2836+        def _check_block_and_salt((block, salt)):
2837+            self.failUnlessEqual(block, self.block)
2838+            self.failUnlessEqual(salt, self.salt)
2839+
2840+        for i in xrange(6):
2841+            d.addCallback(lambda ignored, i=i:
2842+                mr.get_block_and_salt(i))
2843+            d.addCallback(_check_block_and_salt)
2844+
2845+        d.addCallback(lambda ignored:
2846+            mr.get_encprivkey())
2847+        d.addCallback(lambda encprivkey:
2848+            self.failUnlessEqual(self.encprivkey, encprivkey))
2849+
2850+        d.addCallback(lambda ignored:
2851+            mr.get_blockhashes())
2852+        d.addCallback(lambda blockhashes:
2853+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
2854+
2855+        d.addCallback(lambda ignored:
2856+            mr.get_sharehashes())
2857+        d.addCallback(lambda sharehashes:
2858+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
2859+
2860+        d.addCallback(lambda ignored:
2861+            mr.get_signature())
2862+        d.addCallback(lambda signature:
2863+            self.failUnlessEqual(signature, self.signature))
2864+
2865+        d.addCallback(lambda ignored:
2866+            mr.get_verification_key())
2867+        d.addCallback(lambda verification_key:
2868+            self.failUnlessEqual(verification_key, self.verification_key))
2869+
2870+        d.addCallback(lambda ignored:
2871+            mr.get_seqnum())
2872+        d.addCallback(lambda seqnum:
2873+            self.failUnlessEqual(seqnum, 0))
2874+
2875+        d.addCallback(lambda ignored:
2876+            mr.get_root_hash())
2877+        d.addCallback(lambda root_hash:
2878+            self.failUnlessEqual(self.root_hash, root_hash))
2879+
2880+        d.addCallback(lambda ignored:
2881+            mr.get_seqnum())
2882+        d.addCallback(lambda seqnum:
2883+            self.failUnlessEqual(0, seqnum))
2884+
2885+        d.addCallback(lambda ignored:
2886+            mr.get_encoding_parameters())
2887+        def _check_encoding_parameters((k, n, segsize, datalen)):
2888+            self.failUnlessEqual(k, 3)
2889+            self.failUnlessEqual(n, 10)
2890+            self.failUnlessEqual(segsize, 6)
2891+            self.failUnlessEqual(datalen, 36)
2892+        d.addCallback(_check_encoding_parameters)
2893+
2894+        d.addCallback(lambda ignored:
2895+            mr.get_checkstring())
2896+        d.addCallback(lambda checkstring:
2897+            self.failUnlessEqual(checkstring, checkstring))
2898+        return d
2899+
2900+
2901+    def test_read_with_different_tail_segment_size(self):
2902+        self.write_test_share_to_server("si1", tail_segment=True)
2903+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
2904+        d = mr.get_block_and_salt(5)
2905+        def _check_tail_segment(results):
2906+            block, salt = results
2907+            self.failUnlessEqual(len(block), 1)
2908+            self.failUnlessEqual(block, "a")
2909+        d.addCallback(_check_tail_segment)
2910+        return d
2911+
2912+
2913+    def test_get_block_with_invalid_segnum(self):
2914+        self.write_test_share_to_server("si1")
2915+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
2916+        d = defer.succeed(None)
2917+        d.addCallback(lambda ignored:
2918+            self.shouldFail(LayoutInvalid, "test invalid segnum",
2919+                            None,
2920+                            mr.get_block_and_salt, 7))
2921+        return d
2922+
2923+
2924+    def test_get_encoding_parameters_first(self):
2925+        self.write_test_share_to_server("si1")
2926+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
2927+        d = mr.get_encoding_parameters()
2928+        def _check_encoding_parameters((k, n, segment_size, datalen)):
2929+            self.failUnlessEqual(k, 3)
2930+            self.failUnlessEqual(n, 10)
2931+            self.failUnlessEqual(segment_size, 6)
2932+            self.failUnlessEqual(datalen, 36)
2933+        d.addCallback(_check_encoding_parameters)
2934+        return d
2935+
2936+
2937+    def test_get_seqnum_first(self):
2938+        self.write_test_share_to_server("si1")
2939+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
2940+        d = mr.get_seqnum()
2941+        d.addCallback(lambda seqnum:
2942+            self.failUnlessEqual(seqnum, 0))
2943+        return d
2944+
2945+
2946+    def test_get_root_hash_first(self):
2947+        self.write_test_share_to_server("si1")
2948+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
2949+        d = mr.get_root_hash()
2950+        d.addCallback(lambda root_hash:
2951+            self.failUnlessEqual(root_hash, self.root_hash))
2952+        return d
2953+
2954+
2955+    def test_get_checkstring_first(self):
2956+        self.write_test_share_to_server("si1")
2957+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
2958+        d = mr.get_checkstring()
2959+        d.addCallback(lambda checkstring:
2960+            self.failUnlessEqual(checkstring, self.checkstring))
2961+        return d
2962+
2963+
2964+    def test_write_read_vectors(self):
2965+        # When writing for us, the storage server will return to us a
2966+        # read vector, along with its result. If a write fails because
2967+        # the test vectors failed, this read vector can help us to
2968+        # diagnose the problem. This test ensures that the read vector
2969+        # is working appropriately.
2970+        mw = self._make_new_mw("si1", 0)
2971+
2972+        for i in xrange(6):
2973+            mw.put_block(self.block, i, self.salt)
2974+        mw.put_encprivkey(self.encprivkey)
2975+        mw.put_blockhashes(self.block_hash_tree)
2976+        mw.put_sharehashes(self.share_hash_chain)
2977+        mw.put_root_hash(self.root_hash)
2978+        mw.put_signature(self.signature)
2979+        mw.put_verification_key(self.verification_key)
2980+        d = mw.finish_publishing()
2981+        def _then(results):
2982+            self.failUnless(len(results), 2)
2983+            result, readv = results
2984+            self.failUnless(result)
2985+            self.failIf(readv)
2986+            self.old_checkstring = mw.get_checkstring()
2987+            mw.set_checkstring("")
2988+        d.addCallback(_then)
2989+        d.addCallback(lambda ignored:
2990+            mw.finish_publishing())
2991+        def _then_again(results):
2992+            self.failUnlessEqual(len(results), 2)
2993+            result, readvs = results
2994+            self.failIf(result)
2995+            self.failUnlessIn(0, readvs)
2996+            readv = readvs[0][0]
2997+            self.failUnlessEqual(readv, self.old_checkstring)
2998+        d.addCallback(_then_again)
2999+        # The checkstring remains the same for the rest of the process.
3000+        return d
3001+
3002+
3003+    def test_blockhashes_after_share_hash_chain(self):
3004+        mw = self._make_new_mw("si1", 0)
3005+        d = defer.succeed(None)
3006+        # Put everything up to and including the share hash chain
3007+        for i in xrange(6):
3008+            d.addCallback(lambda ignored, i=i:
3009+                mw.put_block(self.block, i, self.salt))
3010+        d.addCallback(lambda ignored:
3011+            mw.put_encprivkey(self.encprivkey))
3012+        d.addCallback(lambda ignored:
3013+            mw.put_blockhashes(self.block_hash_tree))
3014+        d.addCallback(lambda ignored:
3015+            mw.put_sharehashes(self.share_hash_chain))
3016+
3017+        # Now try to put the block hash tree again.
3018+        d.addCallback(lambda ignored:
3019+            self.shouldFail(LayoutInvalid, "test repeat salthashes",
3020+                            None,
3021+                            mw.put_blockhashes, self.block_hash_tree))
3022+        return d
3023+
3024+
3025+    def test_encprivkey_after_blockhashes(self):
3026+        mw = self._make_new_mw("si1", 0)
3027+        d = defer.succeed(None)
3028+        # Put everything up to and including the block hash tree
3029+        for i in xrange(6):
3030+            d.addCallback(lambda ignored, i=i:
3031+                mw.put_block(self.block, i, self.salt))
3032+        d.addCallback(lambda ignored:
3033+            mw.put_encprivkey(self.encprivkey))
3034+        d.addCallback(lambda ignored:
3035+            mw.put_blockhashes(self.block_hash_tree))
3036+        d.addCallback(lambda ignored:
3037+            self.shouldFail(LayoutInvalid, "out of order private key",
3038+                            None,
3039+                            mw.put_encprivkey, self.encprivkey))
3040+        return d
3041+
3042+
3043+    def test_share_hash_chain_after_signature(self):
3044+        mw = self._make_new_mw("si1", 0)
3045+        d = defer.succeed(None)
3046+        # Put everything up to and including the signature
3047+        for i in xrange(6):
3048+            d.addCallback(lambda ignored, i=i:
3049+                mw.put_block(self.block, i, self.salt))
3050+        d.addCallback(lambda ignored:
3051+            mw.put_encprivkey(self.encprivkey))
3052+        d.addCallback(lambda ignored:
3053+            mw.put_blockhashes(self.block_hash_tree))
3054+        d.addCallback(lambda ignored:
3055+            mw.put_sharehashes(self.share_hash_chain))
3056+        d.addCallback(lambda ignored:
3057+            mw.put_root_hash(self.root_hash))
3058+        d.addCallback(lambda ignored:
3059+            mw.put_signature(self.signature))
3060+        # Now try to put the share hash chain again. This should fail
3061+        d.addCallback(lambda ignored:
3062+            self.shouldFail(LayoutInvalid, "out of order share hash chain",
3063+                            None,
3064+                            mw.put_sharehashes, self.share_hash_chain))
3065+        return d
3066+
3067+
3068+    def test_signature_after_verification_key(self):
3069+        mw = self._make_new_mw("si1", 0)
3070+        d = defer.succeed(None)
3071+        # Put everything up to and including the verification key.
3072+        for i in xrange(6):
3073+            d.addCallback(lambda ignored, i=i:
3074+                mw.put_block(self.block, i, self.salt))
3075+        d.addCallback(lambda ignored:
3076+            mw.put_encprivkey(self.encprivkey))
3077+        d.addCallback(lambda ignored:
3078+            mw.put_blockhashes(self.block_hash_tree))
3079+        d.addCallback(lambda ignored:
3080+            mw.put_sharehashes(self.share_hash_chain))
3081+        d.addCallback(lambda ignored:
3082+            mw.put_root_hash(self.root_hash))
3083+        d.addCallback(lambda ignored:
3084+            mw.put_signature(self.signature))
3085+        d.addCallback(lambda ignored:
3086+            mw.put_verification_key(self.verification_key))
3087+        # Now try to put the signature again. This should fail
3088+        d.addCallback(lambda ignored:
3089+            self.shouldFail(LayoutInvalid, "signature after verification",
3090+                            None,
3091+                            mw.put_signature, self.signature))
3092+        return d
3093+
3094+
3095+    def test_uncoordinated_write(self):
3096+        # Make two mutable writers, both pointing to the same storage
3097+        # server, both at the same storage index, and try writing to the
3098+        # same share.
3099+        mw1 = self._make_new_mw("si1", 0)
3100+        mw2 = self._make_new_mw("si1", 0)
3101+
3102+        def _check_success(results):
3103+            result, readvs = results
3104+            self.failUnless(result)
3105+
3106+        def _check_failure(results):
3107+            result, readvs = results
3108+            self.failIf(result)
3109+
3110+        def _write_share(mw):
3111+            for i in xrange(6):
3112+                mw.put_block(self.block, i, self.salt)
3113+            mw.put_encprivkey(self.encprivkey)
3114+            mw.put_blockhashes(self.block_hash_tree)
3115+            mw.put_sharehashes(self.share_hash_chain)
3116+            mw.put_root_hash(self.root_hash)
3117+            mw.put_signature(self.signature)
3118+            mw.put_verification_key(self.verification_key)
3119+            return mw.finish_publishing()
3120+        d = _write_share(mw1)
3121+        d.addCallback(_check_success)
3122+        d.addCallback(lambda ignored:
3123+            _write_share(mw2))
3124+        d.addCallback(_check_failure)
3125+        return d
3126+
3127+
3128+    def test_invalid_salt_size(self):
3129+        # Salts need to be 16 bytes in size. Writes that attempt to
3130+        # write more or less than this should be rejected.
3131+        mw = self._make_new_mw("si1", 0)
3132+        invalid_salt = "a" * 17 # 17 bytes
3133+        another_invalid_salt = "b" * 15 # 15 bytes
3134+        d = defer.succeed(None)
3135+        d.addCallback(lambda ignored:
3136+            self.shouldFail(LayoutInvalid, "salt too big",
3137+                            None,
3138+                            mw.put_block, self.block, 0, invalid_salt))
3139+        d.addCallback(lambda ignored:
3140+            self.shouldFail(LayoutInvalid, "salt too small",
3141+                            None,
3142+                            mw.put_block, self.block, 0,
3143+                            another_invalid_salt))
3144+        return d
3145+
3146+
3147+    def test_write_test_vectors(self):
3148+        # If we give the write proxy a bogus test vector at
3149+        # any point during the process, it should fail to write when we
3150+        # tell it to write.
3151+        def _check_failure(results):
3152+            self.failUnlessEqual(len(results), 2)
3153+            res, d = results
3154+            self.failIf(res)
3155+
3156+        def _check_success(results):
3157+            self.failUnlessEqual(len(results), 2)
3158+            res, d = results
3159+            self.failUnless(results)
3160+
3161+        mw = self._make_new_mw("si1", 0)
3162+        mw.set_checkstring("this is a lie")
3163+        for i in xrange(6):
3164+            mw.put_block(self.block, i, self.salt)
3165+        mw.put_encprivkey(self.encprivkey)
3166+        mw.put_blockhashes(self.block_hash_tree)
3167+        mw.put_sharehashes(self.share_hash_chain)
3168+        mw.put_root_hash(self.root_hash)
3169+        mw.put_signature(self.signature)
3170+        mw.put_verification_key(self.verification_key)
3171+        d = mw.finish_publishing()
3172+        d.addCallback(_check_failure)
3173+        d.addCallback(lambda ignored:
3174+            mw.set_checkstring(""))
3175+        d.addCallback(lambda ignored:
3176+            mw.finish_publishing())
3177+        d.addCallback(_check_success)
3178+        return d
3179+
3180+
3181+    def serialize_blockhashes(self, blockhashes):
3182+        return "".join(blockhashes)
3183+
3184+
3185+    def serialize_sharehashes(self, sharehashes):
3186+        ret = "".join([struct.pack(">H32s", i, sharehashes[i])
3187+                        for i in sorted(sharehashes.keys())])
3188+        return ret
3189+
3190+
3191+    def test_write(self):
3192+        # This translates to a file with 6 6-byte segments, and with 2-byte
3193+        # blocks.
3194+        mw = self._make_new_mw("si1", 0)
3195+        # Test writing some blocks.
3196+        read = self.ss.remote_slot_readv
3197+        expected_sharedata_offset = struct.calcsize(MDMFHEADER)
3198+        written_block_size = 2 + len(self.salt)
3199+        written_block = self.block + self.salt
3200+        for i in xrange(6):
3201+            mw.put_block(self.block, i, self.salt)
3202+
3203+        mw.put_encprivkey(self.encprivkey)
3204+        mw.put_blockhashes(self.block_hash_tree)
3205+        mw.put_sharehashes(self.share_hash_chain)
3206+        mw.put_root_hash(self.root_hash)
3207+        mw.put_signature(self.signature)
3208+        mw.put_verification_key(self.verification_key)
3209+        d = mw.finish_publishing()
3210+        def _check_publish(results):
3211+            self.failUnlessEqual(len(results), 2)
3212+            result, ign = results
3213+            self.failUnless(result, "publish failed")
3214+            for i in xrange(6):
3215+                self.failUnlessEqual(read("si1", [0], [(expected_sharedata_offset + (i * written_block_size), written_block_size)]),
3216+                                {0: [written_block]})
3217+
3218+            expected_private_key_offset = expected_sharedata_offset + \
3219+                                      len(written_block) * 6
3220+            self.failUnlessEqual(len(self.encprivkey), 7)
3221+            self.failUnlessEqual(read("si1", [0], [(expected_private_key_offset, 7)]),
3222+                                 {0: [self.encprivkey]})
3223+
3224+            expected_block_hash_offset = expected_private_key_offset + len(self.encprivkey)
3225+            self.failUnlessEqual(len(self.block_hash_tree_s), 32 * 6)
3226+            self.failUnlessEqual(read("si1", [0], [(expected_block_hash_offset, 32 * 6)]),
3227+                                 {0: [self.block_hash_tree_s]})
3228+
3229+            expected_share_hash_offset = expected_block_hash_offset + len(self.block_hash_tree_s)
3230+            self.failUnlessEqual(read("si1", [0],[(expected_share_hash_offset, (32 + 2) * 6)]),
3231+                                 {0: [self.share_hash_chain_s]})
3232+
3233+            self.failUnlessEqual(read("si1", [0], [(9, 32)]),
3234+                                 {0: [self.root_hash]})
3235+            expected_signature_offset = expected_share_hash_offset + len(self.share_hash_chain_s)
3236+            self.failUnlessEqual(len(self.signature), 9)
3237+            self.failUnlessEqual(read("si1", [0], [(expected_signature_offset, 9)]),
3238+                                 {0: [self.signature]})
3239+
3240+            expected_verification_key_offset = expected_signature_offset + len(self.signature)
3241+            self.failUnlessEqual(len(self.verification_key), 6)
3242+            self.failUnlessEqual(read("si1", [0], [(expected_verification_key_offset, 6)]),
3243+                                 {0: [self.verification_key]})
3244+
3245+            signable = mw.get_signable()
3246+            verno, seq, roothash, k, n, segsize, datalen = \
3247+                                            struct.unpack(">BQ32sBBQQ",
3248+                                                          signable)
3249+            self.failUnlessEqual(verno, 1)
3250+            self.failUnlessEqual(seq, 0)
3251+            self.failUnlessEqual(roothash, self.root_hash)
3252+            self.failUnlessEqual(k, 3)
3253+            self.failUnlessEqual(n, 10)
3254+            self.failUnlessEqual(segsize, 6)
3255+            self.failUnlessEqual(datalen, 36)
3256+            expected_eof_offset = expected_verification_key_offset + len(self.verification_key)
3257+
3258+            # Check the version number to make sure that it is correct.
3259+            expected_version_number = struct.pack(">B", 1)
3260+            self.failUnlessEqual(read("si1", [0], [(0, 1)]),
3261+                                 {0: [expected_version_number]})
3262+            # Check the sequence number to make sure that it is correct
3263+            expected_sequence_number = struct.pack(">Q", 0)
3264+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
3265+                                 {0: [expected_sequence_number]})
3266+            # Check that the encoding parameters (k, N, segement size, data
3267+            # length) are what they should be. These are  3, 10, 6, 36
3268+            expected_k = struct.pack(">B", 3)
3269+            self.failUnlessEqual(read("si1", [0], [(41, 1)]),
3270+                                 {0: [expected_k]})
3271+            expected_n = struct.pack(">B", 10)
3272+            self.failUnlessEqual(read("si1", [0], [(42, 1)]),
3273+                                 {0: [expected_n]})
3274+            expected_segment_size = struct.pack(">Q", 6)
3275+            self.failUnlessEqual(read("si1", [0], [(43, 8)]),
3276+                                 {0: [expected_segment_size]})
3277+            expected_data_length = struct.pack(">Q", 36)
3278+            self.failUnlessEqual(read("si1", [0], [(51, 8)]),
3279+                                 {0: [expected_data_length]})
3280+            expected_offset = struct.pack(">Q", expected_private_key_offset)
3281+            self.failUnlessEqual(read("si1", [0], [(59, 8)]),
3282+                                 {0: [expected_offset]})
3283+            expected_offset = struct.pack(">Q", expected_block_hash_offset)
3284+            self.failUnlessEqual(read("si1", [0], [(67, 8)]),
3285+                                 {0: [expected_offset]})
3286+            expected_offset = struct.pack(">Q", expected_share_hash_offset)
3287+            self.failUnlessEqual(read("si1", [0], [(75, 8)]),
3288+                                 {0: [expected_offset]})
3289+            expected_offset = struct.pack(">Q", expected_signature_offset)
3290+            self.failUnlessEqual(read("si1", [0], [(83, 8)]),
3291+                                 {0: [expected_offset]})
3292+            expected_offset = struct.pack(">Q", expected_verification_key_offset)
3293+            self.failUnlessEqual(read("si1", [0], [(91, 8)]),
3294+                                 {0: [expected_offset]})
3295+            expected_offset = struct.pack(">Q", expected_eof_offset)
3296+            self.failUnlessEqual(read("si1", [0], [(99, 8)]),
3297+                                 {0: [expected_offset]})
3298+        d.addCallback(_check_publish)
3299+        return d
3300+
3301+    def _make_new_mw(self, si, share, datalength=36):
3302+        # This is a file of size 36 bytes. Since it has a segment
3303+        # size of 6, we know that it has 6 byte segments, which will
3304+        # be split into blocks of 2 bytes because our FEC k
3305+        # parameter is 3.
3306+        mw = MDMFSlotWriteProxy(share, self.rref, si, self.secrets, 0, 3, 10,
3307+                                6, datalength)
3308+        return mw
3309+
3310+
3311+    def test_write_rejected_with_too_many_blocks(self):
3312+        mw = self._make_new_mw("si0", 0)
3313+
3314+        # Try writing too many blocks. We should not be able to write
3315+        # more than 6
3316+        # blocks into each share.
3317+        d = defer.succeed(None)
3318+        for i in xrange(6):
3319+            d.addCallback(lambda ignored, i=i:
3320+                mw.put_block(self.block, i, self.salt))
3321+        d.addCallback(lambda ignored:
3322+            self.shouldFail(LayoutInvalid, "too many blocks",
3323+                            None,
3324+                            mw.put_block, self.block, 7, self.salt))
3325+        return d
3326+
3327+
3328+    def test_write_rejected_with_invalid_salt(self):
3329+        # Try writing an invalid salt. Salts are 16 bytes -- any more or
3330+        # less should cause an error.
3331+        mw = self._make_new_mw("si1", 0)
3332+        bad_salt = "a" * 17 # 17 bytes
3333+        d = defer.succeed(None)
3334+        d.addCallback(lambda ignored:
3335+            self.shouldFail(LayoutInvalid, "test_invalid_salt",
3336+                            None, mw.put_block, self.block, 7, bad_salt))
3337+        return d
3338+
3339+
3340+    def test_write_rejected_with_invalid_root_hash(self):
3341+        # Try writing an invalid root hash. This should be SHA256d, and
3342+        # 32 bytes long as a result.
3343+        mw = self._make_new_mw("si2", 0)
3344+        # 17 bytes != 32 bytes
3345+        invalid_root_hash = "a" * 17
3346+        d = defer.succeed(None)
3347+        # Before this test can work, we need to put some blocks + salts,
3348+        # a block hash tree, and a share hash tree. Otherwise, we'll see
3349+        # failures that match what we are looking for, but are caused by
3350+        # the constraints imposed on operation ordering.
3351+        for i in xrange(6):
3352+            d.addCallback(lambda ignored, i=i:
3353+                mw.put_block(self.block, i, self.salt))
3354+        d.addCallback(lambda ignored:
3355+            mw.put_encprivkey(self.encprivkey))
3356+        d.addCallback(lambda ignored:
3357+            mw.put_blockhashes(self.block_hash_tree))
3358+        d.addCallback(lambda ignored:
3359+            mw.put_sharehashes(self.share_hash_chain))
3360+        d.addCallback(lambda ignored:
3361+            self.shouldFail(LayoutInvalid, "invalid root hash",
3362+                            None, mw.put_root_hash, invalid_root_hash))
3363+        return d
3364+
3365+
3366+    def test_write_rejected_with_invalid_blocksize(self):
3367+        # The blocksize implied by the writer that we get from
3368+        # _make_new_mw is 2bytes -- any more or any less than this
3369+        # should be cause for failure, unless it is the tail segment, in
3370+        # which case it may not be failure.
3371+        invalid_block = "a"
3372+        mw = self._make_new_mw("si3", 0, 33) # implies a tail segment with
3373+                                             # one byte blocks
3374+        # 1 bytes != 2 bytes
3375+        d = defer.succeed(None)
3376+        d.addCallback(lambda ignored, invalid_block=invalid_block:
3377+            self.shouldFail(LayoutInvalid, "test blocksize too small",
3378+                            None, mw.put_block, invalid_block, 0,
3379+                            self.salt))
3380+        invalid_block = invalid_block * 3
3381+        # 3 bytes != 2 bytes
3382+        d.addCallback(lambda ignored:
3383+            self.shouldFail(LayoutInvalid, "test blocksize too large",
3384+                            None,
3385+                            mw.put_block, invalid_block, 0, self.salt))
3386+        for i in xrange(5):
3387+            d.addCallback(lambda ignored, i=i:
3388+                mw.put_block(self.block, i, self.salt))
3389+        # Try to put an invalid tail segment
3390+        d.addCallback(lambda ignored:
3391+            self.shouldFail(LayoutInvalid, "test invalid tail segment",
3392+                            None,
3393+                            mw.put_block, self.block, 5, self.salt))
3394+        valid_block = "a"
3395+        d.addCallback(lambda ignored:
3396+            mw.put_block(valid_block, 5, self.salt))
3397+        return d
3398+
3399+
3400+    def test_write_enforces_order_constraints(self):
3401+        # We require that the MDMFSlotWriteProxy be interacted with in a
3402+        # specific way.
3403+        # That way is:
3404+        # 0: __init__
3405+        # 1: write blocks and salts
3406+        # 2: Write the encrypted private key
3407+        # 3: Write the block hashes
3408+        # 4: Write the share hashes
3409+        # 5: Write the root hash and salt hash
3410+        # 6: Write the signature and verification key
3411+        # 7: Write the file.
3412+        #
3413+        # Some of these can be performed out-of-order, and some can't.
3414+        # The dependencies that I want to test here are:
3415+        #  - Private key before block hashes
3416+        #  - share hashes and block hashes before root hash
3417+        #  - root hash before signature
3418+        #  - signature before verification key
3419+        mw0 = self._make_new_mw("si0", 0)
3420+        # Write some shares
3421+        d = defer.succeed(None)
3422+        for i in xrange(6):
3423+            d.addCallback(lambda ignored, i=i:
3424+                mw0.put_block(self.block, i, self.salt))
3425+        # Try to write the block hashes before writing the encrypted
3426+        # private key
3427+        d.addCallback(lambda ignored:
3428+            self.shouldFail(LayoutInvalid, "block hashes before key",
3429+                            None, mw0.put_blockhashes,
3430+                            self.block_hash_tree))
3431+
3432+        # Write the private key.
3433+        d.addCallback(lambda ignored:
3434+            mw0.put_encprivkey(self.encprivkey))
3435+
3436+
3437+        # Try to write the share hash chain without writing the block
3438+        # hash tree
3439+        d.addCallback(lambda ignored:
3440+            self.shouldFail(LayoutInvalid, "share hash chain before "
3441+                                           "salt hash tree",
3442+                            None,
3443+                            mw0.put_sharehashes, self.share_hash_chain))
3444+
3445+        # Try to write the root hash and without writing either the
3446+        # block hashes or the or the share hashes
3447+        d.addCallback(lambda ignored:
3448+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
3449+                            None,
3450+                            mw0.put_root_hash, self.root_hash))
3451+
3452+        # Now write the block hashes and try again
3453+        d.addCallback(lambda ignored:
3454+            mw0.put_blockhashes(self.block_hash_tree))
3455+
3456+        d.addCallback(lambda ignored:
3457+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
3458+                            None, mw0.put_root_hash, self.root_hash))
3459+
3460+        # We haven't yet put the root hash on the share, so we shouldn't
3461+        # be able to sign it.
3462+        d.addCallback(lambda ignored:
3463+            self.shouldFail(LayoutInvalid, "signature before root hash",
3464+                            None, mw0.put_signature, self.signature))
3465+
3466+        d.addCallback(lambda ignored:
3467+            self.failUnlessRaises(LayoutInvalid, mw0.get_signable))
3468+
3469+        # ..and, since that fails, we also shouldn't be able to put the
3470+        # verification key.
3471+        d.addCallback(lambda ignored:
3472+            self.shouldFail(LayoutInvalid, "key before signature",
3473+                            None, mw0.put_verification_key,
3474+                            self.verification_key))
3475+
3476+        # Now write the share hashes.
3477+        d.addCallback(lambda ignored:
3478+            mw0.put_sharehashes(self.share_hash_chain))
3479+        # We should be able to write the root hash now too
3480+        d.addCallback(lambda ignored:
3481+            mw0.put_root_hash(self.root_hash))
3482+
3483+        # We should still be unable to put the verification key
3484+        d.addCallback(lambda ignored:
3485+            self.shouldFail(LayoutInvalid, "key before signature",
3486+                            None, mw0.put_verification_key,
3487+                            self.verification_key))
3488+
3489+        d.addCallback(lambda ignored:
3490+            mw0.put_signature(self.signature))
3491+
3492+        # We shouldn't be able to write the offsets to the remote server
3493+        # until the offset table is finished; IOW, until we have written
3494+        # the verification key.
3495+        d.addCallback(lambda ignored:
3496+            self.shouldFail(LayoutInvalid, "offsets before verification key",
3497+                            None,
3498+                            mw0.finish_publishing))
3499+
3500+        d.addCallback(lambda ignored:
3501+            mw0.put_verification_key(self.verification_key))
3502+        return d
3503+
3504+
3505+    def test_end_to_end(self):
3506+        mw = self._make_new_mw("si1", 0)
3507+        # Write a share using the mutable writer, and make sure that the
3508+        # reader knows how to read everything back to us.
3509+        d = defer.succeed(None)
3510+        for i in xrange(6):
3511+            d.addCallback(lambda ignored, i=i:
3512+                mw.put_block(self.block, i, self.salt))
3513+        d.addCallback(lambda ignored:
3514+            mw.put_encprivkey(self.encprivkey))
3515+        d.addCallback(lambda ignored:
3516+            mw.put_blockhashes(self.block_hash_tree))
3517+        d.addCallback(lambda ignored:
3518+            mw.put_sharehashes(self.share_hash_chain))
3519+        d.addCallback(lambda ignored:
3520+            mw.put_root_hash(self.root_hash))
3521+        d.addCallback(lambda ignored:
3522+            mw.put_signature(self.signature))
3523+        d.addCallback(lambda ignored:
3524+            mw.put_verification_key(self.verification_key))
3525+        d.addCallback(lambda ignored:
3526+            mw.finish_publishing())
3527+
3528+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3529+        def _check_block_and_salt((block, salt)):
3530+            self.failUnlessEqual(block, self.block)
3531+            self.failUnlessEqual(salt, self.salt)
3532+
3533+        for i in xrange(6):
3534+            d.addCallback(lambda ignored, i=i:
3535+                mr.get_block_and_salt(i))
3536+            d.addCallback(_check_block_and_salt)
3537+
3538+        d.addCallback(lambda ignored:
3539+            mr.get_encprivkey())
3540+        d.addCallback(lambda encprivkey:
3541+            self.failUnlessEqual(self.encprivkey, encprivkey))
3542+
3543+        d.addCallback(lambda ignored:
3544+            mr.get_blockhashes())
3545+        d.addCallback(lambda blockhashes:
3546+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
3547+
3548+        d.addCallback(lambda ignored:
3549+            mr.get_sharehashes())
3550+        d.addCallback(lambda sharehashes:
3551+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
3552+
3553+        d.addCallback(lambda ignored:
3554+            mr.get_signature())
3555+        d.addCallback(lambda signature:
3556+            self.failUnlessEqual(signature, self.signature))
3557+
3558+        d.addCallback(lambda ignored:
3559+            mr.get_verification_key())
3560+        d.addCallback(lambda verification_key:
3561+            self.failUnlessEqual(verification_key, self.verification_key))
3562+
3563+        d.addCallback(lambda ignored:
3564+            mr.get_seqnum())
3565+        d.addCallback(lambda seqnum:
3566+            self.failUnlessEqual(seqnum, 0))
3567+
3568+        d.addCallback(lambda ignored:
3569+            mr.get_root_hash())
3570+        d.addCallback(lambda root_hash:
3571+            self.failUnlessEqual(self.root_hash, root_hash))
3572+
3573+        d.addCallback(lambda ignored:
3574+            mr.get_encoding_parameters())
3575+        def _check_encoding_parameters((k, n, segsize, datalen)):
3576+            self.failUnlessEqual(k, 3)
3577+            self.failUnlessEqual(n, 10)
3578+            self.failUnlessEqual(segsize, 6)
3579+            self.failUnlessEqual(datalen, 36)
3580+        d.addCallback(_check_encoding_parameters)
3581+
3582+        d.addCallback(lambda ignored:
3583+            mr.get_checkstring())
3584+        d.addCallback(lambda checkstring:
3585+            self.failUnlessEqual(checkstring, mw.get_checkstring()))
3586+        return d
3587+
3588+
3589+    def test_is_sdmf(self):
3590+        # The MDMFSlotReadProxy should also know how to read SDMF files,
3591+        # since it will encounter them on the grid. Callers use the
3592+        # is_sdmf method to test this.
3593+        self.write_sdmf_share_to_server("si1")
3594+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3595+        d = mr.is_sdmf()
3596+        d.addCallback(lambda issdmf:
3597+            self.failUnless(issdmf))
3598+        return d
3599+
3600+
3601+    def test_reads_sdmf(self):
3602+        # The slot read proxy should, naturally, know how to tell us
3603+        # about data in the SDMF format
3604+        self.write_sdmf_share_to_server("si1")
3605+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3606+        d = defer.succeed(None)
3607+        d.addCallback(lambda ignored:
3608+            mr.is_sdmf())
3609+        d.addCallback(lambda issdmf:
3610+            self.failUnless(issdmf))
3611+
3612+        # What do we need to read?
3613+        #  - The sharedata
3614+        #  - The salt
3615+        d.addCallback(lambda ignored:
3616+            mr.get_block_and_salt(0))
3617+        def _check_block_and_salt(results):
3618+            block, salt = results
3619+            # Our original file is 36 bytes long. Then each share is 12
3620+            # bytes in size. The share is composed entirely of the
3621+            # letter a. self.block contains 2 as, so 6 * self.block is
3622+            # what we are looking for.
3623+            self.failUnlessEqual(block, self.block * 6)
3624+            self.failUnlessEqual(salt, self.salt)
3625+        d.addCallback(_check_block_and_salt)
3626+
3627+        #  - The blockhashes
3628+        d.addCallback(lambda ignored:
3629+            mr.get_blockhashes())
3630+        d.addCallback(lambda blockhashes:
3631+            self.failUnlessEqual(self.block_hash_tree,
3632+                                 blockhashes,
3633+                                 blockhashes))
3634+        #  - The sharehashes
3635+        d.addCallback(lambda ignored:
3636+            mr.get_sharehashes())
3637+        d.addCallback(lambda sharehashes:
3638+            self.failUnlessEqual(self.share_hash_chain,
3639+                                 sharehashes))
3640+        #  - The keys
3641+        d.addCallback(lambda ignored:
3642+            mr.get_encprivkey())
3643+        d.addCallback(lambda encprivkey:
3644+            self.failUnlessEqual(encprivkey, self.encprivkey, encprivkey))
3645+        d.addCallback(lambda ignored:
3646+            mr.get_verification_key())
3647+        d.addCallback(lambda verification_key:
3648+            self.failUnlessEqual(verification_key,
3649+                                 self.verification_key,
3650+                                 verification_key))
3651+        #  - The signature
3652+        d.addCallback(lambda ignored:
3653+            mr.get_signature())
3654+        d.addCallback(lambda signature:
3655+            self.failUnlessEqual(signature, self.signature, signature))
3656+
3657+        #  - The sequence number
3658+        d.addCallback(lambda ignored:
3659+            mr.get_seqnum())
3660+        d.addCallback(lambda seqnum:
3661+            self.failUnlessEqual(seqnum, 0, seqnum))
3662+
3663+        #  - The root hash
3664+        d.addCallback(lambda ignored:
3665+            mr.get_root_hash())
3666+        d.addCallback(lambda root_hash:
3667+            self.failUnlessEqual(root_hash, self.root_hash, root_hash))
3668+        return d
3669+
3670+
3671+    def test_only_reads_one_segment_sdmf(self):
3672+        # SDMF shares have only one segment, so it doesn't make sense to
3673+        # read more segments than that. The reader should know this and
3674+        # complain if we try to do that.
3675+        self.write_sdmf_share_to_server("si1")
3676+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3677+        d = defer.succeed(None)
3678+        d.addCallback(lambda ignored:
3679+            mr.is_sdmf())
3680+        d.addCallback(lambda issdmf:
3681+            self.failUnless(issdmf))
3682+        d.addCallback(lambda ignored:
3683+            self.shouldFail(LayoutInvalid, "test bad segment",
3684+                            None,
3685+                            mr.get_block_and_salt, 1))
3686+        return d
3687+
3688+
3689+    def test_read_with_prefetched_mdmf_data(self):
3690+        # The MDMFSlotReadProxy will prefill certain fields if you pass
3691+        # it data that you have already fetched. This is useful for
3692+        # cases like the Servermap, which prefetches ~2kb of data while
3693+        # finding out which shares are on the remote peer so that it
3694+        # doesn't waste round trips.
3695+        mdmf_data = self.build_test_mdmf_share()
3696+        self.write_test_share_to_server("si1")
3697+        def _make_mr(ignored, length):
3698+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, mdmf_data[:length])
3699+            return mr
3700+
3701+        d = defer.succeed(None)
3702+        # This should be enough to fill in both the encoding parameters
3703+        # and the table of offsets, which will complete the version
3704+        # information tuple.
3705+        d.addCallback(_make_mr, 107)
3706+        d.addCallback(lambda mr:
3707+            mr.get_verinfo())
3708+        def _check_verinfo(verinfo):
3709+            self.failUnless(verinfo)
3710+            self.failUnlessEqual(len(verinfo), 9)
3711+            (seqnum,
3712+             root_hash,
3713+             salt_hash,
3714+             segsize,
3715+             datalen,
3716+             k,
3717+             n,
3718+             prefix,
3719+             offsets) = verinfo
3720+            self.failUnlessEqual(seqnum, 0)
3721+            self.failUnlessEqual(root_hash, self.root_hash)
3722+            self.failUnlessEqual(segsize, 6)
3723+            self.failUnlessEqual(datalen, 36)
3724+            self.failUnlessEqual(k, 3)
3725+            self.failUnlessEqual(n, 10)
3726+            expected_prefix = struct.pack(MDMFSIGNABLEHEADER,
3727+                                          1,
3728+                                          seqnum,
3729+                                          root_hash,
3730+                                          k,
3731+                                          n,
3732+                                          segsize,
3733+                                          datalen)
3734+            self.failUnlessEqual(expected_prefix, prefix)
3735+            self.failUnlessEqual(self.rref.read_count, 0)
3736+        d.addCallback(_check_verinfo)
3737+        # This is not enough data to read a block and a share, so the
3738+        # wrapper should attempt to read this from the remote server.
3739+        d.addCallback(_make_mr, 107)
3740+        d.addCallback(lambda mr:
3741+            mr.get_block_and_salt(0))
3742+        def _check_block_and_salt((block, salt)):
3743+            self.failUnlessEqual(block, self.block)
3744+            self.failUnlessEqual(salt, self.salt)
3745+            self.failUnlessEqual(self.rref.read_count, 1)
3746+        # This should be enough data to read one block.
3747+        d.addCallback(_make_mr, 249)
3748+        d.addCallback(lambda mr:
3749+            mr.get_block_and_salt(0))
3750+        d.addCallback(_check_block_and_salt)
3751+        return d
3752+
3753+
3754+    def test_read_with_prefetched_sdmf_data(self):
3755+        sdmf_data = self.build_test_sdmf_share()
3756+        self.write_sdmf_share_to_server("si1")
3757+        def _make_mr(ignored, length):
3758+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, sdmf_data[:length])
3759+            return mr
3760+
3761+        d = defer.succeed(None)
3762+        # This should be enough to get us the encoding parameters,
3763+        # offset table, and everything else we need to build a verinfo
3764+        # string.
3765+        d.addCallback(_make_mr, 107)
3766+        d.addCallback(lambda mr:
3767+            mr.get_verinfo())
3768+        def _check_verinfo(verinfo):
3769+            self.failUnless(verinfo)
3770+            self.failUnlessEqual(len(verinfo), 9)
3771+            (seqnum,
3772+             root_hash,
3773+             salt,
3774+             segsize,
3775+             datalen,
3776+             k,
3777+             n,
3778+             prefix,
3779+             offsets) = verinfo
3780+            self.failUnlessEqual(seqnum, 0)
3781+            self.failUnlessEqual(root_hash, self.root_hash)
3782+            self.failUnlessEqual(salt, self.salt)
3783+            self.failUnlessEqual(segsize, 36)
3784+            self.failUnlessEqual(datalen, 36)
3785+            self.failUnlessEqual(k, 3)
3786+            self.failUnlessEqual(n, 10)
3787+            expected_prefix = struct.pack(SIGNED_PREFIX,
3788+                                          0,
3789+                                          seqnum,
3790+                                          root_hash,
3791+                                          salt,
3792+                                          k,
3793+                                          n,
3794+                                          segsize,
3795+                                          datalen)
3796+            self.failUnlessEqual(expected_prefix, prefix)
3797+            self.failUnlessEqual(self.rref.read_count, 0)
3798+        d.addCallback(_check_verinfo)
3799+        # This shouldn't be enough to read any share data.
3800+        d.addCallback(_make_mr, 107)
3801+        d.addCallback(lambda mr:
3802+            mr.get_block_and_salt(0))
3803+        def _check_block_and_salt((block, salt)):
3804+            self.failUnlessEqual(block, self.block * 6)
3805+            self.failUnlessEqual(salt, self.salt)
3806+            # TODO: Fix the read routine so that it reads only the data
3807+            #       that it has cached if it can't read all of it.
3808+            self.failUnlessEqual(self.rref.read_count, 2)
3809+
3810+        # This should be enough to read share data.
3811+        d.addCallback(_make_mr, self.offsets['share_data'])
3812+        d.addCallback(lambda mr:
3813+            mr.get_block_and_salt(0))
3814+        d.addCallback(_check_block_and_salt)
3815+        return d
3816+
3817+
3818+    def test_read_with_empty_mdmf_file(self):
3819+        # Some tests upload a file with no contents to test things
3820+        # unrelated to the actual handling of the content of the file.
3821+        # The reader should behave intelligently in these cases.
3822+        self.write_test_share_to_server("si1", empty=True)
3823+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3824+        # We should be able to get the encoding parameters, and they
3825+        # should be correct.
3826+        d = defer.succeed(None)
3827+        d.addCallback(lambda ignored:
3828+            mr.get_encoding_parameters())
3829+        def _check_encoding_parameters(params):
3830+            self.failUnlessEqual(len(params), 4)
3831+            k, n, segsize, datalen = params
3832+            self.failUnlessEqual(k, 3)
3833+            self.failUnlessEqual(n, 10)
3834+            self.failUnlessEqual(segsize, 0)
3835+            self.failUnlessEqual(datalen, 0)
3836+        d.addCallback(_check_encoding_parameters)
3837+
3838+        # We should not be able to fetch a block, since there are no
3839+        # blocks to fetch
3840+        d.addCallback(lambda ignored:
3841+            self.shouldFail(LayoutInvalid, "get block on empty file",
3842+                            None,
3843+                            mr.get_block_and_salt, 0))
3844+        return d
3845+
3846+
3847+    def test_read_with_empty_sdmf_file(self):
3848+        self.write_sdmf_share_to_server("si1", empty=True)
3849+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3850+        # We should be able to get the encoding parameters, and they
3851+        # should be correct
3852+        d = defer.succeed(None)
3853+        d.addCallback(lambda ignored:
3854+            mr.get_encoding_parameters())
3855+        def _check_encoding_parameters(params):
3856+            self.failUnlessEqual(len(params), 4)
3857+            k, n, segsize, datalen = params
3858+            self.failUnlessEqual(k, 3)
3859+            self.failUnlessEqual(n, 10)
3860+            self.failUnlessEqual(segsize, 0)
3861+            self.failUnlessEqual(datalen, 0)
3862+        d.addCallback(_check_encoding_parameters)
3863+
3864+        # It does not make sense to get a block in this format, so we
3865+        # should not be able to.
3866+        d.addCallback(lambda ignored:
3867+            self.shouldFail(LayoutInvalid, "get block on an empty file",
3868+                            None,
3869+                            mr.get_block_and_salt, 0))
3870+        return d
3871+
3872+
3873+    def test_verinfo_with_sdmf_file(self):
3874+        self.write_sdmf_share_to_server("si1")
3875+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3876+        # We should be able to get the version information.
3877+        d = defer.succeed(None)
3878+        d.addCallback(lambda ignored:
3879+            mr.get_verinfo())
3880+        def _check_verinfo(verinfo):
3881+            self.failUnless(verinfo)
3882+            self.failUnlessEqual(len(verinfo), 9)
3883+            (seqnum,
3884+             root_hash,
3885+             salt,
3886+             segsize,
3887+             datalen,
3888+             k,
3889+             n,
3890+             prefix,
3891+             offsets) = verinfo
3892+            self.failUnlessEqual(seqnum, 0)
3893+            self.failUnlessEqual(root_hash, self.root_hash)
3894+            self.failUnlessEqual(salt, self.salt)
3895+            self.failUnlessEqual(segsize, 36)
3896+            self.failUnlessEqual(datalen, 36)
3897+            self.failUnlessEqual(k, 3)
3898+            self.failUnlessEqual(n, 10)
3899+            expected_prefix = struct.pack(">BQ32s16s BBQQ",
3900+                                          0,
3901+                                          seqnum,
3902+                                          root_hash,
3903+                                          salt,
3904+                                          k,
3905+                                          n,
3906+                                          segsize,
3907+                                          datalen)
3908+            self.failUnlessEqual(prefix, expected_prefix)
3909+            self.failUnlessEqual(offsets, self.offsets)
3910+        d.addCallback(_check_verinfo)
3911+        return d
3912+
3913+
3914+    def test_verinfo_with_mdmf_file(self):
3915+        self.write_test_share_to_server("si1")
3916+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3917+        d = defer.succeed(None)
3918+        d.addCallback(lambda ignored:
3919+            mr.get_verinfo())
3920+        def _check_verinfo(verinfo):
3921+            self.failUnless(verinfo)
3922+            self.failUnlessEqual(len(verinfo), 9)
3923+            (seqnum,
3924+             root_hash,
3925+             IV,
3926+             segsize,
3927+             datalen,
3928+             k,
3929+             n,
3930+             prefix,
3931+             offsets) = verinfo
3932+            self.failUnlessEqual(seqnum, 0)
3933+            self.failUnlessEqual(root_hash, self.root_hash)
3934+            self.failIf(IV)
3935+            self.failUnlessEqual(segsize, 6)
3936+            self.failUnlessEqual(datalen, 36)
3937+            self.failUnlessEqual(k, 3)
3938+            self.failUnlessEqual(n, 10)
3939+            expected_prefix = struct.pack(">BQ32s BBQQ",
3940+                                          1,
3941+                                          seqnum,
3942+                                          root_hash,
3943+                                          k,
3944+                                          n,
3945+                                          segsize,
3946+                                          datalen)
3947+            self.failUnlessEqual(prefix, expected_prefix)
3948+            self.failUnlessEqual(offsets, self.offsets)
3949+        d.addCallback(_check_verinfo)
3950+        return d
3951+
3952+
3953+    def test_reader_queue(self):
3954+        self.write_test_share_to_server('si1')
3955+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
3956+        d1 = mr.get_block_and_salt(0, queue=True)
3957+        d2 = mr.get_blockhashes(queue=True)
3958+        d3 = mr.get_sharehashes(queue=True)
3959+        d4 = mr.get_signature(queue=True)
3960+        d5 = mr.get_verification_key(queue=True)
3961+        dl = defer.DeferredList([d1, d2, d3, d4, d5])
3962+        mr.flush()
3963+        def _print(results):
3964+            self.failUnlessEqual(len(results), 5)
3965+            # We have one read for version information and offsets, and
3966+            # one for everything else.
3967+            self.failUnlessEqual(self.rref.read_count, 2)
3968+            block, salt = results[0][1] # results[0] is a boolean that says
3969+                                           # whether or not the operation
3970+                                           # worked.
3971+            self.failUnlessEqual(self.block, block)
3972+            self.failUnlessEqual(self.salt, salt)
3973+
3974+            blockhashes = results[1][1]
3975+            self.failUnlessEqual(self.block_hash_tree, blockhashes)
3976+
3977+            sharehashes = results[2][1]
3978+            self.failUnlessEqual(self.share_hash_chain, sharehashes)
3979+
3980+            signature = results[3][1]
3981+            self.failUnlessEqual(self.signature, signature)
3982+
3983+            verification_key = results[4][1]
3984+            self.failUnlessEqual(self.verification_key, verification_key)
3985+        dl.addCallback(_print)
3986+        return dl
3987+
3988+
3989+    def test_sdmf_writer(self):
3990+        # Go through the motions of writing an SDMF share to the storage
3991+        # server. Then read the storage server to see that the share got
3992+        # written in the way that we think it should have.
3993+
3994+        # We do this first so that the necessary instance variables get
3995+        # set the way we want them for the tests below.
3996+        data = self.build_test_sdmf_share()
3997+        sdmfr = SDMFSlotWriteProxy(0,
3998+                                   self.rref,
3999+                                   "si1",
4000+                                   self.secrets,
4001+                                   0, 3, 10, 36, 36)
4002+        # Put the block and salt.
4003+        sdmfr.put_block(self.blockdata, 0, self.salt)
4004+
4005+        # Put the encprivkey
4006+        sdmfr.put_encprivkey(self.encprivkey)
4007+
4008+        # Put the block and share hash chains
4009+        sdmfr.put_blockhashes(self.block_hash_tree)
4010+        sdmfr.put_sharehashes(self.share_hash_chain)
4011+        sdmfr.put_root_hash(self.root_hash)
4012+
4013+        # Put the signature
4014+        sdmfr.put_signature(self.signature)
4015+
4016+        # Put the verification key
4017+        sdmfr.put_verification_key(self.verification_key)
4018+
4019+        # Now check to make sure that nothing has been written yet.
4020+        self.failUnlessEqual(self.rref.write_count, 0)
4021+
4022+        # Now finish publishing
4023+        d = sdmfr.finish_publishing()
4024+        def _then(ignored):
4025+            self.failUnlessEqual(self.rref.write_count, 1)
4026+            read = self.ss.remote_slot_readv
4027+            self.failUnlessEqual(read("si1", [0], [(0, len(data))]),
4028+                                 {0: [data]})
4029+        d.addCallback(_then)
4030+        return d
4031+
4032+
4033+    def test_sdmf_writer_preexisting_share(self):
4034+        data = self.build_test_sdmf_share()
4035+        self.write_sdmf_share_to_server("si1")
4036+
4037+        # Now there is a share on the storage server. To successfully
4038+        # write, we need to set the checkstring correctly. When we
4039+        # don't, no write should occur.
4040+        sdmfw = SDMFSlotWriteProxy(0,
4041+                                   self.rref,
4042+                                   "si1",
4043+                                   self.secrets,
4044+                                   1, 3, 10, 36, 36)
4045+        sdmfw.put_block(self.blockdata, 0, self.salt)
4046+
4047+        # Put the encprivkey
4048+        sdmfw.put_encprivkey(self.encprivkey)
4049+
4050+        # Put the block and share hash chains
4051+        sdmfw.put_blockhashes(self.block_hash_tree)
4052+        sdmfw.put_sharehashes(self.share_hash_chain)
4053+
4054+        # Put the root hash
4055+        sdmfw.put_root_hash(self.root_hash)
4056+
4057+        # Put the signature
4058+        sdmfw.put_signature(self.signature)
4059+
4060+        # Put the verification key
4061+        sdmfw.put_verification_key(self.verification_key)
4062+
4063+        # We shouldn't have a checkstring yet
4064+        self.failUnlessEqual(sdmfw.get_checkstring(), "")
4065+
4066+        d = sdmfw.finish_publishing()
4067+        def _then(results):
4068+            self.failIf(results[0])
4069+            # this is the correct checkstring
4070+            self._expected_checkstring = results[1][0][0]
4071+            return self._expected_checkstring
4072+
4073+        d.addCallback(_then)
4074+        d.addCallback(sdmfw.set_checkstring)
4075+        d.addCallback(lambda ignored:
4076+            sdmfw.get_checkstring())
4077+        d.addCallback(lambda checkstring:
4078+            self.failUnlessEqual(checkstring, self._expected_checkstring))
4079+        d.addCallback(lambda ignored:
4080+            sdmfw.finish_publishing())
4081+        def _then_again(results):
4082+            self.failUnless(results[0])
4083+            read = self.ss.remote_slot_readv
4084+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
4085+                                 {0: [struct.pack(">Q", 1)]})
4086+            self.failUnlessEqual(read("si1", [0], [(9, len(data) - 9)]),
4087+                                 {0: [data[9:]]})
4088+        d.addCallback(_then_again)
4089+        return d
4090+
4091+
4092 class Stats(unittest.TestCase):
4093 
4094     def setUp(self):
4095}
4096[immutable/filenode.py: Make the immutable file node implement the same interfaces as the mutable one
4097Kevan Carstensen <kevan@isnotajoke.com>**20100810000619
4098 Ignore-this: 93e536c0f8efb705310f13ff64621527
4099] {
4100hunk ./src/allmydata/immutable/filenode.py 8
4101 now = time.time
4102 from zope.interface import implements, Interface
4103 from twisted.internet import defer
4104-from twisted.internet.interfaces import IConsumer
4105 
4106hunk ./src/allmydata/immutable/filenode.py 9
4107-from allmydata.interfaces import IImmutableFileNode, IUploadResults
4108 from allmydata import uri
4109hunk ./src/allmydata/immutable/filenode.py 10
4110+from twisted.internet.interfaces import IConsumer
4111+from twisted.protocols import basic
4112+from foolscap.api import eventually
4113+from allmydata.interfaces import IImmutableFileNode, ICheckable, \
4114+     IDownloadTarget, IUploadResults
4115+from allmydata.util import dictutil, log, base32, consumer
4116+from allmydata.immutable.checker import Checker
4117 from allmydata.check_results import CheckResults, CheckAndRepairResults
4118 from allmydata.util.dictutil import DictOfSets
4119 from pycryptopp.cipher.aes import AES
4120hunk ./src/allmydata/immutable/filenode.py 296
4121         return self._cnode.check_and_repair(monitor, verify, add_lease)
4122     def check(self, monitor, verify=False, add_lease=False):
4123         return self._cnode.check(monitor, verify, add_lease)
4124+
4125+    def get_best_readable_version(self):
4126+        """
4127+        Return an IReadable of the best version of this file. Since
4128+        immutable files can have only one version, we just return the
4129+        current filenode.
4130+        """
4131+        return defer.succeed(self)
4132+
4133+
4134+    def download_best_version(self):
4135+        """
4136+        Download the best version of this file, returning its contents
4137+        as a bytestring. Since there is only one version of an immutable
4138+        file, we download and return the contents of this file.
4139+        """
4140+        d = consumer.download_to_data(self)
4141+        return d
4142+
4143+    # for an immutable file, download_to_data (specified in IReadable)
4144+    # is the same as download_best_version (specified in IFileNode). For
4145+    # mutable files, the difference is more meaningful, since they can
4146+    # have multiple versions.
4147+    download_to_data = download_best_version
4148+
4149+
4150+    # get_size() (IReadable), get_current_size() (IFilesystemNode), and
4151+    # get_size_of_best_version(IFileNode) are all the same for immutable
4152+    # files.
4153+    get_size_of_best_version = get_current_size
4154}
4155[immutable/literal.py: implement the same interfaces as other filenodes
4156Kevan Carstensen <kevan@isnotajoke.com>**20100810000633
4157 Ignore-this: b50dd5df2d34ecd6477b8499a27aef13
4158] hunk ./src/allmydata/immutable/literal.py 106
4159         d.addCallback(lambda lastSent: consumer)
4160         return d
4161 
4162+    # IReadable, IFileNode, IFilesystemNode
4163+    def get_best_readable_version(self):
4164+        return defer.succeed(self)
4165+
4166+
4167+    def download_best_version(self):
4168+        return defer.succeed(self.u.data)
4169+
4170+
4171+    download_to_data = download_best_version
4172+    get_size_of_best_version = get_current_size
4173+
4174[mutable/filenode.py: add versions and partial-file updates to the mutable file node
4175Kevan Carstensen <kevan@isnotajoke.com>**20100811233049
4176 Ignore-this: edf9f6d5d2833909568757ba2dbeedff
4177 
4178 One of the goals of MDMF as a GSoC project is to lay the groundwork for
4179 LDMF, a format that will allow Tahoe-LAFS to deal with and encourage
4180 multiple versions of a single cap on the grid. In line with this, there
4181 is a now a distinction between an overriding mutable file (which can be
4182 thought to correspond to the cap/unique identifier for that mutable
4183 file) and versions of the mutable file (which we can download, update,
4184 and so on). All download, upload, and modification operations end up
4185 happening on a particular version of a mutable file, but there are
4186 shortcut methods on the object representing the overriding mutable file
4187 that perform these operations on the best version of the mutable file
4188 (which is what code should be doing until we have LDMF and better
4189 support for other paradigms).
4190 
4191 Another goal of MDMF was to take advantage of segmentation to give
4192 callers more efficient partial file updates or appends. This patch
4193 implements methods that do that, too.
4194 
4195] {
4196hunk ./src/allmydata/mutable/filenode.py 7
4197 from zope.interface import implements
4198 from twisted.internet import defer, reactor
4199 from foolscap.api import eventually
4200-from allmydata.interfaces import IMutableFileNode, \
4201-     ICheckable, ICheckResults, NotEnoughSharesError
4202-from allmydata.util import hashutil, log
4203+from allmydata.interfaces import IMutableFileNode, ICheckable, ICheckResults, \
4204+     NotEnoughSharesError, MDMF_VERSION, SDMF_VERSION, IMutableUploadable, \
4205+     IMutableFileVersion, IWritable
4206+from allmydata import hashtree
4207+from allmydata.util import hashutil, log, consumer, deferredutil, mathutil
4208 from allmydata.util.assertutil import precondition
4209 from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI
4210 from allmydata.monitor import Monitor
4211hunk ./src/allmydata/mutable/filenode.py 17
4212 from pycryptopp.cipher.aes import AES
4213 
4214-from allmydata.mutable.publish import Publish
4215+from allmydata.mutable.publish import Publish, MutableFileHandle, \
4216+                                      MutableData,\
4217+                                      DEFAULT_MAX_SEGMENT_SIZE, \
4218+                                      TransformingUploadable
4219 from allmydata.mutable.common import MODE_READ, MODE_WRITE, UnrecoverableFileError, \
4220      ResponseCache, UncoordinatedWriteError
4221 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
4222hunk ./src/allmydata/mutable/filenode.py 72
4223         self._sharemap = {} # known shares, shnum-to-[nodeids]
4224         self._cache = ResponseCache()
4225         self._most_recent_size = None
4226+        # filled in after __init__ if we're being created for the first time;
4227+        # filled in by the servermap updater before publishing, otherwise.
4228+        # set to this default value in case neither of those things happen,
4229+        # or in case the servermap can't find any shares to tell us what
4230+        # to publish as.
4231+        # TODO: Set this back to None, and find out why the tests fail
4232+        #       with it set to None.
4233+        self._protocol_version = None
4234 
4235         # all users of this MutableFileNode go through the serializer. This
4236         # takes advantage of the fact that Deferreds discard the callbacks
4237hunk ./src/allmydata/mutable/filenode.py 136
4238         return self._upload(initial_contents, None)
4239 
4240     def _get_initial_contents(self, contents):
4241-        if isinstance(contents, str):
4242-            return contents
4243         if contents is None:
4244hunk ./src/allmydata/mutable/filenode.py 137
4245-            return ""
4246+            return MutableData("")
4247+
4248+        if IMutableUploadable.providedBy(contents):
4249+            return contents
4250+
4251         assert callable(contents), "%s should be callable, not %s" % \
4252                (contents, type(contents))
4253         return contents(self)
4254hunk ./src/allmydata/mutable/filenode.py 211
4255 
4256     def get_size(self):
4257         return self._most_recent_size
4258+
4259     def get_current_size(self):
4260         d = self.get_size_of_best_version()
4261         d.addCallback(self._stash_size)
4262hunk ./src/allmydata/mutable/filenode.py 216
4263         return d
4264+
4265     def _stash_size(self, size):
4266         self._most_recent_size = size
4267         return size
4268hunk ./src/allmydata/mutable/filenode.py 275
4269             return cmp(self.__class__, them.__class__)
4270         return cmp(self._uri, them._uri)
4271 
4272-    def _do_serialized(self, cb, *args, **kwargs):
4273-        # note: to avoid deadlock, this callable is *not* allowed to invoke
4274-        # other serialized methods within this (or any other)
4275-        # MutableFileNode. The callable should be a bound method of this same
4276-        # MFN instance.
4277-        d = defer.Deferred()
4278-        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
4279-        # we need to put off d.callback until this Deferred is finished being
4280-        # processed. Otherwise the caller's subsequent activities (like,
4281-        # doing other things with this node) can cause reentrancy problems in
4282-        # the Deferred code itself
4283-        self._serializer.addBoth(lambda res: eventually(d.callback, res))
4284-        # add a log.err just in case something really weird happens, because
4285-        # self._serializer stays around forever, therefore we won't see the
4286-        # usual Unhandled Error in Deferred that would give us a hint.
4287-        self._serializer.addErrback(log.err)
4288-        return d
4289 
4290     #################################
4291     # ICheckable
4292hunk ./src/allmydata/mutable/filenode.py 300
4293 
4294 
4295     #################################
4296-    # IMutableFileNode
4297+    # IFileNode
4298+
4299+    def get_best_readable_version(self):
4300+        """
4301+        I return a Deferred that fires with a MutableFileVersion
4302+        representing the best readable version of the file that I
4303+        represent
4304+        """
4305+        return self.get_readable_version()
4306+
4307+
4308+    def get_readable_version(self, servermap=None, version=None):
4309+        """
4310+        I return a Deferred that fires with an MutableFileVersion for my
4311+        version argument, if there is a recoverable file of that version
4312+        on the grid. If there is no recoverable version, I fire with an
4313+        UnrecoverableFileError.
4314+
4315+        If a servermap is provided, I look in there for the requested
4316+        version. If no servermap is provided, I create and update a new
4317+        one.
4318+
4319+        If no version is provided, then I return a MutableFileVersion
4320+        representing the best recoverable version of the file.
4321+        """
4322+        d = self._get_version_from_servermap(MODE_READ, servermap, version)
4323+        def _build_version((servermap, their_version)):
4324+            assert their_version in servermap.recoverable_versions()
4325+            assert their_version in servermap.make_versionmap()
4326+
4327+            mfv = MutableFileVersion(self,
4328+                                     servermap,
4329+                                     their_version,
4330+                                     self._storage_index,
4331+                                     self._storage_broker,
4332+                                     self._readkey,
4333+                                     history=self._history)
4334+            assert mfv.is_readonly()
4335+            # our caller can use this to download the contents of the
4336+            # mutable file.
4337+            return mfv
4338+        return d.addCallback(_build_version)
4339+
4340+
4341+    def _get_version_from_servermap(self,
4342+                                    mode,
4343+                                    servermap=None,
4344+                                    version=None):
4345+        """
4346+        I return a Deferred that fires with (servermap, version).
4347+
4348+        This function performs validation and a servermap update. If it
4349+        returns (servermap, version), the caller can assume that:
4350+            - servermap was last updated in mode.
4351+            - version is recoverable, and corresponds to the servermap.
4352+
4353+        If version and servermap are provided to me, I will validate
4354+        that version exists in the servermap, and that the servermap was
4355+        updated correctly.
4356+
4357+        If version is not provided, but servermap is, I will validate
4358+        the servermap and return the best recoverable version that I can
4359+        find in the servermap.
4360+
4361+        If the version is provided but the servermap isn't, I will
4362+        obtain a servermap that has been updated in the correct mode and
4363+        validate that version is found and recoverable.
4364+
4365+        If neither servermap nor version are provided, I will obtain a
4366+        servermap updated in the correct mode, and return the best
4367+        recoverable version that I can find in there.
4368+        """
4369+        # XXX: wording ^^^^
4370+        if servermap and servermap.last_update_mode == mode:
4371+            d = defer.succeed(servermap)
4372+        else:
4373+            d = self._get_servermap(mode)
4374+
4375+        def _get_version(servermap, v):
4376+            if v and v not in servermap.recoverable_versions():
4377+                v = None
4378+            elif not v:
4379+                v = servermap.best_recoverable_version()
4380+            if not v:
4381+                raise UnrecoverableFileError("no recoverable versions")
4382+
4383+            return (servermap, v)
4384+        return d.addCallback(_get_version, version)
4385+
4386 
4387     def download_best_version(self):
4388hunk ./src/allmydata/mutable/filenode.py 391
4389+        """
4390+        I return a Deferred that fires with the contents of the best
4391+        version of this mutable file.
4392+        """
4393         return self._do_serialized(self._download_best_version)
4394hunk ./src/allmydata/mutable/filenode.py 396
4395+
4396+
4397     def _download_best_version(self):
4398hunk ./src/allmydata/mutable/filenode.py 399
4399-        servermap = ServerMap()
4400-        d = self._try_once_to_download_best_version(servermap, MODE_READ)
4401-        def _maybe_retry(f):
4402-            f.trap(NotEnoughSharesError)
4403-            # the download is worth retrying once. Make sure to use the
4404-            # old servermap, since it is what remembers the bad shares,
4405-            # but use MODE_WRITE to make it look for even more shares.
4406-            # TODO: consider allowing this to retry multiple times.. this
4407-            # approach will let us tolerate about 8 bad shares, I think.
4408-            return self._try_once_to_download_best_version(servermap,
4409-                                                           MODE_WRITE)
4410+        """
4411+        I am the serialized sibling of download_best_version.
4412+        """
4413+        d = self.get_best_readable_version()
4414+        d.addCallback(self._record_size)
4415+        d.addCallback(lambda version: version.download_to_data())
4416+
4417+        # It is possible that the download will fail because there
4418+        # aren't enough shares to be had. If so, we will try again after
4419+        # updating the servermap in MODE_WRITE, which may find more
4420+        # shares than updating in MODE_READ, as we just did. We can do
4421+        # this by getting the best mutable version and downloading from
4422+        # that -- the best mutable version will be a MutableFileVersion
4423+        # with a servermap that was last updated in MODE_WRITE, as we
4424+        # want. If this fails, then we give up.
4425+        def _maybe_retry(failure):
4426+            failure.trap(NotEnoughSharesError)
4427+
4428+            d = self.get_best_mutable_version()
4429+            d.addCallback(self._record_size)
4430+            d.addCallback(lambda version: version.download_to_data())
4431+            return d
4432+
4433         d.addErrback(_maybe_retry)
4434         return d
4435hunk ./src/allmydata/mutable/filenode.py 424
4436-    def _try_once_to_download_best_version(self, servermap, mode):
4437-        d = self._update_servermap(servermap, mode)
4438-        d.addCallback(self._once_updated_download_best_version, servermap)
4439-        return d
4440-    def _once_updated_download_best_version(self, ignored, servermap):
4441-        goal = servermap.best_recoverable_version()
4442-        if not goal:
4443-            raise UnrecoverableFileError("no recoverable versions")
4444-        return self._try_once_to_download_version(servermap, goal)
4445+
4446+
4447+    def _record_size(self, mfv):
4448+        """
4449+        I record the size of a mutable file version.
4450+        """
4451+        self._most_recent_size = mfv.get_size()
4452+        return mfv
4453+
4454 
4455     def get_size_of_best_version(self):
4456hunk ./src/allmydata/mutable/filenode.py 435
4457-        d = self.get_servermap(MODE_READ)
4458-        def _got_servermap(smap):
4459-            ver = smap.best_recoverable_version()
4460-            if not ver:
4461-                raise UnrecoverableFileError("no recoverable version")
4462-            return smap.size_of_version(ver)
4463-        d.addCallback(_got_servermap)
4464-        return d
4465+        """
4466+        I return the size of the best version of this mutable file.
4467 
4468hunk ./src/allmydata/mutable/filenode.py 438
4469+        This is equivalent to calling get_size() on the result of
4470+        get_best_readable_version().
4471+        """
4472+        d = self.get_best_readable_version()
4473+        return d.addCallback(lambda mfv: mfv.get_size())
4474+
4475+
4476+    #################################
4477+    # IMutableFileNode
4478+
4479+    def get_best_mutable_version(self, servermap=None):
4480+        """
4481+        I return a Deferred that fires with a MutableFileVersion
4482+        representing the best readable version of the file that I
4483+        represent. I am like get_best_readable_version, except that I
4484+        will try to make a writable version if I can.
4485+        """
4486+        return self.get_mutable_version(servermap=servermap)
4487+
4488+
4489+    def get_mutable_version(self, servermap=None, version=None):
4490+        """
4491+        I return a version of this mutable file. I return a Deferred
4492+        that fires with a MutableFileVersion
4493+
4494+        If version is provided, the Deferred will fire with a
4495+        MutableFileVersion initailized with that version. Otherwise, it
4496+        will fire with the best version that I can recover.
4497+
4498+        If servermap is provided, I will use that to find versions
4499+        instead of performing my own servermap update.
4500+        """
4501+        if self.is_readonly():
4502+            return self.get_readable_version(servermap=servermap,
4503+                                             version=version)
4504+
4505+        # get_mutable_version => write intent, so we require that the
4506+        # servermap is updated in MODE_WRITE
4507+        d = self._get_version_from_servermap(MODE_WRITE, servermap, version)
4508+        def _build_version((servermap, smap_version)):
4509+            # these should have been set by the servermap update.
4510+            assert self._secret_holder
4511+            assert self._writekey
4512+
4513+            mfv = MutableFileVersion(self,
4514+                                     servermap,
4515+                                     smap_version,
4516+                                     self._storage_index,
4517+                                     self._storage_broker,
4518+                                     self._readkey,
4519+                                     self._writekey,
4520+                                     self._secret_holder,
4521+                                     history=self._history)
4522+            assert not mfv.is_readonly()
4523+            return mfv
4524+
4525+        return d.addCallback(_build_version)
4526+
4527+
4528+    # XXX: I'm uncomfortable with the difference between upload and
4529+    #      overwrite, which, FWICT, is basically that you don't have to
4530+    #      do a servermap update before you overwrite. We split them up
4531+    #      that way anyway, so I guess there's no real difficulty in
4532+    #      offering both ways to callers, but it also makes the
4533+    #      public-facing API cluttery, and makes it hard to discern the
4534+    #      right way of doing things.
4535+
4536+    # In general, we leave it to callers to ensure that they aren't
4537+    # going to cause UncoordinatedWriteErrors when working with
4538+    # MutableFileVersions. We know that the next three operations
4539+    # (upload, overwrite, and modify) will all operate on the same
4540+    # version, so we say that only one of them can be going on at once,
4541+    # and serialize them to ensure that that actually happens, since as
4542+    # the caller in this situation it is our job to do that.
4543     def overwrite(self, new_contents):
4544hunk ./src/allmydata/mutable/filenode.py 513
4545+        """
4546+        I overwrite the contents of the best recoverable version of this
4547+        mutable file with new_contents. This is equivalent to calling
4548+        overwrite on the result of get_best_mutable_version with
4549+        new_contents as an argument. I return a Deferred that eventually
4550+        fires with the results of my replacement process.
4551+        """
4552         return self._do_serialized(self._overwrite, new_contents)
4553hunk ./src/allmydata/mutable/filenode.py 521
4554+
4555+
4556     def _overwrite(self, new_contents):
4557hunk ./src/allmydata/mutable/filenode.py 524
4558+        """
4559+        I am the serialized sibling of overwrite.
4560+        """
4561+        d = self.get_best_mutable_version()
4562+        return d.addCallback(lambda mfv: mfv.overwrite(new_contents))
4563+
4564+
4565+
4566+    def upload(self, new_contents, servermap):
4567+        """
4568+        I overwrite the contents of the best recoverable version of this
4569+        mutable file with new_contents, using servermap instead of
4570+        creating/updating our own servermap. I return a Deferred that
4571+        fires with the results of my upload.
4572+        """
4573+        return self._do_serialized(self._upload, new_contents, servermap)
4574+
4575+
4576+    def _upload(self, new_contents, servermap):
4577+        """
4578+        I am the serialized sibling of upload.
4579+        """
4580+        d = self.get_best_mutable_version(servermap)
4581+        return d.addCallback(lambda mfv: mfv.overwrite(new_contents))
4582+
4583+
4584+    def modify(self, modifier, backoffer=None):
4585+        """
4586+        I modify the contents of the best recoverable version of this
4587+        mutable file with the modifier. This is equivalent to calling
4588+        modify on the result of get_best_mutable_version. I return a
4589+        Deferred that eventually fires with an UploadResults instance
4590+        describing this process.
4591+        """
4592+        return self._do_serialized(self._modify, modifier, backoffer)
4593+
4594+
4595+    def _modify(self, modifier, backoffer):
4596+        """
4597+        I am the serialized sibling of modify.
4598+        """
4599+        d = self.get_best_mutable_version()
4600+        return d.addCallback(lambda mfv: mfv.modify(modifier, backoffer))
4601+
4602+
4603+    def download_version(self, servermap, version, fetch_privkey=False):
4604+        """
4605+        Download the specified version of this mutable file. I return a
4606+        Deferred that fires with the contents of the specified version
4607+        as a bytestring, or errbacks if the file is not recoverable.
4608+        """
4609+        d = self.get_readable_version(servermap, version)
4610+        return d.addCallback(lambda mfv: mfv.download_to_data(fetch_privkey))
4611+
4612+
4613+    def get_servermap(self, mode):
4614+        """
4615+        I return a servermap that has been updated in mode.
4616+
4617+        mode should be one of MODE_READ, MODE_WRITE, MODE_CHECK or
4618+        MODE_ANYTHING. See servermap.py for more on what these mean.
4619+        """
4620+        return self._do_serialized(self._get_servermap, mode)
4621+
4622+
4623+    def _get_servermap(self, mode):
4624+        """
4625+        I am a serialized twin to get_servermap.
4626+        """
4627         servermap = ServerMap()
4628hunk ./src/allmydata/mutable/filenode.py 594
4629-        d = self._update_servermap(servermap, mode=MODE_WRITE)
4630-        d.addCallback(lambda ignored: self._upload(new_contents, servermap))
4631+        return self._update_servermap(servermap, mode)
4632+
4633+
4634+    def _update_servermap(self, servermap, mode):
4635+        u = ServermapUpdater(self, self._storage_broker, Monitor(), servermap,
4636+                             mode)
4637+        if self._history:
4638+            self._history.notify_mapupdate(u.get_status())
4639+        return u.update()
4640+
4641+
4642+    def set_version(self, version):
4643+        # I can be set in two ways:
4644+        #  1. When the node is created.
4645+        #  2. (for an existing share) when the Servermap is updated
4646+        #     before I am read.
4647+        assert version in (MDMF_VERSION, SDMF_VERSION)
4648+        self._protocol_version = version
4649+
4650+
4651+    def get_version(self):
4652+        return self._protocol_version
4653+
4654+
4655+    def _do_serialized(self, cb, *args, **kwargs):
4656+        # note: to avoid deadlock, this callable is *not* allowed to invoke
4657+        # other serialized methods within this (or any other)
4658+        # MutableFileNode. The callable should be a bound method of this same
4659+        # MFN instance.
4660+        d = defer.Deferred()
4661+        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
4662+        # we need to put off d.callback until this Deferred is finished being
4663+        # processed. Otherwise the caller's subsequent activities (like,
4664+        # doing other things with this node) can cause reentrancy problems in
4665+        # the Deferred code itself
4666+        self._serializer.addBoth(lambda res: eventually(d.callback, res))
4667+        # add a log.err just in case something really weird happens, because
4668+        # self._serializer stays around forever, therefore we won't see the
4669+        # usual Unhandled Error in Deferred that would give us a hint.
4670+        self._serializer.addErrback(log.err)
4671         return d
4672 
4673 
4674hunk ./src/allmydata/mutable/filenode.py 637
4675+    def _upload(self, new_contents, servermap):
4676+        """
4677+        A MutableFileNode still has to have some way of getting
4678+        published initially, which is what I am here for. After that,
4679+        all publishing, updating, modifying and so on happens through
4680+        MutableFileVersions.
4681+        """
4682+        assert self._pubkey, "update_servermap must be called before publish"
4683+
4684+        p = Publish(self, self._storage_broker, servermap)
4685+        if self._history:
4686+            self._history.notify_publish(p.get_status(),
4687+                                         new_contents.get_size())
4688+        d = p.publish(new_contents)
4689+        d.addCallback(self._did_upload, new_contents.get_size())
4690+        return d
4691+
4692+
4693+    def _did_upload(self, res, size):
4694+        self._most_recent_size = size
4695+        return res
4696+
4697+
4698+class MutableFileVersion:
4699+    """
4700+    I represent a specific version (most likely the best version) of a
4701+    mutable file.
4702+
4703+    Since I implement IReadable, instances which hold a
4704+    reference to an instance of me are guaranteed the ability (absent
4705+    connection difficulties or unrecoverable versions) to read the file
4706+    that I represent. Depending on whether I was initialized with a
4707+    write capability or not, I may also provide callers the ability to
4708+    overwrite or modify the contents of the mutable file that I
4709+    reference.
4710+    """
4711+    implements(IMutableFileVersion, IWritable)
4712+
4713+    def __init__(self,
4714+                 node,
4715+                 servermap,
4716+                 version,
4717+                 storage_index,
4718+                 storage_broker,
4719+                 readcap,
4720+                 writekey=None,
4721+                 write_secrets=None,
4722+                 history=None):
4723+
4724+        self._node = node
4725+        self._servermap = servermap
4726+        self._version = version
4727+        self._storage_index = storage_index
4728+        self._write_secrets = write_secrets
4729+        self._history = history
4730+        self._storage_broker = storage_broker
4731+
4732+        #assert isinstance(readcap, IURI)
4733+        self._readcap = readcap
4734+
4735+        self._writekey = writekey
4736+        self._serializer = defer.succeed(None)
4737+        self._size = None
4738+
4739+
4740+    def get_sequence_number(self):
4741+        """
4742+        Get the sequence number of the mutable version that I represent.
4743+        """
4744+        return self._version[0] # verinfo[0] == the sequence number
4745+
4746+
4747+    # TODO: Terminology?
4748+    def get_writekey(self):
4749+        """
4750+        I return a writekey or None if I don't have a writekey.
4751+        """
4752+        return self._writekey
4753+
4754+
4755+    def overwrite(self, new_contents):
4756+        """
4757+        I overwrite the contents of this mutable file version with the
4758+        data in new_contents.
4759+        """
4760+        assert not self.is_readonly()
4761+
4762+        return self._do_serialized(self._overwrite, new_contents)
4763+
4764+
4765+    def _overwrite(self, new_contents):
4766+        assert IMutableUploadable.providedBy(new_contents)
4767+        assert self._servermap.last_update_mode == MODE_WRITE
4768+
4769+        return self._upload(new_contents)
4770+
4771+
4772     def modify(self, modifier, backoffer=None):
4773         """I use a modifier callback to apply a change to the mutable file.
4774         I implement the following pseudocode::
4775hunk ./src/allmydata/mutable/filenode.py 774
4776         backoffer should not invoke any methods on this MutableFileNode
4777         instance, and it needs to be highly conscious of deadlock issues.
4778         """
4779+        assert not self.is_readonly()
4780+
4781         return self._do_serialized(self._modify, modifier, backoffer)
4782hunk ./src/allmydata/mutable/filenode.py 777
4783+
4784+
4785     def _modify(self, modifier, backoffer):
4786hunk ./src/allmydata/mutable/filenode.py 780
4787-        servermap = ServerMap()
4788         if backoffer is None:
4789             backoffer = BackoffAgent().delay
4790hunk ./src/allmydata/mutable/filenode.py 782
4791-        return self._modify_and_retry(servermap, modifier, backoffer, True)
4792-    def _modify_and_retry(self, servermap, modifier, backoffer, first_time):
4793-        d = self._modify_once(servermap, modifier, first_time)
4794+        return self._modify_and_retry(modifier, backoffer, True)
4795+
4796+
4797+    def _modify_and_retry(self, modifier, backoffer, first_time):
4798+        """
4799+        I try to apply modifier to the contents of this version of the
4800+        mutable file. If I succeed, I return an UploadResults instance
4801+        describing my success. If I fail, I try again after waiting for
4802+        a little bit.
4803+        """
4804+        log.msg("doing modify")
4805+        d = self._modify_once(modifier, first_time)
4806         def _retry(f):
4807             f.trap(UncoordinatedWriteError)
4808             d2 = defer.maybeDeferred(backoffer, self, f)
4809hunk ./src/allmydata/mutable/filenode.py 798
4810             d2.addCallback(lambda ignored:
4811-                           self._modify_and_retry(servermap, modifier,
4812+                           self._modify_and_retry(modifier,
4813                                                   backoffer, False))
4814             return d2
4815         d.addErrback(_retry)
4816hunk ./src/allmydata/mutable/filenode.py 803
4817         return d
4818-    def _modify_once(self, servermap, modifier, first_time):
4819-        d = self._update_servermap(servermap, MODE_WRITE)
4820-        d.addCallback(self._once_updated_download_best_version, servermap)
4821+
4822+
4823+    def _modify_once(self, modifier, first_time):
4824+        """
4825+        I attempt to apply a modifier to the contents of the mutable
4826+        file.
4827+        """
4828+        # XXX: This is wrong -- we could get more servers if we updated
4829+        # in MODE_ANYTHING and possibly MODE_CHECK. Probably we want to
4830+        # assert that the last update wasn't MODE_READ
4831+        assert self._servermap.last_update_mode == MODE_WRITE
4832+
4833+        # download_to_data is serialized, so we have to call this to
4834+        # avoid deadlock.
4835+        d = self._try_to_download_data()
4836         def _apply(old_contents):
4837hunk ./src/allmydata/mutable/filenode.py 819
4838-            new_contents = modifier(old_contents, servermap, first_time)
4839+            new_contents = modifier(old_contents, self._servermap, first_time)
4840+            precondition((isinstance(new_contents, str) or
4841+                          new_contents is None),
4842+                         "Modifier function must return a string "
4843+                         "or None")
4844+
4845             if new_contents is None or new_contents == old_contents:
4846hunk ./src/allmydata/mutable/filenode.py 826
4847+                log.msg("no changes")
4848                 # no changes need to be made
4849                 if first_time:
4850                     return
4851hunk ./src/allmydata/mutable/filenode.py 834
4852                 # recovery when it observes UCWE, we need to do a second
4853                 # publish. See #551 for details. We'll basically loop until
4854                 # we managed an uncontested publish.
4855-                new_contents = old_contents
4856-            precondition(isinstance(new_contents, str),
4857-                         "Modifier function must return a string or None")
4858-            return self._upload(new_contents, servermap)
4859+                old_uploadable = MutableData(old_contents)
4860+                new_contents = old_uploadable
4861+            else:
4862+                new_contents = MutableData(new_contents)
4863+
4864+            return self._upload(new_contents)
4865         d.addCallback(_apply)
4866         return d
4867 
4868hunk ./src/allmydata/mutable/filenode.py 843
4869-    def get_servermap(self, mode):
4870-        return self._do_serialized(self._get_servermap, mode)
4871-    def _get_servermap(self, mode):
4872-        servermap = ServerMap()
4873-        return self._update_servermap(servermap, mode)
4874-    def _update_servermap(self, servermap, mode):
4875-        u = ServermapUpdater(self, self._storage_broker, Monitor(), servermap,
4876-                             mode)
4877-        if self._history:
4878-            self._history.notify_mapupdate(u.get_status())
4879-        return u.update()
4880 
4881hunk ./src/allmydata/mutable/filenode.py 844
4882-    def download_version(self, servermap, version, fetch_privkey=False):
4883-        return self._do_serialized(self._try_once_to_download_version,
4884-                                   servermap, version, fetch_privkey)
4885-    def _try_once_to_download_version(self, servermap, version,
4886-                                      fetch_privkey=False):
4887-        r = Retrieve(self, servermap, version, fetch_privkey)
4888+    def is_readonly(self):
4889+        """
4890+        I return True if this MutableFileVersion provides no write
4891+        access to the file that it encapsulates, and False if it
4892+        provides the ability to modify the file.
4893+        """
4894+        return self._writekey is None
4895+
4896+
4897+    def is_mutable(self):
4898+        """
4899+        I return True, since mutable files are always mutable by
4900+        somebody.
4901+        """
4902+        return True
4903+
4904+
4905+    def get_storage_index(self):
4906+        """
4907+        I return the storage index of the reference that I encapsulate.
4908+        """
4909+        return self._storage_index
4910+
4911+
4912+    def get_size(self):
4913+        """
4914+        I return the length, in bytes, of this readable object.
4915+        """
4916+        return self._servermap.size_of_version(self._version)
4917+
4918+
4919+    def download_to_data(self, fetch_privkey=False):
4920+        """
4921+        I return a Deferred that fires with the contents of this
4922+        readable object as a byte string.
4923+
4924+        """
4925+        c = consumer.MemoryConsumer()
4926+        d = self.read(c, fetch_privkey=fetch_privkey)
4927+        d.addCallback(lambda mc: "".join(mc.chunks))
4928+        return d
4929+
4930+
4931+    def _try_to_download_data(self):
4932+        """
4933+        I am an unserialized cousin of download_to_data; I am called
4934+        from the children of modify() to download the data associated
4935+        with this mutable version.
4936+        """
4937+        c = consumer.MemoryConsumer()
4938+        # modify will almost certainly write, so we need the privkey.
4939+        d = self._read(c, fetch_privkey=True)
4940+        d.addCallback(lambda mc: "".join(mc.chunks))
4941+        return d
4942+
4943+
4944+    def _update_servermap(self, mode=MODE_READ):
4945+        """
4946+        I update our Servermap according to my mode argument. I return a
4947+        Deferred that fires with None when this has finished. The
4948+        updated Servermap will be at self._servermap in that case.
4949+        """
4950+        d = self._node.get_servermap(mode)
4951+
4952+        def _got_servermap(servermap):
4953+            assert servermap.last_update_mode == mode
4954+
4955+            self._servermap = servermap
4956+        d.addCallback(_got_servermap)
4957+        return d
4958+
4959+
4960+    def read(self, consumer, offset=0, size=None, fetch_privkey=False):
4961+        """
4962+        I read a portion (possibly all) of the mutable file that I
4963+        reference into consumer.
4964+        """
4965+        return self._do_serialized(self._read, consumer, offset, size,
4966+                                   fetch_privkey)
4967+
4968+
4969+    def _read(self, consumer, offset=0, size=None, fetch_privkey=False):
4970+        """
4971+        I am the serialized companion of read.
4972+        """
4973+        r = Retrieve(self._node, self._servermap, self._version, fetch_privkey)
4974         if self._history:
4975             self._history.notify_retrieve(r.get_status())
4976hunk ./src/allmydata/mutable/filenode.py 932
4977-        d = r.download()
4978-        d.addCallback(self._downloaded_version)
4979+        d = r.download(consumer, offset, size)
4980         return d
4981hunk ./src/allmydata/mutable/filenode.py 934
4982-    def _downloaded_version(self, data):
4983-        self._most_recent_size = len(data)
4984-        return data
4985 
4986hunk ./src/allmydata/mutable/filenode.py 935
4987-    def upload(self, new_contents, servermap):
4988-        return self._do_serialized(self._upload, new_contents, servermap)
4989-    def _upload(self, new_contents, servermap):
4990-        assert self._pubkey, "update_servermap must be called before publish"
4991-        p = Publish(self, self._storage_broker, servermap)
4992+
4993+    def _do_serialized(self, cb, *args, **kwargs):
4994+        # note: to avoid deadlock, this callable is *not* allowed to invoke
4995+        # other serialized methods within this (or any other)
4996+        # MutableFileNode. The callable should be a bound method of this same
4997+        # MFN instance.
4998+        d = defer.Deferred()
4999+        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
5000+        # we need to put off d.callback until this Deferred is finished being
5001+        # processed. Otherwise the caller's subsequent activities (like,
5002+        # doing other things with this node) can cause reentrancy problems in
5003+        # the Deferred code itself
5004+        self._serializer.addBoth(lambda res: eventually(d.callback, res))
5005+        # add a log.err just in case something really weird happens, because
5006+        # self._serializer stays around forever, therefore we won't see the
5007+        # usual Unhandled Error in Deferred that would give us a hint.
5008+        self._serializer.addErrback(log.err)
5009+        return d
5010+
5011+
5012+    def _upload(self, new_contents):
5013+        #assert self._pubkey, "update_servermap must be called before publish"
5014+        p = Publish(self._node, self._storage_broker, self._servermap)
5015         if self._history:
5016hunk ./src/allmydata/mutable/filenode.py 959
5017-            self._history.notify_publish(p.get_status(), len(new_contents))
5018+            self._history.notify_publish(p.get_status(),
5019+                                         new_contents.get_size())
5020         d = p.publish(new_contents)
5021hunk ./src/allmydata/mutable/filenode.py 962
5022-        d.addCallback(self._did_upload, len(new_contents))
5023+        d.addCallback(self._did_upload, new_contents.get_size())
5024         return d
5025hunk ./src/allmydata/mutable/filenode.py 964
5026+
5027+
5028     def _did_upload(self, res, size):
5029hunk ./src/allmydata/mutable/filenode.py 967
5030-        self._most_recent_size = size
5031+        self._size = size
5032         return res
5033hunk ./src/allmydata/mutable/filenode.py 969
5034+
5035+    def update(self, data, offset):
5036+        """
5037+        Do an update of this mutable file version by inserting data at
5038+        offset within the file. If offset is the EOF, this is an append
5039+        operation. I return a Deferred that fires with the results of
5040+        the update operation when it has completed.
5041+
5042+        In cases where update does not append any data, or where it does
5043+        not append so many blocks that the block count crosses a
5044+        power-of-two boundary, this operation will use roughly
5045+        O(data.get_size()) memory/bandwidth/CPU to perform the update.
5046+        Otherwise, it must download, re-encode, and upload the entire
5047+        file again, which will use O(filesize) resources.
5048+        """
5049+        return self._do_serialized(self._update, data, offset)
5050+
5051+
5052+    def _update(self, data, offset):
5053+        """
5054+        I update the mutable file version represented by this particular
5055+        IMutableVersion by inserting the data in data at the offset
5056+        offset. I return a Deferred that fires when this has been
5057+        completed.
5058+        """
5059+        # We have two cases here:
5060+        # 1. The new data will add few enough segments so that it does
5061+        #    not cross into the next power-of-two boundary.
5062+        # 2. It doesn't.
5063+        #
5064+        # In the former case, we can modify the file in place. In the
5065+        # latter case, we need to re-encode the file.
5066+        new_size = data.get_size() + offset
5067+        old_size = self.get_size()
5068+        segment_size = self._version[3]
5069+        num_old_segments = mathutil.div_ceil(old_size,
5070+                                             segment_size)
5071+        num_new_segments = mathutil.div_ceil(new_size,
5072+                                             segment_size)
5073+        log.msg("got %d old segments, %d new segments" % \
5074+                        (num_old_segments, num_new_segments))
5075+
5076+        # We also do a whole file re-encode if the file is an SDMF file.
5077+        if self._version[2]: # version[2] == SDMF salt, which MDMF lacks
5078+            log.msg("doing re-encode instead of in-place update")
5079+            return self._do_modify_update(data, offset)
5080+
5081+        log.msg("updating in place")
5082+        d = self._do_update_update(data, offset)
5083+        d.addCallback(self._decode_and_decrypt_segments, data, offset)
5084+        d.addCallback(self._build_uploadable_and_finish, data, offset)
5085+        return d
5086+
5087+
5088+    def _do_modify_update(self, data, offset):
5089+        """
5090+        I perform a file update by modifying the contents of the file
5091+        after downloading it, then reuploading it. I am less efficient
5092+        than _do_update_update, but am necessary for certain updates.
5093+        """
5094+        def m(old, servermap, first_time):
5095+            start = offset
5096+            rest = offset + data.get_size()
5097+            new = old[:start]
5098+            new += "".join(data.read(data.get_size()))
5099+            new += old[rest:]
5100+            return new
5101+        return self._modify(m, None)
5102+
5103+
5104+    def _do_update_update(self, data, offset):
5105+        """
5106+        I start the Servermap update that gets us the data we need to
5107+        continue the update process. I return a Deferred that fires when
5108+        the servermap update is done.
5109+        """
5110+        assert IMutableUploadable.providedBy(data)
5111+        assert self.is_mutable()
5112+        # offset == self.get_size() is valid and means that we are
5113+        # appending data to the file.
5114+        assert offset <= self.get_size()
5115+
5116+        datasize = data.get_size()
5117+        # We'll need the segment that the data starts in, regardless of
5118+        # what we'll do later.
5119+        start_segment = mathutil.div_ceil(offset, DEFAULT_MAX_SEGMENT_SIZE)
5120+        start_segment -= 1
5121+
5122+        # We only need the end segment if the data we append does not go
5123+        # beyond the current end-of-file.
5124+        end_segment = start_segment
5125+        if offset + data.get_size() < self.get_size():
5126+            end_data = offset + data.get_size()
5127+            end_segment = mathutil.div_ceil(end_data, DEFAULT_MAX_SEGMENT_SIZE)
5128+            end_segment -= 1
5129+        self._start_segment = start_segment
5130+        self._end_segment = end_segment
5131+
5132+        # Now ask for the servermap to be updated in MODE_WRITE with
5133+        # this update range.
5134+        u = ServermapUpdater(self._node, self._storage_broker, Monitor(),
5135+                             self._servermap,
5136+                             mode=MODE_WRITE,
5137+                             update_range=(start_segment, end_segment))
5138+        return u.update()
5139+
5140+
5141+    def _decode_and_decrypt_segments(self, ignored, data, offset):
5142+        """
5143+        After the servermap update, I take the encrypted and encoded
5144+        data that the servermap fetched while doing its update and
5145+        transform it into decoded-and-decrypted plaintext that can be
5146+        used by the new uploadable. I return a Deferred that fires with
5147+        the segments.
5148+        """
5149+        r = Retrieve(self._node, self._servermap, self._version)
5150+        # decode: takes in our blocks and salts from the servermap,
5151+        # returns a Deferred that fires with the corresponding plaintext
5152+        # segments. Does not download -- simply takes advantage of
5153+        # existing infrastructure within the Retrieve class to avoid
5154+        # duplicating code.
5155+        sm = self._servermap
5156+        # XXX: If the methods in the servermap don't work as
5157+        # abstractions, you should rewrite them instead of going around
5158+        # them.
5159+        update_data = sm.update_data
5160+        start_segments = {} # shnum -> start segment
5161+        end_segments = {} # shnum -> end segment
5162+        blockhashes = {} # shnum -> blockhash tree
5163+        for (shnum, data) in update_data.iteritems():
5164+            data = [d[1] for d in data if d[0] == self._version]
5165+
5166+            # Every data entry in our list should now be share shnum for
5167+            # a particular version of the mutable file, so all of the
5168+            # entries should be identical.
5169+            datum = data[0]
5170+            assert filter(lambda x: x != datum, data) == []
5171+
5172+            blockhashes[shnum] = datum[0]
5173+            start_segments[shnum] = datum[1]
5174+            end_segments[shnum] = datum[2]
5175+
5176+        d1 = r.decode(start_segments, self._start_segment)
5177+        d2 = r.decode(end_segments, self._end_segment)
5178+        d3 = defer.succeed(blockhashes)
5179+        return deferredutil.gatherResults([d1, d2, d3])
5180+
5181+
5182+    def _build_uploadable_and_finish(self, segments_and_bht, data, offset):
5183+        """
5184+        After the process has the plaintext segments, I build the
5185+        TransformingUploadable that the publisher will eventually
5186+        re-upload to the grid. I then invoke the publisher with that
5187+        uploadable, and return a Deferred when the publish operation has
5188+        completed without issue.
5189+        """
5190+        u = TransformingUploadable(data, offset,
5191+                                   self._version[3],
5192+                                   segments_and_bht[0],
5193+                                   segments_and_bht[1])
5194+        p = Publish(self._node, self._storage_broker, self._servermap)
5195+        return p.update(u, offset, segments_and_bht[2], self._version)
5196}
5197[mutable/publish.py: Modify the publish process to support MDMF
5198Kevan Carstensen <kevan@isnotajoke.com>**20100811233101
5199 Ignore-this: c2eb57cf67da7af5ad02be793e918bc6
5200 
5201 The inner workings of the publishing process needed to be reworked to a
5202 large extend to cope with segmented mutable files, and to cope with
5203 partial-file updates of mutable files. This patch does that. It also
5204 introduces wrappers for uploadable data, allowing the use of
5205 filehandle-like objects as data sources, in addition to strings. This
5206 reduces memory inefficiency when dealing with large files through the
5207 webapi, and clarifies update code there.
5208] {
5209hunk ./src/allmydata/mutable/publish.py 4
5210 
5211 
5212 import os, struct, time
5213+from StringIO import StringIO
5214 from itertools import count
5215 from zope.interface import implements
5216 from twisted.internet import defer
5217hunk ./src/allmydata/mutable/publish.py 9
5218 from twisted.python import failure
5219-from allmydata.interfaces import IPublishStatus
5220+from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION, \
5221+                                 IMutableUploadable
5222 from allmydata.util import base32, hashutil, mathutil, idlib, log
5223 from allmydata import hashtree, codec
5224 from allmydata.storage.server import si_b2a
5225hunk ./src/allmydata/mutable/publish.py 21
5226      UncoordinatedWriteError, NotEnoughServersError
5227 from allmydata.mutable.servermap import ServerMap
5228 from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
5229-     unpack_checkstring, SIGNED_PREFIX
5230+     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy, \
5231+     SDMFSlotWriteProxy
5232+
5233+KiB = 1024
5234+DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
5235+PUSHING_BLOCKS_STATE = 0
5236+PUSHING_EVERYTHING_ELSE_STATE = 1
5237+DONE_STATE = 2
5238 
5239 class PublishStatus:
5240     implements(IPublishStatus)
5241hunk ./src/allmydata/mutable/publish.py 118
5242         self._status.set_helper(False)
5243         self._status.set_progress(0.0)
5244         self._status.set_active(True)
5245+        self._version = self._node.get_version()
5246+        assert self._version in (SDMF_VERSION, MDMF_VERSION)
5247+
5248 
5249     def get_status(self):
5250         return self._status
5251hunk ./src/allmydata/mutable/publish.py 132
5252             kwargs["facility"] = "tahoe.mutable.publish"
5253         return log.msg(*args, **kwargs)
5254 
5255+
5256+    def update(self, data, offset, blockhashes, version):
5257+        """
5258+        I replace the contents of this file with the contents of data,
5259+        starting at offset. I return a Deferred that fires with None
5260+        when the replacement has been completed, or with an error if
5261+        something went wrong during the process.
5262+
5263+        Note that this process will not upload new shares. If the file
5264+        being updated is in need of repair, callers will have to repair
5265+        it on their own.
5266+        """
5267+        # How this works:
5268+        # 1: Make peer assignments. We'll assign each share that we know
5269+        # about on the grid to that peer that currently holds that
5270+        # share, and will not place any new shares.
5271+        # 2: Setup encoding parameters. Most of these will stay the same
5272+        # -- datalength will change, as will some of the offsets.
5273+        # 3. Upload the new segments.
5274+        # 4. Be done.
5275+        assert IMutableUploadable.providedBy(data)
5276+
5277+        self.data = data
5278+
5279+        # XXX: Use the MutableFileVersion instead.
5280+        self.datalength = self._node.get_size()
5281+        if data.get_size() > self.datalength:
5282+            self.datalength = data.get_size()
5283+
5284+        self.log("starting update")
5285+        self.log("adding new data of length %d at offset %d" % \
5286+                    (data.get_size(), offset))
5287+        self.log("new data length is %d" % self.datalength)
5288+        self._status.set_size(self.datalength)
5289+        self._status.set_status("Started")
5290+        self._started = time.time()
5291+
5292+        self.done_deferred = defer.Deferred()
5293+
5294+        self._writekey = self._node.get_writekey()
5295+        assert self._writekey, "need write capability to publish"
5296+
5297+        # first, which servers will we publish to? We require that the
5298+        # servermap was updated in MODE_WRITE, so we can depend upon the
5299+        # peerlist computed by that process instead of computing our own.
5300+        assert self._servermap
5301+        assert self._servermap.last_update_mode in (MODE_WRITE, MODE_CHECK)
5302+        # we will push a version that is one larger than anything present
5303+        # in the grid, according to the servermap.
5304+        self._new_seqnum = self._servermap.highest_seqnum() + 1
5305+        self._status.set_servermap(self._servermap)
5306+
5307+        self.log(format="new seqnum will be %(seqnum)d",
5308+                 seqnum=self._new_seqnum, level=log.NOISY)
5309+
5310+        # We're updating an existing file, so all of the following
5311+        # should be available.
5312+        self.readkey = self._node.get_readkey()
5313+        self.required_shares = self._node.get_required_shares()
5314+        assert self.required_shares is not None
5315+        self.total_shares = self._node.get_total_shares()
5316+        assert self.total_shares is not None
5317+        self._status.set_encoding(self.required_shares, self.total_shares)
5318+
5319+        self._pubkey = self._node.get_pubkey()
5320+        assert self._pubkey
5321+        self._privkey = self._node.get_privkey()
5322+        assert self._privkey
5323+        self._encprivkey = self._node.get_encprivkey()
5324+
5325+        sb = self._storage_broker
5326+        full_peerlist = sb.get_servers_for_index(self._storage_index)
5327+        self.full_peerlist = full_peerlist # for use later, immutable
5328+        self.bad_peers = set() # peerids who have errbacked/refused requests
5329+
5330+        # This will set self.segment_size, self.num_segments, and
5331+        # self.fec. TODO: Does it know how to do the offset? Probably
5332+        # not. So do that part next.
5333+        self.setup_encoding_parameters(offset=offset)
5334+
5335+        # if we experience any surprises (writes which were rejected because
5336+        # our test vector did not match, or shares which we didn't expect to
5337+        # see), we set this flag and report an UncoordinatedWriteError at the
5338+        # end of the publish process.
5339+        self.surprised = False
5340+
5341+        # we keep track of three tables. The first is our goal: which share
5342+        # we want to see on which servers. This is initially populated by the
5343+        # existing servermap.
5344+        self.goal = set() # pairs of (peerid, shnum) tuples
5345+
5346+        # the second table is our list of outstanding queries: those which
5347+        # are in flight and may or may not be delivered, accepted, or
5348+        # acknowledged. Items are added to this table when the request is
5349+        # sent, and removed when the response returns (or errbacks).
5350+        self.outstanding = set() # (peerid, shnum) tuples
5351+
5352+        # the third is a table of successes: share which have actually been
5353+        # placed. These are populated when responses come back with success.
5354+        # When self.placed == self.goal, we're done.
5355+        self.placed = set() # (peerid, shnum) tuples
5356+
5357+        # we also keep a mapping from peerid to RemoteReference. Each time we
5358+        # pull a connection out of the full peerlist, we add it to this for
5359+        # use later.
5360+        self.connections = {}
5361+
5362+        self.bad_share_checkstrings = {}
5363+
5364+        # This is set at the last step of the publishing process.
5365+        self.versioninfo = ""
5366+
5367+        # we use the servermap to populate the initial goal: this way we will
5368+        # try to update each existing share in place. Since we're
5369+        # updating, we ignore damaged and missing shares -- callers must
5370+        # do a repair to repair and recreate these.
5371+        for (peerid, shnum) in self._servermap.servermap:
5372+            self.goal.add( (peerid, shnum) )
5373+            self.connections[peerid] = self._servermap.connections[peerid]
5374+        self.writers = {}
5375+
5376+        # SDMF files are updated differently.
5377+        self._version = MDMF_VERSION
5378+        writer_class = MDMFSlotWriteProxy
5379+
5380+        # For each (peerid, shnum) in self.goal, we make a
5381+        # write proxy for that peer. We'll use this to write
5382+        # shares to the peer.
5383+        for key in self.goal:
5384+            peerid, shnum = key
5385+            write_enabler = self._node.get_write_enabler(peerid)
5386+            renew_secret = self._node.get_renewal_secret(peerid)
5387+            cancel_secret = self._node.get_cancel_secret(peerid)
5388+            secrets = (write_enabler, renew_secret, cancel_secret)
5389+
5390+            self.writers[shnum] =  writer_class(shnum,
5391+                                                self.connections[peerid],
5392+                                                self._storage_index,
5393+                                                secrets,
5394+                                                self._new_seqnum,
5395+                                                self.required_shares,
5396+                                                self.total_shares,
5397+                                                self.segment_size,
5398+                                                self.datalength)
5399+            self.writers[shnum].peerid = peerid
5400+            assert (peerid, shnum) in self._servermap.servermap
5401+            old_versionid, old_timestamp = self._servermap.servermap[key]
5402+            (old_seqnum, old_root_hash, old_salt, old_segsize,
5403+             old_datalength, old_k, old_N, old_prefix,
5404+             old_offsets_tuple) = old_versionid
5405+            self.writers[shnum].set_checkstring(old_seqnum,
5406+                                                old_root_hash,
5407+                                                old_salt)
5408+
5409+        # Our remote shares will not have a complete checkstring until
5410+        # after we are done writing share data and have started to write
5411+        # blocks. In the meantime, we need to know what to look for when
5412+        # writing, so that we can detect UncoordinatedWriteErrors.
5413+        self._checkstring = self.writers.values()[0].get_checkstring()
5414+
5415+        # Now, we start pushing shares.
5416+        self._status.timings["setup"] = time.time() - self._started
5417+        # First, we encrypt, encode, and publish the shares that we need
5418+        # to encrypt, encode, and publish.
5419+
5420+        # Our update process fetched these for us. We need to update
5421+        # them in place as publishing happens.
5422+        self.blockhashes = {} # (shnum, [blochashes])
5423+        for (i, bht) in blockhashes.iteritems():
5424+            # We need to extract the leaves from our old hash tree.
5425+            old_segcount = mathutil.div_ceil(version[4],
5426+                                             version[3])
5427+            h = hashtree.IncompleteHashTree(old_segcount)
5428+            bht = dict(enumerate(bht))
5429+            h.set_hashes(bht)
5430+            leaves = h[h.get_leaf_index(0):]
5431+            for j in xrange(self.num_segments - len(leaves)):
5432+                leaves.append(None)
5433+
5434+            assert len(leaves) >= self.num_segments
5435+            self.blockhashes[i] = leaves
5436+            # This list will now be the leaves that were set during the
5437+            # initial upload + enough empty hashes to make it a
5438+            # power-of-two. If we exceed a power of two boundary, we
5439+            # should be encoding the file over again, and should not be
5440+            # here. So, we have
5441+            #assert len(self.blockhashes[i]) == \
5442+            #    hashtree.roundup_pow2(self.num_segments), \
5443+            #        len(self.blockhashes[i])
5444+            # XXX: Except this doesn't work. Figure out why.
5445+
5446+        # These are filled in later, after we've modified the block hash
5447+        # tree suitably.
5448+        self.sharehash_leaves = None # eventually [sharehashes]
5449+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
5450+                              # validate the share]
5451+
5452+        d = defer.succeed(None)
5453+        self.log("Starting push")
5454+
5455+        self._state = PUSHING_BLOCKS_STATE
5456+        self._push()
5457+
5458+        return self.done_deferred
5459+
5460+
5461     def publish(self, newdata):
5462         """Publish the filenode's current contents.  Returns a Deferred that
5463         fires (with None) when the publish has done as much work as it's ever
5464hunk ./src/allmydata/mutable/publish.py 345
5465         simultaneous write.
5466         """
5467 
5468-        # 1: generate shares (SDMF: files are small, so we can do it in RAM)
5469-        # 2: perform peer selection, get candidate servers
5470-        #  2a: send queries to n+epsilon servers, to determine current shares
5471-        #  2b: based upon responses, create target map
5472-        # 3: send slot_testv_and_readv_and_writev messages
5473-        # 4: as responses return, update share-dispatch table
5474-        # 4a: may need to run recovery algorithm
5475-        # 5: when enough responses are back, we're done
5476+        # 0. Setup encoding parameters, encoder, and other such things.
5477+        # 1. Encrypt, encode, and publish segments.
5478+        assert IMutableUploadable.providedBy(newdata)
5479 
5480hunk ./src/allmydata/mutable/publish.py 349
5481-        self.log("starting publish, datalen is %s" % len(newdata))
5482-        self._status.set_size(len(newdata))
5483+        self.data = newdata
5484+        self.datalength = newdata.get_size()
5485+        #if self.datalength >= DEFAULT_MAX_SEGMENT_SIZE:
5486+        #    self._version = MDMF_VERSION
5487+        #else:
5488+        #    self._version = SDMF_VERSION
5489+
5490+        self.log("starting publish, datalen is %s" % self.datalength)
5491+        self._status.set_size(self.datalength)
5492         self._status.set_status("Started")
5493         self._started = time.time()
5494 
5495hunk ./src/allmydata/mutable/publish.py 405
5496         self.full_peerlist = full_peerlist # for use later, immutable
5497         self.bad_peers = set() # peerids who have errbacked/refused requests
5498 
5499-        self.newdata = newdata
5500-        self.salt = os.urandom(16)
5501-
5502+        # This will set self.segment_size, self.num_segments, and
5503+        # self.fec.
5504         self.setup_encoding_parameters()
5505 
5506         # if we experience any surprises (writes which were rejected because
5507hunk ./src/allmydata/mutable/publish.py 415
5508         # end of the publish process.
5509         self.surprised = False
5510 
5511-        # as a failsafe, refuse to iterate through self.loop more than a
5512-        # thousand times.
5513-        self.looplimit = 1000
5514-
5515         # we keep track of three tables. The first is our goal: which share
5516         # we want to see on which servers. This is initially populated by the
5517         # existing servermap.
5518hunk ./src/allmydata/mutable/publish.py 438
5519 
5520         self.bad_share_checkstrings = {}
5521 
5522+        # This is set at the last step of the publishing process.
5523+        self.versioninfo = ""
5524+
5525         # we use the servermap to populate the initial goal: this way we will
5526         # try to update each existing share in place.
5527         for (peerid, shnum) in self._servermap.servermap:
5528hunk ./src/allmydata/mutable/publish.py 454
5529             self.bad_share_checkstrings[key] = old_checkstring
5530             self.connections[peerid] = self._servermap.connections[peerid]
5531 
5532-        # create the shares. We'll discard these as they are delivered. SDMF:
5533-        # we're allowed to hold everything in memory.
5534+        # TODO: Make this part do peer selection.
5535+        self.update_goal()
5536+        self.writers = {}
5537+        if self._version == MDMF_VERSION:
5538+            writer_class = MDMFSlotWriteProxy
5539+        else:
5540+            writer_class = SDMFSlotWriteProxy
5541 
5542hunk ./src/allmydata/mutable/publish.py 462
5543+        # For each (peerid, shnum) in self.goal, we make a
5544+        # write proxy for that peer. We'll use this to write
5545+        # shares to the peer.
5546+        for key in self.goal:
5547+            peerid, shnum = key
5548+            write_enabler = self._node.get_write_enabler(peerid)
5549+            renew_secret = self._node.get_renewal_secret(peerid)
5550+            cancel_secret = self._node.get_cancel_secret(peerid)
5551+            secrets = (write_enabler, renew_secret, cancel_secret)
5552+
5553+            self.writers[shnum] =  writer_class(shnum,
5554+                                                self.connections[peerid],
5555+                                                self._storage_index,
5556+                                                secrets,
5557+                                                self._new_seqnum,
5558+                                                self.required_shares,
5559+                                                self.total_shares,
5560+                                                self.segment_size,
5561+                                                self.datalength)
5562+            self.writers[shnum].peerid = peerid
5563+            if (peerid, shnum) in self._servermap.servermap:
5564+                old_versionid, old_timestamp = self._servermap.servermap[key]
5565+                (old_seqnum, old_root_hash, old_salt, old_segsize,
5566+                 old_datalength, old_k, old_N, old_prefix,
5567+                 old_offsets_tuple) = old_versionid
5568+                self.writers[shnum].set_checkstring(old_seqnum,
5569+                                                    old_root_hash,
5570+                                                    old_salt)
5571+            elif (peerid, shnum) in self.bad_share_checkstrings:
5572+                old_checkstring = self.bad_share_checkstrings[(peerid, shnum)]
5573+                self.writers[shnum].set_checkstring(old_checkstring)
5574+
5575+        # Our remote shares will not have a complete checkstring until
5576+        # after we are done writing share data and have started to write
5577+        # blocks. In the meantime, we need to know what to look for when
5578+        # writing, so that we can detect UncoordinatedWriteErrors.
5579+        self._checkstring = self.writers.values()[0].get_checkstring()
5580+
5581+        # Now, we start pushing shares.
5582         self._status.timings["setup"] = time.time() - self._started
5583hunk ./src/allmydata/mutable/publish.py 502
5584-        d = self._encrypt_and_encode()
5585-        d.addCallback(self._generate_shares)
5586-        def _start_pushing(res):
5587-            self._started_pushing = time.time()
5588-            return res
5589-        d.addCallback(_start_pushing)
5590-        d.addCallback(self.loop) # trigger delivery
5591-        d.addErrback(self._fatal_error)
5592+        # First, we encrypt, encode, and publish the shares that we need
5593+        # to encrypt, encode, and publish.
5594+
5595+        # This will eventually hold the block hash chain for each share
5596+        # that we publish. We define it this way so that empty publishes
5597+        # will still have something to write to the remote slot.
5598+        self.blockhashes = dict([(i, []) for i in xrange(self.total_shares)])
5599+        for i in xrange(self.total_shares):
5600+            blocks = self.blockhashes[i]
5601+            for j in xrange(self.num_segments):
5602+                blocks.append(None)
5603+        self.sharehash_leaves = None # eventually [sharehashes]
5604+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
5605+                              # validate the share]
5606+
5607+        d = defer.succeed(None)
5608+        self.log("Starting push")
5609+
5610+        self._state = PUSHING_BLOCKS_STATE
5611+        self._push()
5612 
5613         return self.done_deferred
5614 
5615hunk ./src/allmydata/mutable/publish.py 525
5616-    def setup_encoding_parameters(self):
5617-        segment_size = len(self.newdata)
5618+
5619+    def _update_status(self):
5620+        self._status.set_status("Sending Shares: %d placed out of %d, "
5621+                                "%d messages outstanding" %
5622+                                (len(self.placed),
5623+                                 len(self.goal),
5624+                                 len(self.outstanding)))
5625+        self._status.set_progress(1.0 * len(self.placed) / len(self.goal))
5626+
5627+
5628+    def setup_encoding_parameters(self, offset=0):
5629+        if self._version == MDMF_VERSION:
5630+            segment_size = DEFAULT_MAX_SEGMENT_SIZE # 128 KiB by default
5631+        else:
5632+            segment_size = self.datalength # SDMF is only one segment
5633         # this must be a multiple of self.required_shares
5634         segment_size = mathutil.next_multiple(segment_size,
5635                                               self.required_shares)
5636hunk ./src/allmydata/mutable/publish.py 544
5637         self.segment_size = segment_size
5638+
5639+        # Calculate the starting segment for the upload.
5640         if segment_size:
5641hunk ./src/allmydata/mutable/publish.py 547
5642-            self.num_segments = mathutil.div_ceil(len(self.newdata),
5643+            self.num_segments = mathutil.div_ceil(self.datalength,
5644                                                   segment_size)
5645hunk ./src/allmydata/mutable/publish.py 549
5646+            self.starting_segment = mathutil.div_ceil(offset,
5647+                                                      segment_size)
5648+            self.starting_segment -= 1
5649+            if offset == 0:
5650+                self.starting_segment = 0
5651+
5652         else:
5653             self.num_segments = 0
5654hunk ./src/allmydata/mutable/publish.py 557
5655-        assert self.num_segments in [0, 1,] # SDMF restrictions
5656+            self.starting_segment = 0
5657+
5658+
5659+        self.log("building encoding parameters for file")
5660+        self.log("got segsize %d" % self.segment_size)
5661+        self.log("got %d segments" % self.num_segments)
5662+
5663+        if self._version == SDMF_VERSION:
5664+            assert self.num_segments in (0, 1) # SDMF
5665+        # calculate the tail segment size.
5666+
5667+        if segment_size and self.datalength:
5668+            self.tail_segment_size = self.datalength % segment_size
5669+            self.log("got tail segment size %d" % self.tail_segment_size)
5670+        else:
5671+            self.tail_segment_size = 0
5672+
5673+        if self.tail_segment_size == 0 and segment_size:
5674+            # The tail segment is the same size as the other segments.
5675+            self.tail_segment_size = segment_size
5676+
5677+        # Make FEC encoders
5678+        fec = codec.CRSEncoder()
5679+        fec.set_params(self.segment_size,
5680+                       self.required_shares, self.total_shares)
5681+        self.piece_size = fec.get_block_size()
5682+        self.fec = fec
5683+
5684+        if self.tail_segment_size == self.segment_size:
5685+            self.tail_fec = self.fec
5686+        else:
5687+            tail_fec = codec.CRSEncoder()
5688+            tail_fec.set_params(self.tail_segment_size,
5689+                                self.required_shares,
5690+                                self.total_shares)
5691+            self.tail_fec = tail_fec
5692+
5693+        self._current_segment = self.starting_segment
5694+        self.end_segment = self.num_segments - 1
5695+        # Now figure out where the last segment should be.
5696+        if self.data.get_size() != self.datalength:
5697+            end = self.data.get_size()
5698+            self.end_segment = mathutil.div_ceil(end,
5699+                                                 segment_size)
5700+            self.end_segment -= 1
5701+        self.log("got start segment %d" % self.starting_segment)
5702+        self.log("got end segment %d" % self.end_segment)
5703+
5704+
5705+    def _push(self, ignored=None):
5706+        """
5707+        I manage state transitions. In particular, I see that we still
5708+        have a good enough number of writers to complete the upload
5709+        successfully.
5710+        """
5711+        # Can we still successfully publish this file?
5712+        # TODO: Keep track of outstanding queries before aborting the
5713+        #       process.
5714+        if len(self.writers) <= self.required_shares or self.surprised:
5715+            return self._failure()
5716+
5717+        # Figure out what we need to do next. Each of these needs to
5718+        # return a deferred so that we don't block execution when this
5719+        # is first called in the upload method.
5720+        if self._state == PUSHING_BLOCKS_STATE:
5721+            return self.push_segment(self._current_segment)
5722+
5723+        elif self._state == PUSHING_EVERYTHING_ELSE_STATE:
5724+            return self.push_everything_else()
5725+
5726+        # If we make it to this point, we were successful in placing the
5727+        # file.
5728+        return self._done(None)
5729+
5730+
5731+    def push_segment(self, segnum):
5732+        if self.num_segments == 0 and self._version == SDMF_VERSION:
5733+            self._add_dummy_salts()
5734 
5735hunk ./src/allmydata/mutable/publish.py 636
5736-    def _fatal_error(self, f):
5737-        self.log("error during loop", failure=f, level=log.UNUSUAL)
5738-        self._done(f)
5739+        if segnum > self.end_segment:
5740+            # We don't have any more segments to push.
5741+            self._state = PUSHING_EVERYTHING_ELSE_STATE
5742+            return self._push()
5743+
5744+        d = self._encode_segment(segnum)
5745+        d.addCallback(self._push_segment, segnum)
5746+        def _increment_segnum(ign):
5747+            self._current_segment += 1
5748+        # XXX: I don't think we need to do addBoth here -- any errBacks
5749+        # should be handled within push_segment.
5750+        d.addBoth(_increment_segnum)
5751+        d.addBoth(self._turn_barrier)
5752+        d.addBoth(self._push)
5753+
5754+
5755+    def _turn_barrier(self, result):
5756+        """
5757+        I help the publish process avoid the recursion limit issues
5758+        described in #237.
5759+        """
5760+        return fireEventually(result)
5761+
5762+
5763+    def _add_dummy_salts(self):
5764+        """
5765+        SDMF files need a salt even if they're empty, or the signature
5766+        won't make sense. This method adds a dummy salt to each of our
5767+        SDMF writers so that they can write the signature later.
5768+        """
5769+        salt = os.urandom(16)
5770+        assert self._version == SDMF_VERSION
5771+
5772+        for writer in self.writers.itervalues():
5773+            writer.put_salt(salt)
5774+
5775+
5776+    def _encode_segment(self, segnum):
5777+        """
5778+        I encrypt and encode the segment segnum.
5779+        """
5780+        started = time.time()
5781+
5782+        if segnum + 1 == self.num_segments:
5783+            segsize = self.tail_segment_size
5784+        else:
5785+            segsize = self.segment_size
5786+
5787+
5788+        self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
5789+        data = self.data.read(segsize)
5790+        # XXX: This is dumb. Why return a list?
5791+        data = "".join(data)
5792+
5793+        assert len(data) == segsize, len(data)
5794+
5795+        salt = os.urandom(16)
5796+
5797+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
5798+        self._status.set_status("Encrypting")
5799+        enc = AES(key)
5800+        crypttext = enc.process(data)
5801+        assert len(crypttext) == len(data)
5802+
5803+        now = time.time()
5804+        self._status.timings["encrypt"] = now - started
5805+        started = now
5806+
5807+        # now apply FEC
5808+        if segnum + 1 == self.num_segments:
5809+            fec = self.tail_fec
5810+        else:
5811+            fec = self.fec
5812+
5813+        self._status.set_status("Encoding")
5814+        crypttext_pieces = [None] * self.required_shares
5815+        piece_size = fec.get_block_size()
5816+        for i in range(len(crypttext_pieces)):
5817+            offset = i * piece_size
5818+            piece = crypttext[offset:offset+piece_size]
5819+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
5820+            crypttext_pieces[i] = piece
5821+            assert len(piece) == piece_size
5822+        d = fec.encode(crypttext_pieces)
5823+        def _done_encoding(res):
5824+            elapsed = time.time() - started
5825+            self._status.timings["encode"] = elapsed
5826+            return (res, salt)
5827+        d.addCallback(_done_encoding)
5828+        return d
5829+
5830+
5831+    def _push_segment(self, encoded_and_salt, segnum):
5832+        """
5833+        I push (data, salt) as segment number segnum.
5834+        """
5835+        results, salt = encoded_and_salt
5836+        shares, shareids = results
5837+        started = time.time()
5838+        self._status.set_status("Pushing segment")
5839+        for i in xrange(len(shares)):
5840+            sharedata = shares[i]
5841+            shareid = shareids[i]
5842+            if self._version == MDMF_VERSION:
5843+                hashed = salt + sharedata
5844+            else:
5845+                hashed = sharedata
5846+            block_hash = hashutil.block_hash(hashed)
5847+            old_hash = self.blockhashes[shareid][segnum]
5848+            self.blockhashes[shareid][segnum] = block_hash
5849+            # find the writer for this share
5850+            writer = self.writers[shareid]
5851+            writer.put_block(sharedata, segnum, salt)
5852+
5853+
5854+    def push_everything_else(self):
5855+        """
5856+        I put everything else associated with a share.
5857+        """
5858+        self._pack_started = time.time()
5859+        self.push_encprivkey()
5860+        self.push_blockhashes()
5861+        self.push_sharehashes()
5862+        self.push_toplevel_hashes_and_signature()
5863+        d = self.finish_publishing()
5864+        def _change_state(ignored):
5865+            self._state = DONE_STATE
5866+        d.addCallback(_change_state)
5867+        d.addCallback(self._push)
5868+        return d
5869+
5870+
5871+    def push_encprivkey(self):
5872+        encprivkey = self._encprivkey
5873+        self._status.set_status("Pushing encrypted private key")
5874+        for writer in self.writers.itervalues():
5875+            writer.put_encprivkey(encprivkey)
5876+
5877+
5878+    def push_blockhashes(self):
5879+        self.sharehash_leaves = [None] * len(self.blockhashes)
5880+        self._status.set_status("Building and pushing block hash tree")
5881+        for shnum, blockhashes in self.blockhashes.iteritems():
5882+            t = hashtree.HashTree(blockhashes)
5883+            self.blockhashes[shnum] = list(t)
5884+            # set the leaf for future use.
5885+            self.sharehash_leaves[shnum] = t[0]
5886+
5887+            writer = self.writers[shnum]
5888+            writer.put_blockhashes(self.blockhashes[shnum])
5889+
5890+
5891+    def push_sharehashes(self):
5892+        self._status.set_status("Building and pushing share hash chain")
5893+        share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
5894+        share_hash_chain = {}
5895+        for shnum in xrange(len(self.sharehash_leaves)):
5896+            needed_indices = share_hash_tree.needed_hashes(shnum)
5897+            self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
5898+                                             for i in needed_indices] )
5899+            writer = self.writers[shnum]
5900+            writer.put_sharehashes(self.sharehashes[shnum])
5901+        self.root_hash = share_hash_tree[0]
5902+
5903+
5904+    def push_toplevel_hashes_and_signature(self):
5905+        # We need to to three things here:
5906+        #   - Push the root hash and salt hash
5907+        #   - Get the checkstring of the resulting layout; sign that.
5908+        #   - Push the signature
5909+        self._status.set_status("Pushing root hashes and signature")
5910+        for shnum in xrange(self.total_shares):
5911+            writer = self.writers[shnum]
5912+            writer.put_root_hash(self.root_hash)
5913+        self._update_checkstring()
5914+        self._make_and_place_signature()
5915+
5916+
5917+    def _update_checkstring(self):
5918+        """
5919+        After putting the root hash, MDMF files will have the
5920+        checkstring written to the storage server. This means that we
5921+        can update our copy of the checkstring so we can detect
5922+        uncoordinated writes. SDMF files will have the same checkstring,
5923+        so we need not do anything.
5924+        """
5925+        self._checkstring = self.writers.values()[0].get_checkstring()
5926+
5927+
5928+    def _make_and_place_signature(self):
5929+        """
5930+        I create and place the signature.
5931+        """
5932+        started = time.time()
5933+        self._status.set_status("Signing prefix")
5934+        signable = self.writers[0].get_signable()
5935+        self.signature = self._privkey.sign(signable)
5936+
5937+        for (shnum, writer) in self.writers.iteritems():
5938+            writer.put_signature(self.signature)
5939+        self._status.timings['sign'] = time.time() - started
5940+
5941+
5942+    def finish_publishing(self):
5943+        # We're almost done -- we just need to put the verification key
5944+        # and the offsets
5945+        started = time.time()
5946+        self._status.set_status("Pushing shares")
5947+        self._started_pushing = started
5948+        ds = []
5949+        verification_key = self._pubkey.serialize()
5950+
5951+
5952+        # TODO: Bad, since we remove from this same dict. We need to
5953+        # make a copy, or just use a non-iterated value.
5954+        for (shnum, writer) in self.writers.iteritems():
5955+            writer.put_verification_key(verification_key)
5956+            d = writer.finish_publishing()
5957+            # Add the (peerid, shnum) tuple to our list of outstanding
5958+            # queries. This gets used by _loop if some of our queries
5959+            # fail to place shares.
5960+            self.outstanding.add((writer.peerid, writer.shnum))
5961+            d.addCallback(self._got_write_answer, writer, started)
5962+            d.addErrback(self._connection_problem, writer)
5963+            ds.append(d)
5964+        self._record_verinfo()
5965+        self._status.timings['pack'] = time.time() - started
5966+        return defer.DeferredList(ds)
5967+
5968+
5969+    def _record_verinfo(self):
5970+        self.versioninfo = self.writers.values()[0].get_verinfo()
5971+
5972+
5973+    def _connection_problem(self, f, writer):
5974+        """
5975+        We ran into a connection problem while working with writer, and
5976+        need to deal with that.
5977+        """
5978+        self.log("found problem: %s" % str(f))
5979+        self._last_failure = f
5980+        del(self.writers[writer.shnum])
5981 
5982hunk ./src/allmydata/mutable/publish.py 879
5983-    def _update_status(self):
5984-        self._status.set_status("Sending Shares: %d placed out of %d, "
5985-                                "%d messages outstanding" %
5986-                                (len(self.placed),
5987-                                 len(self.goal),
5988-                                 len(self.outstanding)))
5989-        self._status.set_progress(1.0 * len(self.placed) / len(self.goal))
5990 
5991hunk ./src/allmydata/mutable/publish.py 880
5992-    def loop(self, ignored=None):
5993-        self.log("entering loop", level=log.NOISY)
5994-        if not self._running:
5995-            return
5996-
5997-        self.looplimit -= 1
5998-        if self.looplimit <= 0:
5999-            raise LoopLimitExceededError("loop limit exceeded")
6000-
6001-        if self.surprised:
6002-            # don't send out any new shares, just wait for the outstanding
6003-            # ones to be retired.
6004-            self.log("currently surprised, so don't send any new shares",
6005-                     level=log.NOISY)
6006-        else:
6007-            self.update_goal()
6008-            # how far are we from our goal?
6009-            needed = self.goal - self.placed - self.outstanding
6010-            self._update_status()
6011-
6012-            if needed:
6013-                # we need to send out new shares
6014-                self.log(format="need to send %(needed)d new shares",
6015-                         needed=len(needed), level=log.NOISY)
6016-                self._send_shares(needed)
6017-                return
6018-
6019-        if self.outstanding:
6020-            # queries are still pending, keep waiting
6021-            self.log(format="%(outstanding)d queries still outstanding",
6022-                     outstanding=len(self.outstanding),
6023-                     level=log.NOISY)
6024-            return
6025-
6026-        # no queries outstanding, no placements needed: we're done
6027-        self.log("no queries outstanding, no placements needed: done",
6028-                 level=log.OPERATIONAL)
6029-        now = time.time()
6030-        elapsed = now - self._started_pushing
6031-        self._status.timings["push"] = elapsed
6032-        return self._done(None)
6033-
6034     def log_goal(self, goal, message=""):
6035         logmsg = [message]
6036         for (shnum, peerid) in sorted([(s,p) for (p,s) in goal]):
6037hunk ./src/allmydata/mutable/publish.py 961
6038             self.log_goal(self.goal, "after update: ")
6039 
6040 
6041+    def _got_write_answer(self, answer, writer, started):
6042+        if not answer:
6043+            # SDMF writers only pretend to write when readers set their
6044+            # blocks, salts, and so on -- they actually just write once,
6045+            # at the end of the upload process. In fake writes, they
6046+            # return defer.succeed(None). If we see that, we shouldn't
6047+            # bother checking it.
6048+            return
6049 
6050hunk ./src/allmydata/mutable/publish.py 970
6051-    def _encrypt_and_encode(self):
6052-        # this returns a Deferred that fires with a list of (sharedata,
6053-        # sharenum) tuples. TODO: cache the ciphertext, only produce the
6054-        # shares that we care about.
6055-        self.log("_encrypt_and_encode")
6056-
6057-        self._status.set_status("Encrypting")
6058-        started = time.time()
6059-
6060-        key = hashutil.ssk_readkey_data_hash(self.salt, self.readkey)
6061-        enc = AES(key)
6062-        crypttext = enc.process(self.newdata)
6063-        assert len(crypttext) == len(self.newdata)
6064+        peerid = writer.peerid
6065+        lp = self.log("_got_write_answer from %s, share %d" %
6066+                      (idlib.shortnodeid_b2a(peerid), writer.shnum))
6067 
6068         now = time.time()
6069hunk ./src/allmydata/mutable/publish.py 975
6070-        self._status.timings["encrypt"] = now - started
6071-        started = now
6072-
6073-        # now apply FEC
6074-
6075-        self._status.set_status("Encoding")
6076-        fec = codec.CRSEncoder()
6077-        fec.set_params(self.segment_size,
6078-                       self.required_shares, self.total_shares)
6079-        piece_size = fec.get_block_size()
6080-        crypttext_pieces = [None] * self.required_shares
6081-        for i in range(len(crypttext_pieces)):
6082-            offset = i * piece_size
6083-            piece = crypttext[offset:offset+piece_size]
6084-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
6085-            crypttext_pieces[i] = piece
6086-            assert len(piece) == piece_size
6087-
6088-        d = fec.encode(crypttext_pieces)
6089-        def _done_encoding(res):
6090-            elapsed = time.time() - started
6091-            self._status.timings["encode"] = elapsed
6092-            return res
6093-        d.addCallback(_done_encoding)
6094-        return d
6095-
6096-    def _generate_shares(self, shares_and_shareids):
6097-        # this sets self.shares and self.root_hash
6098-        self.log("_generate_shares")
6099-        self._status.set_status("Generating Shares")
6100-        started = time.time()
6101-
6102-        # we should know these by now
6103-        privkey = self._privkey
6104-        encprivkey = self._encprivkey
6105-        pubkey = self._pubkey
6106-
6107-        (shares, share_ids) = shares_and_shareids
6108-
6109-        assert len(shares) == len(share_ids)
6110-        assert len(shares) == self.total_shares
6111-        all_shares = {}
6112-        block_hash_trees = {}
6113-        share_hash_leaves = [None] * len(shares)
6114-        for i in range(len(shares)):
6115-            share_data = shares[i]
6116-            shnum = share_ids[i]
6117-            all_shares[shnum] = share_data
6118-
6119-            # build the block hash tree. SDMF has only one leaf.
6120-            leaves = [hashutil.block_hash(share_data)]
6121-            t = hashtree.HashTree(leaves)
6122-            block_hash_trees[shnum] = list(t)
6123-            share_hash_leaves[shnum] = t[0]
6124-        for leaf in share_hash_leaves:
6125-            assert leaf is not None
6126-        share_hash_tree = hashtree.HashTree(share_hash_leaves)
6127-        share_hash_chain = {}
6128-        for shnum in range(self.total_shares):
6129-            needed_hashes = share_hash_tree.needed_hashes(shnum)
6130-            share_hash_chain[shnum] = dict( [ (i, share_hash_tree[i])
6131-                                              for i in needed_hashes ] )
6132-        root_hash = share_hash_tree[0]
6133-        assert len(root_hash) == 32
6134-        self.log("my new root_hash is %s" % base32.b2a(root_hash))
6135-        self._new_version_info = (self._new_seqnum, root_hash, self.salt)
6136-
6137-        prefix = pack_prefix(self._new_seqnum, root_hash, self.salt,
6138-                             self.required_shares, self.total_shares,
6139-                             self.segment_size, len(self.newdata))
6140-
6141-        # now pack the beginning of the share. All shares are the same up
6142-        # to the signature, then they have divergent share hash chains,
6143-        # then completely different block hash trees + salt + share data,
6144-        # then they all share the same encprivkey at the end. The sizes
6145-        # of everything are the same for all shares.
6146-
6147-        sign_started = time.time()
6148-        signature = privkey.sign(prefix)
6149-        self._status.timings["sign"] = time.time() - sign_started
6150-
6151-        verification_key = pubkey.serialize()
6152-
6153-        final_shares = {}
6154-        for shnum in range(self.total_shares):
6155-            final_share = pack_share(prefix,
6156-                                     verification_key,
6157-                                     signature,
6158-                                     share_hash_chain[shnum],
6159-                                     block_hash_trees[shnum],
6160-                                     all_shares[shnum],
6161-                                     encprivkey)
6162-            final_shares[shnum] = final_share
6163-        elapsed = time.time() - started
6164-        self._status.timings["pack"] = elapsed
6165-        self.shares = final_shares
6166-        self.root_hash = root_hash
6167-
6168-        # we also need to build up the version identifier for what we're
6169-        # pushing. Extract the offsets from one of our shares.
6170-        assert final_shares
6171-        offsets = unpack_header(final_shares.values()[0])[-1]
6172-        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
6173-        verinfo = (self._new_seqnum, root_hash, self.salt,
6174-                   self.segment_size, len(self.newdata),
6175-                   self.required_shares, self.total_shares,
6176-                   prefix, offsets_tuple)
6177-        self.versioninfo = verinfo
6178-
6179-
6180-
6181-    def _send_shares(self, needed):
6182-        self.log("_send_shares")
6183-
6184-        # we're finally ready to send out our shares. If we encounter any
6185-        # surprises here, it's because somebody else is writing at the same
6186-        # time. (Note: in the future, when we remove the _query_peers() step
6187-        # and instead speculate about [or remember] which shares are where,
6188-        # surprises here are *not* indications of UncoordinatedWriteError,
6189-        # and we'll need to respond to them more gracefully.)
6190-
6191-        # needed is a set of (peerid, shnum) tuples. The first thing we do is
6192-        # organize it by peerid.
6193-
6194-        peermap = DictOfSets()
6195-        for (peerid, shnum) in needed:
6196-            peermap.add(peerid, shnum)
6197-
6198-        # the next thing is to build up a bunch of test vectors. The
6199-        # semantics of Publish are that we perform the operation if the world
6200-        # hasn't changed since the ServerMap was constructed (more or less).
6201-        # For every share we're trying to place, we create a test vector that
6202-        # tests to see if the server*share still corresponds to the
6203-        # map.
6204-
6205-        all_tw_vectors = {} # maps peerid to tw_vectors
6206-        sm = self._servermap.servermap
6207-
6208-        for key in needed:
6209-            (peerid, shnum) = key
6210-
6211-            if key in sm:
6212-                # an old version of that share already exists on the
6213-                # server, according to our servermap. We will create a
6214-                # request that attempts to replace it.
6215-                old_versionid, old_timestamp = sm[key]
6216-                (old_seqnum, old_root_hash, old_salt, old_segsize,
6217-                 old_datalength, old_k, old_N, old_prefix,
6218-                 old_offsets_tuple) = old_versionid
6219-                old_checkstring = pack_checkstring(old_seqnum,
6220-                                                   old_root_hash,
6221-                                                   old_salt)
6222-                testv = (0, len(old_checkstring), "eq", old_checkstring)
6223-
6224-            elif key in self.bad_share_checkstrings:
6225-                old_checkstring = self.bad_share_checkstrings[key]
6226-                testv = (0, len(old_checkstring), "eq", old_checkstring)
6227-
6228-            else:
6229-                # add a testv that requires the share not exist
6230-
6231-                # Unfortunately, foolscap-0.2.5 has a bug in the way inbound
6232-                # constraints are handled. If the same object is referenced
6233-                # multiple times inside the arguments, foolscap emits a
6234-                # 'reference' token instead of a distinct copy of the
6235-                # argument. The bug is that these 'reference' tokens are not
6236-                # accepted by the inbound constraint code. To work around
6237-                # this, we need to prevent python from interning the
6238-                # (constant) tuple, by creating a new copy of this vector
6239-                # each time.
6240-
6241-                # This bug is fixed in foolscap-0.2.6, and even though this
6242-                # version of Tahoe requires foolscap-0.3.1 or newer, we are
6243-                # supposed to be able to interoperate with older versions of
6244-                # Tahoe which are allowed to use older versions of foolscap,
6245-                # including foolscap-0.2.5 . In addition, I've seen other
6246-                # foolscap problems triggered by 'reference' tokens (see #541
6247-                # for details). So we must keep this workaround in place.
6248-
6249-                #testv = (0, 1, 'eq', "")
6250-                testv = tuple([0, 1, 'eq', ""])
6251-
6252-            testvs = [testv]
6253-            # the write vector is simply the share
6254-            writev = [(0, self.shares[shnum])]
6255-
6256-            if peerid not in all_tw_vectors:
6257-                all_tw_vectors[peerid] = {}
6258-                # maps shnum to (testvs, writevs, new_length)
6259-            assert shnum not in all_tw_vectors[peerid]
6260-
6261-            all_tw_vectors[peerid][shnum] = (testvs, writev, None)
6262-
6263-        # we read the checkstring back from each share, however we only use
6264-        # it to detect whether there was a new share that we didn't know
6265-        # about. The success or failure of the write will tell us whether
6266-        # there was a collision or not. If there is a collision, the first
6267-        # thing we'll do is update the servermap, which will find out what
6268-        # happened. We could conceivably reduce a roundtrip by using the
6269-        # readv checkstring to populate the servermap, but really we'd have
6270-        # to read enough data to validate the signatures too, so it wouldn't
6271-        # be an overall win.
6272-        read_vector = [(0, struct.calcsize(SIGNED_PREFIX))]
6273-
6274-        # ok, send the messages!
6275-        self.log("sending %d shares" % len(all_tw_vectors), level=log.NOISY)
6276-        started = time.time()
6277-        for (peerid, tw_vectors) in all_tw_vectors.items():
6278-
6279-            write_enabler = self._node.get_write_enabler(peerid)
6280-            renew_secret = self._node.get_renewal_secret(peerid)
6281-            cancel_secret = self._node.get_cancel_secret(peerid)
6282-            secrets = (write_enabler, renew_secret, cancel_secret)
6283-            shnums = tw_vectors.keys()
6284-
6285-            for shnum in shnums:
6286-                self.outstanding.add( (peerid, shnum) )
6287+        elapsed = now - started
6288 
6289hunk ./src/allmydata/mutable/publish.py 977
6290-            d = self._do_testreadwrite(peerid, secrets,
6291-                                       tw_vectors, read_vector)
6292-            d.addCallbacks(self._got_write_answer, self._got_write_error,
6293-                           callbackArgs=(peerid, shnums, started),
6294-                           errbackArgs=(peerid, shnums, started))
6295-            # tolerate immediate errback, like with DeadReferenceError
6296-            d.addBoth(fireEventually)
6297-            d.addCallback(self.loop)
6298-            d.addErrback(self._fatal_error)
6299+        self._status.add_per_server_time(peerid, elapsed)
6300 
6301hunk ./src/allmydata/mutable/publish.py 979
6302-        self._update_status()
6303-        self.log("%d shares sent" % len(all_tw_vectors), level=log.NOISY)
6304+        wrote, read_data = answer
6305 
6306hunk ./src/allmydata/mutable/publish.py 981
6307-    def _do_testreadwrite(self, peerid, secrets,
6308-                          tw_vectors, read_vector):
6309-        storage_index = self._storage_index
6310-        ss = self.connections[peerid]
6311+        surprise_shares = set(read_data.keys()) - set([writer.shnum])
6312 
6313hunk ./src/allmydata/mutable/publish.py 983
6314-        #print "SS[%s] is %s" % (idlib.shortnodeid_b2a(peerid), ss), ss.tracker.interfaceName
6315-        d = ss.callRemote("slot_testv_and_readv_and_writev",
6316-                          storage_index,
6317-                          secrets,
6318-                          tw_vectors,
6319-                          read_vector)
6320-        return d
6321+        # We need to remove from surprise_shares any shares that we are
6322+        # knowingly also writing to that peer from other writers.
6323 
6324hunk ./src/allmydata/mutable/publish.py 986
6325-    def _got_write_answer(self, answer, peerid, shnums, started):
6326-        lp = self.log("_got_write_answer from %s" %
6327-                      idlib.shortnodeid_b2a(peerid))
6328-        for shnum in shnums:
6329-            self.outstanding.discard( (peerid, shnum) )
6330+        # TODO: Precompute this.
6331+        known_shnums = [x.shnum for x in self.writers.values()
6332+                        if x.peerid == peerid]
6333+        surprise_shares -= set(known_shnums)
6334+        self.log("found the following surprise shares: %s" %
6335+                 str(surprise_shares))
6336 
6337hunk ./src/allmydata/mutable/publish.py 993
6338-        now = time.time()
6339-        elapsed = now - started
6340-        self._status.add_per_server_time(peerid, elapsed)
6341-
6342-        wrote, read_data = answer
6343-
6344-        surprise_shares = set(read_data.keys()) - set(shnums)
6345+        # Now surprise shares contains all of the shares that we did not
6346+        # expect to be there.
6347 
6348         surprised = False
6349         for shnum in surprise_shares:
6350hunk ./src/allmydata/mutable/publish.py 1000
6351             # read_data is a dict mapping shnum to checkstring (SIGNED_PREFIX)
6352             checkstring = read_data[shnum][0]
6353-            their_version_info = unpack_checkstring(checkstring)
6354-            if their_version_info == self._new_version_info:
6355+            # What we want to do here is to see if their (seqnum,
6356+            # roothash, salt) is the same as our (seqnum, roothash,
6357+            # salt), or the equivalent for MDMF. The best way to do this
6358+            # is to store a packed representation of our checkstring
6359+            # somewhere, then not bother unpacking the other
6360+            # checkstring.
6361+            if checkstring == self._checkstring:
6362                 # they have the right share, somehow
6363 
6364                 if (peerid,shnum) in self.goal:
6365hunk ./src/allmydata/mutable/publish.py 1085
6366             self.log("our testv failed, so the write did not happen",
6367                      parent=lp, level=log.WEIRD, umid="8sc26g")
6368             self.surprised = True
6369-            self.bad_peers.add(peerid) # don't ask them again
6370+            self.bad_peers.add(writer) # don't ask them again
6371             # use the checkstring to add information to the log message
6372             for (shnum,readv) in read_data.items():
6373                 checkstring = readv[0]
6374hunk ./src/allmydata/mutable/publish.py 1107
6375                 # if expected_version==None, then we didn't expect to see a
6376                 # share on that peer, and the 'surprise_shares' clause above
6377                 # will have logged it.
6378-            # self.loop() will take care of finding new homes
6379             return
6380 
6381hunk ./src/allmydata/mutable/publish.py 1109
6382-        for shnum in shnums:
6383-            self.placed.add( (peerid, shnum) )
6384-            # and update the servermap
6385-            self._servermap.add_new_share(peerid, shnum,
6386+        # and update the servermap
6387+        # self.versioninfo is set during the last phase of publishing.
6388+        # If we get there, we know that responses correspond to placed
6389+        # shares, and can safely execute these statements.
6390+        if self.versioninfo:
6391+            self.log("wrote successfully: adding new share to servermap")
6392+            self._servermap.add_new_share(peerid, writer.shnum,
6393                                           self.versioninfo, started)
6394hunk ./src/allmydata/mutable/publish.py 1117
6395-
6396-        # self.loop() will take care of checking to see if we're done
6397+            self.placed.add( (peerid, writer.shnum) )
6398+        self._update_status()
6399+        # the next method in the deferred chain will check to see if
6400+        # we're done and successful.
6401         return
6402 
6403hunk ./src/allmydata/mutable/publish.py 1123
6404-    def _got_write_error(self, f, peerid, shnums, started):
6405-        for shnum in shnums:
6406-            self.outstanding.discard( (peerid, shnum) )
6407-        self.bad_peers.add(peerid)
6408-        if self._first_write_error is None:
6409-            self._first_write_error = f
6410-        self.log(format="error while writing shares %(shnums)s to peerid %(peerid)s",
6411-                 shnums=list(shnums), peerid=idlib.shortnodeid_b2a(peerid),
6412-                 failure=f,
6413-                 level=log.UNUSUAL)
6414-        # self.loop() will take care of checking to see if we're done
6415-        return
6416-
6417 
6418     def _done(self, res):
6419         if not self._running:
6420hunk ./src/allmydata/mutable/publish.py 1130
6421         self._running = False
6422         now = time.time()
6423         self._status.timings["total"] = now - self._started
6424+
6425+        elapsed = now - self._started_pushing
6426+        self._status.timings['push'] = elapsed
6427+
6428         self._status.set_active(False)
6429hunk ./src/allmydata/mutable/publish.py 1135
6430-        if isinstance(res, failure.Failure):
6431-            self.log("Publish done, with failure", failure=res,
6432-                     level=log.WEIRD, umid="nRsR9Q")
6433-            self._status.set_status("Failed")
6434-        elif self.surprised:
6435-            self.log("Publish done, UncoordinatedWriteError", level=log.UNUSUAL)
6436-            self._status.set_status("UncoordinatedWriteError")
6437-            # deliver a failure
6438-            res = failure.Failure(UncoordinatedWriteError())
6439-            # TODO: recovery
6440-        else:
6441-            self.log("Publish done, success")
6442-            self._status.set_status("Finished")
6443-            self._status.set_progress(1.0)
6444+        self.log("Publish done, success")
6445+        self._status.set_status("Finished")
6446+        self._status.set_progress(1.0)
6447         eventually(self.done_deferred.callback, res)
6448 
6449hunk ./src/allmydata/mutable/publish.py 1140
6450+    def _failure(self):
6451+
6452+        if not self.surprised:
6453+            # We ran out of servers
6454+            self.log("Publish ran out of good servers, "
6455+                     "last failure was: %s" % str(self._last_failure))
6456+            e = NotEnoughServersError("Ran out of non-bad servers, "
6457+                                      "last failure was %s" %
6458+                                      str(self._last_failure))
6459+        else:
6460+            # We ran into shares that we didn't recognize, which means
6461+            # that we need to return an UncoordinatedWriteError.
6462+            self.log("Publish failed with UncoordinatedWriteError")
6463+            e = UncoordinatedWriteError()
6464+        f = failure.Failure(e)
6465+        eventually(self.done_deferred.callback, f)
6466+
6467+
6468+class MutableFileHandle:
6469+    """
6470+    I am a mutable uploadable built around a filehandle-like object,
6471+    usually either a StringIO instance or a handle to an actual file.
6472+    """
6473+    implements(IMutableUploadable)
6474+
6475+    def __init__(self, filehandle):
6476+        # The filehandle is defined as a generally file-like object that
6477+        # has these two methods. We don't care beyond that.
6478+        assert hasattr(filehandle, "read")
6479+        assert hasattr(filehandle, "close")
6480+
6481+        self._filehandle = filehandle
6482+        # We must start reading at the beginning of the file, or we risk
6483+        # encountering errors when the data read does not match the size
6484+        # reported to the uploader.
6485+        self._filehandle.seek(0)
6486+
6487+        # We have not yet read anything, so our position is 0.
6488+        self._marker = 0
6489+
6490+
6491+    def get_size(self):
6492+        """
6493+        I return the amount of data in my filehandle.
6494+        """
6495+        if not hasattr(self, "_size"):
6496+            old_position = self._filehandle.tell()
6497+            # Seek to the end of the file by seeking 0 bytes from the
6498+            # file's end
6499+            self._filehandle.seek(0, 2) # 2 == os.SEEK_END in 2.5+
6500+            self._size = self._filehandle.tell()
6501+            # Restore the previous position, in case this was called
6502+            # after a read.
6503+            self._filehandle.seek(old_position)
6504+            assert self._filehandle.tell() == old_position
6505+
6506+        assert hasattr(self, "_size")
6507+        return self._size
6508+
6509+
6510+    def pos(self):
6511+        """
6512+        I return the position of my read marker -- i.e., how much data I
6513+        have already read and returned to callers.
6514+        """
6515+        return self._marker
6516+
6517+
6518+    def read(self, length):
6519+        """
6520+        I return some data (up to length bytes) from my filehandle.
6521+
6522+        In most cases, I return length bytes, but sometimes I won't --
6523+        for example, if I am asked to read beyond the end of a file, or
6524+        an error occurs.
6525+        """
6526+        results = self._filehandle.read(length)
6527+        self._marker += len(results)
6528+        return [results]
6529+
6530+
6531+    def close(self):
6532+        """
6533+        I close the underlying filehandle. Any further operations on the
6534+        filehandle fail at this point.
6535+        """
6536+        self._filehandle.close()
6537+
6538+
6539+class MutableData(MutableFileHandle):
6540+    """
6541+    I am a mutable uploadable built around a string, which I then cast
6542+    into a StringIO and treat as a filehandle.
6543+    """
6544+
6545+    def __init__(self, s):
6546+        # Take a string and return a file-like uploadable.
6547+        assert isinstance(s, str)
6548+
6549+        MutableFileHandle.__init__(self, StringIO(s))
6550+
6551+
6552+class TransformingUploadable:
6553+    """
6554+    I am an IMutableUploadable that wraps another IMutableUploadable,
6555+    and some segments that are already on the grid. When I am called to
6556+    read, I handle merging of boundary segments.
6557+    """
6558+    implements(IMutableUploadable)
6559+
6560+
6561+    def __init__(self, data, offset, segment_size, start, end):
6562+        assert IMutableUploadable.providedBy(data)
6563+
6564+        self._newdata = data
6565+        self._offset = offset
6566+        self._segment_size = segment_size
6567+        self._start = start
6568+        self._end = end
6569+
6570+        self._read_marker = 0
6571+
6572+        self._first_segment_offset = offset % segment_size
6573+
6574+        num = self.log("TransformingUploadable: starting", parent=None)
6575+        self._log_number = num
6576+        self.log("got fso: %d" % self._first_segment_offset)
6577+        self.log("got offset: %d" % self._offset)
6578+
6579+
6580+    def log(self, *args, **kwargs):
6581+        if 'parent' not in kwargs:
6582+            kwargs['parent'] = self._log_number
6583+        if "facility" not in kwargs:
6584+            kwargs["facility"] = "tahoe.mutable.transforminguploadable"
6585+        return log.msg(*args, **kwargs)
6586+
6587+
6588+    def get_size(self):
6589+        return self._offset + self._newdata.get_size()
6590+
6591+
6592+    def read(self, length):
6593+        # We can get data from 3 sources here.
6594+        #   1. The first of the segments provided to us.
6595+        #   2. The data that we're replacing things with.
6596+        #   3. The last of the segments provided to us.
6597+
6598+        # are we in state 0?
6599+        self.log("reading %d bytes" % length)
6600+
6601+        old_start_data = ""
6602+        old_data_length = self._first_segment_offset - self._read_marker
6603+        if old_data_length > 0:
6604+            if old_data_length > length:
6605+                old_data_length = length
6606+            self.log("returning %d bytes of old start data" % old_data_length)
6607+
6608+            old_data_end = old_data_length + self._read_marker
6609+            old_start_data = self._start[self._read_marker:old_data_end]
6610+            length -= old_data_length
6611+        else:
6612+            # otherwise calculations later get screwed up.
6613+            old_data_length = 0
6614+
6615+        # Is there enough new data to satisfy this read? If not, we need
6616+        # to pad the end of the data with data from our last segment.
6617+        old_end_length = length - \
6618+            (self._newdata.get_size() - self._newdata.pos())
6619+        old_end_data = ""
6620+        if old_end_length > 0:
6621+            self.log("reading %d bytes of old end data" % old_end_length)
6622+
6623+            # TODO: We're not explicitly checking for tail segment size
6624+            # here. Is that a problem?
6625+            old_data_offset = (length - old_end_length + \
6626+                               old_data_length) % self._segment_size
6627+            self.log("reading at offset %d" % old_data_offset)
6628+            old_end = old_data_offset + old_end_length
6629+            old_end_data = self._end[old_data_offset:old_end]
6630+            length -= old_end_length
6631+            assert length == self._newdata.get_size() - self._newdata.pos()
6632+
6633+        self.log("reading %d bytes of new data" % length)
6634+        new_data = self._newdata.read(length)
6635+        new_data = "".join(new_data)
6636+
6637+        self._read_marker += len(old_start_data + new_data + old_end_data)
6638+
6639+        return old_start_data + new_data + old_end_data
6640 
6641hunk ./src/allmydata/mutable/publish.py 1331
6642+    def close(self):
6643+        pass
6644}
6645[mutable/retrieve.py: Modify the retrieval process to support MDMF
6646Kevan Carstensen <kevan@isnotajoke.com>**20100811233125
6647 Ignore-this: bb5f95e1d0e8bb734d43d5ed1550ce
6648 
6649 The logic behind a mutable file download had to be adapted to work with
6650 segmented mutable files; this patch performs those adaptations. It also
6651 exposes some decoding and decrypting functionality to make partial-file
6652 updates a little easier, and supports efficient random-access downloads
6653 of parts of an MDMF file.
6654] {
6655hunk ./src/allmydata/mutable/retrieve.py 7
6656 from zope.interface import implements
6657 from twisted.internet import defer
6658 from twisted.python import failure
6659+from twisted.internet.interfaces import IPushProducer, IConsumer
6660 from foolscap.api import DeadReferenceError, eventually, fireEventually
6661hunk ./src/allmydata/mutable/retrieve.py 9
6662-from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError
6663-from allmydata.util import hashutil, idlib, log
6664+from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError, \
6665+                                 MDMF_VERSION, SDMF_VERSION
6666+from allmydata.util import hashutil, idlib, log, mathutil
6667 from allmydata import hashtree, codec
6668 from allmydata.storage.server import si_b2a
6669 from pycryptopp.cipher.aes import AES
6670hunk ./src/allmydata/mutable/retrieve.py 18
6671 from pycryptopp.publickey import rsa
6672 
6673 from allmydata.mutable.common import DictOfSets, CorruptShareError, UncoordinatedWriteError
6674-from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data
6675+from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data, \
6676+                                     MDMFSlotReadProxy
6677 
6678 class RetrieveStatus:
6679     implements(IRetrieveStatus)
6680hunk ./src/allmydata/mutable/retrieve.py 86
6681     # times, and each will have a separate response chain. However the
6682     # Retrieve object will remain tied to a specific version of the file, and
6683     # will use a single ServerMap instance.
6684+    implements(IPushProducer)
6685 
6686hunk ./src/allmydata/mutable/retrieve.py 88
6687-    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False):
6688+    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False,
6689+                 verify=False):
6690         self._node = filenode
6691         assert self._node.get_pubkey()
6692         self._storage_index = filenode.get_storage_index()
6693hunk ./src/allmydata/mutable/retrieve.py 107
6694         self.verinfo = verinfo
6695         # during repair, we may be called upon to grab the private key, since
6696         # it wasn't picked up during a verify=False checker run, and we'll
6697-        # need it for repair to generate the a new version.
6698-        self._need_privkey = fetch_privkey
6699-        if self._node.get_privkey():
6700+        # need it for repair to generate a new version.
6701+        self._need_privkey = fetch_privkey or verify
6702+        if self._node.get_privkey() and not verify:
6703             self._need_privkey = False
6704 
6705hunk ./src/allmydata/mutable/retrieve.py 112
6706+        if self._need_privkey:
6707+            # TODO: Evaluate the need for this. We'll use it if we want
6708+            # to limit how many queries are on the wire for the privkey
6709+            # at once.
6710+            self._privkey_query_markers = [] # one Marker for each time we've
6711+                                             # tried to get the privkey.
6712+
6713+        # verify means that we are using the downloader logic to verify all
6714+        # of our shares. This tells the downloader a few things.
6715+        #
6716+        # 1. We need to download all of the shares.
6717+        # 2. We don't need to decode or decrypt the shares, since our
6718+        #    caller doesn't care about the plaintext, only the
6719+        #    information about which shares are or are not valid.
6720+        # 3. When we are validating readers, we need to validate the
6721+        #    signature on the prefix. Do we? We already do this in the
6722+        #    servermap update?
6723+        self._verify = False
6724+        if verify:
6725+            self._verify = True
6726+
6727         self._status = RetrieveStatus()
6728         self._status.set_storage_index(self._storage_index)
6729         self._status.set_helper(False)
6730hunk ./src/allmydata/mutable/retrieve.py 142
6731          offsets_tuple) = self.verinfo
6732         self._status.set_size(datalength)
6733         self._status.set_encoding(k, N)
6734+        self.readers = {}
6735+        self._paused = False
6736+        self._paused_deferred = None
6737+        self._offset = None
6738+        self._read_length = None
6739+        self.log("got seqnum %d" % self.verinfo[0])
6740+
6741 
6742     def get_status(self):
6743         return self._status
6744hunk ./src/allmydata/mutable/retrieve.py 160
6745             kwargs["facility"] = "tahoe.mutable.retrieve"
6746         return log.msg(*args, **kwargs)
6747 
6748-    def download(self):
6749+
6750+    ###################
6751+    # IPushProducer
6752+
6753+    def pauseProducing(self):
6754+        """
6755+        I am called by my download target if we have produced too much
6756+        data for it to handle. I make the downloader stop producing new
6757+        data until my resumeProducing method is called.
6758+        """
6759+        if self._paused:
6760+            return
6761+
6762+        # fired when the download is unpaused.
6763+        self._old_status = self._status.get_status()
6764+        self._status.set_status("Paused")
6765+
6766+        self._pause_deferred = defer.Deferred()
6767+        self._paused = True
6768+
6769+
6770+    def resumeProducing(self):
6771+        """
6772+        I am called by my download target once it is ready to begin
6773+        receiving data again.
6774+        """
6775+        if not self._paused:
6776+            return
6777+
6778+        self._paused = False
6779+        p = self._pause_deferred
6780+        self._pause_deferred = None
6781+        self._status.set_status(self._old_status)
6782+
6783+        eventually(p.callback, None)
6784+
6785+
6786+    def _check_for_paused(self, res):
6787+        """
6788+        I am called just before a write to the consumer. I return a
6789+        Deferred that eventually fires with the data that is to be
6790+        written to the consumer. If the download has not been paused,
6791+        the Deferred fires immediately. Otherwise, the Deferred fires
6792+        when the downloader is unpaused.
6793+        """
6794+        if self._paused:
6795+            d = defer.Deferred()
6796+            self._pause_defered.addCallback(lambda ignored: d.callback(res))
6797+            return d
6798+        return defer.succeed(res)
6799+
6800+
6801+    def download(self, consumer=None, offset=0, size=None):
6802+        assert IConsumer.providedBy(consumer) or self._verify
6803+
6804+        if consumer:
6805+            self._consumer = consumer
6806+            # we provide IPushProducer, so streaming=True, per
6807+            # IConsumer.
6808+            self._consumer.registerProducer(self, streaming=True)
6809+
6810         self._done_deferred = defer.Deferred()
6811         self._started = time.time()
6812         self._status.set_status("Retrieving Shares")
6813hunk ./src/allmydata/mutable/retrieve.py 225
6814 
6815+        self._offset = offset
6816+        self._read_length = size
6817+
6818         # first, which servers can we use?
6819         versionmap = self.servermap.make_versionmap()
6820         shares = versionmap[self.verinfo]
6821hunk ./src/allmydata/mutable/retrieve.py 235
6822         self.remaining_sharemap = DictOfSets()
6823         for (shnum, peerid, timestamp) in shares:
6824             self.remaining_sharemap.add(shnum, peerid)
6825+            # If the servermap update fetched anything, it fetched at least 1
6826+            # KiB, so we ask for that much.
6827+            # TODO: Change the cache methods to allow us to fetch all of the
6828+            # data that they have, then change this method to do that.
6829+            any_cache, timestamp = self._node._read_from_cache(self.verinfo,
6830+                                                               shnum,
6831+                                                               0,
6832+                                                               1000)
6833+            ss = self.servermap.connections[peerid]
6834+            reader = MDMFSlotReadProxy(ss,
6835+                                       self._storage_index,
6836+                                       shnum,
6837+                                       any_cache)
6838+            reader.peerid = peerid
6839+            self.readers[shnum] = reader
6840+
6841 
6842         self.shares = {} # maps shnum to validated blocks
6843hunk ./src/allmydata/mutable/retrieve.py 253
6844+        self._active_readers = [] # list of active readers for this dl.
6845+        self._validated_readers = set() # set of readers that we have
6846+                                        # validated the prefix of
6847+        self._block_hash_trees = {} # shnum => hashtree
6848 
6849         # how many shares do we need?
6850hunk ./src/allmydata/mutable/retrieve.py 259
6851-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
6852+        (seqnum,
6853+         root_hash,
6854+         IV,
6855+         segsize,
6856+         datalength,
6857+         k,
6858+         N,
6859+         prefix,
6860          offsets_tuple) = self.verinfo
6861hunk ./src/allmydata/mutable/retrieve.py 268
6862-        assert len(self.remaining_sharemap) >= k
6863-        # we start with the lowest shnums we have available, since FEC is
6864-        # faster if we're using "primary shares"
6865-        self.active_shnums = set(sorted(self.remaining_sharemap.keys())[:k])
6866-        for shnum in self.active_shnums:
6867-            # we use an arbitrary peer who has the share. If shares are
6868-            # doubled up (more than one share per peer), we could make this
6869-            # run faster by spreading the load among multiple peers. But the
6870-            # algorithm to do that is more complicated than I want to write
6871-            # right now, and a well-provisioned grid shouldn't have multiple
6872-            # shares per peer.
6873-            peerid = list(self.remaining_sharemap[shnum])[0]
6874-            self.get_data(shnum, peerid)
6875 
6876hunk ./src/allmydata/mutable/retrieve.py 269
6877-        # control flow beyond this point: state machine. Receiving responses
6878-        # from queries is the input. We might send out more queries, or we
6879-        # might produce a result.
6880 
6881hunk ./src/allmydata/mutable/retrieve.py 270
6882+        # We need one share hash tree for the entire file; its leaves
6883+        # are the roots of the block hash trees for the shares that
6884+        # comprise it, and its root is in the verinfo.
6885+        self.share_hash_tree = hashtree.IncompleteHashTree(N)
6886+        self.share_hash_tree.set_hashes({0: root_hash})
6887+
6888+        # This will set up both the segment decoder and the tail segment
6889+        # decoder, as well as a variety of other instance variables that
6890+        # the download process will use.
6891+        self._setup_encoding_parameters()
6892+        assert len(self.remaining_sharemap) >= k
6893+
6894+        self.log("starting download")
6895+        self._paused = False
6896+        self._started_fetching = time.time()
6897+
6898+        self._add_active_peers()
6899+        # The download process beyond this is a state machine.
6900+        # _add_active_peers will select the peers that we want to use
6901+        # for the download, and then attempt to start downloading. After
6902+        # each segment, it will check for doneness, reacting to broken
6903+        # peers and corrupt shares as necessary. If it runs out of good
6904+        # peers before downloading all of the segments, _done_deferred
6905+        # will errback.  Otherwise, it will eventually callback with the
6906+        # contents of the mutable file.
6907         return self._done_deferred
6908 
6909hunk ./src/allmydata/mutable/retrieve.py 297
6910-    def get_data(self, shnum, peerid):
6911-        self.log(format="sending sh#%(shnum)d request to [%(peerid)s]",
6912-                 shnum=shnum,
6913-                 peerid=idlib.shortnodeid_b2a(peerid),
6914-                 level=log.NOISY)
6915-        ss = self.servermap.connections[peerid]
6916-        started = time.time()
6917-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
6918+
6919+    def decode(self, blocks_and_salts, segnum):
6920+        """
6921+        I am a helper method that the mutable file update process uses
6922+        as a shortcut to decode and decrypt the segments that it needs
6923+        to fetch in order to perform a file update. I take in a
6924+        collection of blocks and salts, and pick some of those to make a
6925+        segment with. I return the plaintext associated with that
6926+        segment.
6927+        """
6928+        # shnum => block hash tree. Unusued, but setup_encoding_parameters will
6929+        # want to set this.
6930+        # XXX: Make it so that it won't set this if we're just decoding.
6931+        self._block_hash_trees = {}
6932+        self._setup_encoding_parameters()
6933+        # This is the form expected by decode.
6934+        blocks_and_salts = blocks_and_salts.items()
6935+        blocks_and_salts = [(True, [d]) for d in blocks_and_salts]
6936+
6937+        d = self._decode_blocks(blocks_and_salts, segnum)
6938+        d.addCallback(self._decrypt_segment)
6939+        return d
6940+
6941+
6942+    def _setup_encoding_parameters(self):
6943+        """
6944+        I set up the encoding parameters, including k, n, the number
6945+        of segments associated with this file, and the segment decoder.
6946+        """
6947+        (seqnum,
6948+         root_hash,
6949+         IV,
6950+         segsize,
6951+         datalength,
6952+         k,
6953+         n,
6954+         known_prefix,
6955          offsets_tuple) = self.verinfo
6956hunk ./src/allmydata/mutable/retrieve.py 335
6957-        offsets = dict(offsets_tuple)
6958+        self._required_shares = k
6959+        self._total_shares = n
6960+        self._segment_size = segsize
6961+        self._data_length = datalength
6962 
6963hunk ./src/allmydata/mutable/retrieve.py 340
6964-        # we read the checkstring, to make sure that the data we grab is from
6965-        # the right version.
6966-        readv = [ (0, struct.calcsize(SIGNED_PREFIX)) ]
6967+        if not IV:
6968+            self._version = MDMF_VERSION
6969+        else:
6970+            self._version = SDMF_VERSION
6971 
6972hunk ./src/allmydata/mutable/retrieve.py 345
6973-        # We also read the data, and the hashes necessary to validate them
6974-        # (share_hash_chain, block_hash_tree, share_data). We don't read the
6975-        # signature or the pubkey, since that was handled during the
6976-        # servermap phase, and we'll be comparing the share hash chain
6977-        # against the roothash that was validated back then.
6978+        if datalength and segsize:
6979+            self._num_segments = mathutil.div_ceil(datalength, segsize)
6980+            self._tail_data_size = datalength % segsize
6981+        else:
6982+            self._num_segments = 0
6983+            self._tail_data_size = 0
6984 
6985hunk ./src/allmydata/mutable/retrieve.py 352
6986-        readv.append( (offsets['share_hash_chain'],
6987-                       offsets['enc_privkey'] - offsets['share_hash_chain'] ) )
6988+        self._segment_decoder = codec.CRSDecoder()
6989+        self._segment_decoder.set_params(segsize, k, n)
6990 
6991hunk ./src/allmydata/mutable/retrieve.py 355
6992-        # if we need the private key (for repair), we also fetch that
6993-        if self._need_privkey:
6994-            readv.append( (offsets['enc_privkey'],
6995-                           offsets['EOF'] - offsets['enc_privkey']) )
6996+        if  not self._tail_data_size:
6997+            self._tail_data_size = segsize
6998+
6999+        self._tail_segment_size = mathutil.next_multiple(self._tail_data_size,
7000+                                                         self._required_shares)
7001+        if self._tail_segment_size == self._segment_size:
7002+            self._tail_decoder = self._segment_decoder
7003+        else:
7004+            self._tail_decoder = codec.CRSDecoder()
7005+            self._tail_decoder.set_params(self._tail_segment_size,
7006+                                          self._required_shares,
7007+                                          self._total_shares)
7008 
7009hunk ./src/allmydata/mutable/retrieve.py 368
7010-        m = Marker()
7011-        self._outstanding_queries[m] = (peerid, shnum, started)
7012+        self.log("got encoding parameters: "
7013+                 "k: %d "
7014+                 "n: %d "
7015+                 "%d segments of %d bytes each (%d byte tail segment)" % \
7016+                 (k, n, self._num_segments, self._segment_size,
7017+                  self._tail_segment_size))
7018 
7019hunk ./src/allmydata/mutable/retrieve.py 375
7020-        # ask the cache first
7021-        got_from_cache = False
7022-        datavs = []
7023-        for (offset, length) in readv:
7024-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
7025-                                                            offset, length)
7026-            if data is not None:
7027-                datavs.append(data)
7028-        if len(datavs) == len(readv):
7029-            self.log("got data from cache")
7030-            got_from_cache = True
7031-            d = fireEventually({shnum: datavs})
7032-            # datavs is a dict mapping shnum to a pair of strings
7033+        for i in xrange(self._total_shares):
7034+            # So we don't have to do this later.
7035+            self._block_hash_trees[i] = hashtree.IncompleteHashTree(self._num_segments)
7036+
7037+        # Our last task is to tell the downloader where to start and
7038+        # where to stop. We use three parameters for that:
7039+        #   - self._start_segment: the segment that we need to start
7040+        #     downloading from.
7041+        #   - self._current_segment: the next segment that we need to
7042+        #     download.
7043+        #   - self._last_segment: The last segment that we were asked to
7044+        #     download.
7045+        #
7046+        #  We say that the download is complete when
7047+        #  self._current_segment > self._last_segment. We use
7048+        #  self._start_segment and self._last_segment to know when to
7049+        #  strip things off of segments, and how much to strip.
7050+        if self._offset:
7051+            self.log("got offset: %d" % self._offset)
7052+            # our start segment is the first segment containing the
7053+            # offset we were given.
7054+            start = mathutil.div_ceil(self._offset,
7055+                                      self._segment_size)
7056+            # this gets us the first segment after self._offset. Then
7057+            # our start segment is the one before it.
7058+            start -= 1
7059+
7060+            assert start < self._num_segments
7061+            self._start_segment = start
7062+            self.log("got start segment: %d" % self._start_segment)
7063         else:
7064hunk ./src/allmydata/mutable/retrieve.py 406
7065-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
7066-        self.remaining_sharemap.discard(shnum, peerid)
7067+            self._start_segment = 0
7068 
7069hunk ./src/allmydata/mutable/retrieve.py 408
7070-        d.addCallback(self._got_results, m, peerid, started, got_from_cache)
7071-        d.addErrback(self._query_failed, m, peerid)
7072-        # errors that aren't handled by _query_failed (and errors caused by
7073-        # _query_failed) get logged, but we still want to check for doneness.
7074-        def _oops(f):
7075-            self.log(format="problem in _query_failed for sh#%(shnum)d to %(peerid)s",
7076-                     shnum=shnum,
7077-                     peerid=idlib.shortnodeid_b2a(peerid),
7078-                     failure=f,
7079-                     level=log.WEIRD, umid="W0xnQA")
7080-        d.addErrback(_oops)
7081-        d.addBoth(self._check_for_done)
7082-        # any error during _check_for_done means the download fails. If the
7083-        # download is successful, _check_for_done will fire _done by itself.
7084-        d.addErrback(self._done)
7085-        d.addErrback(log.err)
7086-        return d # purely for testing convenience
7087 
7088hunk ./src/allmydata/mutable/retrieve.py 409
7089-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
7090-        # isolate the callRemote to a separate method, so tests can subclass
7091-        # Publish and override it
7092-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
7093-        return d
7094+        if self._read_length:
7095+            # our end segment is the last segment containing part of the
7096+            # segment that we were asked to read.
7097+            self.log("got read length %d" % self._read_length)
7098+            end_data = self._offset + self._read_length
7099+            end = mathutil.div_ceil(end_data,
7100+                                    self._segment_size)
7101+            end -= 1
7102+            assert end < self._num_segments
7103+            self._last_segment = end
7104+            self.log("got end segment: %d" % self._last_segment)
7105+        else:
7106+            self._last_segment = self._num_segments - 1
7107 
7108hunk ./src/allmydata/mutable/retrieve.py 423
7109-    def remove_peer(self, peerid):
7110-        for shnum in list(self.remaining_sharemap.keys()):
7111-            self.remaining_sharemap.discard(shnum, peerid)
7112+        self._current_segment = self._start_segment
7113 
7114hunk ./src/allmydata/mutable/retrieve.py 425
7115-    def _got_results(self, datavs, marker, peerid, started, got_from_cache):
7116-        now = time.time()
7117-        elapsed = now - started
7118-        if not got_from_cache:
7119-            self._status.add_fetch_timing(peerid, elapsed)
7120-        self.log(format="got results (%(shares)d shares) from [%(peerid)s]",
7121-                 shares=len(datavs),
7122-                 peerid=idlib.shortnodeid_b2a(peerid),
7123-                 level=log.NOISY)
7124-        self._outstanding_queries.pop(marker, None)
7125-        if not self._running:
7126-            return
7127+    def _add_active_peers(self):
7128+        """
7129+        I populate self._active_readers with enough active readers to
7130+        retrieve the contents of this mutable file. I am called before
7131+        downloading starts, and (eventually) after each validation
7132+        error, connection error, or other problem in the download.
7133+        """
7134+        # TODO: It would be cool to investigate other heuristics for
7135+        # reader selection. For instance, the cost (in time the user
7136+        # spends waiting for their file) of selecting a really slow peer
7137+        # that happens to have a primary share is probably more than
7138+        # selecting a really fast peer that doesn't have a primary
7139+        # share. Maybe the servermap could be extended to provide this
7140+        # information; it could keep track of latency information while
7141+        # it gathers more important data, and then this routine could
7142+        # use that to select active readers.
7143+        #
7144+        # (these and other questions would be easier to answer with a
7145+        #  robust, configurable tahoe-lafs simulator, which modeled node
7146+        #  failures, differences in node speed, and other characteristics
7147+        #  that we expect storage servers to have.  You could have
7148+        #  presets for really stable grids (like allmydata.com),
7149+        #  friendnets, make it easy to configure your own settings, and
7150+        #  then simulate the effect of big changes on these use cases
7151+        #  instead of just reasoning about what the effect might be. Out
7152+        #  of scope for MDMF, though.)
7153 
7154hunk ./src/allmydata/mutable/retrieve.py 452
7155-        # note that we only ask for a single share per query, so we only
7156-        # expect a single share back. On the other hand, we use the extra
7157-        # shares if we get them.. seems better than an assert().
7158+        # We need at least self._required_shares readers to download a
7159+        # segment.
7160+        if self._verify:
7161+            needed = self._total_shares
7162+        else:
7163+            needed = self._required_shares - len(self._active_readers)
7164+        # XXX: Why don't format= log messages work here?
7165+        self.log("adding %d peers to the active peers list" % needed)
7166 
7167hunk ./src/allmydata/mutable/retrieve.py 461
7168-        for shnum,datav in datavs.items():
7169-            (prefix, hash_and_data) = datav[:2]
7170-            try:
7171-                self._got_results_one_share(shnum, peerid,
7172-                                            prefix, hash_and_data)
7173-            except CorruptShareError, e:
7174-                # log it and give the other shares a chance to be processed
7175-                f = failure.Failure()
7176-                self.log(format="bad share: %(f_value)s",
7177-                         f_value=str(f.value), failure=f,
7178-                         level=log.WEIRD, umid="7fzWZw")
7179-                self.notify_server_corruption(peerid, shnum, str(e))
7180-                self.remove_peer(peerid)
7181-                self.servermap.mark_bad_share(peerid, shnum, prefix)
7182-                self._bad_shares.add( (peerid, shnum) )
7183-                self._status.problems[peerid] = f
7184-                self._last_failure = f
7185-                pass
7186-            if self._need_privkey and len(datav) > 2:
7187-                lp = None
7188-                self._try_to_validate_privkey(datav[2], peerid, shnum, lp)
7189-        # all done!
7190+        # We favor lower numbered shares, since FEC is faster with
7191+        # primary shares than with other shares, and lower-numbered
7192+        # shares are more likely to be primary than higher numbered
7193+        # shares.
7194+        active_shnums = set(sorted(self.remaining_sharemap.keys()))
7195+        # We shouldn't consider adding shares that we already have; this
7196+        # will cause problems later.
7197+        active_shnums -= set([reader.shnum for reader in self._active_readers])
7198+        active_shnums = list(active_shnums)[:needed]
7199+        if len(active_shnums) < needed and not self._verify:
7200+            # We don't have enough readers to retrieve the file; fail.
7201+            return self._failed()
7202 
7203hunk ./src/allmydata/mutable/retrieve.py 474
7204-    def notify_server_corruption(self, peerid, shnum, reason):
7205-        ss = self.servermap.connections[peerid]
7206-        ss.callRemoteOnly("advise_corrupt_share",
7207-                          "mutable", self._storage_index, shnum, reason)
7208+        for shnum in active_shnums:
7209+            self._active_readers.append(self.readers[shnum])
7210+            self.log("added reader for share %d" % shnum)
7211+        assert len(self._active_readers) >= self._required_shares
7212+        # Conceptually, this is part of the _add_active_peers step. It
7213+        # validates the prefixes of newly added readers to make sure
7214+        # that they match what we are expecting for self.verinfo. If
7215+        # validation is successful, _validate_active_prefixes will call
7216+        # _download_current_segment for us. If validation is
7217+        # unsuccessful, then _validate_prefixes will remove the peer and
7218+        # call _add_active_peers again, where we will attempt to rectify
7219+        # the problem by choosing another peer.
7220+        return self._validate_active_prefixes()
7221 
7222hunk ./src/allmydata/mutable/retrieve.py 488
7223-    def _got_results_one_share(self, shnum, peerid,
7224-                               got_prefix, got_hash_and_data):
7225-        self.log("_got_results: got shnum #%d from peerid %s"
7226-                 % (shnum, idlib.shortnodeid_b2a(peerid)))
7227-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
7228-         offsets_tuple) = self.verinfo
7229-        assert len(got_prefix) == len(prefix), (len(got_prefix), len(prefix))
7230-        if got_prefix != prefix:
7231-            msg = "someone wrote to the data since we read the servermap: prefix changed"
7232-            raise UncoordinatedWriteError(msg)
7233-        (share_hash_chain, block_hash_tree,
7234-         share_data) = unpack_share_data(self.verinfo, got_hash_and_data)
7235 
7236hunk ./src/allmydata/mutable/retrieve.py 489
7237-        assert isinstance(share_data, str)
7238-        # build the block hash tree. SDMF has only one leaf.
7239-        leaves = [hashutil.block_hash(share_data)]
7240-        t = hashtree.HashTree(leaves)
7241-        if list(t) != block_hash_tree:
7242-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
7243-        share_hash_leaf = t[0]
7244-        t2 = hashtree.IncompleteHashTree(N)
7245-        # root_hash was checked by the signature
7246-        t2.set_hashes({0: root_hash})
7247-        try:
7248-            t2.set_hashes(hashes=share_hash_chain,
7249-                          leaves={shnum: share_hash_leaf})
7250-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
7251-                IndexError), e:
7252-            msg = "corrupt hashes: %s" % (e,)
7253-            raise CorruptShareError(peerid, shnum, msg)
7254-        self.log(" data valid! len=%d" % len(share_data))
7255-        # each query comes down to this: placing validated share data into
7256-        # self.shares
7257-        self.shares[shnum] = share_data
7258+    def _validate_active_prefixes(self):
7259+        """
7260+        I check to make sure that the prefixes on the peers that I am
7261+        currently reading from match the prefix that we want to see, as
7262+        said in self.verinfo.
7263 
7264hunk ./src/allmydata/mutable/retrieve.py 495
7265-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
7266+        If I find that all of the active peers have acceptable prefixes,
7267+        I pass control to _download_current_segment, which will use
7268+        those peers to do cool things. If I find that some of the active
7269+        peers have unacceptable prefixes, I will remove them from active
7270+        peers (and from further consideration) and call
7271+        _add_active_peers to attempt to rectify the situation. I keep
7272+        track of which peers I have already validated so that I don't
7273+        need to do so again.
7274+        """
7275+        assert self._active_readers, "No more active readers"
7276 
7277hunk ./src/allmydata/mutable/retrieve.py 506
7278-        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
7279-        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
7280-        if alleged_writekey != self._node.get_writekey():
7281-            self.log("invalid privkey from %s shnum %d" %
7282-                     (idlib.nodeid_b2a(peerid)[:8], shnum),
7283-                     parent=lp, level=log.WEIRD, umid="YIw4tA")
7284-            return
7285+        ds = []
7286+        new_readers = set(self._active_readers) - self._validated_readers
7287+        self.log('validating %d newly-added active readers' % len(new_readers))
7288 
7289hunk ./src/allmydata/mutable/retrieve.py 510
7290-        # it's good
7291-        self.log("got valid privkey from shnum %d on peerid %s" %
7292-                 (shnum, idlib.shortnodeid_b2a(peerid)),
7293-                 parent=lp)
7294-        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
7295-        self._node._populate_encprivkey(enc_privkey)
7296-        self._node._populate_privkey(privkey)
7297-        self._need_privkey = False
7298+        for reader in new_readers:
7299+            # We force a remote read here -- otherwise, we are relying
7300+            # on cached data that we already verified as valid, and we
7301+            # won't detect an uncoordinated write that has occurred
7302+            # since the last servermap update.
7303+            d = reader.get_prefix(force_remote=True)
7304+            d.addCallback(self._try_to_validate_prefix, reader)
7305+            ds.append(d)
7306+        dl = defer.DeferredList(ds, consumeErrors=True)
7307+        def _check_results(results):
7308+            # Each result in results will be of the form (success, msg).
7309+            # We don't care about msg, but success will tell us whether
7310+            # or not the checkstring validated. If it didn't, we need to
7311+            # remove the offending (peer,share) from our active readers,
7312+            # and ensure that active readers is again populated.
7313+            bad_readers = []
7314+            for i, result in enumerate(results):
7315+                if not result[0]:
7316+                    reader = self._active_readers[i]
7317+                    f = result[1]
7318+                    assert isinstance(f, failure.Failure)
7319 
7320hunk ./src/allmydata/mutable/retrieve.py 532
7321-    def _query_failed(self, f, marker, peerid):
7322-        self.log(format="query to [%(peerid)s] failed",
7323-                 peerid=idlib.shortnodeid_b2a(peerid),
7324-                 level=log.NOISY)
7325-        self._status.problems[peerid] = f
7326-        self._outstanding_queries.pop(marker, None)
7327-        if not self._running:
7328-            return
7329-        self._last_failure = f
7330-        self.remove_peer(peerid)
7331-        level = log.WEIRD
7332-        if f.check(DeadReferenceError):
7333-            level = log.UNUSUAL
7334-        self.log(format="error during query: %(f_value)s",
7335-                 f_value=str(f.value), failure=f, level=level, umid="gOJB5g")
7336+                    self.log("The reader %s failed to "
7337+                             "properly validate: %s" % \
7338+                             (reader, str(f.value)))
7339+                    bad_readers.append((reader, f))
7340+                else:
7341+                    reader = self._active_readers[i]
7342+                    self.log("the reader %s checks out, so we'll use it" % \
7343+                             reader)
7344+                    self._validated_readers.add(reader)
7345+                    # Each time we validate a reader, we check to see if
7346+                    # we need the private key. If we do, we politely ask
7347+                    # for it and then continue computing. If we find
7348+                    # that we haven't gotten it at the end of
7349+                    # segment decoding, then we'll take more drastic
7350+                    # measures.
7351+                    if self._need_privkey and not self._node.is_readonly():
7352+                        d = reader.get_encprivkey()
7353+                        d.addCallback(self._try_to_validate_privkey, reader)
7354+            if bad_readers:
7355+                # We do them all at once, or else we screw up list indexing.
7356+                for (reader, f) in bad_readers:
7357+                    self._mark_bad_share(reader, f)
7358+                if self._verify:
7359+                    if len(self._active_readers) >= self._required_shares:
7360+                        return self._download_current_segment()
7361+                    else:
7362+                        return self._failed()
7363+                else:
7364+                    return self._add_active_peers()
7365+            else:
7366+                return self._download_current_segment()
7367+            # The next step will assert that it has enough active
7368+            # readers to fetch shares; we just need to remove it.
7369+        dl.addCallback(_check_results)
7370+        return dl
7371 
7372hunk ./src/allmydata/mutable/retrieve.py 568
7373-    def _check_for_done(self, res):
7374-        # exit paths:
7375-        #  return : keep waiting, no new queries
7376-        #  return self._send_more_queries(outstanding) : send some more queries
7377-        #  fire self._done(plaintext) : download successful
7378-        #  raise exception : download fails
7379 
7380hunk ./src/allmydata/mutable/retrieve.py 569
7381-        self.log(format="_check_for_done: running=%(running)s, decoding=%(decoding)s",
7382-                 running=self._running, decoding=self._decoding,
7383-                 level=log.NOISY)
7384-        if not self._running:
7385-            return
7386-        if self._decoding:
7387-            return
7388-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
7389+    def _try_to_validate_prefix(self, prefix, reader):
7390+        """
7391+        I check that the prefix returned by a candidate server for
7392+        retrieval matches the prefix that the servermap knows about
7393+        (and, hence, the prefix that was validated earlier). If it does,
7394+        I return True, which means that I approve of the use of the
7395+        candidate server for segment retrieval. If it doesn't, I return
7396+        False, which means that another server must be chosen.
7397+        """
7398+        (seqnum,
7399+         root_hash,
7400+         IV,
7401+         segsize,
7402+         datalength,
7403+         k,
7404+         N,
7405+         known_prefix,
7406          offsets_tuple) = self.verinfo
7407hunk ./src/allmydata/mutable/retrieve.py 587
7408+        if known_prefix != prefix:
7409+            self.log("prefix from share %d doesn't match" % reader.shnum)
7410+            raise UncoordinatedWriteError("Mismatched prefix -- this could "
7411+                                          "indicate an uncoordinated write")
7412+        # Otherwise, we're okay -- no issues.
7413 
7414hunk ./src/allmydata/mutable/retrieve.py 593
7415-        if len(self.shares) < k:
7416-            # we don't have enough shares yet
7417-            return self._maybe_send_more_queries(k)
7418-        if self._need_privkey:
7419-            # we got k shares, but none of them had a valid privkey. TODO:
7420-            # look further. Adding code to do this is a bit complicated, and
7421-            # I want to avoid that complication, and this should be pretty
7422-            # rare (k shares with bitflips in the enc_privkey but not in the
7423-            # data blocks). If we actually do get here, the subsequent repair
7424-            # will fail for lack of a privkey.
7425-            self.log("got k shares but still need_privkey, bummer",
7426-                     level=log.WEIRD, umid="MdRHPA")
7427 
7428hunk ./src/allmydata/mutable/retrieve.py 594
7429-        # we have enough to finish. All the shares have had their hashes
7430-        # checked, so if something fails at this point, we don't know how
7431-        # to fix it, so the download will fail.
7432+    def _remove_reader(self, reader):
7433+        """
7434+        At various points, we will wish to remove a peer from
7435+        consideration and/or use. These include, but are not necessarily
7436+        limited to:
7437 
7438hunk ./src/allmydata/mutable/retrieve.py 600
7439-        self._decoding = True # avoid reentrancy
7440-        self._status.set_status("decoding")
7441-        now = time.time()
7442-        elapsed = now - self._started
7443-        self._status.timings["fetch"] = elapsed
7444+            - A connection error.
7445+            - A mismatched prefix (that is, a prefix that does not match
7446+              our conception of the version information string).
7447+            - A failing block hash, salt hash, or share hash, which can
7448+              indicate disk failure/bit flips, or network trouble.
7449 
7450hunk ./src/allmydata/mutable/retrieve.py 606
7451-        d = defer.maybeDeferred(self._decode)
7452-        d.addCallback(self._decrypt, IV, self._node.get_readkey())
7453-        d.addBoth(self._done)
7454-        return d # purely for test convenience
7455+        This method will do that. I will make sure that the
7456+        (shnum,reader) combination represented by my reader argument is
7457+        not used for anything else during this download. I will not
7458+        advise the reader of any corruption, something that my callers
7459+        may wish to do on their own.
7460+        """
7461+        # TODO: When you're done writing this, see if this is ever
7462+        # actually used for something that _mark_bad_share isn't. I have
7463+        # a feeling that they will be used for very similar things, and
7464+        # that having them both here is just going to be an epic amount
7465+        # of code duplication.
7466+        #
7467+        # (well, okay, not epic, but meaningful)
7468+        self.log("removing reader %s" % reader)
7469+        # Remove the reader from _active_readers
7470+        self._active_readers.remove(reader)
7471+        # TODO: self.readers.remove(reader)?
7472+        for shnum in list(self.remaining_sharemap.keys()):
7473+            self.remaining_sharemap.discard(shnum, reader.peerid)
7474 
7475hunk ./src/allmydata/mutable/retrieve.py 626
7476-    def _maybe_send_more_queries(self, k):
7477-        # we don't have enough shares yet. Should we send out more queries?
7478-        # There are some number of queries outstanding, each for a single
7479-        # share. If we can generate 'needed_shares' additional queries, we do
7480-        # so. If we can't, then we know this file is a goner, and we raise
7481-        # NotEnoughSharesError.
7482-        self.log(format=("_maybe_send_more_queries, have=%(have)d, k=%(k)d, "
7483-                         "outstanding=%(outstanding)d"),
7484-                 have=len(self.shares), k=k,
7485-                 outstanding=len(self._outstanding_queries),
7486-                 level=log.NOISY)
7487 
7488hunk ./src/allmydata/mutable/retrieve.py 627
7489-        remaining_shares = k - len(self.shares)
7490-        needed = remaining_shares - len(self._outstanding_queries)
7491-        if not needed:
7492-            # we have enough queries in flight already
7493+    def _mark_bad_share(self, reader, f):
7494+        """
7495+        I mark the (peerid, shnum) encapsulated by my reader argument as
7496+        a bad share, which means that it will not be used anywhere else.
7497 
7498hunk ./src/allmydata/mutable/retrieve.py 632
7499-            # TODO: but if they've been in flight for a long time, and we
7500-            # have reason to believe that new queries might respond faster
7501-            # (i.e. we've seen other queries come back faster, then consider
7502-            # sending out new queries. This could help with peers which have
7503-            # silently gone away since the servermap was updated, for which
7504-            # we're still waiting for the 15-minute TCP disconnect to happen.
7505-            self.log("enough queries are in flight, no more are needed",
7506-                     level=log.NOISY)
7507-            return
7508+        There are several reasons to want to mark something as a bad
7509+        share. These include:
7510+
7511+            - A connection error to the peer.
7512+            - A mismatched prefix (that is, a prefix that does not match
7513+              our local conception of the version information string).
7514+            - A failing block hash, salt hash, share hash, or other
7515+              integrity check.
7516 
7517hunk ./src/allmydata/mutable/retrieve.py 641
7518-        outstanding_shnums = set([shnum
7519-                                  for (peerid, shnum, started)
7520-                                  in self._outstanding_queries.values()])
7521-        # prefer low-numbered shares, they are more likely to be primary
7522-        available_shnums = sorted(self.remaining_sharemap.keys())
7523-        for shnum in available_shnums:
7524-            if shnum in outstanding_shnums:
7525-                # skip ones that are already in transit
7526-                continue
7527-            if shnum not in self.remaining_sharemap:
7528-                # no servers for that shnum. note that DictOfSets removes
7529-                # empty sets from the dict for us.
7530-                continue
7531-            peerid = list(self.remaining_sharemap[shnum])[0]
7532-            # get_data will remove that peerid from the sharemap, and add the
7533-            # query to self._outstanding_queries
7534-            self._status.set_status("Retrieving More Shares")
7535-            self.get_data(shnum, peerid)
7536-            needed -= 1
7537-            if not needed:
7538+        This method will ensure that readers that we wish to mark bad
7539+        (for these reasons or other reasons) are not used for the rest
7540+        of the download. Additionally, it will attempt to tell the
7541+        remote peer (with no guarantee of success) that its share is
7542+        corrupt.
7543+        """
7544+        self.log("marking share %d on server %s as bad" % \
7545+                 (reader.shnum, reader))
7546+        prefix = self.verinfo[-2]
7547+        self.servermap.mark_bad_share(reader.peerid,
7548+                                      reader.shnum,
7549+                                      prefix)
7550+        self._remove_reader(reader)
7551+        self._bad_shares.add((reader.peerid, reader.shnum, f))
7552+        self._status.problems[reader.peerid] = f
7553+        self._last_failure = f
7554+        self.notify_server_corruption(reader.peerid, reader.shnum,
7555+                                      str(f.value))
7556+
7557+
7558+    def _download_current_segment(self):
7559+        """
7560+        I download, validate, decode, decrypt, and assemble the segment
7561+        that this Retrieve is currently responsible for downloading.
7562+        """
7563+        assert len(self._active_readers) >= self._required_shares
7564+        if self._current_segment <= self._last_segment:
7565+            d = self._process_segment(self._current_segment)
7566+        else:
7567+            d = defer.succeed(None)
7568+        d.addBoth(self._turn_barrier)
7569+        d.addCallback(self._check_for_done)
7570+        return d
7571+
7572+
7573+    def _turn_barrier(self, result):
7574+        """
7575+        I help the download process avoid the recursion limit issues
7576+        discussed in #237.
7577+        """
7578+        return fireEventually(result)
7579+
7580+
7581+    def _process_segment(self, segnum):
7582+        """
7583+        I download, validate, decode, and decrypt one segment of the
7584+        file that this Retrieve is retrieving. This means coordinating
7585+        the process of getting k blocks of that file, validating them,
7586+        assembling them into one segment with the decoder, and then
7587+        decrypting them.
7588+        """
7589+        self.log("processing segment %d" % segnum)
7590+
7591+        # TODO: The old code uses a marker. Should this code do that
7592+        # too? What did the Marker do?
7593+        assert len(self._active_readers) >= self._required_shares
7594+
7595+        # We need to ask each of our active readers for its block and
7596+        # salt. We will then validate those. If validation is
7597+        # successful, we will assemble the results into plaintext.
7598+        ds = []
7599+        for reader in self._active_readers:
7600+            started = time.time()
7601+            d = reader.get_block_and_salt(segnum, queue=True)
7602+            d2 = self._get_needed_hashes(reader, segnum)
7603+            dl = defer.DeferredList([d, d2], consumeErrors=True)
7604+            dl.addCallback(self._validate_block, segnum, reader, started)
7605+            dl.addErrback(self._validation_or_decoding_failed, [reader])
7606+            ds.append(dl)
7607+            reader.flush()
7608+        dl = defer.DeferredList(ds)
7609+        if self._verify:
7610+            dl.addCallback(lambda ignored: "")
7611+            dl.addCallback(self._set_segment)
7612+        else:
7613+            dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
7614+        return dl
7615+
7616+
7617+    def _maybe_decode_and_decrypt_segment(self, blocks_and_salts, segnum):
7618+        """
7619+        I take the results of fetching and validating the blocks from a
7620+        callback chain in another method. If the results are such that
7621+        they tell me that validation and fetching succeeded without
7622+        incident, I will proceed with decoding and decryption.
7623+        Otherwise, I will do nothing.
7624+        """
7625+        self.log("trying to decode and decrypt segment %d" % segnum)
7626+        failures = False
7627+        for block_and_salt in blocks_and_salts:
7628+            if not block_and_salt[0] or block_and_salt[1] == None:
7629+                self.log("some validation operations failed; not proceeding")
7630+                failures = True
7631                 break
7632hunk ./src/allmydata/mutable/retrieve.py 735
7633+        if not failures:
7634+            self.log("everything looks ok, building segment %d" % segnum)
7635+            d = self._decode_blocks(blocks_and_salts, segnum)
7636+            d.addCallback(self._decrypt_segment)
7637+            d.addErrback(self._validation_or_decoding_failed,
7638+                         self._active_readers)
7639+            # check to see whether we've been paused before writing
7640+            # anything.
7641+            d.addCallback(self._check_for_paused)
7642+            d.addCallback(self._set_segment)
7643+            return d
7644+        else:
7645+            return defer.succeed(None)
7646+
7647+
7648+    def _set_segment(self, segment):
7649+        """
7650+        Given a plaintext segment, I register that segment with the
7651+        target that is handling the file download.
7652+        """
7653+        self.log("got plaintext for segment %d" % self._current_segment)
7654+        if self._current_segment == self._start_segment:
7655+            # We're on the first segment. It's possible that we want
7656+            # only some part of the end of this segment, and that we
7657+            # just downloaded the whole thing to get that part. If so,
7658+            # we need to account for that and give the reader just the
7659+            # data that they want.
7660+            n = self._offset % self._segment_size
7661+            self.log("stripping %d bytes off of the first segment" % n)
7662+            self.log("original segment length: %d" % len(segment))
7663+            segment = segment[n:]
7664+            self.log("new segment length: %d" % len(segment))
7665+
7666+        if self._current_segment == self._last_segment and self._read_length is not None:
7667+            # We're on the last segment. It's possible that we only want
7668+            # part of the beginning of this segment, and that we
7669+            # downloaded the whole thing anyway. Make sure to give the
7670+            # caller only the portion of the segment that they want to
7671+            # receive.
7672+            extra = self._read_length
7673+            if self._start_segment != self._last_segment:
7674+                extra -= self._segment_size - \
7675+                            (self._offset % self._segment_size)
7676+            extra %= self._segment_size
7677+            self.log("original segment length: %d" % len(segment))
7678+            segment = segment[:extra]
7679+            self.log("new segment length: %d" % len(segment))
7680+            self.log("only taking %d bytes of the last segment" % extra)
7681+
7682+        if not self._verify:
7683+            self._consumer.write(segment)
7684+        else:
7685+            # we don't care about the plaintext if we are doing a verify.
7686+            segment = None
7687+        self._current_segment += 1
7688 
7689hunk ./src/allmydata/mutable/retrieve.py 791
7690-        # at this point, we have as many outstanding queries as we can. If
7691-        # needed!=0 then we might not have enough to recover the file.
7692-        if needed:
7693-            format = ("ran out of peers: "
7694-                      "have %(have)d shares (k=%(k)d), "
7695-                      "%(outstanding)d queries in flight, "
7696-                      "need %(need)d more, "
7697-                      "found %(bad)d bad shares")
7698-            args = {"have": len(self.shares),
7699-                    "k": k,
7700-                    "outstanding": len(self._outstanding_queries),
7701-                    "need": needed,
7702-                    "bad": len(self._bad_shares),
7703-                    }
7704-            self.log(format=format,
7705-                     level=log.WEIRD, umid="ezTfjw", **args)
7706-            err = NotEnoughSharesError("%s, last failure: %s" %
7707-                                      (format % args, self._last_failure))
7708-            if self._bad_shares:
7709-                self.log("We found some bad shares this pass. You should "
7710-                         "update the servermap and try again to check "
7711-                         "more peers",
7712-                         level=log.WEIRD, umid="EFkOlA")
7713-                err.servermap = self.servermap
7714-            raise err
7715 
7716hunk ./src/allmydata/mutable/retrieve.py 792
7717+    def _validation_or_decoding_failed(self, f, readers):
7718+        """
7719+        I am called when a block or a salt fails to correctly validate, or when
7720+        the decryption or decoding operation fails for some reason.  I react to
7721+        this failure by notifying the remote server of corruption, and then
7722+        removing the remote peer from further activity.
7723+        """
7724+        assert isinstance(readers, list)
7725+        bad_shnums = [reader.shnum for reader in readers]
7726+
7727+        self.log("validation or decoding failed on share(s) %s, peer(s) %s "
7728+                 ", segment %d: %s" % \
7729+                 (bad_shnums, readers, self._current_segment, str(f)))
7730+        for reader in readers:
7731+            self._mark_bad_share(reader, f)
7732         return
7733 
7734hunk ./src/allmydata/mutable/retrieve.py 809
7735-    def _decode(self):
7736-        started = time.time()
7737-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
7738-         offsets_tuple) = self.verinfo
7739 
7740hunk ./src/allmydata/mutable/retrieve.py 810
7741-        # shares_dict is a dict mapping shnum to share data, but the codec
7742-        # wants two lists.
7743-        shareids = []; shares = []
7744-        for shareid, share in self.shares.items():
7745+    def _validate_block(self, results, segnum, reader, started):
7746+        """
7747+        I validate a block from one share on a remote server.
7748+        """
7749+        # Grab the part of the block hash tree that is necessary to
7750+        # validate this block, then generate the block hash root.
7751+        self.log("validating share %d for segment %d" % (reader.shnum,
7752+                                                             segnum))
7753+        self._status.add_fetch_timing(reader.peerid, started)
7754+        self._status.set_status("Valdiating blocks for segment %d" % segnum)
7755+        # Did we fail to fetch either of the things that we were
7756+        # supposed to? Fail if so.
7757+        if not results[0][0] and results[1][0]:
7758+            # handled by the errback handler.
7759+
7760+            # These all get batched into one query, so the resulting
7761+            # failure should be the same for all of them, so we can just
7762+            # use the first one.
7763+            assert isinstance(results[0][1], failure.Failure)
7764+
7765+            f = results[0][1]
7766+            raise CorruptShareError(reader.peerid,
7767+                                    reader.shnum,
7768+                                    "Connection error: %s" % str(f))
7769+
7770+        block_and_salt, block_and_sharehashes = results
7771+        block, salt = block_and_salt[1]
7772+        blockhashes, sharehashes = block_and_sharehashes[1]
7773+
7774+        blockhashes = dict(enumerate(blockhashes[1]))
7775+        self.log("the reader gave me the following blockhashes: %s" % \
7776+                 blockhashes.keys())
7777+        self.log("the reader gave me the following sharehashes: %s" % \
7778+                 sharehashes[1].keys())
7779+        bht = self._block_hash_trees[reader.shnum]
7780+
7781+        if bht.needed_hashes(segnum, include_leaf=True):
7782+            try:
7783+                bht.set_hashes(blockhashes)
7784+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
7785+                    IndexError), e:
7786+                raise CorruptShareError(reader.peerid,
7787+                                        reader.shnum,
7788+                                        "block hash tree failure: %s" % e)
7789+
7790+        if self._version == MDMF_VERSION:
7791+            blockhash = hashutil.block_hash(salt + block)
7792+        else:
7793+            blockhash = hashutil.block_hash(block)
7794+        # If this works without an error, then validation is
7795+        # successful.
7796+        try:
7797+           bht.set_hashes(leaves={segnum: blockhash})
7798+        except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
7799+                IndexError), e:
7800+            raise CorruptShareError(reader.peerid,
7801+                                    reader.shnum,
7802+                                    "block hash tree failure: %s" % e)
7803+
7804+        # Reaching this point means that we know that this segment
7805+        # is correct. Now we need to check to see whether the share
7806+        # hash chain is also correct.
7807+        # SDMF wrote share hash chains that didn't contain the
7808+        # leaves, which would be produced from the block hash tree.
7809+        # So we need to validate the block hash tree first. If
7810+        # successful, then bht[0] will contain the root for the
7811+        # shnum, which will be a leaf in the share hash tree, which
7812+        # will allow us to validate the rest of the tree.
7813+        if self.share_hash_tree.needed_hashes(reader.shnum,
7814+                                              include_leaf=True) or \
7815+                                              self._verify:
7816+            try:
7817+                self.share_hash_tree.set_hashes(hashes=sharehashes[1],
7818+                                            leaves={reader.shnum: bht[0]})
7819+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
7820+                    IndexError), e:
7821+                raise CorruptShareError(reader.peerid,
7822+                                        reader.shnum,
7823+                                        "corrupt hashes: %s" % e)
7824+
7825+        self.log('share %d is valid for segment %d' % (reader.shnum,
7826+                                                       segnum))
7827+        return {reader.shnum: (block, salt)}
7828+
7829+
7830+    def _get_needed_hashes(self, reader, segnum):
7831+        """
7832+        I get the hashes needed to validate segnum from the reader, then return
7833+        to my caller when this is done.
7834+        """
7835+        bht = self._block_hash_trees[reader.shnum]
7836+        needed = bht.needed_hashes(segnum, include_leaf=True)
7837+        # The root of the block hash tree is also a leaf in the share
7838+        # hash tree. So we don't need to fetch it from the remote
7839+        # server. In the case of files with one segment, this means that
7840+        # we won't fetch any block hash tree from the remote server,
7841+        # since the hash of each share of the file is the entire block
7842+        # hash tree, and is a leaf in the share hash tree. This is fine,
7843+        # since any share corruption will be detected in the share hash
7844+        # tree.
7845+        #needed.discard(0)
7846+        self.log("getting blockhashes for segment %d, share %d: %s" % \
7847+                 (segnum, reader.shnum, str(needed)))
7848+        d1 = reader.get_blockhashes(needed, queue=True, force_remote=True)
7849+        if self.share_hash_tree.needed_hashes(reader.shnum):
7850+            need = self.share_hash_tree.needed_hashes(reader.shnum)
7851+            self.log("also need sharehashes for share %d: %s" % (reader.shnum,
7852+                                                                 str(need)))
7853+            d2 = reader.get_sharehashes(need, queue=True, force_remote=True)
7854+        else:
7855+            d2 = defer.succeed({}) # the logic in the next method
7856+                                   # expects a dict
7857+        dl = defer.DeferredList([d1, d2], consumeErrors=True)
7858+        return dl
7859+
7860+
7861+    def _decode_blocks(self, blocks_and_salts, segnum):
7862+        """
7863+        I take a list of k blocks and salts, and decode that into a
7864+        single encrypted segment.
7865+        """
7866+        d = {}
7867+        # We want to merge our dictionaries to the form
7868+        # {shnum: blocks_and_salts}
7869+        #
7870+        # The dictionaries come from validate block that way, so we just
7871+        # need to merge them.
7872+        for block_and_salt in blocks_and_salts:
7873+            d.update(block_and_salt[1])
7874+
7875+        # All of these blocks should have the same salt; in SDMF, it is
7876+        # the file-wide IV, while in MDMF it is the per-segment salt. In
7877+        # either case, we just need to get one of them and use it.
7878+        #
7879+        # d.items()[0] is like (shnum, (block, salt))
7880+        # d.items()[0][1] is like (block, salt)
7881+        # d.items()[0][1][1] is the salt.
7882+        salt = d.items()[0][1][1]
7883+        # Next, extract just the blocks from the dict. We'll use the
7884+        # salt in the next step.
7885+        share_and_shareids = [(k, v[0]) for k, v in d.items()]
7886+        d2 = dict(share_and_shareids)
7887+        shareids = []
7888+        shares = []
7889+        for shareid, share in d2.items():
7890             shareids.append(shareid)
7891             shares.append(share)
7892 
7893hunk ./src/allmydata/mutable/retrieve.py 958
7894-        assert len(shareids) >= k, len(shareids)
7895+        self._status.set_status("Decoding")
7896+        started = time.time()
7897+        assert len(shareids) >= self._required_shares, len(shareids)
7898         # zfec really doesn't want extra shares
7899hunk ./src/allmydata/mutable/retrieve.py 962
7900-        shareids = shareids[:k]
7901-        shares = shares[:k]
7902-
7903-        fec = codec.CRSDecoder()
7904-        fec.set_params(segsize, k, N)
7905-
7906-        self.log("params %s, we have %d shares" % ((segsize, k, N), len(shares)))
7907-        self.log("about to decode, shareids=%s" % (shareids,))
7908-        d = defer.maybeDeferred(fec.decode, shares, shareids)
7909-        def _done(buffers):
7910-            self._status.timings["decode"] = time.time() - started
7911-            self.log(" decode done, %d buffers" % len(buffers))
7912+        shareids = shareids[:self._required_shares]
7913+        shares = shares[:self._required_shares]
7914+        self.log("decoding segment %d" % segnum)
7915+        if segnum == self._num_segments - 1:
7916+            d = defer.maybeDeferred(self._tail_decoder.decode, shares, shareids)
7917+        else:
7918+            d = defer.maybeDeferred(self._segment_decoder.decode, shares, shareids)
7919+        def _process(buffers):
7920             segment = "".join(buffers)
7921hunk ./src/allmydata/mutable/retrieve.py 971
7922+            self.log(format="now decoding segment %(segnum)s of %(numsegs)s",
7923+                     segnum=segnum,
7924+                     numsegs=self._num_segments,
7925+                     level=log.NOISY)
7926             self.log(" joined length %d, datalength %d" %
7927hunk ./src/allmydata/mutable/retrieve.py 976
7928-                     (len(segment), datalength))
7929-            segment = segment[:datalength]
7930+                     (len(segment), self._data_length))
7931+            if segnum == self._num_segments - 1:
7932+                size_to_use = self._tail_data_size
7933+            else:
7934+                size_to_use = self._segment_size
7935+            segment = segment[:size_to_use]
7936             self.log(" segment len=%d" % len(segment))
7937hunk ./src/allmydata/mutable/retrieve.py 983
7938-            return segment
7939-        def _err(f):
7940-            self.log(" decode failed: %s" % f)
7941-            return f
7942-        d.addCallback(_done)
7943-        d.addErrback(_err)
7944+            self._status.timings.setdefault("decode", 0)
7945+            self._status.timings['decode'] = time.time() - started
7946+            return segment, salt
7947+        d.addCallback(_process)
7948         return d
7949 
7950hunk ./src/allmydata/mutable/retrieve.py 989
7951-    def _decrypt(self, crypttext, IV, readkey):
7952+
7953+    def _decrypt_segment(self, segment_and_salt):
7954+        """
7955+        I take a single segment and its salt, and decrypt it. I return
7956+        the plaintext of the segment that is in my argument.
7957+        """
7958+        segment, salt = segment_and_salt
7959         self._status.set_status("decrypting")
7960hunk ./src/allmydata/mutable/retrieve.py 997
7961+        self.log("decrypting segment %d" % self._current_segment)
7962         started = time.time()
7963hunk ./src/allmydata/mutable/retrieve.py 999
7964-        key = hashutil.ssk_readkey_data_hash(IV, readkey)
7965+        key = hashutil.ssk_readkey_data_hash(salt, self._node.get_readkey())
7966         decryptor = AES(key)
7967hunk ./src/allmydata/mutable/retrieve.py 1001
7968-        plaintext = decryptor.process(crypttext)
7969-        self._status.timings["decrypt"] = time.time() - started
7970+        plaintext = decryptor.process(segment)
7971+        self._status.timings.setdefault("decrypt", 0)
7972+        self._status.timings['decrypt'] = time.time() - started
7973         return plaintext
7974 
7975hunk ./src/allmydata/mutable/retrieve.py 1006
7976-    def _done(self, res):
7977-        if not self._running:
7978+
7979+    def notify_server_corruption(self, peerid, shnum, reason):
7980+        ss = self.servermap.connections[peerid]
7981+        ss.callRemoteOnly("advise_corrupt_share",
7982+                          "mutable", self._storage_index, shnum, reason)
7983+
7984+
7985+    def _try_to_validate_privkey(self, enc_privkey, reader):
7986+        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
7987+        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
7988+        if alleged_writekey != self._node.get_writekey():
7989+            self.log("invalid privkey from %s shnum %d" %
7990+                     (reader, reader.shnum),
7991+                     level=log.WEIRD, umid="YIw4tA")
7992+            if self._verify:
7993+                self.servermap.mark_bad_share(reader.peerid, reader.shnum,
7994+                                              self.verinfo[-2])
7995+                e = CorruptShareError(reader.peerid,
7996+                                      reader.shnum,
7997+                                      "invalid privkey")
7998+                f = failure.Failure(e)
7999+                self._bad_shares.add((reader.peerid, reader.shnum, f))
8000             return
8001hunk ./src/allmydata/mutable/retrieve.py 1029
8002+
8003+        # it's good
8004+        self.log("got valid privkey from shnum %d on reader %s" %
8005+                 (reader.shnum, reader))
8006+        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
8007+        self._node._populate_encprivkey(enc_privkey)
8008+        self._node._populate_privkey(privkey)
8009+        self._need_privkey = False
8010+
8011+
8012+    def _check_for_done(self, res):
8013+        """
8014+        I check to see if this Retrieve object has successfully finished
8015+        its work.
8016+
8017+        I can exit in the following ways:
8018+            - If there are no more segments to download, then I exit by
8019+              causing self._done_deferred to fire with the plaintext
8020+              content requested by the caller.
8021+            - If there are still segments to be downloaded, and there
8022+              are enough active readers (readers which have not broken
8023+              and have not given us corrupt data) to continue
8024+              downloading, I send control back to
8025+              _download_current_segment.
8026+            - If there are still segments to be downloaded but there are
8027+              not enough active peers to download them, I ask
8028+              _add_active_peers to add more peers. If it is successful,
8029+              it will call _download_current_segment. If there are not
8030+              enough peers to retrieve the file, then that will cause
8031+              _done_deferred to errback.
8032+        """
8033+        self.log("checking for doneness")
8034+        if self._current_segment > self._last_segment:
8035+            # No more segments to download, we're done.
8036+            self.log("got plaintext, done")
8037+            return self._done()
8038+
8039+        if len(self._active_readers) >= self._required_shares:
8040+            # More segments to download, but we have enough good peers
8041+            # in self._active_readers that we can do that without issue,
8042+            # so go nab the next segment.
8043+            self.log("not done yet: on segment %d of %d" % \
8044+                     (self._current_segment + 1, self._num_segments))
8045+            return self._download_current_segment()
8046+
8047+        self.log("not done yet: on segment %d of %d, need to add peers" % \
8048+                 (self._current_segment + 1, self._num_segments))
8049+        return self._add_active_peers()
8050+
8051+
8052+    def _done(self):
8053+        """
8054+        I am called by _check_for_done when the download process has
8055+        finished successfully. After making some useful logging
8056+        statements, I return the decrypted contents to the owner of this
8057+        Retrieve object through self._done_deferred.
8058+        """
8059         self._running = False
8060         self._status.set_active(False)
8061hunk ./src/allmydata/mutable/retrieve.py 1088
8062-        self._status.timings["total"] = time.time() - self._started
8063-        # res is either the new contents, or a Failure
8064-        if isinstance(res, failure.Failure):
8065-            self.log("Retrieve done, with failure", failure=res,
8066-                     level=log.UNUSUAL)
8067-            self._status.set_status("Failed")
8068+        now = time.time()
8069+        self._status.timings['total'] = now - self._started
8070+        self._status.timings['fetch'] = now - self._started_fetching
8071+
8072+        if self._verify:
8073+            ret = list(self._bad_shares)
8074+            self.log("done verifying, found %d bad shares" % len(ret))
8075         else:
8076hunk ./src/allmydata/mutable/retrieve.py 1096
8077-            self.log("Retrieve done, success!")
8078-            self._status.set_status("Finished")
8079-            self._status.set_progress(1.0)
8080-            # remember the encoding parameters, use them again next time
8081-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
8082-             offsets_tuple) = self.verinfo
8083-            self._node._populate_required_shares(k)
8084-            self._node._populate_total_shares(N)
8085-        eventually(self._done_deferred.callback, res)
8086+            # TODO: upload status here?
8087+            ret = self._consumer
8088+            self._consumer.unregisterProducer()
8089+        eventually(self._done_deferred.callback, ret)
8090+
8091 
8092hunk ./src/allmydata/mutable/retrieve.py 1102
8093+    def _failed(self):
8094+        """
8095+        I am called by _add_active_peers when there are not enough
8096+        active peers left to complete the download. After making some
8097+        useful logging statements, I return an exception to that effect
8098+        to the caller of this Retrieve object through
8099+        self._done_deferred.
8100+        """
8101+        self._running = False
8102+        self._status.set_active(False)
8103+        now = time.time()
8104+        self._status.timings['total'] = now - self._started
8105+        self._status.timings['fetch'] = now - self._started_fetching
8106+
8107+        if self._verify:
8108+            ret = list(self._bad_shares)
8109+        else:
8110+            format = ("ran out of peers: "
8111+                      "have %(have)d of %(total)d segments "
8112+                      "found %(bad)d bad shares "
8113+                      "encoding %(k)d-of-%(n)d")
8114+            args = {"have": self._current_segment,
8115+                    "total": self._num_segments,
8116+                    "need": self._last_segment,
8117+                    "k": self._required_shares,
8118+                    "n": self._total_shares,
8119+                    "bad": len(self._bad_shares)}
8120+            e = NotEnoughSharesError("%s, last failure: %s" % \
8121+                                     (format % args, str(self._last_failure)))
8122+            f = failure.Failure(e)
8123+            ret = f
8124+        eventually(self._done_deferred.callback, ret)
8125}
8126[mutable/servermap.py: Alter the servermap updater to work with MDMF files
8127Kevan Carstensen <kevan@isnotajoke.com>**20100811233309
8128 Ignore-this: 5d2c922283c12cad93a5346e978cd691
8129 
8130 These modifications were basically all to the end of having the
8131 servermap updater use the unified MDMF + SDMF read interface whenever
8132 possible -- this reduces the complexity of the code, making it easier to
8133 read and maintain. To do this, I needed to modify the process of
8134 updating the servermap a little bit.
8135 
8136 To support partial-file updates, I also modified the servermap updater
8137 to fetch the block hash trees and certain segments of files while it
8138 performed a servermap update (this can be done without adding any new
8139 roundtrips because of batch-read functionality that the read proxy has).
8140 
8141] {
8142hunk ./src/allmydata/mutable/servermap.py 2
8143 
8144-import sys, time
8145+import sys, time, struct
8146 from zope.interface import implements
8147 from itertools import count
8148 from twisted.internet import defer
8149hunk ./src/allmydata/mutable/servermap.py 7
8150 from twisted.python import failure
8151-from foolscap.api import DeadReferenceError, RemoteException, eventually
8152-from allmydata.util import base32, hashutil, idlib, log
8153+from foolscap.api import DeadReferenceError, RemoteException, eventually, \
8154+                         fireEventually
8155+from allmydata.util import base32, hashutil, idlib, log, deferredutil
8156 from allmydata.storage.server import si_b2a
8157 from allmydata.interfaces import IServermapUpdaterStatus
8158 from pycryptopp.publickey import rsa
8159hunk ./src/allmydata/mutable/servermap.py 17
8160 from allmydata.mutable.common import MODE_CHECK, MODE_ANYTHING, MODE_WRITE, MODE_READ, \
8161      DictOfSets, CorruptShareError, NeedMoreDataError
8162 from allmydata.mutable.layout import unpack_prefix_and_signature, unpack_header, unpack_share, \
8163-     SIGNED_PREFIX_LENGTH
8164+     SIGNED_PREFIX_LENGTH, MDMFSlotReadProxy
8165 
8166 class UpdateStatus:
8167     implements(IServermapUpdaterStatus)
8168hunk ./src/allmydata/mutable/servermap.py 124
8169         self.bad_shares = {} # maps (peerid,shnum) to old checkstring
8170         self.last_update_mode = None
8171         self.last_update_time = 0
8172+        self.update_data = {} # (verinfo,shnum) => data
8173 
8174     def copy(self):
8175         s = ServerMap()
8176hunk ./src/allmydata/mutable/servermap.py 255
8177         """Return a set of versionids, one for each version that is currently
8178         recoverable."""
8179         versionmap = self.make_versionmap()
8180-
8181         recoverable_versions = set()
8182         for (verinfo, shares) in versionmap.items():
8183             (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
8184hunk ./src/allmydata/mutable/servermap.py 340
8185         return False
8186 
8187 
8188+    def get_update_data_for_share_and_verinfo(self, shnum, verinfo):
8189+        """
8190+        I return the update data for the given shnum
8191+        """
8192+        update_data = self.update_data[shnum]
8193+        update_datum = [i[1] for i in update_data if i[0] == verinfo][0]
8194+        return update_datum
8195+
8196+
8197+    def set_update_data_for_share_and_verinfo(self, shnum, verinfo, data):
8198+        """
8199+        I record the block hash tree for the given shnum.
8200+        """
8201+        self.update_data.setdefault(shnum , []).append((verinfo, data))
8202+
8203+
8204 class ServermapUpdater:
8205     def __init__(self, filenode, storage_broker, monitor, servermap,
8206hunk ./src/allmydata/mutable/servermap.py 358
8207-                 mode=MODE_READ, add_lease=False):
8208+                 mode=MODE_READ, add_lease=False, update_range=None):
8209         """I update a servermap, locating a sufficient number of useful
8210         shares and remembering where they are located.
8211 
8212hunk ./src/allmydata/mutable/servermap.py 390
8213         #  * if we need the encrypted private key, we want [-1216ish:]
8214         #   * but we can't read from negative offsets
8215         #   * the offset table tells us the 'ish', also the positive offset
8216-        # A future version of the SMDF slot format should consider using
8217-        # fixed-size slots so we can retrieve less data. For now, we'll just
8218-        # read 2000 bytes, which also happens to read enough actual data to
8219-        # pre-fetch a 9-entry dirnode.
8220+        # MDMF:
8221+        #  * Checkstring? [0:72]
8222+        #  * If we want to validate the checkstring, then [0:72], [143:?] --
8223+        #    the offset table will tell us for sure.
8224+        #  * If we need the verification key, we have to consult the offset
8225+        #    table as well.
8226+        # At this point, we don't know which we are. Our filenode can
8227+        # tell us, but it might be lying -- in some cases, we're
8228+        # responsible for telling it which kind of file it is.
8229         self._read_size = 4000
8230         if mode == MODE_CHECK:
8231             # we use unpack_prefix_and_signature, so we need 1k
8232hunk ./src/allmydata/mutable/servermap.py 410
8233         # to ask for it during the check, we'll have problems doing the
8234         # publish.
8235 
8236+        self.fetch_update_data = False
8237+        if mode == MODE_WRITE and update_range:
8238+            # We're updating the servermap in preparation for an
8239+            # in-place file update, so we need to fetch some additional
8240+            # data from each share that we find.
8241+            assert len(update_range) == 2
8242+
8243+            self.start_segment = update_range[0]
8244+            self.end_segment = update_range[1]
8245+            self.fetch_update_data = True
8246+
8247         prefix = si_b2a(self._storage_index)[:5]
8248         self._log_number = log.msg(format="SharemapUpdater(%(si)s): starting (%(mode)s)",
8249                                    si=prefix, mode=mode)
8250hunk ./src/allmydata/mutable/servermap.py 459
8251         self._queries_completed = 0
8252 
8253         sb = self._storage_broker
8254+        # All of the peers, permuted by the storage index, as usual.
8255         full_peerlist = sb.get_servers_for_index(self._storage_index)
8256         self.full_peerlist = full_peerlist # for use later, immutable
8257         self.extra_peers = full_peerlist[:] # peers are removed as we use them
8258hunk ./src/allmydata/mutable/servermap.py 466
8259         self._good_peers = set() # peers who had some shares
8260         self._empty_peers = set() # peers who don't have any shares
8261         self._bad_peers = set() # peers to whom our queries failed
8262+        self._readers = {} # peerid -> dict(sharewriters), filled in
8263+                           # after responses come in.
8264 
8265         k = self._node.get_required_shares()
8266hunk ./src/allmydata/mutable/servermap.py 470
8267+        # For what cases can these conditions work?
8268         if k is None:
8269             # make a guess
8270             k = 3
8271hunk ./src/allmydata/mutable/servermap.py 483
8272         self.num_peers_to_query = k + self.EPSILON
8273 
8274         if self.mode == MODE_CHECK:
8275+            # We want to query all of the peers.
8276             initial_peers_to_query = dict(full_peerlist)
8277             must_query = set(initial_peers_to_query.keys())
8278             self.extra_peers = []
8279hunk ./src/allmydata/mutable/servermap.py 491
8280             # we're planning to replace all the shares, so we want a good
8281             # chance of finding them all. We will keep searching until we've
8282             # seen epsilon that don't have a share.
8283+            # We don't query all of the peers because that could take a while.
8284             self.num_peers_to_query = N + self.EPSILON
8285             initial_peers_to_query, must_query = self._build_initial_querylist()
8286             self.required_num_empty_peers = self.EPSILON
8287hunk ./src/allmydata/mutable/servermap.py 501
8288             # might also avoid the round trip required to read the encrypted
8289             # private key.
8290 
8291-        else:
8292+        else: # MODE_READ, MODE_ANYTHING
8293+            # 2k peers is good enough.
8294             initial_peers_to_query, must_query = self._build_initial_querylist()
8295 
8296         # this is a set of peers that we are required to get responses from:
8297hunk ./src/allmydata/mutable/servermap.py 517
8298         # before we can consider ourselves finished, and self.extra_peers
8299         # contains the overflow (peers that we should tap if we don't get
8300         # enough responses)
8301+        # I guess that self._must_query is a subset of
8302+        # initial_peers_to_query?
8303+        assert set(must_query).issubset(set(initial_peers_to_query))
8304 
8305         self._send_initial_requests(initial_peers_to_query)
8306         self._status.timings["initial_queries"] = time.time() - self._started
8307hunk ./src/allmydata/mutable/servermap.py 576
8308         # errors that aren't handled by _query_failed (and errors caused by
8309         # _query_failed) get logged, but we still want to check for doneness.
8310         d.addErrback(log.err)
8311-        d.addBoth(self._check_for_done)
8312         d.addErrback(self._fatal_error)
8313hunk ./src/allmydata/mutable/servermap.py 577
8314+        d.addCallback(self._check_for_done)
8315         return d
8316 
8317     def _do_read(self, ss, peerid, storage_index, shnums, readv):
8318hunk ./src/allmydata/mutable/servermap.py 596
8319         d = ss.callRemote("slot_readv", storage_index, shnums, readv)
8320         return d
8321 
8322+
8323+    def _got_corrupt_share(self, e, shnum, peerid, data, lp):
8324+        """
8325+        I am called when a remote server returns a corrupt share in
8326+        response to one of our queries. By corrupt, I mean a share
8327+        without a valid signature. I then record the failure, notify the
8328+        server of the corruption, and record the share as bad.
8329+        """
8330+        f = failure.Failure(e)
8331+        self.log(format="bad share: %(f_value)s", f_value=str(f),
8332+                 failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
8333+        # Notify the server that its share is corrupt.
8334+        self.notify_server_corruption(peerid, shnum, str(e))
8335+        # By flagging this as a bad peer, we won't count any of
8336+        # the other shares on that peer as valid, though if we
8337+        # happen to find a valid version string amongst those
8338+        # shares, we'll keep track of it so that we don't need
8339+        # to validate the signature on those again.
8340+        self._bad_peers.add(peerid)
8341+        self._last_failure = f
8342+        # XXX: Use the reader for this?
8343+        checkstring = data[:SIGNED_PREFIX_LENGTH]
8344+        self._servermap.mark_bad_share(peerid, shnum, checkstring)
8345+        self._servermap.problems.append(f)
8346+
8347+
8348+    def _cache_good_sharedata(self, verinfo, shnum, now, data):
8349+        """
8350+        If one of my queries returns successfully (which means that we
8351+        were able to and successfully did validate the signature), I
8352+        cache the data that we initially fetched from the storage
8353+        server. This will help reduce the number of roundtrips that need
8354+        to occur when the file is downloaded, or when the file is
8355+        updated.
8356+        """
8357+        if verinfo:
8358+            self._node._add_to_cache(verinfo, shnum, 0, data, now)
8359+
8360+
8361     def _got_results(self, datavs, peerid, readsize, stuff, started):
8362         lp = self.log(format="got result from [%(peerid)s], %(numshares)d shares",
8363                       peerid=idlib.shortnodeid_b2a(peerid),
8364hunk ./src/allmydata/mutable/servermap.py 642
8365                       level=log.NOISY)
8366         now = time.time()
8367         elapsed = now - started
8368-        self._queries_outstanding.discard(peerid)
8369-        self._servermap.reachable_peers.add(peerid)
8370-        self._must_query.discard(peerid)
8371-        self._queries_completed += 1
8372+        def _done_processing(ignored=None):
8373+            self._queries_outstanding.discard(peerid)
8374+            self._servermap.reachable_peers.add(peerid)
8375+            self._must_query.discard(peerid)
8376+            self._queries_completed += 1
8377         if not self._running:
8378             self.log("but we're not running, so we'll ignore it", parent=lp,
8379                      level=log.NOISY)
8380hunk ./src/allmydata/mutable/servermap.py 650
8381+            _done_processing()
8382             self._status.add_per_server_time(peerid, "late", started, elapsed)
8383             return
8384         self._status.add_per_server_time(peerid, "query", started, elapsed)
8385hunk ./src/allmydata/mutable/servermap.py 660
8386         else:
8387             self._empty_peers.add(peerid)
8388 
8389-        last_verinfo = None
8390-        last_shnum = None
8391+        ss, storage_index = stuff
8392+        ds = []
8393+
8394         for shnum,datav in datavs.items():
8395             data = datav[0]
8396hunk ./src/allmydata/mutable/servermap.py 665
8397-            try:
8398-                verinfo = self._got_results_one_share(shnum, data, peerid, lp)
8399-                last_verinfo = verinfo
8400-                last_shnum = shnum
8401-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
8402-            except CorruptShareError, e:
8403-                # log it and give the other shares a chance to be processed
8404-                f = failure.Failure()
8405-                self.log(format="bad share: %(f_value)s", f_value=str(f.value),
8406-                         failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
8407-                self.notify_server_corruption(peerid, shnum, str(e))
8408-                self._bad_peers.add(peerid)
8409-                self._last_failure = f
8410-                checkstring = data[:SIGNED_PREFIX_LENGTH]
8411-                self._servermap.mark_bad_share(peerid, shnum, checkstring)
8412-                self._servermap.problems.append(f)
8413-                pass
8414+            reader = MDMFSlotReadProxy(ss,
8415+                                       storage_index,
8416+                                       shnum,
8417+                                       data)
8418+            self._readers.setdefault(peerid, dict())[shnum] = reader
8419+            # our goal, with each response, is to validate the version
8420+            # information and share data as best we can at this point --
8421+            # we do this by validating the signature. To do this, we
8422+            # need to do the following:
8423+            #   - If we don't already have the public key, fetch the
8424+            #     public key. We use this to validate the signature.
8425+            if not self._node.get_pubkey():
8426+                # fetch and set the public key.
8427+                d = reader.get_verification_key(queue=True)
8428+                d.addCallback(lambda results, shnum=shnum, peerid=peerid:
8429+                    self._try_to_set_pubkey(results, peerid, shnum, lp))
8430+                # XXX: Make self._pubkey_query_failed?
8431+                d.addErrback(lambda error, shnum=shnum, peerid=peerid:
8432+                    self._got_corrupt_share(error, shnum, peerid, data, lp))
8433+            else:
8434+                # we already have the public key.
8435+                d = defer.succeed(None)
8436 
8437hunk ./src/allmydata/mutable/servermap.py 688
8438-        self._status.timings["cumulative_verify"] += (time.time() - now)
8439+            # Neither of these two branches return anything of
8440+            # consequence, so the first entry in our deferredlist will
8441+            # be None.
8442 
8443hunk ./src/allmydata/mutable/servermap.py 692
8444-        if self._need_privkey and last_verinfo:
8445-            # send them a request for the privkey. We send one request per
8446-            # server.
8447-            lp2 = self.log("sending privkey request",
8448-                           parent=lp, level=log.NOISY)
8449-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
8450-             offsets_tuple) = last_verinfo
8451-            o = dict(offsets_tuple)
8452+            # - Next, we need the version information. We almost
8453+            #   certainly got this by reading the first thousand or so
8454+            #   bytes of the share on the storage server, so we
8455+            #   shouldn't need to fetch anything at this step.
8456+            d2 = reader.get_verinfo()
8457+            d2.addErrback(lambda error, shnum=shnum, peerid=peerid:
8458+                self._got_corrupt_share(error, shnum, peerid, data, lp))
8459+            # - Next, we need the signature. For an SDMF share, it is
8460+            #   likely that we fetched this when doing our initial fetch
8461+            #   to get the version information. In MDMF, this lives at
8462+            #   the end of the share, so unless the file is quite small,
8463+            #   we'll need to do a remote fetch to get it.
8464+            d3 = reader.get_signature(queue=True)
8465+            d3.addErrback(lambda error, shnum=shnum, peerid=peerid:
8466+                self._got_corrupt_share(error, shnum, peerid, data, lp))
8467+            #  Once we have all three of these responses, we can move on
8468+            #  to validating the signature
8469 
8470hunk ./src/allmydata/mutable/servermap.py 710
8471-            self._queries_outstanding.add(peerid)
8472-            readv = [ (o['enc_privkey'], (o['EOF'] - o['enc_privkey'])) ]
8473-            ss = self._servermap.connections[peerid]
8474-            privkey_started = time.time()
8475-            d = self._do_read(ss, peerid, self._storage_index,
8476-                              [last_shnum], readv)
8477-            d.addCallback(self._got_privkey_results, peerid, last_shnum,
8478-                          privkey_started, lp2)
8479-            d.addErrback(self._privkey_query_failed, peerid, last_shnum, lp2)
8480-            d.addErrback(log.err)
8481-            d.addCallback(self._check_for_done)
8482-            d.addErrback(self._fatal_error)
8483+            # Does the node already have a privkey? If not, we'll try to
8484+            # fetch it here.
8485+            if self._need_privkey:
8486+                d4 = reader.get_encprivkey(queue=True)
8487+                d4.addCallback(lambda results, shnum=shnum, peerid=peerid:
8488+                    self._try_to_validate_privkey(results, peerid, shnum, lp))
8489+                d4.addErrback(lambda error, shnum=shnum, peerid=peerid:
8490+                    self._privkey_query_failed(error, shnum, data, lp))
8491+            else:
8492+                d4 = defer.succeed(None)
8493+
8494+
8495+            if self.fetch_update_data:
8496+                # fetch the block hash tree and first + last segment, as
8497+                # configured earlier.
8498+                # Then set them in wherever we happen to want to set
8499+                # them.
8500+                ds = []
8501+                # XXX: We do this above, too. Is there a good way to
8502+                # make the two routines share the value without
8503+                # introducing more roundtrips?
8504+                ds.append(reader.get_verinfo())
8505+                ds.append(reader.get_blockhashes(queue=True))
8506+                ds.append(reader.get_block_and_salt(self.start_segment,
8507+                                                    queue=True))
8508+                ds.append(reader.get_block_and_salt(self.end_segment,
8509+                                                    queue=True))
8510+                d5 = deferredutil.gatherResults(ds)
8511+                d5.addCallback(self._got_update_results_one_share, shnum)
8512+            else:
8513+                d5 = defer.succeed(None)
8514 
8515hunk ./src/allmydata/mutable/servermap.py 742
8516+            dl = defer.DeferredList([d, d2, d3, d4, d5])
8517+            dl.addBoth(self._turn_barrier)
8518+            reader.flush()
8519+            dl.addCallback(lambda results, shnum=shnum, peerid=peerid:
8520+                self._got_signature_one_share(results, shnum, peerid, lp))
8521+            dl.addErrback(lambda error, shnum=shnum, data=data:
8522+               self._got_corrupt_share(error, shnum, peerid, data, lp))
8523+            dl.addCallback(lambda verinfo, shnum=shnum, peerid=peerid, data=data:
8524+                self._cache_good_sharedata(verinfo, shnum, now, data))
8525+            ds.append(dl)
8526+        # dl is a deferred list that will fire when all of the shares
8527+        # that we found on this peer are done processing. When dl fires,
8528+        # we know that processing is done, so we can decrement the
8529+        # semaphore-like thing that we incremented earlier.
8530+        dl = defer.DeferredList(ds, fireOnOneErrback=True)
8531+        # Are we done? Done means that there are no more queries to
8532+        # send, that there are no outstanding queries, and that we
8533+        # haven't received any queries that are still processing. If we
8534+        # are done, self._check_for_done will cause the done deferred
8535+        # that we returned to our caller to fire, which tells them that
8536+        # they have a complete servermap, and that we won't be touching
8537+        # the servermap anymore.
8538+        dl.addCallback(_done_processing)
8539+        dl.addCallback(self._check_for_done)
8540+        dl.addErrback(self._fatal_error)
8541         # all done!
8542         self.log("_got_results done", parent=lp, level=log.NOISY)
8543hunk ./src/allmydata/mutable/servermap.py 769
8544+        return dl
8545+
8546+
8547+    def _turn_barrier(self, result):
8548+        """
8549+        I help the servermap updater avoid the recursion limit issues
8550+        discussed in #237.
8551+        """
8552+        return fireEventually(result)
8553+
8554+
8555+    def _try_to_set_pubkey(self, pubkey_s, peerid, shnum, lp):
8556+        if self._node.get_pubkey():
8557+            return # don't go through this again if we don't have to
8558+        fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
8559+        assert len(fingerprint) == 32
8560+        if fingerprint != self._node.get_fingerprint():
8561+            raise CorruptShareError(peerid, shnum,
8562+                                "pubkey doesn't match fingerprint")
8563+        self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
8564+        assert self._node.get_pubkey()
8565+
8566 
8567     def notify_server_corruption(self, peerid, shnum, reason):
8568         ss = self._servermap.connections[peerid]
8569hunk ./src/allmydata/mutable/servermap.py 797
8570         ss.callRemoteOnly("advise_corrupt_share",
8571                           "mutable", self._storage_index, shnum, reason)
8572 
8573-    def _got_results_one_share(self, shnum, data, peerid, lp):
8574+
8575+    def _got_signature_one_share(self, results, shnum, peerid, lp):
8576+        # It is our job to give versioninfo to our caller. We need to
8577+        # raise CorruptShareError if the share is corrupt for any
8578+        # reason, something that our caller will handle.
8579         self.log(format="_got_results: got shnum #%(shnum)d from peerid %(peerid)s",
8580                  shnum=shnum,
8581                  peerid=idlib.shortnodeid_b2a(peerid),
8582hunk ./src/allmydata/mutable/servermap.py 807
8583                  level=log.NOISY,
8584                  parent=lp)
8585+        if not self._running:
8586+            # We can't process the results, since we can't touch the
8587+            # servermap anymore.
8588+            self.log("but we're not running anymore.")
8589+            return None
8590 
8591hunk ./src/allmydata/mutable/servermap.py 813
8592-        # this might raise NeedMoreDataError, if the pubkey and signature
8593-        # live at some weird offset. That shouldn't happen, so I'm going to
8594-        # treat it as a bad share.
8595-        (seqnum, root_hash, IV, k, N, segsize, datalength,
8596-         pubkey_s, signature, prefix) = unpack_prefix_and_signature(data)
8597-
8598-        if not self._node.get_pubkey():
8599-            fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
8600-            assert len(fingerprint) == 32
8601-            if fingerprint != self._node.get_fingerprint():
8602-                raise CorruptShareError(peerid, shnum,
8603-                                        "pubkey doesn't match fingerprint")
8604-            self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
8605-
8606-        if self._need_privkey:
8607-            self._try_to_extract_privkey(data, peerid, shnum, lp)
8608-
8609-        (ig_version, ig_seqnum, ig_root_hash, ig_IV, ig_k, ig_N,
8610-         ig_segsize, ig_datalen, offsets) = unpack_header(data)
8611+        _, verinfo, signature, __, ___ = results
8612+        (seqnum,
8613+         root_hash,
8614+         saltish,
8615+         segsize,
8616+         datalen,
8617+         k,
8618+         n,
8619+         prefix,
8620+         offsets) = verinfo[1]
8621         offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
8622 
8623hunk ./src/allmydata/mutable/servermap.py 825
8624-        verinfo = (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
8625+        # XXX: This should be done for us in the method, so
8626+        # presumably you can go in there and fix it.
8627+        verinfo = (seqnum,
8628+                   root_hash,
8629+                   saltish,
8630+                   segsize,
8631+                   datalen,
8632+                   k,
8633+                   n,
8634+                   prefix,
8635                    offsets_tuple)
8636hunk ./src/allmydata/mutable/servermap.py 836
8637+        # This tuple uniquely identifies a share on the grid; we use it
8638+        # to keep track of the ones that we've already seen.
8639 
8640         if verinfo not in self._valid_versions:
8641hunk ./src/allmydata/mutable/servermap.py 840
8642-            # it's a new pair. Verify the signature.
8643-            valid = self._node.get_pubkey().verify(prefix, signature)
8644+            # This is a new version tuple, and we need to validate it
8645+            # against the public key before keeping track of it.
8646+            assert self._node.get_pubkey()
8647+            valid = self._node.get_pubkey().verify(prefix, signature[1])
8648             if not valid:
8649hunk ./src/allmydata/mutable/servermap.py 845
8650-                raise CorruptShareError(peerid, shnum, "signature is invalid")
8651+                raise CorruptShareError(peerid, shnum,
8652+                                        "signature is invalid")
8653 
8654hunk ./src/allmydata/mutable/servermap.py 848
8655-            # ok, it's a valid verinfo. Add it to the list of validated
8656-            # versions.
8657-            self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
8658-                     % (seqnum, base32.b2a(root_hash)[:4],
8659-                        idlib.shortnodeid_b2a(peerid), shnum,
8660-                        k, N, segsize, datalength),
8661-                     parent=lp)
8662-            self._valid_versions.add(verinfo)
8663-        # We now know that this is a valid candidate verinfo.
8664+        # ok, it's a valid verinfo. Add it to the list of validated
8665+        # versions.
8666+        self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
8667+                 % (seqnum, base32.b2a(root_hash)[:4],
8668+                    idlib.shortnodeid_b2a(peerid), shnum,
8669+                    k, n, segsize, datalen),
8670+                    parent=lp)
8671+        self._valid_versions.add(verinfo)
8672+        # We now know that this is a valid candidate verinfo. Whether or
8673+        # not this instance of it is valid is a matter for the next
8674+        # statement; at this point, we just know that if we see this
8675+        # version info again, that its signature checks out and that
8676+        # we're okay to skip the signature-checking step.
8677 
8678hunk ./src/allmydata/mutable/servermap.py 862
8679+        # (peerid, shnum) are bound in the method invocation.
8680         if (peerid, shnum) in self._servermap.bad_shares:
8681             # we've been told that the rest of the data in this share is
8682             # unusable, so don't add it to the servermap.
8683hunk ./src/allmydata/mutable/servermap.py 875
8684         self._servermap.add_new_share(peerid, shnum, verinfo, timestamp)
8685         # and the versionmap
8686         self.versionmap.add(verinfo, (shnum, peerid, timestamp))
8687+
8688+        # It's our job to set the protocol version of our parent
8689+        # filenode if it isn't already set.
8690+        if not self._node.get_version():
8691+            # The first byte of the prefix is the version.
8692+            v = struct.unpack(">B", prefix[:1])[0]
8693+            self.log("got version %d" % v)
8694+            self._node.set_version(v)
8695+
8696         return verinfo
8697 
8698hunk ./src/allmydata/mutable/servermap.py 886
8699-    def _deserialize_pubkey(self, pubkey_s):
8700-        verifier = rsa.create_verifying_key_from_string(pubkey_s)
8701-        return verifier
8702 
8703hunk ./src/allmydata/mutable/servermap.py 887
8704-    def _try_to_extract_privkey(self, data, peerid, shnum, lp):
8705-        try:
8706-            r = unpack_share(data)
8707-        except NeedMoreDataError, e:
8708-            # this share won't help us. oh well.
8709-            offset = e.encprivkey_offset
8710-            length = e.encprivkey_length
8711-            self.log("shnum %d on peerid %s: share was too short (%dB) "
8712-                     "to get the encprivkey; [%d:%d] ought to hold it" %
8713-                     (shnum, idlib.shortnodeid_b2a(peerid), len(data),
8714-                      offset, offset+length),
8715-                     parent=lp)
8716-            # NOTE: if uncoordinated writes are taking place, someone might
8717-            # change the share (and most probably move the encprivkey) before
8718-            # we get a chance to do one of these reads and fetch it. This
8719-            # will cause us to see a NotEnoughSharesError(unable to fetch
8720-            # privkey) instead of an UncoordinatedWriteError . This is a
8721-            # nuisance, but it will go away when we move to DSA-based mutable
8722-            # files (since the privkey will be small enough to fit in the
8723-            # write cap).
8724+    def _got_update_results_one_share(self, results, share):
8725+        """
8726+        I record the update results in results.
8727+        """
8728+        assert len(results) == 4
8729+        verinfo, blockhashes, start, end = results
8730+        (seqnum,
8731+         root_hash,
8732+         saltish,
8733+         segsize,
8734+         datalen,
8735+         k,
8736+         n,
8737+         prefix,
8738+         offsets) = verinfo
8739+        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
8740 
8741hunk ./src/allmydata/mutable/servermap.py 904
8742-            return
8743+        # XXX: This should be done for us in the method, so
8744+        # presumably you can go in there and fix it.
8745+        verinfo = (seqnum,
8746+                   root_hash,
8747+                   saltish,
8748+                   segsize,
8749+                   datalen,
8750+                   k,
8751+                   n,
8752+                   prefix,
8753+                   offsets_tuple)
8754 
8755hunk ./src/allmydata/mutable/servermap.py 916
8756-        (seqnum, root_hash, IV, k, N, segsize, datalen,
8757-         pubkey, signature, share_hash_chain, block_hash_tree,
8758-         share_data, enc_privkey) = r
8759+        update_data = (blockhashes, start, end)
8760+        self._servermap.set_update_data_for_share_and_verinfo(share,
8761+                                                              verinfo,
8762+                                                              update_data)
8763 
8764hunk ./src/allmydata/mutable/servermap.py 921
8765-        return self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
8766+
8767+    def _deserialize_pubkey(self, pubkey_s):
8768+        verifier = rsa.create_verifying_key_from_string(pubkey_s)
8769+        return verifier
8770 
8771hunk ./src/allmydata/mutable/servermap.py 926
8772-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
8773 
8774hunk ./src/allmydata/mutable/servermap.py 927
8775+    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
8776+        """
8777+        Given a writekey from a remote server, I validate it against the
8778+        writekey stored in my node. If it is valid, then I set the
8779+        privkey and encprivkey properties of the node.
8780+        """
8781         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
8782         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
8783         if alleged_writekey != self._node.get_writekey():
8784hunk ./src/allmydata/mutable/servermap.py 1005
8785         self._queries_completed += 1
8786         self._last_failure = f
8787 
8788-    def _got_privkey_results(self, datavs, peerid, shnum, started, lp):
8789-        now = time.time()
8790-        elapsed = now - started
8791-        self._status.add_per_server_time(peerid, "privkey", started, elapsed)
8792-        self._queries_outstanding.discard(peerid)
8793-        if not self._need_privkey:
8794-            return
8795-        if shnum not in datavs:
8796-            self.log("privkey wasn't there when we asked it",
8797-                     level=log.WEIRD, umid="VA9uDQ")
8798-            return
8799-        datav = datavs[shnum]
8800-        enc_privkey = datav[0]
8801-        self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
8802 
8803     def _privkey_query_failed(self, f, peerid, shnum, lp):
8804         self._queries_outstanding.discard(peerid)
8805hunk ./src/allmydata/mutable/servermap.py 1019
8806         self._servermap.problems.append(f)
8807         self._last_failure = f
8808 
8809+
8810     def _check_for_done(self, res):
8811         # exit paths:
8812         #  return self._send_more_queries(outstanding) : send some more queries
8813hunk ./src/allmydata/mutable/servermap.py 1025
8814         #  return self._done() : all done
8815         #  return : keep waiting, no new queries
8816-
8817         lp = self.log(format=("_check_for_done, mode is '%(mode)s', "
8818                               "%(outstanding)d queries outstanding, "
8819                               "%(extra)d extra peers available, "
8820hunk ./src/allmydata/mutable/servermap.py 1216
8821 
8822     def _done(self):
8823         if not self._running:
8824+            self.log("not running; we're already done")
8825             return
8826         self._running = False
8827         now = time.time()
8828hunk ./src/allmydata/mutable/servermap.py 1231
8829         self._servermap.last_update_time = self._started
8830         # the servermap will not be touched after this
8831         self.log("servermap: %s" % self._servermap.summarize_versions())
8832+
8833         eventually(self._done_deferred.callback, self._servermap)
8834 
8835     def _fatal_error(self, f):
8836}
8837[client.py: learn how to create different kinds of mutable files
8838Kevan Carstensen <kevan@isnotajoke.com>**20100812231410
8839 Ignore-this: 6b0e1205cf882fad2e9d1ba144082b02
8840] {
8841hunk ./src/allmydata/client.py 25
8842 from allmydata.util.time_format import parse_duration, parse_date
8843 from allmydata.stats import StatsProvider
8844 from allmydata.history import History
8845-from allmydata.interfaces import IStatsProducer, RIStubClient
8846+from allmydata.interfaces import IStatsProducer, RIStubClient, \
8847+                                 SDMF_VERSION, MDMF_VERSION
8848 from allmydata.nodemaker import NodeMaker
8849 
8850 
8851hunk ./src/allmydata/client.py 357
8852                                    self.terminator,
8853                                    self.get_encoding_parameters(),
8854                                    self._key_generator)
8855+        default = self.get_config("mutable", "format", default="sdmf")
8856+        if default == "mdmf":
8857+            self.mutable_file_default = MDMF_VERSION
8858+        else:
8859+            self.mutable_file_default = SDMF_VERSION
8860 
8861     def get_history(self):
8862         return self.history
8863hunk ./src/allmydata/client.py 500
8864     def create_immutable_dirnode(self, children, convergence=None):
8865         return self.nodemaker.create_immutable_directory(children, convergence)
8866 
8867-    def create_mutable_file(self, contents=None, keysize=None):
8868-        return self.nodemaker.create_mutable_file(contents, keysize)
8869+    def create_mutable_file(self, contents=None, keysize=None, version=None):
8870+        if not version:
8871+            version = self.mutable_file_default
8872+        return self.nodemaker.create_mutable_file(contents, keysize,
8873+                                                  version=version)
8874 
8875     def upload(self, uploadable):
8876         uploader = self.getServiceNamed("uploader")
8877}
8878[scripts: tell 'tahoe put' about MDMF
8879Kevan Carstensen <kevan@isnotajoke.com>**20100813234957
8880 Ignore-this: c106b3384fc676bd3c0fb466d2a52b1b
8881] {
8882hunk ./src/allmydata/scripts/cli.py 156
8883     optFlags = [
8884         ("mutable", "m", "Create a mutable file instead of an immutable one."),
8885         ]
8886+    optParameters = [
8887+        ("mutable-type", None, False, "Create a mutable file in the given format. Valid formats are 'sdmf' for SDMF and 'mdmf' for MDMF"),
8888+        ]
8889 
8890     def parseArgs(self, arg1=None, arg2=None):
8891         # see Examples below
8892hunk ./src/allmydata/scripts/tahoe_put.py 21
8893     from_file = options.from_file
8894     to_file = options.to_file
8895     mutable = options['mutable']
8896+    mutable_type = False
8897+
8898+    if mutable:
8899+        mutable_type = options['mutable-type']
8900     if options['quiet']:
8901         verbosity = 0
8902     else:
8903hunk ./src/allmydata/scripts/tahoe_put.py 33
8904     stdout = options.stdout
8905     stderr = options.stderr
8906 
8907+    if mutable_type and mutable_type not in ('sdmf', 'mdmf'):
8908+        # Don't try to pass unsupported types to the webapi
8909+        print >>stderr, "error: %s is an invalid format" % mutable_type
8910+        return 1
8911+
8912     if nodeurl[-1] != "/":
8913         nodeurl += "/"
8914     if to_file:
8915hunk ./src/allmydata/scripts/tahoe_put.py 76
8916         url = nodeurl + "uri"
8917     if mutable:
8918         url += "?mutable=true"
8919+    if mutable_type:
8920+        assert mutable
8921+        url += "&mutable-type=%s" % mutable_type
8922+
8923     if from_file:
8924         infileobj = open(os.path.expanduser(from_file), "rb")
8925     else:
8926}
8927[tests:
8928Kevan Carstensen <kevan@isnotajoke.com>**20100813235038
8929 Ignore-this: 21580ac66b79ebd11dd95cd5c18b42d9
8930 
8931     - A lot of existing tests relied on aspects of the mutable file
8932       implementation that were changed. This patch updates those tests
8933       to work with the changes.
8934     - This patch also adds tests for new features.
8935] {
8936hunk ./src/allmydata/test/common.py 12
8937 from allmydata import uri, dirnode, client
8938 from allmydata.introducer.server import IntroducerNode
8939 from allmydata.interfaces import IMutableFileNode, IImmutableFileNode, \
8940-     FileTooLargeError, NotEnoughSharesError, ICheckable
8941+     FileTooLargeError, NotEnoughSharesError, ICheckable, \
8942+     IMutableUploadable, SDMF_VERSION, MDMF_VERSION
8943 from allmydata.check_results import CheckResults, CheckAndRepairResults, \
8944      DeepCheckResults, DeepCheckAndRepairResults
8945 from allmydata.mutable.common import CorruptShareError
8946hunk ./src/allmydata/test/common.py 18
8947 from allmydata.mutable.layout import unpack_header
8948+from allmydata.mutable.publish import MutableData
8949 from allmydata.storage.server import storage_index_to_dir
8950 from allmydata.storage.mutable import MutableShareFile
8951 from allmydata.util import hashutil, log, fileutil, pollmixin
8952hunk ./src/allmydata/test/common.py 152
8953         consumer.write(data[start:end])
8954         return consumer
8955 
8956+
8957+    def get_best_readable_version(self):
8958+        return defer.succeed(self)
8959+
8960+
8961+    download_best_version = download_to_data
8962+
8963+
8964+    def download_to_data(self):
8965+        return download_to_data(self)
8966+
8967+
8968+    def get_size_of_best_version(self):
8969+        return defer.succeed(self.get_size)
8970+
8971+
8972 def make_chk_file_cap(size):
8973     return uri.CHKFileURI(key=os.urandom(16),
8974                           uri_extension_hash=os.urandom(32),
8975hunk ./src/allmydata/test/common.py 192
8976     MUTABLE_SIZELIMIT = 10000
8977     all_contents = {}
8978     bad_shares = {}
8979+    file_types = {} # storage index => MDMF_VERSION or SDMF_VERSION
8980 
8981     def __init__(self, storage_broker, secret_holder,
8982                  default_encoding_parameters, history):
8983hunk ./src/allmydata/test/common.py 199
8984         self.init_from_cap(make_mutable_file_cap())
8985     def create(self, contents, key_generator=None, keysize=None):
8986         initial_contents = self._get_initial_contents(contents)
8987-        if len(initial_contents) > self.MUTABLE_SIZELIMIT:
8988-            raise FileTooLargeError("SDMF is limited to one segment, and "
8989-                                    "%d > %d" % (len(initial_contents),
8990-                                                 self.MUTABLE_SIZELIMIT))
8991-        self.all_contents[self.storage_index] = initial_contents
8992+        data = initial_contents.read(initial_contents.get_size())
8993+        data = "".join(data)
8994+        self.all_contents[self.storage_index] = data
8995         return defer.succeed(self)
8996     def _get_initial_contents(self, contents):
8997hunk ./src/allmydata/test/common.py 204
8998-        if isinstance(contents, str):
8999-            return contents
9000         if contents is None:
9001hunk ./src/allmydata/test/common.py 205
9002-            return ""
9003+            return MutableData("")
9004+
9005+        if IMutableUploadable.providedBy(contents):
9006+            return contents
9007+
9008         assert callable(contents), "%s should be callable, not %s" % \
9009                (contents, type(contents))
9010         return contents(self)
9011hunk ./src/allmydata/test/common.py 257
9012     def get_storage_index(self):
9013         return self.storage_index
9014 
9015+    def get_servermap(self, mode):
9016+        return defer.succeed(None)
9017+
9018+    def set_version(self, version):
9019+        assert version in (SDMF_VERSION, MDMF_VERSION)
9020+        self.file_types[self.storage_index] = version
9021+
9022+    def get_version(self):
9023+        assert self.storage_index in self.file_types
9024+        return self.file_types[self.storage_index]
9025+
9026     def check(self, monitor, verify=False, add_lease=False):
9027         r = CheckResults(self.my_uri, self.storage_index)
9028         is_bad = self.bad_shares.get(self.storage_index, None)
9029hunk ./src/allmydata/test/common.py 326
9030         return d
9031 
9032     def download_best_version(self):
9033+        return defer.succeed(self._download_best_version())
9034+
9035+
9036+    def _download_best_version(self, ignored=None):
9037         if isinstance(self.my_uri, uri.LiteralFileURI):
9038hunk ./src/allmydata/test/common.py 331
9039-            return defer.succeed(self.my_uri.data)
9040+            return self.my_uri.data
9041         if self.storage_index not in self.all_contents:
9042hunk ./src/allmydata/test/common.py 333
9043-            return defer.fail(NotEnoughSharesError(None, 0, 3))
9044-        return defer.succeed(self.all_contents[self.storage_index])
9045+            raise NotEnoughSharesError(None, 0, 3)
9046+        return self.all_contents[self.storage_index]
9047+
9048 
9049     def overwrite(self, new_contents):
9050hunk ./src/allmydata/test/common.py 338
9051-        if len(new_contents) > self.MUTABLE_SIZELIMIT:
9052-            raise FileTooLargeError("SDMF is limited to one segment, and "
9053-                                    "%d > %d" % (len(new_contents),
9054-                                                 self.MUTABLE_SIZELIMIT))
9055         assert not self.is_readonly()
9056hunk ./src/allmydata/test/common.py 339
9057-        self.all_contents[self.storage_index] = new_contents
9058+        new_data = new_contents.read(new_contents.get_size())
9059+        new_data = "".join(new_data)
9060+        self.all_contents[self.storage_index] = new_data
9061         return defer.succeed(None)
9062     def modify(self, modifier):
9063         # this does not implement FileTooLargeError, but the real one does
9064hunk ./src/allmydata/test/common.py 349
9065     def _modify(self, modifier):
9066         assert not self.is_readonly()
9067         old_contents = self.all_contents[self.storage_index]
9068-        self.all_contents[self.storage_index] = modifier(old_contents, None, True)
9069+        new_data = modifier(old_contents, None, True)
9070+        self.all_contents[self.storage_index] = new_data
9071         return None
9072 
9073hunk ./src/allmydata/test/common.py 353
9074+    # As actually implemented, MutableFilenode and MutableFileVersion
9075+    # are distinct. However, nothing in the webapi uses (yet) that
9076+    # distinction -- it just uses the unified download interface
9077+    # provided by get_best_readable_version and read. When we start
9078+    # doing cooler things like LDMF, we will want to revise this code to
9079+    # be less simplistic.
9080+    def get_best_readable_version(self):
9081+        return defer.succeed(self)
9082+
9083+
9084+    def get_best_mutable_version(self):
9085+        return defer.succeed(self)
9086+
9087+    # Ditto for this, which is an implementation of IWritable.
9088+    # XXX: Declare that the same is implemented.
9089+    def update(self, data, offset):
9090+        assert not self.is_readonly()
9091+        def modifier(old, servermap, first_time):
9092+            new = old[:offset] + "".join(data.read(data.get_size()))
9093+            new += old[len(new):]
9094+            return new
9095+        return self.modify(modifier)
9096+
9097+
9098+    def read(self, consumer, offset=0, size=None):
9099+        data = self._download_best_version()
9100+        if size:
9101+            data = data[offset:offset+size]
9102+        consumer.write(data)
9103+        return defer.succeed(consumer)
9104+
9105+
9106 def make_mutable_file_cap():
9107     return uri.WriteableSSKFileURI(writekey=os.urandom(16),
9108                                    fingerprint=os.urandom(32))
9109hunk ./src/allmydata/test/test_checker.py 11
9110 from allmydata.test.no_network import GridTestMixin
9111 from allmydata.immutable.upload import Data
9112 from allmydata.test.common_web import WebRenderingMixin
9113+from allmydata.mutable.publish import MutableData
9114 
9115 class FakeClient:
9116     def get_storage_broker(self):
9117hunk ./src/allmydata/test/test_checker.py 291
9118         def _stash_immutable(ur):
9119             self.imm = c0.create_node_from_uri(ur.uri)
9120         d.addCallback(_stash_immutable)
9121-        d.addCallback(lambda ign: c0.create_mutable_file("contents"))
9122+        d.addCallback(lambda ign:
9123+            c0.create_mutable_file(MutableData("contents")))
9124         def _stash_mutable(node):
9125             self.mut = node
9126         d.addCallback(_stash_mutable)
9127hunk ./src/allmydata/test/test_cli.py 11
9128 from allmydata.util import fileutil, hashutil, base32
9129 from allmydata import uri
9130 from allmydata.immutable import upload
9131+from allmydata.mutable.publish import MutableData
9132 from allmydata.dirnode import normalize
9133 
9134 # Test that the scripts can be imported -- although the actual tests of their
9135hunk ./src/allmydata/test/test_cli.py 644
9136 
9137         d = self.do_cli("create-alias", etudes_arg)
9138         def _check_create_unicode((rc, out, err)):
9139-            self.failUnlessReallyEqual(rc, 0)
9140+            #self.failUnlessReallyEqual(rc, 0)
9141             self.failUnlessReallyEqual(err, "")
9142             self.failUnlessIn("Alias %s created" % quote_output(u"\u00E9tudes"), out)
9143 
9144hunk ./src/allmydata/test/test_cli.py 949
9145         d.addCallback(lambda (rc,out,err): self.failUnlessReallyEqual(out, DATA2))
9146         return d
9147 
9148+    def test_mutable_type(self):
9149+        self.basedir = "cli/Put/mutable_type"
9150+        self.set_up_grid()
9151+        data = "data" * 100000
9152+        fn1 = os.path.join(self.basedir, "data")
9153+        fileutil.write(fn1, data)
9154+        d = self.do_cli("create-alias", "tahoe")
9155+        d.addCallback(lambda ignored:
9156+            self.do_cli("put", "--mutable", "--mutable-type=mdmf",
9157+                        fn1, "tahoe:uploaded.txt"))
9158+        d.addCallback(lambda ignored:
9159+            self.do_cli("ls", "--json", "tahoe:uploaded.txt"))
9160+        d.addCallback(lambda (rc, json, err): self.failUnlessIn("mdmf", json))
9161+        d.addCallback(lambda ignored:
9162+            self.do_cli("put", "--mutable", "--mutable-type=sdmf",
9163+                        fn1, "tahoe:uploaded2.txt"))
9164+        d.addCallback(lambda ignored:
9165+            self.do_cli("ls", "--json", "tahoe:uploaded2.txt"))
9166+        d.addCallback(lambda (rc, json, err):
9167+            self.failUnlessIn("sdmf", json))
9168+        return d
9169+
9170+    def test_mutable_type_unlinked(self):
9171+        self.basedir = "cli/Put/mutable_type_unlinked"
9172+        self.set_up_grid()
9173+        data = "data" * 100000
9174+        fn1 = os.path.join(self.basedir, "data")
9175+        fileutil.write(fn1, data)
9176+        d = self.do_cli("put", "--mutable", "--mutable-type=mdmf", fn1)
9177+        d.addCallback(lambda (rc, cap, err):
9178+            self.do_cli("ls", "--json", cap))
9179+        d.addCallback(lambda (rc, json, err): self.failUnlessIn("mdmf", json))
9180+        d.addCallback(lambda ignored:
9181+            self.do_cli("put", "--mutable", "--mutable-type=sdmf", fn1))
9182+        d.addCallback(lambda (rc, cap, err):
9183+            self.do_cli("ls", "--json", cap))
9184+        d.addCallback(lambda (rc, json, err):
9185+            self.failUnlessIn("sdmf", json))
9186+        return d
9187+
9188+    def test_mutable_type_invalid_format(self):
9189+        self.basedir = "cli/Put/mutable_type_invalid_format"
9190+        self.set_up_grid()
9191+        data = "data" * 100000
9192+        fn1 = os.path.join(self.basedir, "data")
9193+        fileutil.write(fn1, data)
9194+        d = self.do_cli("put", "--mutable", "--mutable-type=ldmf", fn1)
9195+        def _check_failure((rc, out, err)):
9196+            self.failIfEqual(rc, 0)
9197+            self.failUnlessIn("invalid", err)
9198+        d.addCallback(_check_failure)
9199+        return d
9200+
9201     def test_put_with_nonexistent_alias(self):
9202         # when invoked with an alias that doesn't exist, 'tahoe put'
9203         # should output a useful error message, not a stack trace
9204hunk ./src/allmydata/test/test_cli.py 2028
9205         self.set_up_grid()
9206         c0 = self.g.clients[0]
9207         DATA = "data" * 100
9208-        d = c0.create_mutable_file(DATA)
9209+        DATA_uploadable = MutableData(DATA)
9210+        d = c0.create_mutable_file(DATA_uploadable)
9211         def _stash_uri(n):
9212             self.uri = n.get_uri()
9213         d.addCallback(_stash_uri)
9214hunk ./src/allmydata/test/test_cli.py 2130
9215                                            upload.Data("literal",
9216                                                         convergence="")))
9217         d.addCallback(_stash_uri, "small")
9218-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"1"))
9219+        d.addCallback(lambda ign:
9220+            c0.create_mutable_file(MutableData(DATA+"1")))
9221         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
9222         d.addCallback(_stash_uri, "mutable")
9223 
9224hunk ./src/allmydata/test/test_cli.py 2149
9225         # root/small
9226         # root/mutable
9227 
9228+        # We haven't broken anything yet, so this should all be healthy.
9229         d.addCallback(lambda ign: self.do_cli("deep-check", "--verbose",
9230                                               self.rooturi))
9231         def _check2((rc, out, err)):
9232hunk ./src/allmydata/test/test_cli.py 2164
9233                             in lines, out)
9234         d.addCallback(_check2)
9235 
9236+        # Similarly, all of these results should be as we expect them to
9237+        # be for a healthy file layout.
9238         d.addCallback(lambda ign: self.do_cli("stats", self.rooturi))
9239         def _check_stats((rc, out, err)):
9240             self.failUnlessReallyEqual(err, "")
9241hunk ./src/allmydata/test/test_cli.py 2181
9242             self.failUnlessIn(" 317-1000 : 1    (1000 B, 1000 B)", lines)
9243         d.addCallback(_check_stats)
9244 
9245+        # Now we break things.
9246         def _clobber_shares(ignored):
9247             shares = self.find_uri_shares(self.uris[u"g\u00F6\u00F6d"])
9248             self.failUnlessReallyEqual(len(shares), 10)
9249hunk ./src/allmydata/test/test_cli.py 2206
9250 
9251         d.addCallback(lambda ign:
9252                       self.do_cli("deep-check", "--verbose", self.rooturi))
9253+        # This should reveal the missing share, but not the corrupt
9254+        # share, since we didn't tell the deep check operation to also
9255+        # verify.
9256         def _check3((rc, out, err)):
9257             self.failUnlessReallyEqual(err, "")
9258             self.failUnlessReallyEqual(rc, 0)
9259hunk ./src/allmydata/test/test_cli.py 2257
9260                                   "--verbose", "--verify", "--repair",
9261                                   self.rooturi))
9262         def _check6((rc, out, err)):
9263+            # We've just repaired the directory. There is no reason for
9264+            # that repair to be unsuccessful.
9265             self.failUnlessReallyEqual(err, "")
9266             self.failUnlessReallyEqual(rc, 0)
9267             lines = out.splitlines()
9268hunk ./src/allmydata/test/test_deepcheck.py 9
9269 from twisted.internet import threads # CLI tests use deferToThread
9270 from allmydata.immutable import upload
9271 from allmydata.mutable.common import UnrecoverableFileError
9272+from allmydata.mutable.publish import MutableData
9273 from allmydata.util import idlib
9274 from allmydata.util import base32
9275 from allmydata.scripts import runner
9276hunk ./src/allmydata/test/test_deepcheck.py 38
9277         self.basedir = "deepcheck/MutableChecker/good"
9278         self.set_up_grid()
9279         CONTENTS = "a little bit of data"
9280-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9281+        CONTENTS_uploadable = MutableData(CONTENTS)
9282+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9283         def _created(node):
9284             self.node = node
9285             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9286hunk ./src/allmydata/test/test_deepcheck.py 61
9287         self.basedir = "deepcheck/MutableChecker/corrupt"
9288         self.set_up_grid()
9289         CONTENTS = "a little bit of data"
9290-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9291+        CONTENTS_uploadable = MutableData(CONTENTS)
9292+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9293         def _stash_and_corrupt(node):
9294             self.node = node
9295             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9296hunk ./src/allmydata/test/test_deepcheck.py 99
9297         self.basedir = "deepcheck/MutableChecker/delete_share"
9298         self.set_up_grid()
9299         CONTENTS = "a little bit of data"
9300-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9301+        CONTENTS_uploadable = MutableData(CONTENTS)
9302+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9303         def _stash_and_delete(node):
9304             self.node = node
9305             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9306hunk ./src/allmydata/test/test_deepcheck.py 223
9307             self.root = n
9308             self.root_uri = n.get_uri()
9309         d.addCallback(_created_root)
9310-        d.addCallback(lambda ign: c0.create_mutable_file("mutable file contents"))
9311+        d.addCallback(lambda ign:
9312+            c0.create_mutable_file(MutableData("mutable file contents")))
9313         d.addCallback(lambda n: self.root.set_node(u"mutable", n))
9314         def _created_mutable(n):
9315             self.mutable = n
9316hunk ./src/allmydata/test/test_deepcheck.py 965
9317     def create_mangled(self, ignored, name):
9318         nodetype, mangletype = name.split("-", 1)
9319         if nodetype == "mutable":
9320-            d = self.g.clients[0].create_mutable_file("mutable file contents")
9321+            mutable_uploadable = MutableData("mutable file contents")
9322+            d = self.g.clients[0].create_mutable_file(mutable_uploadable)
9323             d.addCallback(lambda n: self.root.set_node(unicode(name), n))
9324         elif nodetype == "large":
9325             large = upload.Data("Lots of data\n" * 1000 + name + "\n", None)
9326hunk ./src/allmydata/test/test_dirnode.py 1304
9327     implements(IMutableFileNode)
9328     counter = 0
9329     def __init__(self, initial_contents=""):
9330-        self.data = self._get_initial_contents(initial_contents)
9331+        data = self._get_initial_contents(initial_contents)
9332+        self.data = data.read(data.get_size())
9333+        self.data = "".join(self.data)
9334+
9335         counter = FakeMutableFile.counter
9336         FakeMutableFile.counter += 1
9337         writekey = hashutil.ssk_writekey_hash(str(counter))
9338hunk ./src/allmydata/test/test_dirnode.py 1354
9339         pass
9340 
9341     def modify(self, modifier):
9342-        self.data = modifier(self.data, None, True)
9343+        data = modifier(self.data, None, True)
9344+        self.data = data
9345         return defer.succeed(None)
9346 
9347 class FakeNodeMaker(NodeMaker):
9348hunk ./src/allmydata/test/test_filenode.py 98
9349         def _check_segment(res):
9350             self.failUnlessEqual(res, DATA[1:1+5])
9351         d.addCallback(_check_segment)
9352+        d.addCallback(lambda ignored: fn1.get_best_readable_version())
9353+        d.addCallback(lambda fn2: self.failUnlessEqual(fn1, fn2))
9354+        d.addCallback(lambda ignored:
9355+            fn1.get_size_of_best_version())
9356+        d.addCallback(lambda size:
9357+            self.failUnlessEqual(size, len(DATA)))
9358+        d.addCallback(lambda ignored:
9359+            fn1.download_to_data())
9360+        d.addCallback(lambda data:
9361+            self.failUnlessEqual(data, DATA))
9362+        d.addCallback(lambda ignored:
9363+            fn1.download_best_version())
9364+        d.addCallback(lambda data:
9365+            self.failUnlessEqual(data, DATA))
9366 
9367         return d
9368 
9369hunk ./src/allmydata/test/test_hung_server.py 10
9370 from allmydata.util.consumer import download_to_data
9371 from allmydata.immutable import upload
9372 from allmydata.mutable.common import UnrecoverableFileError
9373+from allmydata.mutable.publish import MutableData
9374 from allmydata.storage.common import storage_index_to_dir
9375 from allmydata.test.no_network import GridTestMixin
9376 from allmydata.test.common import ShouldFailMixin
9377hunk ./src/allmydata/test/test_hung_server.py 108
9378         self.servers = [(id, ss) for (id, ss) in nm.storage_broker.get_all_servers()]
9379 
9380         if mutable:
9381-            d = nm.create_mutable_file(mutable_plaintext)
9382+            uploadable = MutableData(mutable_plaintext)
9383+            d = nm.create_mutable_file(uploadable)
9384             def _uploaded_mutable(node):
9385                 self.uri = node.get_uri()
9386                 self.shares = self.find_uri_shares(self.uri)
9387hunk ./src/allmydata/test/test_immutable.py 4
9388 from allmydata.test import common
9389 from allmydata.interfaces import NotEnoughSharesError
9390 from allmydata.util.consumer import download_to_data
9391-from twisted.internet import defer
9392+from twisted.internet import defer, base
9393 from twisted.trial import unittest
9394 import random
9395 
9396hunk ./src/allmydata/test/test_immutable.py 143
9397         d.addCallback(_after_attempt)
9398         return d
9399 
9400+    def test_download_to_data(self):
9401+        d = self.n.download_to_data()
9402+        d.addCallback(lambda data:
9403+            self.failUnlessEqual(data, common.TEST_DATA))
9404+        return d
9405 
9406hunk ./src/allmydata/test/test_immutable.py 149
9407+
9408+    def test_download_best_version(self):
9409+        d = self.n.download_best_version()
9410+        d.addCallback(lambda data:
9411+            self.failUnlessEqual(data, common.TEST_DATA))
9412+        return d
9413+
9414+
9415+    def test_get_best_readable_version(self):
9416+        d = self.n.get_best_readable_version()
9417+        d.addCallback(lambda n2:
9418+            self.failUnlessEqual(n2, self.n))
9419+        return d
9420+
9421+    def test_get_size_of_best_version(self):
9422+        d = self.n.get_size_of_best_version()
9423+        d.addCallback(lambda size:
9424+            self.failUnlessEqual(size, len(common.TEST_DATA)))
9425+        return d
9426+
9427+
9428 # XXX extend these tests to show bad behavior of various kinds from servers:
9429 # raising exception from each remove_foo() method, for example
9430 
9431hunk ./src/allmydata/test/test_mutable.py 2
9432 
9433-import struct
9434+import struct, os
9435 from cStringIO import StringIO
9436 from twisted.trial import unittest
9437 from twisted.internet import defer, reactor
9438hunk ./src/allmydata/test/test_mutable.py 8
9439 from allmydata import uri, client
9440 from allmydata.nodemaker import NodeMaker
9441-from allmydata.util import base32
9442+from allmydata.util import base32, consumer, mathutil
9443 from allmydata.util.hashutil import tagged_hash, ssk_writekey_hash, \
9444      ssk_pubkey_fingerprint_hash
9445hunk ./src/allmydata/test/test_mutable.py 11
9446+from allmydata.util.deferredutil import gatherResults
9447 from allmydata.interfaces import IRepairResults, ICheckAndRepairResults, \
9448hunk ./src/allmydata/test/test_mutable.py 13
9449-     NotEnoughSharesError
9450+     NotEnoughSharesError, SDMF_VERSION, MDMF_VERSION
9451 from allmydata.monitor import Monitor
9452 from allmydata.test.common import ShouldFailMixin
9453 from allmydata.test.no_network import GridTestMixin
9454hunk ./src/allmydata/test/test_mutable.py 27
9455      NeedMoreDataError, UnrecoverableFileError, UncoordinatedWriteError, \
9456      NotEnoughServersError, CorruptShareError
9457 from allmydata.mutable.retrieve import Retrieve
9458-from allmydata.mutable.publish import Publish
9459+from allmydata.mutable.publish import Publish, MutableFileHandle, \
9460+                                      MutableData, \
9461+                                      DEFAULT_MAX_SEGMENT_SIZE
9462 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
9463hunk ./src/allmydata/test/test_mutable.py 31
9464-from allmydata.mutable.layout import unpack_header, unpack_share
9465+from allmydata.mutable.layout import unpack_header, unpack_share, \
9466+                                     MDMFSlotReadProxy
9467 from allmydata.mutable.repairer import MustForceRepairError
9468 
9469 import allmydata.test.common_util as testutil
9470hunk ./src/allmydata/test/test_mutable.py 101
9471         self.storage = storage
9472         self.queries = 0
9473     def callRemote(self, methname, *args, **kwargs):
9474+        self.queries += 1
9475         def _call():
9476             meth = getattr(self, methname)
9477             return meth(*args, **kwargs)
9478hunk ./src/allmydata/test/test_mutable.py 108
9479         d = fireEventually()
9480         d.addCallback(lambda res: _call())
9481         return d
9482+
9483     def callRemoteOnly(self, methname, *args, **kwargs):
9484hunk ./src/allmydata/test/test_mutable.py 110
9485+        self.queries += 1
9486         d = self.callRemote(methname, *args, **kwargs)
9487         d.addBoth(lambda ignore: None)
9488         pass
9489hunk ./src/allmydata/test/test_mutable.py 158
9490             chr(ord(original[byte_offset]) ^ 0x01) +
9491             original[byte_offset+1:])
9492 
9493+def add_two(original, byte_offset):
9494+    # It isn't enough to simply flip the bit for the version number,
9495+    # because 1 is a valid version number. So we add two instead.
9496+    return (original[:byte_offset] +
9497+            chr(ord(original[byte_offset]) ^ 0x02) +
9498+            original[byte_offset+1:])
9499+
9500 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
9501     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
9502     # list of shnums to corrupt.
9503hunk ./src/allmydata/test/test_mutable.py 168
9504+    ds = []
9505     for peerid in s._peers:
9506         shares = s._peers[peerid]
9507         for shnum in shares:
9508hunk ./src/allmydata/test/test_mutable.py 176
9509                 and shnum not in shnums_to_corrupt):
9510                 continue
9511             data = shares[shnum]
9512-            (version,
9513-             seqnum,
9514-             root_hash,
9515-             IV,
9516-             k, N, segsize, datalen,
9517-             o) = unpack_header(data)
9518-            if isinstance(offset, tuple):
9519-                offset1, offset2 = offset
9520-            else:
9521-                offset1 = offset
9522-                offset2 = 0
9523-            if offset1 == "pubkey":
9524-                real_offset = 107
9525-            elif offset1 in o:
9526-                real_offset = o[offset1]
9527-            else:
9528-                real_offset = offset1
9529-            real_offset = int(real_offset) + offset2 + offset_offset
9530-            assert isinstance(real_offset, int), offset
9531-            shares[shnum] = flip_bit(data, real_offset)
9532-    return res
9533+            # We're feeding the reader all of the share data, so it
9534+            # won't need to use the rref that we didn't provide, nor the
9535+            # storage index that we didn't provide. We do this because
9536+            # the reader will work for both MDMF and SDMF.
9537+            reader = MDMFSlotReadProxy(None, None, shnum, data)
9538+            # We need to get the offsets for the next part.
9539+            d = reader.get_verinfo()
9540+            def _do_corruption(verinfo, data, shnum):
9541+                (seqnum,
9542+                 root_hash,
9543+                 IV,
9544+                 segsize,
9545+                 datalen,
9546+                 k, n, prefix, o) = verinfo
9547+                if isinstance(offset, tuple):
9548+                    offset1, offset2 = offset
9549+                else:
9550+                    offset1 = offset
9551+                    offset2 = 0
9552+                if offset1 == "pubkey" and IV:
9553+                    real_offset = 107
9554+                elif offset1 == "share_data" and not IV:
9555+                    real_offset = 107
9556+                elif offset1 in o:
9557+                    real_offset = o[offset1]
9558+                else:
9559+                    real_offset = offset1
9560+                real_offset = int(real_offset) + offset2 + offset_offset
9561+                assert isinstance(real_offset, int), offset
9562+                if offset1 == 0: # verbyte
9563+                    f = add_two
9564+                else:
9565+                    f = flip_bit
9566+                shares[shnum] = f(data, real_offset)
9567+            d.addCallback(_do_corruption, data, shnum)
9568+            ds.append(d)
9569+    dl = defer.DeferredList(ds)
9570+    dl.addCallback(lambda ignored: res)
9571+    return dl
9572 
9573 def make_storagebroker(s=None, num_peers=10):
9574     if not s:
9575hunk ./src/allmydata/test/test_mutable.py 257
9576             self.failUnlessEqual(len(shnums), 1)
9577         d.addCallback(_created)
9578         return d
9579+    test_create.timeout = 15
9580+
9581+
9582+    def test_create_mdmf(self):
9583+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
9584+        def _created(n):
9585+            self.failUnless(isinstance(n, MutableFileNode))
9586+            self.failUnlessEqual(n.get_storage_index(), n._storage_index)
9587+            sb = self.nodemaker.storage_broker
9588+            peer0 = sorted(sb.get_all_serverids())[0]
9589+            shnums = self._storage._peers[peer0].keys()
9590+            self.failUnlessEqual(len(shnums), 1)
9591+        d.addCallback(_created)
9592+        return d
9593+
9594 
9595     def test_serialize(self):
9596         n = MutableFileNode(None, None, {"k": 3, "n": 10}, None)
9597hunk ./src/allmydata/test/test_mutable.py 302
9598             d.addCallback(lambda smap: smap.dump(StringIO()))
9599             d.addCallback(lambda sio:
9600                           self.failUnless("3-of-10" in sio.getvalue()))
9601-            d.addCallback(lambda res: n.overwrite("contents 1"))
9602+            d.addCallback(lambda res: n.overwrite(MutableData("contents 1")))
9603             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
9604             d.addCallback(lambda res: n.download_best_version())
9605             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9606hunk ./src/allmydata/test/test_mutable.py 309
9607             d.addCallback(lambda res: n.get_size_of_best_version())
9608             d.addCallback(lambda size:
9609                           self.failUnlessEqual(size, len("contents 1")))
9610-            d.addCallback(lambda res: n.overwrite("contents 2"))
9611+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
9612             d.addCallback(lambda res: n.download_best_version())
9613             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9614             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
9615hunk ./src/allmydata/test/test_mutable.py 313
9616-            d.addCallback(lambda smap: n.upload("contents 3", smap))
9617+            d.addCallback(lambda smap: n.upload(MutableData("contents 3"), smap))
9618             d.addCallback(lambda res: n.download_best_version())
9619             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
9620             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
9621hunk ./src/allmydata/test/test_mutable.py 325
9622             # mapupdate-to-retrieve data caching (i.e. make the shares larger
9623             # than the default readsize, which is 2000 bytes). A 15kB file
9624             # will have 5kB shares.
9625-            d.addCallback(lambda res: n.overwrite("large size file" * 1000))
9626+            d.addCallback(lambda res: n.overwrite(MutableData("large size file" * 1000)))
9627             d.addCallback(lambda res: n.download_best_version())
9628             d.addCallback(lambda res:
9629                           self.failUnlessEqual(res, "large size file" * 1000))
9630hunk ./src/allmydata/test/test_mutable.py 333
9631         d.addCallback(_created)
9632         return d
9633 
9634+
9635+    def test_upload_and_download_mdmf(self):
9636+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
9637+        def _created(n):
9638+            d = defer.succeed(None)
9639+            d.addCallback(lambda ignored:
9640+                n.get_servermap(MODE_READ))
9641+            def _then(servermap):
9642+                dumped = servermap.dump(StringIO())
9643+                self.failUnlessIn("3-of-10", dumped.getvalue())
9644+            d.addCallback(_then)
9645+            # Now overwrite the contents with some new contents. We want
9646+            # to make them big enough to force the file to be uploaded
9647+            # in more than one segment.
9648+            big_contents = "contents1" * 100000 # about 900 KiB
9649+            big_contents_uploadable = MutableData(big_contents)
9650+            d.addCallback(lambda ignored:
9651+                n.overwrite(big_contents_uploadable))
9652+            d.addCallback(lambda ignored:
9653+                n.download_best_version())
9654+            d.addCallback(lambda data:
9655+                self.failUnlessEqual(data, big_contents))
9656+            # Overwrite the contents again with some new contents. As
9657+            # before, they need to be big enough to force multiple
9658+            # segments, so that we make the downloader deal with
9659+            # multiple segments.
9660+            bigger_contents = "contents2" * 1000000 # about 9MiB
9661+            bigger_contents_uploadable = MutableData(bigger_contents)
9662+            d.addCallback(lambda ignored:
9663+                n.overwrite(bigger_contents_uploadable))
9664+            d.addCallback(lambda ignored:
9665+                n.download_best_version())
9666+            d.addCallback(lambda data:
9667+                self.failUnlessEqual(data, bigger_contents))
9668+            return d
9669+        d.addCallback(_created)
9670+        return d
9671+
9672+
9673+    def test_mdmf_write_count(self):
9674+        # Publishing an MDMF file should only cause one write for each
9675+        # share that is to be published. Otherwise, we introduce
9676+        # undesirable semantics that are a regression from SDMF
9677+        upload = MutableData("MDMF" * 100000) # about 400 KiB
9678+        d = self.nodemaker.create_mutable_file(upload,
9679+                                               version=MDMF_VERSION)
9680+        def _check_server_write_counts(ignored):
9681+            sb = self.nodemaker.storage_broker
9682+            peers = sb.test_servers.values()
9683+            for peer in peers:
9684+                self.failUnlessEqual(peer.queries, 1)
9685+        d.addCallback(_check_server_write_counts)
9686+        return d
9687+
9688+
9689     def test_create_with_initial_contents(self):
9690hunk ./src/allmydata/test/test_mutable.py 389
9691-        d = self.nodemaker.create_mutable_file("contents 1")
9692+        upload1 = MutableData("contents 1")
9693+        d = self.nodemaker.create_mutable_file(upload1)
9694         def _created(n):
9695             d = n.download_best_version()
9696             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9697hunk ./src/allmydata/test/test_mutable.py 394
9698-            d.addCallback(lambda res: n.overwrite("contents 2"))
9699+            upload2 = MutableData("contents 2")
9700+            d.addCallback(lambda res: n.overwrite(upload2))
9701             d.addCallback(lambda res: n.download_best_version())
9702             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9703             return d
9704hunk ./src/allmydata/test/test_mutable.py 401
9705         d.addCallback(_created)
9706         return d
9707+    test_create_with_initial_contents.timeout = 15
9708+
9709+
9710+    def test_create_mdmf_with_initial_contents(self):
9711+        initial_contents = "foobarbaz" * 131072 # 900KiB
9712+        initial_contents_uploadable = MutableData(initial_contents)
9713+        d = self.nodemaker.create_mutable_file(initial_contents_uploadable,
9714+                                               version=MDMF_VERSION)
9715+        def _created(n):
9716+            d = n.download_best_version()
9717+            d.addCallback(lambda data:
9718+                self.failUnlessEqual(data, initial_contents))
9719+            uploadable2 = MutableData(initial_contents + "foobarbaz")
9720+            d.addCallback(lambda ignored:
9721+                n.overwrite(uploadable2))
9722+            d.addCallback(lambda ignored:
9723+                n.download_best_version())
9724+            d.addCallback(lambda data:
9725+                self.failUnlessEqual(data, initial_contents +
9726+                                           "foobarbaz"))
9727+            return d
9728+        d.addCallback(_created)
9729+        return d
9730+    test_create_mdmf_with_initial_contents.timeout = 20
9731+
9732 
9733     def test_create_with_initial_contents_function(self):
9734         data = "initial contents"
9735hunk ./src/allmydata/test/test_mutable.py 434
9736             key = n.get_writekey()
9737             self.failUnless(isinstance(key, str), key)
9738             self.failUnlessEqual(len(key), 16) # AES key size
9739-            return data
9740+            return MutableData(data)
9741         d = self.nodemaker.create_mutable_file(_make_contents)
9742         def _created(n):
9743             return n.download_best_version()
9744hunk ./src/allmydata/test/test_mutable.py 442
9745         d.addCallback(lambda data2: self.failUnlessEqual(data2, data))
9746         return d
9747 
9748+
9749+    def test_create_mdmf_with_initial_contents_function(self):
9750+        data = "initial contents" * 100000
9751+        def _make_contents(n):
9752+            self.failUnless(isinstance(n, MutableFileNode))
9753+            key = n.get_writekey()
9754+            self.failUnless(isinstance(key, str), key)
9755+            self.failUnlessEqual(len(key), 16)
9756+            return MutableData(data)
9757+        d = self.nodemaker.create_mutable_file(_make_contents,
9758+                                               version=MDMF_VERSION)
9759+        d.addCallback(lambda n:
9760+            n.download_best_version())
9761+        d.addCallback(lambda data2:
9762+            self.failUnlessEqual(data2, data))
9763+        return d
9764+
9765+
9766     def test_create_with_too_large_contents(self):
9767         BIG = "a" * (self.OLD_MAX_SEGMENT_SIZE + 1)
9768hunk ./src/allmydata/test/test_mutable.py 462
9769-        d = self.nodemaker.create_mutable_file(BIG)
9770+        BIG_uploadable = MutableData(BIG)
9771+        d = self.nodemaker.create_mutable_file(BIG_uploadable)
9772         def _created(n):
9773hunk ./src/allmydata/test/test_mutable.py 465
9774-            d = n.overwrite(BIG)
9775+            other_BIG_uploadable = MutableData(BIG)
9776+            d = n.overwrite(other_BIG_uploadable)
9777             return d
9778         d.addCallback(_created)
9779         return d
9780hunk ./src/allmydata/test/test_mutable.py 480
9781 
9782     def test_modify(self):
9783         def _modifier(old_contents, servermap, first_time):
9784-            return old_contents + "line2"
9785+            new_contents = old_contents + "line2"
9786+            return new_contents
9787         def _non_modifier(old_contents, servermap, first_time):
9788             return old_contents
9789         def _none_modifier(old_contents, servermap, first_time):
9790hunk ./src/allmydata/test/test_mutable.py 489
9791         def _error_modifier(old_contents, servermap, first_time):
9792             raise ValueError("oops")
9793         def _toobig_modifier(old_contents, servermap, first_time):
9794-            return "b" * (self.OLD_MAX_SEGMENT_SIZE+1)
9795+            new_content = "b" * (self.OLD_MAX_SEGMENT_SIZE + 1)
9796+            return new_content
9797         calls = []
9798         def _ucw_error_modifier(old_contents, servermap, first_time):
9799             # simulate an UncoordinatedWriteError once
9800hunk ./src/allmydata/test/test_mutable.py 497
9801             calls.append(1)
9802             if len(calls) <= 1:
9803                 raise UncoordinatedWriteError("simulated")
9804-            return old_contents + "line3"
9805+            new_contents = old_contents + "line3"
9806+            return new_contents
9807         def _ucw_error_non_modifier(old_contents, servermap, first_time):
9808             # simulate an UncoordinatedWriteError once, and don't actually
9809             # modify the contents on subsequent invocations
9810hunk ./src/allmydata/test/test_mutable.py 507
9811                 raise UncoordinatedWriteError("simulated")
9812             return old_contents
9813 
9814-        d = self.nodemaker.create_mutable_file("line1")
9815+        initial_contents = "line1"
9816+        d = self.nodemaker.create_mutable_file(MutableData(initial_contents))
9817         def _created(n):
9818             d = n.modify(_modifier)
9819             d.addCallback(lambda res: n.download_best_version())
9820hunk ./src/allmydata/test/test_mutable.py 565
9821             return d
9822         d.addCallback(_created)
9823         return d
9824+    test_modify.timeout = 15
9825+
9826 
9827     def test_modify_backoffer(self):
9828         def _modifier(old_contents, servermap, first_time):
9829hunk ./src/allmydata/test/test_mutable.py 592
9830         giveuper._delay = 0.1
9831         giveuper.factor = 1
9832 
9833-        d = self.nodemaker.create_mutable_file("line1")
9834+        d = self.nodemaker.create_mutable_file(MutableData("line1"))
9835         def _created(n):
9836             d = n.modify(_modifier)
9837             d.addCallback(lambda res: n.download_best_version())
9838hunk ./src/allmydata/test/test_mutable.py 642
9839             d.addCallback(lambda smap: smap.dump(StringIO()))
9840             d.addCallback(lambda sio:
9841                           self.failUnless("3-of-10" in sio.getvalue()))
9842-            d.addCallback(lambda res: n.overwrite("contents 1"))
9843+            d.addCallback(lambda res: n.overwrite(MutableData("contents 1")))
9844             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
9845             d.addCallback(lambda res: n.download_best_version())
9846             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9847hunk ./src/allmydata/test/test_mutable.py 646
9848-            d.addCallback(lambda res: n.overwrite("contents 2"))
9849+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
9850             d.addCallback(lambda res: n.download_best_version())
9851             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9852             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
9853hunk ./src/allmydata/test/test_mutable.py 650
9854-            d.addCallback(lambda smap: n.upload("contents 3", smap))
9855+            d.addCallback(lambda smap: n.upload(MutableData("contents 3"), smap))
9856             d.addCallback(lambda res: n.download_best_version())
9857             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
9858             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
9859hunk ./src/allmydata/test/test_mutable.py 663
9860         return d
9861 
9862 
9863-class MakeShares(unittest.TestCase):
9864-    def test_encrypt(self):
9865-        nm = make_nodemaker()
9866-        CONTENTS = "some initial contents"
9867-        d = nm.create_mutable_file(CONTENTS)
9868-        def _created(fn):
9869-            p = Publish(fn, nm.storage_broker, None)
9870-            p.salt = "SALT" * 4
9871-            p.readkey = "\x00" * 16
9872-            p.newdata = CONTENTS
9873-            p.required_shares = 3
9874-            p.total_shares = 10
9875-            p.setup_encoding_parameters()
9876-            return p._encrypt_and_encode()
9877+class PublishMixin:
9878+    def publish_one(self):
9879+        # publish a file and create shares, which can then be manipulated
9880+        # later.
9881+        self.CONTENTS = "New contents go here" * 1000
9882+        self.uploadable = MutableData(self.CONTENTS)
9883+        self._storage = FakeStorage()
9884+        self._nodemaker = make_nodemaker(self._storage)
9885+        self._storage_broker = self._nodemaker.storage_broker
9886+        d = self._nodemaker.create_mutable_file(self.uploadable)
9887+        def _created(node):
9888+            self._fn = node
9889+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
9890         d.addCallback(_created)
9891hunk ./src/allmydata/test/test_mutable.py 677
9892-        def _done(shares_and_shareids):
9893-            (shares, share_ids) = shares_and_shareids
9894-            self.failUnlessEqual(len(shares), 10)
9895-            for sh in shares:
9896-                self.failUnless(isinstance(sh, str))
9897-                self.failUnlessEqual(len(sh), 7)
9898-            self.failUnlessEqual(len(share_ids), 10)
9899-        d.addCallback(_done)
9900         return d
9901 
9902hunk ./src/allmydata/test/test_mutable.py 679
9903-    def test_generate(self):
9904-        nm = make_nodemaker()
9905-        CONTENTS = "some initial contents"
9906-        d = nm.create_mutable_file(CONTENTS)
9907-        def _created(fn):
9908-            self._fn = fn
9909-            p = Publish(fn, nm.storage_broker, None)
9910-            self._p = p
9911-            p.newdata = CONTENTS
9912-            p.required_shares = 3
9913-            p.total_shares = 10
9914-            p.setup_encoding_parameters()
9915-            p._new_seqnum = 3
9916-            p.salt = "SALT" * 4
9917-            # make some fake shares
9918-            shares_and_ids = ( ["%07d" % i for i in range(10)], range(10) )
9919-            p._privkey = fn.get_privkey()
9920-            p._encprivkey = fn.get_encprivkey()
9921-            p._pubkey = fn.get_pubkey()
9922-            return p._generate_shares(shares_and_ids)
9923+    def publish_mdmf(self):
9924+        # like publish_one, except that the result is guaranteed to be
9925+        # an MDMF file.
9926+        # self.CONTENTS should have more than one segment.
9927+        self.CONTENTS = "This is an MDMF file" * 100000
9928+        self.uploadable = MutableData(self.CONTENTS)
9929+        self._storage = FakeStorage()
9930+        self._nodemaker = make_nodemaker(self._storage)
9931+        self._storage_broker = self._nodemaker.storage_broker
9932+        d = self._nodemaker.create_mutable_file(self.uploadable, version=MDMF_VERSION)
9933+        def _created(node):
9934+            self._fn = node
9935+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
9936         d.addCallback(_created)
9937hunk ./src/allmydata/test/test_mutable.py 693
9938-        def _generated(res):
9939-            p = self._p
9940-            final_shares = p.shares
9941-            root_hash = p.root_hash
9942-            self.failUnlessEqual(len(root_hash), 32)
9943-            self.failUnless(isinstance(final_shares, dict))
9944-            self.failUnlessEqual(len(final_shares), 10)
9945-            self.failUnlessEqual(sorted(final_shares.keys()), range(10))
9946-            for i,sh in final_shares.items():
9947-                self.failUnless(isinstance(sh, str))
9948-                # feed the share through the unpacker as a sanity-check
9949-                pieces = unpack_share(sh)
9950-                (u_seqnum, u_root_hash, IV, k, N, segsize, datalen,
9951-                 pubkey, signature, share_hash_chain, block_hash_tree,
9952-                 share_data, enc_privkey) = pieces
9953-                self.failUnlessEqual(u_seqnum, 3)
9954-                self.failUnlessEqual(u_root_hash, root_hash)
9955-                self.failUnlessEqual(k, 3)
9956-                self.failUnlessEqual(N, 10)
9957-                self.failUnlessEqual(segsize, 21)
9958-                self.failUnlessEqual(datalen, len(CONTENTS))
9959-                self.failUnlessEqual(pubkey, p._pubkey.serialize())
9960-                sig_material = struct.pack(">BQ32s16s BBQQ",
9961-                                           0, p._new_seqnum, root_hash, IV,
9962-                                           k, N, segsize, datalen)
9963-                self.failUnless(p._pubkey.verify(sig_material, signature))
9964-                #self.failUnlessEqual(signature, p._privkey.sign(sig_material))
9965-                self.failUnless(isinstance(share_hash_chain, dict))
9966-                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
9967-                for shnum,share_hash in share_hash_chain.items():
9968-                    self.failUnless(isinstance(shnum, int))
9969-                    self.failUnless(isinstance(share_hash, str))
9970-                    self.failUnlessEqual(len(share_hash), 32)
9971-                self.failUnless(isinstance(block_hash_tree, list))
9972-                self.failUnlessEqual(len(block_hash_tree), 1) # very small tree
9973-                self.failUnlessEqual(IV, "SALT"*4)
9974-                self.failUnlessEqual(len(share_data), len("%07d" % 1))
9975-                self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey())
9976-        d.addCallback(_generated)
9977         return d
9978 
9979hunk ./src/allmydata/test/test_mutable.py 695
9980-    # TODO: when we publish to 20 peers, we should get one share per peer on 10
9981-    # when we publish to 3 peers, we should get either 3 or 4 shares per peer
9982-    # when we publish to zero peers, we should get a NotEnoughSharesError
9983 
9984hunk ./src/allmydata/test/test_mutable.py 696
9985-class PublishMixin:
9986-    def publish_one(self):
9987-        # publish a file and create shares, which can then be manipulated
9988-        # later.
9989-        self.CONTENTS = "New contents go here" * 1000
9990+    def publish_sdmf(self):
9991+        # like publish_one, except that the result is guaranteed to be
9992+        # an SDMF file
9993+        self.CONTENTS = "This is an SDMF file" * 1000
9994+        self.uploadable = MutableData(self.CONTENTS)
9995         self._storage = FakeStorage()
9996         self._nodemaker = make_nodemaker(self._storage)
9997         self._storage_broker = self._nodemaker.storage_broker
9998hunk ./src/allmydata/test/test_mutable.py 704
9999-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
10000+        d = self._nodemaker.create_mutable_file(self.uploadable, version=SDMF_VERSION)
10001         def _created(node):
10002             self._fn = node
10003             self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
10004hunk ./src/allmydata/test/test_mutable.py 711
10005         d.addCallback(_created)
10006         return d
10007 
10008-    def publish_multiple(self):
10009+
10010+    def publish_multiple(self, version=0):
10011         self.CONTENTS = ["Contents 0",
10012                          "Contents 1",
10013                          "Contents 2",
10014hunk ./src/allmydata/test/test_mutable.py 718
10015                          "Contents 3a",
10016                          "Contents 3b"]
10017+        self.uploadables = [MutableData(d) for d in self.CONTENTS]
10018         self._copied_shares = {}
10019         self._storage = FakeStorage()
10020         self._nodemaker = make_nodemaker(self._storage)
10021hunk ./src/allmydata/test/test_mutable.py 722
10022-        d = self._nodemaker.create_mutable_file(self.CONTENTS[0]) # seqnum=1
10023+        d = self._nodemaker.create_mutable_file(self.uploadables[0], version=version) # seqnum=1
10024         def _created(node):
10025             self._fn = node
10026             # now create multiple versions of the same file, and accumulate
10027hunk ./src/allmydata/test/test_mutable.py 729
10028             # their shares, so we can mix and match them later.
10029             d = defer.succeed(None)
10030             d.addCallback(self._copy_shares, 0)
10031-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[1])) #s2
10032+            d.addCallback(lambda res: node.overwrite(self.uploadables[1])) #s2
10033             d.addCallback(self._copy_shares, 1)
10034hunk ./src/allmydata/test/test_mutable.py 731
10035-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[2])) #s3
10036+            d.addCallback(lambda res: node.overwrite(self.uploadables[2])) #s3
10037             d.addCallback(self._copy_shares, 2)
10038hunk ./src/allmydata/test/test_mutable.py 733
10039-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[3])) #s4a
10040+            d.addCallback(lambda res: node.overwrite(self.uploadables[3])) #s4a
10041             d.addCallback(self._copy_shares, 3)
10042             # now we replace all the shares with version s3, and upload a new
10043             # version to get s4b.
10044hunk ./src/allmydata/test/test_mutable.py 739
10045             rollback = dict([(i,2) for i in range(10)])
10046             d.addCallback(lambda res: self._set_versions(rollback))
10047-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[4])) #s4b
10048+            d.addCallback(lambda res: node.overwrite(self.uploadables[4])) #s4b
10049             d.addCallback(self._copy_shares, 4)
10050             # we leave the storage in state 4
10051             return d
10052hunk ./src/allmydata/test/test_mutable.py 746
10053         d.addCallback(_created)
10054         return d
10055 
10056+
10057     def _copy_shares(self, ignored, index):
10058         shares = self._storage._peers
10059         # we need a deep copy
10060hunk ./src/allmydata/test/test_mutable.py 770
10061                     shares[peerid][shnum] = oldshares[index][peerid][shnum]
10062 
10063 
10064+
10065+
10066 class Servermap(unittest.TestCase, PublishMixin):
10067     def setUp(self):
10068         return self.publish_one()
10069hunk ./src/allmydata/test/test_mutable.py 776
10070 
10071-    def make_servermap(self, mode=MODE_CHECK, fn=None, sb=None):
10072+    def make_servermap(self, mode=MODE_CHECK, fn=None, sb=None,
10073+                       update_range=None):
10074         if fn is None:
10075             fn = self._fn
10076         if sb is None:
10077hunk ./src/allmydata/test/test_mutable.py 783
10078             sb = self._storage_broker
10079         smu = ServermapUpdater(fn, sb, Monitor(),
10080-                               ServerMap(), mode)
10081+                               ServerMap(), mode, update_range=update_range)
10082         d = smu.update()
10083         return d
10084 
10085hunk ./src/allmydata/test/test_mutable.py 849
10086         # create a new file, which is large enough to knock the privkey out
10087         # of the early part of the file
10088         LARGE = "These are Larger contents" * 200 # about 5KB
10089-        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE))
10090+        LARGE_uploadable = MutableData(LARGE)
10091+        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE_uploadable))
10092         def _created(large_fn):
10093             large_fn2 = self._nodemaker.create_from_cap(large_fn.get_uri())
10094             return self.make_servermap(MODE_WRITE, large_fn2)
10095hunk ./src/allmydata/test/test_mutable.py 858
10096         d.addCallback(lambda sm: self.failUnlessOneRecoverable(sm, 10))
10097         return d
10098 
10099+
10100     def test_mark_bad(self):
10101         d = defer.succeed(None)
10102         ms = self.make_servermap
10103hunk ./src/allmydata/test/test_mutable.py 904
10104         self._storage._peers = {} # delete all shares
10105         ms = self.make_servermap
10106         d = defer.succeed(None)
10107-
10108+#
10109         d.addCallback(lambda res: ms(mode=MODE_CHECK))
10110         d.addCallback(lambda sm: self.failUnlessNoneRecoverable(sm))
10111 
10112hunk ./src/allmydata/test/test_mutable.py 956
10113         return d
10114 
10115 
10116+    def test_servermapupdater_finds_mdmf_files(self):
10117+        # setUp already published an MDMF file for us. We just need to
10118+        # make sure that when we run the ServermapUpdater, the file is
10119+        # reported to have one recoverable version.
10120+        d = defer.succeed(None)
10121+        d.addCallback(lambda ignored:
10122+            self.publish_mdmf())
10123+        d.addCallback(lambda ignored:
10124+            self.make_servermap(mode=MODE_CHECK))
10125+        # Calling make_servermap also updates the servermap in the mode
10126+        # that we specify, so we just need to see what it says.
10127+        def _check_servermap(sm):
10128+            self.failUnlessEqual(len(sm.recoverable_versions()), 1)
10129+        d.addCallback(_check_servermap)
10130+        return d
10131+
10132+
10133+    def test_fetch_update(self):
10134+        d = defer.succeed(None)
10135+        d.addCallback(lambda ignored:
10136+            self.publish_mdmf())
10137+        d.addCallback(lambda ignored:
10138+            self.make_servermap(mode=MODE_WRITE, update_range=(1, 2)))
10139+        def _check_servermap(sm):
10140+            # 10 shares
10141+            self.failUnlessEqual(len(sm.update_data), 10)
10142+            # one version
10143+            for data in sm.update_data.itervalues():
10144+                self.failUnlessEqual(len(data), 1)
10145+        d.addCallback(_check_servermap)
10146+        return d
10147+
10148+
10149+    def test_servermapupdater_finds_sdmf_files(self):
10150+        d = defer.succeed(None)
10151+        d.addCallback(lambda ignored:
10152+            self.publish_sdmf())
10153+        d.addCallback(lambda ignored:
10154+            self.make_servermap(mode=MODE_CHECK))
10155+        d.addCallback(lambda servermap:
10156+            self.failUnlessEqual(len(servermap.recoverable_versions()), 1))
10157+        return d
10158+
10159 
10160 class Roundtrip(unittest.TestCase, testutil.ShouldFailMixin, PublishMixin):
10161     def setUp(self):
10162hunk ./src/allmydata/test/test_mutable.py 1039
10163         if version is None:
10164             version = servermap.best_recoverable_version()
10165         r = Retrieve(self._fn, servermap, version)
10166-        return r.download()
10167+        c = consumer.MemoryConsumer()
10168+        d = r.download(consumer=c)
10169+        d.addCallback(lambda mc: "".join(mc.chunks))
10170+        return d
10171+
10172 
10173     def test_basic(self):
10174         d = self.make_servermap()
10175hunk ./src/allmydata/test/test_mutable.py 1120
10176         return d
10177     test_no_servers_download.timeout = 15
10178 
10179+
10180     def _test_corrupt_all(self, offset, substring,
10181hunk ./src/allmydata/test/test_mutable.py 1122
10182-                          should_succeed=False, corrupt_early=True,
10183-                          failure_checker=None):
10184+                          should_succeed=False,
10185+                          corrupt_early=True,
10186+                          failure_checker=None,
10187+                          fetch_privkey=False):
10188         d = defer.succeed(None)
10189         if corrupt_early:
10190             d.addCallback(corrupt, self._storage, offset)
10191hunk ./src/allmydata/test/test_mutable.py 1142
10192                     self.failUnlessIn(substring, "".join(allproblems))
10193                 return servermap
10194             if should_succeed:
10195-                d1 = self._fn.download_version(servermap, ver)
10196+                d1 = self._fn.download_version(servermap, ver,
10197+                                               fetch_privkey)
10198                 d1.addCallback(lambda new_contents:
10199                                self.failUnlessEqual(new_contents, self.CONTENTS))
10200             else:
10201hunk ./src/allmydata/test/test_mutable.py 1150
10202                 d1 = self.shouldFail(NotEnoughSharesError,
10203                                      "_corrupt_all(offset=%s)" % (offset,),
10204                                      substring,
10205-                                     self._fn.download_version, servermap, ver)
10206+                                     self._fn.download_version, servermap,
10207+                                                                ver,
10208+                                                                fetch_privkey)
10209             if failure_checker:
10210                 d1.addCallback(failure_checker)
10211             d1.addCallback(lambda res: servermap)
10212hunk ./src/allmydata/test/test_mutable.py 1161
10213         return d
10214 
10215     def test_corrupt_all_verbyte(self):
10216-        # when the version byte is not 0, we hit an UnknownVersionError error
10217-        # in unpack_share().
10218+        # when the version byte is not 0 or 1, we hit an UnknownVersionError
10219+        # error in unpack_share().
10220         d = self._test_corrupt_all(0, "UnknownVersionError")
10221         def _check_servermap(servermap):
10222             # and the dump should mention the problems
10223hunk ./src/allmydata/test/test_mutable.py 1168
10224             s = StringIO()
10225             dump = servermap.dump(s).getvalue()
10226-            self.failUnless("10 PROBLEMS" in dump, dump)
10227+            self.failUnless("30 PROBLEMS" in dump, dump)
10228         d.addCallback(_check_servermap)
10229         return d
10230 
10231hunk ./src/allmydata/test/test_mutable.py 1238
10232         return self._test_corrupt_all("enc_privkey", None, should_succeed=True)
10233 
10234 
10235+    def test_corrupt_all_encprivkey_late(self):
10236+        # this should work for the same reason as above, but we corrupt
10237+        # after the servermap update to exercise the error handling
10238+        # code.
10239+        # We need to remove the privkey from the node, or the retrieve
10240+        # process won't know to update it.
10241+        self._fn._privkey = None
10242+        return self._test_corrupt_all("enc_privkey",
10243+                                      None, # this shouldn't fail
10244+                                      should_succeed=True,
10245+                                      corrupt_early=False,
10246+                                      fetch_privkey=True)
10247+
10248+
10249     def test_corrupt_all_seqnum_late(self):
10250         # corrupting the seqnum between mapupdate and retrieve should result
10251         # in NotEnoughSharesError, since each share will look invalid
10252hunk ./src/allmydata/test/test_mutable.py 1258
10253         def _check(res):
10254             f = res[0]
10255             self.failUnless(f.check(NotEnoughSharesError))
10256-            self.failUnless("someone wrote to the data since we read the servermap" in str(f))
10257+            self.failUnless("uncoordinated write" in str(f))
10258         return self._test_corrupt_all(1, "ran out of peers",
10259                                       corrupt_early=False,
10260                                       failure_checker=_check)
10261hunk ./src/allmydata/test/test_mutable.py 1302
10262                             in str(servermap.problems[0]))
10263             ver = servermap.best_recoverable_version()
10264             r = Retrieve(self._fn, servermap, ver)
10265-            return r.download()
10266+            c = consumer.MemoryConsumer()
10267+            return r.download(c)
10268         d.addCallback(_do_retrieve)
10269hunk ./src/allmydata/test/test_mutable.py 1305
10270+        d.addCallback(lambda mc: "".join(mc.chunks))
10271         d.addCallback(lambda new_contents:
10272                       self.failUnlessEqual(new_contents, self.CONTENTS))
10273         return d
10274hunk ./src/allmydata/test/test_mutable.py 1310
10275 
10276-    def test_corrupt_some(self):
10277-        # corrupt the data of first five shares (so the servermap thinks
10278-        # they're good but retrieve marks them as bad), so that the
10279-        # MODE_READ set of 6 will be insufficient, forcing node.download to
10280-        # retry with more servers.
10281-        corrupt(None, self._storage, "share_data", range(5))
10282-        d = self.make_servermap()
10283+
10284+    def _test_corrupt_some(self, offset, mdmf=False):
10285+        if mdmf:
10286+            d = self.publish_mdmf()
10287+        else:
10288+            d = defer.succeed(None)
10289+        d.addCallback(lambda ignored:
10290+            corrupt(None, self._storage, offset, range(5)))
10291+        d.addCallback(lambda ignored:
10292+            self.make_servermap())
10293         def _do_retrieve(servermap):
10294             ver = servermap.best_recoverable_version()
10295             self.failUnless(ver)
10296hunk ./src/allmydata/test/test_mutable.py 1326
10297             return self._fn.download_best_version()
10298         d.addCallback(_do_retrieve)
10299         d.addCallback(lambda new_contents:
10300-                      self.failUnlessEqual(new_contents, self.CONTENTS))
10301+            self.failUnlessEqual(new_contents, self.CONTENTS))
10302         return d
10303 
10304hunk ./src/allmydata/test/test_mutable.py 1329
10305+
10306+    def test_corrupt_some(self):
10307+        # corrupt the data of first five shares (so the servermap thinks
10308+        # they're good but retrieve marks them as bad), so that the
10309+        # MODE_READ set of 6 will be insufficient, forcing node.download to
10310+        # retry with more servers.
10311+        return self._test_corrupt_some("share_data")
10312+
10313+
10314     def test_download_fails(self):
10315hunk ./src/allmydata/test/test_mutable.py 1339
10316-        corrupt(None, self._storage, "signature")
10317-        d = self.shouldFail(UnrecoverableFileError, "test_download_anyway",
10318+        d = corrupt(None, self._storage, "signature")
10319+        d.addCallback(lambda ignored:
10320+            self.shouldFail(UnrecoverableFileError, "test_download_anyway",
10321                             "no recoverable versions",
10322hunk ./src/allmydata/test/test_mutable.py 1343
10323-                            self._fn.download_best_version)
10324+                            self._fn.download_best_version))
10325         return d
10326 
10327 
10328hunk ./src/allmydata/test/test_mutable.py 1347
10329+
10330+    def test_corrupt_mdmf_block_hash_tree(self):
10331+        d = self.publish_mdmf()
10332+        d.addCallback(lambda ignored:
10333+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
10334+                                   "block hash tree failure",
10335+                                   corrupt_early=False,
10336+                                   should_succeed=False))
10337+        return d
10338+
10339+
10340+    def test_corrupt_mdmf_block_hash_tree_late(self):
10341+        d = self.publish_mdmf()
10342+        d.addCallback(lambda ignored:
10343+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
10344+                                   "block hash tree failure",
10345+                                   corrupt_early=True,
10346+                                   should_succeed=False))
10347+        return d
10348+
10349+
10350+    def test_corrupt_mdmf_share_data(self):
10351+        d = self.publish_mdmf()
10352+        d.addCallback(lambda ignored:
10353+            # TODO: Find out what the block size is and corrupt a
10354+            # specific block, rather than just guessing.
10355+            self._test_corrupt_all(("share_data", 12 * 40),
10356+                                    "block hash tree failure",
10357+                                    corrupt_early=True,
10358+                                    should_succeed=False))
10359+        return d
10360+
10361+
10362+    def test_corrupt_some_mdmf(self):
10363+        return self._test_corrupt_some(("share_data", 12 * 40),
10364+                                       mdmf=True)
10365+
10366+
10367 class CheckerMixin:
10368     def check_good(self, r, where):
10369         self.failUnless(r.is_healthy(), where)
10370hunk ./src/allmydata/test/test_mutable.py 1415
10371         d.addCallback(self.check_good, "test_check_good")
10372         return d
10373 
10374+    def test_check_mdmf_good(self):
10375+        d = self.publish_mdmf()
10376+        d.addCallback(lambda ignored:
10377+            self._fn.check(Monitor()))
10378+        d.addCallback(self.check_good, "test_check_mdmf_good")
10379+        return d
10380+
10381     def test_check_no_shares(self):
10382         for shares in self._storage._peers.values():
10383             shares.clear()
10384hunk ./src/allmydata/test/test_mutable.py 1429
10385         d.addCallback(self.check_bad, "test_check_no_shares")
10386         return d
10387 
10388+    def test_check_mdmf_no_shares(self):
10389+        d = self.publish_mdmf()
10390+        def _then(ignored):
10391+            for share in self._storage._peers.values():
10392+                share.clear()
10393+        d.addCallback(_then)
10394+        d.addCallback(lambda ignored:
10395+            self._fn.check(Monitor()))
10396+        d.addCallback(self.check_bad, "test_check_mdmf_no_shares")
10397+        return d
10398+
10399     def test_check_not_enough_shares(self):
10400         for shares in self._storage._peers.values():
10401             for shnum in shares.keys():
10402hunk ./src/allmydata/test/test_mutable.py 1449
10403         d.addCallback(self.check_bad, "test_check_not_enough_shares")
10404         return d
10405 
10406+    def test_check_mdmf_not_enough_shares(self):
10407+        d = self.publish_mdmf()
10408+        def _then(ignored):
10409+            for shares in self._storage._peers.values():
10410+                for shnum in shares.keys():
10411+                    if shnum > 0:
10412+                        del shares[shnum]
10413+        d.addCallback(_then)
10414+        d.addCallback(lambda ignored:
10415+            self._fn.check(Monitor()))
10416+        d.addCallback(self.check_bad, "test_check_mdmf_not_enougH_shares")
10417+        return d
10418+
10419+
10420     def test_check_all_bad_sig(self):
10421hunk ./src/allmydata/test/test_mutable.py 1464
10422-        corrupt(None, self._storage, 1) # bad sig
10423-        d = self._fn.check(Monitor())
10424+        d = corrupt(None, self._storage, 1) # bad sig
10425+        d.addCallback(lambda ignored:
10426+            self._fn.check(Monitor()))
10427         d.addCallback(self.check_bad, "test_check_all_bad_sig")
10428         return d
10429 
10430hunk ./src/allmydata/test/test_mutable.py 1470
10431+    def test_check_mdmf_all_bad_sig(self):
10432+        d = self.publish_mdmf()
10433+        d.addCallback(lambda ignored:
10434+            corrupt(None, self._storage, 1))
10435+        d.addCallback(lambda ignored:
10436+            self._fn.check(Monitor()))
10437+        d.addCallback(self.check_bad, "test_check_mdmf_all_bad_sig")
10438+        return d
10439+
10440     def test_check_all_bad_blocks(self):
10441hunk ./src/allmydata/test/test_mutable.py 1480
10442-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
10443+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
10444         # the Checker won't notice this.. it doesn't look at actual data
10445hunk ./src/allmydata/test/test_mutable.py 1482
10446-        d = self._fn.check(Monitor())
10447+        d.addCallback(lambda ignored:
10448+            self._fn.check(Monitor()))
10449         d.addCallback(self.check_good, "test_check_all_bad_blocks")
10450         return d
10451 
10452hunk ./src/allmydata/test/test_mutable.py 1487
10453+
10454+    def test_check_mdmf_all_bad_blocks(self):
10455+        d = self.publish_mdmf()
10456+        d.addCallback(lambda ignored:
10457+            corrupt(None, self._storage, "share_data"))
10458+        d.addCallback(lambda ignored:
10459+            self._fn.check(Monitor()))
10460+        d.addCallback(self.check_good, "test_check_mdmf_all_bad_blocks")
10461+        return d
10462+
10463     def test_verify_good(self):
10464         d = self._fn.check(Monitor(), verify=True)
10465         d.addCallback(self.check_good, "test_verify_good")
10466hunk ./src/allmydata/test/test_mutable.py 1501
10467         return d
10468+    test_verify_good.timeout = 15
10469 
10470     def test_verify_all_bad_sig(self):
10471hunk ./src/allmydata/test/test_mutable.py 1504
10472-        corrupt(None, self._storage, 1) # bad sig
10473-        d = self._fn.check(Monitor(), verify=True)
10474+        d = corrupt(None, self._storage, 1) # bad sig
10475+        d.addCallback(lambda ignored:
10476+            self._fn.check(Monitor(), verify=True))
10477         d.addCallback(self.check_bad, "test_verify_all_bad_sig")
10478         return d
10479 
10480hunk ./src/allmydata/test/test_mutable.py 1511
10481     def test_verify_one_bad_sig(self):
10482-        corrupt(None, self._storage, 1, [9]) # bad sig
10483-        d = self._fn.check(Monitor(), verify=True)
10484+        d = corrupt(None, self._storage, 1, [9]) # bad sig
10485+        d.addCallback(lambda ignored:
10486+            self._fn.check(Monitor(), verify=True))
10487         d.addCallback(self.check_bad, "test_verify_one_bad_sig")
10488         return d
10489 
10490hunk ./src/allmydata/test/test_mutable.py 1518
10491     def test_verify_one_bad_block(self):
10492-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
10493+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
10494         # the Verifier *will* notice this, since it examines every byte
10495hunk ./src/allmydata/test/test_mutable.py 1520
10496-        d = self._fn.check(Monitor(), verify=True)
10497+        d.addCallback(lambda ignored:
10498+            self._fn.check(Monitor(), verify=True))
10499         d.addCallback(self.check_bad, "test_verify_one_bad_block")
10500         d.addCallback(self.check_expected_failure,
10501                       CorruptShareError, "block hash tree failure",
10502hunk ./src/allmydata/test/test_mutable.py 1529
10503         return d
10504 
10505     def test_verify_one_bad_sharehash(self):
10506-        corrupt(None, self._storage, "share_hash_chain", [9], 5)
10507-        d = self._fn.check(Monitor(), verify=True)
10508+        d = corrupt(None, self._storage, "share_hash_chain", [9], 5)
10509+        d.addCallback(lambda ignored:
10510+            self._fn.check(Monitor(), verify=True))
10511         d.addCallback(self.check_bad, "test_verify_one_bad_sharehash")
10512         d.addCallback(self.check_expected_failure,
10513                       CorruptShareError, "corrupt hashes",
10514hunk ./src/allmydata/test/test_mutable.py 1539
10515         return d
10516 
10517     def test_verify_one_bad_encprivkey(self):
10518-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
10519-        d = self._fn.check(Monitor(), verify=True)
10520+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
10521+        d.addCallback(lambda ignored:
10522+            self._fn.check(Monitor(), verify=True))
10523         d.addCallback(self.check_bad, "test_verify_one_bad_encprivkey")
10524         d.addCallback(self.check_expected_failure,
10525                       CorruptShareError, "invalid privkey",
10526hunk ./src/allmydata/test/test_mutable.py 1549
10527         return d
10528 
10529     def test_verify_one_bad_encprivkey_uncheckable(self):
10530-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
10531+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
10532         readonly_fn = self._fn.get_readonly()
10533         # a read-only node has no way to validate the privkey
10534hunk ./src/allmydata/test/test_mutable.py 1552
10535-        d = readonly_fn.check(Monitor(), verify=True)
10536+        d.addCallback(lambda ignored:
10537+            readonly_fn.check(Monitor(), verify=True))
10538         d.addCallback(self.check_good,
10539                       "test_verify_one_bad_encprivkey_uncheckable")
10540         return d
10541hunk ./src/allmydata/test/test_mutable.py 1558
10542 
10543+
10544+    def test_verify_mdmf_good(self):
10545+        d = self.publish_mdmf()
10546+        d.addCallback(lambda ignored:
10547+            self._fn.check(Monitor(), verify=True))
10548+        d.addCallback(self.check_good, "test_verify_mdmf_good")
10549+        return d
10550+
10551+
10552+    def test_verify_mdmf_one_bad_block(self):
10553+        d = self.publish_mdmf()
10554+        d.addCallback(lambda ignored:
10555+            corrupt(None, self._storage, "share_data", [1]))
10556+        d.addCallback(lambda ignored:
10557+            self._fn.check(Monitor(), verify=True))
10558+        # We should find one bad block here
10559+        d.addCallback(self.check_bad, "test_verify_mdmf_one_bad_block")
10560+        d.addCallback(self.check_expected_failure,
10561+                      CorruptShareError, "block hash tree failure",
10562+                      "test_verify_mdmf_one_bad_block")
10563+        return d
10564+
10565+
10566+    def test_verify_mdmf_bad_encprivkey(self):
10567+        d = self.publish_mdmf()
10568+        d.addCallback(lambda ignored:
10569+            corrupt(None, self._storage, "enc_privkey", [1]))
10570+        d.addCallback(lambda ignored:
10571+            self._fn.check(Monitor(), verify=True))
10572+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_encprivkey")
10573+        d.addCallback(self.check_expected_failure,
10574+                      CorruptShareError, "privkey",
10575+                      "test_verify_mdmf_bad_encprivkey")
10576+        return d
10577+
10578+
10579+    def test_verify_mdmf_bad_sig(self):
10580+        d = self.publish_mdmf()
10581+        d.addCallback(lambda ignored:
10582+            corrupt(None, self._storage, 1, [1]))
10583+        d.addCallback(lambda ignored:
10584+            self._fn.check(Monitor(), verify=True))
10585+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_sig")
10586+        return d
10587+
10588+
10589+    def test_verify_mdmf_bad_encprivkey_uncheckable(self):
10590+        d = self.publish_mdmf()
10591+        d.addCallback(lambda ignored:
10592+            corrupt(None, self._storage, "enc_privkey", [1]))
10593+        d.addCallback(lambda ignored:
10594+            self._fn.get_readonly())
10595+        d.addCallback(lambda fn:
10596+            fn.check(Monitor(), verify=True))
10597+        d.addCallback(self.check_good,
10598+                      "test_verify_mdmf_bad_encprivkey_uncheckable")
10599+        return d
10600+
10601+
10602 class Repair(unittest.TestCase, PublishMixin, ShouldFailMixin):
10603 
10604     def get_shares(self, s):
10605hunk ./src/allmydata/test/test_mutable.py 1682
10606         current_shares = self.old_shares[-1]
10607         self.failUnlessEqual(old_shares, current_shares)
10608 
10609+
10610     def test_unrepairable_0shares(self):
10611         d = self.publish_one()
10612         def _delete_all_shares(ign):
10613hunk ./src/allmydata/test/test_mutable.py 1697
10614         d.addCallback(_check)
10615         return d
10616 
10617+    def test_mdmf_unrepairable_0shares(self):
10618+        d = self.publish_mdmf()
10619+        def _delete_all_shares(ign):
10620+            shares = self._storage._peers
10621+            for peerid in shares:
10622+                shares[peerid] = {}
10623+        d.addCallback(_delete_all_shares)
10624+        d.addCallback(lambda ign: self._fn.check(Monitor()))
10625+        d.addCallback(lambda check_results: self._fn.repair(check_results))
10626+        d.addCallback(lambda crr: self.failIf(crr.get_successful()))
10627+        return d
10628+
10629+
10630     def test_unrepairable_1share(self):
10631         d = self.publish_one()
10632         def _delete_all_shares(ign):
10633hunk ./src/allmydata/test/test_mutable.py 1726
10634         d.addCallback(_check)
10635         return d
10636 
10637+    def test_mdmf_unrepairable_1share(self):
10638+        d = self.publish_mdmf()
10639+        def _delete_all_shares(ign):
10640+            shares = self._storage._peers
10641+            for peerid in shares:
10642+                for shnum in list(shares[peerid]):
10643+                    if shnum > 0:
10644+                        del shares[peerid][shnum]
10645+        d.addCallback(_delete_all_shares)
10646+        d.addCallback(lambda ign: self._fn.check(Monitor()))
10647+        d.addCallback(lambda check_results: self._fn.repair(check_results))
10648+        def _check(crr):
10649+            self.failUnlessEqual(crr.get_successful(), False)
10650+        d.addCallback(_check)
10651+        return d
10652+
10653+    def test_repairable_5shares(self):
10654+        d = self.publish_mdmf()
10655+        def _delete_all_shares(ign):
10656+            shares = self._storage._peers
10657+            for peerid in shares:
10658+                for shnum in list(shares[peerid]):
10659+                    if shnum > 4:
10660+                        del shares[peerid][shnum]
10661+        d.addCallback(_delete_all_shares)
10662+        d.addCallback(lambda ign: self._fn.check(Monitor()))
10663+        d.addCallback(lambda check_results: self._fn.repair(check_results))
10664+        def _check(crr):
10665+            self.failUnlessEqual(crr.get_successful(), True)
10666+        d.addCallback(_check)
10667+        return d
10668+
10669+    def test_mdmf_repairable_5shares(self):
10670+        d = self.publish_mdmf()
10671+        def _delete_some_shares(ign):
10672+            shares = self._storage._peers
10673+            for peerid in shares:
10674+                for shnum in list(shares[peerid]):
10675+                    if shnum > 5:
10676+                        del shares[peerid][shnum]
10677+        d.addCallback(_delete_some_shares)
10678+        d.addCallback(lambda ign: self._fn.check(Monitor()))
10679+        def _check(cr):
10680+            self.failIf(cr.is_healthy())
10681+            self.failUnless(cr.is_recoverable())
10682+            return cr
10683+        d.addCallback(_check)
10684+        d.addCallback(lambda check_results: self._fn.repair(check_results))
10685+        def _check1(crr):
10686+            self.failUnlessEqual(crr.get_successful(), True)
10687+        d.addCallback(_check1)
10688+        return d
10689+
10690+
10691     def test_merge(self):
10692         self.old_shares = []
10693         d = self.publish_multiple()
10694hunk ./src/allmydata/test/test_mutable.py 1894
10695 class MultipleEncodings(unittest.TestCase):
10696     def setUp(self):
10697         self.CONTENTS = "New contents go here"
10698+        self.uploadable = MutableData(self.CONTENTS)
10699         self._storage = FakeStorage()
10700         self._nodemaker = make_nodemaker(self._storage, num_peers=20)
10701         self._storage_broker = self._nodemaker.storage_broker
10702hunk ./src/allmydata/test/test_mutable.py 1898
10703-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
10704+        d = self._nodemaker.create_mutable_file(self.uploadable)
10705         def _created(node):
10706             self._fn = node
10707         d.addCallback(_created)
10708hunk ./src/allmydata/test/test_mutable.py 1904
10709         return d
10710 
10711-    def _encode(self, k, n, data):
10712+    def _encode(self, k, n, data, version=SDMF_VERSION):
10713         # encode 'data' into a peerid->shares dict.
10714 
10715         fn = self._fn
10716hunk ./src/allmydata/test/test_mutable.py 1920
10717         # and set the encoding parameters to something completely different
10718         fn2._required_shares = k
10719         fn2._total_shares = n
10720+        # Normally a servermap update would occur before a publish.
10721+        # Here, it doesn't, so we have to do it ourselves.
10722+        fn2.set_version(version)
10723 
10724         s = self._storage
10725         s._peers = {} # clear existing storage
10726hunk ./src/allmydata/test/test_mutable.py 1927
10727         p2 = Publish(fn2, self._storage_broker, None)
10728-        d = p2.publish(data)
10729+        uploadable = MutableData(data)
10730+        d = p2.publish(uploadable)
10731         def _published(res):
10732             shares = s._peers
10733             s._peers = {}
10734hunk ./src/allmydata/test/test_mutable.py 2230
10735         self.basedir = "mutable/Problems/test_publish_surprise"
10736         self.set_up_grid()
10737         nm = self.g.clients[0].nodemaker
10738-        d = nm.create_mutable_file("contents 1")
10739+        d = nm.create_mutable_file(MutableData("contents 1"))
10740         def _created(n):
10741             d = defer.succeed(None)
10742             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
10743hunk ./src/allmydata/test/test_mutable.py 2240
10744             d.addCallback(_got_smap1)
10745             # then modify the file, leaving the old map untouched
10746             d.addCallback(lambda res: log.msg("starting winning write"))
10747-            d.addCallback(lambda res: n.overwrite("contents 2"))
10748+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
10749             # now attempt to modify the file with the old servermap. This
10750             # will look just like an uncoordinated write, in which every
10751             # single share got updated between our mapupdate and our publish
10752hunk ./src/allmydata/test/test_mutable.py 2249
10753                           self.shouldFail(UncoordinatedWriteError,
10754                                           "test_publish_surprise", None,
10755                                           n.upload,
10756-                                          "contents 2a", self.old_map))
10757+                                          MutableData("contents 2a"), self.old_map))
10758             return d
10759         d.addCallback(_created)
10760         return d
10761hunk ./src/allmydata/test/test_mutable.py 2258
10762         self.basedir = "mutable/Problems/test_retrieve_surprise"
10763         self.set_up_grid()
10764         nm = self.g.clients[0].nodemaker
10765-        d = nm.create_mutable_file("contents 1")
10766+        d = nm.create_mutable_file(MutableData("contents 1"))
10767         def _created(n):
10768             d = defer.succeed(None)
10769             d.addCallback(lambda res: n.get_servermap(MODE_READ))
10770hunk ./src/allmydata/test/test_mutable.py 2268
10771             d.addCallback(_got_smap1)
10772             # then modify the file, leaving the old map untouched
10773             d.addCallback(lambda res: log.msg("starting winning write"))
10774-            d.addCallback(lambda res: n.overwrite("contents 2"))
10775+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
10776             # now attempt to retrieve the old version with the old servermap.
10777             # This will look like someone has changed the file since we
10778             # updated the servermap.
10779hunk ./src/allmydata/test/test_mutable.py 2277
10780             d.addCallback(lambda res:
10781                           self.shouldFail(NotEnoughSharesError,
10782                                           "test_retrieve_surprise",
10783-                                          "ran out of peers: have 0 shares (k=3)",
10784+                                          "ran out of peers: have 0 of 1",
10785                                           n.download_version,
10786                                           self.old_map,
10787                                           self.old_map.best_recoverable_version(),
10788hunk ./src/allmydata/test/test_mutable.py 2286
10789         d.addCallback(_created)
10790         return d
10791 
10792+
10793     def test_unexpected_shares(self):
10794         # upload the file, take a servermap, shut down one of the servers,
10795         # upload it again (causing shares to appear on a new server), then
10796hunk ./src/allmydata/test/test_mutable.py 2296
10797         self.basedir = "mutable/Problems/test_unexpected_shares"
10798         self.set_up_grid()
10799         nm = self.g.clients[0].nodemaker
10800-        d = nm.create_mutable_file("contents 1")
10801+        d = nm.create_mutable_file(MutableData("contents 1"))
10802         def _created(n):
10803             d = defer.succeed(None)
10804             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
10805hunk ./src/allmydata/test/test_mutable.py 2308
10806                 self.g.remove_server(peer0)
10807                 # then modify the file, leaving the old map untouched
10808                 log.msg("starting winning write")
10809-                return n.overwrite("contents 2")
10810+                return n.overwrite(MutableData("contents 2"))
10811             d.addCallback(_got_smap1)
10812             # now attempt to modify the file with the old servermap. This
10813             # will look just like an uncoordinated write, in which every
10814hunk ./src/allmydata/test/test_mutable.py 2318
10815                           self.shouldFail(UncoordinatedWriteError,
10816                                           "test_surprise", None,
10817                                           n.upload,
10818-                                          "contents 2a", self.old_map))
10819+                                          MutableData("contents 2a"), self.old_map))
10820             return d
10821         d.addCallback(_created)
10822         return d
10823hunk ./src/allmydata/test/test_mutable.py 2322
10824+    test_unexpected_shares.timeout = 15
10825 
10826     def test_bad_server(self):
10827         # Break one server, then create the file: the initial publish should
10828hunk ./src/allmydata/test/test_mutable.py 2358
10829         d.addCallback(_break_peer0)
10830         # now "create" the file, using the pre-established key, and let the
10831         # initial publish finally happen
10832-        d.addCallback(lambda res: nm.create_mutable_file("contents 1"))
10833+        d.addCallback(lambda res: nm.create_mutable_file(MutableData("contents 1")))
10834         # that ought to work
10835         def _got_node(n):
10836             d = n.download_best_version()
10837hunk ./src/allmydata/test/test_mutable.py 2367
10838             def _break_peer1(res):
10839                 self.connection1.broken = True
10840             d.addCallback(_break_peer1)
10841-            d.addCallback(lambda res: n.overwrite("contents 2"))
10842+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
10843             # that ought to work too
10844             d.addCallback(lambda res: n.download_best_version())
10845             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
10846hunk ./src/allmydata/test/test_mutable.py 2399
10847         peerids = [serverid for (serverid,ss) in sb.get_all_servers()]
10848         self.g.break_server(peerids[0])
10849 
10850-        d = nm.create_mutable_file("contents 1")
10851+        d = nm.create_mutable_file(MutableData("contents 1"))
10852         def _created(n):
10853             d = n.download_best_version()
10854             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
10855hunk ./src/allmydata/test/test_mutable.py 2407
10856             def _break_second_server(res):
10857                 self.g.break_server(peerids[1])
10858             d.addCallback(_break_second_server)
10859-            d.addCallback(lambda res: n.overwrite("contents 2"))
10860+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
10861             # that ought to work too
10862             d.addCallback(lambda res: n.download_best_version())
10863             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
10864hunk ./src/allmydata/test/test_mutable.py 2426
10865         d = self.shouldFail(NotEnoughServersError,
10866                             "test_publish_all_servers_bad",
10867                             "Ran out of non-bad servers",
10868-                            nm.create_mutable_file, "contents")
10869+                            nm.create_mutable_file, MutableData("contents"))
10870         return d
10871 
10872     def test_publish_no_servers(self):
10873hunk ./src/allmydata/test/test_mutable.py 2438
10874         d = self.shouldFail(NotEnoughServersError,
10875                             "test_publish_no_servers",
10876                             "Ran out of non-bad servers",
10877-                            nm.create_mutable_file, "contents")
10878+                            nm.create_mutable_file, MutableData("contents"))
10879         return d
10880     test_publish_no_servers.timeout = 30
10881 
10882hunk ./src/allmydata/test/test_mutable.py 2456
10883         # we need some contents that are large enough to push the privkey out
10884         # of the early part of the file
10885         LARGE = "These are Larger contents" * 2000 # about 50KB
10886-        d = nm.create_mutable_file(LARGE)
10887+        LARGE_uploadable = MutableData(LARGE)
10888+        d = nm.create_mutable_file(LARGE_uploadable)
10889         def _created(n):
10890             self.uri = n.get_uri()
10891             self.n2 = nm.create_from_cap(self.uri)
10892hunk ./src/allmydata/test/test_mutable.py 2492
10893         self.basedir = "mutable/Problems/test_privkey_query_missing"
10894         self.set_up_grid(num_servers=20)
10895         nm = self.g.clients[0].nodemaker
10896-        LARGE = "These are Larger contents" * 2000 # about 50KB
10897+        LARGE = "These are Larger contents" * 2000 # about 50KiB
10898+        LARGE_uploadable = MutableData(LARGE)
10899         nm._node_cache = DevNullDictionary() # disable the nodecache
10900 
10901hunk ./src/allmydata/test/test_mutable.py 2496
10902-        d = nm.create_mutable_file(LARGE)
10903+        d = nm.create_mutable_file(LARGE_uploadable)
10904         def _created(n):
10905             self.uri = n.get_uri()
10906             self.n2 = nm.create_from_cap(self.uri)
10907hunk ./src/allmydata/test/test_mutable.py 2506
10908         d.addCallback(_created)
10909         d.addCallback(lambda res: self.n2.get_servermap(MODE_WRITE))
10910         return d
10911+
10912+
10913+    def test_block_and_hash_query_error(self):
10914+        # This tests for what happens when a query to a remote server
10915+        # fails in either the hash validation step or the block getting
10916+        # step (because of batching, this is the same actual query).
10917+        # We need to have the storage server persist up until the point
10918+        # that its prefix is validated, then suddenly die. This
10919+        # exercises some exception handling code in Retrieve.
10920+        self.basedir = "mutable/Problems/test_block_and_hash_query_error"
10921+        self.set_up_grid(num_servers=20)
10922+        nm = self.g.clients[0].nodemaker
10923+        CONTENTS = "contents" * 2000
10924+        CONTENTS_uploadable = MutableData(CONTENTS)
10925+        d = nm.create_mutable_file(CONTENTS_uploadable)
10926+        def _created(node):
10927+            self._node = node
10928+        d.addCallback(_created)
10929+        d.addCallback(lambda ignored:
10930+            self._node.get_servermap(MODE_READ))
10931+        def _then(servermap):
10932+            # we have our servermap. Now we set up the servers like the
10933+            # tests above -- the first one that gets a read call should
10934+            # start throwing errors, but only after returning its prefix
10935+            # for validation. Since we'll download without fetching the
10936+            # private key, the next query to the remote server will be
10937+            # for either a block and salt or for hashes, either of which
10938+            # will exercise the error handling code.
10939+            killer = FirstServerGetsKilled()
10940+            for (serverid, ss) in nm.storage_broker.get_all_servers():
10941+                ss.post_call_notifier = killer.notify
10942+            ver = servermap.best_recoverable_version()
10943+            assert ver
10944+            return self._node.download_version(servermap, ver)
10945+        d.addCallback(_then)
10946+        d.addCallback(lambda data:
10947+            self.failUnlessEqual(data, CONTENTS))
10948+        return d
10949+
10950+
10951+class FileHandle(unittest.TestCase):
10952+    def setUp(self):
10953+        self.test_data = "Test Data" * 50000
10954+        self.sio = StringIO(self.test_data)
10955+        self.uploadable = MutableFileHandle(self.sio)
10956+
10957+
10958+    def test_filehandle_read(self):
10959+        self.basedir = "mutable/FileHandle/test_filehandle_read"
10960+        chunk_size = 10
10961+        for i in xrange(0, len(self.test_data), chunk_size):
10962+            data = self.uploadable.read(chunk_size)
10963+            data = "".join(data)
10964+            start = i
10965+            end = i + chunk_size
10966+            self.failUnlessEqual(data, self.test_data[start:end])
10967+
10968+
10969+    def test_filehandle_get_size(self):
10970+        self.basedir = "mutable/FileHandle/test_filehandle_get_size"
10971+        actual_size = len(self.test_data)
10972+        size = self.uploadable.get_size()
10973+        self.failUnlessEqual(size, actual_size)
10974+
10975+
10976+    def test_filehandle_get_size_out_of_order(self):
10977+        # We should be able to call get_size whenever we want without
10978+        # disturbing the location of the seek pointer.
10979+        chunk_size = 100
10980+        data = self.uploadable.read(chunk_size)
10981+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
10982+
10983+        # Now get the size.
10984+        size = self.uploadable.get_size()
10985+        self.failUnlessEqual(size, len(self.test_data))
10986+
10987+        # Now get more data. We should be right where we left off.
10988+        more_data = self.uploadable.read(chunk_size)
10989+        start = chunk_size
10990+        end = chunk_size * 2
10991+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
10992+
10993+
10994+    def test_filehandle_file(self):
10995+        # Make sure that the MutableFileHandle works on a file as well
10996+        # as a StringIO object, since in some cases it will be asked to
10997+        # deal with files.
10998+        self.basedir = self.mktemp()
10999+        # necessary? What am I doing wrong here?
11000+        os.mkdir(self.basedir)
11001+        f_path = os.path.join(self.basedir, "test_file")
11002+        f = open(f_path, "w")
11003+        f.write(self.test_data)
11004+        f.close()
11005+        f = open(f_path, "r")
11006+
11007+        uploadable = MutableFileHandle(f)
11008+
11009+        data = uploadable.read(len(self.test_data))
11010+        self.failUnlessEqual("".join(data), self.test_data)
11011+        size = uploadable.get_size()
11012+        self.failUnlessEqual(size, len(self.test_data))
11013+
11014+
11015+    def test_close(self):
11016+        # Make sure that the MutableFileHandle closes its handle when
11017+        # told to do so.
11018+        self.uploadable.close()
11019+        self.failUnless(self.sio.closed)
11020+
11021+
11022+class DataHandle(unittest.TestCase):
11023+    def setUp(self):
11024+        self.test_data = "Test Data" * 50000
11025+        self.uploadable = MutableData(self.test_data)
11026+
11027+
11028+    def test_datahandle_read(self):
11029+        chunk_size = 10
11030+        for i in xrange(0, len(self.test_data), chunk_size):
11031+            data = self.uploadable.read(chunk_size)
11032+            data = "".join(data)
11033+            start = i
11034+            end = i + chunk_size
11035+            self.failUnlessEqual(data, self.test_data[start:end])
11036+
11037+
11038+    def test_datahandle_get_size(self):
11039+        actual_size = len(self.test_data)
11040+        size = self.uploadable.get_size()
11041+        self.failUnlessEqual(size, actual_size)
11042+
11043+
11044+    def test_datahandle_get_size_out_of_order(self):
11045+        # We should be able to call get_size whenever we want without
11046+        # disturbing the location of the seek pointer.
11047+        chunk_size = 100
11048+        data = self.uploadable.read(chunk_size)
11049+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
11050+
11051+        # Now get the size.
11052+        size = self.uploadable.get_size()
11053+        self.failUnlessEqual(size, len(self.test_data))
11054+
11055+        # Now get more data. We should be right where we left off.
11056+        more_data = self.uploadable.read(chunk_size)
11057+        start = chunk_size
11058+        end = chunk_size * 2
11059+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
11060+
11061+
11062+class Version(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin, \
11063+              PublishMixin):
11064+    def setUp(self):
11065+        GridTestMixin.setUp(self)
11066+        self.basedir = self.mktemp()
11067+        self.set_up_grid()
11068+        self.c = self.g.clients[0]
11069+        self.nm = self.c.nodemaker
11070+        self.data = "test data" * 100000 # about 900 KiB; MDMF
11071+        self.small_data = "test data" * 10 # about 90 B; SDMF
11072+        return self.do_upload()
11073+
11074+
11075+    def do_upload(self):
11076+        d1 = self.nm.create_mutable_file(MutableData(self.data),
11077+                                         version=MDMF_VERSION)
11078+        d2 = self.nm.create_mutable_file(MutableData(self.small_data))
11079+        dl = gatherResults([d1, d2])
11080+        def _then((n1, n2)):
11081+            assert isinstance(n1, MutableFileNode)
11082+            assert isinstance(n2, MutableFileNode)
11083+
11084+            self.mdmf_node = n1
11085+            self.sdmf_node = n2
11086+        dl.addCallback(_then)
11087+        return dl
11088+
11089+
11090+    def test_get_readonly_mutable_version(self):
11091+        # Attempting to get a mutable version of a mutable file from a
11092+        # filenode initialized with a readcap should return a readonly
11093+        # version of that same node.
11094+        ro = self.mdmf_node.get_readonly()
11095+        d = ro.get_best_mutable_version()
11096+        d.addCallback(lambda version:
11097+            self.failUnless(version.is_readonly()))
11098+        d.addCallback(lambda ignored:
11099+            self.sdmf_node.get_readonly())
11100+        d.addCallback(lambda version:
11101+            self.failUnless(version.is_readonly()))
11102+        return d
11103+
11104+
11105+    def test_get_sequence_number(self):
11106+        d = self.mdmf_node.get_best_readable_version()
11107+        d.addCallback(lambda bv:
11108+            self.failUnlessEqual(bv.get_sequence_number(), 1))
11109+        d.addCallback(lambda ignored:
11110+            self.sdmf_node.get_best_readable_version())
11111+        d.addCallback(lambda bv:
11112+            self.failUnlessEqual(bv.get_sequence_number(), 1))
11113+        # Now update. The sequence number in both cases should be 1 in
11114+        # both cases.
11115+        def _do_update(ignored):
11116+            new_data = MutableData("foo bar baz" * 100000)
11117+            new_small_data = MutableData("foo bar baz" * 10)
11118+            d1 = self.mdmf_node.overwrite(new_data)
11119+            d2 = self.sdmf_node.overwrite(new_small_data)
11120+            dl = gatherResults([d1, d2])
11121+            return dl
11122+        d.addCallback(_do_update)
11123+        d.addCallback(lambda ignored:
11124+            self.mdmf_node.get_best_readable_version())
11125+        d.addCallback(lambda bv:
11126+            self.failUnlessEqual(bv.get_sequence_number(), 2))
11127+        d.addCallback(lambda ignored:
11128+            self.sdmf_node.get_best_readable_version())
11129+        d.addCallback(lambda bv:
11130+            self.failUnlessEqual(bv.get_sequence_number(), 2))
11131+        return d
11132+
11133+
11134+    def test_get_writekey(self):
11135+        d = self.mdmf_node.get_best_mutable_version()
11136+        d.addCallback(lambda bv:
11137+            self.failUnlessEqual(bv.get_writekey(),
11138+                                 self.mdmf_node.get_writekey()))
11139+        d.addCallback(lambda ignored:
11140+            self.sdmf_node.get_best_mutable_version())
11141+        d.addCallback(lambda bv:
11142+            self.failUnlessEqual(bv.get_writekey(),
11143+                                 self.sdmf_node.get_writekey()))
11144+        return d
11145+
11146+
11147+    def test_get_storage_index(self):
11148+        d = self.mdmf_node.get_best_mutable_version()
11149+        d.addCallback(lambda bv:
11150+            self.failUnlessEqual(bv.get_storage_index(),
11151+                                 self.mdmf_node.get_storage_index()))
11152+        d.addCallback(lambda ignored:
11153+            self.sdmf_node.get_best_mutable_version())
11154+        d.addCallback(lambda bv:
11155+            self.failUnlessEqual(bv.get_storage_index(),
11156+                                 self.sdmf_node.get_storage_index()))
11157+        return d
11158+
11159+
11160+    def test_get_readonly_version(self):
11161+        d = self.mdmf_node.get_best_readable_version()
11162+        d.addCallback(lambda bv:
11163+            self.failUnless(bv.is_readonly()))
11164+        d.addCallback(lambda ignored:
11165+            self.sdmf_node.get_best_readable_version())
11166+        d.addCallback(lambda bv:
11167+            self.failUnless(bv.is_readonly()))
11168+        return d
11169+
11170+
11171+    def test_get_mutable_version(self):
11172+        d = self.mdmf_node.get_best_mutable_version()
11173+        d.addCallback(lambda bv:
11174+            self.failIf(bv.is_readonly()))
11175+        d.addCallback(lambda ignored:
11176+            self.sdmf_node.get_best_mutable_version())
11177+        d.addCallback(lambda bv:
11178+            self.failIf(bv.is_readonly()))
11179+        return d
11180+
11181+
11182+    def test_toplevel_overwrite(self):
11183+        new_data = MutableData("foo bar baz" * 100000)
11184+        new_small_data = MutableData("foo bar baz" * 10)
11185+        d = self.mdmf_node.overwrite(new_data)
11186+        d.addCallback(lambda ignored:
11187+            self.mdmf_node.download_best_version())
11188+        d.addCallback(lambda data:
11189+            self.failUnlessEqual(data, "foo bar baz" * 100000))
11190+        d.addCallback(lambda ignored:
11191+            self.sdmf_node.overwrite(new_small_data))
11192+        d.addCallback(lambda ignored:
11193+            self.sdmf_node.download_best_version())
11194+        d.addCallback(lambda data:
11195+            self.failUnlessEqual(data, "foo bar baz" * 10))
11196+        return d
11197+
11198+
11199+    def test_toplevel_modify(self):
11200+        def modifier(old_contents, servermap, first_time):
11201+            return old_contents + "modified"
11202+        d = self.mdmf_node.modify(modifier)
11203+        d.addCallback(lambda ignored:
11204+            self.mdmf_node.download_best_version())
11205+        d.addCallback(lambda data:
11206+            self.failUnlessIn("modified", data))
11207+        d.addCallback(lambda ignored:
11208+            self.sdmf_node.modify(modifier))
11209+        d.addCallback(lambda ignored:
11210+            self.sdmf_node.download_best_version())
11211+        d.addCallback(lambda data:
11212+            self.failUnlessIn("modified", data))
11213+        return d
11214+
11215+
11216+    def test_version_modify(self):
11217+        # TODO: When we can publish multiple versions, alter this test
11218+        # to modify a version other than the best usable version, then
11219+        # test to see that the best recoverable version is that.
11220+        def modifier(old_contents, servermap, first_time):
11221+            return old_contents + "modified"
11222+        d = self.mdmf_node.modify(modifier)
11223+        d.addCallback(lambda ignored:
11224+            self.mdmf_node.download_best_version())
11225+        d.addCallback(lambda data:
11226+            self.failUnlessIn("modified", data))
11227+        d.addCallback(lambda ignored:
11228+            self.sdmf_node.modify(modifier))
11229+        d.addCallback(lambda ignored:
11230+            self.sdmf_node.download_best_version())
11231+        d.addCallback(lambda data:
11232+            self.failUnlessIn("modified", data))
11233+        return d
11234+
11235+
11236+    def test_download_version(self):
11237+        d = self.publish_multiple()
11238+        # We want to have two recoverable versions on the grid.
11239+        d.addCallback(lambda res:
11240+                      self._set_versions({0:0,2:0,4:0,6:0,8:0,
11241+                                          1:1,3:1,5:1,7:1,9:1}))
11242+        # Now try to download each version. We should get the plaintext
11243+        # associated with that version.
11244+        d.addCallback(lambda ignored:
11245+            self._fn.get_servermap(mode=MODE_READ))
11246+        def _got_servermap(smap):
11247+            versions = smap.recoverable_versions()
11248+            assert len(versions) == 2
11249+
11250+            self.servermap = smap
11251+            self.version1, self.version2 = versions
11252+            assert self.version1 != self.version2
11253+
11254+            self.version1_seqnum = self.version1[0]
11255+            self.version2_seqnum = self.version2[0]
11256+            self.version1_index = self.version1_seqnum - 1
11257+            self.version2_index = self.version2_seqnum - 1
11258+
11259+        d.addCallback(_got_servermap)
11260+        d.addCallback(lambda ignored:
11261+            self._fn.download_version(self.servermap, self.version1))
11262+        d.addCallback(lambda results:
11263+            self.failUnlessEqual(self.CONTENTS[self.version1_index],
11264+                                 results))
11265+        d.addCallback(lambda ignored:
11266+            self._fn.download_version(self.servermap, self.version2))
11267+        d.addCallback(lambda results:
11268+            self.failUnlessEqual(self.CONTENTS[self.version2_index],
11269+                                 results))
11270+        return d
11271+
11272+
11273+    def test_partial_read(self):
11274+        # read only a few bytes at a time, and see that the results are
11275+        # what we expect.
11276+        d = self.mdmf_node.get_best_readable_version()
11277+        def _read_data(version):
11278+            c = consumer.MemoryConsumer()
11279+            d2 = defer.succeed(None)
11280+            for i in xrange(0, len(self.data), 10000):
11281+                d2.addCallback(lambda ignored, i=i: version.read(c, i, 10000))
11282+            d2.addCallback(lambda ignored:
11283+                self.failUnlessEqual(self.data, "".join(c.chunks)))
11284+            return d2
11285+        d.addCallback(_read_data)
11286+        return d
11287+
11288+
11289+    def test_read(self):
11290+        d = self.mdmf_node.get_best_readable_version()
11291+        def _read_data(version):
11292+            c = consumer.MemoryConsumer()
11293+            d2 = defer.succeed(None)
11294+            d2.addCallback(lambda ignored: version.read(c))
11295+            d2.addCallback(lambda ignored:
11296+                self.failUnlessEqual("".join(c.chunks), self.data))
11297+            return d2
11298+        d.addCallback(_read_data)
11299+        return d
11300+
11301+
11302+    def test_download_best_version(self):
11303+        d = self.mdmf_node.download_best_version()
11304+        d.addCallback(lambda data:
11305+            self.failUnlessEqual(data, self.data))
11306+        d.addCallback(lambda ignored:
11307+            self.sdmf_node.download_best_version())
11308+        d.addCallback(lambda data:
11309+            self.failUnlessEqual(data, self.small_data))
11310+        return d
11311+
11312+
11313+class Update(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin):
11314+    def setUp(self):
11315+        GridTestMixin.setUp(self)
11316+        self.basedir = self.mktemp()
11317+        self.set_up_grid()
11318+        self.c = self.g.clients[0]
11319+        self.nm = self.c.nodemaker
11320+        self.data = "test data" * 100000 # about 900 KiB; MDMF
11321+        self.small_data = "test data" * 10 # about 90 B; SDMF
11322+        return self.do_upload()
11323+
11324+
11325+    def do_upload(self):
11326+        d1 = self.nm.create_mutable_file(MutableData(self.data),
11327+                                         version=MDMF_VERSION)
11328+        d2 = self.nm.create_mutable_file(MutableData(self.small_data))
11329+        dl = gatherResults([d1, d2])
11330+        def _then((n1, n2)):
11331+            assert isinstance(n1, MutableFileNode)
11332+            assert isinstance(n2, MutableFileNode)
11333+
11334+            self.mdmf_node = n1
11335+            self.sdmf_node = n2
11336+        dl.addCallback(_then)
11337+        return dl
11338+
11339+
11340+    def test_append(self):
11341+        # We should be able to append data to the middle of a mutable
11342+        # file and get what we expect.
11343+        new_data = self.data + "appended"
11344+        d = self.mdmf_node.get_best_mutable_version()
11345+        d.addCallback(lambda mv:
11346+            mv.update(MutableData("appended"), len(self.data)))
11347+        d.addCallback(lambda ignored:
11348+            self.mdmf_node.download_best_version())
11349+        d.addCallback(lambda results:
11350+            self.failUnlessEqual(results, new_data))
11351+        return d
11352+    test_append.timeout = 15
11353+
11354+
11355+    def test_replace(self):
11356+        # We should be able to replace data in the middle of a mutable
11357+        # file and get what we expect back.
11358+        new_data = self.data[:100]
11359+        new_data += "appended"
11360+        new_data += self.data[108:]
11361+        d = self.mdmf_node.get_best_mutable_version()
11362+        d.addCallback(lambda mv:
11363+            mv.update(MutableData("appended"), 100))
11364+        d.addCallback(lambda ignored:
11365+            self.mdmf_node.download_best_version())
11366+        d.addCallback(lambda results:
11367+            self.failUnlessEqual(results, new_data))
11368+        return d
11369+
11370+
11371+    def test_replace_and_extend(self):
11372+        # We should be able to replace data in the middle of a mutable
11373+        # file and extend that mutable file and get what we expect.
11374+        new_data = self.data[:100]
11375+        new_data += "modified " * 100000
11376+        d = self.mdmf_node.get_best_mutable_version()
11377+        d.addCallback(lambda mv:
11378+            mv.update(MutableData("modified " * 100000), 100))
11379+        d.addCallback(lambda ignored:
11380+            self.mdmf_node.download_best_version())
11381+        d.addCallback(lambda results:
11382+            self.failUnlessEqual(results, new_data))
11383+        return d
11384+
11385+
11386+    def test_append_power_of_two(self):
11387+        # If we attempt to extend a mutable file so that its segment
11388+        # count crosses a power-of-two boundary, the update operation
11389+        # should know how to reencode the file.
11390+
11391+        # Note that the data populating self.mdmf_node is about 900 KiB
11392+        # long -- this is 7 segments in the default segment size. So we
11393+        # need to add 2 segments worth of data to push it over a
11394+        # power-of-two boundary.
11395+        segment = "a" * DEFAULT_MAX_SEGMENT_SIZE
11396+        new_data = self.data + (segment * 2)
11397+        d = self.mdmf_node.get_best_mutable_version()
11398+        d.addCallback(lambda mv:
11399+            mv.update(MutableData(segment * 2), len(self.data)))
11400+        d.addCallback(lambda ignored:
11401+            self.mdmf_node.download_best_version())
11402+        d.addCallback(lambda results:
11403+            self.failUnlessEqual(results, new_data))
11404+        return d
11405+    test_append_power_of_two.timeout = 15
11406+
11407+
11408+    def test_update_sdmf(self):
11409+        # Running update on a single-segment file should still work.
11410+        new_data = self.small_data + "appended"
11411+        d = self.sdmf_node.get_best_mutable_version()
11412+        d.addCallback(lambda mv:
11413+            mv.update(MutableData("appended"), len(self.small_data)))
11414+        d.addCallback(lambda ignored:
11415+            self.sdmf_node.download_best_version())
11416+        d.addCallback(lambda results:
11417+            self.failUnlessEqual(results, new_data))
11418+        return d
11419+
11420+    def test_replace_in_last_segment(self):
11421+        # The wrapper should know how to handle the tail segment
11422+        # appropriately.
11423+        replace_offset = len(self.data) - 100
11424+        new_data = self.data[:replace_offset] + "replaced"
11425+        rest_offset = replace_offset + len("replaced")
11426+        new_data += self.data[rest_offset:]
11427+        d = self.mdmf_node.get_best_mutable_version()
11428+        d.addCallback(lambda mv:
11429+            mv.update(MutableData("replaced"), replace_offset))
11430+        d.addCallback(lambda ignored:
11431+            self.mdmf_node.download_best_version())
11432+        d.addCallback(lambda results:
11433+            self.failUnlessEqual(results, new_data))
11434+        return d
11435+
11436+
11437+    def test_multiple_segment_replace(self):
11438+        replace_offset = 2 * DEFAULT_MAX_SEGMENT_SIZE
11439+        new_data = self.data[:replace_offset]
11440+        new_segment = "a" * DEFAULT_MAX_SEGMENT_SIZE
11441+        new_data += 2 * new_segment
11442+        new_data += "replaced"
11443+        rest_offset = len(new_data)
11444+        new_data += self.data[rest_offset:]
11445+        d = self.mdmf_node.get_best_mutable_version()
11446+        d.addCallback(lambda mv:
11447+            mv.update(MutableData((2 * new_segment) + "replaced"),
11448+                      replace_offset))
11449+        d.addCallback(lambda ignored:
11450+            self.mdmf_node.download_best_version())
11451+        d.addCallback(lambda results:
11452+            self.failUnlessEqual(results, new_data))
11453+        return d
11454hunk ./src/allmydata/test/test_sftp.py 32
11455 
11456 from allmydata.util.consumer import download_to_data
11457 from allmydata.immutable import upload
11458+from allmydata.mutable import publish
11459 from allmydata.test.no_network import GridTestMixin
11460 from allmydata.test.common import ShouldFailMixin
11461 from allmydata.test.common_util import ReallyEqualMixin
11462hunk ./src/allmydata/test/test_sftp.py 84
11463         return d
11464 
11465     def _set_up_tree(self):
11466-        d = self.client.create_mutable_file("mutable file contents")
11467+        u = publish.MutableData("mutable file contents")
11468+        d = self.client.create_mutable_file(u)
11469         d.addCallback(lambda node: self.root.set_node(u"mutable", node))
11470         def _created_mutable(n):
11471             self.mutable = n
11472hunk ./src/allmydata/test/test_sftp.py 1334
11473         d.addCallback(lambda ign: self.failUnlessEqual(sftpd.all_heisenfiles, {}))
11474         d.addCallback(lambda ign: self.failUnlessEqual(self.handler._heisenfiles, {}))
11475         return d
11476+    test_makeDirectory.timeout = 15
11477 
11478     def test_execCommand_and_openShell(self):
11479         class FakeProtocol:
11480hunk ./src/allmydata/test/test_system.py 25
11481 from allmydata.monitor import Monitor
11482 from allmydata.mutable.common import NotWriteableError
11483 from allmydata.mutable import layout as mutable_layout
11484+from allmydata.mutable.publish import MutableData
11485 from foolscap.api import DeadReferenceError
11486 from twisted.python.failure import Failure
11487 from twisted.web.client import getPage
11488hunk ./src/allmydata/test/test_system.py 463
11489     def test_mutable(self):
11490         self.basedir = "system/SystemTest/test_mutable"
11491         DATA = "initial contents go here."  # 25 bytes % 3 != 0
11492+        DATA_uploadable = MutableData(DATA)
11493         NEWDATA = "new contents yay"
11494hunk ./src/allmydata/test/test_system.py 465
11495+        NEWDATA_uploadable = MutableData(NEWDATA)
11496         NEWERDATA = "this is getting old"
11497hunk ./src/allmydata/test/test_system.py 467
11498+        NEWERDATA_uploadable = MutableData(NEWERDATA)
11499 
11500         d = self.set_up_nodes(use_key_generator=True)
11501 
11502hunk ./src/allmydata/test/test_system.py 474
11503         def _create_mutable(res):
11504             c = self.clients[0]
11505             log.msg("starting create_mutable_file")
11506-            d1 = c.create_mutable_file(DATA)
11507+            d1 = c.create_mutable_file(DATA_uploadable)
11508             def _done(res):
11509                 log.msg("DONE: %s" % (res,))
11510                 self._mutable_node_1 = res
11511hunk ./src/allmydata/test/test_system.py 561
11512             self.failUnlessEqual(res, DATA)
11513             # replace the data
11514             log.msg("starting replace1")
11515-            d1 = newnode.overwrite(NEWDATA)
11516+            d1 = newnode.overwrite(NEWDATA_uploadable)
11517             d1.addCallback(lambda res: newnode.download_best_version())
11518             return d1
11519         d.addCallback(_check_download_3)
11520hunk ./src/allmydata/test/test_system.py 575
11521             newnode2 = self.clients[3].create_node_from_uri(uri)
11522             self._newnode3 = self.clients[3].create_node_from_uri(uri)
11523             log.msg("starting replace2")
11524-            d1 = newnode1.overwrite(NEWERDATA)
11525+            d1 = newnode1.overwrite(NEWERDATA_uploadable)
11526             d1.addCallback(lambda res: newnode2.download_best_version())
11527             return d1
11528         d.addCallback(_check_download_4)
11529hunk ./src/allmydata/test/test_system.py 645
11530         def _check_empty_file(res):
11531             # make sure we can create empty files, this usually screws up the
11532             # segsize math
11533-            d1 = self.clients[2].create_mutable_file("")
11534+            d1 = self.clients[2].create_mutable_file(MutableData(""))
11535             d1.addCallback(lambda newnode: newnode.download_best_version())
11536             d1.addCallback(lambda res: self.failUnlessEqual("", res))
11537             return d1
11538hunk ./src/allmydata/test/test_system.py 676
11539                                  self.key_generator_svc.key_generator.pool_size + size_delta)
11540 
11541         d.addCallback(check_kg_poolsize, 0)
11542-        d.addCallback(lambda junk: self.clients[3].create_mutable_file('hello, world'))
11543+        d.addCallback(lambda junk:
11544+            self.clients[3].create_mutable_file(MutableData('hello, world')))
11545         d.addCallback(check_kg_poolsize, -1)
11546         d.addCallback(lambda junk: self.clients[3].create_dirnode())
11547         d.addCallback(check_kg_poolsize, -2)
11548hunk ./src/allmydata/test/test_web.py 28
11549 from allmydata.util.encodingutil import to_str
11550 from allmydata.test.common import FakeCHKFileNode, FakeMutableFileNode, \
11551      create_chk_filenode, WebErrorMixin, ShouldFailMixin, make_mutable_file_uri
11552-from allmydata.interfaces import IMutableFileNode
11553+from allmydata.interfaces import IMutableFileNode, SDMF_VERSION, MDMF_VERSION
11554 from allmydata.mutable import servermap, publish, retrieve
11555 import allmydata.test.common_util as testutil
11556 from allmydata.test.no_network import GridTestMixin
11557hunk ./src/allmydata/test/test_web.py 57
11558         return FakeCHKFileNode(cap)
11559     def _create_mutable(self, cap):
11560         return FakeMutableFileNode(None, None, None, None).init_from_cap(cap)
11561-    def create_mutable_file(self, contents="", keysize=None):
11562+    def create_mutable_file(self, contents="", keysize=None,
11563+                            version=SDMF_VERSION):
11564         n = FakeMutableFileNode(None, None, None, None)
11565hunk ./src/allmydata/test/test_web.py 60
11566+        n.set_version(version)
11567         return n.create(contents)
11568 
11569 class FakeUploader(service.Service):
11570hunk ./src/allmydata/test/test_web.py 153
11571         self.nodemaker = FakeNodeMaker(None, self._secret_holder, None,
11572                                        self.uploader, None,
11573                                        None, None)
11574+        self.mutable_file_default = SDMF_VERSION
11575 
11576     def startService(self):
11577         return service.MultiService.startService(self)
11578hunk ./src/allmydata/test/test_web.py 756
11579                              self.PUT, base + "/@@name=/blah.txt", "")
11580         return d
11581 
11582+
11583     def test_GET_DIRURL_named_bad(self):
11584         base = "/file/%s" % urllib.quote(self._foo_uri)
11585         d = self.shouldFail2(error.Error, "test_PUT_DIRURL_named_bad",
11586hunk ./src/allmydata/test/test_web.py 872
11587                                                       self.NEWFILE_CONTENTS))
11588         return d
11589 
11590+    def test_PUT_NEWFILEURL_unlinked_mdmf(self):
11591+        # this should get us a few segments of an MDMF mutable file,
11592+        # which we can then test for.
11593+        contents = self.NEWFILE_CONTENTS * 300000
11594+        d = self.PUT("/uri?mutable=true&mutable-type=mdmf",
11595+                     contents)
11596+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
11597+        d.addCallback(lambda json: self.failUnlessIn("mdmf", json))
11598+        return d
11599+
11600+    def test_PUT_NEWFILEURL_unlinked_sdmf(self):
11601+        contents = self.NEWFILE_CONTENTS * 300000
11602+        d = self.PUT("/uri?mutable=true&mutable-type=sdmf",
11603+                     contents)
11604+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
11605+        d.addCallback(lambda json: self.failUnlessIn("sdmf", json))
11606+        return d
11607+
11608     def test_PUT_NEWFILEURL_range_bad(self):
11609         headers = {"content-range": "bytes 1-10/%d" % len(self.NEWFILE_CONTENTS)}
11610         target = self.public_url + "/foo/new.txt"
11611hunk ./src/allmydata/test/test_web.py 922
11612         return d
11613 
11614     def test_PUT_NEWFILEURL_mutable_toobig(self):
11615-        d = self.shouldFail2(error.Error, "test_PUT_NEWFILEURL_mutable_toobig",
11616-                             "413 Request Entity Too Large",
11617-                             "SDMF is limited to one segment, and 10001 > 10000",
11618-                             self.PUT,
11619-                             self.public_url + "/foo/new.txt?mutable=true",
11620-                             "b" * (self.s.MUTABLE_SIZELIMIT+1))
11621+        # It is okay to upload large mutable files, so we should be able
11622+        # to do that.
11623+        d = self.PUT(self.public_url + "/foo/new.txt?mutable=true",
11624+                     "b" * (self.s.MUTABLE_SIZELIMIT + 1))
11625         return d
11626 
11627     def test_PUT_NEWFILEURL_replace(self):
11628hunk ./src/allmydata/test/test_web.py 1020
11629         d.addCallback(_check1)
11630         return d
11631 
11632+    def test_GET_FILEURL_json_mutable_type(self):
11633+        # The JSON should include mutable-type, which says whether the
11634+        # file is SDMF or MDMF
11635+        d = self.PUT("/uri?mutable=true&mutable-type=mdmf",
11636+                     self.NEWFILE_CONTENTS * 300000)
11637+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
11638+        def _got_json(json, version):
11639+            data = simplejson.loads(json)
11640+            assert "filenode" == data[0]
11641+            data = data[1]
11642+            assert isinstance(data, dict)
11643+
11644+            self.failUnlessIn("mutable-type", data)
11645+            self.failUnlessEqual(data['mutable-type'], version)
11646+
11647+        d.addCallback(_got_json, "mdmf")
11648+        # Now make an SDMF file and check that it is reported correctly.
11649+        d.addCallback(lambda ignored:
11650+            self.PUT("/uri?mutable=true&mutable-type=sdmf",
11651+                      self.NEWFILE_CONTENTS * 300000))
11652+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
11653+        d.addCallback(_got_json, "sdmf")
11654+        return d
11655+
11656     def test_GET_FILEURL_json_missing(self):
11657         d = self.GET(self.public_url + "/foo/missing?json")
11658         d.addBoth(self.should404, "test_GET_FILEURL_json_missing")
11659hunk ./src/allmydata/test/test_web.py 1082
11660         d.addBoth(self.should404, "test_GET_FILEURL_uri_missing")
11661         return d
11662 
11663-    def test_GET_DIRECTORY_html_banner(self):
11664+    def test_GET_DIRECTORY_html(self):
11665         d = self.GET(self.public_url + "/foo", followRedirect=True)
11666         def _check(res):
11667             self.failUnlessIn('<div class="toolbar-item"><a href="../../..">Return to Welcome page</a></div>',res)
11668hunk ./src/allmydata/test/test_web.py 1086
11669+            self.failUnlessIn("mutable-type-mdmf", res)
11670+            self.failUnlessIn("mutable-type-sdmf", res)
11671         d.addCallback(_check)
11672         return d
11673 
11674hunk ./src/allmydata/test/test_web.py 1091
11675+    def test_GET_root_html(self):
11676+        # make sure that we have the option to upload an unlinked
11677+        # mutable file in SDMF and MDMF formats.
11678+        d = self.GET("/")
11679+        def _got_html(html):
11680+            # These are radio buttons that allow the user to toggle
11681+            # whether a particular mutable file is MDMF or SDMF.
11682+            self.failUnlessIn("mutable-type-mdmf", html)
11683+            self.failUnlessIn("mutable-type-sdmf", html)
11684+        d.addCallback(_got_html)
11685+        return d
11686+
11687     def test_GET_DIRURL(self):
11688         # the addSlash means we get a redirect here
11689         # from /uri/$URI/foo/ , we need ../../../ to get back to the root
11690hunk ./src/allmydata/test/test_web.py 1192
11691         d.addCallback(self.failUnlessIsFooJSON)
11692         return d
11693 
11694+    def test_GET_DIRURL_json_mutable_type(self):
11695+        d = self.PUT(self.public_url + \
11696+                     "/foo/sdmf.txt?mutable=true&mutable-type=sdmf",
11697+                     self.NEWFILE_CONTENTS * 300000)
11698+        d.addCallback(lambda ignored:
11699+            self.PUT(self.public_url + \
11700+                     "/foo/mdmf.txt?mutable=true&mutable-type=mdmf",
11701+                     self.NEWFILE_CONTENTS * 300000))
11702+        # Now we have an MDMF and SDMF file in the directory. If we GET
11703+        # its JSON, we should see their encodings.
11704+        d.addCallback(lambda ignored:
11705+            self.GET(self.public_url + "/foo?t=json"))
11706+        def _got_json(json):
11707+            data = simplejson.loads(json)
11708+            assert data[0] == "dirnode"
11709+
11710+            data = data[1]
11711+            kids = data['children']
11712+
11713+            mdmf_data = kids['mdmf.txt'][1]
11714+            self.failUnlessIn("mutable-type", mdmf_data)
11715+            self.failUnlessEqual(mdmf_data['mutable-type'], "mdmf")
11716+
11717+            sdmf_data = kids['sdmf.txt'][1]
11718+            self.failUnlessIn("mutable-type", sdmf_data)
11719+            self.failUnlessEqual(sdmf_data['mutable-type'], "sdmf")
11720+        d.addCallback(_got_json)
11721+        return d
11722+
11723 
11724     def test_POST_DIRURL_manifest_no_ophandle(self):
11725         d = self.shouldFail2(error.Error,
11726hunk ./src/allmydata/test/test_web.py 1775
11727         return d
11728 
11729     def test_POST_upload_no_link_mutable_toobig(self):
11730-        d = self.shouldFail2(error.Error,
11731-                             "test_POST_upload_no_link_mutable_toobig",
11732-                             "413 Request Entity Too Large",
11733-                             "SDMF is limited to one segment, and 10001 > 10000",
11734-                             self.POST,
11735-                             "/uri", t="upload", mutable="true",
11736-                             file=("new.txt",
11737-                                   "b" * (self.s.MUTABLE_SIZELIMIT+1)) )
11738+        # The SDMF size limit is no longer in place, so we should be
11739+        # able to upload mutable files that are as large as we want them
11740+        # to be.
11741+        d = self.POST("/uri", t="upload", mutable="true",
11742+                      file=("new.txt", "b" * (self.s.MUTABLE_SIZELIMIT + 1)))
11743         return d
11744 
11745hunk ./src/allmydata/test/test_web.py 1782
11746+
11747+    def test_POST_upload_mutable_type_unlinked(self):
11748+        d = self.POST("/uri?t=upload&mutable=true&mutable-type=sdmf",
11749+                      file=("sdmf.txt", self.NEWFILE_CONTENTS * 300000))
11750+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
11751+        def _got_json(json, version):
11752+            data = simplejson.loads(json)
11753+            data = data[1]
11754+
11755+            self.failUnlessIn("mutable-type", data)
11756+            self.failUnlessEqual(data['mutable-type'], version)
11757+        d.addCallback(_got_json, "sdmf")
11758+        d.addCallback(lambda ignored:
11759+            self.POST("/uri?t=upload&mutable=true&mutable-type=mdmf",
11760+                      file=('mdmf.txt', self.NEWFILE_CONTENTS * 300000)))
11761+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
11762+        d.addCallback(_got_json, "mdmf")
11763+        return d
11764+
11765+    def test_POST_upload_mutable_type(self):
11766+        d = self.POST(self.public_url + \
11767+                      "/foo?t=upload&mutable=true&mutable-type=sdmf",
11768+                      file=("sdmf.txt", self.NEWFILE_CONTENTS * 300000))
11769+        fn = self._foo_node
11770+        def _got_cap(filecap, filename):
11771+            filenameu = unicode(filename)
11772+            self.failUnlessURIMatchesRWChild(filecap, fn, filenameu)
11773+            return self.GET(self.public_url + "/foo/%s?t=json" % filename)
11774+        d.addCallback(_got_cap, "sdmf.txt")
11775+        def _got_json(json, version):
11776+            data = simplejson.loads(json)
11777+            data = data[1]
11778+
11779+            self.failUnlessIn("mutable-type", data)
11780+            self.failUnlessEqual(data['mutable-type'], version)
11781+        d.addCallback(_got_json, "sdmf")
11782+        d.addCallback(lambda ignored:
11783+            self.POST(self.public_url + \
11784+                      "/foo?t=upload&mutable=true&mutable-type=mdmf",
11785+                      file=("mdmf.txt", self.NEWFILE_CONTENTS * 300000)))
11786+        d.addCallback(_got_cap, "mdmf.txt")
11787+        d.addCallback(_got_json, "mdmf")
11788+        return d
11789+
11790     def test_POST_upload_mutable(self):
11791         # this creates a mutable file
11792         d = self.POST(self.public_url + "/foo", t="upload", mutable="true",
11793hunk ./src/allmydata/test/test_web.py 1950
11794             self.failUnlessReallyEqual(headers["content-type"], ["text/plain"])
11795         d.addCallback(_got_headers)
11796 
11797-        # make sure that size errors are displayed correctly for overwrite
11798-        d.addCallback(lambda res:
11799-                      self.shouldFail2(error.Error,
11800-                                       "test_POST_upload_mutable-toobig",
11801-                                       "413 Request Entity Too Large",
11802-                                       "SDMF is limited to one segment, and 10001 > 10000",
11803-                                       self.POST,
11804-                                       self.public_url + "/foo", t="upload",
11805-                                       mutable="true",
11806-                                       file=("new.txt",
11807-                                             "b" * (self.s.MUTABLE_SIZELIMIT+1)),
11808-                                       ))
11809-
11810+        # make sure that outdated size limits aren't enforced anymore.
11811+        d.addCallback(lambda ignored:
11812+            self.POST(self.public_url + "/foo", t="upload",
11813+                      mutable="true",
11814+                      file=("new.txt",
11815+                            "b" * (self.s.MUTABLE_SIZELIMIT+1))))
11816         d.addErrback(self.dump_error)
11817         return d
11818 
11819hunk ./src/allmydata/test/test_web.py 1960
11820     def test_POST_upload_mutable_toobig(self):
11821-        d = self.shouldFail2(error.Error,
11822-                             "test_POST_upload_mutable_toobig",
11823-                             "413 Request Entity Too Large",
11824-                             "SDMF is limited to one segment, and 10001 > 10000",
11825-                             self.POST,
11826-                             self.public_url + "/foo",
11827-                             t="upload", mutable="true",
11828-                             file=("new.txt",
11829-                                   "b" * (self.s.MUTABLE_SIZELIMIT+1)) )
11830+        # SDMF had a size limti that was removed a while ago. MDMF has
11831+        # never had a size limit. Test to make sure that we do not
11832+        # encounter errors when trying to upload large mutable files,
11833+        # since there should be no coded prohibitions regarding large
11834+        # mutable files.
11835+        d = self.POST(self.public_url + "/foo",
11836+                      t="upload", mutable="true",
11837+                      file=("new.txt", "b" * (self.s.MUTABLE_SIZELIMIT + 1)))
11838         return d
11839 
11840     def dump_error(self, f):
11841hunk ./src/allmydata/test/test_web.py 2970
11842                                                       contents))
11843         return d
11844 
11845+    def test_PUT_NEWFILEURL_mdmf(self):
11846+        new_contents = self.NEWFILE_CONTENTS * 300000
11847+        d = self.PUT(self.public_url + \
11848+                     "/foo/mdmf.txt?mutable=true&mutable-type=mdmf",
11849+                     new_contents)
11850+        d.addCallback(lambda ignored:
11851+            self.GET(self.public_url + "/foo/mdmf.txt?t=json"))
11852+        def _got_json(json):
11853+            data = simplejson.loads(json)
11854+            data = data[1]
11855+            self.failUnlessIn("mutable-type", data)
11856+            self.failUnlessEqual(data['mutable-type'], "mdmf")
11857+        d.addCallback(_got_json)
11858+        return d
11859+
11860+    def test_PUT_NEWFILEURL_sdmf(self):
11861+        new_contents = self.NEWFILE_CONTENTS * 300000
11862+        d = self.PUT(self.public_url + \
11863+                     "/foo/sdmf.txt?mutable=true&mutable-type=sdmf",
11864+                     new_contents)
11865+        d.addCallback(lambda ignored:
11866+            self.GET(self.public_url + "/foo/sdmf.txt?t=json"))
11867+        def _got_json(json):
11868+            data = simplejson.loads(json)
11869+            data = data[1]
11870+            self.failUnlessIn("mutable-type", data)
11871+            self.failUnlessEqual(data['mutable-type'], "sdmf")
11872+        d.addCallback(_got_json)
11873+        return d
11874+
11875     def test_PUT_NEWFILEURL_uri_replace(self):
11876         contents, n, new_uri = self.makefile(8)
11877         d = self.PUT(self.public_url + "/foo/bar.txt?t=uri", new_uri)
11878hunk ./src/allmydata/test/test_web.py 3121
11879         d.addCallback(_done)
11880         return d
11881 
11882+
11883+    def test_PUT_update_at_offset(self):
11884+        file_contents = "test file" * 100000 # about 900 KiB
11885+        d = self.PUT("/uri?mutable=true", file_contents)
11886+        def _then(filecap):
11887+            self.filecap = filecap
11888+            new_data = file_contents[:100]
11889+            new = "replaced and so on"
11890+            new_data += new
11891+            new_data += file_contents[len(new_data):]
11892+            assert len(new_data) == len(file_contents)
11893+            self.new_data = new_data
11894+        d.addCallback(_then)
11895+        d.addCallback(lambda ignored:
11896+            self.PUT("/uri/%s?replace=True&offset=100" % self.filecap,
11897+                     "replaced and so on"))
11898+        def _get_data(filecap):
11899+            n = self.s.create_node_from_uri(filecap)
11900+            return n.download_best_version()
11901+        d.addCallback(_get_data)
11902+        d.addCallback(lambda results:
11903+            self.failUnlessEqual(results, self.new_data))
11904+        # Now try appending things to the file
11905+        d.addCallback(lambda ignored:
11906+            self.PUT("/uri/%s?offset=%d" % (self.filecap, len(self.new_data)),
11907+                     "puppies" * 100))
11908+        d.addCallback(_get_data)
11909+        d.addCallback(lambda results:
11910+            self.failUnlessEqual(results, self.new_data + ("puppies" * 100)))
11911+        return d
11912+
11913+
11914+    def test_PUT_update_at_offset_immutable(self):
11915+        file_contents = "Test file" * 100000
11916+        d = self.PUT("/uri", file_contents)
11917+        def _then(filecap):
11918+            self.filecap = filecap
11919+        d.addCallback(_then)
11920+        d.addCallback(lambda ignored:
11921+            self.shouldHTTPError("test immutable update",
11922+                                 400, "Bad Request",
11923+                                 "immutable",
11924+                                 self.PUT,
11925+                                 "/uri/%s?offset=50" % self.filecap,
11926+                                 "foo"))
11927+        return d
11928+
11929+
11930     def test_bad_method(self):
11931         url = self.webish_url + self.public_url + "/foo/bar.txt"
11932         d = self.shouldHTTPError("test_bad_method",
11933hunk ./src/allmydata/test/test_web.py 3422
11934         def _stash_mutable_uri(n, which):
11935             self.uris[which] = n.get_uri()
11936             assert isinstance(self.uris[which], str)
11937-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
11938+        d.addCallback(lambda ign:
11939+            c0.create_mutable_file(publish.MutableData(DATA+"3")))
11940         d.addCallback(_stash_mutable_uri, "corrupt")
11941         d.addCallback(lambda ign:
11942                       c0.upload(upload.Data("literal", convergence="")))
11943hunk ./src/allmydata/test/test_web.py 3569
11944         def _stash_mutable_uri(n, which):
11945             self.uris[which] = n.get_uri()
11946             assert isinstance(self.uris[which], str)
11947-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
11948+        d.addCallback(lambda ign:
11949+            c0.create_mutable_file(publish.MutableData(DATA+"3")))
11950         d.addCallback(_stash_mutable_uri, "corrupt")
11951 
11952         def _compute_fileurls(ignored):
11953hunk ./src/allmydata/test/test_web.py 4232
11954         def _stash_mutable_uri(n, which):
11955             self.uris[which] = n.get_uri()
11956             assert isinstance(self.uris[which], str)
11957-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"2"))
11958+        d.addCallback(lambda ign:
11959+            c0.create_mutable_file(publish.MutableData(DATA+"2")))
11960         d.addCallback(_stash_mutable_uri, "mutable")
11961 
11962         def _compute_fileurls(ignored):
11963hunk ./src/allmydata/test/test_web.py 4332
11964                                                         convergence="")))
11965         d.addCallback(_stash_uri, "small")
11966 
11967-        d.addCallback(lambda ign: c0.create_mutable_file("mutable"))
11968+        d.addCallback(lambda ign:
11969+            c0.create_mutable_file(publish.MutableData("mutable")))
11970         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
11971         d.addCallback(_stash_uri, "mutable")
11972 
11973}
11974[web: Alter the webapi to get along with and take advantage of the MDMF changes
11975Kevan Carstensen <kevan@isnotajoke.com>**20100813235106
11976 Ignore-this: d211fadf6fba0bb17d43aa0dc962516d
11977 
11978 The main benefit that the webapi gets from MDMF, at least initially, is
11979 the ability to do a streaming download of an MDMF mutable file. It also
11980 exposes a way (through the PUT verb) to append to or otherwise modify
11981 (in-place) an MDMF mutable file.
11982] {
11983hunk ./src/allmydata/web/common.py 12
11984 from allmydata.interfaces import ExistingChildError, NoSuchChildError, \
11985      FileTooLargeError, NotEnoughSharesError, NoSharesError, \
11986      EmptyPathnameComponentError, MustBeDeepImmutableError, \
11987-     MustBeReadonlyError, MustNotBeUnknownRWError
11988+     MustBeReadonlyError, MustNotBeUnknownRWError, SDMF_VERSION, MDMF_VERSION
11989 from allmydata.mutable.common import UnrecoverableFileError
11990 from allmydata.util import abbreviate
11991 from allmydata.util.encodingutil import to_str
11992hunk ./src/allmydata/web/common.py 34
11993     else:
11994         return boolean_of_arg(replace)
11995 
11996+
11997+def parse_mutable_type_arg(arg):
11998+    if not arg:
11999+        return None # interpreted by the caller as "let the nodemaker decide"
12000+
12001+    arg = arg.lower()
12002+    assert arg in ("mdmf", "sdmf")
12003+
12004+    if arg == "mdmf":
12005+        return MDMF_VERSION
12006+
12007+    return SDMF_VERSION
12008+
12009+
12010+def parse_offset_arg(offset):
12011+    # XXX: This will raise a ValueError when invoked on something that
12012+    # is not an integer. Is that okay? Or do we want a better error
12013+    # message? Since this call is going to be used by programmers and
12014+    # their tools rather than users (through the wui), it is not
12015+    # inconsistent to return that, I guess.
12016+    offset = int(offset)
12017+    return offset
12018+
12019+
12020 def get_root(ctx_or_req):
12021     req = IRequest(ctx_or_req)
12022     # the addSlash=True gives us one extra (empty) segment
12023hunk ./src/allmydata/web/directory.py 19
12024 from allmydata.uri import from_string_dirnode
12025 from allmydata.interfaces import IDirectoryNode, IFileNode, IFilesystemNode, \
12026      IImmutableFileNode, IMutableFileNode, ExistingChildError, \
12027-     NoSuchChildError, EmptyPathnameComponentError
12028+     NoSuchChildError, EmptyPathnameComponentError, SDMF_VERSION, MDMF_VERSION
12029 from allmydata.monitor import Monitor, OperationCancelledError
12030 from allmydata import dirnode
12031 from allmydata.web.common import text_plain, WebError, \
12032hunk ./src/allmydata/web/directory.py 788
12033             T.input(type="submit", value="Upload"),
12034             " Mutable?:",
12035             T.input(type="checkbox", name="mutable"),
12036+            T.input(type='radio', name='mutable-type', id='mutable-type-sdmf', value='sdmf', checked='checked'), T.label(for_="mutable-type-sdmf")["SDMF"],
12037+            T.input(type='radio', name='mutable-type', id='mutable-type-mdmf',
12038+                    value='mdmf'), T.label(for_="mutable-type-mdmf")["MDMF (experimental)"],
12039             ]]
12040         forms.append(T.div(class_="freeform-form")[upload])
12041 
12042hunk ./src/allmydata/web/directory.py 826
12043                 kiddata = ("filenode", {'size': childnode.get_size(),
12044                                         'mutable': childnode.is_mutable(),
12045                                         })
12046+                if childnode.is_mutable() and \
12047+                    childnode.get_version() is not None:
12048+                    mutable_type = childnode.get_version()
12049+                    assert mutable_type in (SDMF_VERSION, MDMF_VERSION)
12050+
12051+                    if mutable_type == MDMF_VERSION:
12052+                        mutable_type = "mdmf"
12053+                    else:
12054+                        mutable_type = "sdmf"
12055+                    kiddata[1]['mutable-type'] = mutable_type
12056+
12057             elif IDirectoryNode.providedBy(childnode):
12058                 kiddata = ("dirnode", {'mutable': childnode.is_mutable()})
12059             else:
12060hunk ./src/allmydata/web/filenode.py 9
12061 from nevow import url, rend
12062 from nevow.inevow import IRequest
12063 
12064-from allmydata.interfaces import ExistingChildError
12065+from allmydata.interfaces import ExistingChildError, SDMF_VERSION, MDMF_VERSION
12066 from allmydata.monitor import Monitor
12067 from allmydata.immutable.upload import FileHandle
12068hunk ./src/allmydata/web/filenode.py 12
12069+from allmydata.mutable.publish import MutableFileHandle
12070+from allmydata.mutable.common import MODE_READ
12071 from allmydata.util import log, base32
12072 
12073 from allmydata.web.common import text_plain, WebError, RenderMixin, \
12074hunk ./src/allmydata/web/filenode.py 18
12075      boolean_of_arg, get_arg, should_create_intermediate_directories, \
12076-     MyExceptionHandler, parse_replace_arg
12077+     MyExceptionHandler, parse_replace_arg, parse_offset_arg, \
12078+     parse_mutable_type_arg
12079 from allmydata.web.check_results import CheckResults, \
12080      CheckAndRepairResults, LiteralCheckResults
12081 from allmydata.web.info import MoreInfo
12082hunk ./src/allmydata/web/filenode.py 29
12083         # a new file is being uploaded in our place.
12084         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
12085         if mutable:
12086-            req.content.seek(0)
12087-            data = req.content.read()
12088-            d = client.create_mutable_file(data)
12089+            mutable_type = parse_mutable_type_arg(get_arg(req,
12090+                                                          "mutable-type",
12091+                                                          None))
12092+            data = MutableFileHandle(req.content)
12093+            d = client.create_mutable_file(data, version=mutable_type)
12094             def _uploaded(newnode):
12095                 d2 = self.parentnode.set_node(self.name, newnode,
12096                                               overwrite=replace)
12097hunk ./src/allmydata/web/filenode.py 66
12098         d.addCallback(lambda res: childnode.get_uri())
12099         return d
12100 
12101-    def _read_data_from_formpost(self, req):
12102-        # SDMF: files are small, and we can only upload data, so we read
12103-        # the whole file into memory before uploading.
12104-        contents = req.fields["file"]
12105-        contents.file.seek(0)
12106-        data = contents.file.read()
12107-        return data
12108 
12109     def replace_me_with_a_formpost(self, req, client, replace):
12110         # create a new file, maybe mutable, maybe immutable
12111hunk ./src/allmydata/web/filenode.py 71
12112         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
12113 
12114+        # create an immutable file
12115+        contents = req.fields["file"]
12116         if mutable:
12117hunk ./src/allmydata/web/filenode.py 74
12118-            data = self._read_data_from_formpost(req)
12119-            d = client.create_mutable_file(data)
12120+            mutable_type = parse_mutable_type_arg(get_arg(req, "mutable-type",
12121+                                                          None))
12122+            uploadable = MutableFileHandle(contents.file)
12123+            d = client.create_mutable_file(uploadable, version=mutable_type)
12124             def _uploaded(newnode):
12125                 d2 = self.parentnode.set_node(self.name, newnode,
12126                                               overwrite=replace)
12127hunk ./src/allmydata/web/filenode.py 85
12128                 return d2
12129             d.addCallback(_uploaded)
12130             return d
12131-        # create an immutable file
12132-        contents = req.fields["file"]
12133+
12134         uploadable = FileHandle(contents.file, convergence=client.convergence)
12135         d = self.parentnode.add_file(self.name, uploadable, overwrite=replace)
12136         d.addCallback(lambda newnode: newnode.get_uri())
12137hunk ./src/allmydata/web/filenode.py 91
12138         return d
12139 
12140+
12141 class PlaceHolderNodeHandler(RenderMixin, rend.Page, ReplaceMeMixin):
12142     def __init__(self, client, parentnode, name):
12143         rend.Page.__init__(self)
12144hunk ./src/allmydata/web/filenode.py 174
12145             # properly. So we assume that at least the browser will agree
12146             # with itself, and echo back the same bytes that we were given.
12147             filename = get_arg(req, "filename", self.name) or "unknown"
12148-            if self.node.is_mutable():
12149-                # some day: d = self.node.get_best_version()
12150-                d = makeMutableDownloadable(self.node)
12151-            else:
12152-                d = defer.succeed(self.node)
12153+            d = self.node.get_best_readable_version()
12154             d.addCallback(lambda dn: FileDownloader(dn, filename))
12155             return d
12156         if t == "json":
12157hunk ./src/allmydata/web/filenode.py 178
12158-            if self.parentnode and self.name:
12159-                d = self.parentnode.get_metadata_for(self.name)
12160+            # We do this to make sure that fields like size and
12161+            # mutable-type (which depend on the file on the grid and not
12162+            # just on the cap) are filled in. The latter gets used in
12163+            # tests, in particular.
12164+            #
12165+            # TODO: Make it so that the servermap knows how to update in
12166+            # a mode specifically designed to fill in these fields, and
12167+            # then update it in that mode.
12168+            if self.node.is_mutable():
12169+                d = self.node.get_servermap(MODE_READ)
12170             else:
12171                 d = defer.succeed(None)
12172hunk ./src/allmydata/web/filenode.py 190
12173+            if self.parentnode and self.name:
12174+                d.addCallback(lambda ignored:
12175+                    self.parentnode.get_metadata_for(self.name))
12176+            else:
12177+                d.addCallback(lambda ignored: None)
12178             d.addCallback(lambda md: FileJSONMetadata(ctx, self.node, md))
12179             return d
12180         if t == "info":
12181hunk ./src/allmydata/web/filenode.py 211
12182         if t:
12183             raise WebError("GET file: bad t=%s" % t)
12184         filename = get_arg(req, "filename", self.name) or "unknown"
12185-        if self.node.is_mutable():
12186-            # some day: d = self.node.get_best_version()
12187-            d = makeMutableDownloadable(self.node)
12188-        else:
12189-            d = defer.succeed(self.node)
12190+        d = self.node.get_best_readable_version()
12191         d.addCallback(lambda dn: FileDownloader(dn, filename))
12192         return d
12193 
12194hunk ./src/allmydata/web/filenode.py 219
12195         req = IRequest(ctx)
12196         t = get_arg(req, "t", "").strip()
12197         replace = parse_replace_arg(get_arg(req, "replace", "true"))
12198+        offset = parse_offset_arg(get_arg(req, "offset", -1))
12199 
12200         if not t:
12201hunk ./src/allmydata/web/filenode.py 222
12202-            if self.node.is_mutable():
12203+            if self.node.is_mutable() and offset >= 0:
12204+                return self.update_my_contents(req, offset)
12205+
12206+            elif self.node.is_mutable():
12207                 return self.replace_my_contents(req)
12208             if not replace:
12209                 # this is the early trap: if someone else modifies the
12210hunk ./src/allmydata/web/filenode.py 232
12211                 # directory while we're uploading, the add_file(overwrite=)
12212                 # call in replace_me_with_a_child will do the late trap.
12213                 raise ExistingChildError()
12214+            if offset >= 0:
12215+                raise WebError("PUT to a file: append operation invoked "
12216+                               "on an immutable cap")
12217+
12218+
12219             assert self.parentnode and self.name
12220             return self.replace_me_with_a_child(req, self.client, replace)
12221         if t == "uri":
12222hunk ./src/allmydata/web/filenode.py 299
12223 
12224     def replace_my_contents(self, req):
12225         req.content.seek(0)
12226-        new_contents = req.content.read()
12227+        new_contents = MutableFileHandle(req.content)
12228         d = self.node.overwrite(new_contents)
12229         d.addCallback(lambda res: self.node.get_uri())
12230         return d
12231hunk ./src/allmydata/web/filenode.py 304
12232 
12233+
12234+    def update_my_contents(self, req, offset):
12235+        req.content.seek(0)
12236+        added_contents = MutableFileHandle(req.content)
12237+
12238+        d = self.node.get_best_mutable_version()
12239+        d.addCallback(lambda mv:
12240+            mv.update(added_contents, offset))
12241+        d.addCallback(lambda ignored:
12242+            self.node.get_uri())
12243+        return d
12244+
12245+
12246     def replace_my_contents_with_a_formpost(self, req):
12247         # we have a mutable file. Get the data from the formpost, and replace
12248         # the mutable file's contents with it.
12249hunk ./src/allmydata/web/filenode.py 320
12250-        new_contents = self._read_data_from_formpost(req)
12251+        new_contents = req.fields['file']
12252+        new_contents = MutableFileHandle(new_contents.file)
12253+
12254         d = self.node.overwrite(new_contents)
12255         d.addCallback(lambda res: self.node.get_uri())
12256         return d
12257hunk ./src/allmydata/web/filenode.py 327
12258 
12259-class MutableDownloadable:
12260-    #implements(IDownloadable)
12261-    def __init__(self, size, node):
12262-        self.size = size
12263-        self.node = node
12264-    def get_size(self):
12265-        return self.size
12266-    def is_mutable(self):
12267-        return True
12268-    def read(self, consumer, offset=0, size=None):
12269-        d = self.node.download_best_version()
12270-        d.addCallback(self._got_data, consumer, offset, size)
12271-        return d
12272-    def _got_data(self, contents, consumer, offset, size):
12273-        start = offset
12274-        if size is not None:
12275-            end = offset+size
12276-        else:
12277-            end = self.size
12278-        # SDMF: we can write the whole file in one big chunk
12279-        consumer.write(contents[start:end])
12280-        return consumer
12281-
12282-def makeMutableDownloadable(n):
12283-    d = defer.maybeDeferred(n.get_size_of_best_version)
12284-    d.addCallback(MutableDownloadable, n)
12285-    return d
12286 
12287 class FileDownloader(rend.Page):
12288     # since we override the rendering process (to let the tahoe Downloader
12289hunk ./src/allmydata/web/filenode.py 492
12290     data[1]['mutable'] = filenode.is_mutable()
12291     if edge_metadata is not None:
12292         data[1]['metadata'] = edge_metadata
12293+
12294+    if filenode.is_mutable() and filenode.get_version() is not None:
12295+        mutable_type = filenode.get_version()
12296+        assert mutable_type in (MDMF_VERSION, SDMF_VERSION)
12297+        if mutable_type == MDMF_VERSION:
12298+            mutable_type = "mdmf"
12299+        else:
12300+            mutable_type = "sdmf"
12301+        data[1]['mutable-type'] = mutable_type
12302+
12303     return text_plain(simplejson.dumps(data, indent=1) + "\n", ctx)
12304 
12305 def FileURI(ctx, filenode):
12306hunk ./src/allmydata/web/root.py 19
12307 from allmydata.web import filenode, directory, unlinked, status, operations
12308 from allmydata.web import reliability, storage
12309 from allmydata.web.common import abbreviate_size, getxmlfile, WebError, \
12310-     get_arg, RenderMixin, boolean_of_arg
12311+     get_arg, RenderMixin, boolean_of_arg, parse_mutable_type_arg
12312 
12313 
12314 class URIHandler(RenderMixin, rend.Page):
12315hunk ./src/allmydata/web/root.py 50
12316         if t == "":
12317             mutable = boolean_of_arg(get_arg(req, "mutable", "false").strip())
12318             if mutable:
12319-                return unlinked.PUTUnlinkedSSK(req, self.client)
12320+                version = parse_mutable_type_arg(get_arg(req, "mutable-type",
12321+                                                 None))
12322+                return unlinked.PUTUnlinkedSSK(req, self.client, version)
12323             else:
12324                 return unlinked.PUTUnlinkedCHK(req, self.client)
12325         if t == "mkdir":
12326hunk ./src/allmydata/web/root.py 70
12327         if t in ("", "upload"):
12328             mutable = bool(get_arg(req, "mutable", "").strip())
12329             if mutable:
12330-                return unlinked.POSTUnlinkedSSK(req, self.client)
12331+                version = parse_mutable_type_arg(get_arg(req, "mutable-type",
12332+                                                         None))
12333+                return unlinked.POSTUnlinkedSSK(req, self.client, version)
12334             else:
12335                 return unlinked.POSTUnlinkedCHK(req, self.client)
12336         if t == "mkdir":
12337hunk ./src/allmydata/web/root.py 332
12338                   T.input(type="file", name="file", class_="freeform-input-file")],
12339             T.input(type="hidden", name="t", value="upload"),
12340             T.div[T.input(type="checkbox", name="mutable"), T.label(for_="mutable")["Create mutable file"],
12341+                  T.input(type='radio', name='mutable-type', value='sdmf', id='mutable-type-sdmf', checked="checked"), T.label(for_="mutable-type-sdmf")["SDMF"],
12342+                  T.input(type='radio', name='mutable-type', value='mdmf', id='mutable-type-mdmf'), T.label(for_='mutable-type-mdmf')['MDMF (experimental)'],
12343                   " ", T.input(type="submit", value="Upload!")],
12344             ]]
12345         return T.div[form]
12346hunk ./src/allmydata/web/unlinked.py 7
12347 from twisted.internet import defer
12348 from nevow import rend, url, tags as T
12349 from allmydata.immutable.upload import FileHandle
12350+from allmydata.mutable.publish import MutableFileHandle
12351 from allmydata.web.common import getxmlfile, get_arg, boolean_of_arg, \
12352      convert_children_json, WebError
12353 from allmydata.web import status
12354hunk ./src/allmydata/web/unlinked.py 20
12355     # that fires with the URI of the new file
12356     return d
12357 
12358-def PUTUnlinkedSSK(req, client):
12359+def PUTUnlinkedSSK(req, client, version):
12360     # SDMF: files are small, and we can only upload data
12361     req.content.seek(0)
12362hunk ./src/allmydata/web/unlinked.py 23
12363-    data = req.content.read()
12364-    d = client.create_mutable_file(data)
12365+    data = MutableFileHandle(req.content)
12366+    d = client.create_mutable_file(data, version=version)
12367     d.addCallback(lambda n: n.get_uri())
12368     return d
12369 
12370hunk ./src/allmydata/web/unlinked.py 83
12371                       ["/uri/" + res.uri])
12372         return d
12373 
12374-def POSTUnlinkedSSK(req, client):
12375+def POSTUnlinkedSSK(req, client, version):
12376     # "POST /uri", to create an unlinked file.
12377     # SDMF: files are small, and we can only upload data
12378hunk ./src/allmydata/web/unlinked.py 86
12379-    contents = req.fields["file"]
12380-    contents.file.seek(0)
12381-    data = contents.file.read()
12382-    d = client.create_mutable_file(data)
12383+    contents = req.fields["file"].file
12384+    data = MutableFileHandle(contents)
12385+    d = client.create_mutable_file(data, version=version)
12386     d.addCallback(lambda n: n.get_uri())
12387     return d
12388 
12389}
12390
12391Context:
12392
12393[docs: NEWS: edit English usage, remove ticket numbers for regressions vs. 1.7.1 that were fixed again before 1.8.0c2
12394zooko@zooko.com**20100811071758
12395 Ignore-this: 993f5a1e6a9535f5b7a0bd77b93b66d0
12396] 
12397[docs: NEWS: more detail about new-downloader
12398zooko@zooko.com**20100811071303
12399 Ignore-this: 9f07da4dce9d794ce165aae287f29a1e
12400] 
12401[TAG allmydata-tahoe-1.8.0c2
12402david-sarah@jacaranda.org**20100810073847
12403 Ignore-this: c37f732b0e45f9ebfdc2f29c0899aeec
12404] 
12405[quickstart.html: update tarball link.
12406david-sarah@jacaranda.org**20100810073832
12407 Ignore-this: 4fcf9a7ec9d0de297c8ed4f29af50d71
12408] 
12409[webapi.txt: fix grammatical error.
12410david-sarah@jacaranda.org**20100810064127
12411 Ignore-this: 64f66aa71682195f82ac1066fe947e35
12412] 
12413[relnotes.txt: update revision of NEWS.
12414david-sarah@jacaranda.org**20100810063243
12415 Ignore-this: cf9eb342802d19f3a8004acd123fd46e
12416] 
12417[NEWS, relnotes and known-issues for 1.8.0c2.
12418david-sarah@jacaranda.org**20100810062851
12419 Ignore-this: bf319506558f6ba053fd896823c96a20
12420] 
12421[DownloadStatus: put real numbers in progress/status rows, not placeholders.
12422Brian Warner <warner@lothar.com>**20100810060603
12423 Ignore-this: 1f9dcd47c06cb356fc024d7bb8e24115
12424 Improve tests.
12425] 
12426[web download-status: tolerate DYHBs that haven't retired yet. Fixes #1160.
12427Brian Warner <warner@lothar.com>**20100809225100
12428 Ignore-this: cb0add71adde0a2e24f4bcc00abf9938
12429 
12430 Also add a better unit test for it.
12431] 
12432[immutable/filenode.py: put off DownloadStatus creation until first read() call
12433Brian Warner <warner@lothar.com>**20100809225055
12434 Ignore-this: 48564598f236eb73e96cd2d2a21a2445
12435 
12436 This avoids spamming the "recent uploads and downloads" /status page from
12437 FileNode instances that were created for a directory read but which nobody is
12438 ever going to read from. I also cleaned up the way DownloadStatus instances
12439 are made to only ever do it in the CiphertextFileNode, not in the
12440 higher-level plaintext FileNode. Also fixed DownloadStatus handling of read
12441 size, thanks to David-Sarah for the catch.
12442] 
12443[Share: hush log entries in the main loop() after the fetch has been completed.
12444Brian Warner <warner@lothar.com>**20100809204359
12445 Ignore-this: 72b9e262980edf5a967873ebbe1e9479
12446] 
12447[test_runner.py: correct and simplify normalization of package directory for case-insensitive filesystems.
12448david-sarah@jacaranda.org**20100808185005
12449 Ignore-this: fba96e967d4e7f33f301c7d56b577de
12450] 
12451[test_runner.py: make test_path work for test-from-installdir.
12452david-sarah@jacaranda.org**20100808171340
12453 Ignore-this: 46328d769ae6ec8d191c3cddacc91dc9
12454] 
12455[src/allmydata/__init__.py: make the package paths more accurate when we fail to get them from setuptools.
12456david-sarah@jacaranda.org**20100808171235
12457 Ignore-this: 8d534d2764d64f7434880bd70696cd75
12458] 
12459[test_runner.py: another try at calculating the rootdir correctly for test-from-egg and test-from-prefixdir.
12460david-sarah@jacaranda.org**20100808154307
12461 Ignore-this: 66737313935f2a0313d1de9b2ed68d0
12462] 
12463[test_runner.py: calculate the location of bin/tahoe correctly for test-from-prefixdir (by copying code from misc/build_helpers/run_trial.py). Also fix the false-positive check for Unicode paths in test_the_right_code, which was causing skips that should have been failures.
12464david-sarah@jacaranda.org**20100808042817
12465 Ignore-this: 1b7dfff07cbfb1a74f94141b18da2c3f
12466] 
12467[TAG allmydata-tahoe-1.8.0c1
12468david-sarah@jacaranda.org**20100807004546
12469 Ignore-this: 484ff2513774f3b48ca49c992e878b89
12470] 
12471[how_to_make_a_tahoe-lafs_release.txt: add step to check that release will report itself as the intended version.
12472david-sarah@jacaranda.org**20100807004254
12473 Ignore-this: 7709322e883f4118f38c7f042f5a9a2
12474] 
12475[relnotes.txt: 1.8.0c1 release
12476david-sarah@jacaranda.org**20100807003646
12477 Ignore-this: 1994ffcaf55089eb05e96c23c037dfee
12478] 
12479[NEWS, quickstart.html and known_issues.txt for 1.8.0c1 release.
12480david-sarah@jacaranda.org**20100806235111
12481 Ignore-this: 777cea943685cf2d48b6147a7648fca0
12482] 
12483[TAG allmydata-tahoe-1.8.0rc1
12484warner@lothar.com**20100806080450] 
12485[update NEWS and other docs in preparation for 1.8.0rc1
12486Brian Warner <warner@lothar.com>**20100806080228
12487 Ignore-this: 6ebdf11806f6dfbfde0b61115421a459
12488 
12489 in particular, merge the various 1.8.0b1/b2 sections, and remove the
12490 datestamp. NEWS gets updated just before a release, doesn't need to precisely
12491 describe pre-release candidates, and the datestamp gets updated just before
12492 the final release is tagged
12493 
12494 Also, I removed the BOM from some files. My toolchain made it hard to retain,
12495 and BOMs in UTF-8 don't make a whole lot of sense anyway. Sorry if that
12496 messes anything up.
12497] 
12498[downloader.Segmentation: unregisterProducer when asked to stopProducing, this
12499Brian Warner <warner@lothar.com>**20100806070705
12500 Ignore-this: a0a71dcf83df8a6f727deb9a61fa4fdf
12501 seems to avoid the #1155 log message which reveals the URI (and filecap).
12502 
12503 Also add an [ERROR] marker to the flog entry, since unregisterProducer also
12504 makes interrupted downloads appear "200 OK"; this makes it more obvious that
12505 the download did not complete.
12506] 
12507[TAG allmydata-tahoe-1.8.0b2
12508david-sarah@jacaranda.org**20100806052415
12509 Ignore-this: 2c1af8df5e25a6ebd90a32b49b8486dc
12510] 
12511[relnotes.txt and docs/known_issues.txt for 1.8.0beta2.
12512david-sarah@jacaranda.org**20100806040823
12513 Ignore-this: 862ad55d93ee37259ded9e2c9da78eb9
12514] 
12515[test_util.py: use SHA-256 from pycryptopp instead of MD5 from hashlib (for uses in which any hash will do), since hashlib was only added to the stdlib in Python 2.5.
12516david-sarah@jacaranda.org**20100806050051
12517 Ignore-this: 552049b5d190a5ca775a8240030dbe3f
12518] 
12519[test_runner.py: increase timeout to cater for Francois' ARM buildslave.
12520david-sarah@jacaranda.org**20100806042601
12521 Ignore-this: 6ee618cf00ac1c99cb7ddb60fd7ef078
12522] 
12523[test_util.py: remove use of 'a if p else b' syntax that requires Python 2.5.
12524david-sarah@jacaranda.org**20100806041616
12525 Ignore-this: 5fecba9aa530ef352797fcfa70d5c592
12526] 
12527[NEWS and docs/quickstart.html for 1.8.0beta2.
12528david-sarah@jacaranda.org**20100806035112
12529 Ignore-this: 3a593cfdc2ae265da8f64c6c8aebae4
12530] 
12531[docs/quickstart.html: remove link to tahoe-lafs-ticket798-1.8.0b.zip, due to appname regression. refs #1159
12532david-sarah@jacaranda.org**20100806002435
12533 Ignore-this: bad61b30cdcc3d93b4165d5800047b85
12534] 
12535[test_download.DownloadTest.test_simultaneous_goodguess: enable some disabled
12536Brian Warner <warner@lothar.com>**20100805185507
12537 Ignore-this: ac53d44643805412238ccbfae920d20c
12538 checks that used to fail but work now.
12539] 
12540[DownloadNode: fix lost-progress in fetch_failed, tolerate cancel when no segment-fetch is active. Fixes #1154.
12541Brian Warner <warner@lothar.com>**20100805185507
12542 Ignore-this: 35fd36b273b21b6dca12ab3d11ee7d2d
12543 
12544 The lost-progress bug occurred when two simultanous read() calls fetched
12545 different segments, and the first one failed (due to corruption, or the other
12546 bugs in #1154): the second read() would never complete. While in this state,
12547 cancelling the second read by having its consumer call stopProducing) would
12548 trigger the cancel-intolerance bug. Finally, in downloader.node.Cancel,
12549 prevent late cancels by adding an 'active' flag
12550] 
12551[util/spans.py: __nonzero__ cannot return a long either. for #1154
12552Brian Warner <warner@lothar.com>**20100805185507
12553 Ignore-this: 6f87fead8252e7a820bffee74a1c51a2
12554] 
12555[test_storage.py: change skip note for test_large_share to say that Windows doesn't support sparse files. refs #569
12556david-sarah@jacaranda.org**20100805022612
12557 Ignore-this: 85c807a536dc4eeb8bf14980028bb05b
12558] 
12559[One fix for bug #1154: webapi GETs with a 'Range' header broke new-downloader.
12560Brian Warner <warner@lothar.com>**20100804184549
12561 Ignore-this: ffa3e703093a905b416af125a7923b7b
12562 
12563 The Range header causes n.read() to be called with an offset= of type 'long',
12564 which eventually got used in a Spans/DataSpans object's __len__ method.
12565 Apparently python doesn't permit __len__() to return longs, only ints.
12566 Rewrote Spans/DataSpans to use s.len() instead of len(s) aka s.__len__() .
12567 Added a test in test_download. Note that test_web didn't catch this because
12568 it uses mock FileNodes for speed: it's probably time to rewrite that.
12569 
12570 There is still an unresolved error-recovery problem in #1154, so I'm not
12571 closing the ticket quite yet.
12572] 
12573[test_download: minor cleanup
12574Brian Warner <warner@lothar.com>**20100804175555
12575 Ignore-this: f4aec3c77f6a0d7f7b2c07f302755cc1
12576] 
12577[fetcher.py: improve comments
12578Brian Warner <warner@lothar.com>**20100804072814
12579 Ignore-this: 8bf74c21aef55cf0b0642e55ee4e7c5f
12580] 
12581[lazily create DownloadNode upon first read()/get_segment()
12582Brian Warner <warner@lothar.com>**20100804072808
12583 Ignore-this: 4bb1c49290cefac1dadd9d42fac46ba2
12584] 
12585[test_hung_server: update comments, remove dead "stage_4_d" code
12586Brian Warner <warner@lothar.com>**20100804072800
12587 Ignore-this: 4d18b374b568237603466f93346d00db
12588] 
12589[copy the rest of David-Sarah's changes to make my tree match 1.8.0beta
12590Brian Warner <warner@lothar.com>**20100804072752
12591 Ignore-this: 9ac7f21c9b27e53452371096146be5bb
12592] 
12593[ShareFinder: add 10s OVERDUE timer, send new requests to replace overdue ones
12594Brian Warner <warner@lothar.com>**20100804072741
12595 Ignore-this: 7fa674edbf239101b79b341bb2944349
12596 
12597 The fixed 10-second timer will eventually be replaced with a per-server
12598 value, calculated based on observed response times.
12599 
12600 test_hung_server.py: enhance to exercise DYHB=OVERDUE state. Split existing
12601 mutable+immutable tests into two pieces for clarity. Reenabled several tests.
12602 Deleted the now-obsolete "test_failover_during_stage_4".
12603] 
12604[Rewrite immutable downloader (#798). This patch adds and updates unit tests.
12605Brian Warner <warner@lothar.com>**20100804072710
12606 Ignore-this: c3c838e124d67b39edaa39e002c653e1
12607] 
12608[Rewrite immutable downloader (#798). This patch includes higher-level
12609Brian Warner <warner@lothar.com>**20100804072702
12610 Ignore-this: 40901ddb07d73505cb58d06d9bff73d9
12611 integration into the NodeMaker, and updates the web-status display to handle
12612 the new download events.
12613] 
12614[Rewrite immutable downloader (#798). This patch rearranges the rest of src/allmydata/immutable/ .
12615Brian Warner <warner@lothar.com>**20100804072639
12616 Ignore-this: 302b1427a39985bfd11ccc14a1199ea4
12617] 
12618[Rewrite immutable downloader (#798). This patch adds the new downloader itself.
12619Brian Warner <warner@lothar.com>**20100804072629
12620 Ignore-this: e9102460798123dd55ddca7653f4fc16
12621] 
12622[util/observer.py: add EventStreamObserver
12623Brian Warner <warner@lothar.com>**20100804072612
12624 Ignore-this: fb9d205f34a6db7580b9be33414dfe21
12625] 
12626[Add a byte-spans utility class, like perl's Set::IntSpan for .newsrc files.
12627Brian Warner <warner@lothar.com>**20100804072600
12628 Ignore-this: bbad42104aeb2f26b8dd0779de546128
12629 Also a data-spans class, which records a byte (instead of a bit) for each
12630 index.
12631] 
12632[check-umids: oops, forgot to add the tool
12633Brian Warner <warner@lothar.com>**20100804071713
12634 Ignore-this: bbeb74d075414f3713fabbdf66189faf
12635] 
12636[coverage tools: ignore errors, display lines-uncovered in elisp mode. Fix Makefile paths.
12637"Brian Warner <warner@lothar.com>"**20100804071131] 
12638[check-umids: new tool to check uniqueness of umids
12639"Brian Warner <warner@lothar.com>"**20100804071042] 
12640[misc/simulators/sizes.py: update, we now use SHA256 (not SHA1), so large-file overhead grows to 0.5%
12641"Brian Warner <warner@lothar.com>"**20100804070942] 
12642[storage-overhead: try to fix, probably still broken
12643"Brian Warner <warner@lothar.com>"**20100804070815] 
12644[docs/quickstart.html: link to 1.8.0beta zip, and note 'bin\tahoe' on Windows.
12645david-sarah@jacaranda.org**20100803233254
12646 Ignore-this: 3c11f249efc42a588e3a7056349739ed
12647] 
12648[docs: relnotes.txt for 1.8.0β
12649zooko@zooko.com**20100803154913
12650 Ignore-this: d9101f72572b18da3cfac3c0e272c907
12651] 
12652[test_storage.py: avoid spurious test failure by accepting either 'Next crawl in 59 minutes' or 'Next crawl in 60 minutes'. fixes #1140
12653david-sarah@jacaranda.org**20100803102058
12654 Ignore-this: aa2419fc295727e4fbccec3c7b780e76
12655] 
12656[misc/build_helpers/show-tool-versions.py: get sys.std{out,err}.encoding and 'as' version correctly, and improve formatting.
12657david-sarah@jacaranda.org**20100803101128
12658 Ignore-this: 4fd2907d86da58eb220e104010e9c6a
12659] 
12660[misc/build_helpers/show-tool-versions.py: avoid error message when 'as -version' does not create a.out.
12661david-sarah@jacaranda.org**20100803094812
12662 Ignore-this: 38fc2d639f30b4e123b9551e6931998d
12663] 
12664[CLI: further improve consistency of basedir options and add tests. addresses #118
12665david-sarah@jacaranda.org**20100803085416
12666 Ignore-this: d8f8f55738abb5ea44ed4cf24d750efe
12667] 
12668[CLI: make the synopsis for 'tahoe unlink' say unlink instead of rm.
12669david-sarah@jacaranda.org**20100803085359
12670 Ignore-this: c35d3f99f906dfab61df8f5e81a42c92
12671] 
12672[CLI: make all of the option descriptions imperative sentences.
12673david-sarah@jacaranda.org**20100803084801
12674 Ignore-this: ec80c7d2a10c6452d190fee4e1a60739
12675] 
12676[test_cli.py: make 'tahoe mkdir' tests slightly less dumb (check for 'URI:' in the output).
12677david-sarah@jacaranda.org**20100803084720
12678 Ignore-this: 31a4ae4fb5f7c123bc6b6e36a9e3911e
12679] 
12680[test_cli.py: use u-escapes instead of UTF-8.
12681david-sarah@jacaranda.org**20100803083538
12682 Ignore-this: a48af66942defe8491c6e1811c7809b5
12683] 
12684[NEWS: remove XXX comment and separate description of #890.
12685david-sarah@jacaranda.org**20100803050827
12686 Ignore-this: 6d308f34dc9d929d3d0811f7a1f5c786
12687] 
12688[docs: more updates to NEWS for 1.8.0β
12689zooko@zooko.com**20100803044618
12690 Ignore-this: 8193a1be38effe2bdcc632fdb570e9fc
12691] 
12692[docs: incomplete beginnings of a NEWS update for v1.8β
12693zooko@zooko.com**20100802072840
12694 Ignore-this: cb00fcd4f1e0eaed8c8341014a2ba4d4
12695] 
12696[docs/quickstart.html: extra step to open a new Command Prompt or log out/in on Windows.
12697david-sarah@jacaranda.org**20100803004938
12698 Ignore-this: 1334a2cd01f77e0c9eddaeccfeff2370
12699] 
12700[update bundled zetuptools with doc changes, change to script setup for Windows XP, and to have the 'develop' command run script setup.
12701david-sarah@jacaranda.org**20100803003815
12702 Ignore-this: 73c86e154f4d3f7cc9855eb31a20b1ed
12703] 
12704[bundled setuptools/command/scriptsetup.py: use SendMessageTimeoutW, to test whether that broadcasts environment changes any better.
12705david-sarah@jacaranda.org**20100802224505
12706 Ignore-this: 7788f7c2f9355e7852a376ec94182056
12707] 
12708[bundled zetuptoolz: add missing setuptools/command/scriptsetup.py
12709david-sarah@jacaranda.org**20100802072129
12710 Ignore-this: 794b1c411f6cdec76eeb716223a55d0
12711] 
12712[test_runner.py: add test_run_with_python_options, which checks that the Windows script changes haven't broken 'python <options> bin/tahoe'.
12713david-sarah@jacaranda.org**20100802062558
12714 Ignore-this: 812a2ccb7d9c7a8e01d5ca04d875aba5
12715] 
12716[test_runner.py: fix missing import of get_filesystem_encoding
12717david-sarah@jacaranda.org**20100802060902
12718 Ignore-this: 2e9e439b7feb01e0c3c94b54e802503b
12719] 
12720[Bundle setuptools-0.6c16dev (with Windows script changes, and the change to only warn if site.py wasn't generated by setuptools) instead of 0.6c15dev. addresses #565, #1073, #1074
12721david-sarah@jacaranda.org**20100802060602
12722 Ignore-this: 34ee2735e49e2c05b57e353d48f83050
12723] 
12724[.darcs-boringfile: changes needed to take account of egg directories being bundled. Also, make _trial_temp a prefix rather than exact match.
12725david-sarah@jacaranda.org**20100802050313
12726 Ignore-this: 8de6a8dbaba014ba88dec6c792fc5a9d
12727] 
12728[.darcs-boringfile: changes needed to take account of pyscript wrappers on Windows.
12729david-sarah@jacaranda.org**20100802050128
12730 Ignore-this: 7366b631e2095166696e6da5765d9180
12731] 
12732[misc/build_helpers/run_trial.py: check that the root from which the module we are testing was loaded is the current directory. This version of the patch folds in later fixes to the logic for caculating the directories to compare, and improvements to error messages. addresses #1137
12733david-sarah@jacaranda.org**20100802045535
12734 Ignore-this: 9d3c1447f0539c6308127413098eb646
12735] 
12736[Skip option arguments to the python interpreter when reconstructing Unicode argv on Windows.
12737david-sarah@jacaranda.org**20100728062731
12738 Ignore-this: 2b17fc43860bcc02a66bb6e5e050ea7c
12739] 
12740[windows/fixups.py: improve comments and reference some relevant Python bugs.
12741david-sarah@jacaranda.org**20100727181921
12742 Ignore-this: 32e61cf98dfc2e3dac60b750dda6429b
12743] 
12744[windows/fixups.py: make errors reported to original_stderr have enough information to debug even if we can't see the traceback.
12745david-sarah@jacaranda.org**20100726221904
12746 Ignore-this: e30b4629a7aa5d71554237c7e809c080
12747] 
12748[windows/fixups.py: fix paste-o in name of Unicode stderr wrapper.
12749david-sarah@jacaranda.org**20100726214736
12750 Ignore-this: cb220931f1683eb53b0c7269e18a38be
12751] 
12752[windows/fixups.py: Don't rely on buggy MSVCRT library for Unicode output, use the Win32 API instead. This should make it work on XP. Also, change how we handle the case where sys.stdout and sys.stderr are redirected, since the .encoding attribute isn't necessarily writeable.
12753david-sarah@jacaranda.org**20100726045019
12754 Ignore-this: 69267abc5065cbd5b86ca71fe4921fb6
12755] 
12756[test_runner.py: change to code for locating the bin/tahoe script that was missed when rebasing the patch for #1074.
12757david-sarah@jacaranda.org**20100725182008
12758 Ignore-this: d891a93989ecc3f4301a17110c3d196c
12759] 
12760[Add missing windows/fixups.py (for setting up Unicode args and output on Windows).
12761david-sarah@jacaranda.org**20100725092849
12762 Ignore-this: 35a1e8aeb4e1dea6e81433bf0825a6f6
12763] 
12764[Changes to Tahoe needed to work with new zetuptoolz (that does not use .exe wrappers on Windows), and to support Unicode arguments and stdout/stderr -- v5
12765david-sarah@jacaranda.org**20100725083216
12766 Ignore-this: 5041a634b1328f041130658233f6a7ce
12767] 
12768[scripts/common.py: fix an error introduced when rebasing to the ticket798 branch, which caused base directories to be duplicated in self.basedirs.
12769david-sarah@jacaranda.org**20100802064929
12770 Ignore-this: 116fd437d1f91a647879fe8d9510f513
12771] 
12772[Basedir/node directory option improvements for ticket798 branch. addresses #188, #706, #715, #772, #890
12773david-sarah@jacaranda.org**20100802043004
12774 Ignore-this: d19fc24349afa19833406518595bfdf7
12775] 
12776[scripts/create_node.py: allow nickname to be Unicode. Also ensure webport is validly encoded in config file.
12777david-sarah@jacaranda.org**20100802000212
12778 Ignore-this: fb236169280507dd1b3b70d459155f6e
12779] 
12780[test_runner.py: Fix error in message arguments to 'fail' calls.
12781david-sarah@jacaranda.org**20100802013526
12782 Ignore-this: 3bfdef19ae3cf993194811367da5d020
12783] 
12784[Additional Unicode basedir changes for ticket798 branch.
12785david-sarah@jacaranda.org**20100802010552
12786 Ignore-this: 7090d8c6b04eb6275345a55e75142028
12787] 
12788[Unicode basedir changes for ticket798 branch.
12789david-sarah@jacaranda.org**20100801235310
12790 Ignore-this: a00717eaeae8650847b5395801e04c45
12791] 
12792[fileutil: change WindowsError to OSError in abspath_expanduser_unicode, because WindowsError might not exist.
12793david-sarah@jacaranda.org**20100725222603
12794 Ignore-this: e125d503670ed049a9ade0322faa0c51
12795] 
12796[test_system: correct a failure in _test_runner caused by Unicode basedir patch on non-Unicode platforms.
12797david-sarah@jacaranda.org**20100724032123
12798 Ignore-this: 399b3953104fdd1bbed3f7564d163553
12799] 
12800[Fix test failures due to Unicode basedir patches.
12801david-sarah@jacaranda.org**20100725010318
12802 Ignore-this: fe92cd439eb3e60a56c007ae452784ed
12803] 
12804[util.encodingutil: change quote_output to do less unnecessary escaping, and to use double-quotes more consistently when needed. This version avoids u-escaping for characters that are representable in the output encoding, when double quotes are used, and includes tests. fixes #1135
12805david-sarah@jacaranda.org**20100723075314
12806 Ignore-this: b82205834d17db61612dd16436b7c5a2
12807] 
12808[Replace uses of os.path.abspath with abspath_expanduser_unicode where necessary. This makes basedir paths consistently represented as Unicode.
12809david-sarah@jacaranda.org**20100722001418
12810 Ignore-this: 9f8cb706540e695550e0dbe303c01f52
12811] 
12812[util.fileutil, test.test_util: add abspath_expanduser_unicode function, to work around <http://bugs.python.org/issue3426>. util.encodingutil: add a convenience function argv_to_abspath.
12813david-sarah@jacaranda.org**20100721231507
12814 Ignore-this: eee6904d1f65a733ff35190879844d08
12815] 
12816[setup: increase requirement on foolscap from >= 0.4.1 to >= 0.5.1 to avoid the foolscap performance bug with transferring large mutable files
12817zooko@zooko.com**20100802071748
12818 Ignore-this: 53b5b8571ebfee48e6b11e3f3a5efdb7
12819] 
12820[upload: tidy up logging messages
12821zooko@zooko.com**20100802070212
12822 Ignore-this: b3532518326f6d808d085da52c14b661
12823 reformat code to be less than 100 chars wide, refactor formatting of logging messages, add log levels to some logging messages, M-x whitespace-cleanup
12824] 
12825[tests: remove debug print
12826zooko@zooko.com**20100802063339
12827 Ignore-this: b13b8c15e946556bffca9d7ad7c890f5
12828] 
12829[docs: update the list of forums to announce Tahoe-LAFS too, add empty checkboxes
12830zooko@zooko.com**20100802063314
12831 Ignore-this: 89d0e8bd43f1749a9e85fcee2205bb04
12832] 
12833[immutable: tidy-up some code by using a set instead of list to hold homeless_shares
12834zooko@zooko.com**20100802062004
12835 Ignore-this: a70bda3cf6c48ab0f0688756b015cf8d
12836] 
12837[setup: fix a couple instances of hard-coded 'allmydata-tahoe' in the scripts, tighten the tests (as suggested by David-Sarah)
12838zooko@zooko.com**20100801164207
12839 Ignore-this: 50265b562193a9a3797293123ed8ba5c
12840] 
12841[setup: replace hardcoded 'allmydata-tahoe' with allmydata.__appname__
12842zooko@zooko.com**20100801160517
12843 Ignore-this: 55e1a98515300d228f02df10975f7ba
12844] 
12845[NEWS: describe #1055
12846zooko@zooko.com**20100801034338
12847 Ignore-this: 3a16cfa387c2b245c610ea1d1ad8d7f1
12848] 
12849[immutable: use PrefixingLogMixin to organize logging in Tahoe2PeerSelector and add more detailed messages about peer
12850zooko@zooko.com**20100719082000
12851 Ignore-this: e034c4988b327f7e138a106d913a3082
12852] 
12853[benchmarking: update bench_dirnode to be correct and use the shiniest new pyutil.benchutil features concerning what units you measure in
12854zooko@zooko.com**20100719044948
12855 Ignore-this: b72059e4ff921741b490e6b47ec687c6
12856] 
12857[trivial: rename and add in-line doc to clarify "used_peers" => "upload_servers"
12858zooko@zooko.com**20100719044744
12859 Ignore-this: 93c42081676e0dea181e55187cfc506d
12860] 
12861[abbreviate time edge case python2.5 unit test
12862jacob.lyles@gmail.com**20100729210638
12863 Ignore-this: 80f9b1dc98ee768372a50be7d0ef66af
12864] 
12865[docs: add Jacob Lyles to CREDITS
12866zooko@zooko.com**20100730230500
12867 Ignore-this: 9dbbd6a591b4b1a5a8dcb69b7b757792
12868] 
12869[web: don't use %d formatting on a potentially large negative float -- there is a bug in Python 2.5 in that case
12870jacob.lyles@gmail.com**20100730220550
12871 Ignore-this: 7080eb4bddbcce29cba5447f8f4872ee
12872 fixes #1055
12873] 
12874[test_upload.py: rename test_problem_layout_ticket1124 to test_problem_layout_ticket_1124 -- fix .todo reference.
12875david-sarah@jacaranda.org**20100729152927
12876 Ignore-this: c8fe1047edcc83c87b9feb47f4aa587b
12877] 
12878[test_upload.py: rename test_problem_layout_ticket1124 to test_problem_layout_ticket_1124 for consistency.
12879david-sarah@jacaranda.org**20100729142250
12880 Ignore-this: bc3aad5919ae9079ceb9968ad0f5ea5a
12881] 
12882[docs: fix licensing typo that was earlier fixed in [20090921164651-92b7f-7f97b58101d93dc588445c52a9aaa56a2c7ae336]
12883zooko@zooko.com**20100729052923
12884 Ignore-this: a975d79115911688e5469d4d869e1664
12885 I wish we didn't copies of this licensing text in several different files so that changes can be accidentally omitted from some of them.
12886] 
12887[misc/build_helpers/run-with-pythonpath.py: fix stale comment, and remove 'trial' example that is not the right way to run trial.
12888david-sarah@jacaranda.org**20100726225729
12889 Ignore-this: a61f55557ad69a1633bfb2b8172cce97
12890] 
12891[docs/specifications/dirnodes.txt: 'mesh'->'grid'.
12892david-sarah@jacaranda.org**20100723061616
12893 Ignore-this: 887bcf921ef00afba8e05e9239035bca
12894] 
12895[docs/specifications/dirnodes.txt: bring layer terminology up-to-date with architecture.txt, and a few other updates (e.g. note that the MAC is no longer verified, and that URIs can be unknown). Also 'Tahoe'->'Tahoe-LAFS'.
12896david-sarah@jacaranda.org**20100723054703
12897 Ignore-this: f3b98183e7d0a0f391225b8b93ac6c37
12898] 
12899[docs: use current cap to Zooko's wiki page in example text
12900zooko@zooko.com**20100721010543
12901 Ignore-this: 4f36f36758f9fdbaf9eb73eac23b6652
12902 fixes #1134
12903] 
12904[__init__.py: silence DeprecationWarning about BaseException.message globally. fixes #1129
12905david-sarah@jacaranda.org**20100720011939
12906 Ignore-this: 38808986ba79cb2786b010504a22f89
12907] 
12908[test_runner: test that 'tahoe --version' outputs no noise (e.g. DeprecationWarnings).
12909david-sarah@jacaranda.org**20100720011345
12910 Ignore-this: dd358b7b2e5d57282cbe133e8069702e
12911] 
12912[TAG allmydata-tahoe-1.7.1
12913zooko@zooko.com**20100719131352
12914 Ignore-this: 6942056548433dc653a746703819ad8c
12915] 
12916Patch bundle hash:
1291766ae88ffe3000d70093a0ae4709ca6df5a92b595