Ticket #393: 393status43.dpatch

File 393status43.dpatch, 743.3 KB (added by kevan, at 2011-05-16T01:16:42Z)

add more tests, fix failing tests, fix broken pausing in mutable downloader

Line 
1Mon Aug  9 16:32:44 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
2  * interfaces.py: Add #993 interfaces
3
4Mon Aug  9 16:35:35 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
5  * frontends/sftpd.py: Modify the sftp frontend to work with the MDMF changes
6
7Mon Aug  9 17:06:19 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
8  * immutable/filenode.py: Make the immutable file node implement the same interfaces as the mutable one
9
10Mon Aug  9 17:06:33 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
11  * immutable/literal.py: implement the same interfaces as other filenodes
12
13Fri Aug 13 16:49:57 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
14  * scripts: tell 'tahoe put' about MDMF
15
16Sat Aug 14 01:10:12 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
17  * web: Alter the webapi to get along with and take advantage of the MDMF changes
18 
19  The main benefit that the webapi gets from MDMF, at least initially, is
20  the ability to do a streaming download of an MDMF mutable file. It also
21  exposes a way (through the PUT verb) to append to or otherwise modify
22  (in-place) an MDMF mutable file.
23
24Sat Aug 14 15:57:11 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
25  * client.py: learn how to create different kinds of mutable files
26
27Wed Aug 18 17:32:16 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
28  * mutable/checker.py and mutable/repair.py: Modify checker and repairer to work with MDMF
29 
30  The checker and repairer required minimal changes to work with the MDMF
31  modifications made elsewhere. The checker duplicated a lot of the code
32  that was already in the downloader, so I modified the downloader
33  slightly to expose this functionality to the checker and removed the
34  duplicated code. The repairer only required a minor change to deal with
35  data representation.
36
37Wed Aug 18 17:32:31 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
38  * mutable/filenode.py: add versions and partial-file updates to the mutable file node
39 
40  One of the goals of MDMF as a GSoC project is to lay the groundwork for
41  LDMF, a format that will allow Tahoe-LAFS to deal with and encourage
42  multiple versions of a single cap on the grid. In line with this, there
43  is a now a distinction between an overriding mutable file (which can be
44  thought to correspond to the cap/unique identifier for that mutable
45  file) and versions of the mutable file (which we can download, update,
46  and so on). All download, upload, and modification operations end up
47  happening on a particular version of a mutable file, but there are
48  shortcut methods on the object representing the overriding mutable file
49  that perform these operations on the best version of the mutable file
50  (which is what code should be doing until we have LDMF and better
51  support for other paradigms).
52 
53  Another goal of MDMF was to take advantage of segmentation to give
54  callers more efficient partial file updates or appends. This patch
55  implements methods that do that, too.
56 
57
58Wed Aug 18 17:33:42 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
59  * mutable/publish.py: Modify the publish process to support MDMF
60 
61  The inner workings of the publishing process needed to be reworked to a
62  large extend to cope with segmented mutable files, and to cope with
63  partial-file updates of mutable files. This patch does that. It also
64  introduces wrappers for uploadable data, allowing the use of
65  filehandle-like objects as data sources, in addition to strings. This
66  reduces memory inefficiency when dealing with large files through the
67  webapi, and clarifies update code there.
68
69Wed Aug 18 17:35:09 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
70  * nodemaker.py: Make nodemaker expose a way to create MDMF files
71
72Sat Aug 14 15:56:44 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
73  * docs: update docs to mention MDMF
74
75Wed Aug 18 17:33:04 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
76  * mutable/layout.py and interfaces.py: add MDMF writer and reader
77 
78  The MDMF writer is responsible for keeping state as plaintext is
79  gradually processed into share data by the upload process. When the
80  upload finishes, it will write all of its share data to a remote server,
81  reporting its status back to the publisher.
82 
83  The MDMF reader is responsible for abstracting an MDMF file as it sits
84  on the grid from the downloader; specifically, by receiving and
85  responding to requests for arbitrary data within the MDMF file.
86 
87  The interfaces.py file has also been modified to contain an interface
88  for the writer.
89
90Wed Aug 18 17:34:09 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
91  * mutable/retrieve.py: Modify the retrieval process to support MDMF
92 
93  The logic behind a mutable file download had to be adapted to work with
94  segmented mutable files; this patch performs those adaptations. It also
95  exposes some decoding and decrypting functionality to make partial-file
96  updates a little easier, and supports efficient random-access downloads
97  of parts of an MDMF file.
98
99Wed Aug 18 17:34:39 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
100  * mutable/servermap.py: Alter the servermap updater to work with MDMF files
101 
102  These modifications were basically all to the end of having the
103  servermap updater use the unified MDMF + SDMF read interface whenever
104  possible -- this reduces the complexity of the code, making it easier to
105  read and maintain. To do this, I needed to modify the process of
106  updating the servermap a little bit.
107 
108  To support partial-file updates, I also modified the servermap updater
109  to fetch the block hash trees and certain segments of files while it
110  performed a servermap update (this can be done without adding any new
111  roundtrips because of batch-read functionality that the read proxy has).
112 
113
114Wed Aug 18 17:35:31 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
115  * tests:
116 
117      - A lot of existing tests relied on aspects of the mutable file
118        implementation that were changed. This patch updates those tests
119        to work with the changes.
120      - This patch also adds tests for new features.
121
122Sun Feb 20 15:02:01 PST 2011  "Brian Warner <warner@lothar.com>"
123  * resolve conflicts between 393-MDMF patches and trunk as of 1.8.2
124
125Sun Feb 20 17:46:59 PST 2011  "Brian Warner <warner@lothar.com>"
126  * mutable/filenode.py: fix create_mutable_file('string')
127
128Sun Feb 20 21:56:00 PST 2011  "Brian Warner <warner@lothar.com>"
129  * resolve more conflicts with current trunk
130
131Sun Feb 20 22:10:04 PST 2011  "Brian Warner <warner@lothar.com>"
132  * update MDMF code with StorageFarmBroker changes
133
134Fri Feb 25 17:04:33 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
135  * mutable/filenode: Clean up servermap handling in MutableFileVersion
136 
137  We want to update the servermap before attempting to modify a file,
138  which we now do. This introduced code duplication, which was addressed
139  by refactoring the servermap update into its own method, and then
140  eliminating duplicate servermap updates throughout the
141  MutableFileVersion.
142
143Sun Feb 27 15:16:43 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
144  * web: Use the string "replace" to trigger whole-file replacement when processing an offset parameter.
145
146Sun Feb 27 16:34:26 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
147  * docs/configuration.rst: fix more conflicts between #393 and trunk
148
149Sun Feb 27 17:06:37 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
150  * mutable/layout: remove references to the salt hash tree.
151
152Sun Feb 27 18:10:56 PST 2011  warner@lothar.com
153  * test_mutable.py: add test to exercise fencepost bug
154
155Mon Feb 28 00:33:27 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
156  * mutable/publish: account for offsets on segment boundaries.
157
158Mon Feb 28 19:08:07 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
159  * tahoe-put: raise UsageError when given a nonsensical mutable type, move option validation code to the option parser.
160
161Fri Mar  4 17:08:58 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
162  * web: use None instead of False in the case of no offset, use object identity comparison to check whether or not an offset was specified.
163
164Mon Mar  7 00:17:13 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
165  * mutable/filenode: remove incorrect comments about segment boundaries
166
167Mon Mar  7 00:22:29 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
168  * mutable: use integer division where appropriate
169
170Sun May  1 15:41:25 PDT 2011  Kevan Carstensen <kevan@isnotajoke.com>
171  * mutable/layout.py: reorder on-disk format to aput variable-length fields at the end of the share, after a predictably long preamble
172
173Sun May  1 15:42:49 PDT 2011  Kevan Carstensen <kevan@isnotajoke.com>
174  * uri.py: Add MDMF cap
175
176Sun May  1 15:45:23 PDT 2011  Kevan Carstensen <kevan@isnotajoke.com>
177  * nodemaker, mutable/filenode: train nodemaker and filenode to handle MDMF caps
178
179Sun May 15 15:59:46 PDT 2011  Kevan Carstensen <kevan@isnotajoke.com>
180  * mutable/retrieve: fix typo in paused check
181
182Sun May 15 16:00:08 PDT 2011  Kevan Carstensen <kevan@isnotajoke.com>
183  * scripts/tahoe_put.py: teach tahoe put about MDMF caps
184
185Sun May 15 16:00:38 PDT 2011  Kevan Carstensen <kevan@isnotajoke.com>
186  * test/common.py: fix some MDMF-related bugs in common test fixtures
187
188Sun May 15 16:00:54 PDT 2011  Kevan Carstensen <kevan@isnotajoke.com>
189  * test/test_cli: Alter existing MDMF tests to test for MDMF caps
190
191Sun May 15 16:02:07 PDT 2011  Kevan Carstensen <kevan@isnotajoke.com>
192  * test/test_mutable.py: write a test for pausing during retrieval, write support structure for that test
193
194Sun May 15 16:03:26 PDT 2011  Kevan Carstensen <kevan@isnotajoke.com>
195  * test/test_mutable.py: implement cap type checking
196
197Sun May 15 16:03:58 PDT 2011  Kevan Carstensen <kevan@isnotajoke.com>
198  * test/test_web: add MDMF cap tests
199
200Sun May 15 16:04:21 PDT 2011  Kevan Carstensen <kevan@isnotajoke.com>
201  * web/filenode.py: complain if a PUT is requested with a readonly cap
202
203Sun May 15 16:04:44 PDT 2011  Kevan Carstensen <kevan@isnotajoke.com>
204  * web/info.py: Display mutable type information when describing a mutable file
205
206New patches:
207
208[interfaces.py: Add #993 interfaces
209Kevan Carstensen <kevan@isnotajoke.com>**20100809233244
210 Ignore-this: b58621ac5cc86f1b4b4149f9e6c6a1ce
211] {
212hunk ./src/allmydata/interfaces.py 499
213 class MustNotBeUnknownRWError(CapConstraintError):
214     """Cannot add an unknown child cap specified in a rw_uri field."""
215 
216+
217+class IReadable(Interface):
218+    """I represent a readable object -- either an immutable file, or a
219+    specific version of a mutable file.
220+    """
221+
222+    def is_readonly():
223+        """Return True if this reference provides mutable access to the given
224+        file or directory (i.e. if you can modify it), or False if not. Note
225+        that even if this reference is read-only, someone else may hold a
226+        read-write reference to it.
227+
228+        For an IReadable returned by get_best_readable_version(), this will
229+        always return True, but for instances of subinterfaces such as
230+        IMutableFileVersion, it may return False."""
231+
232+    def is_mutable():
233+        """Return True if this file or directory is mutable (by *somebody*,
234+        not necessarily you), False if it is is immutable. Note that a file
235+        might be mutable overall, but your reference to it might be
236+        read-only. On the other hand, all references to an immutable file
237+        will be read-only; there are no read-write references to an immutable
238+        file."""
239+
240+    def get_storage_index():
241+        """Return the storage index of the file."""
242+
243+    def get_size():
244+        """Return the length (in bytes) of this readable object."""
245+
246+    def download_to_data():
247+        """Download all of the file contents. I return a Deferred that fires
248+        with the contents as a byte string."""
249+
250+    def read(consumer, offset=0, size=None):
251+        """Download a portion (possibly all) of the file's contents, making
252+        them available to the given IConsumer. Return a Deferred that fires
253+        (with the consumer) when the consumer is unregistered (either because
254+        the last byte has been given to it, or because the consumer threw an
255+        exception during write(), possibly because it no longer wants to
256+        receive data). The portion downloaded will start at 'offset' and
257+        contain 'size' bytes (or the remainder of the file if size==None).
258+
259+        The consumer will be used in non-streaming mode: an IPullProducer
260+        will be attached to it.
261+
262+        The consumer will not receive data right away: several network trips
263+        must occur first. The order of events will be::
264+
265+         consumer.registerProducer(p, streaming)
266+          (if streaming == False)::
267+           consumer does p.resumeProducing()
268+            consumer.write(data)
269+           consumer does p.resumeProducing()
270+            consumer.write(data).. (repeat until all data is written)
271+         consumer.unregisterProducer()
272+         deferred.callback(consumer)
273+
274+        If a download error occurs, or an exception is raised by
275+        consumer.registerProducer() or consumer.write(), I will call
276+        consumer.unregisterProducer() and then deliver the exception via
277+        deferred.errback(). To cancel the download, the consumer should call
278+        p.stopProducing(), which will result in an exception being delivered
279+        via deferred.errback().
280+
281+        See src/allmydata/util/consumer.py for an example of a simple
282+        download-to-memory consumer.
283+        """
284+
285+
286+class IWritable(Interface):
287+    """
288+    I define methods that callers can use to update SDMF and MDMF
289+    mutable files on a Tahoe-LAFS grid.
290+    """
291+    # XXX: For the moment, we have only this. It is possible that we
292+    #      want to move overwrite() and modify() in here too.
293+    def update(data, offset):
294+        """
295+        I write the data from my data argument to the MDMF file,
296+        starting at offset. I continue writing data until my data
297+        argument is exhausted, appending data to the file as necessary.
298+        """
299+        # assert IMutableUploadable.providedBy(data)
300+        # to append data: offset=node.get_size_of_best_version()
301+        # do we want to support compacting MDMF?
302+        # for an MDMF file, this can be done with O(data.get_size())
303+        # memory. For an SDMF file, any modification takes
304+        # O(node.get_size_of_best_version()).
305+
306+
307+class IMutableFileVersion(IReadable):
308+    """I provide access to a particular version of a mutable file. The
309+    access is read/write if I was obtained from a filenode derived from
310+    a write cap, or read-only if the filenode was derived from a read cap.
311+    """
312+
313+    def get_sequence_number():
314+        """Return the sequence number of this version."""
315+
316+    def get_servermap():
317+        """Return the IMutableFileServerMap instance that was used to create
318+        this object.
319+        """
320+
321+    def get_writekey():
322+        """Return this filenode's writekey, or None if the node does not have
323+        write-capability. This may be used to assist with data structures
324+        that need to make certain data available only to writers, such as the
325+        read-write child caps in dirnodes. The recommended process is to have
326+        reader-visible data be submitted to the filenode in the clear (where
327+        it will be encrypted by the filenode using the readkey), but encrypt
328+        writer-visible data using this writekey.
329+        """
330+
331+    # TODO: Can this be overwrite instead of replace?
332+    def replace(new_contents):
333+        """Replace the contents of the mutable file, provided that no other
334+        node has published (or is attempting to publish, concurrently) a
335+        newer version of the file than this one.
336+
337+        I will avoid modifying any share that is different than the version
338+        given by get_sequence_number(). However, if another node is writing
339+        to the file at the same time as me, I may manage to update some shares
340+        while they update others. If I see any evidence of this, I will signal
341+        UncoordinatedWriteError, and the file will be left in an inconsistent
342+        state (possibly the version you provided, possibly the old version,
343+        possibly somebody else's version, and possibly a mix of shares from
344+        all of these).
345+
346+        The recommended response to UncoordinatedWriteError is to either
347+        return it to the caller (since they failed to coordinate their
348+        writes), or to attempt some sort of recovery. It may be sufficient to
349+        wait a random interval (with exponential backoff) and repeat your
350+        operation. If I do not signal UncoordinatedWriteError, then I was
351+        able to write the new version without incident.
352+
353+        I return a Deferred that fires (with a PublishStatus object) when the
354+        update has completed.
355+        """
356+
357+    def modify(modifier_cb):
358+        """Modify the contents of the file, by downloading this version,
359+        applying the modifier function (or bound method), then uploading
360+        the new version. This will succeed as long as no other node
361+        publishes a version between the download and the upload.
362+        I return a Deferred that fires (with a PublishStatus object) when
363+        the update is complete.
364+
365+        The modifier callable will be given three arguments: a string (with
366+        the old contents), a 'first_time' boolean, and a servermap. As with
367+        download_to_data(), the old contents will be from this version,
368+        but the modifier can use the servermap to make other decisions
369+        (such as refusing to apply the delta if there are multiple parallel
370+        versions, or if there is evidence of a newer unrecoverable version).
371+        'first_time' will be True the first time the modifier is called,
372+        and False on any subsequent calls.
373+
374+        The callable should return a string with the new contents. The
375+        callable must be prepared to be called multiple times, and must
376+        examine the input string to see if the change that it wants to make
377+        is already present in the old version. If it does not need to make
378+        any changes, it can either return None, or return its input string.
379+
380+        If the modifier raises an exception, it will be returned in the
381+        errback.
382+        """
383+
384+
385 # The hierarchy looks like this:
386 #  IFilesystemNode
387 #   IFileNode
388hunk ./src/allmydata/interfaces.py 758
389     def raise_error():
390         """Raise any error associated with this node."""
391 
392+    # XXX: These may not be appropriate outside the context of an IReadable.
393     def get_size():
394         """Return the length (in bytes) of the data this node represents. For
395         directory nodes, I return the size of the backing store. I return
396hunk ./src/allmydata/interfaces.py 775
397 class IFileNode(IFilesystemNode):
398     """I am a node which represents a file: a sequence of bytes. I am not a
399     container, like IDirectoryNode."""
400+    def get_best_readable_version():
401+        """Return a Deferred that fires with an IReadable for the 'best'
402+        available version of the file. The IReadable provides only read
403+        access, even if this filenode was derived from a write cap.
404 
405hunk ./src/allmydata/interfaces.py 780
406-class IImmutableFileNode(IFileNode):
407-    def read(consumer, offset=0, size=None):
408-        """Download a portion (possibly all) of the file's contents, making
409-        them available to the given IConsumer. Return a Deferred that fires
410-        (with the consumer) when the consumer is unregistered (either because
411-        the last byte has been given to it, or because the consumer threw an
412-        exception during write(), possibly because it no longer wants to
413-        receive data). The portion downloaded will start at 'offset' and
414-        contain 'size' bytes (or the remainder of the file if size==None).
415-
416-        The consumer will be used in non-streaming mode: an IPullProducer
417-        will be attached to it.
418+        For an immutable file, there is only one version. For a mutable
419+        file, the 'best' version is the recoverable version with the
420+        highest sequence number. If no uncoordinated writes have occurred,
421+        and if enough shares are available, then this will be the most
422+        recent version that has been uploaded. If no version is recoverable,
423+        the Deferred will errback with an UnrecoverableFileError.
424+        """
425 
426hunk ./src/allmydata/interfaces.py 788
427-        The consumer will not receive data right away: several network trips
428-        must occur first. The order of events will be::
429+    def download_best_version():
430+        """Download the contents of the version that would be returned
431+        by get_best_readable_version(). This is equivalent to calling
432+        download_to_data() on the IReadable given by that method.
433 
434hunk ./src/allmydata/interfaces.py 793
435-         consumer.registerProducer(p, streaming)
436-          (if streaming == False)::
437-           consumer does p.resumeProducing()
438-            consumer.write(data)
439-           consumer does p.resumeProducing()
440-            consumer.write(data).. (repeat until all data is written)
441-         consumer.unregisterProducer()
442-         deferred.callback(consumer)
443+        I return a Deferred that fires with a byte string when the file
444+        has been fully downloaded. To support streaming download, use
445+        the 'read' method of IReadable. If no version is recoverable,
446+        the Deferred will errback with an UnrecoverableFileError.
447+        """
448 
449hunk ./src/allmydata/interfaces.py 799
450-        If a download error occurs, or an exception is raised by
451-        consumer.registerProducer() or consumer.write(), I will call
452-        consumer.unregisterProducer() and then deliver the exception via
453-        deferred.errback(). To cancel the download, the consumer should call
454-        p.stopProducing(), which will result in an exception being delivered
455-        via deferred.errback().
456+    def get_size_of_best_version():
457+        """Find the size of the version that would be returned by
458+        get_best_readable_version().
459 
460hunk ./src/allmydata/interfaces.py 803
461-        See src/allmydata/util/consumer.py for an example of a simple
462-        download-to-memory consumer.
463+        I return a Deferred that fires with an integer. If no version
464+        is recoverable, the Deferred will errback with an
465+        UnrecoverableFileError.
466         """
467 
468hunk ./src/allmydata/interfaces.py 808
469+
470+class IImmutableFileNode(IFileNode, IReadable):
471+    """I am a node representing an immutable file. Immutable files have
472+    only one version"""
473+
474+
475 class IMutableFileNode(IFileNode):
476     """I provide access to a 'mutable file', which retains its identity
477     regardless of what contents are put in it.
478hunk ./src/allmydata/interfaces.py 873
479     only be retrieved and updated all-at-once, as a single big string. Future
480     versions of our mutable files will remove this restriction.
481     """
482-
483-    def download_best_version():
484-        """Download the 'best' available version of the file, meaning one of
485-        the recoverable versions with the highest sequence number. If no
486+    def get_best_mutable_version():
487+        """Return a Deferred that fires with an IMutableFileVersion for
488+        the 'best' available version of the file. The best version is
489+        the recoverable version with the highest sequence number. If no
490         uncoordinated writes have occurred, and if enough shares are
491hunk ./src/allmydata/interfaces.py 878
492-        available, then this will be the most recent version that has been
493-        uploaded.
494+        available, then this will be the most recent version that has
495+        been uploaded.
496 
497hunk ./src/allmydata/interfaces.py 881
498-        I update an internal servermap with MODE_READ, determine which
499-        version of the file is indicated by
500-        servermap.best_recoverable_version(), and return a Deferred that
501-        fires with its contents. If no version is recoverable, the Deferred
502-        will errback with UnrecoverableFileError.
503-        """
504-
505-    def get_size_of_best_version():
506-        """Find the size of the version that would be downloaded with
507-        download_best_version(), without actually downloading the whole file.
508-
509-        I return a Deferred that fires with an integer.
510+        If no version is recoverable, the Deferred will errback with an
511+        UnrecoverableFileError.
512         """
513 
514     def overwrite(new_contents):
515hunk ./src/allmydata/interfaces.py 921
516         errback.
517         """
518 
519-
520     def get_servermap(mode):
521         """Return a Deferred that fires with an IMutableFileServerMap
522         instance, updated using the given mode.
523hunk ./src/allmydata/interfaces.py 974
524         writer-visible data using this writekey.
525         """
526 
527+    def set_version(version):
528+        """Tahoe-LAFS supports SDMF and MDMF mutable files. By default,
529+        we upload in SDMF for reasons of compatibility. If you want to
530+        change this, set_version will let you do that.
531+
532+        To say that this file should be uploaded in SDMF, pass in a 0. To
533+        say that the file should be uploaded as MDMF, pass in a 1.
534+        """
535+
536+    def get_version():
537+        """Returns the mutable file protocol version."""
538+
539 class NotEnoughSharesError(Exception):
540     """Download was unable to get enough shares"""
541 
542hunk ./src/allmydata/interfaces.py 1822
543         """The upload is finished, and whatever filehandle was in use may be
544         closed."""
545 
546+
547+class IMutableUploadable(Interface):
548+    """
549+    I represent content that is due to be uploaded to a mutable filecap.
550+    """
551+    # This is somewhat simpler than the IUploadable interface above
552+    # because mutable files do not need to be concerned with possibly
553+    # generating a CHK, nor with per-file keys. It is a subset of the
554+    # methods in IUploadable, though, so we could just as well implement
555+    # the mutable uploadables as IUploadables that don't happen to use
556+    # those methods (with the understanding that the unused methods will
557+    # never be called on such objects)
558+    def get_size():
559+        """
560+        Returns a Deferred that fires with the size of the content held
561+        by the uploadable.
562+        """
563+
564+    def read(length):
565+        """
566+        Returns a list of strings which, when concatenated, are the next
567+        length bytes of the file, or fewer if there are fewer bytes
568+        between the current location and the end of the file.
569+        """
570+
571+    def close():
572+        """
573+        The process that used the Uploadable is finished using it, so
574+        the uploadable may be closed.
575+        """
576+
577 class IUploadResults(Interface):
578     """I am returned by upload() methods. I contain a number of public
579     attributes which can be read to determine the results of the upload. Some
580}
581[frontends/sftpd.py: Modify the sftp frontend to work with the MDMF changes
582Kevan Carstensen <kevan@isnotajoke.com>**20100809233535
583 Ignore-this: 2d25e2cfcd0d7bbcbba660c7e1da12f
584] {
585hunk ./src/allmydata/frontends/sftpd.py 33
586 from allmydata.interfaces import IFileNode, IDirectoryNode, ExistingChildError, \
587      NoSuchChildError, ChildOfWrongTypeError
588 from allmydata.mutable.common import NotWriteableError
589+from allmydata.mutable.publish import MutableFileHandle
590 from allmydata.immutable.upload import FileHandle
591 from allmydata.dirnode import update_metadata
592 from allmydata.util.fileutil import EncryptedTemporaryFile
593hunk ./src/allmydata/frontends/sftpd.py 667
594         else:
595             assert IFileNode.providedBy(filenode), filenode
596 
597-            if filenode.is_mutable():
598-                self.async.addCallback(lambda ign: filenode.download_best_version())
599-                def _downloaded(data):
600-                    self.consumer = OverwriteableFileConsumer(len(data), tempfile_maker)
601-                    self.consumer.write(data)
602-                    self.consumer.finish()
603-                    return None
604-                self.async.addCallback(_downloaded)
605-            else:
606-                download_size = filenode.get_size()
607-                assert download_size is not None, "download_size is None"
608+            self.async.addCallback(lambda ignored: filenode.get_best_readable_version())
609+
610+            def _read(version):
611+                if noisy: self.log("_read", level=NOISY)
612+                download_size = version.get_size()
613+                assert download_size is not None
614+
615                 self.consumer = OverwriteableFileConsumer(download_size, tempfile_maker)
616hunk ./src/allmydata/frontends/sftpd.py 675
617-                def _read(ign):
618-                    if noisy: self.log("_read immutable", level=NOISY)
619-                    filenode.read(self.consumer, 0, None)
620-                self.async.addCallback(_read)
621+
622+                version.read(self.consumer, 0, None)
623+            self.async.addCallback(_read)
624 
625         eventually(self.async.callback, None)
626 
627hunk ./src/allmydata/frontends/sftpd.py 821
628                     assert parent and childname, (parent, childname, self.metadata)
629                     d2.addCallback(lambda ign: parent.set_metadata_for(childname, self.metadata))
630 
631-                d2.addCallback(lambda ign: self.consumer.get_current_size())
632-                d2.addCallback(lambda size: self.consumer.read(0, size))
633-                d2.addCallback(lambda new_contents: self.filenode.overwrite(new_contents))
634+                d2.addCallback(lambda ign: self.filenode.overwrite(MutableFileHandle(self.consumer.get_file())))
635             else:
636                 def _add_file(ign):
637                     self.log("_add_file childname=%r" % (childname,), level=OPERATIONAL)
638}
639[immutable/filenode.py: Make the immutable file node implement the same interfaces as the mutable one
640Kevan Carstensen <kevan@isnotajoke.com>**20100810000619
641 Ignore-this: 93e536c0f8efb705310f13ff64621527
642] {
643hunk ./src/allmydata/immutable/filenode.py 8
644 now = time.time
645 from zope.interface import implements, Interface
646 from twisted.internet import defer
647-from twisted.internet.interfaces import IConsumer
648 
649hunk ./src/allmydata/immutable/filenode.py 9
650-from allmydata.interfaces import IImmutableFileNode, IUploadResults
651 from allmydata import uri
652hunk ./src/allmydata/immutable/filenode.py 10
653+from twisted.internet.interfaces import IConsumer
654+from twisted.protocols import basic
655+from foolscap.api import eventually
656+from allmydata.interfaces import IImmutableFileNode, ICheckable, \
657+     IDownloadTarget, IUploadResults
658+from allmydata.util import dictutil, log, base32, consumer
659+from allmydata.immutable.checker import Checker
660 from allmydata.check_results import CheckResults, CheckAndRepairResults
661 from allmydata.util.dictutil import DictOfSets
662 from pycryptopp.cipher.aes import AES
663hunk ./src/allmydata/immutable/filenode.py 296
664         return self._cnode.check_and_repair(monitor, verify, add_lease)
665     def check(self, monitor, verify=False, add_lease=False):
666         return self._cnode.check(monitor, verify, add_lease)
667+
668+    def get_best_readable_version(self):
669+        """
670+        Return an IReadable of the best version of this file. Since
671+        immutable files can have only one version, we just return the
672+        current filenode.
673+        """
674+        return defer.succeed(self)
675+
676+
677+    def download_best_version(self):
678+        """
679+        Download the best version of this file, returning its contents
680+        as a bytestring. Since there is only one version of an immutable
681+        file, we download and return the contents of this file.
682+        """
683+        d = consumer.download_to_data(self)
684+        return d
685+
686+    # for an immutable file, download_to_data (specified in IReadable)
687+    # is the same as download_best_version (specified in IFileNode). For
688+    # mutable files, the difference is more meaningful, since they can
689+    # have multiple versions.
690+    download_to_data = download_best_version
691+
692+
693+    # get_size() (IReadable), get_current_size() (IFilesystemNode), and
694+    # get_size_of_best_version(IFileNode) are all the same for immutable
695+    # files.
696+    get_size_of_best_version = get_current_size
697}
698[immutable/literal.py: implement the same interfaces as other filenodes
699Kevan Carstensen <kevan@isnotajoke.com>**20100810000633
700 Ignore-this: b50dd5df2d34ecd6477b8499a27aef13
701] hunk ./src/allmydata/immutable/literal.py 106
702         d.addCallback(lambda lastSent: consumer)
703         return d
704 
705+    # IReadable, IFileNode, IFilesystemNode
706+    def get_best_readable_version(self):
707+        return defer.succeed(self)
708+
709+
710+    def download_best_version(self):
711+        return defer.succeed(self.u.data)
712+
713+
714+    download_to_data = download_best_version
715+    get_size_of_best_version = get_current_size
716+
717[scripts: tell 'tahoe put' about MDMF
718Kevan Carstensen <kevan@isnotajoke.com>**20100813234957
719 Ignore-this: c106b3384fc676bd3c0fb466d2a52b1b
720] {
721hunk ./src/allmydata/scripts/cli.py 160
722     optFlags = [
723         ("mutable", "m", "Create a mutable file instead of an immutable one."),
724         ]
725+    optParameters = [
726+        ("mutable-type", None, False, "Create a mutable file in the given format. Valid formats are 'sdmf' for SDMF and 'mdmf' for MDMF"),
727+        ]
728 
729     def parseArgs(self, arg1=None, arg2=None):
730         # see Examples below
731hunk ./src/allmydata/scripts/tahoe_put.py 21
732     from_file = options.from_file
733     to_file = options.to_file
734     mutable = options['mutable']
735+    mutable_type = False
736+
737+    if mutable:
738+        mutable_type = options['mutable-type']
739     if options['quiet']:
740         verbosity = 0
741     else:
742hunk ./src/allmydata/scripts/tahoe_put.py 33
743     stdout = options.stdout
744     stderr = options.stderr
745 
746+    if mutable_type and mutable_type not in ('sdmf', 'mdmf'):
747+        # Don't try to pass unsupported types to the webapi
748+        print >>stderr, "error: %s is an invalid format" % mutable_type
749+        return 1
750+
751     if nodeurl[-1] != "/":
752         nodeurl += "/"
753     if to_file:
754hunk ./src/allmydata/scripts/tahoe_put.py 76
755         url = nodeurl + "uri"
756     if mutable:
757         url += "?mutable=true"
758+    if mutable_type:
759+        assert mutable
760+        url += "&mutable-type=%s" % mutable_type
761+
762     if from_file:
763         infileobj = open(os.path.expanduser(from_file), "rb")
764     else:
765}
766[web: Alter the webapi to get along with and take advantage of the MDMF changes
767Kevan Carstensen <kevan@isnotajoke.com>**20100814081012
768 Ignore-this: 96c2ed4e4a9f450fb84db5d711d10bd6
769 
770 The main benefit that the webapi gets from MDMF, at least initially, is
771 the ability to do a streaming download of an MDMF mutable file. It also
772 exposes a way (through the PUT verb) to append to or otherwise modify
773 (in-place) an MDMF mutable file.
774] {
775hunk ./src/allmydata/web/common.py 12
776 from allmydata.interfaces import ExistingChildError, NoSuchChildError, \
777      FileTooLargeError, NotEnoughSharesError, NoSharesError, \
778      EmptyPathnameComponentError, MustBeDeepImmutableError, \
779-     MustBeReadonlyError, MustNotBeUnknownRWError
780+     MustBeReadonlyError, MustNotBeUnknownRWError, SDMF_VERSION, MDMF_VERSION
781 from allmydata.mutable.common import UnrecoverableFileError
782 from allmydata.util import abbreviate
783 from allmydata.util.encodingutil import to_str, quote_output
784hunk ./src/allmydata/web/common.py 35
785     else:
786         return boolean_of_arg(replace)
787 
788+
789+def parse_mutable_type_arg(arg):
790+    if not arg:
791+        return None # interpreted by the caller as "let the nodemaker decide"
792+
793+    arg = arg.lower()
794+    assert arg in ("mdmf", "sdmf")
795+
796+    if arg == "mdmf":
797+        return MDMF_VERSION
798+
799+    return SDMF_VERSION
800+
801+
802+def parse_offset_arg(offset):
803+    # XXX: This will raise a ValueError when invoked on something that
804+    # is not an integer. Is that okay? Or do we want a better error
805+    # message? Since this call is going to be used by programmers and
806+    # their tools rather than users (through the wui), it is not
807+    # inconsistent to return that, I guess.
808+    offset = int(offset)
809+    return offset
810+
811+
812 def get_root(ctx_or_req):
813     req = IRequest(ctx_or_req)
814     # the addSlash=True gives us one extra (empty) segment
815hunk ./src/allmydata/web/directory.py 19
816 from allmydata.uri import from_string_dirnode
817 from allmydata.interfaces import IDirectoryNode, IFileNode, IFilesystemNode, \
818      IImmutableFileNode, IMutableFileNode, ExistingChildError, \
819-     NoSuchChildError, EmptyPathnameComponentError
820+     NoSuchChildError, EmptyPathnameComponentError, SDMF_VERSION, MDMF_VERSION
821 from allmydata.monitor import Monitor, OperationCancelledError
822 from allmydata import dirnode
823 from allmydata.web.common import text_plain, WebError, \
824hunk ./src/allmydata/web/directory.py 153
825         if not t:
826             # render the directory as HTML, using the docFactory and Nevow's
827             # whole templating thing.
828-            return DirectoryAsHTML(self.node)
829+            return DirectoryAsHTML(self.node,
830+                                   self.client.mutable_file_default)
831 
832         if t == "json":
833             return DirectoryJSONMetadata(ctx, self.node)
834hunk ./src/allmydata/web/directory.py 556
835     docFactory = getxmlfile("directory.xhtml")
836     addSlash = True
837 
838-    def __init__(self, node):
839+    def __init__(self, node, default_mutable_format):
840         rend.Page.__init__(self)
841         self.node = node
842 
843hunk ./src/allmydata/web/directory.py 560
844+        assert default_mutable_format in (MDMF_VERSION, SDMF_VERSION)
845+        self.default_mutable_format = default_mutable_format
846+
847     def beforeRender(self, ctx):
848         # attempt to get the dirnode's children, stashing them (or the
849         # failure that results) for later use
850hunk ./src/allmydata/web/directory.py 780
851             ]]
852         forms.append(T.div(class_="freeform-form")[mkdir])
853 
854+        # Build input elements for mutable file type. We do this outside
855+        # of the list so we can check the appropriate format, based on
856+        # the default configured in the client (which reflects the
857+        # default configured in tahoe.cfg)
858+        if self.default_mutable_format == MDMF_VERSION:
859+            mdmf_input = T.input(type='radio', name='mutable-type',
860+                                 id='mutable-type-mdmf', value='mdmf',
861+                                 checked='checked')
862+        else:
863+            mdmf_input = T.input(type='radio', name='mutable-type',
864+                                 id='mutable-type-mdmf', value='mdmf')
865+
866+        if self.default_mutable_format == SDMF_VERSION:
867+            sdmf_input = T.input(type='radio', name='mutable-type',
868+                                 id='mutable-type-sdmf', value='sdmf',
869+                                 checked="checked")
870+        else:
871+            sdmf_input = T.input(type='radio', name='mutable-type',
872+                                 id='mutable-type-sdmf', value='sdmf')
873+
874         upload = T.form(action=".", method="post",
875                         enctype="multipart/form-data")[
876             T.fieldset[
877hunk ./src/allmydata/web/directory.py 812
878             T.input(type="submit", value="Upload"),
879             " Mutable?:",
880             T.input(type="checkbox", name="mutable"),
881+            sdmf_input, T.label(for_="mutable-type-sdmf")["SDMF"],
882+            mdmf_input,
883+            T.label(for_="mutable-type-mdmf")["MDMF (experimental)"],
884             ]]
885         forms.append(T.div(class_="freeform-form")[upload])
886 
887hunk ./src/allmydata/web/directory.py 850
888                 kiddata = ("filenode", {'size': childnode.get_size(),
889                                         'mutable': childnode.is_mutable(),
890                                         })
891+                if childnode.is_mutable() and \
892+                    childnode.get_version() is not None:
893+                    mutable_type = childnode.get_version()
894+                    assert mutable_type in (SDMF_VERSION, MDMF_VERSION)
895+
896+                    if mutable_type == MDMF_VERSION:
897+                        mutable_type = "mdmf"
898+                    else:
899+                        mutable_type = "sdmf"
900+                    kiddata[1]['mutable-type'] = mutable_type
901+
902             elif IDirectoryNode.providedBy(childnode):
903                 kiddata = ("dirnode", {'mutable': childnode.is_mutable()})
904             else:
905hunk ./src/allmydata/web/filenode.py 9
906 from nevow import url, rend
907 from nevow.inevow import IRequest
908 
909-from allmydata.interfaces import ExistingChildError
910+from allmydata.interfaces import ExistingChildError, SDMF_VERSION, MDMF_VERSION
911 from allmydata.monitor import Monitor
912 from allmydata.immutable.upload import FileHandle
913hunk ./src/allmydata/web/filenode.py 12
914+from allmydata.mutable.publish import MutableFileHandle
915+from allmydata.mutable.common import MODE_READ
916 from allmydata.util import log, base32
917 
918 from allmydata.web.common import text_plain, WebError, RenderMixin, \
919hunk ./src/allmydata/web/filenode.py 18
920      boolean_of_arg, get_arg, should_create_intermediate_directories, \
921-     MyExceptionHandler, parse_replace_arg
922+     MyExceptionHandler, parse_replace_arg, parse_offset_arg, \
923+     parse_mutable_type_arg
924 from allmydata.web.check_results import CheckResults, \
925      CheckAndRepairResults, LiteralCheckResults
926 from allmydata.web.info import MoreInfo
927hunk ./src/allmydata/web/filenode.py 29
928         # a new file is being uploaded in our place.
929         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
930         if mutable:
931-            req.content.seek(0)
932-            data = req.content.read()
933-            d = client.create_mutable_file(data)
934+            mutable_type = parse_mutable_type_arg(get_arg(req,
935+                                                          "mutable-type",
936+                                                          None))
937+            data = MutableFileHandle(req.content)
938+            d = client.create_mutable_file(data, version=mutable_type)
939             def _uploaded(newnode):
940                 d2 = self.parentnode.set_node(self.name, newnode,
941                                               overwrite=replace)
942hunk ./src/allmydata/web/filenode.py 66
943         d.addCallback(lambda res: childnode.get_uri())
944         return d
945 
946-    def _read_data_from_formpost(self, req):
947-        # SDMF: files are small, and we can only upload data, so we read
948-        # the whole file into memory before uploading.
949-        contents = req.fields["file"]
950-        contents.file.seek(0)
951-        data = contents.file.read()
952-        return data
953 
954     def replace_me_with_a_formpost(self, req, client, replace):
955         # create a new file, maybe mutable, maybe immutable
956hunk ./src/allmydata/web/filenode.py 71
957         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
958 
959+        # create an immutable file
960+        contents = req.fields["file"]
961         if mutable:
962hunk ./src/allmydata/web/filenode.py 74
963-            data = self._read_data_from_formpost(req)
964-            d = client.create_mutable_file(data)
965+            mutable_type = parse_mutable_type_arg(get_arg(req, "mutable-type",
966+                                                          None))
967+            uploadable = MutableFileHandle(contents.file)
968+            d = client.create_mutable_file(uploadable, version=mutable_type)
969             def _uploaded(newnode):
970                 d2 = self.parentnode.set_node(self.name, newnode,
971                                               overwrite=replace)
972hunk ./src/allmydata/web/filenode.py 85
973                 return d2
974             d.addCallback(_uploaded)
975             return d
976-        # create an immutable file
977-        contents = req.fields["file"]
978+
979         uploadable = FileHandle(contents.file, convergence=client.convergence)
980         d = self.parentnode.add_file(self.name, uploadable, overwrite=replace)
981         d.addCallback(lambda newnode: newnode.get_uri())
982hunk ./src/allmydata/web/filenode.py 91
983         return d
984 
985+
986 class PlaceHolderNodeHandler(RenderMixin, rend.Page, ReplaceMeMixin):
987     def __init__(self, client, parentnode, name):
988         rend.Page.__init__(self)
989hunk ./src/allmydata/web/filenode.py 174
990             # properly. So we assume that at least the browser will agree
991             # with itself, and echo back the same bytes that we were given.
992             filename = get_arg(req, "filename", self.name) or "unknown"
993-            if self.node.is_mutable():
994-                # some day: d = self.node.get_best_version()
995-                d = makeMutableDownloadable(self.node)
996-            else:
997-                d = defer.succeed(self.node)
998+            d = self.node.get_best_readable_version()
999             d.addCallback(lambda dn: FileDownloader(dn, filename))
1000             return d
1001         if t == "json":
1002hunk ./src/allmydata/web/filenode.py 178
1003-            if self.parentnode and self.name:
1004-                d = self.parentnode.get_metadata_for(self.name)
1005+            # We do this to make sure that fields like size and
1006+            # mutable-type (which depend on the file on the grid and not
1007+            # just on the cap) are filled in. The latter gets used in
1008+            # tests, in particular.
1009+            #
1010+            # TODO: Make it so that the servermap knows how to update in
1011+            # a mode specifically designed to fill in these fields, and
1012+            # then update it in that mode.
1013+            if self.node.is_mutable():
1014+                d = self.node.get_servermap(MODE_READ)
1015             else:
1016                 d = defer.succeed(None)
1017hunk ./src/allmydata/web/filenode.py 190
1018+            if self.parentnode and self.name:
1019+                d.addCallback(lambda ignored:
1020+                    self.parentnode.get_metadata_for(self.name))
1021+            else:
1022+                d.addCallback(lambda ignored: None)
1023             d.addCallback(lambda md: FileJSONMetadata(ctx, self.node, md))
1024             return d
1025         if t == "info":
1026hunk ./src/allmydata/web/filenode.py 211
1027         if t:
1028             raise WebError("GET file: bad t=%s" % t)
1029         filename = get_arg(req, "filename", self.name) or "unknown"
1030-        if self.node.is_mutable():
1031-            # some day: d = self.node.get_best_version()
1032-            d = makeMutableDownloadable(self.node)
1033-        else:
1034-            d = defer.succeed(self.node)
1035+        d = self.node.get_best_readable_version()
1036         d.addCallback(lambda dn: FileDownloader(dn, filename))
1037         return d
1038 
1039hunk ./src/allmydata/web/filenode.py 219
1040         req = IRequest(ctx)
1041         t = get_arg(req, "t", "").strip()
1042         replace = parse_replace_arg(get_arg(req, "replace", "true"))
1043+        offset = parse_offset_arg(get_arg(req, "offset", -1))
1044 
1045         if not t:
1046hunk ./src/allmydata/web/filenode.py 222
1047-            if self.node.is_mutable():
1048+            if self.node.is_mutable() and offset >= 0:
1049+                return self.update_my_contents(req, offset)
1050+
1051+            elif self.node.is_mutable():
1052                 return self.replace_my_contents(req)
1053             if not replace:
1054                 # this is the early trap: if someone else modifies the
1055hunk ./src/allmydata/web/filenode.py 232
1056                 # directory while we're uploading, the add_file(overwrite=)
1057                 # call in replace_me_with_a_child will do the late trap.
1058                 raise ExistingChildError()
1059+            if offset >= 0:
1060+                raise WebError("PUT to a file: append operation invoked "
1061+                               "on an immutable cap")
1062+
1063+
1064             assert self.parentnode and self.name
1065             return self.replace_me_with_a_child(req, self.client, replace)
1066         if t == "uri":
1067hunk ./src/allmydata/web/filenode.py 299
1068 
1069     def replace_my_contents(self, req):
1070         req.content.seek(0)
1071-        new_contents = req.content.read()
1072+        new_contents = MutableFileHandle(req.content)
1073         d = self.node.overwrite(new_contents)
1074         d.addCallback(lambda res: self.node.get_uri())
1075         return d
1076hunk ./src/allmydata/web/filenode.py 304
1077 
1078+
1079+    def update_my_contents(self, req, offset):
1080+        req.content.seek(0)
1081+        added_contents = MutableFileHandle(req.content)
1082+
1083+        d = self.node.get_best_mutable_version()
1084+        d.addCallback(lambda mv:
1085+            mv.update(added_contents, offset))
1086+        d.addCallback(lambda ignored:
1087+            self.node.get_uri())
1088+        return d
1089+
1090+
1091     def replace_my_contents_with_a_formpost(self, req):
1092         # we have a mutable file. Get the data from the formpost, and replace
1093         # the mutable file's contents with it.
1094hunk ./src/allmydata/web/filenode.py 320
1095-        new_contents = self._read_data_from_formpost(req)
1096+        new_contents = req.fields['file']
1097+        new_contents = MutableFileHandle(new_contents.file)
1098+
1099         d = self.node.overwrite(new_contents)
1100         d.addCallback(lambda res: self.node.get_uri())
1101         return d
1102hunk ./src/allmydata/web/filenode.py 327
1103 
1104-class MutableDownloadable:
1105-    #implements(IDownloadable)
1106-    def __init__(self, size, node):
1107-        self.size = size
1108-        self.node = node
1109-    def get_size(self):
1110-        return self.size
1111-    def is_mutable(self):
1112-        return True
1113-    def read(self, consumer, offset=0, size=None):
1114-        d = self.node.download_best_version()
1115-        d.addCallback(self._got_data, consumer, offset, size)
1116-        return d
1117-    def _got_data(self, contents, consumer, offset, size):
1118-        start = offset
1119-        if size is not None:
1120-            end = offset+size
1121-        else:
1122-            end = self.size
1123-        # SDMF: we can write the whole file in one big chunk
1124-        consumer.write(contents[start:end])
1125-        return consumer
1126-
1127-def makeMutableDownloadable(n):
1128-    d = defer.maybeDeferred(n.get_size_of_best_version)
1129-    d.addCallback(MutableDownloadable, n)
1130-    return d
1131 
1132 class FileDownloader(rend.Page):
1133     # since we override the rendering process (to let the tahoe Downloader
1134hunk ./src/allmydata/web/filenode.py 509
1135     data[1]['mutable'] = filenode.is_mutable()
1136     if edge_metadata is not None:
1137         data[1]['metadata'] = edge_metadata
1138+
1139+    if filenode.is_mutable() and filenode.get_version() is not None:
1140+        mutable_type = filenode.get_version()
1141+        assert mutable_type in (MDMF_VERSION, SDMF_VERSION)
1142+        if mutable_type == MDMF_VERSION:
1143+            mutable_type = "mdmf"
1144+        else:
1145+            mutable_type = "sdmf"
1146+        data[1]['mutable-type'] = mutable_type
1147+
1148     return text_plain(simplejson.dumps(data, indent=1) + "\n", ctx)
1149 
1150 def FileURI(ctx, filenode):
1151hunk ./src/allmydata/web/root.py 15
1152 from allmydata import get_package_versions_string
1153 from allmydata import provisioning
1154 from allmydata.util import idlib, log
1155-from allmydata.interfaces import IFileNode
1156+from allmydata.interfaces import IFileNode, MDMF_VERSION, SDMF_VERSION
1157 from allmydata.web import filenode, directory, unlinked, status, operations
1158 from allmydata.web import reliability, storage
1159 from allmydata.web.common import abbreviate_size, getxmlfile, WebError, \
1160hunk ./src/allmydata/web/root.py 19
1161-     get_arg, RenderMixin, boolean_of_arg
1162+     get_arg, RenderMixin, boolean_of_arg, parse_mutable_type_arg
1163 
1164 
1165 class URIHandler(RenderMixin, rend.Page):
1166hunk ./src/allmydata/web/root.py 50
1167         if t == "":
1168             mutable = boolean_of_arg(get_arg(req, "mutable", "false").strip())
1169             if mutable:
1170-                return unlinked.PUTUnlinkedSSK(req, self.client)
1171+                version = parse_mutable_type_arg(get_arg(req, "mutable-type",
1172+                                                 None))
1173+                return unlinked.PUTUnlinkedSSK(req, self.client, version)
1174             else:
1175                 return unlinked.PUTUnlinkedCHK(req, self.client)
1176         if t == "mkdir":
1177hunk ./src/allmydata/web/root.py 70
1178         if t in ("", "upload"):
1179             mutable = bool(get_arg(req, "mutable", "").strip())
1180             if mutable:
1181-                return unlinked.POSTUnlinkedSSK(req, self.client)
1182+                version = parse_mutable_type_arg(get_arg(req, "mutable-type",
1183+                                                         None))
1184+                return unlinked.POSTUnlinkedSSK(req, self.client, version)
1185             else:
1186                 return unlinked.POSTUnlinkedCHK(req, self.client)
1187         if t == "mkdir":
1188hunk ./src/allmydata/web/root.py 324
1189 
1190     def render_upload_form(self, ctx, data):
1191         # this is a form where users can upload unlinked files
1192+        #
1193+        # for mutable files, users can choose the format by selecting
1194+        # MDMF or SDMF from a radio button. They can also configure a
1195+        # default format in tahoe.cfg, which they rightly expect us to
1196+        # obey. we convey to them that we are obeying their choice by
1197+        # ensuring that the one that they've chosen is selected in the
1198+        # interface.
1199+        if self.client.mutable_file_default == MDMF_VERSION:
1200+            mdmf_input = T.input(type='radio', name='mutable-type',
1201+                                 value='mdmf', id='mutable-type-mdmf',
1202+                                 checked='checked')
1203+        else:
1204+            mdmf_input = T.input(type='radio', name='mutable-type',
1205+                                 value='mdmf', id='mutable-type-mdmf')
1206+
1207+        if self.client.mutable_file_default == SDMF_VERSION:
1208+            sdmf_input = T.input(type='radio', name='mutable-type',
1209+                                 value='sdmf', id='mutable-type-sdmf',
1210+                                 checked='checked')
1211+        else:
1212+            sdmf_input = T.input(type='radio', name='mutable-type',
1213+                                 value='sdmf', id='mutable-type-sdmf')
1214+
1215+
1216         form = T.form(action="uri", method="post",
1217                       enctype="multipart/form-data")[
1218             T.fieldset[
1219hunk ./src/allmydata/web/root.py 356
1220                   T.input(type="file", name="file", class_="freeform-input-file")],
1221             T.input(type="hidden", name="t", value="upload"),
1222             T.div[T.input(type="checkbox", name="mutable"), T.label(for_="mutable")["Create mutable file"],
1223+                  sdmf_input, T.label(for_="mutable-type-sdmf")["SDMF"],
1224+                  mdmf_input,
1225+                  T.label(for_='mutable-type-mdmf')['MDMF (experimental)'],
1226                   " ", T.input(type="submit", value="Upload!")],
1227             ]]
1228         return T.div[form]
1229hunk ./src/allmydata/web/unlinked.py 7
1230 from twisted.internet import defer
1231 from nevow import rend, url, tags as T
1232 from allmydata.immutable.upload import FileHandle
1233+from allmydata.mutable.publish import MutableFileHandle
1234 from allmydata.web.common import getxmlfile, get_arg, boolean_of_arg, \
1235      convert_children_json, WebError
1236 from allmydata.web import status
1237hunk ./src/allmydata/web/unlinked.py 20
1238     # that fires with the URI of the new file
1239     return d
1240 
1241-def PUTUnlinkedSSK(req, client):
1242+def PUTUnlinkedSSK(req, client, version):
1243     # SDMF: files are small, and we can only upload data
1244     req.content.seek(0)
1245hunk ./src/allmydata/web/unlinked.py 23
1246-    data = req.content.read()
1247-    d = client.create_mutable_file(data)
1248+    data = MutableFileHandle(req.content)
1249+    d = client.create_mutable_file(data, version=version)
1250     d.addCallback(lambda n: n.get_uri())
1251     return d
1252 
1253hunk ./src/allmydata/web/unlinked.py 83
1254                       ["/uri/" + res.uri])
1255         return d
1256 
1257-def POSTUnlinkedSSK(req, client):
1258+def POSTUnlinkedSSK(req, client, version):
1259     # "POST /uri", to create an unlinked file.
1260     # SDMF: files are small, and we can only upload data
1261hunk ./src/allmydata/web/unlinked.py 86
1262-    contents = req.fields["file"]
1263-    contents.file.seek(0)
1264-    data = contents.file.read()
1265-    d = client.create_mutable_file(data)
1266+    contents = req.fields["file"].file
1267+    data = MutableFileHandle(contents)
1268+    d = client.create_mutable_file(data, version=version)
1269     d.addCallback(lambda n: n.get_uri())
1270     return d
1271 
1272}
1273[client.py: learn how to create different kinds of mutable files
1274Kevan Carstensen <kevan@isnotajoke.com>**20100814225711
1275 Ignore-this: 61ff665bc050cba5f58bf2ed779d692b
1276] {
1277hunk ./src/allmydata/client.py 25
1278 from allmydata.util.time_format import parse_duration, parse_date
1279 from allmydata.stats import StatsProvider
1280 from allmydata.history import History
1281-from allmydata.interfaces import IStatsProducer, RIStubClient
1282+from allmydata.interfaces import IStatsProducer, RIStubClient, \
1283+                                 SDMF_VERSION, MDMF_VERSION
1284 from allmydata.nodemaker import NodeMaker
1285 
1286 
1287hunk ./src/allmydata/client.py 357
1288                                    self.terminator,
1289                                    self.get_encoding_parameters(),
1290                                    self._key_generator)
1291+        default = self.get_config("client", "mutable.format", default="sdmf")
1292+        if default == "mdmf":
1293+            self.mutable_file_default = MDMF_VERSION
1294+        else:
1295+            self.mutable_file_default = SDMF_VERSION
1296 
1297     def get_history(self):
1298         return self.history
1299hunk ./src/allmydata/client.py 500
1300     def create_immutable_dirnode(self, children, convergence=None):
1301         return self.nodemaker.create_immutable_directory(children, convergence)
1302 
1303-    def create_mutable_file(self, contents=None, keysize=None):
1304-        return self.nodemaker.create_mutable_file(contents, keysize)
1305+    def create_mutable_file(self, contents=None, keysize=None, version=None):
1306+        if not version:
1307+            version = self.mutable_file_default
1308+        return self.nodemaker.create_mutable_file(contents, keysize,
1309+                                                  version=version)
1310 
1311     def upload(self, uploadable):
1312         uploader = self.getServiceNamed("uploader")
1313}
1314[mutable/checker.py and mutable/repair.py: Modify checker and repairer to work with MDMF
1315Kevan Carstensen <kevan@isnotajoke.com>**20100819003216
1316 Ignore-this: d3bd3260742be8964877f0a53543b01b
1317 
1318 The checker and repairer required minimal changes to work with the MDMF
1319 modifications made elsewhere. The checker duplicated a lot of the code
1320 that was already in the downloader, so I modified the downloader
1321 slightly to expose this functionality to the checker and removed the
1322 duplicated code. The repairer only required a minor change to deal with
1323 data representation.
1324] {
1325hunk ./src/allmydata/mutable/checker.py 2
1326 
1327-from twisted.internet import defer
1328-from twisted.python import failure
1329-from allmydata import hashtree
1330 from allmydata.uri import from_string
1331hunk ./src/allmydata/mutable/checker.py 3
1332-from allmydata.util import hashutil, base32, idlib, log
1333+from allmydata.util import base32, idlib, log
1334 from allmydata.check_results import CheckAndRepairResults, CheckResults
1335 
1336 from allmydata.mutable.common import MODE_CHECK, CorruptShareError
1337hunk ./src/allmydata/mutable/checker.py 8
1338 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
1339-from allmydata.mutable.layout import unpack_share, SIGNED_PREFIX_LENGTH
1340+from allmydata.mutable.retrieve import Retrieve # for verifying
1341 
1342 class MutableChecker:
1343 
1344hunk ./src/allmydata/mutable/checker.py 25
1345 
1346     def check(self, verify=False, add_lease=False):
1347         servermap = ServerMap()
1348+        # Updating the servermap in MODE_CHECK will stand a good chance
1349+        # of finding all of the shares, and getting a good idea of
1350+        # recoverability, etc, without verifying.
1351         u = ServermapUpdater(self._node, self._storage_broker, self._monitor,
1352                              servermap, MODE_CHECK, add_lease=add_lease)
1353         if self._history:
1354hunk ./src/allmydata/mutable/checker.py 51
1355         if num_recoverable:
1356             self.best_version = servermap.best_recoverable_version()
1357 
1358+        # The file is unhealthy and needs to be repaired if:
1359+        # - There are unrecoverable versions.
1360         if servermap.unrecoverable_versions():
1361             self.need_repair = True
1362hunk ./src/allmydata/mutable/checker.py 55
1363+        # - There isn't a recoverable version.
1364         if num_recoverable != 1:
1365             self.need_repair = True
1366hunk ./src/allmydata/mutable/checker.py 58
1367+        # - The best recoverable version is missing some shares.
1368         if self.best_version:
1369             available_shares = servermap.shares_available()
1370             (num_distinct_shares, k, N) = available_shares[self.best_version]
1371hunk ./src/allmydata/mutable/checker.py 69
1372 
1373     def _verify_all_shares(self, servermap):
1374         # read every byte of each share
1375+        #
1376+        # This logic is going to be very nearly the same as the
1377+        # downloader. I bet we could pass the downloader a flag that
1378+        # makes it do this, and piggyback onto that instead of
1379+        # duplicating a bunch of code.
1380+        #
1381+        # Like:
1382+        #  r = Retrieve(blah, blah, blah, verify=True)
1383+        #  d = r.download()
1384+        #  (wait, wait, wait, d.callback)
1385+        # 
1386+        #  Then, when it has finished, we can check the servermap (which
1387+        #  we provided to Retrieve) to figure out which shares are bad,
1388+        #  since the Retrieve process will have updated the servermap as
1389+        #  it went along.
1390+        #
1391+        #  By passing the verify=True flag to the constructor, we are
1392+        #  telling the downloader a few things.
1393+        #
1394+        #  1. It needs to download all N shares, not just K shares.
1395+        #  2. It doesn't need to decrypt or decode the shares, only
1396+        #     verify them.
1397         if not self.best_version:
1398             return
1399hunk ./src/allmydata/mutable/checker.py 93
1400-        versionmap = servermap.make_versionmap()
1401-        shares = versionmap[self.best_version]
1402-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1403-         offsets_tuple) = self.best_version
1404-        offsets = dict(offsets_tuple)
1405-        readv = [ (0, offsets["EOF"]) ]
1406-        dl = []
1407-        for (shnum, peerid, timestamp) in shares:
1408-            ss = servermap.connections[peerid]
1409-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
1410-            d.addCallback(self._got_answer, peerid, servermap)
1411-            dl.append(d)
1412-        return defer.DeferredList(dl, fireOnOneErrback=True, consumeErrors=True)
1413 
1414hunk ./src/allmydata/mutable/checker.py 94
1415-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
1416-        # isolate the callRemote to a separate method, so tests can subclass
1417-        # Publish and override it
1418-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
1419+        r = Retrieve(self._node, servermap, self.best_version, verify=True)
1420+        d = r.download()
1421+        d.addCallback(self._process_bad_shares)
1422         return d
1423 
1424hunk ./src/allmydata/mutable/checker.py 99
1425-    def _got_answer(self, datavs, peerid, servermap):
1426-        for shnum,datav in datavs.items():
1427-            data = datav[0]
1428-            try:
1429-                self._got_results_one_share(shnum, peerid, data)
1430-            except CorruptShareError:
1431-                f = failure.Failure()
1432-                self.need_repair = True
1433-                self.bad_shares.append( (peerid, shnum, f) )
1434-                prefix = data[:SIGNED_PREFIX_LENGTH]
1435-                servermap.mark_bad_share(peerid, shnum, prefix)
1436-                ss = servermap.connections[peerid]
1437-                self.notify_server_corruption(ss, shnum, str(f.value))
1438-
1439-    def check_prefix(self, peerid, shnum, data):
1440-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1441-         offsets_tuple) = self.best_version
1442-        got_prefix = data[:SIGNED_PREFIX_LENGTH]
1443-        if got_prefix != prefix:
1444-            raise CorruptShareError(peerid, shnum,
1445-                                    "prefix mismatch: share changed while we were reading it")
1446-
1447-    def _got_results_one_share(self, shnum, peerid, data):
1448-        self.check_prefix(peerid, shnum, data)
1449-
1450-        # the [seqnum:signature] pieces are validated by _compare_prefix,
1451-        # which checks their signature against the pubkey known to be
1452-        # associated with this file.
1453 
1454hunk ./src/allmydata/mutable/checker.py 100
1455-        (seqnum, root_hash, IV, k, N, segsize, datalen, pubkey, signature,
1456-         share_hash_chain, block_hash_tree, share_data,
1457-         enc_privkey) = unpack_share(data)
1458-
1459-        # validate [share_hash_chain,block_hash_tree,share_data]
1460-
1461-        leaves = [hashutil.block_hash(share_data)]
1462-        t = hashtree.HashTree(leaves)
1463-        if list(t) != block_hash_tree:
1464-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
1465-        share_hash_leaf = t[0]
1466-        t2 = hashtree.IncompleteHashTree(N)
1467-        # root_hash was checked by the signature
1468-        t2.set_hashes({0: root_hash})
1469-        try:
1470-            t2.set_hashes(hashes=share_hash_chain,
1471-                          leaves={shnum: share_hash_leaf})
1472-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
1473-                IndexError), e:
1474-            msg = "corrupt hashes: %s" % (e,)
1475-            raise CorruptShareError(peerid, shnum, msg)
1476-
1477-        # validate enc_privkey: only possible if we have a write-cap
1478-        if not self._node.is_readonly():
1479-            alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
1480-            alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
1481-            if alleged_writekey != self._node.get_writekey():
1482-                raise CorruptShareError(peerid, shnum, "invalid privkey")
1483+    def _process_bad_shares(self, bad_shares):
1484+        if bad_shares:
1485+            self.need_repair = True
1486+        self.bad_shares = bad_shares
1487 
1488hunk ./src/allmydata/mutable/checker.py 105
1489-    def notify_server_corruption(self, ss, shnum, reason):
1490-        ss.callRemoteOnly("advise_corrupt_share",
1491-                          "mutable", self._storage_index, shnum, reason)
1492 
1493     def _count_shares(self, smap, version):
1494         available_shares = smap.shares_available()
1495hunk ./src/allmydata/mutable/repairer.py 5
1496 from zope.interface import implements
1497 from twisted.internet import defer
1498 from allmydata.interfaces import IRepairResults, ICheckResults
1499+from allmydata.mutable.publish import MutableData
1500 
1501 class RepairResults:
1502     implements(IRepairResults)
1503hunk ./src/allmydata/mutable/repairer.py 108
1504             raise RepairRequiresWritecapError("Sorry, repair currently requires a writecap, to set the write-enabler properly.")
1505 
1506         d = self.node.download_version(smap, best_version, fetch_privkey=True)
1507+        d.addCallback(lambda data:
1508+            MutableData(data))
1509         d.addCallback(self.node.upload, smap)
1510         d.addCallback(self.get_results, smap)
1511         return d
1512}
1513[mutable/filenode.py: add versions and partial-file updates to the mutable file node
1514Kevan Carstensen <kevan@isnotajoke.com>**20100819003231
1515 Ignore-this: b7b5434201fdb9b48f902d7ab25ef45c
1516 
1517 One of the goals of MDMF as a GSoC project is to lay the groundwork for
1518 LDMF, a format that will allow Tahoe-LAFS to deal with and encourage
1519 multiple versions of a single cap on the grid. In line with this, there
1520 is a now a distinction between an overriding mutable file (which can be
1521 thought to correspond to the cap/unique identifier for that mutable
1522 file) and versions of the mutable file (which we can download, update,
1523 and so on). All download, upload, and modification operations end up
1524 happening on a particular version of a mutable file, but there are
1525 shortcut methods on the object representing the overriding mutable file
1526 that perform these operations on the best version of the mutable file
1527 (which is what code should be doing until we have LDMF and better
1528 support for other paradigms).
1529 
1530 Another goal of MDMF was to take advantage of segmentation to give
1531 callers more efficient partial file updates or appends. This patch
1532 implements methods that do that, too.
1533 
1534] {
1535hunk ./src/allmydata/mutable/filenode.py 7
1536 from zope.interface import implements
1537 from twisted.internet import defer, reactor
1538 from foolscap.api import eventually
1539-from allmydata.interfaces import IMutableFileNode, \
1540-     ICheckable, ICheckResults, NotEnoughSharesError
1541-from allmydata.util import hashutil, log
1542+from allmydata.interfaces import IMutableFileNode, ICheckable, ICheckResults, \
1543+     NotEnoughSharesError, MDMF_VERSION, SDMF_VERSION, IMutableUploadable, \
1544+     IMutableFileVersion, IWritable
1545+from allmydata.util import hashutil, log, consumer, deferredutil, mathutil
1546 from allmydata.util.assertutil import precondition
1547 from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI
1548 from allmydata.monitor import Monitor
1549hunk ./src/allmydata/mutable/filenode.py 16
1550 from pycryptopp.cipher.aes import AES
1551 
1552-from allmydata.mutable.publish import Publish
1553+from allmydata.mutable.publish import Publish, MutableData,\
1554+                                      DEFAULT_MAX_SEGMENT_SIZE, \
1555+                                      TransformingUploadable
1556 from allmydata.mutable.common import MODE_READ, MODE_WRITE, UnrecoverableFileError, \
1557      ResponseCache, UncoordinatedWriteError
1558 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
1559hunk ./src/allmydata/mutable/filenode.py 70
1560         self._sharemap = {} # known shares, shnum-to-[nodeids]
1561         self._cache = ResponseCache()
1562         self._most_recent_size = None
1563+        # filled in after __init__ if we're being created for the first time;
1564+        # filled in by the servermap updater before publishing, otherwise.
1565+        # set to this default value in case neither of those things happen,
1566+        # or in case the servermap can't find any shares to tell us what
1567+        # to publish as.
1568+        # TODO: Set this back to None, and find out why the tests fail
1569+        #       with it set to None.
1570+        self._protocol_version = None
1571 
1572         # all users of this MutableFileNode go through the serializer. This
1573         # takes advantage of the fact that Deferreds discard the callbacks
1574hunk ./src/allmydata/mutable/filenode.py 134
1575         return self._upload(initial_contents, None)
1576 
1577     def _get_initial_contents(self, contents):
1578-        if isinstance(contents, str):
1579-            return contents
1580         if contents is None:
1581hunk ./src/allmydata/mutable/filenode.py 135
1582-            return ""
1583+            return MutableData("")
1584+
1585+        if IMutableUploadable.providedBy(contents):
1586+            return contents
1587+
1588         assert callable(contents), "%s should be callable, not %s" % \
1589                (contents, type(contents))
1590         return contents(self)
1591hunk ./src/allmydata/mutable/filenode.py 209
1592 
1593     def get_size(self):
1594         return self._most_recent_size
1595+
1596     def get_current_size(self):
1597         d = self.get_size_of_best_version()
1598         d.addCallback(self._stash_size)
1599hunk ./src/allmydata/mutable/filenode.py 214
1600         return d
1601+
1602     def _stash_size(self, size):
1603         self._most_recent_size = size
1604         return size
1605hunk ./src/allmydata/mutable/filenode.py 273
1606             return cmp(self.__class__, them.__class__)
1607         return cmp(self._uri, them._uri)
1608 
1609-    def _do_serialized(self, cb, *args, **kwargs):
1610-        # note: to avoid deadlock, this callable is *not* allowed to invoke
1611-        # other serialized methods within this (or any other)
1612-        # MutableFileNode. The callable should be a bound method of this same
1613-        # MFN instance.
1614-        d = defer.Deferred()
1615-        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
1616-        # we need to put off d.callback until this Deferred is finished being
1617-        # processed. Otherwise the caller's subsequent activities (like,
1618-        # doing other things with this node) can cause reentrancy problems in
1619-        # the Deferred code itself
1620-        self._serializer.addBoth(lambda res: eventually(d.callback, res))
1621-        # add a log.err just in case something really weird happens, because
1622-        # self._serializer stays around forever, therefore we won't see the
1623-        # usual Unhandled Error in Deferred that would give us a hint.
1624-        self._serializer.addErrback(log.err)
1625-        return d
1626 
1627     #################################
1628     # ICheckable
1629hunk ./src/allmydata/mutable/filenode.py 298
1630 
1631 
1632     #################################
1633-    # IMutableFileNode
1634+    # IFileNode
1635+
1636+    def get_best_readable_version(self):
1637+        """
1638+        I return a Deferred that fires with a MutableFileVersion
1639+        representing the best readable version of the file that I
1640+        represent
1641+        """
1642+        return self.get_readable_version()
1643+
1644+
1645+    def get_readable_version(self, servermap=None, version=None):
1646+        """
1647+        I return a Deferred that fires with an MutableFileVersion for my
1648+        version argument, if there is a recoverable file of that version
1649+        on the grid. If there is no recoverable version, I fire with an
1650+        UnrecoverableFileError.
1651+
1652+        If a servermap is provided, I look in there for the requested
1653+        version. If no servermap is provided, I create and update a new
1654+        one.
1655+
1656+        If no version is provided, then I return a MutableFileVersion
1657+        representing the best recoverable version of the file.
1658+        """
1659+        d = self._get_version_from_servermap(MODE_READ, servermap, version)
1660+        def _build_version((servermap, their_version)):
1661+            assert their_version in servermap.recoverable_versions()
1662+            assert their_version in servermap.make_versionmap()
1663+
1664+            mfv = MutableFileVersion(self,
1665+                                     servermap,
1666+                                     their_version,
1667+                                     self._storage_index,
1668+                                     self._storage_broker,
1669+                                     self._readkey,
1670+                                     history=self._history)
1671+            assert mfv.is_readonly()
1672+            # our caller can use this to download the contents of the
1673+            # mutable file.
1674+            return mfv
1675+        return d.addCallback(_build_version)
1676+
1677+
1678+    def _get_version_from_servermap(self,
1679+                                    mode,
1680+                                    servermap=None,
1681+                                    version=None):
1682+        """
1683+        I return a Deferred that fires with (servermap, version).
1684+
1685+        This function performs validation and a servermap update. If it
1686+        returns (servermap, version), the caller can assume that:
1687+            - servermap was last updated in mode.
1688+            - version is recoverable, and corresponds to the servermap.
1689+
1690+        If version and servermap are provided to me, I will validate
1691+        that version exists in the servermap, and that the servermap was
1692+        updated correctly.
1693+
1694+        If version is not provided, but servermap is, I will validate
1695+        the servermap and return the best recoverable version that I can
1696+        find in the servermap.
1697+
1698+        If the version is provided but the servermap isn't, I will
1699+        obtain a servermap that has been updated in the correct mode and
1700+        validate that version is found and recoverable.
1701+
1702+        If neither servermap nor version are provided, I will obtain a
1703+        servermap updated in the correct mode, and return the best
1704+        recoverable version that I can find in there.
1705+        """
1706+        # XXX: wording ^^^^
1707+        if servermap and servermap.last_update_mode == mode:
1708+            d = defer.succeed(servermap)
1709+        else:
1710+            d = self._get_servermap(mode)
1711+
1712+        def _get_version(servermap, v):
1713+            if v and v not in servermap.recoverable_versions():
1714+                v = None
1715+            elif not v:
1716+                v = servermap.best_recoverable_version()
1717+            if not v:
1718+                raise UnrecoverableFileError("no recoverable versions")
1719+
1720+            return (servermap, v)
1721+        return d.addCallback(_get_version, version)
1722+
1723 
1724     def download_best_version(self):
1725hunk ./src/allmydata/mutable/filenode.py 389
1726+        """
1727+        I return a Deferred that fires with the contents of the best
1728+        version of this mutable file.
1729+        """
1730         return self._do_serialized(self._download_best_version)
1731hunk ./src/allmydata/mutable/filenode.py 394
1732+
1733+
1734     def _download_best_version(self):
1735hunk ./src/allmydata/mutable/filenode.py 397
1736-        servermap = ServerMap()
1737-        d = self._try_once_to_download_best_version(servermap, MODE_READ)
1738-        def _maybe_retry(f):
1739-            f.trap(NotEnoughSharesError)
1740-            # the download is worth retrying once. Make sure to use the
1741-            # old servermap, since it is what remembers the bad shares,
1742-            # but use MODE_WRITE to make it look for even more shares.
1743-            # TODO: consider allowing this to retry multiple times.. this
1744-            # approach will let us tolerate about 8 bad shares, I think.
1745-            return self._try_once_to_download_best_version(servermap,
1746-                                                           MODE_WRITE)
1747+        """
1748+        I am the serialized sibling of download_best_version.
1749+        """
1750+        d = self.get_best_readable_version()
1751+        d.addCallback(self._record_size)
1752+        d.addCallback(lambda version: version.download_to_data())
1753+
1754+        # It is possible that the download will fail because there
1755+        # aren't enough shares to be had. If so, we will try again after
1756+        # updating the servermap in MODE_WRITE, which may find more
1757+        # shares than updating in MODE_READ, as we just did. We can do
1758+        # this by getting the best mutable version and downloading from
1759+        # that -- the best mutable version will be a MutableFileVersion
1760+        # with a servermap that was last updated in MODE_WRITE, as we
1761+        # want. If this fails, then we give up.
1762+        def _maybe_retry(failure):
1763+            failure.trap(NotEnoughSharesError)
1764+
1765+            d = self.get_best_mutable_version()
1766+            d.addCallback(self._record_size)
1767+            d.addCallback(lambda version: version.download_to_data())
1768+            return d
1769+
1770         d.addErrback(_maybe_retry)
1771         return d
1772hunk ./src/allmydata/mutable/filenode.py 422
1773-    def _try_once_to_download_best_version(self, servermap, mode):
1774-        d = self._update_servermap(servermap, mode)
1775-        d.addCallback(self._once_updated_download_best_version, servermap)
1776-        return d
1777-    def _once_updated_download_best_version(self, ignored, servermap):
1778-        goal = servermap.best_recoverable_version()
1779-        if not goal:
1780-            raise UnrecoverableFileError("no recoverable versions")
1781-        return self._try_once_to_download_version(servermap, goal)
1782+
1783+
1784+    def _record_size(self, mfv):
1785+        """
1786+        I record the size of a mutable file version.
1787+        """
1788+        self._most_recent_size = mfv.get_size()
1789+        return mfv
1790+
1791 
1792     def get_size_of_best_version(self):
1793hunk ./src/allmydata/mutable/filenode.py 433
1794-        d = self.get_servermap(MODE_READ)
1795-        def _got_servermap(smap):
1796-            ver = smap.best_recoverable_version()
1797-            if not ver:
1798-                raise UnrecoverableFileError("no recoverable version")
1799-            return smap.size_of_version(ver)
1800-        d.addCallback(_got_servermap)
1801-        return d
1802+        """
1803+        I return the size of the best version of this mutable file.
1804 
1805hunk ./src/allmydata/mutable/filenode.py 436
1806+        This is equivalent to calling get_size() on the result of
1807+        get_best_readable_version().
1808+        """
1809+        d = self.get_best_readable_version()
1810+        return d.addCallback(lambda mfv: mfv.get_size())
1811+
1812+
1813+    #################################
1814+    # IMutableFileNode
1815+
1816+    def get_best_mutable_version(self, servermap=None):
1817+        """
1818+        I return a Deferred that fires with a MutableFileVersion
1819+        representing the best readable version of the file that I
1820+        represent. I am like get_best_readable_version, except that I
1821+        will try to make a writable version if I can.
1822+        """
1823+        return self.get_mutable_version(servermap=servermap)
1824+
1825+
1826+    def get_mutable_version(self, servermap=None, version=None):
1827+        """
1828+        I return a version of this mutable file. I return a Deferred
1829+        that fires with a MutableFileVersion
1830+
1831+        If version is provided, the Deferred will fire with a
1832+        MutableFileVersion initailized with that version. Otherwise, it
1833+        will fire with the best version that I can recover.
1834+
1835+        If servermap is provided, I will use that to find versions
1836+        instead of performing my own servermap update.
1837+        """
1838+        if self.is_readonly():
1839+            return self.get_readable_version(servermap=servermap,
1840+                                             version=version)
1841+
1842+        # get_mutable_version => write intent, so we require that the
1843+        # servermap is updated in MODE_WRITE
1844+        d = self._get_version_from_servermap(MODE_WRITE, servermap, version)
1845+        def _build_version((servermap, smap_version)):
1846+            # these should have been set by the servermap update.
1847+            assert self._secret_holder
1848+            assert self._writekey
1849+
1850+            mfv = MutableFileVersion(self,
1851+                                     servermap,
1852+                                     smap_version,
1853+                                     self._storage_index,
1854+                                     self._storage_broker,
1855+                                     self._readkey,
1856+                                     self._writekey,
1857+                                     self._secret_holder,
1858+                                     history=self._history)
1859+            assert not mfv.is_readonly()
1860+            return mfv
1861+
1862+        return d.addCallback(_build_version)
1863+
1864+
1865+    # XXX: I'm uncomfortable with the difference between upload and
1866+    #      overwrite, which, FWICT, is basically that you don't have to
1867+    #      do a servermap update before you overwrite. We split them up
1868+    #      that way anyway, so I guess there's no real difficulty in
1869+    #      offering both ways to callers, but it also makes the
1870+    #      public-facing API cluttery, and makes it hard to discern the
1871+    #      right way of doing things.
1872+
1873+    # In general, we leave it to callers to ensure that they aren't
1874+    # going to cause UncoordinatedWriteErrors when working with
1875+    # MutableFileVersions. We know that the next three operations
1876+    # (upload, overwrite, and modify) will all operate on the same
1877+    # version, so we say that only one of them can be going on at once,
1878+    # and serialize them to ensure that that actually happens, since as
1879+    # the caller in this situation it is our job to do that.
1880     def overwrite(self, new_contents):
1881hunk ./src/allmydata/mutable/filenode.py 511
1882+        """
1883+        I overwrite the contents of the best recoverable version of this
1884+        mutable file with new_contents. This is equivalent to calling
1885+        overwrite on the result of get_best_mutable_version with
1886+        new_contents as an argument. I return a Deferred that eventually
1887+        fires with the results of my replacement process.
1888+        """
1889         return self._do_serialized(self._overwrite, new_contents)
1890hunk ./src/allmydata/mutable/filenode.py 519
1891+
1892+
1893     def _overwrite(self, new_contents):
1894hunk ./src/allmydata/mutable/filenode.py 522
1895+        """
1896+        I am the serialized sibling of overwrite.
1897+        """
1898+        d = self.get_best_mutable_version()
1899+        d.addCallback(lambda mfv: mfv.overwrite(new_contents))
1900+        d.addCallback(self._did_upload, new_contents.get_size())
1901+        return d
1902+
1903+
1904+
1905+    def upload(self, new_contents, servermap):
1906+        """
1907+        I overwrite the contents of the best recoverable version of this
1908+        mutable file with new_contents, using servermap instead of
1909+        creating/updating our own servermap. I return a Deferred that
1910+        fires with the results of my upload.
1911+        """
1912+        return self._do_serialized(self._upload, new_contents, servermap)
1913+
1914+
1915+    def modify(self, modifier, backoffer=None):
1916+        """
1917+        I modify the contents of the best recoverable version of this
1918+        mutable file with the modifier. This is equivalent to calling
1919+        modify on the result of get_best_mutable_version. I return a
1920+        Deferred that eventually fires with an UploadResults instance
1921+        describing this process.
1922+        """
1923+        return self._do_serialized(self._modify, modifier, backoffer)
1924+
1925+
1926+    def _modify(self, modifier, backoffer):
1927+        """
1928+        I am the serialized sibling of modify.
1929+        """
1930+        d = self.get_best_mutable_version()
1931+        d.addCallback(lambda mfv: mfv.modify(modifier, backoffer))
1932+        return d
1933+
1934+
1935+    def download_version(self, servermap, version, fetch_privkey=False):
1936+        """
1937+        Download the specified version of this mutable file. I return a
1938+        Deferred that fires with the contents of the specified version
1939+        as a bytestring, or errbacks if the file is not recoverable.
1940+        """
1941+        d = self.get_readable_version(servermap, version)
1942+        return d.addCallback(lambda mfv: mfv.download_to_data(fetch_privkey))
1943+
1944+
1945+    def get_servermap(self, mode):
1946+        """
1947+        I return a servermap that has been updated in mode.
1948+
1949+        mode should be one of MODE_READ, MODE_WRITE, MODE_CHECK or
1950+        MODE_ANYTHING. See servermap.py for more on what these mean.
1951+        """
1952+        return self._do_serialized(self._get_servermap, mode)
1953+
1954+
1955+    def _get_servermap(self, mode):
1956+        """
1957+        I am a serialized twin to get_servermap.
1958+        """
1959         servermap = ServerMap()
1960hunk ./src/allmydata/mutable/filenode.py 587
1961-        d = self._update_servermap(servermap, mode=MODE_WRITE)
1962-        d.addCallback(lambda ignored: self._upload(new_contents, servermap))
1963+        d = self._update_servermap(servermap, mode)
1964+        # The servermap will tell us about the most recent size of the
1965+        # file, so we may as well set that so that callers might get
1966+        # more data about us.
1967+        if not self._most_recent_size:
1968+            d.addCallback(self._get_size_from_servermap)
1969+        return d
1970+
1971+
1972+    def _get_size_from_servermap(self, servermap):
1973+        """
1974+        I extract the size of the best version of this file and record
1975+        it in self._most_recent_size. I return the servermap that I was
1976+        given.
1977+        """
1978+        if servermap.recoverable_versions():
1979+            v = servermap.best_recoverable_version()
1980+            size = v[4] # verinfo[4] == size
1981+            self._most_recent_size = size
1982+        return servermap
1983+
1984+
1985+    def _update_servermap(self, servermap, mode):
1986+        u = ServermapUpdater(self, self._storage_broker, Monitor(), servermap,
1987+                             mode)
1988+        if self._history:
1989+            self._history.notify_mapupdate(u.get_status())
1990+        return u.update()
1991+
1992+
1993+    def set_version(self, version):
1994+        # I can be set in two ways:
1995+        #  1. When the node is created.
1996+        #  2. (for an existing share) when the Servermap is updated
1997+        #     before I am read.
1998+        assert version in (MDMF_VERSION, SDMF_VERSION)
1999+        self._protocol_version = version
2000+
2001+
2002+    def get_version(self):
2003+        return self._protocol_version
2004+
2005+
2006+    def _do_serialized(self, cb, *args, **kwargs):
2007+        # note: to avoid deadlock, this callable is *not* allowed to invoke
2008+        # other serialized methods within this (or any other)
2009+        # MutableFileNode. The callable should be a bound method of this same
2010+        # MFN instance.
2011+        d = defer.Deferred()
2012+        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
2013+        # we need to put off d.callback until this Deferred is finished being
2014+        # processed. Otherwise the caller's subsequent activities (like,
2015+        # doing other things with this node) can cause reentrancy problems in
2016+        # the Deferred code itself
2017+        self._serializer.addBoth(lambda res: eventually(d.callback, res))
2018+        # add a log.err just in case something really weird happens, because
2019+        # self._serializer stays around forever, therefore we won't see the
2020+        # usual Unhandled Error in Deferred that would give us a hint.
2021+        self._serializer.addErrback(log.err)
2022         return d
2023 
2024 
2025hunk ./src/allmydata/mutable/filenode.py 649
2026+    def _upload(self, new_contents, servermap):
2027+        """
2028+        A MutableFileNode still has to have some way of getting
2029+        published initially, which is what I am here for. After that,
2030+        all publishing, updating, modifying and so on happens through
2031+        MutableFileVersions.
2032+        """
2033+        assert self._pubkey, "update_servermap must be called before publish"
2034+
2035+        p = Publish(self, self._storage_broker, servermap)
2036+        if self._history:
2037+            self._history.notify_publish(p.get_status(),
2038+                                         new_contents.get_size())
2039+        d = p.publish(new_contents)
2040+        d.addCallback(self._did_upload, new_contents.get_size())
2041+        return d
2042+
2043+
2044+    def _did_upload(self, res, size):
2045+        self._most_recent_size = size
2046+        return res
2047+
2048+
2049+class MutableFileVersion:
2050+    """
2051+    I represent a specific version (most likely the best version) of a
2052+    mutable file.
2053+
2054+    Since I implement IReadable, instances which hold a
2055+    reference to an instance of me are guaranteed the ability (absent
2056+    connection difficulties or unrecoverable versions) to read the file
2057+    that I represent. Depending on whether I was initialized with a
2058+    write capability or not, I may also provide callers the ability to
2059+    overwrite or modify the contents of the mutable file that I
2060+    reference.
2061+    """
2062+    implements(IMutableFileVersion, IWritable)
2063+
2064+    def __init__(self,
2065+                 node,
2066+                 servermap,
2067+                 version,
2068+                 storage_index,
2069+                 storage_broker,
2070+                 readcap,
2071+                 writekey=None,
2072+                 write_secrets=None,
2073+                 history=None):
2074+
2075+        self._node = node
2076+        self._servermap = servermap
2077+        self._version = version
2078+        self._storage_index = storage_index
2079+        self._write_secrets = write_secrets
2080+        self._history = history
2081+        self._storage_broker = storage_broker
2082+
2083+        #assert isinstance(readcap, IURI)
2084+        self._readcap = readcap
2085+
2086+        self._writekey = writekey
2087+        self._serializer = defer.succeed(None)
2088+
2089+
2090+    def get_sequence_number(self):
2091+        """
2092+        Get the sequence number of the mutable version that I represent.
2093+        """
2094+        return self._version[0] # verinfo[0] == the sequence number
2095+
2096+
2097+    # TODO: Terminology?
2098+    def get_writekey(self):
2099+        """
2100+        I return a writekey or None if I don't have a writekey.
2101+        """
2102+        return self._writekey
2103+
2104+
2105+    def overwrite(self, new_contents):
2106+        """
2107+        I overwrite the contents of this mutable file version with the
2108+        data in new_contents.
2109+        """
2110+        assert not self.is_readonly()
2111+
2112+        return self._do_serialized(self._overwrite, new_contents)
2113+
2114+
2115+    def _overwrite(self, new_contents):
2116+        assert IMutableUploadable.providedBy(new_contents)
2117+        assert self._servermap.last_update_mode == MODE_WRITE
2118+
2119+        return self._upload(new_contents)
2120+
2121+
2122     def modify(self, modifier, backoffer=None):
2123         """I use a modifier callback to apply a change to the mutable file.
2124         I implement the following pseudocode::
2125hunk ./src/allmydata/mutable/filenode.py 785
2126         backoffer should not invoke any methods on this MutableFileNode
2127         instance, and it needs to be highly conscious of deadlock issues.
2128         """
2129+        assert not self.is_readonly()
2130+
2131         return self._do_serialized(self._modify, modifier, backoffer)
2132hunk ./src/allmydata/mutable/filenode.py 788
2133+
2134+
2135     def _modify(self, modifier, backoffer):
2136hunk ./src/allmydata/mutable/filenode.py 791
2137-        servermap = ServerMap()
2138         if backoffer is None:
2139             backoffer = BackoffAgent().delay
2140hunk ./src/allmydata/mutable/filenode.py 793
2141-        return self._modify_and_retry(servermap, modifier, backoffer, True)
2142-    def _modify_and_retry(self, servermap, modifier, backoffer, first_time):
2143-        d = self._modify_once(servermap, modifier, first_time)
2144+        return self._modify_and_retry(modifier, backoffer, True)
2145+
2146+
2147+    def _modify_and_retry(self, modifier, backoffer, first_time):
2148+        """
2149+        I try to apply modifier to the contents of this version of the
2150+        mutable file. If I succeed, I return an UploadResults instance
2151+        describing my success. If I fail, I try again after waiting for
2152+        a little bit.
2153+        """
2154+        log.msg("doing modify")
2155+        d = self._modify_once(modifier, first_time)
2156         def _retry(f):
2157             f.trap(UncoordinatedWriteError)
2158             d2 = defer.maybeDeferred(backoffer, self, f)
2159hunk ./src/allmydata/mutable/filenode.py 809
2160             d2.addCallback(lambda ignored:
2161-                           self._modify_and_retry(servermap, modifier,
2162+                           self._modify_and_retry(modifier,
2163                                                   backoffer, False))
2164             return d2
2165         d.addErrback(_retry)
2166hunk ./src/allmydata/mutable/filenode.py 814
2167         return d
2168-    def _modify_once(self, servermap, modifier, first_time):
2169-        d = self._update_servermap(servermap, MODE_WRITE)
2170-        d.addCallback(self._once_updated_download_best_version, servermap)
2171+
2172+
2173+    def _modify_once(self, modifier, first_time):
2174+        """
2175+        I attempt to apply a modifier to the contents of the mutable
2176+        file.
2177+        """
2178+        # XXX: This is wrong -- we could get more servers if we updated
2179+        # in MODE_ANYTHING and possibly MODE_CHECK. Probably we want to
2180+        # assert that the last update wasn't MODE_READ
2181+        assert self._servermap.last_update_mode == MODE_WRITE
2182+
2183+        # download_to_data is serialized, so we have to call this to
2184+        # avoid deadlock.
2185+        d = self._try_to_download_data()
2186         def _apply(old_contents):
2187hunk ./src/allmydata/mutable/filenode.py 830
2188-            new_contents = modifier(old_contents, servermap, first_time)
2189+            new_contents = modifier(old_contents, self._servermap, first_time)
2190+            precondition((isinstance(new_contents, str) or
2191+                          new_contents is None),
2192+                         "Modifier function must return a string "
2193+                         "or None")
2194+
2195             if new_contents is None or new_contents == old_contents:
2196hunk ./src/allmydata/mutable/filenode.py 837
2197+                log.msg("no changes")
2198                 # no changes need to be made
2199                 if first_time:
2200                     return
2201hunk ./src/allmydata/mutable/filenode.py 845
2202                 # recovery when it observes UCWE, we need to do a second
2203                 # publish. See #551 for details. We'll basically loop until
2204                 # we managed an uncontested publish.
2205-                new_contents = old_contents
2206-            precondition(isinstance(new_contents, str),
2207-                         "Modifier function must return a string or None")
2208-            return self._upload(new_contents, servermap)
2209+                old_uploadable = MutableData(old_contents)
2210+                new_contents = old_uploadable
2211+            else:
2212+                new_contents = MutableData(new_contents)
2213+
2214+            return self._upload(new_contents)
2215         d.addCallback(_apply)
2216         return d
2217 
2218hunk ./src/allmydata/mutable/filenode.py 854
2219-    def get_servermap(self, mode):
2220-        return self._do_serialized(self._get_servermap, mode)
2221-    def _get_servermap(self, mode):
2222-        servermap = ServerMap()
2223-        return self._update_servermap(servermap, mode)
2224-    def _update_servermap(self, servermap, mode):
2225-        u = ServermapUpdater(self, self._storage_broker, Monitor(), servermap,
2226-                             mode)
2227-        if self._history:
2228-            self._history.notify_mapupdate(u.get_status())
2229-        return u.update()
2230 
2231hunk ./src/allmydata/mutable/filenode.py 855
2232-    def download_version(self, servermap, version, fetch_privkey=False):
2233-        return self._do_serialized(self._try_once_to_download_version,
2234-                                   servermap, version, fetch_privkey)
2235-    def _try_once_to_download_version(self, servermap, version,
2236-                                      fetch_privkey=False):
2237-        r = Retrieve(self, servermap, version, fetch_privkey)
2238+    def is_readonly(self):
2239+        """
2240+        I return True if this MutableFileVersion provides no write
2241+        access to the file that it encapsulates, and False if it
2242+        provides the ability to modify the file.
2243+        """
2244+        return self._writekey is None
2245+
2246+
2247+    def is_mutable(self):
2248+        """
2249+        I return True, since mutable files are always mutable by
2250+        somebody.
2251+        """
2252+        return True
2253+
2254+
2255+    def get_storage_index(self):
2256+        """
2257+        I return the storage index of the reference that I encapsulate.
2258+        """
2259+        return self._storage_index
2260+
2261+
2262+    def get_size(self):
2263+        """
2264+        I return the length, in bytes, of this readable object.
2265+        """
2266+        return self._servermap.size_of_version(self._version)
2267+
2268+
2269+    def download_to_data(self, fetch_privkey=False):
2270+        """
2271+        I return a Deferred that fires with the contents of this
2272+        readable object as a byte string.
2273+
2274+        """
2275+        c = consumer.MemoryConsumer()
2276+        d = self.read(c, fetch_privkey=fetch_privkey)
2277+        d.addCallback(lambda mc: "".join(mc.chunks))
2278+        return d
2279+
2280+
2281+    def _try_to_download_data(self):
2282+        """
2283+        I am an unserialized cousin of download_to_data; I am called
2284+        from the children of modify() to download the data associated
2285+        with this mutable version.
2286+        """
2287+        c = consumer.MemoryConsumer()
2288+        # modify will almost certainly write, so we need the privkey.
2289+        d = self._read(c, fetch_privkey=True)
2290+        d.addCallback(lambda mc: "".join(mc.chunks))
2291+        return d
2292+
2293+
2294+    def read(self, consumer, offset=0, size=None, fetch_privkey=False):
2295+        """
2296+        I read a portion (possibly all) of the mutable file that I
2297+        reference into consumer.
2298+        """
2299+        return self._do_serialized(self._read, consumer, offset, size,
2300+                                   fetch_privkey)
2301+
2302+
2303+    def _read(self, consumer, offset=0, size=None, fetch_privkey=False):
2304+        """
2305+        I am the serialized companion of read.
2306+        """
2307+        r = Retrieve(self._node, self._servermap, self._version, fetch_privkey)
2308         if self._history:
2309             self._history.notify_retrieve(r.get_status())
2310hunk ./src/allmydata/mutable/filenode.py 927
2311-        d = r.download()
2312-        d.addCallback(self._downloaded_version)
2313+        d = r.download(consumer, offset, size)
2314         return d
2315hunk ./src/allmydata/mutable/filenode.py 929
2316-    def _downloaded_version(self, data):
2317-        self._most_recent_size = len(data)
2318-        return data
2319 
2320hunk ./src/allmydata/mutable/filenode.py 930
2321-    def upload(self, new_contents, servermap):
2322-        return self._do_serialized(self._upload, new_contents, servermap)
2323-    def _upload(self, new_contents, servermap):
2324-        assert self._pubkey, "update_servermap must be called before publish"
2325-        p = Publish(self, self._storage_broker, servermap)
2326+
2327+    def _do_serialized(self, cb, *args, **kwargs):
2328+        # note: to avoid deadlock, this callable is *not* allowed to invoke
2329+        # other serialized methods within this (or any other)
2330+        # MutableFileNode. The callable should be a bound method of this same
2331+        # MFN instance.
2332+        d = defer.Deferred()
2333+        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
2334+        # we need to put off d.callback until this Deferred is finished being
2335+        # processed. Otherwise the caller's subsequent activities (like,
2336+        # doing other things with this node) can cause reentrancy problems in
2337+        # the Deferred code itself
2338+        self._serializer.addBoth(lambda res: eventually(d.callback, res))
2339+        # add a log.err just in case something really weird happens, because
2340+        # self._serializer stays around forever, therefore we won't see the
2341+        # usual Unhandled Error in Deferred that would give us a hint.
2342+        self._serializer.addErrback(log.err)
2343+        return d
2344+
2345+
2346+    def _upload(self, new_contents):
2347+        #assert self._pubkey, "update_servermap must be called before publish"
2348+        p = Publish(self._node, self._storage_broker, self._servermap)
2349         if self._history:
2350hunk ./src/allmydata/mutable/filenode.py 954
2351-            self._history.notify_publish(p.get_status(), len(new_contents))
2352+            self._history.notify_publish(p.get_status(),
2353+                                         new_contents.get_size())
2354         d = p.publish(new_contents)
2355hunk ./src/allmydata/mutable/filenode.py 957
2356-        d.addCallback(self._did_upload, len(new_contents))
2357+        d.addCallback(self._did_upload, new_contents.get_size())
2358         return d
2359hunk ./src/allmydata/mutable/filenode.py 959
2360+
2361+
2362     def _did_upload(self, res, size):
2363         self._most_recent_size = size
2364         return res
2365hunk ./src/allmydata/mutable/filenode.py 964
2366+
2367+    def update(self, data, offset):
2368+        """
2369+        Do an update of this mutable file version by inserting data at
2370+        offset within the file. If offset is the EOF, this is an append
2371+        operation. I return a Deferred that fires with the results of
2372+        the update operation when it has completed.
2373+
2374+        In cases where update does not append any data, or where it does
2375+        not append so many blocks that the block count crosses a
2376+        power-of-two boundary, this operation will use roughly
2377+        O(data.get_size()) memory/bandwidth/CPU to perform the update.
2378+        Otherwise, it must download, re-encode, and upload the entire
2379+        file again, which will use O(filesize) resources.
2380+        """
2381+        return self._do_serialized(self._update, data, offset)
2382+
2383+
2384+    def _update(self, data, offset):
2385+        """
2386+        I update the mutable file version represented by this particular
2387+        IMutableVersion by inserting the data in data at the offset
2388+        offset. I return a Deferred that fires when this has been
2389+        completed.
2390+        """
2391+        # We have two cases here:
2392+        # 1. The new data will add few enough segments so that it does
2393+        #    not cross into the next power-of-two boundary.
2394+        # 2. It doesn't.
2395+        #
2396+        # In the former case, we can modify the file in place. In the
2397+        # latter case, we need to re-encode the file.
2398+        new_size = data.get_size() + offset
2399+        old_size = self.get_size()
2400+        segment_size = self._version[3]
2401+        num_old_segments = mathutil.div_ceil(old_size,
2402+                                             segment_size)
2403+        num_new_segments = mathutil.div_ceil(new_size,
2404+                                             segment_size)
2405+        log.msg("got %d old segments, %d new segments" % \
2406+                        (num_old_segments, num_new_segments))
2407+
2408+        # We also do a whole file re-encode if the file is an SDMF file.
2409+        if self._version[2]: # version[2] == SDMF salt, which MDMF lacks
2410+            log.msg("doing re-encode instead of in-place update")
2411+            return self._do_modify_update(data, offset)
2412+
2413+        log.msg("updating in place")
2414+        d = self._do_update_update(data, offset)
2415+        d.addCallback(self._decode_and_decrypt_segments, data, offset)
2416+        d.addCallback(self._build_uploadable_and_finish, data, offset)
2417+        return d
2418+
2419+
2420+    def _do_modify_update(self, data, offset):
2421+        """
2422+        I perform a file update by modifying the contents of the file
2423+        after downloading it, then reuploading it. I am less efficient
2424+        than _do_update_update, but am necessary for certain updates.
2425+        """
2426+        def m(old, servermap, first_time):
2427+            start = offset
2428+            rest = offset + data.get_size()
2429+            new = old[:start]
2430+            new += "".join(data.read(data.get_size()))
2431+            new += old[rest:]
2432+            return new
2433+        return self._modify(m, None)
2434+
2435+
2436+    def _do_update_update(self, data, offset):
2437+        """
2438+        I start the Servermap update that gets us the data we need to
2439+        continue the update process. I return a Deferred that fires when
2440+        the servermap update is done.
2441+        """
2442+        assert IMutableUploadable.providedBy(data)
2443+        assert self.is_mutable()
2444+        # offset == self.get_size() is valid and means that we are
2445+        # appending data to the file.
2446+        assert offset <= self.get_size()
2447+
2448+        # We'll need the segment that the data starts in, regardless of
2449+        # what we'll do later.
2450+        start_segment = mathutil.div_ceil(offset, DEFAULT_MAX_SEGMENT_SIZE)
2451+        start_segment -= 1
2452+
2453+        # We only need the end segment if the data we append does not go
2454+        # beyond the current end-of-file.
2455+        end_segment = start_segment
2456+        if offset + data.get_size() < self.get_size():
2457+            end_data = offset + data.get_size()
2458+            end_segment = mathutil.div_ceil(end_data, DEFAULT_MAX_SEGMENT_SIZE)
2459+            end_segment -= 1
2460+        self._start_segment = start_segment
2461+        self._end_segment = end_segment
2462+
2463+        # Now ask for the servermap to be updated in MODE_WRITE with
2464+        # this update range.
2465+        u = ServermapUpdater(self._node, self._storage_broker, Monitor(),
2466+                             self._servermap,
2467+                             mode=MODE_WRITE,
2468+                             update_range=(start_segment, end_segment))
2469+        return u.update()
2470+
2471+
2472+    def _decode_and_decrypt_segments(self, ignored, data, offset):
2473+        """
2474+        After the servermap update, I take the encrypted and encoded
2475+        data that the servermap fetched while doing its update and
2476+        transform it into decoded-and-decrypted plaintext that can be
2477+        used by the new uploadable. I return a Deferred that fires with
2478+        the segments.
2479+        """
2480+        r = Retrieve(self._node, self._servermap, self._version)
2481+        # decode: takes in our blocks and salts from the servermap,
2482+        # returns a Deferred that fires with the corresponding plaintext
2483+        # segments. Does not download -- simply takes advantage of
2484+        # existing infrastructure within the Retrieve class to avoid
2485+        # duplicating code.
2486+        sm = self._servermap
2487+        # XXX: If the methods in the servermap don't work as
2488+        # abstractions, you should rewrite them instead of going around
2489+        # them.
2490+        update_data = sm.update_data
2491+        start_segments = {} # shnum -> start segment
2492+        end_segments = {} # shnum -> end segment
2493+        blockhashes = {} # shnum -> blockhash tree
2494+        for (shnum, data) in update_data.iteritems():
2495+            data = [d[1] for d in data if d[0] == self._version]
2496+
2497+            # Every data entry in our list should now be share shnum for
2498+            # a particular version of the mutable file, so all of the
2499+            # entries should be identical.
2500+            datum = data[0]
2501+            assert filter(lambda x: x != datum, data) == []
2502+
2503+            blockhashes[shnum] = datum[0]
2504+            start_segments[shnum] = datum[1]
2505+            end_segments[shnum] = datum[2]
2506+
2507+        d1 = r.decode(start_segments, self._start_segment)
2508+        d2 = r.decode(end_segments, self._end_segment)
2509+        d3 = defer.succeed(blockhashes)
2510+        return deferredutil.gatherResults([d1, d2, d3])
2511+
2512+
2513+    def _build_uploadable_and_finish(self, segments_and_bht, data, offset):
2514+        """
2515+        After the process has the plaintext segments, I build the
2516+        TransformingUploadable that the publisher will eventually
2517+        re-upload to the grid. I then invoke the publisher with that
2518+        uploadable, and return a Deferred when the publish operation has
2519+        completed without issue.
2520+        """
2521+        u = TransformingUploadable(data, offset,
2522+                                   self._version[3],
2523+                                   segments_and_bht[0],
2524+                                   segments_and_bht[1])
2525+        p = Publish(self._node, self._storage_broker, self._servermap)
2526+        return p.update(u, offset, segments_and_bht[2], self._version)
2527}
2528[mutable/publish.py: Modify the publish process to support MDMF
2529Kevan Carstensen <kevan@isnotajoke.com>**20100819003342
2530 Ignore-this: 2bb379974927e2e20cff75bae8302d1d
2531 
2532 The inner workings of the publishing process needed to be reworked to a
2533 large extend to cope with segmented mutable files, and to cope with
2534 partial-file updates of mutable files. This patch does that. It also
2535 introduces wrappers for uploadable data, allowing the use of
2536 filehandle-like objects as data sources, in addition to strings. This
2537 reduces memory inefficiency when dealing with large files through the
2538 webapi, and clarifies update code there.
2539] {
2540hunk ./src/allmydata/mutable/publish.py 3
2541 
2542 
2543-import os, struct, time
2544+import os, time
2545+from StringIO import StringIO
2546 from itertools import count
2547 from zope.interface import implements
2548 from twisted.internet import defer
2549hunk ./src/allmydata/mutable/publish.py 9
2550 from twisted.python import failure
2551-from allmydata.interfaces import IPublishStatus
2552+from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION, \
2553+                                 IMutableUploadable
2554 from allmydata.util import base32, hashutil, mathutil, idlib, log
2555 from allmydata.util.dictutil import DictOfSets
2556 from allmydata import hashtree, codec
2557hunk ./src/allmydata/mutable/publish.py 21
2558 from allmydata.mutable.common import MODE_WRITE, MODE_CHECK, \
2559      UncoordinatedWriteError, NotEnoughServersError
2560 from allmydata.mutable.servermap import ServerMap
2561-from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
2562-     unpack_checkstring, SIGNED_PREFIX
2563+from allmydata.mutable.layout import unpack_checkstring, MDMFSlotWriteProxy, \
2564+                                     SDMFSlotWriteProxy
2565+
2566+KiB = 1024
2567+DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
2568+PUSHING_BLOCKS_STATE = 0
2569+PUSHING_EVERYTHING_ELSE_STATE = 1
2570+DONE_STATE = 2
2571 
2572 class PublishStatus:
2573     implements(IPublishStatus)
2574hunk ./src/allmydata/mutable/publish.py 118
2575         self._status.set_helper(False)
2576         self._status.set_progress(0.0)
2577         self._status.set_active(True)
2578+        self._version = self._node.get_version()
2579+        assert self._version in (SDMF_VERSION, MDMF_VERSION)
2580+
2581 
2582     def get_status(self):
2583         return self._status
2584hunk ./src/allmydata/mutable/publish.py 132
2585             kwargs["facility"] = "tahoe.mutable.publish"
2586         return log.msg(*args, **kwargs)
2587 
2588+
2589+    def update(self, data, offset, blockhashes, version):
2590+        """
2591+        I replace the contents of this file with the contents of data,
2592+        starting at offset. I return a Deferred that fires with None
2593+        when the replacement has been completed, or with an error if
2594+        something went wrong during the process.
2595+
2596+        Note that this process will not upload new shares. If the file
2597+        being updated is in need of repair, callers will have to repair
2598+        it on their own.
2599+        """
2600+        # How this works:
2601+        # 1: Make peer assignments. We'll assign each share that we know
2602+        # about on the grid to that peer that currently holds that
2603+        # share, and will not place any new shares.
2604+        # 2: Setup encoding parameters. Most of these will stay the same
2605+        # -- datalength will change, as will some of the offsets.
2606+        # 3. Upload the new segments.
2607+        # 4. Be done.
2608+        assert IMutableUploadable.providedBy(data)
2609+
2610+        self.data = data
2611+
2612+        # XXX: Use the MutableFileVersion instead.
2613+        self.datalength = self._node.get_size()
2614+        if data.get_size() > self.datalength:
2615+            self.datalength = data.get_size()
2616+
2617+        self.log("starting update")
2618+        self.log("adding new data of length %d at offset %d" % \
2619+                    (data.get_size(), offset))
2620+        self.log("new data length is %d" % self.datalength)
2621+        self._status.set_size(self.datalength)
2622+        self._status.set_status("Started")
2623+        self._started = time.time()
2624+
2625+        self.done_deferred = defer.Deferred()
2626+
2627+        self._writekey = self._node.get_writekey()
2628+        assert self._writekey, "need write capability to publish"
2629+
2630+        # first, which servers will we publish to? We require that the
2631+        # servermap was updated in MODE_WRITE, so we can depend upon the
2632+        # peerlist computed by that process instead of computing our own.
2633+        assert self._servermap
2634+        assert self._servermap.last_update_mode in (MODE_WRITE, MODE_CHECK)
2635+        # we will push a version that is one larger than anything present
2636+        # in the grid, according to the servermap.
2637+        self._new_seqnum = self._servermap.highest_seqnum() + 1
2638+        self._status.set_servermap(self._servermap)
2639+
2640+        self.log(format="new seqnum will be %(seqnum)d",
2641+                 seqnum=self._new_seqnum, level=log.NOISY)
2642+
2643+        # We're updating an existing file, so all of the following
2644+        # should be available.
2645+        self.readkey = self._node.get_readkey()
2646+        self.required_shares = self._node.get_required_shares()
2647+        assert self.required_shares is not None
2648+        self.total_shares = self._node.get_total_shares()
2649+        assert self.total_shares is not None
2650+        self._status.set_encoding(self.required_shares, self.total_shares)
2651+
2652+        self._pubkey = self._node.get_pubkey()
2653+        assert self._pubkey
2654+        self._privkey = self._node.get_privkey()
2655+        assert self._privkey
2656+        self._encprivkey = self._node.get_encprivkey()
2657+
2658+        sb = self._storage_broker
2659+        full_peerlist = sb.get_servers_for_index(self._storage_index)
2660+        self.full_peerlist = full_peerlist # for use later, immutable
2661+        self.bad_peers = set() # peerids who have errbacked/refused requests
2662+
2663+        # This will set self.segment_size, self.num_segments, and
2664+        # self.fec. TODO: Does it know how to do the offset? Probably
2665+        # not. So do that part next.
2666+        self.setup_encoding_parameters(offset=offset)
2667+
2668+        # if we experience any surprises (writes which were rejected because
2669+        # our test vector did not match, or shares which we didn't expect to
2670+        # see), we set this flag and report an UncoordinatedWriteError at the
2671+        # end of the publish process.
2672+        self.surprised = False
2673+
2674+        # we keep track of three tables. The first is our goal: which share
2675+        # we want to see on which servers. This is initially populated by the
2676+        # existing servermap.
2677+        self.goal = set() # pairs of (peerid, shnum) tuples
2678+
2679+        # the second table is our list of outstanding queries: those which
2680+        # are in flight and may or may not be delivered, accepted, or
2681+        # acknowledged. Items are added to this table when the request is
2682+        # sent, and removed when the response returns (or errbacks).
2683+        self.outstanding = set() # (peerid, shnum) tuples
2684+
2685+        # the third is a table of successes: share which have actually been
2686+        # placed. These are populated when responses come back with success.
2687+        # When self.placed == self.goal, we're done.
2688+        self.placed = set() # (peerid, shnum) tuples
2689+
2690+        # we also keep a mapping from peerid to RemoteReference. Each time we
2691+        # pull a connection out of the full peerlist, we add it to this for
2692+        # use later.
2693+        self.connections = {}
2694+
2695+        self.bad_share_checkstrings = {}
2696+
2697+        # This is set at the last step of the publishing process.
2698+        self.versioninfo = ""
2699+
2700+        # we use the servermap to populate the initial goal: this way we will
2701+        # try to update each existing share in place. Since we're
2702+        # updating, we ignore damaged and missing shares -- callers must
2703+        # do a repair to repair and recreate these.
2704+        for (peerid, shnum) in self._servermap.servermap:
2705+            self.goal.add( (peerid, shnum) )
2706+            self.connections[peerid] = self._servermap.connections[peerid]
2707+        self.writers = {}
2708+
2709+        # SDMF files are updated differently.
2710+        self._version = MDMF_VERSION
2711+        writer_class = MDMFSlotWriteProxy
2712+
2713+        # For each (peerid, shnum) in self.goal, we make a
2714+        # write proxy for that peer. We'll use this to write
2715+        # shares to the peer.
2716+        for key in self.goal:
2717+            peerid, shnum = key
2718+            write_enabler = self._node.get_write_enabler(peerid)
2719+            renew_secret = self._node.get_renewal_secret(peerid)
2720+            cancel_secret = self._node.get_cancel_secret(peerid)
2721+            secrets = (write_enabler, renew_secret, cancel_secret)
2722+
2723+            self.writers[shnum] =  writer_class(shnum,
2724+                                                self.connections[peerid],
2725+                                                self._storage_index,
2726+                                                secrets,
2727+                                                self._new_seqnum,
2728+                                                self.required_shares,
2729+                                                self.total_shares,
2730+                                                self.segment_size,
2731+                                                self.datalength)
2732+            self.writers[shnum].peerid = peerid
2733+            assert (peerid, shnum) in self._servermap.servermap
2734+            old_versionid, old_timestamp = self._servermap.servermap[key]
2735+            (old_seqnum, old_root_hash, old_salt, old_segsize,
2736+             old_datalength, old_k, old_N, old_prefix,
2737+             old_offsets_tuple) = old_versionid
2738+            self.writers[shnum].set_checkstring(old_seqnum,
2739+                                                old_root_hash,
2740+                                                old_salt)
2741+
2742+        # Our remote shares will not have a complete checkstring until
2743+        # after we are done writing share data and have started to write
2744+        # blocks. In the meantime, we need to know what to look for when
2745+        # writing, so that we can detect UncoordinatedWriteErrors.
2746+        self._checkstring = self.writers.values()[0].get_checkstring()
2747+
2748+        # Now, we start pushing shares.
2749+        self._status.timings["setup"] = time.time() - self._started
2750+        # First, we encrypt, encode, and publish the shares that we need
2751+        # to encrypt, encode, and publish.
2752+
2753+        # Our update process fetched these for us. We need to update
2754+        # them in place as publishing happens.
2755+        self.blockhashes = {} # (shnum, [blochashes])
2756+        for (i, bht) in blockhashes.iteritems():
2757+            # We need to extract the leaves from our old hash tree.
2758+            old_segcount = mathutil.div_ceil(version[4],
2759+                                             version[3])
2760+            h = hashtree.IncompleteHashTree(old_segcount)
2761+            bht = dict(enumerate(bht))
2762+            h.set_hashes(bht)
2763+            leaves = h[h.get_leaf_index(0):]
2764+            for j in xrange(self.num_segments - len(leaves)):
2765+                leaves.append(None)
2766+
2767+            assert len(leaves) >= self.num_segments
2768+            self.blockhashes[i] = leaves
2769+            # This list will now be the leaves that were set during the
2770+            # initial upload + enough empty hashes to make it a
2771+            # power-of-two. If we exceed a power of two boundary, we
2772+            # should be encoding the file over again, and should not be
2773+            # here. So, we have
2774+            #assert len(self.blockhashes[i]) == \
2775+            #    hashtree.roundup_pow2(self.num_segments), \
2776+            #        len(self.blockhashes[i])
2777+            # XXX: Except this doesn't work. Figure out why.
2778+
2779+        # These are filled in later, after we've modified the block hash
2780+        # tree suitably.
2781+        self.sharehash_leaves = None # eventually [sharehashes]
2782+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
2783+                              # validate the share]
2784+
2785+        self.log("Starting push")
2786+
2787+        self._state = PUSHING_BLOCKS_STATE
2788+        self._push()
2789+
2790+        return self.done_deferred
2791+
2792+
2793     def publish(self, newdata):
2794         """Publish the filenode's current contents.  Returns a Deferred that
2795         fires (with None) when the publish has done as much work as it's ever
2796hunk ./src/allmydata/mutable/publish.py 344
2797         simultaneous write.
2798         """
2799 
2800-        # 1: generate shares (SDMF: files are small, so we can do it in RAM)
2801-        # 2: perform peer selection, get candidate servers
2802-        #  2a: send queries to n+epsilon servers, to determine current shares
2803-        #  2b: based upon responses, create target map
2804-        # 3: send slot_testv_and_readv_and_writev messages
2805-        # 4: as responses return, update share-dispatch table
2806-        # 4a: may need to run recovery algorithm
2807-        # 5: when enough responses are back, we're done
2808+        # 0. Setup encoding parameters, encoder, and other such things.
2809+        # 1. Encrypt, encode, and publish segments.
2810+        assert IMutableUploadable.providedBy(newdata)
2811 
2812hunk ./src/allmydata/mutable/publish.py 348
2813-        self.log("starting publish, datalen is %s" % len(newdata))
2814-        self._status.set_size(len(newdata))
2815+        self.data = newdata
2816+        self.datalength = newdata.get_size()
2817+        #if self.datalength >= DEFAULT_MAX_SEGMENT_SIZE:
2818+        #    self._version = MDMF_VERSION
2819+        #else:
2820+        #    self._version = SDMF_VERSION
2821+
2822+        self.log("starting publish, datalen is %s" % self.datalength)
2823+        self._status.set_size(self.datalength)
2824         self._status.set_status("Started")
2825         self._started = time.time()
2826 
2827hunk ./src/allmydata/mutable/publish.py 405
2828         self.full_peerlist = full_peerlist # for use later, immutable
2829         self.bad_peers = set() # peerids who have errbacked/refused requests
2830 
2831-        self.newdata = newdata
2832-        self.salt = os.urandom(16)
2833-
2834+        # This will set self.segment_size, self.num_segments, and
2835+        # self.fec.
2836         self.setup_encoding_parameters()
2837 
2838         # if we experience any surprises (writes which were rejected because
2839hunk ./src/allmydata/mutable/publish.py 415
2840         # end of the publish process.
2841         self.surprised = False
2842 
2843-        # as a failsafe, refuse to iterate through self.loop more than a
2844-        # thousand times.
2845-        self.looplimit = 1000
2846-
2847         # we keep track of three tables. The first is our goal: which share
2848         # we want to see on which servers. This is initially populated by the
2849         # existing servermap.
2850hunk ./src/allmydata/mutable/publish.py 438
2851 
2852         self.bad_share_checkstrings = {}
2853 
2854+        # This is set at the last step of the publishing process.
2855+        self.versioninfo = ""
2856+
2857         # we use the servermap to populate the initial goal: this way we will
2858         # try to update each existing share in place.
2859         for (peerid, shnum) in self._servermap.servermap:
2860hunk ./src/allmydata/mutable/publish.py 454
2861             self.bad_share_checkstrings[key] = old_checkstring
2862             self.connections[peerid] = self._servermap.connections[peerid]
2863 
2864-        # create the shares. We'll discard these as they are delivered. SDMF:
2865-        # we're allowed to hold everything in memory.
2866+        # TODO: Make this part do peer selection.
2867+        self.update_goal()
2868+        self.writers = {}
2869+        if self._version == MDMF_VERSION:
2870+            writer_class = MDMFSlotWriteProxy
2871+        else:
2872+            writer_class = SDMFSlotWriteProxy
2873 
2874hunk ./src/allmydata/mutable/publish.py 462
2875+        # For each (peerid, shnum) in self.goal, we make a
2876+        # write proxy for that peer. We'll use this to write
2877+        # shares to the peer.
2878+        for key in self.goal:
2879+            peerid, shnum = key
2880+            write_enabler = self._node.get_write_enabler(peerid)
2881+            renew_secret = self._node.get_renewal_secret(peerid)
2882+            cancel_secret = self._node.get_cancel_secret(peerid)
2883+            secrets = (write_enabler, renew_secret, cancel_secret)
2884+
2885+            self.writers[shnum] =  writer_class(shnum,
2886+                                                self.connections[peerid],
2887+                                                self._storage_index,
2888+                                                secrets,
2889+                                                self._new_seqnum,
2890+                                                self.required_shares,
2891+                                                self.total_shares,
2892+                                                self.segment_size,
2893+                                                self.datalength)
2894+            self.writers[shnum].peerid = peerid
2895+            if (peerid, shnum) in self._servermap.servermap:
2896+                old_versionid, old_timestamp = self._servermap.servermap[key]
2897+                (old_seqnum, old_root_hash, old_salt, old_segsize,
2898+                 old_datalength, old_k, old_N, old_prefix,
2899+                 old_offsets_tuple) = old_versionid
2900+                self.writers[shnum].set_checkstring(old_seqnum,
2901+                                                    old_root_hash,
2902+                                                    old_salt)
2903+            elif (peerid, shnum) in self.bad_share_checkstrings:
2904+                old_checkstring = self.bad_share_checkstrings[(peerid, shnum)]
2905+                self.writers[shnum].set_checkstring(old_checkstring)
2906+
2907+        # Our remote shares will not have a complete checkstring until
2908+        # after we are done writing share data and have started to write
2909+        # blocks. In the meantime, we need to know what to look for when
2910+        # writing, so that we can detect UncoordinatedWriteErrors.
2911+        self._checkstring = self.writers.values()[0].get_checkstring()
2912+
2913+        # Now, we start pushing shares.
2914         self._status.timings["setup"] = time.time() - self._started
2915hunk ./src/allmydata/mutable/publish.py 502
2916-        d = self._encrypt_and_encode()
2917-        d.addCallback(self._generate_shares)
2918-        def _start_pushing(res):
2919-            self._started_pushing = time.time()
2920-            return res
2921-        d.addCallback(_start_pushing)
2922-        d.addCallback(self.loop) # trigger delivery
2923-        d.addErrback(self._fatal_error)
2924+        # First, we encrypt, encode, and publish the shares that we need
2925+        # to encrypt, encode, and publish.
2926+
2927+        # This will eventually hold the block hash chain for each share
2928+        # that we publish. We define it this way so that empty publishes
2929+        # will still have something to write to the remote slot.
2930+        self.blockhashes = dict([(i, []) for i in xrange(self.total_shares)])
2931+        for i in xrange(self.total_shares):
2932+            blocks = self.blockhashes[i]
2933+            for j in xrange(self.num_segments):
2934+                blocks.append(None)
2935+        self.sharehash_leaves = None # eventually [sharehashes]
2936+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
2937+                              # validate the share]
2938+
2939+        self.log("Starting push")
2940+
2941+        self._state = PUSHING_BLOCKS_STATE
2942+        self._push()
2943 
2944         return self.done_deferred
2945 
2946hunk ./src/allmydata/mutable/publish.py 524
2947-    def setup_encoding_parameters(self):
2948-        segment_size = len(self.newdata)
2949+
2950+    def _update_status(self):
2951+        self._status.set_status("Sending Shares: %d placed out of %d, "
2952+                                "%d messages outstanding" %
2953+                                (len(self.placed),
2954+                                 len(self.goal),
2955+                                 len(self.outstanding)))
2956+        self._status.set_progress(1.0 * len(self.placed) / len(self.goal))
2957+
2958+
2959+    def setup_encoding_parameters(self, offset=0):
2960+        if self._version == MDMF_VERSION:
2961+            segment_size = DEFAULT_MAX_SEGMENT_SIZE # 128 KiB by default
2962+        else:
2963+            segment_size = self.datalength # SDMF is only one segment
2964         # this must be a multiple of self.required_shares
2965         segment_size = mathutil.next_multiple(segment_size,
2966                                               self.required_shares)
2967hunk ./src/allmydata/mutable/publish.py 543
2968         self.segment_size = segment_size
2969+
2970+        # Calculate the starting segment for the upload.
2971         if segment_size:
2972hunk ./src/allmydata/mutable/publish.py 546
2973-            self.num_segments = mathutil.div_ceil(len(self.newdata),
2974+            self.num_segments = mathutil.div_ceil(self.datalength,
2975                                                   segment_size)
2976hunk ./src/allmydata/mutable/publish.py 548
2977+            self.starting_segment = mathutil.div_ceil(offset,
2978+                                                      segment_size)
2979+            self.starting_segment -= 1
2980+            if offset == 0:
2981+                self.starting_segment = 0
2982+
2983         else:
2984             self.num_segments = 0
2985hunk ./src/allmydata/mutable/publish.py 556
2986-        assert self.num_segments in [0, 1,] # SDMF restrictions
2987+            self.starting_segment = 0
2988+
2989+
2990+        self.log("building encoding parameters for file")
2991+        self.log("got segsize %d" % self.segment_size)
2992+        self.log("got %d segments" % self.num_segments)
2993+
2994+        if self._version == SDMF_VERSION:
2995+            assert self.num_segments in (0, 1) # SDMF
2996+        # calculate the tail segment size.
2997+
2998+        if segment_size and self.datalength:
2999+            self.tail_segment_size = self.datalength % segment_size
3000+            self.log("got tail segment size %d" % self.tail_segment_size)
3001+        else:
3002+            self.tail_segment_size = 0
3003+
3004+        if self.tail_segment_size == 0 and segment_size:
3005+            # The tail segment is the same size as the other segments.
3006+            self.tail_segment_size = segment_size
3007+
3008+        # Make FEC encoders
3009+        fec = codec.CRSEncoder()
3010+        fec.set_params(self.segment_size,
3011+                       self.required_shares, self.total_shares)
3012+        self.piece_size = fec.get_block_size()
3013+        self.fec = fec
3014+
3015+        if self.tail_segment_size == self.segment_size:
3016+            self.tail_fec = self.fec
3017+        else:
3018+            tail_fec = codec.CRSEncoder()
3019+            tail_fec.set_params(self.tail_segment_size,
3020+                                self.required_shares,
3021+                                self.total_shares)
3022+            self.tail_fec = tail_fec
3023+
3024+        self._current_segment = self.starting_segment
3025+        self.end_segment = self.num_segments - 1
3026+        # Now figure out where the last segment should be.
3027+        if self.data.get_size() != self.datalength:
3028+            end = self.data.get_size()
3029+            self.end_segment = mathutil.div_ceil(end,
3030+                                                 segment_size)
3031+            self.end_segment -= 1
3032+        self.log("got start segment %d" % self.starting_segment)
3033+        self.log("got end segment %d" % self.end_segment)
3034+
3035+
3036+    def _push(self, ignored=None):
3037+        """
3038+        I manage state transitions. In particular, I see that we still
3039+        have a good enough number of writers to complete the upload
3040+        successfully.
3041+        """
3042+        # Can we still successfully publish this file?
3043+        # TODO: Keep track of outstanding queries before aborting the
3044+        #       process.
3045+        if len(self.writers) <= self.required_shares or self.surprised:
3046+            return self._failure()
3047+
3048+        # Figure out what we need to do next. Each of these needs to
3049+        # return a deferred so that we don't block execution when this
3050+        # is first called in the upload method.
3051+        if self._state == PUSHING_BLOCKS_STATE:
3052+            return self.push_segment(self._current_segment)
3053+
3054+        elif self._state == PUSHING_EVERYTHING_ELSE_STATE:
3055+            return self.push_everything_else()
3056+
3057+        # If we make it to this point, we were successful in placing the
3058+        # file.
3059+        return self._done(None)
3060+
3061+
3062+    def push_segment(self, segnum):
3063+        if self.num_segments == 0 and self._version == SDMF_VERSION:
3064+            self._add_dummy_salts()
3065 
3066hunk ./src/allmydata/mutable/publish.py 635
3067-    def _fatal_error(self, f):
3068-        self.log("error during loop", failure=f, level=log.UNUSUAL)
3069-        self._done(f)
3070+        if segnum > self.end_segment:
3071+            # We don't have any more segments to push.
3072+            self._state = PUSHING_EVERYTHING_ELSE_STATE
3073+            return self._push()
3074+
3075+        d = self._encode_segment(segnum)
3076+        d.addCallback(self._push_segment, segnum)
3077+        def _increment_segnum(ign):
3078+            self._current_segment += 1
3079+        # XXX: I don't think we need to do addBoth here -- any errBacks
3080+        # should be handled within push_segment.
3081+        d.addBoth(_increment_segnum)
3082+        d.addBoth(self._turn_barrier)
3083+        d.addBoth(self._push)
3084+
3085+
3086+    def _turn_barrier(self, result):
3087+        """
3088+        I help the publish process avoid the recursion limit issues
3089+        described in #237.
3090+        """
3091+        return fireEventually(result)
3092+
3093+
3094+    def _add_dummy_salts(self):
3095+        """
3096+        SDMF files need a salt even if they're empty, or the signature
3097+        won't make sense. This method adds a dummy salt to each of our
3098+        SDMF writers so that they can write the signature later.
3099+        """
3100+        salt = os.urandom(16)
3101+        assert self._version == SDMF_VERSION
3102+
3103+        for writer in self.writers.itervalues():
3104+            writer.put_salt(salt)
3105+
3106+
3107+    def _encode_segment(self, segnum):
3108+        """
3109+        I encrypt and encode the segment segnum.
3110+        """
3111+        started = time.time()
3112+
3113+        if segnum + 1 == self.num_segments:
3114+            segsize = self.tail_segment_size
3115+        else:
3116+            segsize = self.segment_size
3117+
3118+
3119+        self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
3120+        data = self.data.read(segsize)
3121+        # XXX: This is dumb. Why return a list?
3122+        data = "".join(data)
3123+
3124+        assert len(data) == segsize, len(data)
3125+
3126+        salt = os.urandom(16)
3127+
3128+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
3129+        self._status.set_status("Encrypting")
3130+        enc = AES(key)
3131+        crypttext = enc.process(data)
3132+        assert len(crypttext) == len(data)
3133+
3134+        now = time.time()
3135+        self._status.timings["encrypt"] = now - started
3136+        started = now
3137+
3138+        # now apply FEC
3139+        if segnum + 1 == self.num_segments:
3140+            fec = self.tail_fec
3141+        else:
3142+            fec = self.fec
3143+
3144+        self._status.set_status("Encoding")
3145+        crypttext_pieces = [None] * self.required_shares
3146+        piece_size = fec.get_block_size()
3147+        for i in range(len(crypttext_pieces)):
3148+            offset = i * piece_size
3149+            piece = crypttext[offset:offset+piece_size]
3150+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
3151+            crypttext_pieces[i] = piece
3152+            assert len(piece) == piece_size
3153+        d = fec.encode(crypttext_pieces)
3154+        def _done_encoding(res):
3155+            elapsed = time.time() - started
3156+            self._status.timings["encode"] = elapsed
3157+            return (res, salt)
3158+        d.addCallback(_done_encoding)
3159+        return d
3160+
3161+
3162+    def _push_segment(self, encoded_and_salt, segnum):
3163+        """
3164+        I push (data, salt) as segment number segnum.
3165+        """
3166+        results, salt = encoded_and_salt
3167+        shares, shareids = results
3168+        self._status.set_status("Pushing segment")
3169+        for i in xrange(len(shares)):
3170+            sharedata = shares[i]
3171+            shareid = shareids[i]
3172+            if self._version == MDMF_VERSION:
3173+                hashed = salt + sharedata
3174+            else:
3175+                hashed = sharedata
3176+            block_hash = hashutil.block_hash(hashed)
3177+            self.blockhashes[shareid][segnum] = block_hash
3178+            # find the writer for this share
3179+            writer = self.writers[shareid]
3180+            writer.put_block(sharedata, segnum, salt)
3181+
3182+
3183+    def push_everything_else(self):
3184+        """
3185+        I put everything else associated with a share.
3186+        """
3187+        self._pack_started = time.time()
3188+        self.push_encprivkey()
3189+        self.push_blockhashes()
3190+        self.push_sharehashes()
3191+        self.push_toplevel_hashes_and_signature()
3192+        d = self.finish_publishing()
3193+        def _change_state(ignored):
3194+            self._state = DONE_STATE
3195+        d.addCallback(_change_state)
3196+        d.addCallback(self._push)
3197+        return d
3198+
3199+
3200+    def push_encprivkey(self):
3201+        encprivkey = self._encprivkey
3202+        self._status.set_status("Pushing encrypted private key")
3203+        for writer in self.writers.itervalues():
3204+            writer.put_encprivkey(encprivkey)
3205+
3206+
3207+    def push_blockhashes(self):
3208+        self.sharehash_leaves = [None] * len(self.blockhashes)
3209+        self._status.set_status("Building and pushing block hash tree")
3210+        for shnum, blockhashes in self.blockhashes.iteritems():
3211+            t = hashtree.HashTree(blockhashes)
3212+            self.blockhashes[shnum] = list(t)
3213+            # set the leaf for future use.
3214+            self.sharehash_leaves[shnum] = t[0]
3215+
3216+            writer = self.writers[shnum]
3217+            writer.put_blockhashes(self.blockhashes[shnum])
3218+
3219+
3220+    def push_sharehashes(self):
3221+        self._status.set_status("Building and pushing share hash chain")
3222+        share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
3223+        for shnum in xrange(len(self.sharehash_leaves)):
3224+            needed_indices = share_hash_tree.needed_hashes(shnum)
3225+            self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
3226+                                             for i in needed_indices] )
3227+            writer = self.writers[shnum]
3228+            writer.put_sharehashes(self.sharehashes[shnum])
3229+        self.root_hash = share_hash_tree[0]
3230+
3231+
3232+    def push_toplevel_hashes_and_signature(self):
3233+        # We need to to three things here:
3234+        #   - Push the root hash and salt hash
3235+        #   - Get the checkstring of the resulting layout; sign that.
3236+        #   - Push the signature
3237+        self._status.set_status("Pushing root hashes and signature")
3238+        for shnum in xrange(self.total_shares):
3239+            writer = self.writers[shnum]
3240+            writer.put_root_hash(self.root_hash)
3241+        self._update_checkstring()
3242+        self._make_and_place_signature()
3243+
3244+
3245+    def _update_checkstring(self):
3246+        """
3247+        After putting the root hash, MDMF files will have the
3248+        checkstring written to the storage server. This means that we
3249+        can update our copy of the checkstring so we can detect
3250+        uncoordinated writes. SDMF files will have the same checkstring,
3251+        so we need not do anything.
3252+        """
3253+        self._checkstring = self.writers.values()[0].get_checkstring()
3254+
3255+
3256+    def _make_and_place_signature(self):
3257+        """
3258+        I create and place the signature.
3259+        """
3260+        started = time.time()
3261+        self._status.set_status("Signing prefix")
3262+        signable = self.writers[0].get_signable()
3263+        self.signature = self._privkey.sign(signable)
3264+
3265+        for (shnum, writer) in self.writers.iteritems():
3266+            writer.put_signature(self.signature)
3267+        self._status.timings['sign'] = time.time() - started
3268+
3269+
3270+    def finish_publishing(self):
3271+        # We're almost done -- we just need to put the verification key
3272+        # and the offsets
3273+        started = time.time()
3274+        self._status.set_status("Pushing shares")
3275+        self._started_pushing = started
3276+        ds = []
3277+        verification_key = self._pubkey.serialize()
3278+
3279+
3280+        # TODO: Bad, since we remove from this same dict. We need to
3281+        # make a copy, or just use a non-iterated value.
3282+        for (shnum, writer) in self.writers.iteritems():
3283+            writer.put_verification_key(verification_key)
3284+            d = writer.finish_publishing()
3285+            # Add the (peerid, shnum) tuple to our list of outstanding
3286+            # queries. This gets used by _loop if some of our queries
3287+            # fail to place shares.
3288+            self.outstanding.add((writer.peerid, writer.shnum))
3289+            d.addCallback(self._got_write_answer, writer, started)
3290+            d.addErrback(self._connection_problem, writer)
3291+            ds.append(d)
3292+        self._record_verinfo()
3293+        self._status.timings['pack'] = time.time() - started
3294+        return defer.DeferredList(ds)
3295+
3296+
3297+    def _record_verinfo(self):
3298+        self.versioninfo = self.writers.values()[0].get_verinfo()
3299+
3300+
3301+    def _connection_problem(self, f, writer):
3302+        """
3303+        We ran into a connection problem while working with writer, and
3304+        need to deal with that.
3305+        """
3306+        self.log("found problem: %s" % str(f))
3307+        self._last_failure = f
3308+        del(self.writers[writer.shnum])
3309 
3310hunk ./src/allmydata/mutable/publish.py 875
3311-    def _update_status(self):
3312-        self._status.set_status("Sending Shares: %d placed out of %d, "
3313-                                "%d messages outstanding" %
3314-                                (len(self.placed),
3315-                                 len(self.goal),
3316-                                 len(self.outstanding)))
3317-        self._status.set_progress(1.0 * len(self.placed) / len(self.goal))
3318 
3319hunk ./src/allmydata/mutable/publish.py 876
3320-    def loop(self, ignored=None):
3321-        self.log("entering loop", level=log.NOISY)
3322-        if not self._running:
3323-            return
3324-
3325-        self.looplimit -= 1
3326-        if self.looplimit <= 0:
3327-            raise LoopLimitExceededError("loop limit exceeded")
3328-
3329-        if self.surprised:
3330-            # don't send out any new shares, just wait for the outstanding
3331-            # ones to be retired.
3332-            self.log("currently surprised, so don't send any new shares",
3333-                     level=log.NOISY)
3334-        else:
3335-            self.update_goal()
3336-            # how far are we from our goal?
3337-            needed = self.goal - self.placed - self.outstanding
3338-            self._update_status()
3339-
3340-            if needed:
3341-                # we need to send out new shares
3342-                self.log(format="need to send %(needed)d new shares",
3343-                         needed=len(needed), level=log.NOISY)
3344-                self._send_shares(needed)
3345-                return
3346-
3347-        if self.outstanding:
3348-            # queries are still pending, keep waiting
3349-            self.log(format="%(outstanding)d queries still outstanding",
3350-                     outstanding=len(self.outstanding),
3351-                     level=log.NOISY)
3352-            return
3353-
3354-        # no queries outstanding, no placements needed: we're done
3355-        self.log("no queries outstanding, no placements needed: done",
3356-                 level=log.OPERATIONAL)
3357-        now = time.time()
3358-        elapsed = now - self._started_pushing
3359-        self._status.timings["push"] = elapsed
3360-        return self._done(None)
3361-
3362     def log_goal(self, goal, message=""):
3363         logmsg = [message]
3364         for (shnum, peerid) in sorted([(s,p) for (p,s) in goal]):
3365hunk ./src/allmydata/mutable/publish.py 957
3366             self.log_goal(self.goal, "after update: ")
3367 
3368 
3369+    def _got_write_answer(self, answer, writer, started):
3370+        if not answer:
3371+            # SDMF writers only pretend to write when readers set their
3372+            # blocks, salts, and so on -- they actually just write once,
3373+            # at the end of the upload process. In fake writes, they
3374+            # return defer.succeed(None). If we see that, we shouldn't
3375+            # bother checking it.
3376+            return
3377 
3378hunk ./src/allmydata/mutable/publish.py 966
3379-    def _encrypt_and_encode(self):
3380-        # this returns a Deferred that fires with a list of (sharedata,
3381-        # sharenum) tuples. TODO: cache the ciphertext, only produce the
3382-        # shares that we care about.
3383-        self.log("_encrypt_and_encode")
3384-
3385-        self._status.set_status("Encrypting")
3386-        started = time.time()
3387-
3388-        key = hashutil.ssk_readkey_data_hash(self.salt, self.readkey)
3389-        enc = AES(key)
3390-        crypttext = enc.process(self.newdata)
3391-        assert len(crypttext) == len(self.newdata)
3392+        peerid = writer.peerid
3393+        lp = self.log("_got_write_answer from %s, share %d" %
3394+                      (idlib.shortnodeid_b2a(peerid), writer.shnum))
3395 
3396         now = time.time()
3397hunk ./src/allmydata/mutable/publish.py 971
3398-        self._status.timings["encrypt"] = now - started
3399-        started = now
3400-
3401-        # now apply FEC
3402-
3403-        self._status.set_status("Encoding")
3404-        fec = codec.CRSEncoder()
3405-        fec.set_params(self.segment_size,
3406-                       self.required_shares, self.total_shares)
3407-        piece_size = fec.get_block_size()
3408-        crypttext_pieces = [None] * self.required_shares
3409-        for i in range(len(crypttext_pieces)):
3410-            offset = i * piece_size
3411-            piece = crypttext[offset:offset+piece_size]
3412-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
3413-            crypttext_pieces[i] = piece
3414-            assert len(piece) == piece_size
3415-
3416-        d = fec.encode(crypttext_pieces)
3417-        def _done_encoding(res):
3418-            elapsed = time.time() - started
3419-            self._status.timings["encode"] = elapsed
3420-            return res
3421-        d.addCallback(_done_encoding)
3422-        return d
3423-
3424-    def _generate_shares(self, shares_and_shareids):
3425-        # this sets self.shares and self.root_hash
3426-        self.log("_generate_shares")
3427-        self._status.set_status("Generating Shares")
3428-        started = time.time()
3429-
3430-        # we should know these by now
3431-        privkey = self._privkey
3432-        encprivkey = self._encprivkey
3433-        pubkey = self._pubkey
3434-
3435-        (shares, share_ids) = shares_and_shareids
3436-
3437-        assert len(shares) == len(share_ids)
3438-        assert len(shares) == self.total_shares
3439-        all_shares = {}
3440-        block_hash_trees = {}
3441-        share_hash_leaves = [None] * len(shares)
3442-        for i in range(len(shares)):
3443-            share_data = shares[i]
3444-            shnum = share_ids[i]
3445-            all_shares[shnum] = share_data
3446-
3447-            # build the block hash tree. SDMF has only one leaf.
3448-            leaves = [hashutil.block_hash(share_data)]
3449-            t = hashtree.HashTree(leaves)
3450-            block_hash_trees[shnum] = list(t)
3451-            share_hash_leaves[shnum] = t[0]
3452-        for leaf in share_hash_leaves:
3453-            assert leaf is not None
3454-        share_hash_tree = hashtree.HashTree(share_hash_leaves)
3455-        share_hash_chain = {}
3456-        for shnum in range(self.total_shares):
3457-            needed_hashes = share_hash_tree.needed_hashes(shnum)
3458-            share_hash_chain[shnum] = dict( [ (i, share_hash_tree[i])
3459-                                              for i in needed_hashes ] )
3460-        root_hash = share_hash_tree[0]
3461-        assert len(root_hash) == 32
3462-        self.log("my new root_hash is %s" % base32.b2a(root_hash))
3463-        self._new_version_info = (self._new_seqnum, root_hash, self.salt)
3464-
3465-        prefix = pack_prefix(self._new_seqnum, root_hash, self.salt,
3466-                             self.required_shares, self.total_shares,
3467-                             self.segment_size, len(self.newdata))
3468-
3469-        # now pack the beginning of the share. All shares are the same up
3470-        # to the signature, then they have divergent share hash chains,
3471-        # then completely different block hash trees + salt + share data,
3472-        # then they all share the same encprivkey at the end. The sizes
3473-        # of everything are the same for all shares.
3474-
3475-        sign_started = time.time()
3476-        signature = privkey.sign(prefix)
3477-        self._status.timings["sign"] = time.time() - sign_started
3478-
3479-        verification_key = pubkey.serialize()
3480-
3481-        final_shares = {}
3482-        for shnum in range(self.total_shares):
3483-            final_share = pack_share(prefix,
3484-                                     verification_key,
3485-                                     signature,
3486-                                     share_hash_chain[shnum],
3487-                                     block_hash_trees[shnum],
3488-                                     all_shares[shnum],
3489-                                     encprivkey)
3490-            final_shares[shnum] = final_share
3491-        elapsed = time.time() - started
3492-        self._status.timings["pack"] = elapsed
3493-        self.shares = final_shares
3494-        self.root_hash = root_hash
3495-
3496-        # we also need to build up the version identifier for what we're
3497-        # pushing. Extract the offsets from one of our shares.
3498-        assert final_shares
3499-        offsets = unpack_header(final_shares.values()[0])[-1]
3500-        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
3501-        verinfo = (self._new_seqnum, root_hash, self.salt,
3502-                   self.segment_size, len(self.newdata),
3503-                   self.required_shares, self.total_shares,
3504-                   prefix, offsets_tuple)
3505-        self.versioninfo = verinfo
3506-
3507-
3508-
3509-    def _send_shares(self, needed):
3510-        self.log("_send_shares")
3511-
3512-        # we're finally ready to send out our shares. If we encounter any
3513-        # surprises here, it's because somebody else is writing at the same
3514-        # time. (Note: in the future, when we remove the _query_peers() step
3515-        # and instead speculate about [or remember] which shares are where,
3516-        # surprises here are *not* indications of UncoordinatedWriteError,
3517-        # and we'll need to respond to them more gracefully.)
3518-
3519-        # needed is a set of (peerid, shnum) tuples. The first thing we do is
3520-        # organize it by peerid.
3521-
3522-        peermap = DictOfSets()
3523-        for (peerid, shnum) in needed:
3524-            peermap.add(peerid, shnum)
3525-
3526-        # the next thing is to build up a bunch of test vectors. The
3527-        # semantics of Publish are that we perform the operation if the world
3528-        # hasn't changed since the ServerMap was constructed (more or less).
3529-        # For every share we're trying to place, we create a test vector that
3530-        # tests to see if the server*share still corresponds to the
3531-        # map.
3532-
3533-        all_tw_vectors = {} # maps peerid to tw_vectors
3534-        sm = self._servermap.servermap
3535-
3536-        for key in needed:
3537-            (peerid, shnum) = key
3538-
3539-            if key in sm:
3540-                # an old version of that share already exists on the
3541-                # server, according to our servermap. We will create a
3542-                # request that attempts to replace it.
3543-                old_versionid, old_timestamp = sm[key]
3544-                (old_seqnum, old_root_hash, old_salt, old_segsize,
3545-                 old_datalength, old_k, old_N, old_prefix,
3546-                 old_offsets_tuple) = old_versionid
3547-                old_checkstring = pack_checkstring(old_seqnum,
3548-                                                   old_root_hash,
3549-                                                   old_salt)
3550-                testv = (0, len(old_checkstring), "eq", old_checkstring)
3551-
3552-            elif key in self.bad_share_checkstrings:
3553-                old_checkstring = self.bad_share_checkstrings[key]
3554-                testv = (0, len(old_checkstring), "eq", old_checkstring)
3555-
3556-            else:
3557-                # add a testv that requires the share not exist
3558-
3559-                # Unfortunately, foolscap-0.2.5 has a bug in the way inbound
3560-                # constraints are handled. If the same object is referenced
3561-                # multiple times inside the arguments, foolscap emits a
3562-                # 'reference' token instead of a distinct copy of the
3563-                # argument. The bug is that these 'reference' tokens are not
3564-                # accepted by the inbound constraint code. To work around
3565-                # this, we need to prevent python from interning the
3566-                # (constant) tuple, by creating a new copy of this vector
3567-                # each time.
3568-
3569-                # This bug is fixed in foolscap-0.2.6, and even though this
3570-                # version of Tahoe requires foolscap-0.3.1 or newer, we are
3571-                # supposed to be able to interoperate with older versions of
3572-                # Tahoe which are allowed to use older versions of foolscap,
3573-                # including foolscap-0.2.5 . In addition, I've seen other
3574-                # foolscap problems triggered by 'reference' tokens (see #541
3575-                # for details). So we must keep this workaround in place.
3576-
3577-                #testv = (0, 1, 'eq', "")
3578-                testv = tuple([0, 1, 'eq', ""])
3579-
3580-            testvs = [testv]
3581-            # the write vector is simply the share
3582-            writev = [(0, self.shares[shnum])]
3583-
3584-            if peerid not in all_tw_vectors:
3585-                all_tw_vectors[peerid] = {}
3586-                # maps shnum to (testvs, writevs, new_length)
3587-            assert shnum not in all_tw_vectors[peerid]
3588-
3589-            all_tw_vectors[peerid][shnum] = (testvs, writev, None)
3590-
3591-        # we read the checkstring back from each share, however we only use
3592-        # it to detect whether there was a new share that we didn't know
3593-        # about. The success or failure of the write will tell us whether
3594-        # there was a collision or not. If there is a collision, the first
3595-        # thing we'll do is update the servermap, which will find out what
3596-        # happened. We could conceivably reduce a roundtrip by using the
3597-        # readv checkstring to populate the servermap, but really we'd have
3598-        # to read enough data to validate the signatures too, so it wouldn't
3599-        # be an overall win.
3600-        read_vector = [(0, struct.calcsize(SIGNED_PREFIX))]
3601-
3602-        # ok, send the messages!
3603-        self.log("sending %d shares" % len(all_tw_vectors), level=log.NOISY)
3604-        started = time.time()
3605-        for (peerid, tw_vectors) in all_tw_vectors.items():
3606-
3607-            write_enabler = self._node.get_write_enabler(peerid)
3608-            renew_secret = self._node.get_renewal_secret(peerid)
3609-            cancel_secret = self._node.get_cancel_secret(peerid)
3610-            secrets = (write_enabler, renew_secret, cancel_secret)
3611-            shnums = tw_vectors.keys()
3612-
3613-            for shnum in shnums:
3614-                self.outstanding.add( (peerid, shnum) )
3615+        elapsed = now - started
3616 
3617hunk ./src/allmydata/mutable/publish.py 973
3618-            d = self._do_testreadwrite(peerid, secrets,
3619-                                       tw_vectors, read_vector)
3620-            d.addCallbacks(self._got_write_answer, self._got_write_error,
3621-                           callbackArgs=(peerid, shnums, started),
3622-                           errbackArgs=(peerid, shnums, started))
3623-            # tolerate immediate errback, like with DeadReferenceError
3624-            d.addBoth(fireEventually)
3625-            d.addCallback(self.loop)
3626-            d.addErrback(self._fatal_error)
3627+        self._status.add_per_server_time(peerid, elapsed)
3628 
3629hunk ./src/allmydata/mutable/publish.py 975
3630-        self._update_status()
3631-        self.log("%d shares sent" % len(all_tw_vectors), level=log.NOISY)
3632+        wrote, read_data = answer
3633 
3634hunk ./src/allmydata/mutable/publish.py 977
3635-    def _do_testreadwrite(self, peerid, secrets,
3636-                          tw_vectors, read_vector):
3637-        storage_index = self._storage_index
3638-        ss = self.connections[peerid]
3639+        surprise_shares = set(read_data.keys()) - set([writer.shnum])
3640 
3641hunk ./src/allmydata/mutable/publish.py 979
3642-        #print "SS[%s] is %s" % (idlib.shortnodeid_b2a(peerid), ss), ss.tracker.interfaceName
3643-        d = ss.callRemote("slot_testv_and_readv_and_writev",
3644-                          storage_index,
3645-                          secrets,
3646-                          tw_vectors,
3647-                          read_vector)
3648-        return d
3649+        # We need to remove from surprise_shares any shares that we are
3650+        # knowingly also writing to that peer from other writers.
3651 
3652hunk ./src/allmydata/mutable/publish.py 982
3653-    def _got_write_answer(self, answer, peerid, shnums, started):
3654-        lp = self.log("_got_write_answer from %s" %
3655-                      idlib.shortnodeid_b2a(peerid))
3656-        for shnum in shnums:
3657-            self.outstanding.discard( (peerid, shnum) )
3658+        # TODO: Precompute this.
3659+        known_shnums = [x.shnum for x in self.writers.values()
3660+                        if x.peerid == peerid]
3661+        surprise_shares -= set(known_shnums)
3662+        self.log("found the following surprise shares: %s" %
3663+                 str(surprise_shares))
3664 
3665hunk ./src/allmydata/mutable/publish.py 989
3666-        now = time.time()
3667-        elapsed = now - started
3668-        self._status.add_per_server_time(peerid, elapsed)
3669-
3670-        wrote, read_data = answer
3671-
3672-        surprise_shares = set(read_data.keys()) - set(shnums)
3673+        # Now surprise shares contains all of the shares that we did not
3674+        # expect to be there.
3675 
3676         surprised = False
3677         for shnum in surprise_shares:
3678hunk ./src/allmydata/mutable/publish.py 996
3679             # read_data is a dict mapping shnum to checkstring (SIGNED_PREFIX)
3680             checkstring = read_data[shnum][0]
3681-            their_version_info = unpack_checkstring(checkstring)
3682-            if their_version_info == self._new_version_info:
3683+            # What we want to do here is to see if their (seqnum,
3684+            # roothash, salt) is the same as our (seqnum, roothash,
3685+            # salt), or the equivalent for MDMF. The best way to do this
3686+            # is to store a packed representation of our checkstring
3687+            # somewhere, then not bother unpacking the other
3688+            # checkstring.
3689+            if checkstring == self._checkstring:
3690                 # they have the right share, somehow
3691 
3692                 if (peerid,shnum) in self.goal:
3693hunk ./src/allmydata/mutable/publish.py 1081
3694             self.log("our testv failed, so the write did not happen",
3695                      parent=lp, level=log.WEIRD, umid="8sc26g")
3696             self.surprised = True
3697-            self.bad_peers.add(peerid) # don't ask them again
3698+            self.bad_peers.add(writer) # don't ask them again
3699             # use the checkstring to add information to the log message
3700             for (shnum,readv) in read_data.items():
3701                 checkstring = readv[0]
3702hunk ./src/allmydata/mutable/publish.py 1103
3703                 # if expected_version==None, then we didn't expect to see a
3704                 # share on that peer, and the 'surprise_shares' clause above
3705                 # will have logged it.
3706-            # self.loop() will take care of finding new homes
3707             return
3708 
3709hunk ./src/allmydata/mutable/publish.py 1105
3710-        for shnum in shnums:
3711-            self.placed.add( (peerid, shnum) )
3712-            # and update the servermap
3713-            self._servermap.add_new_share(peerid, shnum,
3714+        # and update the servermap
3715+        # self.versioninfo is set during the last phase of publishing.
3716+        # If we get there, we know that responses correspond to placed
3717+        # shares, and can safely execute these statements.
3718+        if self.versioninfo:
3719+            self.log("wrote successfully: adding new share to servermap")
3720+            self._servermap.add_new_share(peerid, writer.shnum,
3721                                           self.versioninfo, started)
3722hunk ./src/allmydata/mutable/publish.py 1113
3723-
3724-        # self.loop() will take care of checking to see if we're done
3725+            self.placed.add( (peerid, writer.shnum) )
3726+        self._update_status()
3727+        # the next method in the deferred chain will check to see if
3728+        # we're done and successful.
3729         return
3730 
3731hunk ./src/allmydata/mutable/publish.py 1119
3732-    def _got_write_error(self, f, peerid, shnums, started):
3733-        for shnum in shnums:
3734-            self.outstanding.discard( (peerid, shnum) )
3735-        self.bad_peers.add(peerid)
3736-        if self._first_write_error is None:
3737-            self._first_write_error = f
3738-        self.log(format="error while writing shares %(shnums)s to peerid %(peerid)s",
3739-                 shnums=list(shnums), peerid=idlib.shortnodeid_b2a(peerid),
3740-                 failure=f,
3741-                 level=log.UNUSUAL)
3742-        # self.loop() will take care of checking to see if we're done
3743-        return
3744-
3745 
3746     def _done(self, res):
3747         if not self._running:
3748hunk ./src/allmydata/mutable/publish.py 1126
3749         self._running = False
3750         now = time.time()
3751         self._status.timings["total"] = now - self._started
3752+
3753+        elapsed = now - self._started_pushing
3754+        self._status.timings['push'] = elapsed
3755+
3756         self._status.set_active(False)
3757hunk ./src/allmydata/mutable/publish.py 1131
3758-        if isinstance(res, failure.Failure):
3759-            self.log("Publish done, with failure", failure=res,
3760-                     level=log.WEIRD, umid="nRsR9Q")
3761-            self._status.set_status("Failed")
3762-        elif self.surprised:
3763-            self.log("Publish done, UncoordinatedWriteError", level=log.UNUSUAL)
3764-            self._status.set_status("UncoordinatedWriteError")
3765-            # deliver a failure
3766-            res = failure.Failure(UncoordinatedWriteError())
3767-            # TODO: recovery
3768-        else:
3769-            self.log("Publish done, success")
3770-            self._status.set_status("Finished")
3771-            self._status.set_progress(1.0)
3772+        self.log("Publish done, success")
3773+        self._status.set_status("Finished")
3774+        self._status.set_progress(1.0)
3775         eventually(self.done_deferred.callback, res)
3776 
3777hunk ./src/allmydata/mutable/publish.py 1136
3778+    def _failure(self):
3779+
3780+        if not self.surprised:
3781+            # We ran out of servers
3782+            self.log("Publish ran out of good servers, "
3783+                     "last failure was: %s" % str(self._last_failure))
3784+            e = NotEnoughServersError("Ran out of non-bad servers, "
3785+                                      "last failure was %s" %
3786+                                      str(self._last_failure))
3787+        else:
3788+            # We ran into shares that we didn't recognize, which means
3789+            # that we need to return an UncoordinatedWriteError.
3790+            self.log("Publish failed with UncoordinatedWriteError")
3791+            e = UncoordinatedWriteError()
3792+        f = failure.Failure(e)
3793+        eventually(self.done_deferred.callback, f)
3794+
3795+
3796+class MutableFileHandle:
3797+    """
3798+    I am a mutable uploadable built around a filehandle-like object,
3799+    usually either a StringIO instance or a handle to an actual file.
3800+    """
3801+    implements(IMutableUploadable)
3802+
3803+    def __init__(self, filehandle):
3804+        # The filehandle is defined as a generally file-like object that
3805+        # has these two methods. We don't care beyond that.
3806+        assert hasattr(filehandle, "read")
3807+        assert hasattr(filehandle, "close")
3808+
3809+        self._filehandle = filehandle
3810+        # We must start reading at the beginning of the file, or we risk
3811+        # encountering errors when the data read does not match the size
3812+        # reported to the uploader.
3813+        self._filehandle.seek(0)
3814+
3815+        # We have not yet read anything, so our position is 0.
3816+        self._marker = 0
3817+
3818+
3819+    def get_size(self):
3820+        """
3821+        I return the amount of data in my filehandle.
3822+        """
3823+        if not hasattr(self, "_size"):
3824+            old_position = self._filehandle.tell()
3825+            # Seek to the end of the file by seeking 0 bytes from the
3826+            # file's end
3827+            self._filehandle.seek(0, 2) # 2 == os.SEEK_END in 2.5+
3828+            self._size = self._filehandle.tell()
3829+            # Restore the previous position, in case this was called
3830+            # after a read.
3831+            self._filehandle.seek(old_position)
3832+            assert self._filehandle.tell() == old_position
3833+
3834+        assert hasattr(self, "_size")
3835+        return self._size
3836+
3837+
3838+    def pos(self):
3839+        """
3840+        I return the position of my read marker -- i.e., how much data I
3841+        have already read and returned to callers.
3842+        """
3843+        return self._marker
3844+
3845+
3846+    def read(self, length):
3847+        """
3848+        I return some data (up to length bytes) from my filehandle.
3849+
3850+        In most cases, I return length bytes, but sometimes I won't --
3851+        for example, if I am asked to read beyond the end of a file, or
3852+        an error occurs.
3853+        """
3854+        results = self._filehandle.read(length)
3855+        self._marker += len(results)
3856+        return [results]
3857+
3858+
3859+    def close(self):
3860+        """
3861+        I close the underlying filehandle. Any further operations on the
3862+        filehandle fail at this point.
3863+        """
3864+        self._filehandle.close()
3865+
3866+
3867+class MutableData(MutableFileHandle):
3868+    """
3869+    I am a mutable uploadable built around a string, which I then cast
3870+    into a StringIO and treat as a filehandle.
3871+    """
3872+
3873+    def __init__(self, s):
3874+        # Take a string and return a file-like uploadable.
3875+        assert isinstance(s, str)
3876+
3877+        MutableFileHandle.__init__(self, StringIO(s))
3878+
3879+
3880+class TransformingUploadable:
3881+    """
3882+    I am an IMutableUploadable that wraps another IMutableUploadable,
3883+    and some segments that are already on the grid. When I am called to
3884+    read, I handle merging of boundary segments.
3885+    """
3886+    implements(IMutableUploadable)
3887+
3888+
3889+    def __init__(self, data, offset, segment_size, start, end):
3890+        assert IMutableUploadable.providedBy(data)
3891+
3892+        self._newdata = data
3893+        self._offset = offset
3894+        self._segment_size = segment_size
3895+        self._start = start
3896+        self._end = end
3897+
3898+        self._read_marker = 0
3899+
3900+        self._first_segment_offset = offset % segment_size
3901+
3902+        num = self.log("TransformingUploadable: starting", parent=None)
3903+        self._log_number = num
3904+        self.log("got fso: %d" % self._first_segment_offset)
3905+        self.log("got offset: %d" % self._offset)
3906+
3907+
3908+    def log(self, *args, **kwargs):
3909+        if 'parent' not in kwargs:
3910+            kwargs['parent'] = self._log_number
3911+        if "facility" not in kwargs:
3912+            kwargs["facility"] = "tahoe.mutable.transforminguploadable"
3913+        return log.msg(*args, **kwargs)
3914+
3915+
3916+    def get_size(self):
3917+        return self._offset + self._newdata.get_size()
3918+
3919+
3920+    def read(self, length):
3921+        # We can get data from 3 sources here.
3922+        #   1. The first of the segments provided to us.
3923+        #   2. The data that we're replacing things with.
3924+        #   3. The last of the segments provided to us.
3925+
3926+        # are we in state 0?
3927+        self.log("reading %d bytes" % length)
3928+
3929+        old_start_data = ""
3930+        old_data_length = self._first_segment_offset - self._read_marker
3931+        if old_data_length > 0:
3932+            if old_data_length > length:
3933+                old_data_length = length
3934+            self.log("returning %d bytes of old start data" % old_data_length)
3935+
3936+            old_data_end = old_data_length + self._read_marker
3937+            old_start_data = self._start[self._read_marker:old_data_end]
3938+            length -= old_data_length
3939+        else:
3940+            # otherwise calculations later get screwed up.
3941+            old_data_length = 0
3942+
3943+        # Is there enough new data to satisfy this read? If not, we need
3944+        # to pad the end of the data with data from our last segment.
3945+        old_end_length = length - \
3946+            (self._newdata.get_size() - self._newdata.pos())
3947+        old_end_data = ""
3948+        if old_end_length > 0:
3949+            self.log("reading %d bytes of old end data" % old_end_length)
3950+
3951+            # TODO: We're not explicitly checking for tail segment size
3952+            # here. Is that a problem?
3953+            old_data_offset = (length - old_end_length + \
3954+                               old_data_length) % self._segment_size
3955+            self.log("reading at offset %d" % old_data_offset)
3956+            old_end = old_data_offset + old_end_length
3957+            old_end_data = self._end[old_data_offset:old_end]
3958+            length -= old_end_length
3959+            assert length == self._newdata.get_size() - self._newdata.pos()
3960+
3961+        self.log("reading %d bytes of new data" % length)
3962+        new_data = self._newdata.read(length)
3963+        new_data = "".join(new_data)
3964+
3965+        self._read_marker += len(old_start_data + new_data + old_end_data)
3966+
3967+        return old_start_data + new_data + old_end_data
3968 
3969hunk ./src/allmydata/mutable/publish.py 1327
3970+    def close(self):
3971+        pass
3972}
3973[nodemaker.py: Make nodemaker expose a way to create MDMF files
3974Kevan Carstensen <kevan@isnotajoke.com>**20100819003509
3975 Ignore-this: a6701746d6b992fc07bc0556a2b4a61d
3976] {
3977hunk ./src/allmydata/nodemaker.py 3
3978 import weakref
3979 from zope.interface import implements
3980-from allmydata.interfaces import INodeMaker
3981+from allmydata.util.assertutil import precondition
3982+from allmydata.interfaces import INodeMaker, SDMF_VERSION
3983 from allmydata.immutable.literal import LiteralFileNode
3984 from allmydata.immutable.filenode import ImmutableFileNode, CiphertextFileNode
3985 from allmydata.immutable.upload import Data
3986hunk ./src/allmydata/nodemaker.py 9
3987 from allmydata.mutable.filenode import MutableFileNode
3988+from allmydata.mutable.publish import MutableData
3989 from allmydata.dirnode import DirectoryNode, pack_children
3990 from allmydata.unknown import UnknownNode
3991 from allmydata import uri
3992hunk ./src/allmydata/nodemaker.py 92
3993             return self._create_dirnode(filenode)
3994         return None
3995 
3996-    def create_mutable_file(self, contents=None, keysize=None):
3997+    def create_mutable_file(self, contents=None, keysize=None,
3998+                            version=SDMF_VERSION):
3999         n = MutableFileNode(self.storage_broker, self.secret_holder,
4000                             self.default_encoding_parameters, self.history)
4001hunk ./src/allmydata/nodemaker.py 96
4002+        n.set_version(version)
4003         d = self.key_generator.generate(keysize)
4004         d.addCallback(n.create_with_keys, contents)
4005         d.addCallback(lambda res: n)
4006hunk ./src/allmydata/nodemaker.py 103
4007         return d
4008 
4009     def create_new_mutable_directory(self, initial_children={}):
4010+        # mutable directories will always be SDMF for now, to help
4011+        # compatibility with older clients.
4012+        version = SDMF_VERSION
4013+        # initial_children must have metadata (i.e. {} instead of None)
4014+        for (name, (node, metadata)) in initial_children.iteritems():
4015+            precondition(isinstance(metadata, dict),
4016+                         "create_new_mutable_directory requires metadata to be a dict, not None", metadata)
4017+            node.raise_error()
4018         d = self.create_mutable_file(lambda n:
4019hunk ./src/allmydata/nodemaker.py 112
4020-                                     pack_children(initial_children, n.get_writekey()))
4021+                                     MutableData(pack_children(initial_children,
4022+                                                    n.get_writekey())),
4023+                                     version=version)
4024         d.addCallback(self._create_dirnode)
4025         return d
4026 
4027}
4028[docs: update docs to mention MDMF
4029Kevan Carstensen <kevan@isnotajoke.com>**20100814225644
4030 Ignore-this: 1c3caa3cd44831007dcfbef297814308
4031] {
4032merger 0.0 (
4033hunk ./docs/configuration.rst 324
4034+Frontend Configuration
4035+======================
4036+
4037+The Tahoe client process can run a variety of frontend file-access protocols.
4038+You will use these to create and retrieve files from the virtual filesystem.
4039+Configuration details for each are documented in the following
4040+protocol-specific guides:
4041+
4042+HTTP
4043+
4044+    Tahoe runs a webserver by default on port 3456. This interface provides a
4045+    human-oriented "WUI", with pages to create, modify, and browse
4046+    directories and files, as well as a number of pages to check on the
4047+    status of your Tahoe node. It also provides a machine-oriented "WAPI",
4048+    with a REST-ful HTTP interface that can be used by other programs
4049+    (including the CLI tools). Please see `<frontends/webapi.rst>`_ for full
4050+    details, and the ``web.port`` and ``web.static`` config variables above.
4051+    The `<frontends/download-status.rst>`_ document also describes a few WUI
4052+    status pages.
4053+
4054+CLI
4055+
4056+    The main "bin/tahoe" executable includes subcommands for manipulating the
4057+    filesystem, uploading/downloading files, and creating/running Tahoe
4058+    nodes. See `<frontends/CLI.rst>`_ for details.
4059+
4060+FTP, SFTP
4061+
4062+    Tahoe can also run both FTP and SFTP servers, and map a username/password
4063+    pair to a top-level Tahoe directory. See `<frontends/FTP-and-SFTP.rst>`_
4064+    for instructions on configuring these services, and the ``[ftpd]`` and
4065+    ``[sftpd]`` sections of ``tahoe.cfg``.
4066+
4067merger 0.0 (
4068replace ./docs/configuration.rst [A-Za-z_0-9\-\.] Tahoe Tahoe-LAFS
4069merger 0.0 (
4070hunk ./docs/configuration.rst 384
4071-shares.needed = (int, optional) aka "k", default 3
4072-shares.total = (int, optional) aka "N", N >= k, default 10
4073-shares.happy = (int, optional) 1 <= happy <= N, default 7
4074-
4075- These three values set the default encoding parameters. Each time a new file
4076- is uploaded, erasure-coding is used to break the ciphertext into separate
4077- pieces. There will be "N" (i.e. shares.total) pieces created, and the file
4078- will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
4079- The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
4080- Setting k to 1 is equivalent to simple replication (uploading N copies of
4081- the file).
4082-
4083- These values control the tradeoff between storage overhead, performance, and
4084- reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
4085- backend storage space (the actual value will be a bit more, because of other
4086- forms of overhead). Up to N-k shares can be lost before the file becomes
4087- unrecoverable, so assuming there are at least N servers, up to N-k servers
4088- can be offline without losing the file. So large N/k ratios are more
4089- reliable, and small N/k ratios use less disk space. Clearly, k must never be
4090- smaller than N.
4091-
4092- Large values of N will slow down upload operations slightly, since more
4093- servers must be involved, and will slightly increase storage overhead due to
4094- the hash trees that are created. Large values of k will cause downloads to
4095- be marginally slower, because more servers must be involved. N cannot be
4096- larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
4097- uses.
4098-
4099- shares.happy allows you control over the distribution of your immutable file.
4100- For a successful upload, shares are guaranteed to be initially placed on
4101- at least 'shares.happy' distinct servers, the correct functioning of any
4102- k of which is sufficient to guarantee the availability of the uploaded file.
4103- This value should not be larger than the number of servers on your grid.
4104-
4105- A value of shares.happy <= k is allowed, but does not provide any redundancy
4106- if some servers fail or lose shares.
4107-
4108- (Mutable files use a different share placement algorithm that does not
4109-  consider this parameter.)
4110-
4111-
4112-== Storage Server Configuration ==
4113-
4114-[storage]
4115-enabled = (boolean, optional)
4116-
4117- If this is True, the node will run a storage server, offering space to other
4118- clients. If it is False, the node will not run a storage server, meaning
4119- that no shares will be stored on this node. Use False this for clients who
4120- do not wish to provide storage service. The default value is True.
4121-
4122-readonly = (boolean, optional)
4123-
4124- If True, the node will run a storage server but will not accept any shares,
4125- making it effectively read-only. Use this for storage servers which are
4126- being decommissioned: the storage/ directory could be mounted read-only,
4127- while shares are moved to other servers. Note that this currently only
4128- affects immutable shares. Mutable shares (used for directories) will be
4129- written and modified anyway. See ticket #390 for the current status of this
4130- bug. The default value is False.
4131-
4132-reserved_space = (str, optional)
4133-
4134- If provided, this value defines how much disk space is reserved: the storage
4135- server will not accept any share which causes the amount of free disk space
4136- to drop below this value. (The free space is measured by a call to statvfs(2)
4137- on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
4138- user account under which the storage server runs.)
4139-
4140- This string contains a number, with an optional case-insensitive scale
4141- suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
4142- "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
4143- thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
4144-
4145-expire.enabled =
4146-expire.mode =
4147-expire.override_lease_duration =
4148-expire.cutoff_date =
4149-expire.immutable =
4150-expire.mutable =
4151-
4152- These settings control garbage-collection, in which the server will delete
4153- shares that no longer have an up-to-date lease on them. Please see the
4154- neighboring "garbage-collection.txt" document for full details.
4155-
4156-
4157-== Running A Helper ==
4158+Running A Helper
4159+================
4160hunk ./docs/configuration.rst 424
4161+mutable.format = sdmf or mdmf
4162+
4163+ This value tells Tahoe-LAFS what the default mutable file format should
4164+ be. If mutable.format=sdmf, then newly created mutable files will be in
4165+ the old SDMF format. This is desirable for clients that operate on
4166+ grids where some peers run older versions of Tahoe-LAFS, as these older
4167+ versions cannot read the new MDMF mutable file format. If
4168+ mutable.format = mdmf, then newly created mutable files will use the
4169+ new MDMF format, which supports efficient in-place modification and
4170+ streaming downloads. You can overwrite this value using a special
4171+ mutable-type parameter in the webapi. If you do not specify a value
4172+ here, Tahoe-LAFS will use SDMF for all newly-created mutable files.
4173+
4174+ Note that this parameter only applies to mutable files. Mutable
4175+ directories, which are stored as mutable files, are not controlled by
4176+ this parameter and will always use SDMF. We may revisit this decision
4177+ in future versions of Tahoe-LAFS.
4178)
4179)
4180)
4181hunk ./docs/frontends/webapi.rst 363
4182  writeable mutable file, that file's contents will be overwritten in-place. If
4183  it is a read-cap for a mutable file, an error will occur. If it is an
4184  immutable file, the old file will be discarded, and a new one will be put in
4185- its place.
4186+ its place. If the target file is a writable mutable file, you may also
4187+ specify an "offset" parameter -- a byte offset that determines where in
4188+ the mutable file the data from the HTTP request body is placed. This
4189+ operation is relatively efficient for MDMF mutable files, and is
4190+ relatively inefficient (but still supported) for SDMF mutable files.
4191 
4192  When creating a new file, if "mutable=true" is in the query arguments, the
4193  operation will create a mutable file instead of an immutable one.
4194hunk ./docs/frontends/webapi.rst 388
4195 
4196  If "mutable=true" is in the query arguments, the operation will create a
4197  mutable file, and return its write-cap in the HTTP respose. The default is
4198- to create an immutable file, returning the read-cap as a response.
4199+ to create an immutable file, returning the read-cap as a response. If
4200+ you create a mutable file, you can also use the "mutable-type" query
4201+ parameter. If "mutable-type=sdmf", then the mutable file will be created
4202+ in the old SDMF mutable file format. This is desirable for files that
4203+ need to be read by old clients. If "mutable-type=mdmf", then the file
4204+ will be created in the new MDMF mutable file format. MDMF mutable files
4205+ can be downloaded more efficiently, and modified in-place efficiently,
4206+ but are not compatible with older versions of Tahoe-LAFS. If no
4207+ "mutable-type" argument is given, the file is created in whatever
4208+ format was configured in tahoe.cfg.
4209 
4210 Creating A New Directory
4211 ------------------------
4212hunk ./docs/frontends/webapi.rst 1082
4213  If a "mutable=true" argument is provided, the operation will create a
4214  mutable file, and the response body will contain the write-cap instead of
4215  the upload results page. The default is to create an immutable file,
4216- returning the upload results page as a response.
4217+ returning the upload results page as a response. If you create a
4218+ mutable file, you may choose to specify the format of that mutable file
4219+ with the "mutable-type" parameter. If "mutable-type=mdmf", then the
4220+ file will be created as an MDMF mutable file. If "mutable-type=sdmf",
4221+ then the file will be created as an SDMF mutable file. If no value is
4222+ specified, the file will be created in whatever format is specified in
4223+ tahoe.cfg.
4224 
4225 
4226 ``POST /uri/$DIRCAP/[SUBDIRS../]?t=upload``
4227}
4228[mutable/layout.py and interfaces.py: add MDMF writer and reader
4229Kevan Carstensen <kevan@isnotajoke.com>**20100819003304
4230 Ignore-this: 44400fec923987b62830da2ed5075fb4
4231 
4232 The MDMF writer is responsible for keeping state as plaintext is
4233 gradually processed into share data by the upload process. When the
4234 upload finishes, it will write all of its share data to a remote server,
4235 reporting its status back to the publisher.
4236 
4237 The MDMF reader is responsible for abstracting an MDMF file as it sits
4238 on the grid from the downloader; specifically, by receiving and
4239 responding to requests for arbitrary data within the MDMF file.
4240 
4241 The interfaces.py file has also been modified to contain an interface
4242 for the writer.
4243] {
4244hunk ./src/allmydata/interfaces.py 7
4245      ChoiceOf, IntegerConstraint, Any, RemoteInterface, Referenceable
4246 
4247 HASH_SIZE=32
4248+SALT_SIZE=16
4249+
4250+SDMF_VERSION=0
4251+MDMF_VERSION=1
4252 
4253 Hash = StringConstraint(maxLength=HASH_SIZE,
4254                         minLength=HASH_SIZE)# binary format 32-byte SHA256 hash
4255hunk ./src/allmydata/interfaces.py 424
4256         """
4257 
4258 
4259+class IMutableSlotWriter(Interface):
4260+    """
4261+    The interface for a writer around a mutable slot on a remote server.
4262+    """
4263+    def set_checkstring(checkstring, *args):
4264+        """
4265+        Set the checkstring that I will pass to the remote server when
4266+        writing.
4267+
4268+            @param checkstring A packed checkstring to use.
4269+
4270+        Note that implementations can differ in which semantics they
4271+        wish to support for set_checkstring -- they can, for example,
4272+        build the checkstring themselves from its constituents, or
4273+        some other thing.
4274+        """
4275+
4276+    def get_checkstring():
4277+        """
4278+        Get the checkstring that I think currently exists on the remote
4279+        server.
4280+        """
4281+
4282+    def put_block(data, segnum, salt):
4283+        """
4284+        Add a block and salt to the share.
4285+        """
4286+
4287+    def put_encprivey(encprivkey):
4288+        """
4289+        Add the encrypted private key to the share.
4290+        """
4291+
4292+    def put_blockhashes(blockhashes=list):
4293+        """
4294+        Add the block hash tree to the share.
4295+        """
4296+
4297+    def put_sharehashes(sharehashes=dict):
4298+        """
4299+        Add the share hash chain to the share.
4300+        """
4301+
4302+    def get_signable():
4303+        """
4304+        Return the part of the share that needs to be signed.
4305+        """
4306+
4307+    def put_signature(signature):
4308+        """
4309+        Add the signature to the share.
4310+        """
4311+
4312+    def put_verification_key(verification_key):
4313+        """
4314+        Add the verification key to the share.
4315+        """
4316+
4317+    def finish_publishing():
4318+        """
4319+        Do anything necessary to finish writing the share to a remote
4320+        server. I require that no further publishing needs to take place
4321+        after this method has been called.
4322+        """
4323+
4324+
4325 class IURI(Interface):
4326     def init_from_string(uri):
4327         """Accept a string (as created by my to_string() method) and populate
4328hunk ./src/allmydata/mutable/layout.py 4
4329 
4330 import struct
4331 from allmydata.mutable.common import NeedMoreDataError, UnknownVersionError
4332+from allmydata.interfaces import HASH_SIZE, SALT_SIZE, SDMF_VERSION, \
4333+                                 MDMF_VERSION, IMutableSlotWriter
4334+from allmydata.util import mathutil, observer
4335+from twisted.python import failure
4336+from twisted.internet import defer
4337+from zope.interface import implements
4338+
4339+
4340+# These strings describe the format of the packed structs they help process
4341+# Here's what they mean:
4342+#
4343+#  PREFIX:
4344+#    >: Big-endian byte order; the most significant byte is first (leftmost).
4345+#    B: The version information; an 8 bit version identifier. Stored as
4346+#       an unsigned char. This is currently 00 00 00 00; our modifications
4347+#       will turn it into 00 00 00 01.
4348+#    Q: The sequence number; this is sort of like a revision history for
4349+#       mutable files; they start at 1 and increase as they are changed after
4350+#       being uploaded. Stored as an unsigned long long, which is 8 bytes in
4351+#       length.
4352+#  32s: The root hash of the share hash tree. We use sha-256d, so we use 32
4353+#       characters = 32 bytes to store the value.
4354+#  16s: The salt for the readkey. This is a 16-byte random value, stored as
4355+#       16 characters.
4356+#
4357+#  SIGNED_PREFIX additions, things that are covered by the signature:
4358+#    B: The "k" encoding parameter. We store this as an 8-bit character,
4359+#       which is convenient because our erasure coding scheme cannot
4360+#       encode if you ask for more than 255 pieces.
4361+#    B: The "N" encoding parameter. Stored as an 8-bit character for the
4362+#       same reasons as above.
4363+#    Q: The segment size of the uploaded file. This will essentially be the
4364+#       length of the file in SDMF. An unsigned long long, so we can store
4365+#       files of quite large size.
4366+#    Q: The data length of the uploaded file. Modulo padding, this will be
4367+#       the same of the data length field. Like the data length field, it is
4368+#       an unsigned long long and can be quite large.
4369+#
4370+#   HEADER additions:
4371+#     L: The offset of the signature of this. An unsigned long.
4372+#     L: The offset of the share hash chain. An unsigned long.
4373+#     L: The offset of the block hash tree. An unsigned long.
4374+#     L: The offset of the share data. An unsigned long.
4375+#     Q: The offset of the encrypted private key. An unsigned long long, to
4376+#        account for the possibility of a lot of share data.
4377+#     Q: The offset of the EOF. An unsigned long long, to account for the
4378+#        possibility of a lot of share data.
4379+#
4380+#  After all of these, we have the following:
4381+#    - The verification key: Occupies the space between the end of the header
4382+#      and the start of the signature (i.e.: data[HEADER_LENGTH:o['signature']].
4383+#    - The signature, which goes from the signature offset to the share hash
4384+#      chain offset.
4385+#    - The share hash chain, which goes from the share hash chain offset to
4386+#      the block hash tree offset.
4387+#    - The share data, which goes from the share data offset to the encrypted
4388+#      private key offset.
4389+#    - The encrypted private key offset, which goes until the end of the file.
4390+#
4391+#  The block hash tree in this encoding has only one share, so the offset of
4392+#  the share data will be 32 bits more than the offset of the block hash tree.
4393+#  Given this, we may need to check to see how many bytes a reasonably sized
4394+#  block hash tree will take up.
4395 
4396 PREFIX = ">BQ32s16s" # each version has a different prefix
4397 SIGNED_PREFIX = ">BQ32s16s BBQQ" # this is covered by the signature
4398hunk ./src/allmydata/mutable/layout.py 73
4399 SIGNED_PREFIX_LENGTH = struct.calcsize(SIGNED_PREFIX)
4400 HEADER = ">BQ32s16s BBQQ LLLLQQ" # includes offsets
4401 HEADER_LENGTH = struct.calcsize(HEADER)
4402+OFFSETS = ">LLLLQQ"
4403+OFFSETS_LENGTH = struct.calcsize(OFFSETS)
4404 
4405hunk ./src/allmydata/mutable/layout.py 76
4406+# These are still used for some tests.
4407 def unpack_header(data):
4408     o = {}
4409     (version,
4410hunk ./src/allmydata/mutable/layout.py 92
4411      o['EOF']) = struct.unpack(HEADER, data[:HEADER_LENGTH])
4412     return (version, seqnum, root_hash, IV, k, N, segsize, datalen, o)
4413 
4414-def unpack_prefix_and_signature(data):
4415-    assert len(data) >= HEADER_LENGTH, len(data)
4416-    prefix = data[:SIGNED_PREFIX_LENGTH]
4417-
4418-    (version,
4419-     seqnum,
4420-     root_hash,
4421-     IV,
4422-     k, N, segsize, datalen,
4423-     o) = unpack_header(data)
4424-
4425-    if version != 0:
4426-        raise UnknownVersionError("got mutable share version %d, but I only understand version 0" % version)
4427-
4428-    if len(data) < o['share_hash_chain']:
4429-        raise NeedMoreDataError(o['share_hash_chain'],
4430-                                o['enc_privkey'], o['EOF']-o['enc_privkey'])
4431-
4432-    pubkey_s = data[HEADER_LENGTH:o['signature']]
4433-    signature = data[o['signature']:o['share_hash_chain']]
4434-
4435-    return (seqnum, root_hash, IV, k, N, segsize, datalen,
4436-            pubkey_s, signature, prefix)
4437-
4438 def unpack_share(data):
4439     assert len(data) >= HEADER_LENGTH
4440     o = {}
4441hunk ./src/allmydata/mutable/layout.py 139
4442             pubkey, signature, share_hash_chain, block_hash_tree,
4443             share_data, enc_privkey)
4444 
4445-def unpack_share_data(verinfo, hash_and_data):
4446-    (seqnum, root_hash, IV, segsize, datalength, k, N, prefix, o_t) = verinfo
4447-
4448-    # hash_and_data starts with the share_hash_chain, so figure out what the
4449-    # offsets really are
4450-    o = dict(o_t)
4451-    o_share_hash_chain = 0
4452-    o_block_hash_tree = o['block_hash_tree'] - o['share_hash_chain']
4453-    o_share_data = o['share_data'] - o['share_hash_chain']
4454-    o_enc_privkey = o['enc_privkey'] - o['share_hash_chain']
4455-
4456-    share_hash_chain_s = hash_and_data[o_share_hash_chain:o_block_hash_tree]
4457-    share_hash_format = ">H32s"
4458-    hsize = struct.calcsize(share_hash_format)
4459-    assert len(share_hash_chain_s) % hsize == 0, len(share_hash_chain_s)
4460-    share_hash_chain = []
4461-    for i in range(0, len(share_hash_chain_s), hsize):
4462-        chunk = share_hash_chain_s[i:i+hsize]
4463-        (hid, h) = struct.unpack(share_hash_format, chunk)
4464-        share_hash_chain.append( (hid, h) )
4465-    share_hash_chain = dict(share_hash_chain)
4466-    block_hash_tree_s = hash_and_data[o_block_hash_tree:o_share_data]
4467-    assert len(block_hash_tree_s) % 32 == 0, len(block_hash_tree_s)
4468-    block_hash_tree = []
4469-    for i in range(0, len(block_hash_tree_s), 32):
4470-        block_hash_tree.append(block_hash_tree_s[i:i+32])
4471-
4472-    share_data = hash_and_data[o_share_data:o_enc_privkey]
4473-
4474-    return (share_hash_chain, block_hash_tree, share_data)
4475-
4476-
4477-def pack_checkstring(seqnum, root_hash, IV):
4478-    return struct.pack(PREFIX,
4479-                       0, # version,
4480-                       seqnum,
4481-                       root_hash,
4482-                       IV)
4483-
4484 def unpack_checkstring(checkstring):
4485     cs_len = struct.calcsize(PREFIX)
4486     version, seqnum, root_hash, IV = struct.unpack(PREFIX, checkstring[:cs_len])
4487hunk ./src/allmydata/mutable/layout.py 146
4488         raise UnknownVersionError("got mutable share version %d, but I only understand version 0" % version)
4489     return (seqnum, root_hash, IV)
4490 
4491-def pack_prefix(seqnum, root_hash, IV,
4492-                required_shares, total_shares,
4493-                segment_size, data_length):
4494-    prefix = struct.pack(SIGNED_PREFIX,
4495-                         0, # version,
4496-                         seqnum,
4497-                         root_hash,
4498-                         IV,
4499-
4500-                         required_shares,
4501-                         total_shares,
4502-                         segment_size,
4503-                         data_length,
4504-                         )
4505-    return prefix
4506 
4507 def pack_offsets(verification_key_length, signature_length,
4508                  share_hash_chain_length, block_hash_tree_length,
4509hunk ./src/allmydata/mutable/layout.py 192
4510                            encprivkey])
4511     return final_share
4512 
4513+def pack_prefix(seqnum, root_hash, IV,
4514+                required_shares, total_shares,
4515+                segment_size, data_length):
4516+    prefix = struct.pack(SIGNED_PREFIX,
4517+                         0, # version,
4518+                         seqnum,
4519+                         root_hash,
4520+                         IV,
4521+                         required_shares,
4522+                         total_shares,
4523+                         segment_size,
4524+                         data_length,
4525+                         )
4526+    return prefix
4527+
4528+
4529+class SDMFSlotWriteProxy:
4530+    implements(IMutableSlotWriter)
4531+    """
4532+    I represent a remote write slot for an SDMF mutable file. I build a
4533+    share in memory, and then write it in one piece to the remote
4534+    server. This mimics how SDMF shares were built before MDMF (and the
4535+    new MDMF uploader), but provides that functionality in a way that
4536+    allows the MDMF uploader to be built without much special-casing for
4537+    file format, which makes the uploader code more readable.
4538+    """
4539+    def __init__(self,
4540+                 shnum,
4541+                 rref, # a remote reference to a storage server
4542+                 storage_index,
4543+                 secrets, # (write_enabler, renew_secret, cancel_secret)
4544+                 seqnum, # the sequence number of the mutable file
4545+                 required_shares,
4546+                 total_shares,
4547+                 segment_size,
4548+                 data_length): # the length of the original file
4549+        self.shnum = shnum
4550+        self._rref = rref
4551+        self._storage_index = storage_index
4552+        self._secrets = secrets
4553+        self._seqnum = seqnum
4554+        self._required_shares = required_shares
4555+        self._total_shares = total_shares
4556+        self._segment_size = segment_size
4557+        self._data_length = data_length
4558+
4559+        # This is an SDMF file, so it should have only one segment, so,
4560+        # modulo padding of the data length, the segment size and the
4561+        # data length should be the same.
4562+        expected_segment_size = mathutil.next_multiple(data_length,
4563+                                                       self._required_shares)
4564+        assert expected_segment_size == segment_size
4565+
4566+        self._block_size = self._segment_size / self._required_shares
4567+
4568+        # This is meant to mimic how SDMF files were built before MDMF
4569+        # entered the picture: we generate each share in its entirety,
4570+        # then push it off to the storage server in one write. When
4571+        # callers call set_*, they are just populating this dict.
4572+        # finish_publishing will stitch these pieces together into a
4573+        # coherent share, and then write the coherent share to the
4574+        # storage server.
4575+        self._share_pieces = {}
4576+
4577+        # This tells the write logic what checkstring to use when
4578+        # writing remote shares.
4579+        self._testvs = []
4580+
4581+        self._readvs = [(0, struct.calcsize(PREFIX))]
4582+
4583+
4584+    def set_checkstring(self, checkstring_or_seqnum,
4585+                              root_hash=None,
4586+                              salt=None):
4587+        """
4588+        Set the checkstring that I will pass to the remote server when
4589+        writing.
4590+
4591+            @param checkstring_or_seqnum: A packed checkstring to use,
4592+                   or a sequence number. I will treat this as a checkstr
4593+
4594+        Note that implementations can differ in which semantics they
4595+        wish to support for set_checkstring -- they can, for example,
4596+        build the checkstring themselves from its constituents, or
4597+        some other thing.
4598+        """
4599+        if root_hash and salt:
4600+            checkstring = struct.pack(PREFIX,
4601+                                      0,
4602+                                      checkstring_or_seqnum,
4603+                                      root_hash,
4604+                                      salt)
4605+        else:
4606+            checkstring = checkstring_or_seqnum
4607+        self._testvs = [(0, len(checkstring), "eq", checkstring)]
4608+
4609+
4610+    def get_checkstring(self):
4611+        """
4612+        Get the checkstring that I think currently exists on the remote
4613+        server.
4614+        """
4615+        if self._testvs:
4616+            return self._testvs[0][3]
4617+        return ""
4618+
4619+
4620+    def put_block(self, data, segnum, salt):
4621+        """
4622+        Add a block and salt to the share.
4623+        """
4624+        # SDMF files have only one segment
4625+        assert segnum == 0
4626+        assert len(data) == self._block_size
4627+        assert len(salt) == SALT_SIZE
4628+
4629+        self._share_pieces['sharedata'] = data
4630+        self._share_pieces['salt'] = salt
4631+
4632+        # TODO: Figure out something intelligent to return.
4633+        return defer.succeed(None)
4634+
4635+
4636+    def put_encprivkey(self, encprivkey):
4637+        """
4638+        Add the encrypted private key to the share.
4639+        """
4640+        self._share_pieces['encprivkey'] = encprivkey
4641+
4642+        return defer.succeed(None)
4643+
4644+
4645+    def put_blockhashes(self, blockhashes):
4646+        """
4647+        Add the block hash tree to the share.
4648+        """
4649+        assert isinstance(blockhashes, list)
4650+        for h in blockhashes:
4651+            assert len(h) == HASH_SIZE
4652+
4653+        # serialize the blockhashes, then set them.
4654+        blockhashes_s = "".join(blockhashes)
4655+        self._share_pieces['block_hash_tree'] = blockhashes_s
4656+
4657+        return defer.succeed(None)
4658+
4659+
4660+    def put_sharehashes(self, sharehashes):
4661+        """
4662+        Add the share hash chain to the share.
4663+        """
4664+        assert isinstance(sharehashes, dict)
4665+        for h in sharehashes.itervalues():
4666+            assert len(h) == HASH_SIZE
4667+
4668+        # serialize the sharehashes, then set them.
4669+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
4670+                                 for i in sorted(sharehashes.keys())])
4671+        self._share_pieces['share_hash_chain'] = sharehashes_s
4672+
4673+        return defer.succeed(None)
4674+
4675+
4676+    def put_root_hash(self, root_hash):
4677+        """
4678+        Add the root hash to the share.
4679+        """
4680+        assert len(root_hash) == HASH_SIZE
4681+
4682+        self._share_pieces['root_hash'] = root_hash
4683+
4684+        return defer.succeed(None)
4685+
4686+
4687+    def put_salt(self, salt):
4688+        """
4689+        Add a salt to an empty SDMF file.
4690+        """
4691+        assert len(salt) == SALT_SIZE
4692+
4693+        self._share_pieces['salt'] = salt
4694+        self._share_pieces['sharedata'] = ""
4695+
4696+
4697+    def get_signable(self):
4698+        """
4699+        Return the part of the share that needs to be signed.
4700+
4701+        SDMF writers need to sign the packed representation of the
4702+        first eight fields of the remote share, that is:
4703+            - version number (0)
4704+            - sequence number
4705+            - root of the share hash tree
4706+            - salt
4707+            - k
4708+            - n
4709+            - segsize
4710+            - datalen
4711+
4712+        This method is responsible for returning that to callers.
4713+        """
4714+        return struct.pack(SIGNED_PREFIX,
4715+                           0,
4716+                           self._seqnum,
4717+                           self._share_pieces['root_hash'],
4718+                           self._share_pieces['salt'],
4719+                           self._required_shares,
4720+                           self._total_shares,
4721+                           self._segment_size,
4722+                           self._data_length)
4723+
4724+
4725+    def put_signature(self, signature):
4726+        """
4727+        Add the signature to the share.
4728+        """
4729+        self._share_pieces['signature'] = signature
4730+
4731+        return defer.succeed(None)
4732+
4733+
4734+    def put_verification_key(self, verification_key):
4735+        """
4736+        Add the verification key to the share.
4737+        """
4738+        self._share_pieces['verification_key'] = verification_key
4739+
4740+        return defer.succeed(None)
4741+
4742+
4743+    def get_verinfo(self):
4744+        """
4745+        I return my verinfo tuple. This is used by the ServermapUpdater
4746+        to keep track of versions of mutable files.
4747+
4748+        The verinfo tuple for MDMF files contains:
4749+            - seqnum
4750+            - root hash
4751+            - a blank (nothing)
4752+            - segsize
4753+            - datalen
4754+            - k
4755+            - n
4756+            - prefix (the thing that you sign)
4757+            - a tuple of offsets
4758+
4759+        We include the nonce in MDMF to simplify processing of version
4760+        information tuples.
4761+
4762+        The verinfo tuple for SDMF files is the same, but contains a
4763+        16-byte IV instead of a hash of salts.
4764+        """
4765+        return (self._seqnum,
4766+                self._share_pieces['root_hash'],
4767+                self._share_pieces['salt'],
4768+                self._segment_size,
4769+                self._data_length,
4770+                self._required_shares,
4771+                self._total_shares,
4772+                self.get_signable(),
4773+                self._get_offsets_tuple())
4774+
4775+    def _get_offsets_dict(self):
4776+        post_offset = HEADER_LENGTH
4777+        offsets = {}
4778+
4779+        verification_key_length = len(self._share_pieces['verification_key'])
4780+        o1 = offsets['signature'] = post_offset + verification_key_length
4781+
4782+        signature_length = len(self._share_pieces['signature'])
4783+        o2 = offsets['share_hash_chain'] = o1 + signature_length
4784+
4785+        share_hash_chain_length = len(self._share_pieces['share_hash_chain'])
4786+        o3 = offsets['block_hash_tree'] = o2 + share_hash_chain_length
4787+
4788+        block_hash_tree_length = len(self._share_pieces['block_hash_tree'])
4789+        o4 = offsets['share_data'] = o3 + block_hash_tree_length
4790+
4791+        share_data_length = len(self._share_pieces['sharedata'])
4792+        o5 = offsets['enc_privkey'] = o4 + share_data_length
4793+
4794+        encprivkey_length = len(self._share_pieces['encprivkey'])
4795+        offsets['EOF'] = o5 + encprivkey_length
4796+        return offsets
4797+
4798+
4799+    def _get_offsets_tuple(self):
4800+        offsets = self._get_offsets_dict()
4801+        return tuple([(key, value) for key, value in offsets.items()])
4802+
4803+
4804+    def _pack_offsets(self):
4805+        offsets = self._get_offsets_dict()
4806+        return struct.pack(">LLLLQQ",
4807+                           offsets['signature'],
4808+                           offsets['share_hash_chain'],
4809+                           offsets['block_hash_tree'],
4810+                           offsets['share_data'],
4811+                           offsets['enc_privkey'],
4812+                           offsets['EOF'])
4813+
4814+
4815+    def finish_publishing(self):
4816+        """
4817+        Do anything necessary to finish writing the share to a remote
4818+        server. I require that no further publishing needs to take place
4819+        after this method has been called.
4820+        """
4821+        for k in ["sharedata", "encprivkey", "signature", "verification_key",
4822+                  "share_hash_chain", "block_hash_tree"]:
4823+            assert k in self._share_pieces
4824+        # This is the only method that actually writes something to the
4825+        # remote server.
4826+        # First, we need to pack the share into data that we can write
4827+        # to the remote server in one write.
4828+        offsets = self._pack_offsets()
4829+        prefix = self.get_signable()
4830+        final_share = "".join([prefix,
4831+                               offsets,
4832+                               self._share_pieces['verification_key'],
4833+                               self._share_pieces['signature'],
4834+                               self._share_pieces['share_hash_chain'],
4835+                               self._share_pieces['block_hash_tree'],
4836+                               self._share_pieces['sharedata'],
4837+                               self._share_pieces['encprivkey']])
4838+
4839+        # Our only data vector is going to be writing the final share,
4840+        # in its entirely.
4841+        datavs = [(0, final_share)]
4842+
4843+        if not self._testvs:
4844+            # Our caller has not provided us with another checkstring
4845+            # yet, so we assume that we are writing a new share, and set
4846+            # a test vector that will allow a new share to be written.
4847+            self._testvs = []
4848+            self._testvs.append(tuple([0, 1, "eq", ""]))
4849+
4850+        tw_vectors = {}
4851+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
4852+        return self._rref.callRemote("slot_testv_and_readv_and_writev",
4853+                                     self._storage_index,
4854+                                     self._secrets,
4855+                                     tw_vectors,
4856+                                     # TODO is it useful to read something?
4857+                                     self._readvs)
4858+
4859+
4860+MDMFHEADER = ">BQ32sBBQQ QQQQQQ"
4861+MDMFHEADERWITHOUTOFFSETS = ">BQ32sBBQQ"
4862+MDMFHEADERSIZE = struct.calcsize(MDMFHEADER)
4863+MDMFHEADERWITHOUTOFFSETSSIZE = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
4864+MDMFCHECKSTRING = ">BQ32s"
4865+MDMFSIGNABLEHEADER = ">BQ32sBBQQ"
4866+MDMFOFFSETS = ">QQQQQQ"
4867+MDMFOFFSETS_LENGTH = struct.calcsize(MDMFOFFSETS)
4868+
4869+class MDMFSlotWriteProxy:
4870+    implements(IMutableSlotWriter)
4871+
4872+    """
4873+    I represent a remote write slot for an MDMF mutable file.
4874+
4875+    I abstract away from my caller the details of block and salt
4876+    management, and the implementation of the on-disk format for MDMF
4877+    shares.
4878+    """
4879+    # Expected layout, MDMF:
4880+    # offset:     size:       name:
4881+    #-- signed part --
4882+    # 0           1           version number (01)
4883+    # 1           8           sequence number
4884+    # 9           32          share tree root hash
4885+    # 41          1           The "k" encoding parameter
4886+    # 42          1           The "N" encoding parameter
4887+    # 43          8           The segment size of the uploaded file
4888+    # 51          8           The data length of the original plaintext
4889+    #-- end signed part --
4890+    # 59          8           The offset of the encrypted private key
4891+    # 83          8           The offset of the signature
4892+    # 91          8           The offset of the verification key
4893+    # 67          8           The offset of the block hash tree
4894+    # 75          8           The offset of the share hash chain
4895+    # 99          8           The offset of the EOF
4896+    #
4897+    # followed by salts and share data, the encrypted private key, the
4898+    # block hash tree, the salt hash tree, the share hash chain, a
4899+    # signature over the first eight fields, and a verification key.
4900+    #
4901+    # The checkstring is the first three fields -- the version number,
4902+    # sequence number, root hash and root salt hash. This is consistent
4903+    # in meaning to what we have with SDMF files, except now instead of
4904+    # using the literal salt, we use a value derived from all of the
4905+    # salts -- the share hash root.
4906+    #
4907+    # The salt is stored before the block for each segment. The block
4908+    # hash tree is computed over the combination of block and salt for
4909+    # each segment. In this way, we get integrity checking for both
4910+    # block and salt with the current block hash tree arrangement.
4911+    #
4912+    # The ordering of the offsets is different to reflect the dependencies
4913+    # that we'll run into with an MDMF file. The expected write flow is
4914+    # something like this:
4915+    #
4916+    #   0: Initialize with the sequence number, encoding parameters and
4917+    #      data length. From this, we can deduce the number of segments,
4918+    #      and where they should go.. We can also figure out where the
4919+    #      encrypted private key should go, because we can figure out how
4920+    #      big the share data will be.
4921+    #
4922+    #   1: Encrypt, encode, and upload the file in chunks. Do something
4923+    #      like
4924+    #
4925+    #       put_block(data, segnum, salt)
4926+    #
4927+    #      to write a block and a salt to the disk. We can do both of
4928+    #      these operations now because we have enough of the offsets to
4929+    #      know where to put them.
4930+    #
4931+    #   2: Put the encrypted private key. Use:
4932+    #
4933+    #        put_encprivkey(encprivkey)
4934+    #
4935+    #      Now that we know the length of the private key, we can fill
4936+    #      in the offset for the block hash tree.
4937+    #
4938+    #   3: We're now in a position to upload the block hash tree for
4939+    #      a share. Put that using something like:
4940+    #       
4941+    #        put_blockhashes(block_hash_tree)
4942+    #
4943+    #      Note that block_hash_tree is a list of hashes -- we'll take
4944+    #      care of the details of serializing that appropriately. When
4945+    #      we get the block hash tree, we are also in a position to
4946+    #      calculate the offset for the share hash chain, and fill that
4947+    #      into the offsets table.
4948+    #
4949+    #   4: At the same time, we're in a position to upload the salt hash
4950+    #      tree. This is a Merkle tree over all of the salts. We use a
4951+    #      Merkle tree so that we can validate each block,salt pair as
4952+    #      we download them later. We do this using
4953+    #
4954+    #        put_salthashes(salt_hash_tree)
4955+    #
4956+    #      When you do this, I automatically put the root of the tree
4957+    #      (the hash at index 0 of the list) in its appropriate slot in
4958+    #      the signed prefix of the share.
4959+    #
4960+    #   5: We're now in a position to upload the share hash chain for
4961+    #      a share. Do that with something like:
4962+    #     
4963+    #        put_sharehashes(share_hash_chain)
4964+    #
4965+    #      share_hash_chain should be a dictionary mapping shnums to
4966+    #      32-byte hashes -- the wrapper handles serialization.
4967+    #      We'll know where to put the signature at this point, also.
4968+    #      The root of this tree will be put explicitly in the next
4969+    #      step.
4970+    #
4971+    #      TODO: Why? Why not just include it in the tree here?
4972+    #
4973+    #   6: Before putting the signature, we must first put the
4974+    #      root_hash. Do this with:
4975+    #
4976+    #        put_root_hash(root_hash).
4977+    #     
4978+    #      In terms of knowing where to put this value, it was always
4979+    #      possible to place it, but it makes sense semantically to
4980+    #      place it after the share hash tree, so that's why you do it
4981+    #      in this order.
4982+    #
4983+    #   6: With the root hash put, we can now sign the header. Use:
4984+    #
4985+    #        get_signable()
4986+    #
4987+    #      to get the part of the header that you want to sign, and use:
4988+    #       
4989+    #        put_signature(signature)
4990+    #
4991+    #      to write your signature to the remote server.
4992+    #
4993+    #   6: Add the verification key, and finish. Do:
4994+    #
4995+    #        put_verification_key(key)
4996+    #
4997+    #      and
4998+    #
4999+    #        finish_publish()
5000+    #
5001+    # Checkstring management:
5002+    #
5003+    # To write to a mutable slot, we have to provide test vectors to ensure
5004+    # that we are writing to the same data that we think we are. These
5005+    # vectors allow us to detect uncoordinated writes; that is, writes
5006+    # where both we and some other shareholder are writing to the
5007+    # mutable slot, and to report those back to the parts of the program
5008+    # doing the writing.
5009+    #
5010+    # With SDMF, this was easy -- all of the share data was written in
5011+    # one go, so it was easy to detect uncoordinated writes, and we only
5012+    # had to do it once. With MDMF, not all of the file is written at
5013+    # once.
5014+    #
5015+    # If a share is new, we write out as much of the header as we can
5016+    # before writing out anything else. This gives other writers a
5017+    # canary that they can use to detect uncoordinated writes, and, if
5018+    # they do the same thing, gives us the same canary. We them update
5019+    # the share. We won't be able to write out two fields of the header
5020+    # -- the share tree hash and the salt hash -- until we finish
5021+    # writing out the share. We only require the writer to provide the
5022+    # initial checkstring, and keep track of what it should be after
5023+    # updates ourselves.
5024+    #
5025+    # If we haven't written anything yet, then on the first write (which
5026+    # will probably be a block + salt of a share), we'll also write out
5027+    # the header. On subsequent passes, we'll expect to see the header.
5028+    # This changes in two places:
5029+    #
5030+    #   - When we write out the salt hash
5031+    #   - When we write out the root of the share hash tree
5032+    #
5033+    # since these values will change the header. It is possible that we
5034+    # can just make those be written in one operation to minimize
5035+    # disruption.
5036+    def __init__(self,
5037+                 shnum,
5038+                 rref, # a remote reference to a storage server
5039+                 storage_index,
5040+                 secrets, # (write_enabler, renew_secret, cancel_secret)
5041+                 seqnum, # the sequence number of the mutable file
5042+                 required_shares,
5043+                 total_shares,
5044+                 segment_size,
5045+                 data_length): # the length of the original file
5046+        self.shnum = shnum
5047+        self._rref = rref
5048+        self._storage_index = storage_index
5049+        self._seqnum = seqnum
5050+        self._required_shares = required_shares
5051+        assert self.shnum >= 0 and self.shnum < total_shares
5052+        self._total_shares = total_shares
5053+        # We build up the offset table as we write things. It is the
5054+        # last thing we write to the remote server.
5055+        self._offsets = {}
5056+        self._testvs = []
5057+        # This is a list of write vectors that will be sent to our
5058+        # remote server once we are directed to write things there.
5059+        self._writevs = []
5060+        self._secrets = secrets
5061+        # The segment size needs to be a multiple of the k parameter --
5062+        # any padding should have been carried out by the publisher
5063+        # already.
5064+        assert segment_size % required_shares == 0
5065+        self._segment_size = segment_size
5066+        self._data_length = data_length
5067+
5068+        # These are set later -- we define them here so that we can
5069+        # check for their existence easily
5070+
5071+        # This is the root of the share hash tree -- the Merkle tree
5072+        # over the roots of the block hash trees computed for shares in
5073+        # this upload.
5074+        self._root_hash = None
5075+
5076+        # We haven't yet written anything to the remote bucket. By
5077+        # setting this, we tell the _write method as much. The write
5078+        # method will then know that it also needs to add a write vector
5079+        # for the checkstring (or what we have of it) to the first write
5080+        # request. We'll then record that value for future use.  If
5081+        # we're expecting something to be there already, we need to call
5082+        # set_checkstring before we write anything to tell the first
5083+        # write about that.
5084+        self._written = False
5085+
5086+        # When writing data to the storage servers, we get a read vector
5087+        # for free. We'll read the checkstring, which will help us
5088+        # figure out what's gone wrong if a write fails.
5089+        self._readv = [(0, struct.calcsize(MDMFCHECKSTRING))]
5090+
5091+        # We calculate the number of segments because it tells us
5092+        # where the salt part of the file ends/share segment begins,
5093+        # and also because it provides a useful amount of bounds checking.
5094+        self._num_segments = mathutil.div_ceil(self._data_length,
5095+                                               self._segment_size)
5096+        self._block_size = self._segment_size / self._required_shares
5097+        # We also calculate the share size, to help us with block
5098+        # constraints later.
5099+        tail_size = self._data_length % self._segment_size
5100+        if not tail_size:
5101+            self._tail_block_size = self._block_size
5102+        else:
5103+            self._tail_block_size = mathutil.next_multiple(tail_size,
5104+                                                           self._required_shares)
5105+            self._tail_block_size /= self._required_shares
5106+
5107+        # We already know where the sharedata starts; right after the end
5108+        # of the header (which is defined as the signable part + the offsets)
5109+        # We can also calculate where the encrypted private key begins
5110+        # from what we know know.
5111+        self._actual_block_size = self._block_size + SALT_SIZE
5112+        data_size = self._actual_block_size * (self._num_segments - 1)
5113+        data_size += self._tail_block_size
5114+        data_size += SALT_SIZE
5115+        self._offsets['enc_privkey'] = MDMFHEADERSIZE
5116+        self._offsets['enc_privkey'] += data_size
5117+        # We'll wait for the rest. Callers can now call my "put_block" and
5118+        # "set_checkstring" methods.
5119+
5120+
5121+    def set_checkstring(self,
5122+                        seqnum_or_checkstring,
5123+                        root_hash=None,
5124+                        salt=None):
5125+        """
5126+        Set checkstring checkstring for the given shnum.
5127+
5128+        This can be invoked in one of two ways.
5129+
5130+        With one argument, I assume that you are giving me a literal
5131+        checkstring -- e.g., the output of get_checkstring. I will then
5132+        set that checkstring as it is. This form is used by unit tests.
5133+
5134+        With two arguments, I assume that you are giving me a sequence
5135+        number and root hash to make a checkstring from. In that case, I
5136+        will build a checkstring and set it for you. This form is used
5137+        by the publisher.
5138+
5139+        By default, I assume that I am writing new shares to the grid.
5140+        If you don't explcitly set your own checkstring, I will use
5141+        one that requires that the remote share not exist. You will want
5142+        to use this method if you are updating a share in-place;
5143+        otherwise, writes will fail.
5144+        """
5145+        # You're allowed to overwrite checkstrings with this method;
5146+        # I assume that users know what they are doing when they call
5147+        # it.
5148+        if root_hash:
5149+            checkstring = struct.pack(MDMFCHECKSTRING,
5150+                                      1,
5151+                                      seqnum_or_checkstring,
5152+                                      root_hash)
5153+        else:
5154+            checkstring = seqnum_or_checkstring
5155+
5156+        if checkstring == "":
5157+            # We special-case this, since len("") = 0, but we need
5158+            # length of 1 for the case of an empty share to work on the
5159+            # storage server, which is what a checkstring that is the
5160+            # empty string means.
5161+            self._testvs = []
5162+        else:
5163+            self._testvs = []
5164+            self._testvs.append((0, len(checkstring), "eq", checkstring))
5165+
5166+
5167+    def __repr__(self):
5168+        return "MDMFSlotWriteProxy for share %d" % self.shnum
5169+
5170+
5171+    def get_checkstring(self):
5172+        """
5173+        Given a share number, I return a representation of what the
5174+        checkstring for that share on the server will look like.
5175+
5176+        I am mostly used for tests.
5177+        """
5178+        if self._root_hash:
5179+            roothash = self._root_hash
5180+        else:
5181+            roothash = "\x00" * 32
5182+        return struct.pack(MDMFCHECKSTRING,
5183+                           1,
5184+                           self._seqnum,
5185+                           roothash)
5186+
5187+
5188+    def put_block(self, data, segnum, salt):
5189+        """
5190+        I queue a write vector for the data, salt, and segment number
5191+        provided to me. I return None, as I do not actually cause
5192+        anything to be written yet.
5193+        """
5194+        if segnum >= self._num_segments:
5195+            raise LayoutInvalid("I won't overwrite the private key")
5196+        if len(salt) != SALT_SIZE:
5197+            raise LayoutInvalid("I was given a salt of size %d, but "
5198+                                "I wanted a salt of size %d")
5199+        if segnum + 1 == self._num_segments:
5200+            if len(data) != self._tail_block_size:
5201+                raise LayoutInvalid("I was given the wrong size block to write")
5202+        elif len(data) != self._block_size:
5203+            raise LayoutInvalid("I was given the wrong size block to write")
5204+
5205+        # We want to write at len(MDMFHEADER) + segnum * block_size.
5206+
5207+        offset = MDMFHEADERSIZE + (self._actual_block_size * segnum)
5208+        data = salt + data
5209+
5210+        self._writevs.append(tuple([offset, data]))
5211+
5212+
5213+    def put_encprivkey(self, encprivkey):
5214+        """
5215+        I queue a write vector for the encrypted private key provided to
5216+        me.
5217+        """
5218+        assert self._offsets
5219+        assert self._offsets['enc_privkey']
5220+        # You shouldn't re-write the encprivkey after the block hash
5221+        # tree is written, since that could cause the private key to run
5222+        # into the block hash tree. Before it writes the block hash
5223+        # tree, the block hash tree writing method writes the offset of
5224+        # the salt hash tree. So that's a good indicator of whether or
5225+        # not the block hash tree has been written.
5226+        if "share_hash_chain" in self._offsets:
5227+            raise LayoutInvalid("You must write this before the block hash tree")
5228+
5229+        self._offsets['block_hash_tree'] = self._offsets['enc_privkey'] + \
5230+            len(encprivkey)
5231+        self._writevs.append(tuple([self._offsets['enc_privkey'], encprivkey]))
5232+
5233+
5234+    def put_blockhashes(self, blockhashes):
5235+        """
5236+        I queue a write vector to put the block hash tree in blockhashes
5237+        onto the remote server.
5238+
5239+        The encrypted private key must be queued before the block hash
5240+        tree, since we need to know how large it is to know where the
5241+        block hash tree should go. The block hash tree must be put
5242+        before the salt hash tree, since its size determines the
5243+        offset of the share hash chain.
5244+        """
5245+        assert self._offsets
5246+        assert isinstance(blockhashes, list)
5247+        if "block_hash_tree" not in self._offsets:
5248+            raise LayoutInvalid("You must put the encrypted private key "
5249+                                "before you put the block hash tree")
5250+        # If written, the share hash chain causes the signature offset
5251+        # to be defined.
5252+        if "signature" in self._offsets:
5253+            raise LayoutInvalid("You must put the block hash tree before "
5254+                                "you put the share hash chain")
5255+        blockhashes_s = "".join(blockhashes)
5256+        self._offsets['share_hash_chain'] = self._offsets['block_hash_tree'] + len(blockhashes_s)
5257+
5258+        self._writevs.append(tuple([self._offsets['block_hash_tree'],
5259+                                  blockhashes_s]))
5260+
5261+
5262+    def put_sharehashes(self, sharehashes):
5263+        """
5264+        I queue a write vector to put the share hash chain in my
5265+        argument onto the remote server.
5266+
5267+        The salt hash tree must be queued before the share hash chain,
5268+        since we need to know where the salt hash tree ends before we
5269+        can know where the share hash chain starts. The share hash chain
5270+        must be put before the signature, since the length of the packed
5271+        share hash chain determines the offset of the signature. Also,
5272+        semantically, you must know what the root of the salt hash tree
5273+        is before you can generate a valid signature.
5274+        """
5275+        assert isinstance(sharehashes, dict)
5276+        if "share_hash_chain" not in self._offsets:
5277+            raise LayoutInvalid("You need to put the salt hash tree before "
5278+                                "you can put the share hash chain")
5279+        # The signature comes after the share hash chain. If the
5280+        # signature has already been written, we must not write another
5281+        # share hash chain. The signature writes the verification key
5282+        # offset when it gets sent to the remote server, so we look for
5283+        # that.
5284+        if "verification_key" in self._offsets:
5285+            raise LayoutInvalid("You must write the share hash chain "
5286+                                "before you write the signature")
5287+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
5288+                                  for i in sorted(sharehashes.keys())])
5289+        self._offsets['signature'] = self._offsets['share_hash_chain'] + len(sharehashes_s)
5290+        self._writevs.append(tuple([self._offsets['share_hash_chain'],
5291+                            sharehashes_s]))
5292+
5293+
5294+    def put_root_hash(self, roothash):
5295+        """
5296+        Put the root hash (the root of the share hash tree) in the
5297+        remote slot.
5298+        """
5299+        # It does not make sense to be able to put the root
5300+        # hash without first putting the share hashes, since you need
5301+        # the share hashes to generate the root hash.
5302+        #
5303+        # Signature is defined by the routine that places the share hash
5304+        # chain, so it's a good thing to look for in finding out whether
5305+        # or not the share hash chain exists on the remote server.
5306+        if "signature" not in self._offsets:
5307+            raise LayoutInvalid("You need to put the share hash chain "
5308+                                "before you can put the root share hash")
5309+        if len(roothash) != HASH_SIZE:
5310+            raise LayoutInvalid("hashes and salts must be exactly %d bytes"
5311+                                 % HASH_SIZE)
5312+        self._root_hash = roothash
5313+        # To write both of these values, we update the checkstring on
5314+        # the remote server, which includes them
5315+        checkstring = self.get_checkstring()
5316+        self._writevs.append(tuple([0, checkstring]))
5317+        # This write, if successful, changes the checkstring, so we need
5318+        # to update our internal checkstring to be consistent with the
5319+        # one on the server.
5320+
5321+
5322+    def get_signable(self):
5323+        """
5324+        Get the first seven fields of the mutable file; the parts that
5325+        are signed.
5326+        """
5327+        if not self._root_hash:
5328+            raise LayoutInvalid("You need to set the root hash "
5329+                                "before getting something to "
5330+                                "sign")
5331+        return struct.pack(MDMFSIGNABLEHEADER,
5332+                           1,
5333+                           self._seqnum,
5334+                           self._root_hash,
5335+                           self._required_shares,
5336+                           self._total_shares,
5337+                           self._segment_size,
5338+                           self._data_length)
5339+
5340+
5341+    def put_signature(self, signature):
5342+        """
5343+        I queue a write vector for the signature of the MDMF share.
5344+
5345+        I require that the root hash and share hash chain have been put
5346+        to the grid before I will write the signature to the grid.
5347+        """
5348+        if "signature" not in self._offsets:
5349+            raise LayoutInvalid("You must put the share hash chain "
5350+        # It does not make sense to put a signature without first
5351+        # putting the root hash and the salt hash (since otherwise
5352+        # the signature would be incomplete), so we don't allow that.
5353+                       "before putting the signature")
5354+        if not self._root_hash:
5355+            raise LayoutInvalid("You must complete the signed prefix "
5356+                                "before computing a signature")
5357+        # If we put the signature after we put the verification key, we
5358+        # could end up running into the verification key, and will
5359+        # probably screw up the offsets as well. So we don't allow that.
5360+        # The method that writes the verification key defines the EOF
5361+        # offset before writing the verification key, so look for that.
5362+        if "EOF" in self._offsets:
5363+            raise LayoutInvalid("You must write the signature before the verification key")
5364+
5365+        self._offsets['verification_key'] = self._offsets['signature'] + len(signature)
5366+        self._writevs.append(tuple([self._offsets['signature'], signature]))
5367+
5368+
5369+    def put_verification_key(self, verification_key):
5370+        """
5371+        I queue a write vector for the verification key.
5372+
5373+        I require that the signature have been written to the storage
5374+        server before I allow the verification key to be written to the
5375+        remote server.
5376+        """
5377+        if "verification_key" not in self._offsets:
5378+            raise LayoutInvalid("You must put the signature before you "
5379+                                "can put the verification key")
5380+        self._offsets['EOF'] = self._offsets['verification_key'] + len(verification_key)
5381+        self._writevs.append(tuple([self._offsets['verification_key'],
5382+                            verification_key]))
5383+
5384+
5385+    def _get_offsets_tuple(self):
5386+        return tuple([(key, value) for key, value in self._offsets.items()])
5387+
5388+
5389+    def get_verinfo(self):
5390+        return (self._seqnum,
5391+                self._root_hash,
5392+                self._required_shares,
5393+                self._total_shares,
5394+                self._segment_size,
5395+                self._data_length,
5396+                self.get_signable(),
5397+                self._get_offsets_tuple())
5398+
5399+
5400+    def finish_publishing(self):
5401+        """
5402+        I add a write vector for the offsets table, and then cause all
5403+        of the write vectors that I've dealt with so far to be published
5404+        to the remote server, ending the write process.
5405+        """
5406+        if "EOF" not in self._offsets:
5407+            raise LayoutInvalid("You must put the verification key before "
5408+                                "you can publish the offsets")
5409+        offsets_offset = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
5410+        offsets = struct.pack(MDMFOFFSETS,
5411+                              self._offsets['enc_privkey'],
5412+                              self._offsets['block_hash_tree'],
5413+                              self._offsets['share_hash_chain'],
5414+                              self._offsets['signature'],
5415+                              self._offsets['verification_key'],
5416+                              self._offsets['EOF'])
5417+        self._writevs.append(tuple([offsets_offset, offsets]))
5418+        encoding_parameters_offset = struct.calcsize(MDMFCHECKSTRING)
5419+        params = struct.pack(">BBQQ",
5420+                             self._required_shares,
5421+                             self._total_shares,
5422+                             self._segment_size,
5423+                             self._data_length)
5424+        self._writevs.append(tuple([encoding_parameters_offset, params]))
5425+        return self._write(self._writevs)
5426+
5427+
5428+    def _write(self, datavs, on_failure=None, on_success=None):
5429+        """I write the data vectors in datavs to the remote slot."""
5430+        tw_vectors = {}
5431+        if not self._testvs:
5432+            self._testvs = []
5433+            self._testvs.append(tuple([0, 1, "eq", ""]))
5434+        if not self._written:
5435+            # Write a new checkstring to the share when we write it, so
5436+            # that we have something to check later.
5437+            new_checkstring = self.get_checkstring()
5438+            datavs.append((0, new_checkstring))
5439+            def _first_write():
5440+                self._written = True
5441+                self._testvs = [(0, len(new_checkstring), "eq", new_checkstring)]
5442+            on_success = _first_write
5443+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
5444+        d = self._rref.callRemote("slot_testv_and_readv_and_writev",
5445+                                  self._storage_index,
5446+                                  self._secrets,
5447+                                  tw_vectors,
5448+                                  self._readv)
5449+        def _result(results):
5450+            if isinstance(results, failure.Failure) or not results[0]:
5451+                # Do nothing; the write was unsuccessful.
5452+                if on_failure: on_failure()
5453+            else:
5454+                if on_success: on_success()
5455+            return results
5456+        d.addCallback(_result)
5457+        return d
5458+
5459+
5460+class MDMFSlotReadProxy:
5461+    """
5462+    I read from a mutable slot filled with data written in the MDMF data
5463+    format (which is described above).
5464+
5465+    I can be initialized with some amount of data, which I will use (if
5466+    it is valid) to eliminate some of the need to fetch it from servers.
5467+    """
5468+    def __init__(self,
5469+                 rref,
5470+                 storage_index,
5471+                 shnum,
5472+                 data=""):
5473+        # Start the initialization process.
5474+        self._rref = rref
5475+        self._storage_index = storage_index
5476+        self.shnum = shnum
5477+
5478+        # Before doing anything, the reader is probably going to want to
5479+        # verify that the signature is correct. To do that, they'll need
5480+        # the verification key, and the signature. To get those, we'll
5481+        # need the offset table. So fetch the offset table on the
5482+        # assumption that that will be the first thing that a reader is
5483+        # going to do.
5484+
5485+        # The fact that these encoding parameters are None tells us
5486+        # that we haven't yet fetched them from the remote share, so we
5487+        # should. We could just not set them, but the checks will be
5488+        # easier to read if we don't have to use hasattr.
5489+        self._version_number = None
5490+        self._sequence_number = None
5491+        self._root_hash = None
5492+        # Filled in if we're dealing with an SDMF file. Unused
5493+        # otherwise.
5494+        self._salt = None
5495+        self._required_shares = None
5496+        self._total_shares = None
5497+        self._segment_size = None
5498+        self._data_length = None
5499+        self._offsets = None
5500+
5501+        # If the user has chosen to initialize us with some data, we'll
5502+        # try to satisfy subsequent data requests with that data before
5503+        # asking the storage server for it. If
5504+        self._data = data
5505+        # The way callers interact with cache in the filenode returns
5506+        # None if there isn't any cached data, but the way we index the
5507+        # cached data requires a string, so convert None to "".
5508+        if self._data == None:
5509+            self._data = ""
5510+
5511+        self._queue_observers = observer.ObserverList()
5512+        self._queue_errbacks = observer.ObserverList()
5513+        self._readvs = []
5514+
5515+
5516+    def _maybe_fetch_offsets_and_header(self, force_remote=False):
5517+        """
5518+        I fetch the offset table and the header from the remote slot if
5519+        I don't already have them. If I do have them, I do nothing and
5520+        return an empty Deferred.
5521+        """
5522+        if self._offsets:
5523+            return defer.succeed(None)
5524+        # At this point, we may be either SDMF or MDMF. Fetching 107
5525+        # bytes will be enough to get header and offsets for both SDMF and
5526+        # MDMF, though we'll be left with 4 more bytes than we
5527+        # need if this ends up being MDMF. This is probably less
5528+        # expensive than the cost of a second roundtrip.
5529+        readvs = [(0, 107)]
5530+        d = self._read(readvs, force_remote)
5531+        d.addCallback(self._process_encoding_parameters)
5532+        d.addCallback(self._process_offsets)
5533+        return d
5534+
5535+
5536+    def _process_encoding_parameters(self, encoding_parameters):
5537+        assert self.shnum in encoding_parameters
5538+        encoding_parameters = encoding_parameters[self.shnum][0]
5539+        # The first byte is the version number. It will tell us what
5540+        # to do next.
5541+        (verno,) = struct.unpack(">B", encoding_parameters[:1])
5542+        if verno == MDMF_VERSION:
5543+            read_size = MDMFHEADERWITHOUTOFFSETSSIZE
5544+            (verno,
5545+             seqnum,
5546+             root_hash,
5547+             k,
5548+             n,
5549+             segsize,
5550+             datalen) = struct.unpack(MDMFHEADERWITHOUTOFFSETS,
5551+                                      encoding_parameters[:read_size])
5552+            if segsize == 0 and datalen == 0:
5553+                # Empty file, no segments.
5554+                self._num_segments = 0
5555+            else:
5556+                self._num_segments = mathutil.div_ceil(datalen, segsize)
5557+
5558+        elif verno == SDMF_VERSION:
5559+            read_size = SIGNED_PREFIX_LENGTH
5560+            (verno,
5561+             seqnum,
5562+             root_hash,
5563+             salt,
5564+             k,
5565+             n,
5566+             segsize,
5567+             datalen) = struct.unpack(">BQ32s16s BBQQ",
5568+                                encoding_parameters[:SIGNED_PREFIX_LENGTH])
5569+            self._salt = salt
5570+            if segsize == 0 and datalen == 0:
5571+                # empty file
5572+                self._num_segments = 0
5573+            else:
5574+                # non-empty SDMF files have one segment.
5575+                self._num_segments = 1
5576+        else:
5577+            raise UnknownVersionError("You asked me to read mutable file "
5578+                                      "version %d, but I only understand "
5579+                                      "%d and %d" % (verno, SDMF_VERSION,
5580+                                                     MDMF_VERSION))
5581+
5582+        self._version_number = verno
5583+        self._sequence_number = seqnum
5584+        self._root_hash = root_hash
5585+        self._required_shares = k
5586+        self._total_shares = n
5587+        self._segment_size = segsize
5588+        self._data_length = datalen
5589+
5590+        self._block_size = self._segment_size / self._required_shares
5591+        # We can upload empty files, and need to account for this fact
5592+        # so as to avoid zero-division and zero-modulo errors.
5593+        if datalen > 0:
5594+            tail_size = self._data_length % self._segment_size
5595+        else:
5596+            tail_size = 0
5597+        if not tail_size:
5598+            self._tail_block_size = self._block_size
5599+        else:
5600+            self._tail_block_size = mathutil.next_multiple(tail_size,
5601+                                                    self._required_shares)
5602+            self._tail_block_size /= self._required_shares
5603+
5604+        return encoding_parameters
5605+
5606+
5607+    def _process_offsets(self, offsets):
5608+        if self._version_number == 0:
5609+            read_size = OFFSETS_LENGTH
5610+            read_offset = SIGNED_PREFIX_LENGTH
5611+            end = read_size + read_offset
5612+            (signature,
5613+             share_hash_chain,
5614+             block_hash_tree,
5615+             share_data,
5616+             enc_privkey,
5617+             EOF) = struct.unpack(">LLLLQQ",
5618+                                  offsets[read_offset:end])
5619+            self._offsets = {}
5620+            self._offsets['signature'] = signature
5621+            self._offsets['share_data'] = share_data
5622+            self._offsets['block_hash_tree'] = block_hash_tree
5623+            self._offsets['share_hash_chain'] = share_hash_chain
5624+            self._offsets['enc_privkey'] = enc_privkey
5625+            self._offsets['EOF'] = EOF
5626+
5627+        elif self._version_number == 1:
5628+            read_offset = MDMFHEADERWITHOUTOFFSETSSIZE
5629+            read_length = MDMFOFFSETS_LENGTH
5630+            end = read_offset + read_length
5631+            (encprivkey,
5632+             blockhashes,
5633+             sharehashes,
5634+             signature,
5635+             verification_key,
5636+             eof) = struct.unpack(MDMFOFFSETS,
5637+                                  offsets[read_offset:end])
5638+            self._offsets = {}
5639+            self._offsets['enc_privkey'] = encprivkey
5640+            self._offsets['block_hash_tree'] = blockhashes
5641+            self._offsets['share_hash_chain'] = sharehashes
5642+            self._offsets['signature'] = signature
5643+            self._offsets['verification_key'] = verification_key
5644+            self._offsets['EOF'] = eof
5645+
5646+
5647+    def get_block_and_salt(self, segnum, queue=False):
5648+        """
5649+        I return (block, salt), where block is the block data and
5650+        salt is the salt used to encrypt that segment.
5651+        """
5652+        d = self._maybe_fetch_offsets_and_header()
5653+        def _then(ignored):
5654+            if self._version_number == 1:
5655+                base_share_offset = MDMFHEADERSIZE
5656+            else:
5657+                base_share_offset = self._offsets['share_data']
5658+
5659+            if segnum + 1 > self._num_segments:
5660+                raise LayoutInvalid("Not a valid segment number")
5661+
5662+            if self._version_number == 0:
5663+                share_offset = base_share_offset + self._block_size * segnum
5664+            else:
5665+                share_offset = base_share_offset + (self._block_size + \
5666+                                                    SALT_SIZE) * segnum
5667+            if segnum + 1 == self._num_segments:
5668+                data = self._tail_block_size
5669+            else:
5670+                data = self._block_size
5671+
5672+            if self._version_number == 1:
5673+                data += SALT_SIZE
5674+
5675+            readvs = [(share_offset, data)]
5676+            return readvs
5677+        d.addCallback(_then)
5678+        d.addCallback(lambda readvs:
5679+            self._read(readvs, queue=queue))
5680+        def _process_results(results):
5681+            assert self.shnum in results
5682+            if self._version_number == 0:
5683+                # We only read the share data, but we know the salt from
5684+                # when we fetched the header
5685+                data = results[self.shnum]
5686+                if not data:
5687+                    data = ""
5688+                else:
5689+                    assert len(data) == 1
5690+                    data = data[0]
5691+                salt = self._salt
5692+            else:
5693+                data = results[self.shnum]
5694+                if not data:
5695+                    salt = data = ""
5696+                else:
5697+                    salt_and_data = results[self.shnum][0]
5698+                    salt = salt_and_data[:SALT_SIZE]
5699+                    data = salt_and_data[SALT_SIZE:]
5700+            return data, salt
5701+        d.addCallback(_process_results)
5702+        return d
5703+
5704+
5705+    def get_blockhashes(self, needed=None, queue=False, force_remote=False):
5706+        """
5707+        I return the block hash tree
5708+
5709+        I take an optional argument, needed, which is a set of indices
5710+        correspond to hashes that I should fetch. If this argument is
5711+        missing, I will fetch the entire block hash tree; otherwise, I
5712+        may attempt to fetch fewer hashes, based on what needed says
5713+        that I should do. Note that I may fetch as many hashes as I
5714+        want, so long as the set of hashes that I do fetch is a superset
5715+        of the ones that I am asked for, so callers should be prepared
5716+        to tolerate additional hashes.
5717+        """
5718+        # TODO: Return only the parts of the block hash tree necessary
5719+        # to validate the blocknum provided?
5720+        # This is a good idea, but it is hard to implement correctly. It
5721+        # is bad to fetch any one block hash more than once, so we
5722+        # probably just want to fetch the whole thing at once and then
5723+        # serve it.
5724+        if needed == set([]):
5725+            return defer.succeed([])
5726+        d = self._maybe_fetch_offsets_and_header()
5727+        def _then(ignored):
5728+            blockhashes_offset = self._offsets['block_hash_tree']
5729+            if self._version_number == 1:
5730+                blockhashes_length = self._offsets['share_hash_chain'] - blockhashes_offset
5731+            else:
5732+                blockhashes_length = self._offsets['share_data'] - blockhashes_offset
5733+            readvs = [(blockhashes_offset, blockhashes_length)]
5734+            return readvs
5735+        d.addCallback(_then)
5736+        d.addCallback(lambda readvs:
5737+            self._read(readvs, queue=queue, force_remote=force_remote))
5738+        def _build_block_hash_tree(results):
5739+            assert self.shnum in results
5740+
5741+            rawhashes = results[self.shnum][0]
5742+            results = [rawhashes[i:i+HASH_SIZE]
5743+                       for i in range(0, len(rawhashes), HASH_SIZE)]
5744+            return results
5745+        d.addCallback(_build_block_hash_tree)
5746+        return d
5747+
5748+
5749+    def get_sharehashes(self, needed=None, queue=False, force_remote=False):
5750+        """
5751+        I return the part of the share hash chain placed to validate
5752+        this share.
5753+
5754+        I take an optional argument, needed. Needed is a set of indices
5755+        that correspond to the hashes that I should fetch. If needed is
5756+        not present, I will fetch and return the entire share hash
5757+        chain. Otherwise, I may fetch and return any part of the share
5758+        hash chain that is a superset of the part that I am asked to
5759+        fetch. Callers should be prepared to deal with more hashes than
5760+        they've asked for.
5761+        """
5762+        if needed == set([]):
5763+            return defer.succeed([])
5764+        d = self._maybe_fetch_offsets_and_header()
5765+
5766+        def _make_readvs(ignored):
5767+            sharehashes_offset = self._offsets['share_hash_chain']
5768+            if self._version_number == 0:
5769+                sharehashes_length = self._offsets['block_hash_tree'] - sharehashes_offset
5770+            else:
5771+                sharehashes_length = self._offsets['signature'] - sharehashes_offset
5772+            readvs = [(sharehashes_offset, sharehashes_length)]
5773+            return readvs
5774+        d.addCallback(_make_readvs)
5775+        d.addCallback(lambda readvs:
5776+            self._read(readvs, queue=queue, force_remote=force_remote))
5777+        def _build_share_hash_chain(results):
5778+            assert self.shnum in results
5779+
5780+            sharehashes = results[self.shnum][0]
5781+            results = [sharehashes[i:i+(HASH_SIZE + 2)]
5782+                       for i in range(0, len(sharehashes), HASH_SIZE + 2)]
5783+            results = dict([struct.unpack(">H32s", data)
5784+                            for data in results])
5785+            return results
5786+        d.addCallback(_build_share_hash_chain)
5787+        return d
5788+
5789+
5790+    def get_encprivkey(self, queue=False):
5791+        """
5792+        I return the encrypted private key.
5793+        """
5794+        d = self._maybe_fetch_offsets_and_header()
5795+
5796+        def _make_readvs(ignored):
5797+            privkey_offset = self._offsets['enc_privkey']
5798+            if self._version_number == 0:
5799+                privkey_length = self._offsets['EOF'] - privkey_offset
5800+            else:
5801+                privkey_length = self._offsets['block_hash_tree'] - privkey_offset
5802+            readvs = [(privkey_offset, privkey_length)]
5803+            return readvs
5804+        d.addCallback(_make_readvs)
5805+        d.addCallback(lambda readvs:
5806+            self._read(readvs, queue=queue))
5807+        def _process_results(results):
5808+            assert self.shnum in results
5809+            privkey = results[self.shnum][0]
5810+            return privkey
5811+        d.addCallback(_process_results)
5812+        return d
5813+
5814+
5815+    def get_signature(self, queue=False):
5816+        """
5817+        I return the signature of my share.
5818+        """
5819+        d = self._maybe_fetch_offsets_and_header()
5820+
5821+        def _make_readvs(ignored):
5822+            signature_offset = self._offsets['signature']
5823+            if self._version_number == 1:
5824+                signature_length = self._offsets['verification_key'] - signature_offset
5825+            else:
5826+                signature_length = self._offsets['share_hash_chain'] - signature_offset
5827+            readvs = [(signature_offset, signature_length)]
5828+            return readvs
5829+        d.addCallback(_make_readvs)
5830+        d.addCallback(lambda readvs:
5831+            self._read(readvs, queue=queue))
5832+        def _process_results(results):
5833+            assert self.shnum in results
5834+            signature = results[self.shnum][0]
5835+            return signature
5836+        d.addCallback(_process_results)
5837+        return d
5838+
5839+
5840+    def get_verification_key(self, queue=False):
5841+        """
5842+        I return the verification key.
5843+        """
5844+        d = self._maybe_fetch_offsets_and_header()
5845+
5846+        def _make_readvs(ignored):
5847+            if self._version_number == 1:
5848+                vk_offset = self._offsets['verification_key']
5849+                vk_length = self._offsets['EOF'] - vk_offset
5850+            else:
5851+                vk_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
5852+                vk_length = self._offsets['signature'] - vk_offset
5853+            readvs = [(vk_offset, vk_length)]
5854+            return readvs
5855+        d.addCallback(_make_readvs)
5856+        d.addCallback(lambda readvs:
5857+            self._read(readvs, queue=queue))
5858+        def _process_results(results):
5859+            assert self.shnum in results
5860+            verification_key = results[self.shnum][0]
5861+            return verification_key
5862+        d.addCallback(_process_results)
5863+        return d
5864+
5865+
5866+    def get_encoding_parameters(self):
5867+        """
5868+        I return (k, n, segsize, datalen)
5869+        """
5870+        d = self._maybe_fetch_offsets_and_header()
5871+        d.addCallback(lambda ignored:
5872+            (self._required_shares,
5873+             self._total_shares,
5874+             self._segment_size,
5875+             self._data_length))
5876+        return d
5877+
5878+
5879+    def get_seqnum(self):
5880+        """
5881+        I return the sequence number for this share.
5882+        """
5883+        d = self._maybe_fetch_offsets_and_header()
5884+        d.addCallback(lambda ignored:
5885+            self._sequence_number)
5886+        return d
5887+
5888+
5889+    def get_root_hash(self):
5890+        """
5891+        I return the root of the block hash tree
5892+        """
5893+        d = self._maybe_fetch_offsets_and_header()
5894+        d.addCallback(lambda ignored: self._root_hash)
5895+        return d
5896+
5897+
5898+    def get_checkstring(self):
5899+        """
5900+        I return the packed representation of the following:
5901+
5902+            - version number
5903+            - sequence number
5904+            - root hash
5905+            - salt hash
5906+
5907+        which my users use as a checkstring to detect other writers.
5908+        """
5909+        d = self._maybe_fetch_offsets_and_header()
5910+        def _build_checkstring(ignored):
5911+            if self._salt:
5912+                checkstring = struct.pack(PREFIX,
5913+                                          self._version_number,
5914+                                          self._sequence_number,
5915+                                          self._root_hash,
5916+                                          self._salt)
5917+            else:
5918+                checkstring = struct.pack(MDMFCHECKSTRING,
5919+                                          self._version_number,
5920+                                          self._sequence_number,
5921+                                          self._root_hash)
5922+
5923+            return checkstring
5924+        d.addCallback(_build_checkstring)
5925+        return d
5926+
5927+
5928+    def get_prefix(self, force_remote):
5929+        d = self._maybe_fetch_offsets_and_header(force_remote)
5930+        d.addCallback(lambda ignored:
5931+            self._build_prefix())
5932+        return d
5933+
5934+
5935+    def _build_prefix(self):
5936+        # The prefix is another name for the part of the remote share
5937+        # that gets signed. It consists of everything up to and
5938+        # including the datalength, packed by struct.
5939+        if self._version_number == SDMF_VERSION:
5940+            return struct.pack(SIGNED_PREFIX,
5941+                           self._version_number,
5942+                           self._sequence_number,
5943+                           self._root_hash,
5944+                           self._salt,
5945+                           self._required_shares,
5946+                           self._total_shares,
5947+                           self._segment_size,
5948+                           self._data_length)
5949+
5950+        else:
5951+            return struct.pack(MDMFSIGNABLEHEADER,
5952+                           self._version_number,
5953+                           self._sequence_number,
5954+                           self._root_hash,
5955+                           self._required_shares,
5956+                           self._total_shares,
5957+                           self._segment_size,
5958+                           self._data_length)
5959+
5960+
5961+    def _get_offsets_tuple(self):
5962+        # The offsets tuple is another component of the version
5963+        # information tuple. It is basically our offsets dictionary,
5964+        # itemized and in a tuple.
5965+        return self._offsets.copy()
5966+
5967+
5968+    def get_verinfo(self):
5969+        """
5970+        I return my verinfo tuple. This is used by the ServermapUpdater
5971+        to keep track of versions of mutable files.
5972+
5973+        The verinfo tuple for MDMF files contains:
5974+            - seqnum
5975+            - root hash
5976+            - a blank (nothing)
5977+            - segsize
5978+            - datalen
5979+            - k
5980+            - n
5981+            - prefix (the thing that you sign)
5982+            - a tuple of offsets
5983+
5984+        We include the nonce in MDMF to simplify processing of version
5985+        information tuples.
5986+
5987+        The verinfo tuple for SDMF files is the same, but contains a
5988+        16-byte IV instead of a hash of salts.
5989+        """
5990+        d = self._maybe_fetch_offsets_and_header()
5991+        def _build_verinfo(ignored):
5992+            if self._version_number == SDMF_VERSION:
5993+                salt_to_use = self._salt
5994+            else:
5995+                salt_to_use = None
5996+            return (self._sequence_number,
5997+                    self._root_hash,
5998+                    salt_to_use,
5999+                    self._segment_size,
6000+                    self._data_length,
6001+                    self._required_shares,
6002+                    self._total_shares,
6003+                    self._build_prefix(),
6004+                    self._get_offsets_tuple())
6005+        d.addCallback(_build_verinfo)
6006+        return d
6007+
6008+
6009+    def flush(self):
6010+        """
6011+        I flush my queue of read vectors.
6012+        """
6013+        d = self._read(self._readvs)
6014+        def _then(results):
6015+            self._readvs = []
6016+            if isinstance(results, failure.Failure):
6017+                self._queue_errbacks.notify(results)
6018+            else:
6019+                self._queue_observers.notify(results)
6020+            self._queue_observers = observer.ObserverList()
6021+            self._queue_errbacks = observer.ObserverList()
6022+        d.addBoth(_then)
6023+
6024+
6025+    def _read(self, readvs, force_remote=False, queue=False):
6026+        unsatisfiable = filter(lambda x: x[0] + x[1] > len(self._data), readvs)
6027+        # TODO: It's entirely possible to tweak this so that it just
6028+        # fulfills the requests that it can, and not demand that all
6029+        # requests are satisfiable before running it.
6030+        if not unsatisfiable and not force_remote:
6031+            results = [self._data[offset:offset+length]
6032+                       for (offset, length) in readvs]
6033+            results = {self.shnum: results}
6034+            return defer.succeed(results)
6035+        else:
6036+            if queue:
6037+                start = len(self._readvs)
6038+                self._readvs += readvs
6039+                end = len(self._readvs)
6040+                def _get_results(results, start, end):
6041+                    if not self.shnum in results:
6042+                        return {self._shnum: [""]}
6043+                    return {self.shnum: results[self.shnum][start:end]}
6044+                d = defer.Deferred()
6045+                d.addCallback(_get_results, start, end)
6046+                self._queue_observers.subscribe(d.callback)
6047+                self._queue_errbacks.subscribe(d.errback)
6048+                return d
6049+            return self._rref.callRemote("slot_readv",
6050+                                         self._storage_index,
6051+                                         [self.shnum],
6052+                                         readvs)
6053+
6054+
6055+    def is_sdmf(self):
6056+        """I tell my caller whether or not my remote file is SDMF or MDMF
6057+        """
6058+        d = self._maybe_fetch_offsets_and_header()
6059+        d.addCallback(lambda ignored:
6060+            self._version_number == 0)
6061+        return d
6062+
6063+
6064+class LayoutInvalid(Exception):
6065+    """
6066+    This isn't a valid MDMF mutable file
6067+    """
6068merger 0.0 (
6069hunk ./src/allmydata/test/test_storage.py 3
6070-from allmydata.util import log
6071-
6072merger 0.0 (
6073hunk ./src/allmydata/test/test_storage.py 3
6074-import time, os.path, stat, re, simplejson, struct
6075+from allmydata.util import log
6076+
6077+import mock
6078hunk ./src/allmydata/test/test_storage.py 3
6079-import time, os.path, stat, re, simplejson, struct
6080+import time, os.path, stat, re, simplejson, struct, shutil
6081)
6082)
6083hunk ./src/allmydata/test/test_storage.py 23
6084 from allmydata.storage.expirer import LeaseCheckingCrawler
6085 from allmydata.immutable.layout import WriteBucketProxy, WriteBucketProxy_v2, \
6086      ReadBucketProxy
6087-from allmydata.interfaces import BadWriteEnablerError
6088-from allmydata.test.common import LoggingServiceParent
6089+from allmydata.mutable.layout import MDMFSlotWriteProxy, MDMFSlotReadProxy, \
6090+                                     LayoutInvalid, MDMFSIGNABLEHEADER, \
6091+                                     SIGNED_PREFIX, MDMFHEADER, \
6092+                                     MDMFOFFSETS, SDMFSlotWriteProxy
6093+from allmydata.interfaces import BadWriteEnablerError, MDMF_VERSION, \
6094+                                 SDMF_VERSION
6095+from allmydata.test.common import LoggingServiceParent, ShouldFailMixin
6096 from allmydata.test.common_web import WebRenderingMixin
6097 from allmydata.web.storage import StorageStatus, remove_prefix
6098 
6099hunk ./src/allmydata/test/test_storage.py 107
6100 
6101 class RemoteBucket:
6102 
6103+    def __init__(self):
6104+        self.read_count = 0
6105+        self.write_count = 0
6106+
6107     def callRemote(self, methname, *args, **kwargs):
6108         def _call():
6109             meth = getattr(self.target, "remote_" + methname)
6110hunk ./src/allmydata/test/test_storage.py 115
6111             return meth(*args, **kwargs)
6112+
6113+        if methname == "slot_readv":
6114+            self.read_count += 1
6115+        if "writev" in methname:
6116+            self.write_count += 1
6117+
6118         return defer.maybeDeferred(_call)
6119 
6120hunk ./src/allmydata/test/test_storage.py 123
6121+
6122 class BucketProxy(unittest.TestCase):
6123     def make_bucket(self, name, size):
6124         basedir = os.path.join("storage", "BucketProxy", name)
6125hunk ./src/allmydata/test/test_storage.py 1306
6126         self.failUnless(os.path.exists(prefixdir), prefixdir)
6127         self.failIf(os.path.exists(bucketdir), bucketdir)
6128 
6129+
6130+class MDMFProxies(unittest.TestCase, ShouldFailMixin):
6131+    def setUp(self):
6132+        self.sparent = LoggingServiceParent()
6133+        self._lease_secret = itertools.count()
6134+        self.ss = self.create("MDMFProxies storage test server")
6135+        self.rref = RemoteBucket()
6136+        self.rref.target = self.ss
6137+        self.secrets = (self.write_enabler("we_secret"),
6138+                        self.renew_secret("renew_secret"),
6139+                        self.cancel_secret("cancel_secret"))
6140+        self.segment = "aaaaaa"
6141+        self.block = "aa"
6142+        self.salt = "a" * 16
6143+        self.block_hash = "a" * 32
6144+        self.block_hash_tree = [self.block_hash for i in xrange(6)]
6145+        self.share_hash = self.block_hash
6146+        self.share_hash_chain = dict([(i, self.share_hash) for i in xrange(6)])
6147+        self.signature = "foobarbaz"
6148+        self.verification_key = "vvvvvv"
6149+        self.encprivkey = "private"
6150+        self.root_hash = self.block_hash
6151+        self.salt_hash = self.root_hash
6152+        self.salt_hash_tree = [self.salt_hash for i in xrange(6)]
6153+        self.block_hash_tree_s = self.serialize_blockhashes(self.block_hash_tree)
6154+        self.share_hash_chain_s = self.serialize_sharehashes(self.share_hash_chain)
6155+        # blockhashes and salt hashes are serialized in the same way,
6156+        # only we lop off the first element and store that in the
6157+        # header.
6158+        self.salt_hash_tree_s = self.serialize_blockhashes(self.salt_hash_tree[1:])
6159+
6160+
6161+    def tearDown(self):
6162+        self.sparent.stopService()
6163+        shutil.rmtree(self.workdir("MDMFProxies storage test server"))
6164+
6165+
6166+    def write_enabler(self, we_tag):
6167+        return hashutil.tagged_hash("we_blah", we_tag)
6168+
6169+
6170+    def renew_secret(self, tag):
6171+        return hashutil.tagged_hash("renew_blah", str(tag))
6172+
6173+
6174+    def cancel_secret(self, tag):
6175+        return hashutil.tagged_hash("cancel_blah", str(tag))
6176+
6177+
6178+    def workdir(self, name):
6179+        basedir = os.path.join("storage", "MutableServer", name)
6180+        return basedir
6181+
6182+
6183+    def create(self, name):
6184+        workdir = self.workdir(name)
6185+        ss = StorageServer(workdir, "\x00" * 20)
6186+        ss.setServiceParent(self.sparent)
6187+        return ss
6188+
6189+
6190+    def build_test_mdmf_share(self, tail_segment=False, empty=False):
6191+        # Start with the checkstring
6192+        data = struct.pack(">BQ32s",
6193+                           1,
6194+                           0,
6195+                           self.root_hash)
6196+        self.checkstring = data
6197+        # Next, the encoding parameters
6198+        if tail_segment:
6199+            data += struct.pack(">BBQQ",
6200+                                3,
6201+                                10,
6202+                                6,
6203+                                33)
6204+        elif empty:
6205+            data += struct.pack(">BBQQ",
6206+                                3,
6207+                                10,
6208+                                0,
6209+                                0)
6210+        else:
6211+            data += struct.pack(">BBQQ",
6212+                                3,
6213+                                10,
6214+                                6,
6215+                                36)
6216+        # Now we'll build the offsets.
6217+        sharedata = ""
6218+        if not tail_segment and not empty:
6219+            for i in xrange(6):
6220+                sharedata += self.salt + self.block
6221+        elif tail_segment:
6222+            for i in xrange(5):
6223+                sharedata += self.salt + self.block
6224+            sharedata += self.salt + "a"
6225+
6226+        # The encrypted private key comes after the shares + salts
6227+        offset_size = struct.calcsize(MDMFOFFSETS)
6228+        encrypted_private_key_offset = len(data) + offset_size + len(sharedata)
6229+        # The blockhashes come after the private key
6230+        blockhashes_offset = encrypted_private_key_offset + len(self.encprivkey)
6231+        # The sharehashes come after the salt hashes
6232+        sharehashes_offset = blockhashes_offset + len(self.block_hash_tree_s)
6233+        # The signature comes after the share hash chain
6234+        signature_offset = sharehashes_offset + len(self.share_hash_chain_s)
6235+        # The verification key comes after the signature
6236+        verification_offset = signature_offset + len(self.signature)
6237+        # The EOF comes after the verification key
6238+        eof_offset = verification_offset + len(self.verification_key)
6239+        data += struct.pack(MDMFOFFSETS,
6240+                            encrypted_private_key_offset,
6241+                            blockhashes_offset,
6242+                            sharehashes_offset,
6243+                            signature_offset,
6244+                            verification_offset,
6245+                            eof_offset)
6246+        self.offsets = {}
6247+        self.offsets['enc_privkey'] = encrypted_private_key_offset
6248+        self.offsets['block_hash_tree'] = blockhashes_offset
6249+        self.offsets['share_hash_chain'] = sharehashes_offset
6250+        self.offsets['signature'] = signature_offset
6251+        self.offsets['verification_key'] = verification_offset
6252+        self.offsets['EOF'] = eof_offset
6253+        # Next, we'll add in the salts and share data,
6254+        data += sharedata
6255+        # the private key,
6256+        data += self.encprivkey
6257+        # the block hash tree,
6258+        data += self.block_hash_tree_s
6259+        # the share hash chain,
6260+        data += self.share_hash_chain_s
6261+        # the signature,
6262+        data += self.signature
6263+        # and the verification key
6264+        data += self.verification_key
6265+        return data
6266+
6267+
6268+    def write_test_share_to_server(self,
6269+                                   storage_index,
6270+                                   tail_segment=False,
6271+                                   empty=False):
6272+        """
6273+        I write some data for the read tests to read to self.ss
6274+
6275+        If tail_segment=True, then I will write a share that has a
6276+        smaller tail segment than other segments.
6277+        """
6278+        write = self.ss.remote_slot_testv_and_readv_and_writev
6279+        data = self.build_test_mdmf_share(tail_segment, empty)
6280+        # Finally, we write the whole thing to the storage server in one
6281+        # pass.
6282+        testvs = [(0, 1, "eq", "")]
6283+        tws = {}
6284+        tws[0] = (testvs, [(0, data)], None)
6285+        readv = [(0, 1)]
6286+        results = write(storage_index, self.secrets, tws, readv)
6287+        self.failUnless(results[0])
6288+
6289+
6290+    def build_test_sdmf_share(self, empty=False):
6291+        if empty:
6292+            sharedata = ""
6293+        else:
6294+            sharedata = self.segment * 6
6295+        self.sharedata = sharedata
6296+        blocksize = len(sharedata) / 3
6297+        block = sharedata[:blocksize]
6298+        self.blockdata = block
6299+        prefix = struct.pack(">BQ32s16s BBQQ",
6300+                             0, # version,
6301+                             0,
6302+                             self.root_hash,
6303+                             self.salt,
6304+                             3,
6305+                             10,
6306+                             len(sharedata),
6307+                             len(sharedata),
6308+                            )
6309+        post_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
6310+        signature_offset = post_offset + len(self.verification_key)
6311+        sharehashes_offset = signature_offset + len(self.signature)
6312+        blockhashes_offset = sharehashes_offset + len(self.share_hash_chain_s)
6313+        sharedata_offset = blockhashes_offset + len(self.block_hash_tree_s)
6314+        encprivkey_offset = sharedata_offset + len(block)
6315+        eof_offset = encprivkey_offset + len(self.encprivkey)
6316+        offsets = struct.pack(">LLLLQQ",
6317+                              signature_offset,
6318+                              sharehashes_offset,
6319+                              blockhashes_offset,
6320+                              sharedata_offset,
6321+                              encprivkey_offset,
6322+                              eof_offset)
6323+        final_share = "".join([prefix,
6324+                           offsets,
6325+                           self.verification_key,
6326+                           self.signature,
6327+                           self.share_hash_chain_s,
6328+                           self.block_hash_tree_s,
6329+                           block,
6330+                           self.encprivkey])
6331+        self.offsets = {}
6332+        self.offsets['signature'] = signature_offset
6333+        self.offsets['share_hash_chain'] = sharehashes_offset
6334+        self.offsets['block_hash_tree'] = blockhashes_offset
6335+        self.offsets['share_data'] = sharedata_offset
6336+        self.offsets['enc_privkey'] = encprivkey_offset
6337+        self.offsets['EOF'] = eof_offset
6338+        return final_share
6339+
6340+
6341+    def write_sdmf_share_to_server(self,
6342+                                   storage_index,
6343+                                   empty=False):
6344+        # Some tests need SDMF shares to verify that we can still
6345+        # read them. This method writes one, which resembles but is not
6346+        assert self.rref
6347+        write = self.ss.remote_slot_testv_and_readv_and_writev
6348+        share = self.build_test_sdmf_share(empty)
6349+        testvs = [(0, 1, "eq", "")]
6350+        tws = {}
6351+        tws[0] = (testvs, [(0, share)], None)
6352+        readv = []
6353+        results = write(storage_index, self.secrets, tws, readv)
6354+        self.failUnless(results[0])
6355+
6356+
6357+    def test_read(self):
6358+        self.write_test_share_to_server("si1")
6359+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6360+        # Check that every method equals what we expect it to.
6361+        d = defer.succeed(None)
6362+        def _check_block_and_salt((block, salt)):
6363+            self.failUnlessEqual(block, self.block)
6364+            self.failUnlessEqual(salt, self.salt)
6365+
6366+        for i in xrange(6):
6367+            d.addCallback(lambda ignored, i=i:
6368+                mr.get_block_and_salt(i))
6369+            d.addCallback(_check_block_and_salt)
6370+
6371+        d.addCallback(lambda ignored:
6372+            mr.get_encprivkey())
6373+        d.addCallback(lambda encprivkey:
6374+            self.failUnlessEqual(self.encprivkey, encprivkey))
6375+
6376+        d.addCallback(lambda ignored:
6377+            mr.get_blockhashes())
6378+        d.addCallback(lambda blockhashes:
6379+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
6380+
6381+        d.addCallback(lambda ignored:
6382+            mr.get_sharehashes())
6383+        d.addCallback(lambda sharehashes:
6384+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
6385+
6386+        d.addCallback(lambda ignored:
6387+            mr.get_signature())
6388+        d.addCallback(lambda signature:
6389+            self.failUnlessEqual(signature, self.signature))
6390+
6391+        d.addCallback(lambda ignored:
6392+            mr.get_verification_key())
6393+        d.addCallback(lambda verification_key:
6394+            self.failUnlessEqual(verification_key, self.verification_key))
6395+
6396+        d.addCallback(lambda ignored:
6397+            mr.get_seqnum())
6398+        d.addCallback(lambda seqnum:
6399+            self.failUnlessEqual(seqnum, 0))
6400+
6401+        d.addCallback(lambda ignored:
6402+            mr.get_root_hash())
6403+        d.addCallback(lambda root_hash:
6404+            self.failUnlessEqual(self.root_hash, root_hash))
6405+
6406+        d.addCallback(lambda ignored:
6407+            mr.get_seqnum())
6408+        d.addCallback(lambda seqnum:
6409+            self.failUnlessEqual(0, seqnum))
6410+
6411+        d.addCallback(lambda ignored:
6412+            mr.get_encoding_parameters())
6413+        def _check_encoding_parameters((k, n, segsize, datalen)):
6414+            self.failUnlessEqual(k, 3)
6415+            self.failUnlessEqual(n, 10)
6416+            self.failUnlessEqual(segsize, 6)
6417+            self.failUnlessEqual(datalen, 36)
6418+        d.addCallback(_check_encoding_parameters)
6419+
6420+        d.addCallback(lambda ignored:
6421+            mr.get_checkstring())
6422+        d.addCallback(lambda checkstring:
6423+            self.failUnlessEqual(checkstring, checkstring))
6424+        return d
6425+
6426+
6427+    def test_read_with_different_tail_segment_size(self):
6428+        self.write_test_share_to_server("si1", tail_segment=True)
6429+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6430+        d = mr.get_block_and_salt(5)
6431+        def _check_tail_segment(results):
6432+            block, salt = results
6433+            self.failUnlessEqual(len(block), 1)
6434+            self.failUnlessEqual(block, "a")
6435+        d.addCallback(_check_tail_segment)
6436+        return d
6437+
6438+
6439+    def test_get_block_with_invalid_segnum(self):
6440+        self.write_test_share_to_server("si1")
6441+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6442+        d = defer.succeed(None)
6443+        d.addCallback(lambda ignored:
6444+            self.shouldFail(LayoutInvalid, "test invalid segnum",
6445+                            None,
6446+                            mr.get_block_and_salt, 7))
6447+        return d
6448+
6449+
6450+    def test_get_encoding_parameters_first(self):
6451+        self.write_test_share_to_server("si1")
6452+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6453+        d = mr.get_encoding_parameters()
6454+        def _check_encoding_parameters((k, n, segment_size, datalen)):
6455+            self.failUnlessEqual(k, 3)
6456+            self.failUnlessEqual(n, 10)
6457+            self.failUnlessEqual(segment_size, 6)
6458+            self.failUnlessEqual(datalen, 36)
6459+        d.addCallback(_check_encoding_parameters)
6460+        return d
6461+
6462+
6463+    def test_get_seqnum_first(self):
6464+        self.write_test_share_to_server("si1")
6465+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6466+        d = mr.get_seqnum()
6467+        d.addCallback(lambda seqnum:
6468+            self.failUnlessEqual(seqnum, 0))
6469+        return d
6470+
6471+
6472+    def test_get_root_hash_first(self):
6473+        self.write_test_share_to_server("si1")
6474+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6475+        d = mr.get_root_hash()
6476+        d.addCallback(lambda root_hash:
6477+            self.failUnlessEqual(root_hash, self.root_hash))
6478+        return d
6479+
6480+
6481+    def test_get_checkstring_first(self):
6482+        self.write_test_share_to_server("si1")
6483+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6484+        d = mr.get_checkstring()
6485+        d.addCallback(lambda checkstring:
6486+            self.failUnlessEqual(checkstring, self.checkstring))
6487+        return d
6488+
6489+
6490+    def test_write_read_vectors(self):
6491+        # When writing for us, the storage server will return to us a
6492+        # read vector, along with its result. If a write fails because
6493+        # the test vectors failed, this read vector can help us to
6494+        # diagnose the problem. This test ensures that the read vector
6495+        # is working appropriately.
6496+        mw = self._make_new_mw("si1", 0)
6497+
6498+        for i in xrange(6):
6499+            mw.put_block(self.block, i, self.salt)
6500+        mw.put_encprivkey(self.encprivkey)
6501+        mw.put_blockhashes(self.block_hash_tree)
6502+        mw.put_sharehashes(self.share_hash_chain)
6503+        mw.put_root_hash(self.root_hash)
6504+        mw.put_signature(self.signature)
6505+        mw.put_verification_key(self.verification_key)
6506+        d = mw.finish_publishing()
6507+        def _then(results):
6508+            self.failUnless(len(results), 2)
6509+            result, readv = results
6510+            self.failUnless(result)
6511+            self.failIf(readv)
6512+            self.old_checkstring = mw.get_checkstring()
6513+            mw.set_checkstring("")
6514+        d.addCallback(_then)
6515+        d.addCallback(lambda ignored:
6516+            mw.finish_publishing())
6517+        def _then_again(results):
6518+            self.failUnlessEqual(len(results), 2)
6519+            result, readvs = results
6520+            self.failIf(result)
6521+            self.failUnlessIn(0, readvs)
6522+            readv = readvs[0][0]
6523+            self.failUnlessEqual(readv, self.old_checkstring)
6524+        d.addCallback(_then_again)
6525+        # The checkstring remains the same for the rest of the process.
6526+        return d
6527+
6528+
6529+    def test_blockhashes_after_share_hash_chain(self):
6530+        mw = self._make_new_mw("si1", 0)
6531+        d = defer.succeed(None)
6532+        # Put everything up to and including the share hash chain
6533+        for i in xrange(6):
6534+            d.addCallback(lambda ignored, i=i:
6535+                mw.put_block(self.block, i, self.salt))
6536+        d.addCallback(lambda ignored:
6537+            mw.put_encprivkey(self.encprivkey))
6538+        d.addCallback(lambda ignored:
6539+            mw.put_blockhashes(self.block_hash_tree))
6540+        d.addCallback(lambda ignored:
6541+            mw.put_sharehashes(self.share_hash_chain))
6542+
6543+        # Now try to put the block hash tree again.
6544+        d.addCallback(lambda ignored:
6545+            self.shouldFail(LayoutInvalid, "test repeat salthashes",
6546+                            None,
6547+                            mw.put_blockhashes, self.block_hash_tree))
6548+        return d
6549+
6550+
6551+    def test_encprivkey_after_blockhashes(self):
6552+        mw = self._make_new_mw("si1", 0)
6553+        d = defer.succeed(None)
6554+        # Put everything up to and including the block hash tree
6555+        for i in xrange(6):
6556+            d.addCallback(lambda ignored, i=i:
6557+                mw.put_block(self.block, i, self.salt))
6558+        d.addCallback(lambda ignored:
6559+            mw.put_encprivkey(self.encprivkey))
6560+        d.addCallback(lambda ignored:
6561+            mw.put_blockhashes(self.block_hash_tree))
6562+        d.addCallback(lambda ignored:
6563+            self.shouldFail(LayoutInvalid, "out of order private key",
6564+                            None,
6565+                            mw.put_encprivkey, self.encprivkey))
6566+        return d
6567+
6568+
6569+    def test_share_hash_chain_after_signature(self):
6570+        mw = self._make_new_mw("si1", 0)
6571+        d = defer.succeed(None)
6572+        # Put everything up to and including the signature
6573+        for i in xrange(6):
6574+            d.addCallback(lambda ignored, i=i:
6575+                mw.put_block(self.block, i, self.salt))
6576+        d.addCallback(lambda ignored:
6577+            mw.put_encprivkey(self.encprivkey))
6578+        d.addCallback(lambda ignored:
6579+            mw.put_blockhashes(self.block_hash_tree))
6580+        d.addCallback(lambda ignored:
6581+            mw.put_sharehashes(self.share_hash_chain))
6582+        d.addCallback(lambda ignored:
6583+            mw.put_root_hash(self.root_hash))
6584+        d.addCallback(lambda ignored:
6585+            mw.put_signature(self.signature))
6586+        # Now try to put the share hash chain again. This should fail
6587+        d.addCallback(lambda ignored:
6588+            self.shouldFail(LayoutInvalid, "out of order share hash chain",
6589+                            None,
6590+                            mw.put_sharehashes, self.share_hash_chain))
6591+        return d
6592+
6593+
6594+    def test_signature_after_verification_key(self):
6595+        mw = self._make_new_mw("si1", 0)
6596+        d = defer.succeed(None)
6597+        # Put everything up to and including the verification key.
6598+        for i in xrange(6):
6599+            d.addCallback(lambda ignored, i=i:
6600+                mw.put_block(self.block, i, self.salt))
6601+        d.addCallback(lambda ignored:
6602+            mw.put_encprivkey(self.encprivkey))
6603+        d.addCallback(lambda ignored:
6604+            mw.put_blockhashes(self.block_hash_tree))
6605+        d.addCallback(lambda ignored:
6606+            mw.put_sharehashes(self.share_hash_chain))
6607+        d.addCallback(lambda ignored:
6608+            mw.put_root_hash(self.root_hash))
6609+        d.addCallback(lambda ignored:
6610+            mw.put_signature(self.signature))
6611+        d.addCallback(lambda ignored:
6612+            mw.put_verification_key(self.verification_key))
6613+        # Now try to put the signature again. This should fail
6614+        d.addCallback(lambda ignored:
6615+            self.shouldFail(LayoutInvalid, "signature after verification",
6616+                            None,
6617+                            mw.put_signature, self.signature))
6618+        return d
6619+
6620+
6621+    def test_uncoordinated_write(self):
6622+        # Make two mutable writers, both pointing to the same storage
6623+        # server, both at the same storage index, and try writing to the
6624+        # same share.
6625+        mw1 = self._make_new_mw("si1", 0)
6626+        mw2 = self._make_new_mw("si1", 0)
6627+
6628+        def _check_success(results):
6629+            result, readvs = results
6630+            self.failUnless(result)
6631+
6632+        def _check_failure(results):
6633+            result, readvs = results
6634+            self.failIf(result)
6635+
6636+        def _write_share(mw):
6637+            for i in xrange(6):
6638+                mw.put_block(self.block, i, self.salt)
6639+            mw.put_encprivkey(self.encprivkey)
6640+            mw.put_blockhashes(self.block_hash_tree)
6641+            mw.put_sharehashes(self.share_hash_chain)
6642+            mw.put_root_hash(self.root_hash)
6643+            mw.put_signature(self.signature)
6644+            mw.put_verification_key(self.verification_key)
6645+            return mw.finish_publishing()
6646+        d = _write_share(mw1)
6647+        d.addCallback(_check_success)
6648+        d.addCallback(lambda ignored:
6649+            _write_share(mw2))
6650+        d.addCallback(_check_failure)
6651+        return d
6652+
6653+
6654+    def test_invalid_salt_size(self):
6655+        # Salts need to be 16 bytes in size. Writes that attempt to
6656+        # write more or less than this should be rejected.
6657+        mw = self._make_new_mw("si1", 0)
6658+        invalid_salt = "a" * 17 # 17 bytes
6659+        another_invalid_salt = "b" * 15 # 15 bytes
6660+        d = defer.succeed(None)
6661+        d.addCallback(lambda ignored:
6662+            self.shouldFail(LayoutInvalid, "salt too big",
6663+                            None,
6664+                            mw.put_block, self.block, 0, invalid_salt))
6665+        d.addCallback(lambda ignored:
6666+            self.shouldFail(LayoutInvalid, "salt too small",
6667+                            None,
6668+                            mw.put_block, self.block, 0,
6669+                            another_invalid_salt))
6670+        return d
6671+
6672+
6673+    def test_write_test_vectors(self):
6674+        # If we give the write proxy a bogus test vector at
6675+        # any point during the process, it should fail to write when we
6676+        # tell it to write.
6677+        def _check_failure(results):
6678+            self.failUnlessEqual(len(results), 2)
6679+            res, d = results
6680+            self.failIf(res)
6681+
6682+        def _check_success(results):
6683+            self.failUnlessEqual(len(results), 2)
6684+            res, d = results
6685+            self.failUnless(results)
6686+
6687+        mw = self._make_new_mw("si1", 0)
6688+        mw.set_checkstring("this is a lie")
6689+        for i in xrange(6):
6690+            mw.put_block(self.block, i, self.salt)
6691+        mw.put_encprivkey(self.encprivkey)
6692+        mw.put_blockhashes(self.block_hash_tree)
6693+        mw.put_sharehashes(self.share_hash_chain)
6694+        mw.put_root_hash(self.root_hash)
6695+        mw.put_signature(self.signature)
6696+        mw.put_verification_key(self.verification_key)
6697+        d = mw.finish_publishing()
6698+        d.addCallback(_check_failure)
6699+        d.addCallback(lambda ignored:
6700+            mw.set_checkstring(""))
6701+        d.addCallback(lambda ignored:
6702+            mw.finish_publishing())
6703+        d.addCallback(_check_success)
6704+        return d
6705+
6706+
6707+    def serialize_blockhashes(self, blockhashes):
6708+        return "".join(blockhashes)
6709+
6710+
6711+    def serialize_sharehashes(self, sharehashes):
6712+        ret = "".join([struct.pack(">H32s", i, sharehashes[i])
6713+                        for i in sorted(sharehashes.keys())])
6714+        return ret
6715+
6716+
6717+    def test_write(self):
6718+        # This translates to a file with 6 6-byte segments, and with 2-byte
6719+        # blocks.
6720+        mw = self._make_new_mw("si1", 0)
6721+        # Test writing some blocks.
6722+        read = self.ss.remote_slot_readv
6723+        expected_sharedata_offset = struct.calcsize(MDMFHEADER)
6724+        written_block_size = 2 + len(self.salt)
6725+        written_block = self.block + self.salt
6726+        for i in xrange(6):
6727+            mw.put_block(self.block, i, self.salt)
6728+
6729+        mw.put_encprivkey(self.encprivkey)
6730+        mw.put_blockhashes(self.block_hash_tree)
6731+        mw.put_sharehashes(self.share_hash_chain)
6732+        mw.put_root_hash(self.root_hash)
6733+        mw.put_signature(self.signature)
6734+        mw.put_verification_key(self.verification_key)
6735+        d = mw.finish_publishing()
6736+        def _check_publish(results):
6737+            self.failUnlessEqual(len(results), 2)
6738+            result, ign = results
6739+            self.failUnless(result, "publish failed")
6740+            for i in xrange(6):
6741+                self.failUnlessEqual(read("si1", [0], [(expected_sharedata_offset + (i * written_block_size), written_block_size)]),
6742+                                {0: [written_block]})
6743+
6744+            expected_private_key_offset = expected_sharedata_offset + \
6745+                                      len(written_block) * 6
6746+            self.failUnlessEqual(len(self.encprivkey), 7)
6747+            self.failUnlessEqual(read("si1", [0], [(expected_private_key_offset, 7)]),
6748+                                 {0: [self.encprivkey]})
6749+
6750+            expected_block_hash_offset = expected_private_key_offset + len(self.encprivkey)
6751+            self.failUnlessEqual(len(self.block_hash_tree_s), 32 * 6)
6752+            self.failUnlessEqual(read("si1", [0], [(expected_block_hash_offset, 32 * 6)]),
6753+                                 {0: [self.block_hash_tree_s]})
6754+
6755+            expected_share_hash_offset = expected_block_hash_offset + len(self.block_hash_tree_s)
6756+            self.failUnlessEqual(read("si1", [0],[(expected_share_hash_offset, (32 + 2) * 6)]),
6757+                                 {0: [self.share_hash_chain_s]})
6758+
6759+            self.failUnlessEqual(read("si1", [0], [(9, 32)]),
6760+                                 {0: [self.root_hash]})
6761+            expected_signature_offset = expected_share_hash_offset + len(self.share_hash_chain_s)
6762+            self.failUnlessEqual(len(self.signature), 9)
6763+            self.failUnlessEqual(read("si1", [0], [(expected_signature_offset, 9)]),
6764+                                 {0: [self.signature]})
6765+
6766+            expected_verification_key_offset = expected_signature_offset + len(self.signature)
6767+            self.failUnlessEqual(len(self.verification_key), 6)
6768+            self.failUnlessEqual(read("si1", [0], [(expected_verification_key_offset, 6)]),
6769+                                 {0: [self.verification_key]})
6770+
6771+            signable = mw.get_signable()
6772+            verno, seq, roothash, k, n, segsize, datalen = \
6773+                                            struct.unpack(">BQ32sBBQQ",
6774+                                                          signable)
6775+            self.failUnlessEqual(verno, 1)
6776+            self.failUnlessEqual(seq, 0)
6777+            self.failUnlessEqual(roothash, self.root_hash)
6778+            self.failUnlessEqual(k, 3)
6779+            self.failUnlessEqual(n, 10)
6780+            self.failUnlessEqual(segsize, 6)
6781+            self.failUnlessEqual(datalen, 36)
6782+            expected_eof_offset = expected_verification_key_offset + len(self.verification_key)
6783+
6784+            # Check the version number to make sure that it is correct.
6785+            expected_version_number = struct.pack(">B", 1)
6786+            self.failUnlessEqual(read("si1", [0], [(0, 1)]),
6787+                                 {0: [expected_version_number]})
6788+            # Check the sequence number to make sure that it is correct
6789+            expected_sequence_number = struct.pack(">Q", 0)
6790+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
6791+                                 {0: [expected_sequence_number]})
6792+            # Check that the encoding parameters (k, N, segement size, data
6793+            # length) are what they should be. These are  3, 10, 6, 36
6794+            expected_k = struct.pack(">B", 3)
6795+            self.failUnlessEqual(read("si1", [0], [(41, 1)]),
6796+                                 {0: [expected_k]})
6797+            expected_n = struct.pack(">B", 10)
6798+            self.failUnlessEqual(read("si1", [0], [(42, 1)]),
6799+                                 {0: [expected_n]})
6800+            expected_segment_size = struct.pack(">Q", 6)
6801+            self.failUnlessEqual(read("si1", [0], [(43, 8)]),
6802+                                 {0: [expected_segment_size]})
6803+            expected_data_length = struct.pack(">Q", 36)
6804+            self.failUnlessEqual(read("si1", [0], [(51, 8)]),
6805+                                 {0: [expected_data_length]})
6806+            expected_offset = struct.pack(">Q", expected_private_key_offset)
6807+            self.failUnlessEqual(read("si1", [0], [(59, 8)]),
6808+                                 {0: [expected_offset]})
6809+            expected_offset = struct.pack(">Q", expected_block_hash_offset)
6810+            self.failUnlessEqual(read("si1", [0], [(67, 8)]),
6811+                                 {0: [expected_offset]})
6812+            expected_offset = struct.pack(">Q", expected_share_hash_offset)
6813+            self.failUnlessEqual(read("si1", [0], [(75, 8)]),
6814+                                 {0: [expected_offset]})
6815+            expected_offset = struct.pack(">Q", expected_signature_offset)
6816+            self.failUnlessEqual(read("si1", [0], [(83, 8)]),
6817+                                 {0: [expected_offset]})
6818+            expected_offset = struct.pack(">Q", expected_verification_key_offset)
6819+            self.failUnlessEqual(read("si1", [0], [(91, 8)]),
6820+                                 {0: [expected_offset]})
6821+            expected_offset = struct.pack(">Q", expected_eof_offset)
6822+            self.failUnlessEqual(read("si1", [0], [(99, 8)]),
6823+                                 {0: [expected_offset]})
6824+        d.addCallback(_check_publish)
6825+        return d
6826+
6827+    def _make_new_mw(self, si, share, datalength=36):
6828+        # This is a file of size 36 bytes. Since it has a segment
6829+        # size of 6, we know that it has 6 byte segments, which will
6830+        # be split into blocks of 2 bytes because our FEC k
6831+        # parameter is 3.
6832+        mw = MDMFSlotWriteProxy(share, self.rref, si, self.secrets, 0, 3, 10,
6833+                                6, datalength)
6834+        return mw
6835+
6836+
6837+    def test_write_rejected_with_too_many_blocks(self):
6838+        mw = self._make_new_mw("si0", 0)
6839+
6840+        # Try writing too many blocks. We should not be able to write
6841+        # more than 6
6842+        # blocks into each share.
6843+        d = defer.succeed(None)
6844+        for i in xrange(6):
6845+            d.addCallback(lambda ignored, i=i:
6846+                mw.put_block(self.block, i, self.salt))
6847+        d.addCallback(lambda ignored:
6848+            self.shouldFail(LayoutInvalid, "too many blocks",
6849+                            None,
6850+                            mw.put_block, self.block, 7, self.salt))
6851+        return d
6852+
6853+
6854+    def test_write_rejected_with_invalid_salt(self):
6855+        # Try writing an invalid salt. Salts are 16 bytes -- any more or
6856+        # less should cause an error.
6857+        mw = self._make_new_mw("si1", 0)
6858+        bad_salt = "a" * 17 # 17 bytes
6859+        d = defer.succeed(None)
6860+        d.addCallback(lambda ignored:
6861+            self.shouldFail(LayoutInvalid, "test_invalid_salt",
6862+                            None, mw.put_block, self.block, 7, bad_salt))
6863+        return d
6864+
6865+
6866+    def test_write_rejected_with_invalid_root_hash(self):
6867+        # Try writing an invalid root hash. This should be SHA256d, and
6868+        # 32 bytes long as a result.
6869+        mw = self._make_new_mw("si2", 0)
6870+        # 17 bytes != 32 bytes
6871+        invalid_root_hash = "a" * 17
6872+        d = defer.succeed(None)
6873+        # Before this test can work, we need to put some blocks + salts,
6874+        # a block hash tree, and a share hash tree. Otherwise, we'll see
6875+        # failures that match what we are looking for, but are caused by
6876+        # the constraints imposed on operation ordering.
6877+        for i in xrange(6):
6878+            d.addCallback(lambda ignored, i=i:
6879+                mw.put_block(self.block, i, self.salt))
6880+        d.addCallback(lambda ignored:
6881+            mw.put_encprivkey(self.encprivkey))
6882+        d.addCallback(lambda ignored:
6883+            mw.put_blockhashes(self.block_hash_tree))
6884+        d.addCallback(lambda ignored:
6885+            mw.put_sharehashes(self.share_hash_chain))
6886+        d.addCallback(lambda ignored:
6887+            self.shouldFail(LayoutInvalid, "invalid root hash",
6888+                            None, mw.put_root_hash, invalid_root_hash))
6889+        return d
6890+
6891+
6892+    def test_write_rejected_with_invalid_blocksize(self):
6893+        # The blocksize implied by the writer that we get from
6894+        # _make_new_mw is 2bytes -- any more or any less than this
6895+        # should be cause for failure, unless it is the tail segment, in
6896+        # which case it may not be failure.
6897+        invalid_block = "a"
6898+        mw = self._make_new_mw("si3", 0, 33) # implies a tail segment with
6899+                                             # one byte blocks
6900+        # 1 bytes != 2 bytes
6901+        d = defer.succeed(None)
6902+        d.addCallback(lambda ignored, invalid_block=invalid_block:
6903+            self.shouldFail(LayoutInvalid, "test blocksize too small",
6904+                            None, mw.put_block, invalid_block, 0,
6905+                            self.salt))
6906+        invalid_block = invalid_block * 3
6907+        # 3 bytes != 2 bytes
6908+        d.addCallback(lambda ignored:
6909+            self.shouldFail(LayoutInvalid, "test blocksize too large",
6910+                            None,
6911+                            mw.put_block, invalid_block, 0, self.salt))
6912+        for i in xrange(5):
6913+            d.addCallback(lambda ignored, i=i:
6914+                mw.put_block(self.block, i, self.salt))
6915+        # Try to put an invalid tail segment
6916+        d.addCallback(lambda ignored:
6917+            self.shouldFail(LayoutInvalid, "test invalid tail segment",
6918+                            None,
6919+                            mw.put_block, self.block, 5, self.salt))
6920+        valid_block = "a"
6921+        d.addCallback(lambda ignored:
6922+            mw.put_block(valid_block, 5, self.salt))
6923+        return d
6924+
6925+
6926+    def test_write_enforces_order_constraints(self):
6927+        # We require that the MDMFSlotWriteProxy be interacted with in a
6928+        # specific way.
6929+        # That way is:
6930+        # 0: __init__
6931+        # 1: write blocks and salts
6932+        # 2: Write the encrypted private key
6933+        # 3: Write the block hashes
6934+        # 4: Write the share hashes
6935+        # 5: Write the root hash and salt hash
6936+        # 6: Write the signature and verification key
6937+        # 7: Write the file.
6938+        #
6939+        # Some of these can be performed out-of-order, and some can't.
6940+        # The dependencies that I want to test here are:
6941+        #  - Private key before block hashes
6942+        #  - share hashes and block hashes before root hash
6943+        #  - root hash before signature
6944+        #  - signature before verification key
6945+        mw0 = self._make_new_mw("si0", 0)
6946+        # Write some shares
6947+        d = defer.succeed(None)
6948+        for i in xrange(6):
6949+            d.addCallback(lambda ignored, i=i:
6950+                mw0.put_block(self.block, i, self.salt))
6951+        # Try to write the block hashes before writing the encrypted
6952+        # private key
6953+        d.addCallback(lambda ignored:
6954+            self.shouldFail(LayoutInvalid, "block hashes before key",
6955+                            None, mw0.put_blockhashes,
6956+                            self.block_hash_tree))
6957+
6958+        # Write the private key.
6959+        d.addCallback(lambda ignored:
6960+            mw0.put_encprivkey(self.encprivkey))
6961+
6962+
6963+        # Try to write the share hash chain without writing the block
6964+        # hash tree
6965+        d.addCallback(lambda ignored:
6966+            self.shouldFail(LayoutInvalid, "share hash chain before "
6967+                                           "salt hash tree",
6968+                            None,
6969+                            mw0.put_sharehashes, self.share_hash_chain))
6970+
6971+        # Try to write the root hash and without writing either the
6972+        # block hashes or the or the share hashes
6973+        d.addCallback(lambda ignored:
6974+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
6975+                            None,
6976+                            mw0.put_root_hash, self.root_hash))
6977+
6978+        # Now write the block hashes and try again
6979+        d.addCallback(lambda ignored:
6980+            mw0.put_blockhashes(self.block_hash_tree))
6981+
6982+        d.addCallback(lambda ignored:
6983+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
6984+                            None, mw0.put_root_hash, self.root_hash))
6985+
6986+        # We haven't yet put the root hash on the share, so we shouldn't
6987+        # be able to sign it.
6988+        d.addCallback(lambda ignored:
6989+            self.shouldFail(LayoutInvalid, "signature before root hash",
6990+                            None, mw0.put_signature, self.signature))
6991+
6992+        d.addCallback(lambda ignored:
6993+            self.failUnlessRaises(LayoutInvalid, mw0.get_signable))
6994+
6995+        # ..and, since that fails, we also shouldn't be able to put the
6996+        # verification key.
6997+        d.addCallback(lambda ignored:
6998+            self.shouldFail(LayoutInvalid, "key before signature",
6999+                            None, mw0.put_verification_key,
7000+                            self.verification_key))
7001+
7002+        # Now write the share hashes.
7003+        d.addCallback(lambda ignored:
7004+            mw0.put_sharehashes(self.share_hash_chain))
7005+        # We should be able to write the root hash now too
7006+        d.addCallback(lambda ignored:
7007+            mw0.put_root_hash(self.root_hash))
7008+
7009+        # We should still be unable to put the verification key
7010+        d.addCallback(lambda ignored:
7011+            self.shouldFail(LayoutInvalid, "key before signature",
7012+                            None, mw0.put_verification_key,
7013+                            self.verification_key))
7014+
7015+        d.addCallback(lambda ignored:
7016+            mw0.put_signature(self.signature))
7017+
7018+        # We shouldn't be able to write the offsets to the remote server
7019+        # until the offset table is finished; IOW, until we have written
7020+        # the verification key.
7021+        d.addCallback(lambda ignored:
7022+            self.shouldFail(LayoutInvalid, "offsets before verification key",
7023+                            None,
7024+                            mw0.finish_publishing))
7025+
7026+        d.addCallback(lambda ignored:
7027+            mw0.put_verification_key(self.verification_key))
7028+        return d
7029+
7030+
7031+    def test_end_to_end(self):
7032+        mw = self._make_new_mw("si1", 0)
7033+        # Write a share using the mutable writer, and make sure that the
7034+        # reader knows how to read everything back to us.
7035+        d = defer.succeed(None)
7036+        for i in xrange(6):
7037+            d.addCallback(lambda ignored, i=i:
7038+                mw.put_block(self.block, i, self.salt))
7039+        d.addCallback(lambda ignored:
7040+            mw.put_encprivkey(self.encprivkey))
7041+        d.addCallback(lambda ignored:
7042+            mw.put_blockhashes(self.block_hash_tree))
7043+        d.addCallback(lambda ignored:
7044+            mw.put_sharehashes(self.share_hash_chain))
7045+        d.addCallback(lambda ignored:
7046+            mw.put_root_hash(self.root_hash))
7047+        d.addCallback(lambda ignored:
7048+            mw.put_signature(self.signature))
7049+        d.addCallback(lambda ignored:
7050+            mw.put_verification_key(self.verification_key))
7051+        d.addCallback(lambda ignored:
7052+            mw.finish_publishing())
7053+
7054+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7055+        def _check_block_and_salt((block, salt)):
7056+            self.failUnlessEqual(block, self.block)
7057+            self.failUnlessEqual(salt, self.salt)
7058+
7059+        for i in xrange(6):
7060+            d.addCallback(lambda ignored, i=i:
7061+                mr.get_block_and_salt(i))
7062+            d.addCallback(_check_block_and_salt)
7063+
7064+        d.addCallback(lambda ignored:
7065+            mr.get_encprivkey())
7066+        d.addCallback(lambda encprivkey:
7067+            self.failUnlessEqual(self.encprivkey, encprivkey))
7068+
7069+        d.addCallback(lambda ignored:
7070+            mr.get_blockhashes())
7071+        d.addCallback(lambda blockhashes:
7072+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
7073+
7074+        d.addCallback(lambda ignored:
7075+            mr.get_sharehashes())
7076+        d.addCallback(lambda sharehashes:
7077+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
7078+
7079+        d.addCallback(lambda ignored:
7080+            mr.get_signature())
7081+        d.addCallback(lambda signature:
7082+            self.failUnlessEqual(signature, self.signature))
7083+
7084+        d.addCallback(lambda ignored:
7085+            mr.get_verification_key())
7086+        d.addCallback(lambda verification_key:
7087+            self.failUnlessEqual(verification_key, self.verification_key))
7088+
7089+        d.addCallback(lambda ignored:
7090+            mr.get_seqnum())
7091+        d.addCallback(lambda seqnum:
7092+            self.failUnlessEqual(seqnum, 0))
7093+
7094+        d.addCallback(lambda ignored:
7095+            mr.get_root_hash())
7096+        d.addCallback(lambda root_hash:
7097+            self.failUnlessEqual(self.root_hash, root_hash))
7098+
7099+        d.addCallback(lambda ignored:
7100+            mr.get_encoding_parameters())
7101+        def _check_encoding_parameters((k, n, segsize, datalen)):
7102+            self.failUnlessEqual(k, 3)
7103+            self.failUnlessEqual(n, 10)
7104+            self.failUnlessEqual(segsize, 6)
7105+            self.failUnlessEqual(datalen, 36)
7106+        d.addCallback(_check_encoding_parameters)
7107+
7108+        d.addCallback(lambda ignored:
7109+            mr.get_checkstring())
7110+        d.addCallback(lambda checkstring:
7111+            self.failUnlessEqual(checkstring, mw.get_checkstring()))
7112+        return d
7113+
7114+
7115+    def test_is_sdmf(self):
7116+        # The MDMFSlotReadProxy should also know how to read SDMF files,
7117+        # since it will encounter them on the grid. Callers use the
7118+        # is_sdmf method to test this.
7119+        self.write_sdmf_share_to_server("si1")
7120+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7121+        d = mr.is_sdmf()
7122+        d.addCallback(lambda issdmf:
7123+            self.failUnless(issdmf))
7124+        return d
7125+
7126+
7127+    def test_reads_sdmf(self):
7128+        # The slot read proxy should, naturally, know how to tell us
7129+        # about data in the SDMF format
7130+        self.write_sdmf_share_to_server("si1")
7131+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7132+        d = defer.succeed(None)
7133+        d.addCallback(lambda ignored:
7134+            mr.is_sdmf())
7135+        d.addCallback(lambda issdmf:
7136+            self.failUnless(issdmf))
7137+
7138+        # What do we need to read?
7139+        #  - The sharedata
7140+        #  - The salt
7141+        d.addCallback(lambda ignored:
7142+            mr.get_block_and_salt(0))
7143+        def _check_block_and_salt(results):
7144+            block, salt = results
7145+            # Our original file is 36 bytes long. Then each share is 12
7146+            # bytes in size. The share is composed entirely of the
7147+            # letter a. self.block contains 2 as, so 6 * self.block is
7148+            # what we are looking for.
7149+            self.failUnlessEqual(block, self.block * 6)
7150+            self.failUnlessEqual(salt, self.salt)
7151+        d.addCallback(_check_block_and_salt)
7152+
7153+        #  - The blockhashes
7154+        d.addCallback(lambda ignored:
7155+            mr.get_blockhashes())
7156+        d.addCallback(lambda blockhashes:
7157+            self.failUnlessEqual(self.block_hash_tree,
7158+                                 blockhashes,
7159+                                 blockhashes))
7160+        #  - The sharehashes
7161+        d.addCallback(lambda ignored:
7162+            mr.get_sharehashes())
7163+        d.addCallback(lambda sharehashes:
7164+            self.failUnlessEqual(self.share_hash_chain,
7165+                                 sharehashes))
7166+        #  - The keys
7167+        d.addCallback(lambda ignored:
7168+            mr.get_encprivkey())
7169+        d.addCallback(lambda encprivkey:
7170+            self.failUnlessEqual(encprivkey, self.encprivkey, encprivkey))
7171+        d.addCallback(lambda ignored:
7172+            mr.get_verification_key())
7173+        d.addCallback(lambda verification_key:
7174+            self.failUnlessEqual(verification_key,
7175+                                 self.verification_key,
7176+                                 verification_key))
7177+        #  - The signature
7178+        d.addCallback(lambda ignored:
7179+            mr.get_signature())
7180+        d.addCallback(lambda signature:
7181+            self.failUnlessEqual(signature, self.signature, signature))
7182+
7183+        #  - The sequence number
7184+        d.addCallback(lambda ignored:
7185+            mr.get_seqnum())
7186+        d.addCallback(lambda seqnum:
7187+            self.failUnlessEqual(seqnum, 0, seqnum))
7188+
7189+        #  - The root hash
7190+        d.addCallback(lambda ignored:
7191+            mr.get_root_hash())
7192+        d.addCallback(lambda root_hash:
7193+            self.failUnlessEqual(root_hash, self.root_hash, root_hash))
7194+        return d
7195+
7196+
7197+    def test_only_reads_one_segment_sdmf(self):
7198+        # SDMF shares have only one segment, so it doesn't make sense to
7199+        # read more segments than that. The reader should know this and
7200+        # complain if we try to do that.
7201+        self.write_sdmf_share_to_server("si1")
7202+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7203+        d = defer.succeed(None)
7204+        d.addCallback(lambda ignored:
7205+            mr.is_sdmf())
7206+        d.addCallback(lambda issdmf:
7207+            self.failUnless(issdmf))
7208+        d.addCallback(lambda ignored:
7209+            self.shouldFail(LayoutInvalid, "test bad segment",
7210+                            None,
7211+                            mr.get_block_and_salt, 1))
7212+        return d
7213+
7214+
7215+    def test_read_with_prefetched_mdmf_data(self):
7216+        # The MDMFSlotReadProxy will prefill certain fields if you pass
7217+        # it data that you have already fetched. This is useful for
7218+        # cases like the Servermap, which prefetches ~2kb of data while
7219+        # finding out which shares are on the remote peer so that it
7220+        # doesn't waste round trips.
7221+        mdmf_data = self.build_test_mdmf_share()
7222+        self.write_test_share_to_server("si1")
7223+        def _make_mr(ignored, length):
7224+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, mdmf_data[:length])
7225+            return mr
7226+
7227+        d = defer.succeed(None)
7228+        # This should be enough to fill in both the encoding parameters
7229+        # and the table of offsets, which will complete the version
7230+        # information tuple.
7231+        d.addCallback(_make_mr, 107)
7232+        d.addCallback(lambda mr:
7233+            mr.get_verinfo())
7234+        def _check_verinfo(verinfo):
7235+            self.failUnless(verinfo)
7236+            self.failUnlessEqual(len(verinfo), 9)
7237+            (seqnum,
7238+             root_hash,
7239+             salt_hash,
7240+             segsize,
7241+             datalen,
7242+             k,
7243+             n,
7244+             prefix,
7245+             offsets) = verinfo
7246+            self.failUnlessEqual(seqnum, 0)
7247+            self.failUnlessEqual(root_hash, self.root_hash)
7248+            self.failUnlessEqual(segsize, 6)
7249+            self.failUnlessEqual(datalen, 36)
7250+            self.failUnlessEqual(k, 3)
7251+            self.failUnlessEqual(n, 10)
7252+            expected_prefix = struct.pack(MDMFSIGNABLEHEADER,
7253+                                          1,
7254+                                          seqnum,
7255+                                          root_hash,
7256+                                          k,
7257+                                          n,
7258+                                          segsize,
7259+                                          datalen)
7260+            self.failUnlessEqual(expected_prefix, prefix)
7261+            self.failUnlessEqual(self.rref.read_count, 0)
7262+        d.addCallback(_check_verinfo)
7263+        # This is not enough data to read a block and a share, so the
7264+        # wrapper should attempt to read this from the remote server.
7265+        d.addCallback(_make_mr, 107)
7266+        d.addCallback(lambda mr:
7267+            mr.get_block_and_salt(0))
7268+        def _check_block_and_salt((block, salt)):
7269+            self.failUnlessEqual(block, self.block)
7270+            self.failUnlessEqual(salt, self.salt)
7271+            self.failUnlessEqual(self.rref.read_count, 1)
7272+        # This should be enough data to read one block.
7273+        d.addCallback(_make_mr, 249)
7274+        d.addCallback(lambda mr:
7275+            mr.get_block_and_salt(0))
7276+        d.addCallback(_check_block_and_salt)
7277+        return d
7278+
7279+
7280+    def test_read_with_prefetched_sdmf_data(self):
7281+        sdmf_data = self.build_test_sdmf_share()
7282+        self.write_sdmf_share_to_server("si1")
7283+        def _make_mr(ignored, length):
7284+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, sdmf_data[:length])
7285+            return mr
7286+
7287+        d = defer.succeed(None)
7288+        # This should be enough to get us the encoding parameters,
7289+        # offset table, and everything else we need to build a verinfo
7290+        # string.
7291+        d.addCallback(_make_mr, 107)
7292+        d.addCallback(lambda mr:
7293+            mr.get_verinfo())
7294+        def _check_verinfo(verinfo):
7295+            self.failUnless(verinfo)
7296+            self.failUnlessEqual(len(verinfo), 9)
7297+            (seqnum,
7298+             root_hash,
7299+             salt,
7300+             segsize,
7301+             datalen,
7302+             k,
7303+             n,
7304+             prefix,
7305+             offsets) = verinfo
7306+            self.failUnlessEqual(seqnum, 0)
7307+            self.failUnlessEqual(root_hash, self.root_hash)
7308+            self.failUnlessEqual(salt, self.salt)
7309+            self.failUnlessEqual(segsize, 36)
7310+            self.failUnlessEqual(datalen, 36)
7311+            self.failUnlessEqual(k, 3)
7312+            self.failUnlessEqual(n, 10)
7313+            expected_prefix = struct.pack(SIGNED_PREFIX,
7314+                                          0,
7315+                                          seqnum,
7316+                                          root_hash,
7317+                                          salt,
7318+                                          k,
7319+                                          n,
7320+                                          segsize,
7321+                                          datalen)
7322+            self.failUnlessEqual(expected_prefix, prefix)
7323+            self.failUnlessEqual(self.rref.read_count, 0)
7324+        d.addCallback(_check_verinfo)
7325+        # This shouldn't be enough to read any share data.
7326+        d.addCallback(_make_mr, 107)
7327+        d.addCallback(lambda mr:
7328+            mr.get_block_and_salt(0))
7329+        def _check_block_and_salt((block, salt)):
7330+            self.failUnlessEqual(block, self.block * 6)
7331+            self.failUnlessEqual(salt, self.salt)
7332+            # TODO: Fix the read routine so that it reads only the data
7333+            #       that it has cached if it can't read all of it.
7334+            self.failUnlessEqual(self.rref.read_count, 2)
7335+
7336+        # This should be enough to read share data.
7337+        d.addCallback(_make_mr, self.offsets['share_data'])
7338+        d.addCallback(lambda mr:
7339+            mr.get_block_and_salt(0))
7340+        d.addCallback(_check_block_and_salt)
7341+        return d
7342+
7343+
7344+    def test_read_with_empty_mdmf_file(self):
7345+        # Some tests upload a file with no contents to test things
7346+        # unrelated to the actual handling of the content of the file.
7347+        # The reader should behave intelligently in these cases.
7348+        self.write_test_share_to_server("si1", empty=True)
7349+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7350+        # We should be able to get the encoding parameters, and they
7351+        # should be correct.
7352+        d = defer.succeed(None)
7353+        d.addCallback(lambda ignored:
7354+            mr.get_encoding_parameters())
7355+        def _check_encoding_parameters(params):
7356+            self.failUnlessEqual(len(params), 4)
7357+            k, n, segsize, datalen = params
7358+            self.failUnlessEqual(k, 3)
7359+            self.failUnlessEqual(n, 10)
7360+            self.failUnlessEqual(segsize, 0)
7361+            self.failUnlessEqual(datalen, 0)
7362+        d.addCallback(_check_encoding_parameters)
7363+
7364+        # We should not be able to fetch a block, since there are no
7365+        # blocks to fetch
7366+        d.addCallback(lambda ignored:
7367+            self.shouldFail(LayoutInvalid, "get block on empty file",
7368+                            None,
7369+                            mr.get_block_and_salt, 0))
7370+        return d
7371+
7372+
7373+    def test_read_with_empty_sdmf_file(self):
7374+        self.write_sdmf_share_to_server("si1", empty=True)
7375+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7376+        # We should be able to get the encoding parameters, and they
7377+        # should be correct
7378+        d = defer.succeed(None)
7379+        d.addCallback(lambda ignored:
7380+            mr.get_encoding_parameters())
7381+        def _check_encoding_parameters(params):
7382+            self.failUnlessEqual(len(params), 4)
7383+            k, n, segsize, datalen = params
7384+            self.failUnlessEqual(k, 3)
7385+            self.failUnlessEqual(n, 10)
7386+            self.failUnlessEqual(segsize, 0)
7387+            self.failUnlessEqual(datalen, 0)
7388+        d.addCallback(_check_encoding_parameters)
7389+
7390+        # It does not make sense to get a block in this format, so we
7391+        # should not be able to.
7392+        d.addCallback(lambda ignored:
7393+            self.shouldFail(LayoutInvalid, "get block on an empty file",
7394+                            None,
7395+                            mr.get_block_and_salt, 0))
7396+        return d
7397+
7398+
7399+    def test_verinfo_with_sdmf_file(self):
7400+        self.write_sdmf_share_to_server("si1")
7401+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7402+        # We should be able to get the version information.
7403+        d = defer.succeed(None)
7404+        d.addCallback(lambda ignored:
7405+            mr.get_verinfo())
7406+        def _check_verinfo(verinfo):
7407+            self.failUnless(verinfo)
7408+            self.failUnlessEqual(len(verinfo), 9)
7409+            (seqnum,
7410+             root_hash,
7411+             salt,
7412+             segsize,
7413+             datalen,
7414+             k,
7415+             n,
7416+             prefix,
7417+             offsets) = verinfo
7418+            self.failUnlessEqual(seqnum, 0)
7419+            self.failUnlessEqual(root_hash, self.root_hash)
7420+            self.failUnlessEqual(salt, self.salt)
7421+            self.failUnlessEqual(segsize, 36)
7422+            self.failUnlessEqual(datalen, 36)
7423+            self.failUnlessEqual(k, 3)
7424+            self.failUnlessEqual(n, 10)
7425+            expected_prefix = struct.pack(">BQ32s16s BBQQ",
7426+                                          0,
7427+                                          seqnum,
7428+                                          root_hash,
7429+                                          salt,
7430+                                          k,
7431+                                          n,
7432+                                          segsize,
7433+                                          datalen)
7434+            self.failUnlessEqual(prefix, expected_prefix)
7435+            self.failUnlessEqual(offsets, self.offsets)
7436+        d.addCallback(_check_verinfo)
7437+        return d
7438+
7439+
7440+    def test_verinfo_with_mdmf_file(self):
7441+        self.write_test_share_to_server("si1")
7442+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7443+        d = defer.succeed(None)
7444+        d.addCallback(lambda ignored:
7445+            mr.get_verinfo())
7446+        def _check_verinfo(verinfo):
7447+            self.failUnless(verinfo)
7448+            self.failUnlessEqual(len(verinfo), 9)
7449+            (seqnum,
7450+             root_hash,
7451+             IV,
7452+             segsize,
7453+             datalen,
7454+             k,
7455+             n,
7456+             prefix,
7457+             offsets) = verinfo
7458+            self.failUnlessEqual(seqnum, 0)
7459+            self.failUnlessEqual(root_hash, self.root_hash)
7460+            self.failIf(IV)
7461+            self.failUnlessEqual(segsize, 6)
7462+            self.failUnlessEqual(datalen, 36)
7463+            self.failUnlessEqual(k, 3)
7464+            self.failUnlessEqual(n, 10)
7465+            expected_prefix = struct.pack(">BQ32s BBQQ",
7466+                                          1,
7467+                                          seqnum,
7468+                                          root_hash,
7469+                                          k,
7470+                                          n,
7471+                                          segsize,
7472+                                          datalen)
7473+            self.failUnlessEqual(prefix, expected_prefix)
7474+            self.failUnlessEqual(offsets, self.offsets)
7475+        d.addCallback(_check_verinfo)
7476+        return d
7477+
7478+
7479+    def test_reader_queue(self):
7480+        self.write_test_share_to_server('si1')
7481+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7482+        d1 = mr.get_block_and_salt(0, queue=True)
7483+        d2 = mr.get_blockhashes(queue=True)
7484+        d3 = mr.get_sharehashes(queue=True)
7485+        d4 = mr.get_signature(queue=True)
7486+        d5 = mr.get_verification_key(queue=True)
7487+        dl = defer.DeferredList([d1, d2, d3, d4, d5])
7488+        mr.flush()
7489+        def _print(results):
7490+            self.failUnlessEqual(len(results), 5)
7491+            # We have one read for version information and offsets, and
7492+            # one for everything else.
7493+            self.failUnlessEqual(self.rref.read_count, 2)
7494+            block, salt = results[0][1] # results[0] is a boolean that says
7495+                                           # whether or not the operation
7496+                                           # worked.
7497+            self.failUnlessEqual(self.block, block)
7498+            self.failUnlessEqual(self.salt, salt)
7499+
7500+            blockhashes = results[1][1]
7501+            self.failUnlessEqual(self.block_hash_tree, blockhashes)
7502+
7503+            sharehashes = results[2][1]
7504+            self.failUnlessEqual(self.share_hash_chain, sharehashes)
7505+
7506+            signature = results[3][1]
7507+            self.failUnlessEqual(self.signature, signature)
7508+
7509+            verification_key = results[4][1]
7510+            self.failUnlessEqual(self.verification_key, verification_key)
7511+        dl.addCallback(_print)
7512+        return dl
7513+
7514+
7515+    def test_sdmf_writer(self):
7516+        # Go through the motions of writing an SDMF share to the storage
7517+        # server. Then read the storage server to see that the share got
7518+        # written in the way that we think it should have.
7519+
7520+        # We do this first so that the necessary instance variables get
7521+        # set the way we want them for the tests below.
7522+        data = self.build_test_sdmf_share()
7523+        sdmfr = SDMFSlotWriteProxy(0,
7524+                                   self.rref,
7525+                                   "si1",
7526+                                   self.secrets,
7527+                                   0, 3, 10, 36, 36)
7528+        # Put the block and salt.
7529+        sdmfr.put_block(self.blockdata, 0, self.salt)
7530+
7531+        # Put the encprivkey
7532+        sdmfr.put_encprivkey(self.encprivkey)
7533+
7534+        # Put the block and share hash chains
7535+        sdmfr.put_blockhashes(self.block_hash_tree)
7536+        sdmfr.put_sharehashes(self.share_hash_chain)
7537+        sdmfr.put_root_hash(self.root_hash)
7538+
7539+        # Put the signature
7540+        sdmfr.put_signature(self.signature)
7541+
7542+        # Put the verification key
7543+        sdmfr.put_verification_key(self.verification_key)
7544+
7545+        # Now check to make sure that nothing has been written yet.
7546+        self.failUnlessEqual(self.rref.write_count, 0)
7547+
7548+        # Now finish publishing
7549+        d = sdmfr.finish_publishing()
7550+        def _then(ignored):
7551+            self.failUnlessEqual(self.rref.write_count, 1)
7552+            read = self.ss.remote_slot_readv
7553+            self.failUnlessEqual(read("si1", [0], [(0, len(data))]),
7554+                                 {0: [data]})
7555+        d.addCallback(_then)
7556+        return d
7557+
7558+
7559+    def test_sdmf_writer_preexisting_share(self):
7560+        data = self.build_test_sdmf_share()
7561+        self.write_sdmf_share_to_server("si1")
7562+
7563+        # Now there is a share on the storage server. To successfully
7564+        # write, we need to set the checkstring correctly. When we
7565+        # don't, no write should occur.
7566+        sdmfw = SDMFSlotWriteProxy(0,
7567+                                   self.rref,
7568+                                   "si1",
7569+                                   self.secrets,
7570+                                   1, 3, 10, 36, 36)
7571+        sdmfw.put_block(self.blockdata, 0, self.salt)
7572+
7573+        # Put the encprivkey
7574+        sdmfw.put_encprivkey(self.encprivkey)
7575+
7576+        # Put the block and share hash chains
7577+        sdmfw.put_blockhashes(self.block_hash_tree)
7578+        sdmfw.put_sharehashes(self.share_hash_chain)
7579+
7580+        # Put the root hash
7581+        sdmfw.put_root_hash(self.root_hash)
7582+
7583+        # Put the signature
7584+        sdmfw.put_signature(self.signature)
7585+
7586+        # Put the verification key
7587+        sdmfw.put_verification_key(self.verification_key)
7588+
7589+        # We shouldn't have a checkstring yet
7590+        self.failUnlessEqual(sdmfw.get_checkstring(), "")
7591+
7592+        d = sdmfw.finish_publishing()
7593+        def _then(results):
7594+            self.failIf(results[0])
7595+            # this is the correct checkstring
7596+            self._expected_checkstring = results[1][0][0]
7597+            return self._expected_checkstring
7598+
7599+        d.addCallback(_then)
7600+        d.addCallback(sdmfw.set_checkstring)
7601+        d.addCallback(lambda ignored:
7602+            sdmfw.get_checkstring())
7603+        d.addCallback(lambda checkstring:
7604+            self.failUnlessEqual(checkstring, self._expected_checkstring))
7605+        d.addCallback(lambda ignored:
7606+            sdmfw.finish_publishing())
7607+        def _then_again(results):
7608+            self.failUnless(results[0])
7609+            read = self.ss.remote_slot_readv
7610+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
7611+                                 {0: [struct.pack(">Q", 1)]})
7612+            self.failUnlessEqual(read("si1", [0], [(9, len(data) - 9)]),
7613+                                 {0: [data[9:]]})
7614+        d.addCallback(_then_again)
7615+        return d
7616+
7617+
7618 class Stats(unittest.TestCase):
7619 
7620     def setUp(self):
7621}
7622[mutable/retrieve.py: Modify the retrieval process to support MDMF
7623Kevan Carstensen <kevan@isnotajoke.com>**20100819003409
7624 Ignore-this: c03f4e41aaa0366a9bf44847f2caf9db
7625 
7626 The logic behind a mutable file download had to be adapted to work with
7627 segmented mutable files; this patch performs those adaptations. It also
7628 exposes some decoding and decrypting functionality to make partial-file
7629 updates a little easier, and supports efficient random-access downloads
7630 of parts of an MDMF file.
7631] {
7632hunk ./src/allmydata/mutable/retrieve.py 2
7633 
7634-import struct, time
7635+import time
7636 from itertools import count
7637 from zope.interface import implements
7638 from twisted.internet import defer
7639merger 0.0 (
7640hunk ./src/allmydata/mutable/retrieve.py 10
7641+from allmydata.util.dictutil import DictOfSets
7642hunk ./src/allmydata/mutable/retrieve.py 7
7643-from foolscap.api import DeadReferenceError, eventually, fireEventually
7644-from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError
7645-from allmydata.util import hashutil, idlib, log
7646+from twisted.internet.interfaces import IPushProducer, IConsumer
7647+from foolscap.api import eventually, fireEventually
7648+from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError, \
7649+                                 MDMF_VERSION, SDMF_VERSION
7650+from allmydata.util import hashutil, log, mathutil
7651)
7652hunk ./src/allmydata/mutable/retrieve.py 16
7653 from pycryptopp.publickey import rsa
7654 
7655 from allmydata.mutable.common import CorruptShareError, UncoordinatedWriteError
7656-from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data
7657+from allmydata.mutable.layout import MDMFSlotReadProxy
7658 
7659 class RetrieveStatus:
7660     implements(IRetrieveStatus)
7661hunk ./src/allmydata/mutable/retrieve.py 83
7662     # times, and each will have a separate response chain. However the
7663     # Retrieve object will remain tied to a specific version of the file, and
7664     # will use a single ServerMap instance.
7665+    implements(IPushProducer)
7666 
7667hunk ./src/allmydata/mutable/retrieve.py 85
7668-    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False):
7669+    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False,
7670+                 verify=False):
7671         self._node = filenode
7672         assert self._node.get_pubkey()
7673         self._storage_index = filenode.get_storage_index()
7674hunk ./src/allmydata/mutable/retrieve.py 104
7675         self.verinfo = verinfo
7676         # during repair, we may be called upon to grab the private key, since
7677         # it wasn't picked up during a verify=False checker run, and we'll
7678-        # need it for repair to generate the a new version.
7679-        self._need_privkey = fetch_privkey
7680-        if self._node.get_privkey():
7681+        # need it for repair to generate a new version.
7682+        self._need_privkey = fetch_privkey or verify
7683+        if self._node.get_privkey() and not verify:
7684             self._need_privkey = False
7685 
7686hunk ./src/allmydata/mutable/retrieve.py 109
7687+        if self._need_privkey:
7688+            # TODO: Evaluate the need for this. We'll use it if we want
7689+            # to limit how many queries are on the wire for the privkey
7690+            # at once.
7691+            self._privkey_query_markers = [] # one Marker for each time we've
7692+                                             # tried to get the privkey.
7693+
7694+        # verify means that we are using the downloader logic to verify all
7695+        # of our shares. This tells the downloader a few things.
7696+        #
7697+        # 1. We need to download all of the shares.
7698+        # 2. We don't need to decode or decrypt the shares, since our
7699+        #    caller doesn't care about the plaintext, only the
7700+        #    information about which shares are or are not valid.
7701+        # 3. When we are validating readers, we need to validate the
7702+        #    signature on the prefix. Do we? We already do this in the
7703+        #    servermap update?
7704+        self._verify = False
7705+        if verify:
7706+            self._verify = True
7707+
7708         self._status = RetrieveStatus()
7709         self._status.set_storage_index(self._storage_index)
7710         self._status.set_helper(False)
7711hunk ./src/allmydata/mutable/retrieve.py 139
7712          offsets_tuple) = self.verinfo
7713         self._status.set_size(datalength)
7714         self._status.set_encoding(k, N)
7715+        self.readers = {}
7716+        self._paused = False
7717+        self._paused_deferred = None
7718+        self._offset = None
7719+        self._read_length = None
7720+        self.log("got seqnum %d" % self.verinfo[0])
7721+
7722 
7723     def get_status(self):
7724         return self._status
7725hunk ./src/allmydata/mutable/retrieve.py 157
7726             kwargs["facility"] = "tahoe.mutable.retrieve"
7727         return log.msg(*args, **kwargs)
7728 
7729-    def download(self):
7730+
7731+    ###################
7732+    # IPushProducer
7733+
7734+    def pauseProducing(self):
7735+        """
7736+        I am called by my download target if we have produced too much
7737+        data for it to handle. I make the downloader stop producing new
7738+        data until my resumeProducing method is called.
7739+        """
7740+        if self._paused:
7741+            return
7742+
7743+        # fired when the download is unpaused.
7744+        self._old_status = self._status.get_status()
7745+        self._status.set_status("Paused")
7746+
7747+        self._pause_deferred = defer.Deferred()
7748+        self._paused = True
7749+
7750+
7751+    def resumeProducing(self):
7752+        """
7753+        I am called by my download target once it is ready to begin
7754+        receiving data again.
7755+        """
7756+        if not self._paused:
7757+            return
7758+
7759+        self._paused = False
7760+        p = self._pause_deferred
7761+        self._pause_deferred = None
7762+        self._status.set_status(self._old_status)
7763+
7764+        eventually(p.callback, None)
7765+
7766+
7767+    def _check_for_paused(self, res):
7768+        """
7769+        I am called just before a write to the consumer. I return a
7770+        Deferred that eventually fires with the data that is to be
7771+        written to the consumer. If the download has not been paused,
7772+        the Deferred fires immediately. Otherwise, the Deferred fires
7773+        when the downloader is unpaused.
7774+        """
7775+        if self._paused:
7776+            d = defer.Deferred()
7777+            self._pause_defered.addCallback(lambda ignored: d.callback(res))
7778+            return d
7779+        return defer.succeed(res)
7780+
7781+
7782+    def download(self, consumer=None, offset=0, size=None):
7783+        assert IConsumer.providedBy(consumer) or self._verify
7784+
7785+        if consumer:
7786+            self._consumer = consumer
7787+            # we provide IPushProducer, so streaming=True, per
7788+            # IConsumer.
7789+            self._consumer.registerProducer(self, streaming=True)
7790+
7791         self._done_deferred = defer.Deferred()
7792         self._started = time.time()
7793         self._status.set_status("Retrieving Shares")
7794hunk ./src/allmydata/mutable/retrieve.py 222
7795 
7796+        self._offset = offset
7797+        self._read_length = size
7798+
7799         # first, which servers can we use?
7800         versionmap = self.servermap.make_versionmap()
7801         shares = versionmap[self.verinfo]
7802hunk ./src/allmydata/mutable/retrieve.py 232
7803         self.remaining_sharemap = DictOfSets()
7804         for (shnum, peerid, timestamp) in shares:
7805             self.remaining_sharemap.add(shnum, peerid)
7806+            # If the servermap update fetched anything, it fetched at least 1
7807+            # KiB, so we ask for that much.
7808+            # TODO: Change the cache methods to allow us to fetch all of the
7809+            # data that they have, then change this method to do that.
7810+            any_cache, timestamp = self._node._read_from_cache(self.verinfo,
7811+                                                               shnum,
7812+                                                               0,
7813+                                                               1000)
7814+            ss = self.servermap.connections[peerid]
7815+            reader = MDMFSlotReadProxy(ss,
7816+                                       self._storage_index,
7817+                                       shnum,
7818+                                       any_cache)
7819+            reader.peerid = peerid
7820+            self.readers[shnum] = reader
7821+
7822 
7823         self.shares = {} # maps shnum to validated blocks
7824hunk ./src/allmydata/mutable/retrieve.py 250
7825+        self._active_readers = [] # list of active readers for this dl.
7826+        self._validated_readers = set() # set of readers that we have
7827+                                        # validated the prefix of
7828+        self._block_hash_trees = {} # shnum => hashtree
7829 
7830         # how many shares do we need?
7831hunk ./src/allmydata/mutable/retrieve.py 256
7832-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
7833+        (seqnum,
7834+         root_hash,
7835+         IV,
7836+         segsize,
7837+         datalength,
7838+         k,
7839+         N,
7840+         prefix,
7841          offsets_tuple) = self.verinfo
7842hunk ./src/allmydata/mutable/retrieve.py 265
7843-        assert len(self.remaining_sharemap) >= k
7844-        # we start with the lowest shnums we have available, since FEC is
7845-        # faster if we're using "primary shares"
7846-        self.active_shnums = set(sorted(self.remaining_sharemap.keys())[:k])
7847-        for shnum in self.active_shnums:
7848-            # we use an arbitrary peer who has the share. If shares are
7849-            # doubled up (more than one share per peer), we could make this
7850-            # run faster by spreading the load among multiple peers. But the
7851-            # algorithm to do that is more complicated than I want to write
7852-            # right now, and a well-provisioned grid shouldn't have multiple
7853-            # shares per peer.
7854-            peerid = list(self.remaining_sharemap[shnum])[0]
7855-            self.get_data(shnum, peerid)
7856 
7857hunk ./src/allmydata/mutable/retrieve.py 266
7858-        # control flow beyond this point: state machine. Receiving responses
7859-        # from queries is the input. We might send out more queries, or we
7860-        # might produce a result.
7861 
7862hunk ./src/allmydata/mutable/retrieve.py 267
7863+        # We need one share hash tree for the entire file; its leaves
7864+        # are the roots of the block hash trees for the shares that
7865+        # comprise it, and its root is in the verinfo.
7866+        self.share_hash_tree = hashtree.IncompleteHashTree(N)
7867+        self.share_hash_tree.set_hashes({0: root_hash})
7868+
7869+        # This will set up both the segment decoder and the tail segment
7870+        # decoder, as well as a variety of other instance variables that
7871+        # the download process will use.
7872+        self._setup_encoding_parameters()
7873+        assert len(self.remaining_sharemap) >= k
7874+
7875+        self.log("starting download")
7876+        self._paused = False
7877+        self._started_fetching = time.time()
7878+
7879+        self._add_active_peers()
7880+        # The download process beyond this is a state machine.
7881+        # _add_active_peers will select the peers that we want to use
7882+        # for the download, and then attempt to start downloading. After
7883+        # each segment, it will check for doneness, reacting to broken
7884+        # peers and corrupt shares as necessary. If it runs out of good
7885+        # peers before downloading all of the segments, _done_deferred
7886+        # will errback.  Otherwise, it will eventually callback with the
7887+        # contents of the mutable file.
7888         return self._done_deferred
7889 
7890hunk ./src/allmydata/mutable/retrieve.py 294
7891-    def get_data(self, shnum, peerid):
7892-        self.log(format="sending sh#%(shnum)d request to [%(peerid)s]",
7893-                 shnum=shnum,
7894-                 peerid=idlib.shortnodeid_b2a(peerid),
7895-                 level=log.NOISY)
7896-        ss = self.servermap.connections[peerid]
7897-        started = time.time()
7898-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
7899+
7900+    def decode(self, blocks_and_salts, segnum):
7901+        """
7902+        I am a helper method that the mutable file update process uses
7903+        as a shortcut to decode and decrypt the segments that it needs
7904+        to fetch in order to perform a file update. I take in a
7905+        collection of blocks and salts, and pick some of those to make a
7906+        segment with. I return the plaintext associated with that
7907+        segment.
7908+        """
7909+        # shnum => block hash tree. Unusued, but setup_encoding_parameters will
7910+        # want to set this.
7911+        # XXX: Make it so that it won't set this if we're just decoding.
7912+        self._block_hash_trees = {}
7913+        self._setup_encoding_parameters()
7914+        # This is the form expected by decode.
7915+        blocks_and_salts = blocks_and_salts.items()
7916+        blocks_and_salts = [(True, [d]) for d in blocks_and_salts]
7917+
7918+        d = self._decode_blocks(blocks_and_salts, segnum)
7919+        d.addCallback(self._decrypt_segment)
7920+        return d
7921+
7922+
7923+    def _setup_encoding_parameters(self):
7924+        """
7925+        I set up the encoding parameters, including k, n, the number
7926+        of segments associated with this file, and the segment decoder.
7927+        """
7928+        (seqnum,
7929+         root_hash,
7930+         IV,
7931+         segsize,
7932+         datalength,
7933+         k,
7934+         n,
7935+         known_prefix,
7936          offsets_tuple) = self.verinfo
7937hunk ./src/allmydata/mutable/retrieve.py 332
7938-        offsets = dict(offsets_tuple)
7939+        self._required_shares = k
7940+        self._total_shares = n
7941+        self._segment_size = segsize
7942+        self._data_length = datalength
7943 
7944hunk ./src/allmydata/mutable/retrieve.py 337
7945-        # we read the checkstring, to make sure that the data we grab is from
7946-        # the right version.
7947-        readv = [ (0, struct.calcsize(SIGNED_PREFIX)) ]
7948+        if not IV:
7949+            self._version = MDMF_VERSION
7950+        else:
7951+            self._version = SDMF_VERSION
7952 
7953hunk ./src/allmydata/mutable/retrieve.py 342
7954-        # We also read the data, and the hashes necessary to validate them
7955-        # (share_hash_chain, block_hash_tree, share_data). We don't read the
7956-        # signature or the pubkey, since that was handled during the
7957-        # servermap phase, and we'll be comparing the share hash chain
7958-        # against the roothash that was validated back then.
7959+        if datalength and segsize:
7960+            self._num_segments = mathutil.div_ceil(datalength, segsize)
7961+            self._tail_data_size = datalength % segsize
7962+        else:
7963+            self._num_segments = 0
7964+            self._tail_data_size = 0
7965 
7966hunk ./src/allmydata/mutable/retrieve.py 349
7967-        readv.append( (offsets['share_hash_chain'],
7968-                       offsets['enc_privkey'] - offsets['share_hash_chain'] ) )
7969+        self._segment_decoder = codec.CRSDecoder()
7970+        self._segment_decoder.set_params(segsize, k, n)
7971 
7972hunk ./src/allmydata/mutable/retrieve.py 352
7973-        # if we need the private key (for repair), we also fetch that
7974-        if self._need_privkey:
7975-            readv.append( (offsets['enc_privkey'],
7976-                           offsets['EOF'] - offsets['enc_privkey']) )
7977+        if  not self._tail_data_size:
7978+            self._tail_data_size = segsize
7979+
7980+        self._tail_segment_size = mathutil.next_multiple(self._tail_data_size,
7981+                                                         self._required_shares)
7982+        if self._tail_segment_size == self._segment_size:
7983+            self._tail_decoder = self._segment_decoder
7984+        else:
7985+            self._tail_decoder = codec.CRSDecoder()
7986+            self._tail_decoder.set_params(self._tail_segment_size,
7987+                                          self._required_shares,
7988+                                          self._total_shares)
7989 
7990hunk ./src/allmydata/mutable/retrieve.py 365
7991-        m = Marker()
7992-        self._outstanding_queries[m] = (peerid, shnum, started)
7993+        self.log("got encoding parameters: "
7994+                 "k: %d "
7995+                 "n: %d "
7996+                 "%d segments of %d bytes each (%d byte tail segment)" % \
7997+                 (k, n, self._num_segments, self._segment_size,
7998+                  self._tail_segment_size))
7999 
8000         # ask the cache first
8001         got_from_cache = False
8002merger 0.0 (
8003hunk ./src/allmydata/mutable/retrieve.py 376
8004-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
8005-                                                            offset, length)
8006+            data = self._node._read_from_cache(self.verinfo, shnum, offset, length)
8007hunk ./src/allmydata/mutable/retrieve.py 372
8008-        # ask the cache first
8009-        got_from_cache = False
8010-        datavs = []
8011-        for (offset, length) in readv:
8012-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
8013-                                                            offset, length)
8014-            if data is not None:
8015-                datavs.append(data)
8016-        if len(datavs) == len(readv):
8017-            self.log("got data from cache")
8018-            got_from_cache = True
8019-            d = fireEventually({shnum: datavs})
8020-            # datavs is a dict mapping shnum to a pair of strings
8021+        for i in xrange(self._total_shares):
8022+            # So we don't have to do this later.
8023+            self._block_hash_trees[i] = hashtree.IncompleteHashTree(self._num_segments)
8024+
8025+        # Our last task is to tell the downloader where to start and
8026+        # where to stop. We use three parameters for that:
8027+        #   - self._start_segment: the segment that we need to start
8028+        #     downloading from.
8029+        #   - self._current_segment: the next segment that we need to
8030+        #     download.
8031+        #   - self._last_segment: The last segment that we were asked to
8032+        #     download.
8033+        #
8034+        #  We say that the download is complete when
8035+        #  self._current_segment > self._last_segment. We use
8036+        #  self._start_segment and self._last_segment to know when to
8037+        #  strip things off of segments, and how much to strip.
8038+        if self._offset:
8039+            self.log("got offset: %d" % self._offset)
8040+            # our start segment is the first segment containing the
8041+            # offset we were given.
8042+            start = mathutil.div_ceil(self._offset,
8043+                                      self._segment_size)
8044+            # this gets us the first segment after self._offset. Then
8045+            # our start segment is the one before it.
8046+            start -= 1
8047+
8048+            assert start < self._num_segments
8049+            self._start_segment = start
8050+            self.log("got start segment: %d" % self._start_segment)
8051)
8052hunk ./src/allmydata/mutable/retrieve.py 386
8053             d = fireEventually({shnum: datavs})
8054             # datavs is a dict mapping shnum to a pair of strings
8055         else:
8056-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
8057-        self.remaining_sharemap.discard(shnum, peerid)
8058+            self._start_segment = 0
8059 
8060hunk ./src/allmydata/mutable/retrieve.py 388
8061-        d.addCallback(self._got_results, m, peerid, started, got_from_cache)
8062-        d.addErrback(self._query_failed, m, peerid)
8063-        # errors that aren't handled by _query_failed (and errors caused by
8064-        # _query_failed) get logged, but we still want to check for doneness.
8065-        def _oops(f):
8066-            self.log(format="problem in _query_failed for sh#%(shnum)d to %(peerid)s",
8067-                     shnum=shnum,
8068-                     peerid=idlib.shortnodeid_b2a(peerid),
8069-                     failure=f,
8070-                     level=log.WEIRD, umid="W0xnQA")
8071-        d.addErrback(_oops)
8072-        d.addBoth(self._check_for_done)
8073-        # any error during _check_for_done means the download fails. If the
8074-        # download is successful, _check_for_done will fire _done by itself.
8075-        d.addErrback(self._done)
8076-        d.addErrback(log.err)
8077-        return d # purely for testing convenience
8078 
8079hunk ./src/allmydata/mutable/retrieve.py 389
8080-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
8081-        # isolate the callRemote to a separate method, so tests can subclass
8082-        # Publish and override it
8083-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
8084-        return d
8085+        if self._read_length:
8086+            # our end segment is the last segment containing part of the
8087+            # segment that we were asked to read.
8088+            self.log("got read length %d" % self._read_length)
8089+            end_data = self._offset + self._read_length
8090+            end = mathutil.div_ceil(end_data,
8091+                                    self._segment_size)
8092+            end -= 1
8093+            assert end < self._num_segments
8094+            self._last_segment = end
8095+            self.log("got end segment: %d" % self._last_segment)
8096+        else:
8097+            self._last_segment = self._num_segments - 1
8098 
8099hunk ./src/allmydata/mutable/retrieve.py 403
8100-    def remove_peer(self, peerid):
8101-        for shnum in list(self.remaining_sharemap.keys()):
8102-            self.remaining_sharemap.discard(shnum, peerid)
8103+        self._current_segment = self._start_segment
8104 
8105hunk ./src/allmydata/mutable/retrieve.py 405
8106-    def _got_results(self, datavs, marker, peerid, started, got_from_cache):
8107-        now = time.time()
8108-        elapsed = now - started
8109-        if not got_from_cache:
8110-            self._status.add_fetch_timing(peerid, elapsed)
8111-        self.log(format="got results (%(shares)d shares) from [%(peerid)s]",
8112-                 shares=len(datavs),
8113-                 peerid=idlib.shortnodeid_b2a(peerid),
8114-                 level=log.NOISY)
8115-        self._outstanding_queries.pop(marker, None)
8116-        if not self._running:
8117-            return
8118+    def _add_active_peers(self):
8119+        """
8120+        I populate self._active_readers with enough active readers to
8121+        retrieve the contents of this mutable file. I am called before
8122+        downloading starts, and (eventually) after each validation
8123+        error, connection error, or other problem in the download.
8124+        """
8125+        # TODO: It would be cool to investigate other heuristics for
8126+        # reader selection. For instance, the cost (in time the user
8127+        # spends waiting for their file) of selecting a really slow peer
8128+        # that happens to have a primary share is probably more than
8129+        # selecting a really fast peer that doesn't have a primary
8130+        # share. Maybe the servermap could be extended to provide this
8131+        # information; it could keep track of latency information while
8132+        # it gathers more important data, and then this routine could
8133+        # use that to select active readers.
8134+        #
8135+        # (these and other questions would be easier to answer with a
8136+        #  robust, configurable tahoe-lafs simulator, which modeled node
8137+        #  failures, differences in node speed, and other characteristics
8138+        #  that we expect storage servers to have.  You could have
8139+        #  presets for really stable grids (like allmydata.com),
8140+        #  friendnets, make it easy to configure your own settings, and
8141+        #  then simulate the effect of big changes on these use cases
8142+        #  instead of just reasoning about what the effect might be. Out
8143+        #  of scope for MDMF, though.)
8144 
8145hunk ./src/allmydata/mutable/retrieve.py 432
8146-        # note that we only ask for a single share per query, so we only
8147-        # expect a single share back. On the other hand, we use the extra
8148-        # shares if we get them.. seems better than an assert().
8149+        # We need at least self._required_shares readers to download a
8150+        # segment.
8151+        if self._verify:
8152+            needed = self._total_shares
8153+        else:
8154+            needed = self._required_shares - len(self._active_readers)
8155+        # XXX: Why don't format= log messages work here?
8156+        self.log("adding %d peers to the active peers list" % needed)
8157 
8158hunk ./src/allmydata/mutable/retrieve.py 441
8159-        for shnum,datav in datavs.items():
8160-            (prefix, hash_and_data) = datav[:2]
8161-            try:
8162-                self._got_results_one_share(shnum, peerid,
8163-                                            prefix, hash_and_data)
8164-            except CorruptShareError, e:
8165-                # log it and give the other shares a chance to be processed
8166-                f = failure.Failure()
8167-                self.log(format="bad share: %(f_value)s",
8168-                         f_value=str(f.value), failure=f,
8169-                         level=log.WEIRD, umid="7fzWZw")
8170-                self.notify_server_corruption(peerid, shnum, str(e))
8171-                self.remove_peer(peerid)
8172-                self.servermap.mark_bad_share(peerid, shnum, prefix)
8173-                self._bad_shares.add( (peerid, shnum) )
8174-                self._status.problems[peerid] = f
8175-                self._last_failure = f
8176-                pass
8177-            if self._need_privkey and len(datav) > 2:
8178-                lp = None
8179-                self._try_to_validate_privkey(datav[2], peerid, shnum, lp)
8180-        # all done!
8181+        # We favor lower numbered shares, since FEC is faster with
8182+        # primary shares than with other shares, and lower-numbered
8183+        # shares are more likely to be primary than higher numbered
8184+        # shares.
8185+        active_shnums = set(sorted(self.remaining_sharemap.keys()))
8186+        # We shouldn't consider adding shares that we already have; this
8187+        # will cause problems later.
8188+        active_shnums -= set([reader.shnum for reader in self._active_readers])
8189+        active_shnums = list(active_shnums)[:needed]
8190+        if len(active_shnums) < needed and not self._verify:
8191+            # We don't have enough readers to retrieve the file; fail.
8192+            return self._failed()
8193 
8194hunk ./src/allmydata/mutable/retrieve.py 454
8195-    def notify_server_corruption(self, peerid, shnum, reason):
8196-        ss = self.servermap.connections[peerid]
8197-        ss.callRemoteOnly("advise_corrupt_share",
8198-                          "mutable", self._storage_index, shnum, reason)
8199+        for shnum in active_shnums:
8200+            self._active_readers.append(self.readers[shnum])
8201+            self.log("added reader for share %d" % shnum)
8202+        assert len(self._active_readers) >= self._required_shares
8203+        # Conceptually, this is part of the _add_active_peers step. It
8204+        # validates the prefixes of newly added readers to make sure
8205+        # that they match what we are expecting for self.verinfo. If
8206+        # validation is successful, _validate_active_prefixes will call
8207+        # _download_current_segment for us. If validation is
8208+        # unsuccessful, then _validate_prefixes will remove the peer and
8209+        # call _add_active_peers again, where we will attempt to rectify
8210+        # the problem by choosing another peer.
8211+        return self._validate_active_prefixes()
8212 
8213hunk ./src/allmydata/mutable/retrieve.py 468
8214-    def _got_results_one_share(self, shnum, peerid,
8215-                               got_prefix, got_hash_and_data):
8216-        self.log("_got_results: got shnum #%d from peerid %s"
8217-                 % (shnum, idlib.shortnodeid_b2a(peerid)))
8218-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
8219-         offsets_tuple) = self.verinfo
8220-        assert len(got_prefix) == len(prefix), (len(got_prefix), len(prefix))
8221-        if got_prefix != prefix:
8222-            msg = "someone wrote to the data since we read the servermap: prefix changed"
8223-            raise UncoordinatedWriteError(msg)
8224-        (share_hash_chain, block_hash_tree,
8225-         share_data) = unpack_share_data(self.verinfo, got_hash_and_data)
8226 
8227hunk ./src/allmydata/mutable/retrieve.py 469
8228-        assert isinstance(share_data, str)
8229-        # build the block hash tree. SDMF has only one leaf.
8230-        leaves = [hashutil.block_hash(share_data)]
8231-        t = hashtree.HashTree(leaves)
8232-        if list(t) != block_hash_tree:
8233-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
8234-        share_hash_leaf = t[0]
8235-        t2 = hashtree.IncompleteHashTree(N)
8236-        # root_hash was checked by the signature
8237-        t2.set_hashes({0: root_hash})
8238-        try:
8239-            t2.set_hashes(hashes=share_hash_chain,
8240-                          leaves={shnum: share_hash_leaf})
8241-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
8242-                IndexError), e:
8243-            msg = "corrupt hashes: %s" % (e,)
8244-            raise CorruptShareError(peerid, shnum, msg)
8245-        self.log(" data valid! len=%d" % len(share_data))
8246-        # each query comes down to this: placing validated share data into
8247-        # self.shares
8248-        self.shares[shnum] = share_data
8249+    def _validate_active_prefixes(self):
8250+        """
8251+        I check to make sure that the prefixes on the peers that I am
8252+        currently reading from match the prefix that we want to see, as
8253+        said in self.verinfo.
8254 
8255hunk ./src/allmydata/mutable/retrieve.py 475
8256-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
8257+        If I find that all of the active peers have acceptable prefixes,
8258+        I pass control to _download_current_segment, which will use
8259+        those peers to do cool things. If I find that some of the active
8260+        peers have unacceptable prefixes, I will remove them from active
8261+        peers (and from further consideration) and call
8262+        _add_active_peers to attempt to rectify the situation. I keep
8263+        track of which peers I have already validated so that I don't
8264+        need to do so again.
8265+        """
8266+        assert self._active_readers, "No more active readers"
8267 
8268hunk ./src/allmydata/mutable/retrieve.py 486
8269-        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
8270-        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
8271-        if alleged_writekey != self._node.get_writekey():
8272-            self.log("invalid privkey from %s shnum %d" %
8273-                     (idlib.nodeid_b2a(peerid)[:8], shnum),
8274-                     parent=lp, level=log.WEIRD, umid="YIw4tA")
8275-            return
8276+        ds = []
8277+        new_readers = set(self._active_readers) - self._validated_readers
8278+        self.log('validating %d newly-added active readers' % len(new_readers))
8279 
8280hunk ./src/allmydata/mutable/retrieve.py 490
8281-        # it's good
8282-        self.log("got valid privkey from shnum %d on peerid %s" %
8283-                 (shnum, idlib.shortnodeid_b2a(peerid)),
8284-                 parent=lp)
8285-        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
8286-        self._node._populate_encprivkey(enc_privkey)
8287-        self._node._populate_privkey(privkey)
8288-        self._need_privkey = False
8289+        for reader in new_readers:
8290+            # We force a remote read here -- otherwise, we are relying
8291+            # on cached data that we already verified as valid, and we
8292+            # won't detect an uncoordinated write that has occurred
8293+            # since the last servermap update.
8294+            d = reader.get_prefix(force_remote=True)
8295+            d.addCallback(self._try_to_validate_prefix, reader)
8296+            ds.append(d)
8297+        dl = defer.DeferredList(ds, consumeErrors=True)
8298+        def _check_results(results):
8299+            # Each result in results will be of the form (success, msg).
8300+            # We don't care about msg, but success will tell us whether
8301+            # or not the checkstring validated. If it didn't, we need to
8302+            # remove the offending (peer,share) from our active readers,
8303+            # and ensure that active readers is again populated.
8304+            bad_readers = []
8305+            for i, result in enumerate(results):
8306+                if not result[0]:
8307+                    reader = self._active_readers[i]
8308+                    f = result[1]
8309+                    assert isinstance(f, failure.Failure)
8310 
8311hunk ./src/allmydata/mutable/retrieve.py 512
8312-    def _query_failed(self, f, marker, peerid):
8313-        self.log(format="query to [%(peerid)s] failed",
8314-                 peerid=idlib.shortnodeid_b2a(peerid),
8315-                 level=log.NOISY)
8316-        self._status.problems[peerid] = f
8317-        self._outstanding_queries.pop(marker, None)
8318-        if not self._running:
8319-            return
8320-        self._last_failure = f
8321-        self.remove_peer(peerid)
8322-        level = log.WEIRD
8323-        if f.check(DeadReferenceError):
8324-            level = log.UNUSUAL
8325-        self.log(format="error during query: %(f_value)s",
8326-                 f_value=str(f.value), failure=f, level=level, umid="gOJB5g")
8327+                    self.log("The reader %s failed to "
8328+                             "properly validate: %s" % \
8329+                             (reader, str(f.value)))
8330+                    bad_readers.append((reader, f))
8331+                else:
8332+                    reader = self._active_readers[i]
8333+                    self.log("the reader %s checks out, so we'll use it" % \
8334+                             reader)
8335+                    self._validated_readers.add(reader)
8336+                    # Each time we validate a reader, we check to see if
8337+                    # we need the private key. If we do, we politely ask
8338+                    # for it and then continue computing. If we find
8339+                    # that we haven't gotten it at the end of
8340+                    # segment decoding, then we'll take more drastic
8341+                    # measures.
8342+                    if self._need_privkey and not self._node.is_readonly():
8343+                        d = reader.get_encprivkey()
8344+                        d.addCallback(self._try_to_validate_privkey, reader)
8345+            if bad_readers:
8346+                # We do them all at once, or else we screw up list indexing.
8347+                for (reader, f) in bad_readers:
8348+                    self._mark_bad_share(reader, f)
8349+                if self._verify:
8350+                    if len(self._active_readers) >= self._required_shares:
8351+                        return self._download_current_segment()
8352+                    else:
8353+                        return self._failed()
8354+                else:
8355+                    return self._add_active_peers()
8356+            else:
8357+                return self._download_current_segment()
8358+            # The next step will assert that it has enough active
8359+            # readers to fetch shares; we just need to remove it.
8360+        dl.addCallback(_check_results)
8361+        return dl
8362 
8363hunk ./src/allmydata/mutable/retrieve.py 548
8364-    def _check_for_done(self, res):
8365-        # exit paths:
8366-        #  return : keep waiting, no new queries
8367-        #  return self._send_more_queries(outstanding) : send some more queries
8368-        #  fire self._done(plaintext) : download successful
8369-        #  raise exception : download fails
8370 
8371hunk ./src/allmydata/mutable/retrieve.py 549
8372-        self.log(format="_check_for_done: running=%(running)s, decoding=%(decoding)s",
8373-                 running=self._running, decoding=self._decoding,
8374-                 level=log.NOISY)
8375-        if not self._running:
8376-            return
8377-        if self._decoding:
8378-            return
8379-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
8380+    def _try_to_validate_prefix(self, prefix, reader):
8381+        """
8382+        I check that the prefix returned by a candidate server for
8383+        retrieval matches the prefix that the servermap knows about
8384+        (and, hence, the prefix that was validated earlier). If it does,
8385+        I return True, which means that I approve of the use of the
8386+        candidate server for segment retrieval. If it doesn't, I return
8387+        False, which means that another server must be chosen.
8388+        """
8389+        (seqnum,
8390+         root_hash,
8391+         IV,
8392+         segsize,
8393+         datalength,
8394+         k,
8395+         N,
8396+         known_prefix,
8397          offsets_tuple) = self.verinfo
8398hunk ./src/allmydata/mutable/retrieve.py 567
8399+        if known_prefix != prefix:
8400+            self.log("prefix from share %d doesn't match" % reader.shnum)
8401+            raise UncoordinatedWriteError("Mismatched prefix -- this could "
8402+                                          "indicate an uncoordinated write")
8403+        # Otherwise, we're okay -- no issues.
8404 
8405hunk ./src/allmydata/mutable/retrieve.py 573
8406-        if len(self.shares) < k:
8407-            # we don't have enough shares yet
8408-            return self._maybe_send_more_queries(k)
8409-        if self._need_privkey:
8410-            # we got k shares, but none of them had a valid privkey. TODO:
8411-            # look further. Adding code to do this is a bit complicated, and
8412-            # I want to avoid that complication, and this should be pretty
8413-            # rare (k shares with bitflips in the enc_privkey but not in the
8414-            # data blocks). If we actually do get here, the subsequent repair
8415-            # will fail for lack of a privkey.
8416-            self.log("got k shares but still need_privkey, bummer",
8417-                     level=log.WEIRD, umid="MdRHPA")
8418 
8419hunk ./src/allmydata/mutable/retrieve.py 574
8420-        # we have enough to finish. All the shares have had their hashes
8421-        # checked, so if something fails at this point, we don't know how
8422-        # to fix it, so the download will fail.
8423+    def _remove_reader(self, reader):
8424+        """
8425+        At various points, we will wish to remove a peer from
8426+        consideration and/or use. These include, but are not necessarily
8427+        limited to:
8428 
8429hunk ./src/allmydata/mutable/retrieve.py 580
8430-        self._decoding = True # avoid reentrancy
8431-        self._status.set_status("decoding")
8432-        now = time.time()
8433-        elapsed = now - self._started
8434-        self._status.timings["fetch"] = elapsed
8435+            - A connection error.
8436+            - A mismatched prefix (that is, a prefix that does not match
8437+              our conception of the version information string).
8438+            - A failing block hash, salt hash, or share hash, which can
8439+              indicate disk failure/bit flips, or network trouble.
8440 
8441hunk ./src/allmydata/mutable/retrieve.py 586
8442-        d = defer.maybeDeferred(self._decode)
8443-        d.addCallback(self._decrypt, IV, self._node.get_readkey())
8444-        d.addBoth(self._done)
8445-        return d # purely for test convenience
8446+        This method will do that. I will make sure that the
8447+        (shnum,reader) combination represented by my reader argument is
8448+        not used for anything else during this download. I will not
8449+        advise the reader of any corruption, something that my callers
8450+        may wish to do on their own.
8451+        """
8452+        # TODO: When you're done writing this, see if this is ever
8453+        # actually used for something that _mark_bad_share isn't. I have
8454+        # a feeling that they will be used for very similar things, and
8455+        # that having them both here is just going to be an epic amount
8456+        # of code duplication.
8457+        #
8458+        # (well, okay, not epic, but meaningful)
8459+        self.log("removing reader %s" % reader)
8460+        # Remove the reader from _active_readers
8461+        self._active_readers.remove(reader)
8462+        # TODO: self.readers.remove(reader)?
8463+        for shnum in list(self.remaining_sharemap.keys()):
8464+            self.remaining_sharemap.discard(shnum, reader.peerid)
8465 
8466hunk ./src/allmydata/mutable/retrieve.py 606
8467-    def _maybe_send_more_queries(self, k):
8468-        # we don't have enough shares yet. Should we send out more queries?
8469-        # There are some number of queries outstanding, each for a single
8470-        # share. If we can generate 'needed_shares' additional queries, we do
8471-        # so. If we can't, then we know this file is a goner, and we raise
8472-        # NotEnoughSharesError.
8473-        self.log(format=("_maybe_send_more_queries, have=%(have)d, k=%(k)d, "
8474-                         "outstanding=%(outstanding)d"),
8475-                 have=len(self.shares), k=k,
8476-                 outstanding=len(self._outstanding_queries),
8477-                 level=log.NOISY)
8478 
8479hunk ./src/allmydata/mutable/retrieve.py 607
8480-        remaining_shares = k - len(self.shares)
8481-        needed = remaining_shares - len(self._outstanding_queries)
8482-        if not needed:
8483-            # we have enough queries in flight already
8484+    def _mark_bad_share(self, reader, f):
8485+        """
8486+        I mark the (peerid, shnum) encapsulated by my reader argument as
8487+        a bad share, which means that it will not be used anywhere else.
8488 
8489hunk ./src/allmydata/mutable/retrieve.py 612
8490-            # TODO: but if they've been in flight for a long time, and we
8491-            # have reason to believe that new queries might respond faster
8492-            # (i.e. we've seen other queries come back faster, then consider
8493-            # sending out new queries. This could help with peers which have
8494-            # silently gone away since the servermap was updated, for which
8495-            # we're still waiting for the 15-minute TCP disconnect to happen.
8496-            self.log("enough queries are in flight, no more are needed",
8497-                     level=log.NOISY)
8498-            return
8499+        There are several reasons to want to mark something as a bad
8500+        share. These include:
8501+
8502+            - A connection error to the peer.
8503+            - A mismatched prefix (that is, a prefix that does not match
8504+              our local conception of the version information string).
8505+            - A failing block hash, salt hash, share hash, or other
8506+              integrity check.
8507 
8508hunk ./src/allmydata/mutable/retrieve.py 621
8509-        outstanding_shnums = set([shnum
8510-                                  for (peerid, shnum, started)
8511-                                  in self._outstanding_queries.values()])
8512-        # prefer low-numbered shares, they are more likely to be primary
8513-        available_shnums = sorted(self.remaining_sharemap.keys())
8514-        for shnum in available_shnums:
8515-            if shnum in outstanding_shnums:
8516-                # skip ones that are already in transit
8517-                continue
8518-            if shnum not in self.remaining_sharemap:
8519-                # no servers for that shnum. note that DictOfSets removes
8520-                # empty sets from the dict for us.
8521-                continue
8522-            peerid = list(self.remaining_sharemap[shnum])[0]
8523-            # get_data will remove that peerid from the sharemap, and add the
8524-            # query to self._outstanding_queries
8525-            self._status.set_status("Retrieving More Shares")
8526-            self.get_data(shnum, peerid)
8527-            needed -= 1
8528-            if not needed:
8529+        This method will ensure that readers that we wish to mark bad
8530+        (for these reasons or other reasons) are not used for the rest
8531+        of the download. Additionally, it will attempt to tell the
8532+        remote peer (with no guarantee of success) that its share is
8533+        corrupt.
8534+        """
8535+        self.log("marking share %d on server %s as bad" % \
8536+                 (reader.shnum, reader))
8537+        prefix = self.verinfo[-2]
8538+        self.servermap.mark_bad_share(reader.peerid,
8539+                                      reader.shnum,
8540+                                      prefix)
8541+        self._remove_reader(reader)
8542+        self._bad_shares.add((reader.peerid, reader.shnum, f))
8543+        self._status.problems[reader.peerid] = f
8544+        self._last_failure = f
8545+        self.notify_server_corruption(reader.peerid, reader.shnum,
8546+                                      str(f.value))
8547+
8548+
8549+    def _download_current_segment(self):
8550+        """
8551+        I download, validate, decode, decrypt, and assemble the segment
8552+        that this Retrieve is currently responsible for downloading.
8553+        """
8554+        assert len(self._active_readers) >= self._required_shares
8555+        if self._current_segment <= self._last_segment:
8556+            d = self._process_segment(self._current_segment)
8557+        else:
8558+            d = defer.succeed(None)
8559+        d.addBoth(self._turn_barrier)
8560+        d.addCallback(self._check_for_done)
8561+        return d
8562+
8563+
8564+    def _turn_barrier(self, result):
8565+        """
8566+        I help the download process avoid the recursion limit issues
8567+        discussed in #237.
8568+        """
8569+        return fireEventually(result)
8570+
8571+
8572+    def _process_segment(self, segnum):
8573+        """
8574+        I download, validate, decode, and decrypt one segment of the
8575+        file that this Retrieve is retrieving. This means coordinating
8576+        the process of getting k blocks of that file, validating them,
8577+        assembling them into one segment with the decoder, and then
8578+        decrypting them.
8579+        """
8580+        self.log("processing segment %d" % segnum)
8581+
8582+        # TODO: The old code uses a marker. Should this code do that
8583+        # too? What did the Marker do?
8584+        assert len(self._active_readers) >= self._required_shares
8585+
8586+        # We need to ask each of our active readers for its block and
8587+        # salt. We will then validate those. If validation is
8588+        # successful, we will assemble the results into plaintext.
8589+        ds = []
8590+        for reader in self._active_readers:
8591+            started = time.time()
8592+            d = reader.get_block_and_salt(segnum, queue=True)
8593+            d2 = self._get_needed_hashes(reader, segnum)
8594+            dl = defer.DeferredList([d, d2], consumeErrors=True)
8595+            dl.addCallback(self._validate_block, segnum, reader, started)
8596+            dl.addErrback(self._validation_or_decoding_failed, [reader])
8597+            ds.append(dl)
8598+            reader.flush()
8599+        dl = defer.DeferredList(ds)
8600+        if self._verify:
8601+            dl.addCallback(lambda ignored: "")
8602+            dl.addCallback(self._set_segment)
8603+        else:
8604+            dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
8605+        return dl
8606+
8607+
8608+    def _maybe_decode_and_decrypt_segment(self, blocks_and_salts, segnum):
8609+        """
8610+        I take the results of fetching and validating the blocks from a
8611+        callback chain in another method. If the results are such that
8612+        they tell me that validation and fetching succeeded without
8613+        incident, I will proceed with decoding and decryption.
8614+        Otherwise, I will do nothing.
8615+        """
8616+        self.log("trying to decode and decrypt segment %d" % segnum)
8617+        failures = False
8618+        for block_and_salt in blocks_and_salts:
8619+            if not block_and_salt[0] or block_and_salt[1] == None:
8620+                self.log("some validation operations failed; not proceeding")
8621+                failures = True
8622                 break
8623hunk ./src/allmydata/mutable/retrieve.py 715
8624+        if not failures:
8625+            self.log("everything looks ok, building segment %d" % segnum)
8626+            d = self._decode_blocks(blocks_and_salts, segnum)
8627+            d.addCallback(self._decrypt_segment)
8628+            d.addErrback(self._validation_or_decoding_failed,
8629+                         self._active_readers)
8630+            # check to see whether we've been paused before writing
8631+            # anything.
8632+            d.addCallback(self._check_for_paused)
8633+            d.addCallback(self._set_segment)
8634+            return d
8635+        else:
8636+            return defer.succeed(None)
8637+
8638+
8639+    def _set_segment(self, segment):
8640+        """
8641+        Given a plaintext segment, I register that segment with the
8642+        target that is handling the file download.
8643+        """
8644+        self.log("got plaintext for segment %d" % self._current_segment)
8645+        if self._current_segment == self._start_segment:
8646+            # We're on the first segment. It's possible that we want
8647+            # only some part of the end of this segment, and that we
8648+            # just downloaded the whole thing to get that part. If so,
8649+            # we need to account for that and give the reader just the
8650+            # data that they want.
8651+            n = self._offset % self._segment_size
8652+            self.log("stripping %d bytes off of the first segment" % n)
8653+            self.log("original segment length: %d" % len(segment))
8654+            segment = segment[n:]
8655+            self.log("new segment length: %d" % len(segment))
8656+
8657+        if self._current_segment == self._last_segment and self._read_length is not None:
8658+            # We're on the last segment. It's possible that we only want
8659+            # part of the beginning of this segment, and that we
8660+            # downloaded the whole thing anyway. Make sure to give the
8661+            # caller only the portion of the segment that they want to
8662+            # receive.
8663+            extra = self._read_length
8664+            if self._start_segment != self._last_segment:
8665+                extra -= self._segment_size - \
8666+                            (self._offset % self._segment_size)
8667+            extra %= self._segment_size
8668+            self.log("original segment length: %d" % len(segment))
8669+            segment = segment[:extra]
8670+            self.log("new segment length: %d" % len(segment))
8671+            self.log("only taking %d bytes of the last segment" % extra)
8672+
8673+        if not self._verify:
8674+            self._consumer.write(segment)
8675+        else:
8676+            # we don't care about the plaintext if we are doing a verify.
8677+            segment = None
8678+        self._current_segment += 1
8679 
8680hunk ./src/allmydata/mutable/retrieve.py 771
8681-        # at this point, we have as many outstanding queries as we can. If
8682-        # needed!=0 then we might not have enough to recover the file.
8683-        if needed:
8684-            format = ("ran out of peers: "
8685-                      "have %(have)d shares (k=%(k)d), "
8686-                      "%(outstanding)d queries in flight, "
8687-                      "need %(need)d more, "
8688-                      "found %(bad)d bad shares")
8689-            args = {"have": len(self.shares),
8690-                    "k": k,
8691-                    "outstanding": len(self._outstanding_queries),
8692-                    "need": needed,
8693-                    "bad": len(self._bad_shares),
8694-                    }
8695-            self.log(format=format,
8696-                     level=log.WEIRD, umid="ezTfjw", **args)
8697-            err = NotEnoughSharesError("%s, last failure: %s" %
8698-                                      (format % args, self._last_failure))
8699-            if self._bad_shares:
8700-                self.log("We found some bad shares this pass. You should "
8701-                         "update the servermap and try again to check "
8702-                         "more peers",
8703-                         level=log.WEIRD, umid="EFkOlA")
8704-                err.servermap = self.servermap
8705-            raise err
8706 
8707hunk ./src/allmydata/mutable/retrieve.py 772
8708+    def _validation_or_decoding_failed(self, f, readers):
8709+        """
8710+        I am called when a block or a salt fails to correctly validate, or when
8711+        the decryption or decoding operation fails for some reason.  I react to
8712+        this failure by notifying the remote server of corruption, and then
8713+        removing the remote peer from further activity.
8714+        """
8715+        assert isinstance(readers, list)
8716+        bad_shnums = [reader.shnum for reader in readers]
8717+
8718+        self.log("validation or decoding failed on share(s) %s, peer(s) %s "
8719+                 ", segment %d: %s" % \
8720+                 (bad_shnums, readers, self._current_segment, str(f)))
8721+        for reader in readers:
8722+            self._mark_bad_share(reader, f)
8723         return
8724 
8725hunk ./src/allmydata/mutable/retrieve.py 789
8726-    def _decode(self):
8727-        started = time.time()
8728-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
8729-         offsets_tuple) = self.verinfo
8730 
8731hunk ./src/allmydata/mutable/retrieve.py 790
8732-        # shares_dict is a dict mapping shnum to share data, but the codec
8733-        # wants two lists.
8734-        shareids = []; shares = []
8735-        for shareid, share in self.shares.items():
8736+    def _validate_block(self, results, segnum, reader, started):
8737+        """
8738+        I validate a block from one share on a remote server.
8739+        """
8740+        # Grab the part of the block hash tree that is necessary to
8741+        # validate this block, then generate the block hash root.
8742+        self.log("validating share %d for segment %d" % (reader.shnum,
8743+                                                             segnum))
8744+        self._status.add_fetch_timing(reader.peerid, started)
8745+        self._status.set_status("Valdiating blocks for segment %d" % segnum)
8746+        # Did we fail to fetch either of the things that we were
8747+        # supposed to? Fail if so.
8748+        if not results[0][0] and results[1][0]:
8749+            # handled by the errback handler.
8750+
8751+            # These all get batched into one query, so the resulting
8752+            # failure should be the same for all of them, so we can just
8753+            # use the first one.
8754+            assert isinstance(results[0][1], failure.Failure)
8755+
8756+            f = results[0][1]
8757+            raise CorruptShareError(reader.peerid,
8758+                                    reader.shnum,
8759+                                    "Connection error: %s" % str(f))
8760+
8761+        block_and_salt, block_and_sharehashes = results
8762+        block, salt = block_and_salt[1]
8763+        blockhashes, sharehashes = block_and_sharehashes[1]
8764+
8765+        blockhashes = dict(enumerate(blockhashes[1]))
8766+        self.log("the reader gave me the following blockhashes: %s" % \
8767+                 blockhashes.keys())
8768+        self.log("the reader gave me the following sharehashes: %s" % \
8769+                 sharehashes[1].keys())
8770+        bht = self._block_hash_trees[reader.shnum]
8771+
8772+        if bht.needed_hashes(segnum, include_leaf=True):
8773+            try:
8774+                bht.set_hashes(blockhashes)
8775+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
8776+                    IndexError), e:
8777+                raise CorruptShareError(reader.peerid,
8778+                                        reader.shnum,
8779+                                        "block hash tree failure: %s" % e)
8780+
8781+        if self._version == MDMF_VERSION:
8782+            blockhash = hashutil.block_hash(salt + block)
8783+        else:
8784+            blockhash = hashutil.block_hash(block)
8785+        # If this works without an error, then validation is
8786+        # successful.
8787+        try:
8788+           bht.set_hashes(leaves={segnum: blockhash})
8789+        except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
8790+                IndexError), e:
8791+            raise CorruptShareError(reader.peerid,
8792+                                    reader.shnum,
8793+                                    "block hash tree failure: %s" % e)
8794+
8795+        # Reaching this point means that we know that this segment
8796+        # is correct. Now we need to check to see whether the share
8797+        # hash chain is also correct.
8798+        # SDMF wrote share hash chains that didn't contain the
8799+        # leaves, which would be produced from the block hash tree.
8800+        # So we need to validate the block hash tree first. If
8801+        # successful, then bht[0] will contain the root for the
8802+        # shnum, which will be a leaf in the share hash tree, which
8803+        # will allow us to validate the rest of the tree.
8804+        if self.share_hash_tree.needed_hashes(reader.shnum,
8805+                                              include_leaf=True) or \
8806+                                              self._verify:
8807+            try:
8808+                self.share_hash_tree.set_hashes(hashes=sharehashes[1],
8809+                                            leaves={reader.shnum: bht[0]})
8810+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
8811+                    IndexError), e:
8812+                raise CorruptShareError(reader.peerid,
8813+                                        reader.shnum,
8814+                                        "corrupt hashes: %s" % e)
8815+
8816+        self.log('share %d is valid for segment %d' % (reader.shnum,
8817+                                                       segnum))
8818+        return {reader.shnum: (block, salt)}
8819+
8820+
8821+    def _get_needed_hashes(self, reader, segnum):
8822+        """
8823+        I get the hashes needed to validate segnum from the reader, then return
8824+        to my caller when this is done.
8825+        """
8826+        bht = self._block_hash_trees[reader.shnum]
8827+        needed = bht.needed_hashes(segnum, include_leaf=True)
8828+        # The root of the block hash tree is also a leaf in the share
8829+        # hash tree. So we don't need to fetch it from the remote
8830+        # server. In the case of files with one segment, this means that
8831+        # we won't fetch any block hash tree from the remote server,
8832+        # since the hash of each share of the file is the entire block
8833+        # hash tree, and is a leaf in the share hash tree. This is fine,
8834+        # since any share corruption will be detected in the share hash
8835+        # tree.
8836+        #needed.discard(0)
8837+        self.log("getting blockhashes for segment %d, share %d: %s" % \
8838+                 (segnum, reader.shnum, str(needed)))
8839+        d1 = reader.get_blockhashes(needed, queue=True, force_remote=True)
8840+        if self.share_hash_tree.needed_hashes(reader.shnum):
8841+            need = self.share_hash_tree.needed_hashes(reader.shnum)
8842+            self.log("also need sharehashes for share %d: %s" % (reader.shnum,
8843+                                                                 str(need)))
8844+            d2 = reader.get_sharehashes(need, queue=True, force_remote=True)
8845+        else:
8846+            d2 = defer.succeed({}) # the logic in the next method
8847+                                   # expects a dict
8848+        dl = defer.DeferredList([d1, d2], consumeErrors=True)
8849+        return dl
8850+
8851+
8852+    def _decode_blocks(self, blocks_and_salts, segnum):
8853+        """
8854+        I take a list of k blocks and salts, and decode that into a
8855+        single encrypted segment.
8856+        """
8857+        d = {}
8858+        # We want to merge our dictionaries to the form
8859+        # {shnum: blocks_and_salts}
8860+        #
8861+        # The dictionaries come from validate block that way, so we just
8862+        # need to merge them.
8863+        for block_and_salt in blocks_and_salts:
8864+            d.update(block_and_salt[1])
8865+
8866+        # All of these blocks should have the same salt; in SDMF, it is
8867+        # the file-wide IV, while in MDMF it is the per-segment salt. In
8868+        # either case, we just need to get one of them and use it.
8869+        #
8870+        # d.items()[0] is like (shnum, (block, salt))
8871+        # d.items()[0][1] is like (block, salt)
8872+        # d.items()[0][1][1] is the salt.
8873+        salt = d.items()[0][1][1]
8874+        # Next, extract just the blocks from the dict. We'll use the
8875+        # salt in the next step.
8876+        share_and_shareids = [(k, v[0]) for k, v in d.items()]
8877+        d2 = dict(share_and_shareids)
8878+        shareids = []
8879+        shares = []
8880+        for shareid, share in d2.items():
8881             shareids.append(shareid)
8882             shares.append(share)
8883 
8884hunk ./src/allmydata/mutable/retrieve.py 938
8885-        assert len(shareids) >= k, len(shareids)
8886+        self._status.set_status("Decoding")
8887+        started = time.time()
8888+        assert len(shareids) >= self._required_shares, len(shareids)
8889         # zfec really doesn't want extra shares
8890hunk ./src/allmydata/mutable/retrieve.py 942
8891-        shareids = shareids[:k]
8892-        shares = shares[:k]
8893-
8894-        fec = codec.CRSDecoder()
8895-        fec.set_params(segsize, k, N)
8896-
8897-        self.log("params %s, we have %d shares" % ((segsize, k, N), len(shares)))
8898-        self.log("about to decode, shareids=%s" % (shareids,))
8899-        d = defer.maybeDeferred(fec.decode, shares, shareids)
8900-        def _done(buffers):
8901-            self._status.timings["decode"] = time.time() - started
8902-            self.log(" decode done, %d buffers" % len(buffers))
8903+        shareids = shareids[:self._required_shares]
8904+        shares = shares[:self._required_shares]
8905+        self.log("decoding segment %d" % segnum)
8906+        if segnum == self._num_segments - 1:
8907+            d = defer.maybeDeferred(self._tail_decoder.decode, shares, shareids)
8908+        else:
8909+            d = defer.maybeDeferred(self._segment_decoder.decode, shares, shareids)
8910+        def _process(buffers):
8911             segment = "".join(buffers)
8912hunk ./src/allmydata/mutable/retrieve.py 951
8913+            self.log(format="now decoding segment %(segnum)s of %(numsegs)s",
8914+                     segnum=segnum,
8915+                     numsegs=self._num_segments,
8916+                     level=log.NOISY)
8917             self.log(" joined length %d, datalength %d" %
8918hunk ./src/allmydata/mutable/retrieve.py 956
8919-                     (len(segment), datalength))
8920-            segment = segment[:datalength]
8921+                     (len(segment), self._data_length))
8922+            if segnum == self._num_segments - 1:
8923+                size_to_use = self._tail_data_size
8924+            else:
8925+                size_to_use = self._segment_size
8926+            segment = segment[:size_to_use]
8927             self.log(" segment len=%d" % len(segment))
8928hunk ./src/allmydata/mutable/retrieve.py 963
8929-            return segment
8930-        def _err(f):
8931-            self.log(" decode failed: %s" % f)
8932-            return f
8933-        d.addCallback(_done)
8934-        d.addErrback(_err)
8935+            self._status.timings.setdefault("decode", 0)
8936+            self._status.timings['decode'] = time.time() - started
8937+            return segment, salt
8938+        d.addCallback(_process)
8939         return d
8940 
8941hunk ./src/allmydata/mutable/retrieve.py 969
8942-    def _decrypt(self, crypttext, IV, readkey):
8943+
8944+    def _decrypt_segment(self, segment_and_salt):
8945+        """
8946+        I take a single segment and its salt, and decrypt it. I return
8947+        the plaintext of the segment that is in my argument.
8948+        """
8949+        segment, salt = segment_and_salt
8950         self._status.set_status("decrypting")
8951hunk ./src/allmydata/mutable/retrieve.py 977
8952+        self.log("decrypting segment %d" % self._current_segment)
8953         started = time.time()
8954hunk ./src/allmydata/mutable/retrieve.py 979
8955-        key = hashutil.ssk_readkey_data_hash(IV, readkey)
8956+        key = hashutil.ssk_readkey_data_hash(salt, self._node.get_readkey())
8957         decryptor = AES(key)
8958hunk ./src/allmydata/mutable/retrieve.py 981
8959-        plaintext = decryptor.process(crypttext)
8960-        self._status.timings["decrypt"] = time.time() - started
8961+        plaintext = decryptor.process(segment)
8962+        self._status.timings.setdefault("decrypt", 0)
8963+        self._status.timings['decrypt'] = time.time() - started
8964         return plaintext
8965 
8966hunk ./src/allmydata/mutable/retrieve.py 986
8967-    def _done(self, res):
8968-        if not self._running:
8969+
8970+    def notify_server_corruption(self, peerid, shnum, reason):
8971+        ss = self.servermap.connections[peerid]
8972+        ss.callRemoteOnly("advise_corrupt_share",
8973+                          "mutable", self._storage_index, shnum, reason)
8974+
8975+
8976+    def _try_to_validate_privkey(self, enc_privkey, reader):
8977+        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
8978+        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
8979+        if alleged_writekey != self._node.get_writekey():
8980+            self.log("invalid privkey from %s shnum %d" %
8981+                     (reader, reader.shnum),
8982+                     level=log.WEIRD, umid="YIw4tA")
8983+            if self._verify:
8984+                self.servermap.mark_bad_share(reader.peerid, reader.shnum,
8985+                                              self.verinfo[-2])
8986+                e = CorruptShareError(reader.peerid,
8987+                                      reader.shnum,
8988+                                      "invalid privkey")
8989+                f = failure.Failure(e)
8990+                self._bad_shares.add((reader.peerid, reader.shnum, f))
8991             return
8992hunk ./src/allmydata/mutable/retrieve.py 1009
8993+
8994+        # it's good
8995+        self.log("got valid privkey from shnum %d on reader %s" %
8996+                 (reader.shnum, reader))
8997+        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
8998+        self._node._populate_encprivkey(enc_privkey)
8999+        self._node._populate_privkey(privkey)
9000+        self._need_privkey = False
9001+
9002+
9003+    def _check_for_done(self, res):
9004+        """
9005+        I check to see if this Retrieve object has successfully finished
9006+        its work.
9007+
9008+        I can exit in the following ways:
9009+            - If there are no more segments to download, then I exit by
9010+              causing self._done_deferred to fire with the plaintext
9011+              content requested by the caller.
9012+            - If there are still segments to be downloaded, and there
9013+              are enough active readers (readers which have not broken
9014+              and have not given us corrupt data) to continue
9015+              downloading, I send control back to
9016+              _download_current_segment.
9017+            - If there are still segments to be downloaded but there are
9018+              not enough active peers to download them, I ask
9019+              _add_active_peers to add more peers. If it is successful,
9020+              it will call _download_current_segment. If there are not
9021+              enough peers to retrieve the file, then that will cause
9022+              _done_deferred to errback.
9023+        """
9024+        self.log("checking for doneness")
9025+        if self._current_segment > self._last_segment:
9026+            # No more segments to download, we're done.
9027+            self.log("got plaintext, done")
9028+            return self._done()
9029+
9030+        if len(self._active_readers) >= self._required_shares:
9031+            # More segments to download, but we have enough good peers
9032+            # in self._active_readers that we can do that without issue,
9033+            # so go nab the next segment.
9034+            self.log("not done yet: on segment %d of %d" % \
9035+                     (self._current_segment + 1, self._num_segments))
9036+            return self._download_current_segment()
9037+
9038+        self.log("not done yet: on segment %d of %d, need to add peers" % \
9039+                 (self._current_segment + 1, self._num_segments))
9040+        return self._add_active_peers()
9041+
9042+
9043+    def _done(self):
9044+        """
9045+        I am called by _check_for_done when the download process has
9046+        finished successfully. After making some useful logging
9047+        statements, I return the decrypted contents to the owner of this
9048+        Retrieve object through self._done_deferred.
9049+        """
9050         self._running = False
9051         self._status.set_active(False)
9052hunk ./src/allmydata/mutable/retrieve.py 1068
9053-        self._status.timings["total"] = time.time() - self._started
9054-        # res is either the new contents, or a Failure
9055-        if isinstance(res, failure.Failure):
9056-            self.log("Retrieve done, with failure", failure=res,
9057-                     level=log.UNUSUAL)
9058-            self._status.set_status("Failed")
9059+        now = time.time()
9060+        self._status.timings['total'] = now - self._started
9061+        self._status.timings['fetch'] = now - self._started_fetching
9062+
9063+        if self._verify:
9064+            ret = list(self._bad_shares)
9065+            self.log("done verifying, found %d bad shares" % len(ret))
9066         else:
9067hunk ./src/allmydata/mutable/retrieve.py 1076
9068-            self.log("Retrieve done, success!")
9069-            self._status.set_status("Finished")
9070-            self._status.set_progress(1.0)
9071-            # remember the encoding parameters, use them again next time
9072-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
9073-             offsets_tuple) = self.verinfo
9074-            self._node._populate_required_shares(k)
9075-            self._node._populate_total_shares(N)
9076-        eventually(self._done_deferred.callback, res)
9077+            # TODO: upload status here?
9078+            ret = self._consumer
9079+            self._consumer.unregisterProducer()
9080+        eventually(self._done_deferred.callback, ret)
9081+
9082 
9083hunk ./src/allmydata/mutable/retrieve.py 1082
9084+    def _failed(self):
9085+        """
9086+        I am called by _add_active_peers when there are not enough
9087+        active peers left to complete the download. After making some
9088+        useful logging statements, I return an exception to that effect
9089+        to the caller of this Retrieve object through
9090+        self._done_deferred.
9091+        """
9092+        self._running = False
9093+        self._status.set_active(False)
9094+        now = time.time()
9095+        self._status.timings['total'] = now - self._started
9096+        self._status.timings['fetch'] = now - self._started_fetching
9097+
9098+        if self._verify:
9099+            ret = list(self._bad_shares)
9100+        else:
9101+            format = ("ran out of peers: "
9102+                      "have %(have)d of %(total)d segments "
9103+                      "found %(bad)d bad shares "
9104+                      "encoding %(k)d-of-%(n)d")
9105+            args = {"have": self._current_segment,
9106+                    "total": self._num_segments,
9107+                    "need": self._last_segment,
9108+                    "k": self._required_shares,
9109+                    "n": self._total_shares,
9110+                    "bad": len(self._bad_shares)}
9111+            e = NotEnoughSharesError("%s, last failure: %s" % \
9112+                                     (format % args, str(self._last_failure)))
9113+            f = failure.Failure(e)
9114+            ret = f
9115+        eventually(self._done_deferred.callback, ret)
9116}
9117[mutable/servermap.py: Alter the servermap updater to work with MDMF files
9118Kevan Carstensen <kevan@isnotajoke.com>**20100819003439
9119 Ignore-this: 7e408303194834bd59a2f27efab3bdb
9120 
9121 These modifications were basically all to the end of having the
9122 servermap updater use the unified MDMF + SDMF read interface whenever
9123 possible -- this reduces the complexity of the code, making it easier to
9124 read and maintain. To do this, I needed to modify the process of
9125 updating the servermap a little bit.
9126 
9127 To support partial-file updates, I also modified the servermap updater
9128 to fetch the block hash trees and certain segments of files while it
9129 performed a servermap update (this can be done without adding any new
9130 roundtrips because of batch-read functionality that the read proxy has).
9131 
9132] {
9133hunk ./src/allmydata/mutable/servermap.py 2
9134 
9135-import sys, time
9136+import sys, time, struct
9137 from zope.interface import implements
9138 from itertools import count
9139 from twisted.internet import defer
9140merger 0.0 (
9141hunk ./src/allmydata/mutable/servermap.py 9
9142+from allmydata.util.dictutil import DictOfSets
9143hunk ./src/allmydata/mutable/servermap.py 7
9144-from foolscap.api import DeadReferenceError, RemoteException, eventually
9145-from allmydata.util import base32, hashutil, idlib, log
9146+from foolscap.api import DeadReferenceError, RemoteException, eventually, \
9147+                         fireEventually
9148+from allmydata.util import base32, hashutil, idlib, log, deferredutil
9149)
9150merger 0.0 (
9151hunk ./src/allmydata/mutable/servermap.py 14
9152-     DictOfSets, CorruptShareError, NeedMoreDataError
9153+     CorruptShareError, NeedMoreDataError
9154hunk ./src/allmydata/mutable/servermap.py 14
9155-     DictOfSets, CorruptShareError, NeedMoreDataError
9156-from allmydata.mutable.layout import unpack_prefix_and_signature, unpack_header, unpack_share, \
9157-     SIGNED_PREFIX_LENGTH
9158+     DictOfSets, CorruptShareError
9159+from allmydata.mutable.layout import SIGNED_PREFIX_LENGTH, MDMFSlotReadProxy
9160)
9161hunk ./src/allmydata/mutable/servermap.py 123
9162         self.bad_shares = {} # maps (peerid,shnum) to old checkstring
9163         self.last_update_mode = None
9164         self.last_update_time = 0
9165+        self.update_data = {} # (verinfo,shnum) => data
9166 
9167     def copy(self):
9168         s = ServerMap()
9169hunk ./src/allmydata/mutable/servermap.py 254
9170         """Return a set of versionids, one for each version that is currently
9171         recoverable."""
9172         versionmap = self.make_versionmap()
9173-
9174         recoverable_versions = set()
9175         for (verinfo, shares) in versionmap.items():
9176             (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
9177hunk ./src/allmydata/mutable/servermap.py 339
9178         return False
9179 
9180 
9181+    def get_update_data_for_share_and_verinfo(self, shnum, verinfo):
9182+        """
9183+        I return the update data for the given shnum
9184+        """
9185+        update_data = self.update_data[shnum]
9186+        update_datum = [i[1] for i in update_data if i[0] == verinfo][0]
9187+        return update_datum
9188+
9189+
9190+    def set_update_data_for_share_and_verinfo(self, shnum, verinfo, data):
9191+        """
9192+        I record the block hash tree for the given shnum.
9193+        """
9194+        self.update_data.setdefault(shnum , []).append((verinfo, data))
9195+
9196+
9197 class ServermapUpdater:
9198     def __init__(self, filenode, storage_broker, monitor, servermap,
9199hunk ./src/allmydata/mutable/servermap.py 357
9200-                 mode=MODE_READ, add_lease=False):
9201+                 mode=MODE_READ, add_lease=False, update_range=None):
9202         """I update a servermap, locating a sufficient number of useful
9203         shares and remembering where they are located.
9204 
9205hunk ./src/allmydata/mutable/servermap.py 382
9206         self._servers_responded = set()
9207 
9208         # how much data should we read?
9209+        # SDMF:
9210         #  * if we only need the checkstring, then [0:75]
9211         #  * if we need to validate the checkstring sig, then [543ish:799ish]
9212         #  * if we need the verification key, then [107:436ish]
9213merger 0.0 (
9214hunk ./src/allmydata/mutable/servermap.py 392
9215-        # read 2000 bytes, which also happens to read enough actual data to
9216-        # pre-fetch a 9-entry dirnode.
9217+        # read 4000 bytes, which also happens to read enough actual data to
9218+        # pre-fetch an 18-entry dirnode.
9219hunk ./src/allmydata/mutable/servermap.py 390
9220-        # A future version of the SMDF slot format should consider using
9221-        # fixed-size slots so we can retrieve less data. For now, we'll just
9222-        # read 2000 bytes, which also happens to read enough actual data to
9223-        # pre-fetch a 9-entry dirnode.
9224+        # MDMF:
9225+        #  * Checkstring? [0:72]
9226+        #  * If we want to validate the checkstring, then [0:72], [143:?] --
9227+        #    the offset table will tell us for sure.
9228+        #  * If we need the verification key, we have to consult the offset
9229+        #    table as well.
9230+        # At this point, we don't know which we are. Our filenode can
9231+        # tell us, but it might be lying -- in some cases, we're
9232+        # responsible for telling it which kind of file it is.
9233)
9234hunk ./src/allmydata/mutable/servermap.py 399
9235             # we use unpack_prefix_and_signature, so we need 1k
9236             self._read_size = 1000
9237         self._need_privkey = False
9238+
9239         if mode == MODE_WRITE and not self._node.get_privkey():
9240             self._need_privkey = True
9241         # check+repair: repair requires the privkey, so if we didn't happen
9242hunk ./src/allmydata/mutable/servermap.py 406
9243         # to ask for it during the check, we'll have problems doing the
9244         # publish.
9245 
9246+        self.fetch_update_data = False
9247+        if mode == MODE_WRITE and update_range:
9248+            # We're updating the servermap in preparation for an
9249+            # in-place file update, so we need to fetch some additional
9250+            # data from each share that we find.
9251+            assert len(update_range) == 2
9252+
9253+            self.start_segment = update_range[0]
9254+            self.end_segment = update_range[1]
9255+            self.fetch_update_data = True
9256+
9257         prefix = si_b2a(self._storage_index)[:5]
9258         self._log_number = log.msg(format="SharemapUpdater(%(si)s): starting (%(mode)s)",
9259                                    si=prefix, mode=mode)
9260merger 0.0 (
9261hunk ./src/allmydata/mutable/servermap.py 455
9262-        full_peerlist = sb.get_servers_for_index(self._storage_index)
9263+        full_peerlist = [(s.get_serverid(), s.get_rref())
9264+                         for s in sb.get_servers_for_psi(self._storage_index)]
9265hunk ./src/allmydata/mutable/servermap.py 455
9266+        # All of the peers, permuted by the storage index, as usual.
9267)
9268hunk ./src/allmydata/mutable/servermap.py 461
9269         self._good_peers = set() # peers who had some shares
9270         self._empty_peers = set() # peers who don't have any shares
9271         self._bad_peers = set() # peers to whom our queries failed
9272+        self._readers = {} # peerid -> dict(sharewriters), filled in
9273+                           # after responses come in.
9274 
9275         k = self._node.get_required_shares()
9276hunk ./src/allmydata/mutable/servermap.py 465
9277+        # For what cases can these conditions work?
9278         if k is None:
9279             # make a guess
9280             k = 3
9281hunk ./src/allmydata/mutable/servermap.py 478
9282         self.num_peers_to_query = k + self.EPSILON
9283 
9284         if self.mode == MODE_CHECK:
9285+            # We want to query all of the peers.
9286             initial_peers_to_query = dict(full_peerlist)
9287             must_query = set(initial_peers_to_query.keys())
9288             self.extra_peers = []
9289hunk ./src/allmydata/mutable/servermap.py 486
9290             # we're planning to replace all the shares, so we want a good
9291             # chance of finding them all. We will keep searching until we've
9292             # seen epsilon that don't have a share.
9293+            # We don't query all of the peers because that could take a while.
9294             self.num_peers_to_query = N + self.EPSILON
9295             initial_peers_to_query, must_query = self._build_initial_querylist()
9296             self.required_num_empty_peers = self.EPSILON
9297hunk ./src/allmydata/mutable/servermap.py 496
9298             # might also avoid the round trip required to read the encrypted
9299             # private key.
9300 
9301-        else:
9302+        else: # MODE_READ, MODE_ANYTHING
9303+            # 2k peers is good enough.
9304             initial_peers_to_query, must_query = self._build_initial_querylist()
9305 
9306         # this is a set of peers that we are required to get responses from:
9307hunk ./src/allmydata/mutable/servermap.py 512
9308         # before we can consider ourselves finished, and self.extra_peers
9309         # contains the overflow (peers that we should tap if we don't get
9310         # enough responses)
9311+        # I guess that self._must_query is a subset of
9312+        # initial_peers_to_query?
9313+        assert set(must_query).issubset(set(initial_peers_to_query))
9314 
9315         self._send_initial_requests(initial_peers_to_query)
9316         self._status.timings["initial_queries"] = time.time() - self._started
9317hunk ./src/allmydata/mutable/servermap.py 571
9318         # errors that aren't handled by _query_failed (and errors caused by
9319         # _query_failed) get logged, but we still want to check for doneness.
9320         d.addErrback(log.err)
9321-        d.addBoth(self._check_for_done)
9322         d.addErrback(self._fatal_error)
9323hunk ./src/allmydata/mutable/servermap.py 572
9324+        d.addCallback(self._check_for_done)
9325         return d
9326 
9327     def _do_read(self, ss, peerid, storage_index, shnums, readv):
9328hunk ./src/allmydata/mutable/servermap.py 591
9329         d = ss.callRemote("slot_readv", storage_index, shnums, readv)
9330         return d
9331 
9332+
9333+    def _got_corrupt_share(self, e, shnum, peerid, data, lp):
9334+        """
9335+        I am called when a remote server returns a corrupt share in
9336+        response to one of our queries. By corrupt, I mean a share
9337+        without a valid signature. I then record the failure, notify the
9338+        server of the corruption, and record the share as bad.
9339+        """
9340+        f = failure.Failure(e)
9341+        self.log(format="bad share: %(f_value)s", f_value=str(f),
9342+                 failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
9343+        # Notify the server that its share is corrupt.
9344+        self.notify_server_corruption(peerid, shnum, str(e))
9345+        # By flagging this as a bad peer, we won't count any of
9346+        # the other shares on that peer as valid, though if we
9347+        # happen to find a valid version string amongst those
9348+        # shares, we'll keep track of it so that we don't need
9349+        # to validate the signature on those again.
9350+        self._bad_peers.add(peerid)
9351+        self._last_failure = f
9352+        # XXX: Use the reader for this?
9353+        checkstring = data[:SIGNED_PREFIX_LENGTH]
9354+        self._servermap.mark_bad_share(peerid, shnum, checkstring)
9355+        self._servermap.problems.append(f)
9356+
9357+
9358+    def _cache_good_sharedata(self, verinfo, shnum, now, data):
9359+        """
9360+        If one of my queries returns successfully (which means that we
9361+        were able to and successfully did validate the signature), I
9362+        cache the data that we initially fetched from the storage
9363+        server. This will help reduce the number of roundtrips that need
9364+        to occur when the file is downloaded, or when the file is
9365+        updated.
9366+        """
9367+        if verinfo:
9368+            self._node._add_to_cache(verinfo, shnum, 0, data, now)
9369+
9370+
9371     def _got_results(self, datavs, peerid, readsize, stuff, started):
9372         lp = self.log(format="got result from [%(peerid)s], %(numshares)d shares",
9373                       peerid=idlib.shortnodeid_b2a(peerid),
9374hunk ./src/allmydata/mutable/servermap.py 633
9375-                      numshares=len(datavs),
9376-                      level=log.NOISY)
9377+                      numshares=len(datavs))
9378         now = time.time()
9379         elapsed = now - started
9380hunk ./src/allmydata/mutable/servermap.py 636
9381-        self._queries_outstanding.discard(peerid)
9382-        self._servermap.reachable_peers.add(peerid)
9383-        self._must_query.discard(peerid)
9384-        self._queries_completed += 1
9385+        def _done_processing(ignored=None):
9386+            self._queries_outstanding.discard(peerid)
9387+            self._servermap.reachable_peers.add(peerid)
9388+            self._must_query.discard(peerid)
9389+            self._queries_completed += 1
9390         if not self._running:
9391hunk ./src/allmydata/mutable/servermap.py 642
9392-            self.log("but we're not running, so we'll ignore it", parent=lp,
9393-                     level=log.NOISY)
9394+            self.log("but we're not running, so we'll ignore it", parent=lp)
9395+            _done_processing()
9396             self._status.add_per_server_time(peerid, "late", started, elapsed)
9397             return
9398         self._status.add_per_server_time(peerid, "query", started, elapsed)
9399hunk ./src/allmydata/mutable/servermap.py 653
9400         else:
9401             self._empty_peers.add(peerid)
9402 
9403-        last_verinfo = None
9404-        last_shnum = None
9405+        ss, storage_index = stuff
9406+        ds = []
9407+
9408         for shnum,datav in datavs.items():
9409             data = datav[0]
9410             try:
9411merger 0.0 (
9412hunk ./src/allmydata/mutable/servermap.py 662
9413-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
9414+                self._node._add_to_cache(verinfo, shnum, 0, data)
9415hunk ./src/allmydata/mutable/servermap.py 658
9416-            try:
9417-                verinfo = self._got_results_one_share(shnum, data, peerid, lp)
9418-                last_verinfo = verinfo
9419-                last_shnum = shnum
9420-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
9421-            except CorruptShareError, e:
9422-                # log it and give the other shares a chance to be processed
9423-                f = failure.Failure()
9424-                self.log(format="bad share: %(f_value)s", f_value=str(f.value),
9425-                         failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
9426-                self.notify_server_corruption(peerid, shnum, str(e))
9427-                self._bad_peers.add(peerid)
9428-                self._last_failure = f
9429-                checkstring = data[:SIGNED_PREFIX_LENGTH]
9430-                self._servermap.mark_bad_share(peerid, shnum, checkstring)
9431-                self._servermap.problems.append(f)
9432-                pass
9433+            reader = MDMFSlotReadProxy(ss,
9434+                                       storage_index,
9435+                                       shnum,
9436+                                       data)
9437+            self._readers.setdefault(peerid, dict())[shnum] = reader
9438+            # our goal, with each response, is to validate the version
9439+            # information and share data as best we can at this point --
9440+            # we do this by validating the signature. To do this, we
9441+            # need to do the following:
9442+            #   - If we don't already have the public key, fetch the
9443+            #     public key. We use this to validate the signature.
9444+            if not self._node.get_pubkey():
9445+                # fetch and set the public key.
9446+                d = reader.get_verification_key(queue=True)
9447+                d.addCallback(lambda results, shnum=shnum, peerid=peerid:
9448+                    self._try_to_set_pubkey(results, peerid, shnum, lp))
9449+                # XXX: Make self._pubkey_query_failed?
9450+                d.addErrback(lambda error, shnum=shnum, peerid=peerid:
9451+                    self._got_corrupt_share(error, shnum, peerid, data, lp))
9452+            else:
9453+                # we already have the public key.
9454+                d = defer.succeed(None)
9455)
9456hunk ./src/allmydata/mutable/servermap.py 676
9457                 self._servermap.problems.append(f)
9458                 pass
9459 
9460-        self._status.timings["cumulative_verify"] += (time.time() - now)
9461+            # Neither of these two branches return anything of
9462+            # consequence, so the first entry in our deferredlist will
9463+            # be None.
9464 
9465hunk ./src/allmydata/mutable/servermap.py 680
9466-        if self._need_privkey and last_verinfo:
9467-            # send them a request for the privkey. We send one request per
9468-            # server.
9469-            lp2 = self.log("sending privkey request",
9470-                           parent=lp, level=log.NOISY)
9471-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
9472-             offsets_tuple) = last_verinfo
9473-            o = dict(offsets_tuple)
9474+            # - Next, we need the version information. We almost
9475+            #   certainly got this by reading the first thousand or so
9476+            #   bytes of the share on the storage server, so we
9477+            #   shouldn't need to fetch anything at this step.
9478+            d2 = reader.get_verinfo()
9479+            d2.addErrback(lambda error, shnum=shnum, peerid=peerid:
9480+                self._got_corrupt_share(error, shnum, peerid, data, lp))
9481+            # - Next, we need the signature. For an SDMF share, it is
9482+            #   likely that we fetched this when doing our initial fetch
9483+            #   to get the version information. In MDMF, this lives at
9484+            #   the end of the share, so unless the file is quite small,
9485+            #   we'll need to do a remote fetch to get it.
9486+            d3 = reader.get_signature(queue=True)
9487+            d3.addErrback(lambda error, shnum=shnum, peerid=peerid:
9488+                self._got_corrupt_share(error, shnum, peerid, data, lp))
9489+            #  Once we have all three of these responses, we can move on
9490+            #  to validating the signature
9491 
9492hunk ./src/allmydata/mutable/servermap.py 698
9493-            self._queries_outstanding.add(peerid)
9494-            readv = [ (o['enc_privkey'], (o['EOF'] - o['enc_privkey'])) ]
9495-            ss = self._servermap.connections[peerid]
9496-            privkey_started = time.time()
9497-            d = self._do_read(ss, peerid, self._storage_index,
9498-                              [last_shnum], readv)
9499-            d.addCallback(self._got_privkey_results, peerid, last_shnum,
9500-                          privkey_started, lp2)
9501-            d.addErrback(self._privkey_query_failed, peerid, last_shnum, lp2)
9502-            d.addErrback(log.err)
9503-            d.addCallback(self._check_for_done)
9504-            d.addErrback(self._fatal_error)
9505+            # Does the node already have a privkey? If not, we'll try to
9506+            # fetch it here.
9507+            if self._need_privkey:
9508+                d4 = reader.get_encprivkey(queue=True)
9509+                d4.addCallback(lambda results, shnum=shnum, peerid=peerid:
9510+                    self._try_to_validate_privkey(results, peerid, shnum, lp))
9511+                d4.addErrback(lambda error, shnum=shnum, peerid=peerid:
9512+                    self._privkey_query_failed(error, shnum, data, lp))
9513+            else:
9514+                d4 = defer.succeed(None)
9515+
9516+
9517+            if self.fetch_update_data:
9518+                # fetch the block hash tree and first + last segment, as
9519+                # configured earlier.
9520+                # Then set them in wherever we happen to want to set
9521+                # them.
9522+                ds = []
9523+                # XXX: We do this above, too. Is there a good way to
9524+                # make the two routines share the value without
9525+                # introducing more roundtrips?
9526+                ds.append(reader.get_verinfo())
9527+                ds.append(reader.get_blockhashes(queue=True))
9528+                ds.append(reader.get_block_and_salt(self.start_segment,
9529+                                                    queue=True))
9530+                ds.append(reader.get_block_and_salt(self.end_segment,
9531+                                                    queue=True))
9532+                d5 = deferredutil.gatherResults(ds)
9533+                d5.addCallback(self._got_update_results_one_share, shnum)
9534+            else:
9535+                d5 = defer.succeed(None)
9536 
9537hunk ./src/allmydata/mutable/servermap.py 730
9538+            dl = defer.DeferredList([d, d2, d3, d4, d5])
9539+            dl.addBoth(self._turn_barrier)
9540+            reader.flush()
9541+            dl.addCallback(lambda results, shnum=shnum, peerid=peerid:
9542+                self._got_signature_one_share(results, shnum, peerid, lp))
9543+            dl.addErrback(lambda error, shnum=shnum, data=data:
9544+               self._got_corrupt_share(error, shnum, peerid, data, lp))
9545+            dl.addCallback(lambda verinfo, shnum=shnum, peerid=peerid, data=data:
9546+                self._cache_good_sharedata(verinfo, shnum, now, data))
9547+            ds.append(dl)
9548+        # dl is a deferred list that will fire when all of the shares
9549+        # that we found on this peer are done processing. When dl fires,
9550+        # we know that processing is done, so we can decrement the
9551+        # semaphore-like thing that we incremented earlier.
9552+        dl = defer.DeferredList(ds, fireOnOneErrback=True)
9553+        # Are we done? Done means that there are no more queries to
9554+        # send, that there are no outstanding queries, and that we
9555+        # haven't received any queries that are still processing. If we
9556+        # are done, self._check_for_done will cause the done deferred
9557+        # that we returned to our caller to fire, which tells them that
9558+        # they have a complete servermap, and that we won't be touching
9559+        # the servermap anymore.
9560+        dl.addCallback(_done_processing)
9561+        dl.addCallback(self._check_for_done)
9562+        dl.addErrback(self._fatal_error)
9563         # all done!
9564         self.log("_got_results done", parent=lp, level=log.NOISY)
9565hunk ./src/allmydata/mutable/servermap.py 757
9566+        return dl
9567+
9568+
9569+    def _turn_barrier(self, result):
9570+        """
9571+        I help the servermap updater avoid the recursion limit issues
9572+        discussed in #237.
9573+        """
9574+        return fireEventually(result)
9575+
9576+
9577+    def _try_to_set_pubkey(self, pubkey_s, peerid, shnum, lp):
9578+        if self._node.get_pubkey():
9579+            return # don't go through this again if we don't have to
9580+        fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
9581+        assert len(fingerprint) == 32
9582+        if fingerprint != self._node.get_fingerprint():
9583+            raise CorruptShareError(peerid, shnum,
9584+                                "pubkey doesn't match fingerprint")
9585+        self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
9586+        assert self._node.get_pubkey()
9587+
9588 
9589     def notify_server_corruption(self, peerid, shnum, reason):
9590         ss = self._servermap.connections[peerid]
9591hunk ./src/allmydata/mutable/servermap.py 785
9592         ss.callRemoteOnly("advise_corrupt_share",
9593                           "mutable", self._storage_index, shnum, reason)
9594 
9595-    def _got_results_one_share(self, shnum, data, peerid, lp):
9596+
9597+    def _got_signature_one_share(self, results, shnum, peerid, lp):
9598+        # It is our job to give versioninfo to our caller. We need to
9599+        # raise CorruptShareError if the share is corrupt for any
9600+        # reason, something that our caller will handle.
9601         self.log(format="_got_results: got shnum #%(shnum)d from peerid %(peerid)s",
9602                  shnum=shnum,
9603                  peerid=idlib.shortnodeid_b2a(peerid),
9604hunk ./src/allmydata/mutable/servermap.py 795
9605                  level=log.NOISY,
9606                  parent=lp)
9607+        if not self._running:
9608+            # We can't process the results, since we can't touch the
9609+            # servermap anymore.
9610+            self.log("but we're not running anymore.")
9611+            return None
9612 
9613hunk ./src/allmydata/mutable/servermap.py 801
9614-        # this might raise NeedMoreDataError, if the pubkey and signature
9615-        # live at some weird offset. That shouldn't happen, so I'm going to
9616-        # treat it as a bad share.
9617-        (seqnum, root_hash, IV, k, N, segsize, datalength,
9618-         pubkey_s, signature, prefix) = unpack_prefix_and_signature(data)
9619-
9620-        if not self._node.get_pubkey():
9621-            fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
9622-            assert len(fingerprint) == 32
9623-            if fingerprint != self._node.get_fingerprint():
9624-                raise CorruptShareError(peerid, shnum,
9625-                                        "pubkey doesn't match fingerprint")
9626-            self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
9627-
9628-        if self._need_privkey:
9629-            self._try_to_extract_privkey(data, peerid, shnum, lp)
9630-
9631-        (ig_version, ig_seqnum, ig_root_hash, ig_IV, ig_k, ig_N,
9632-         ig_segsize, ig_datalen, offsets) = unpack_header(data)
9633+        _, verinfo, signature, __, ___ = results
9634+        (seqnum,
9635+         root_hash,
9636+         saltish,
9637+         segsize,
9638+         datalen,
9639+         k,
9640+         n,
9641+         prefix,
9642+         offsets) = verinfo[1]
9643         offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
9644 
9645hunk ./src/allmydata/mutable/servermap.py 813
9646-        verinfo = (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
9647+        # XXX: This should be done for us in the method, so
9648+        # presumably you can go in there and fix it.
9649+        verinfo = (seqnum,
9650+                   root_hash,
9651+                   saltish,
9652+                   segsize,
9653+                   datalen,
9654+                   k,
9655+                   n,
9656+                   prefix,
9657                    offsets_tuple)
9658hunk ./src/allmydata/mutable/servermap.py 824
9659+        # This tuple uniquely identifies a share on the grid; we use it
9660+        # to keep track of the ones that we've already seen.
9661 
9662         if verinfo not in self._valid_versions:
9663hunk ./src/allmydata/mutable/servermap.py 828
9664-            # it's a new pair. Verify the signature.
9665-            valid = self._node.get_pubkey().verify(prefix, signature)
9666+            # This is a new version tuple, and we need to validate it
9667+            # against the public key before keeping track of it.
9668+            assert self._node.get_pubkey()
9669+            valid = self._node.get_pubkey().verify(prefix, signature[1])
9670             if not valid:
9671hunk ./src/allmydata/mutable/servermap.py 833
9672-                raise CorruptShareError(peerid, shnum, "signature is invalid")
9673+                raise CorruptShareError(peerid, shnum,
9674+                                        "signature is invalid")
9675 
9676hunk ./src/allmydata/mutable/servermap.py 836
9677-            # ok, it's a valid verinfo. Add it to the list of validated
9678-            # versions.
9679-            self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
9680-                     % (seqnum, base32.b2a(root_hash)[:4],
9681-                        idlib.shortnodeid_b2a(peerid), shnum,
9682-                        k, N, segsize, datalength),
9683-                     parent=lp)
9684-            self._valid_versions.add(verinfo)
9685-        # We now know that this is a valid candidate verinfo.
9686+        # ok, it's a valid verinfo. Add it to the list of validated
9687+        # versions.
9688+        self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
9689+                 % (seqnum, base32.b2a(root_hash)[:4],
9690+                    idlib.shortnodeid_b2a(peerid), shnum,
9691+                    k, n, segsize, datalen),
9692+                    parent=lp)
9693+        self._valid_versions.add(verinfo)
9694+        # We now know that this is a valid candidate verinfo. Whether or
9695+        # not this instance of it is valid is a matter for the next
9696+        # statement; at this point, we just know that if we see this
9697+        # version info again, that its signature checks out and that
9698+        # we're okay to skip the signature-checking step.
9699 
9700hunk ./src/allmydata/mutable/servermap.py 850
9701+        # (peerid, shnum) are bound in the method invocation.
9702         if (peerid, shnum) in self._servermap.bad_shares:
9703             # we've been told that the rest of the data in this share is
9704             # unusable, so don't add it to the servermap.
9705hunk ./src/allmydata/mutable/servermap.py 863
9706         self._servermap.add_new_share(peerid, shnum, verinfo, timestamp)
9707         # and the versionmap
9708         self.versionmap.add(verinfo, (shnum, peerid, timestamp))
9709+
9710+        # It's our job to set the protocol version of our parent
9711+        # filenode if it isn't already set.
9712+        if not self._node.get_version():
9713+            # The first byte of the prefix is the version.
9714+            v = struct.unpack(">B", prefix[:1])[0]
9715+            self.log("got version %d" % v)
9716+            self._node.set_version(v)
9717+
9718         return verinfo
9719 
9720hunk ./src/allmydata/mutable/servermap.py 874
9721-    def _deserialize_pubkey(self, pubkey_s):
9722-        verifier = rsa.create_verifying_key_from_string(pubkey_s)
9723-        return verifier
9724 
9725hunk ./src/allmydata/mutable/servermap.py 875
9726-    def _try_to_extract_privkey(self, data, peerid, shnum, lp):
9727-        try:
9728-            r = unpack_share(data)
9729-        except NeedMoreDataError, e:
9730-            # this share won't help us. oh well.
9731-            offset = e.encprivkey_offset
9732-            length = e.encprivkey_length
9733-            self.log("shnum %d on peerid %s: share was too short (%dB) "
9734-                     "to get the encprivkey; [%d:%d] ought to hold it" %
9735-                     (shnum, idlib.shortnodeid_b2a(peerid), len(data),
9736-                      offset, offset+length),
9737-                     parent=lp)
9738-            # NOTE: if uncoordinated writes are taking place, someone might
9739-            # change the share (and most probably move the encprivkey) before
9740-            # we get a chance to do one of these reads and fetch it. This
9741-            # will cause us to see a NotEnoughSharesError(unable to fetch
9742-            # privkey) instead of an UncoordinatedWriteError . This is a
9743-            # nuisance, but it will go away when we move to DSA-based mutable
9744-            # files (since the privkey will be small enough to fit in the
9745-            # write cap).
9746+    def _got_update_results_one_share(self, results, share):
9747+        """
9748+        I record the update results in results.
9749+        """
9750+        assert len(results) == 4
9751+        verinfo, blockhashes, start, end = results
9752+        (seqnum,
9753+         root_hash,
9754+         saltish,
9755+         segsize,
9756+         datalen,
9757+         k,
9758+         n,
9759+         prefix,
9760+         offsets) = verinfo
9761+        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
9762 
9763hunk ./src/allmydata/mutable/servermap.py 892
9764-            return
9765+        # XXX: This should be done for us in the method, so
9766+        # presumably you can go in there and fix it.
9767+        verinfo = (seqnum,
9768+                   root_hash,
9769+                   saltish,
9770+                   segsize,
9771+                   datalen,
9772+                   k,
9773+                   n,
9774+                   prefix,
9775+                   offsets_tuple)
9776 
9777hunk ./src/allmydata/mutable/servermap.py 904
9778-        (seqnum, root_hash, IV, k, N, segsize, datalen,
9779-         pubkey, signature, share_hash_chain, block_hash_tree,
9780-         share_data, enc_privkey) = r
9781+        update_data = (blockhashes, start, end)
9782+        self._servermap.set_update_data_for_share_and_verinfo(share,
9783+                                                              verinfo,
9784+                                                              update_data)
9785 
9786hunk ./src/allmydata/mutable/servermap.py 909
9787-        return self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
9788+
9789+    def _deserialize_pubkey(self, pubkey_s):
9790+        verifier = rsa.create_verifying_key_from_string(pubkey_s)
9791+        return verifier
9792 
9793hunk ./src/allmydata/mutable/servermap.py 914
9794-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
9795 
9796hunk ./src/allmydata/mutable/servermap.py 915
9797+    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
9798+        """
9799+        Given a writekey from a remote server, I validate it against the
9800+        writekey stored in my node. If it is valid, then I set the
9801+        privkey and encprivkey properties of the node.
9802+        """
9803         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
9804         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
9805         if alleged_writekey != self._node.get_writekey():
9806hunk ./src/allmydata/mutable/servermap.py 993
9807         self._queries_completed += 1
9808         self._last_failure = f
9809 
9810-    def _got_privkey_results(self, datavs, peerid, shnum, started, lp):
9811-        now = time.time()
9812-        elapsed = now - started
9813-        self._status.add_per_server_time(peerid, "privkey", started, elapsed)
9814-        self._queries_outstanding.discard(peerid)
9815-        if not self._need_privkey:
9816-            return
9817-        if shnum not in datavs:
9818-            self.log("privkey wasn't there when we asked it",
9819-                     level=log.WEIRD, umid="VA9uDQ")
9820-            return
9821-        datav = datavs[shnum]
9822-        enc_privkey = datav[0]
9823-        self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
9824 
9825     def _privkey_query_failed(self, f, peerid, shnum, lp):
9826         self._queries_outstanding.discard(peerid)
9827hunk ./src/allmydata/mutable/servermap.py 1007
9828         self._servermap.problems.append(f)
9829         self._last_failure = f
9830 
9831+
9832     def _check_for_done(self, res):
9833         # exit paths:
9834         #  return self._send_more_queries(outstanding) : send some more queries
9835hunk ./src/allmydata/mutable/servermap.py 1013
9836         #  return self._done() : all done
9837         #  return : keep waiting, no new queries
9838-
9839         lp = self.log(format=("_check_for_done, mode is '%(mode)s', "
9840                               "%(outstanding)d queries outstanding, "
9841                               "%(extra)d extra peers available, "
9842hunk ./src/allmydata/mutable/servermap.py 1204
9843 
9844     def _done(self):
9845         if not self._running:
9846+            self.log("not running; we're already done")
9847             return
9848         self._running = False
9849         now = time.time()
9850hunk ./src/allmydata/mutable/servermap.py 1219
9851         self._servermap.last_update_time = self._started
9852         # the servermap will not be touched after this
9853         self.log("servermap: %s" % self._servermap.summarize_versions())
9854+
9855         eventually(self._done_deferred.callback, self._servermap)
9856 
9857     def _fatal_error(self, f):
9858}
9859[tests:
9860Kevan Carstensen <kevan@isnotajoke.com>**20100819003531
9861 Ignore-this: 314e8bbcce532ea4d5d2cecc9f31cca0
9862 
9863     - A lot of existing tests relied on aspects of the mutable file
9864       implementation that were changed. This patch updates those tests
9865       to work with the changes.
9866     - This patch also adds tests for new features.
9867] {
9868hunk ./src/allmydata/test/common.py 11
9869 from foolscap.api import flushEventualQueue, fireEventually
9870 from allmydata import uri, dirnode, client
9871 from allmydata.introducer.server import IntroducerNode
9872-from allmydata.interfaces import IMutableFileNode, IImmutableFileNode, \
9873-     FileTooLargeError, NotEnoughSharesError, ICheckable
9874+from allmydata.interfaces import IMutableFileNode, IImmutableFileNode,\
9875+                                 NotEnoughSharesError, ICheckable, \
9876+                                 IMutableUploadable, SDMF_VERSION, \
9877+                                 MDMF_VERSION
9878 from allmydata.check_results import CheckResults, CheckAndRepairResults, \
9879      DeepCheckResults, DeepCheckAndRepairResults
9880 from allmydata.mutable.common import CorruptShareError
9881hunk ./src/allmydata/test/common.py 19
9882 from allmydata.mutable.layout import unpack_header
9883+from allmydata.mutable.publish import MutableData
9884 from allmydata.storage.server import storage_index_to_dir
9885 from allmydata.storage.mutable import MutableShareFile
9886 from allmydata.util import hashutil, log, fileutil, pollmixin
9887hunk ./src/allmydata/test/common.py 153
9888         consumer.write(data[start:end])
9889         return consumer
9890 
9891+
9892+    def get_best_readable_version(self):
9893+        return defer.succeed(self)
9894+
9895+
9896+    download_best_version = download_to_data
9897+
9898+
9899+    def download_to_data(self):
9900+        return download_to_data(self)
9901+
9902+
9903+    def get_size_of_best_version(self):
9904+        return defer.succeed(self.get_size)
9905+
9906+
9907 def make_chk_file_cap(size):
9908     return uri.CHKFileURI(key=os.urandom(16),
9909                           uri_extension_hash=os.urandom(32),
9910hunk ./src/allmydata/test/common.py 193
9911     MUTABLE_SIZELIMIT = 10000
9912     all_contents = {}
9913     bad_shares = {}
9914+    file_types = {} # storage index => MDMF_VERSION or SDMF_VERSION
9915 
9916     def __init__(self, storage_broker, secret_holder,
9917                  default_encoding_parameters, history):
9918hunk ./src/allmydata/test/common.py 200
9919         self.init_from_cap(make_mutable_file_cap())
9920     def create(self, contents, key_generator=None, keysize=None):
9921         initial_contents = self._get_initial_contents(contents)
9922-        if len(initial_contents) > self.MUTABLE_SIZELIMIT:
9923-            raise FileTooLargeError("SDMF is limited to one segment, and "
9924-                                    "%d > %d" % (len(initial_contents),
9925-                                                 self.MUTABLE_SIZELIMIT))
9926-        self.all_contents[self.storage_index] = initial_contents
9927+        data = initial_contents.read(initial_contents.get_size())
9928+        data = "".join(data)
9929+        self.all_contents[self.storage_index] = data
9930         return defer.succeed(self)
9931     def _get_initial_contents(self, contents):
9932hunk ./src/allmydata/test/common.py 205
9933-        if isinstance(contents, str):
9934-            return contents
9935         if contents is None:
9936hunk ./src/allmydata/test/common.py 206
9937-            return ""
9938+            return MutableData("")
9939+
9940+        if IMutableUploadable.providedBy(contents):
9941+            return contents
9942+
9943         assert callable(contents), "%s should be callable, not %s" % \
9944                (contents, type(contents))
9945         return contents(self)
9946hunk ./src/allmydata/test/common.py 258
9947     def get_storage_index(self):
9948         return self.storage_index
9949 
9950+    def get_servermap(self, mode):
9951+        return defer.succeed(None)
9952+
9953+    def set_version(self, version):
9954+        assert version in (SDMF_VERSION, MDMF_VERSION)
9955+        self.file_types[self.storage_index] = version
9956+
9957+    def get_version(self):
9958+        assert self.storage_index in self.file_types
9959+        return self.file_types[self.storage_index]
9960+
9961     def check(self, monitor, verify=False, add_lease=False):
9962         r = CheckResults(self.my_uri, self.storage_index)
9963         is_bad = self.bad_shares.get(self.storage_index, None)
9964hunk ./src/allmydata/test/common.py 327
9965         return d
9966 
9967     def download_best_version(self):
9968+        return defer.succeed(self._download_best_version())
9969+
9970+
9971+    def _download_best_version(self, ignored=None):
9972         if isinstance(self.my_uri, uri.LiteralFileURI):
9973hunk ./src/allmydata/test/common.py 332
9974-            return defer.succeed(self.my_uri.data)
9975+            return self.my_uri.data
9976         if self.storage_index not in self.all_contents:
9977hunk ./src/allmydata/test/common.py 334
9978-            return defer.fail(NotEnoughSharesError(None, 0, 3))
9979-        return defer.succeed(self.all_contents[self.storage_index])
9980+            raise NotEnoughSharesError(None, 0, 3)
9981+        return self.all_contents[self.storage_index]
9982+
9983 
9984     def overwrite(self, new_contents):
9985hunk ./src/allmydata/test/common.py 339
9986-        if len(new_contents) > self.MUTABLE_SIZELIMIT:
9987-            raise FileTooLargeError("SDMF is limited to one segment, and "
9988-                                    "%d > %d" % (len(new_contents),
9989-                                                 self.MUTABLE_SIZELIMIT))
9990         assert not self.is_readonly()
9991hunk ./src/allmydata/test/common.py 340
9992-        self.all_contents[self.storage_index] = new_contents
9993+        new_data = new_contents.read(new_contents.get_size())
9994+        new_data = "".join(new_data)
9995+        self.all_contents[self.storage_index] = new_data
9996         return defer.succeed(None)
9997     def modify(self, modifier):
9998         # this does not implement FileTooLargeError, but the real one does
9999hunk ./src/allmydata/test/common.py 350
10000     def _modify(self, modifier):
10001         assert not self.is_readonly()
10002         old_contents = self.all_contents[self.storage_index]
10003-        self.all_contents[self.storage_index] = modifier(old_contents, None, True)
10004+        new_data = modifier(old_contents, None, True)
10005+        self.all_contents[self.storage_index] = new_data
10006         return None
10007 
10008hunk ./src/allmydata/test/common.py 354
10009+    # As actually implemented, MutableFilenode and MutableFileVersion
10010+    # are distinct. However, nothing in the webapi uses (yet) that
10011+    # distinction -- it just uses the unified download interface
10012+    # provided by get_best_readable_version and read. When we start
10013+    # doing cooler things like LDMF, we will want to revise this code to
10014+    # be less simplistic.
10015+    def get_best_readable_version(self):
10016+        return defer.succeed(self)
10017+
10018+
10019+    def get_best_mutable_version(self):
10020+        return defer.succeed(self)
10021+
10022+    # Ditto for this, which is an implementation of IWritable.
10023+    # XXX: Declare that the same is implemented.
10024+    def update(self, data, offset):
10025+        assert not self.is_readonly()
10026+        def modifier(old, servermap, first_time):
10027+            new = old[:offset] + "".join(data.read(data.get_size()))
10028+            new += old[len(new):]
10029+            return new
10030+        return self.modify(modifier)
10031+
10032+
10033+    def read(self, consumer, offset=0, size=None):
10034+        data = self._download_best_version()
10035+        if size:
10036+            data = data[offset:offset+size]
10037+        consumer.write(data)
10038+        return defer.succeed(consumer)
10039+
10040+
10041 def make_mutable_file_cap():
10042     return uri.WriteableSSKFileURI(writekey=os.urandom(16),
10043                                    fingerprint=os.urandom(32))
10044hunk ./src/allmydata/test/test_checker.py 11
10045 from allmydata.test.no_network import GridTestMixin
10046 from allmydata.immutable.upload import Data
10047 from allmydata.test.common_web import WebRenderingMixin
10048+from allmydata.mutable.publish import MutableData
10049 
10050 class FakeClient:
10051     def get_storage_broker(self):
10052hunk ./src/allmydata/test/test_checker.py 291
10053         def _stash_immutable(ur):
10054             self.imm = c0.create_node_from_uri(ur.uri)
10055         d.addCallback(_stash_immutable)
10056-        d.addCallback(lambda ign: c0.create_mutable_file("contents"))
10057+        d.addCallback(lambda ign:
10058+            c0.create_mutable_file(MutableData("contents")))
10059         def _stash_mutable(node):
10060             self.mut = node
10061         d.addCallback(_stash_mutable)
10062hunk ./src/allmydata/test/test_cli.py 13
10063 from allmydata.util import fileutil, hashutil, base32
10064 from allmydata import uri
10065 from allmydata.immutable import upload
10066+from allmydata.mutable.publish import MutableData
10067 from allmydata.dirnode import normalize
10068 
10069 # Test that the scripts can be imported.
10070hunk ./src/allmydata/test/test_cli.py 662
10071 
10072         d = self.do_cli("create-alias", etudes_arg)
10073         def _check_create_unicode((rc, out, err)):
10074-            self.failUnlessReallyEqual(rc, 0)
10075+            #self.failUnlessReallyEqual(rc, 0)
10076             self.failUnlessReallyEqual(err, "")
10077             self.failUnlessIn("Alias %s created" % quote_output(u"\u00E9tudes"), out)
10078 
10079hunk ./src/allmydata/test/test_cli.py 967
10080         d.addCallback(lambda (rc,out,err): self.failUnlessReallyEqual(out, DATA2))
10081         return d
10082 
10083+    def test_mutable_type(self):
10084+        self.basedir = "cli/Put/mutable_type"
10085+        self.set_up_grid()
10086+        data = "data" * 100000
10087+        fn1 = os.path.join(self.basedir, "data")
10088+        fileutil.write(fn1, data)
10089+        d = self.do_cli("create-alias", "tahoe")
10090+        d.addCallback(lambda ignored:
10091+            self.do_cli("put", "--mutable", "--mutable-type=mdmf",
10092+                        fn1, "tahoe:uploaded.txt"))
10093+        d.addCallback(lambda ignored:
10094+            self.do_cli("ls", "--json", "tahoe:uploaded.txt"))
10095+        d.addCallback(lambda (rc, json, err): self.failUnlessIn("mdmf", json))
10096+        d.addCallback(lambda ignored:
10097+            self.do_cli("put", "--mutable", "--mutable-type=sdmf",
10098+                        fn1, "tahoe:uploaded2.txt"))
10099+        d.addCallback(lambda ignored:
10100+            self.do_cli("ls", "--json", "tahoe:uploaded2.txt"))
10101+        d.addCallback(lambda (rc, json, err):
10102+            self.failUnlessIn("sdmf", json))
10103+        return d
10104+
10105+    def test_mutable_type_unlinked(self):
10106+        self.basedir = "cli/Put/mutable_type_unlinked"
10107+        self.set_up_grid()
10108+        data = "data" * 100000
10109+        fn1 = os.path.join(self.basedir, "data")
10110+        fileutil.write(fn1, data)
10111+        d = self.do_cli("put", "--mutable", "--mutable-type=mdmf", fn1)
10112+        d.addCallback(lambda (rc, cap, err):
10113+            self.do_cli("ls", "--json", cap))
10114+        d.addCallback(lambda (rc, json, err): self.failUnlessIn("mdmf", json))
10115+        d.addCallback(lambda ignored:
10116+            self.do_cli("put", "--mutable", "--mutable-type=sdmf", fn1))
10117+        d.addCallback(lambda (rc, cap, err):
10118+            self.do_cli("ls", "--json", cap))
10119+        d.addCallback(lambda (rc, json, err):
10120+            self.failUnlessIn("sdmf", json))
10121+        return d
10122+
10123+    def test_mutable_type_invalid_format(self):
10124+        self.basedir = "cli/Put/mutable_type_invalid_format"
10125+        self.set_up_grid()
10126+        data = "data" * 100000
10127+        fn1 = os.path.join(self.basedir, "data")
10128+        fileutil.write(fn1, data)
10129+        d = self.do_cli("put", "--mutable", "--mutable-type=ldmf", fn1)
10130+        def _check_failure((rc, out, err)):
10131+            self.failIfEqual(rc, 0)
10132+            self.failUnlessIn("invalid", err)
10133+        d.addCallback(_check_failure)
10134+        return d
10135+
10136     def test_put_with_nonexistent_alias(self):
10137         # when invoked with an alias that doesn't exist, 'tahoe put'
10138         # should output a useful error message, not a stack trace
10139hunk ./src/allmydata/test/test_cli.py 2136
10140         self.set_up_grid()
10141         c0 = self.g.clients[0]
10142         DATA = "data" * 100
10143-        d = c0.create_mutable_file(DATA)
10144+        DATA_uploadable = MutableData(DATA)
10145+        d = c0.create_mutable_file(DATA_uploadable)
10146         def _stash_uri(n):
10147             self.uri = n.get_uri()
10148         d.addCallback(_stash_uri)
10149hunk ./src/allmydata/test/test_cli.py 2238
10150                                            upload.Data("literal",
10151                                                         convergence="")))
10152         d.addCallback(_stash_uri, "small")
10153-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"1"))
10154+        d.addCallback(lambda ign:
10155+            c0.create_mutable_file(MutableData(DATA+"1")))
10156         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
10157         d.addCallback(_stash_uri, "mutable")
10158 
10159hunk ./src/allmydata/test/test_cli.py 2257
10160         # root/small
10161         # root/mutable
10162 
10163+        # We haven't broken anything yet, so this should all be healthy.
10164         d.addCallback(lambda ign: self.do_cli("deep-check", "--verbose",
10165                                               self.rooturi))
10166         def _check2((rc, out, err)):
10167hunk ./src/allmydata/test/test_cli.py 2272
10168                             in lines, out)
10169         d.addCallback(_check2)
10170 
10171+        # Similarly, all of these results should be as we expect them to
10172+        # be for a healthy file layout.
10173         d.addCallback(lambda ign: self.do_cli("stats", self.rooturi))
10174         def _check_stats((rc, out, err)):
10175             self.failUnlessReallyEqual(err, "")
10176hunk ./src/allmydata/test/test_cli.py 2289
10177             self.failUnlessIn(" 317-1000 : 1    (1000 B, 1000 B)", lines)
10178         d.addCallback(_check_stats)
10179 
10180+        # Now we break things.
10181         def _clobber_shares(ignored):
10182             shares = self.find_uri_shares(self.uris[u"g\u00F6\u00F6d"])
10183             self.failUnlessReallyEqual(len(shares), 10)
10184hunk ./src/allmydata/test/test_cli.py 2314
10185 
10186         d.addCallback(lambda ign:
10187                       self.do_cli("deep-check", "--verbose", self.rooturi))
10188+        # This should reveal the missing share, but not the corrupt
10189+        # share, since we didn't tell the deep check operation to also
10190+        # verify.
10191         def _check3((rc, out, err)):
10192             self.failUnlessReallyEqual(err, "")
10193             self.failUnlessReallyEqual(rc, 0)
10194hunk ./src/allmydata/test/test_cli.py 2365
10195                                   "--verbose", "--verify", "--repair",
10196                                   self.rooturi))
10197         def _check6((rc, out, err)):
10198+            # We've just repaired the directory. There is no reason for
10199+            # that repair to be unsuccessful.
10200             self.failUnlessReallyEqual(err, "")
10201             self.failUnlessReallyEqual(rc, 0)
10202             lines = out.splitlines()
10203hunk ./src/allmydata/test/test_deepcheck.py 9
10204 from twisted.internet import threads # CLI tests use deferToThread
10205 from allmydata.immutable import upload
10206 from allmydata.mutable.common import UnrecoverableFileError
10207+from allmydata.mutable.publish import MutableData
10208 from allmydata.util import idlib
10209 from allmydata.util import base32
10210 from allmydata.scripts import runner
10211hunk ./src/allmydata/test/test_deepcheck.py 38
10212         self.basedir = "deepcheck/MutableChecker/good"
10213         self.set_up_grid()
10214         CONTENTS = "a little bit of data"
10215-        d = self.g.clients[0].create_mutable_file(CONTENTS)
10216+        CONTENTS_uploadable = MutableData(CONTENTS)
10217+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
10218         def _created(node):
10219             self.node = node
10220             self.fileurl = "uri/" + urllib.quote(node.get_uri())
10221hunk ./src/allmydata/test/test_deepcheck.py 61
10222         self.basedir = "deepcheck/MutableChecker/corrupt"
10223         self.set_up_grid()
10224         CONTENTS = "a little bit of data"
10225-        d = self.g.clients[0].create_mutable_file(CONTENTS)
10226+        CONTENTS_uploadable = MutableData(CONTENTS)
10227+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
10228         def _stash_and_corrupt(node):
10229             self.node = node
10230             self.fileurl = "uri/" + urllib.quote(node.get_uri())
10231hunk ./src/allmydata/test/test_deepcheck.py 99
10232         self.basedir = "deepcheck/MutableChecker/delete_share"
10233         self.set_up_grid()
10234         CONTENTS = "a little bit of data"
10235-        d = self.g.clients[0].create_mutable_file(CONTENTS)
10236+        CONTENTS_uploadable = MutableData(CONTENTS)
10237+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
10238         def _stash_and_delete(node):
10239             self.node = node
10240             self.fileurl = "uri/" + urllib.quote(node.get_uri())
10241hunk ./src/allmydata/test/test_deepcheck.py 223
10242             self.root = n
10243             self.root_uri = n.get_uri()
10244         d.addCallback(_created_root)
10245-        d.addCallback(lambda ign: c0.create_mutable_file("mutable file contents"))
10246+        d.addCallback(lambda ign:
10247+            c0.create_mutable_file(MutableData("mutable file contents")))
10248         d.addCallback(lambda n: self.root.set_node(u"mutable", n))
10249         def _created_mutable(n):
10250             self.mutable = n
10251hunk ./src/allmydata/test/test_deepcheck.py 965
10252     def create_mangled(self, ignored, name):
10253         nodetype, mangletype = name.split("-", 1)
10254         if nodetype == "mutable":
10255-            d = self.g.clients[0].create_mutable_file("mutable file contents")
10256+            mutable_uploadable = MutableData("mutable file contents")
10257+            d = self.g.clients[0].create_mutable_file(mutable_uploadable)
10258             d.addCallback(lambda n: self.root.set_node(unicode(name), n))
10259         elif nodetype == "large":
10260             large = upload.Data("Lots of data\n" * 1000 + name + "\n", None)
10261hunk ./src/allmydata/test/test_dirnode.py 1304
10262     implements(IMutableFileNode)
10263     counter = 0
10264     def __init__(self, initial_contents=""):
10265-        self.data = self._get_initial_contents(initial_contents)
10266+        data = self._get_initial_contents(initial_contents)
10267+        self.data = data.read(data.get_size())
10268+        self.data = "".join(self.data)
10269+
10270         counter = FakeMutableFile.counter
10271         FakeMutableFile.counter += 1
10272         writekey = hashutil.ssk_writekey_hash(str(counter))
10273hunk ./src/allmydata/test/test_dirnode.py 1354
10274         pass
10275 
10276     def modify(self, modifier):
10277-        self.data = modifier(self.data, None, True)
10278+        data = modifier(self.data, None, True)
10279+        self.data = data
10280         return defer.succeed(None)
10281 
10282 class FakeNodeMaker(NodeMaker):
10283hunk ./src/allmydata/test/test_dirnode.py 1359
10284-    def create_mutable_file(self, contents="", keysize=None):
10285+    def create_mutable_file(self, contents="", keysize=None, version=None):
10286         return defer.succeed(FakeMutableFile(contents))
10287 
10288 class FakeClient2(Client):
10289hunk ./src/allmydata/test/test_filenode.py 98
10290         def _check_segment(res):
10291             self.failUnlessEqual(res, DATA[1:1+5])
10292         d.addCallback(_check_segment)
10293+        d.addCallback(lambda ignored: fn1.get_best_readable_version())
10294+        d.addCallback(lambda fn2: self.failUnlessEqual(fn1, fn2))
10295+        d.addCallback(lambda ignored:
10296+            fn1.get_size_of_best_version())
10297+        d.addCallback(lambda size:
10298+            self.failUnlessEqual(size, len(DATA)))
10299+        d.addCallback(lambda ignored:
10300+            fn1.download_to_data())
10301+        d.addCallback(lambda data:
10302+            self.failUnlessEqual(data, DATA))
10303+        d.addCallback(lambda ignored:
10304+            fn1.download_best_version())
10305+        d.addCallback(lambda data:
10306+            self.failUnlessEqual(data, DATA))
10307 
10308         return d
10309 
10310hunk ./src/allmydata/test/test_hung_server.py 10
10311 from allmydata.util.consumer import download_to_data
10312 from allmydata.immutable import upload
10313 from allmydata.mutable.common import UnrecoverableFileError
10314+from allmydata.mutable.publish import MutableData
10315 from allmydata.storage.common import storage_index_to_dir
10316 from allmydata.test.no_network import GridTestMixin
10317 from allmydata.test.common import ShouldFailMixin
10318hunk ./src/allmydata/test/test_hung_server.py 110
10319         self.servers = self.servers[5:] + self.servers[:5]
10320 
10321         if mutable:
10322-            d = nm.create_mutable_file(mutable_plaintext)
10323+            uploadable = MutableData(mutable_plaintext)
10324+            d = nm.create_mutable_file(uploadable)
10325             def _uploaded_mutable(node):
10326                 self.uri = node.get_uri()
10327                 self.shares = self.find_uri_shares(self.uri)
10328hunk ./src/allmydata/test/test_immutable.py 267
10329         d.addCallback(_after_attempt)
10330         return d
10331 
10332+    def test_download_to_data(self):
10333+        d = self.n.download_to_data()
10334+        d.addCallback(lambda data:
10335+            self.failUnlessEqual(data, common.TEST_DATA))
10336+        return d
10337 
10338hunk ./src/allmydata/test/test_immutable.py 273
10339+
10340+    def test_download_best_version(self):
10341+        d = self.n.download_best_version()
10342+        d.addCallback(lambda data:
10343+            self.failUnlessEqual(data, common.TEST_DATA))
10344+        return d
10345+
10346+
10347+    def test_get_best_readable_version(self):
10348+        d = self.n.get_best_readable_version()
10349+        d.addCallback(lambda n2:
10350+            self.failUnlessEqual(n2, self.n))
10351+        return d
10352+
10353+    def test_get_size_of_best_version(self):
10354+        d = self.n.get_size_of_best_version()
10355+        d.addCallback(lambda size:
10356+            self.failUnlessEqual(size, len(common.TEST_DATA)))
10357+        return d
10358+
10359+
10360 # XXX extend these tests to show bad behavior of various kinds from servers:
10361 # raising exception from each remove_foo() method, for example
10362 
10363hunk ./src/allmydata/test/test_mutable.py 2
10364 
10365-import struct
10366+import os
10367 from cStringIO import StringIO
10368 from twisted.trial import unittest
10369 from twisted.internet import defer, reactor
10370hunk ./src/allmydata/test/test_mutable.py 8
10371 from allmydata import uri, client
10372 from allmydata.nodemaker import NodeMaker
10373-from allmydata.util import base32
10374+from allmydata.util import base32, consumer
10375 from allmydata.util.hashutil import tagged_hash, ssk_writekey_hash, \
10376      ssk_pubkey_fingerprint_hash
10377hunk ./src/allmydata/test/test_mutable.py 11
10378+from allmydata.util.deferredutil import gatherResults
10379 from allmydata.interfaces import IRepairResults, ICheckAndRepairResults, \
10380hunk ./src/allmydata/test/test_mutable.py 13
10381-     NotEnoughSharesError
10382+     NotEnoughSharesError, SDMF_VERSION, MDMF_VERSION
10383 from allmydata.monitor import Monitor
10384 from allmydata.test.common import ShouldFailMixin
10385 from allmydata.test.no_network import GridTestMixin
10386hunk ./src/allmydata/test/test_mutable.py 27
10387      NeedMoreDataError, UnrecoverableFileError, UncoordinatedWriteError, \
10388      NotEnoughServersError, CorruptShareError
10389 from allmydata.mutable.retrieve import Retrieve
10390-from allmydata.mutable.publish import Publish
10391+from allmydata.mutable.publish import Publish, MutableFileHandle, \
10392+                                      MutableData, \
10393+                                      DEFAULT_MAX_SEGMENT_SIZE
10394 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
10395hunk ./src/allmydata/test/test_mutable.py 31
10396-from allmydata.mutable.layout import unpack_header, unpack_share
10397+from allmydata.mutable.layout import unpack_header, MDMFSlotReadProxy
10398 from allmydata.mutable.repairer import MustForceRepairError
10399 
10400 import allmydata.test.common_util as testutil
10401hunk ./src/allmydata/test/test_mutable.py 100
10402         self.storage = storage
10403         self.queries = 0
10404     def callRemote(self, methname, *args, **kwargs):
10405+        self.queries += 1
10406         def _call():
10407             meth = getattr(self, methname)
10408             return meth(*args, **kwargs)
10409hunk ./src/allmydata/test/test_mutable.py 107
10410         d = fireEventually()
10411         d.addCallback(lambda res: _call())
10412         return d
10413+
10414     def callRemoteOnly(self, methname, *args, **kwargs):
10415hunk ./src/allmydata/test/test_mutable.py 109
10416+        self.queries += 1
10417         d = self.callRemote(methname, *args, **kwargs)
10418         d.addBoth(lambda ignore: None)
10419         pass
10420hunk ./src/allmydata/test/test_mutable.py 157
10421             chr(ord(original[byte_offset]) ^ 0x01) +
10422             original[byte_offset+1:])
10423 
10424+def add_two(original, byte_offset):
10425+    # It isn't enough to simply flip the bit for the version number,
10426+    # because 1 is a valid version number. So we add two instead.
10427+    return (original[:byte_offset] +
10428+            chr(ord(original[byte_offset]) ^ 0x02) +
10429+            original[byte_offset+1:])
10430+
10431 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
10432     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
10433     # list of shnums to corrupt.
10434hunk ./src/allmydata/test/test_mutable.py 167
10435+    ds = []
10436     for peerid in s._peers:
10437         shares = s._peers[peerid]
10438         for shnum in shares:
10439hunk ./src/allmydata/test/test_mutable.py 175
10440                 and shnum not in shnums_to_corrupt):
10441                 continue
10442             data = shares[shnum]
10443-            (version,
10444-             seqnum,
10445-             root_hash,
10446-             IV,
10447-             k, N, segsize, datalen,
10448-             o) = unpack_header(data)
10449-            if isinstance(offset, tuple):
10450-                offset1, offset2 = offset
10451-            else:
10452-                offset1 = offset
10453-                offset2 = 0
10454-            if offset1 == "pubkey":
10455-                real_offset = 107
10456-            elif offset1 in o:
10457-                real_offset = o[offset1]
10458-            else:
10459-                real_offset = offset1
10460-            real_offset = int(real_offset) + offset2 + offset_offset
10461-            assert isinstance(real_offset, int), offset
10462-            shares[shnum] = flip_bit(data, real_offset)
10463-    return res
10464+            # We're feeding the reader all of the share data, so it
10465+            # won't need to use the rref that we didn't provide, nor the
10466+            # storage index that we didn't provide. We do this because
10467+            # the reader will work for both MDMF and SDMF.
10468+            reader = MDMFSlotReadProxy(None, None, shnum, data)
10469+            # We need to get the offsets for the next part.
10470+            d = reader.get_verinfo()
10471+            def _do_corruption(verinfo, data, shnum):
10472+                (seqnum,
10473+                 root_hash,
10474+                 IV,
10475+                 segsize,
10476+                 datalen,
10477+                 k, n, prefix, o) = verinfo
10478+                if isinstance(offset, tuple):
10479+                    offset1, offset2 = offset
10480+                else:
10481+                    offset1 = offset
10482+                    offset2 = 0
10483+                if offset1 == "pubkey" and IV:
10484+                    real_offset = 107
10485+                elif offset1 == "share_data" and not IV:
10486+                    real_offset = 107
10487+                elif offset1 in o:
10488+                    real_offset = o[offset1]
10489+                else:
10490+                    real_offset = offset1
10491+                real_offset = int(real_offset) + offset2 + offset_offset
10492+                assert isinstance(real_offset, int), offset
10493+                if offset1 == 0: # verbyte
10494+                    f = add_two
10495+                else:
10496+                    f = flip_bit
10497+                shares[shnum] = f(data, real_offset)
10498+            d.addCallback(_do_corruption, data, shnum)
10499+            ds.append(d)
10500+    dl = defer.DeferredList(ds)
10501+    dl.addCallback(lambda ignored: res)
10502+    return dl
10503 
10504 def make_storagebroker(s=None, num_peers=10):
10505     if not s:
10506hunk ./src/allmydata/test/test_mutable.py 256
10507             self.failUnlessEqual(len(shnums), 1)
10508         d.addCallback(_created)
10509         return d
10510+    test_create.timeout = 15
10511+
10512+
10513+    def test_create_mdmf(self):
10514+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
10515+        def _created(n):
10516+            self.failUnless(isinstance(n, MutableFileNode))
10517+            self.failUnlessEqual(n.get_storage_index(), n._storage_index)
10518+            sb = self.nodemaker.storage_broker
10519+            peer0 = sorted(sb.get_all_serverids())[0]
10520+            shnums = self._storage._peers[peer0].keys()
10521+            self.failUnlessEqual(len(shnums), 1)
10522+        d.addCallback(_created)
10523+        return d
10524+
10525 
10526     def test_serialize(self):
10527         n = MutableFileNode(None, None, {"k": 3, "n": 10}, None)
10528hunk ./src/allmydata/test/test_mutable.py 301
10529             d.addCallback(lambda smap: smap.dump(StringIO()))
10530             d.addCallback(lambda sio:
10531                           self.failUnless("3-of-10" in sio.getvalue()))
10532-            d.addCallback(lambda res: n.overwrite("contents 1"))
10533+            d.addCallback(lambda res: n.overwrite(MutableData("contents 1")))
10534             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
10535             d.addCallback(lambda res: n.download_best_version())
10536             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
10537hunk ./src/allmydata/test/test_mutable.py 308
10538             d.addCallback(lambda res: n.get_size_of_best_version())
10539             d.addCallback(lambda size:
10540                           self.failUnlessEqual(size, len("contents 1")))
10541-            d.addCallback(lambda res: n.overwrite("contents 2"))
10542+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
10543             d.addCallback(lambda res: n.download_best_version())
10544             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
10545             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
10546hunk ./src/allmydata/test/test_mutable.py 312
10547-            d.addCallback(lambda smap: n.upload("contents 3", smap))
10548+            d.addCallback(lambda smap: n.upload(MutableData("contents 3"), smap))
10549             d.addCallback(lambda res: n.download_best_version())
10550             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
10551             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
10552hunk ./src/allmydata/test/test_mutable.py 324
10553             # mapupdate-to-retrieve data caching (i.e. make the shares larger
10554             # than the default readsize, which is 2000 bytes). A 15kB file
10555             # will have 5kB shares.
10556-            d.addCallback(lambda res: n.overwrite("large size file" * 1000))
10557+            d.addCallback(lambda res: n.overwrite(MutableData("large size file" * 1000)))
10558             d.addCallback(lambda res: n.download_best_version())
10559             d.addCallback(lambda res:
10560                           self.failUnlessEqual(res, "large size file" * 1000))
10561hunk ./src/allmydata/test/test_mutable.py 332
10562         d.addCallback(_created)
10563         return d
10564 
10565+
10566+    def test_upload_and_download_mdmf(self):
10567+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
10568+        def _created(n):
10569+            d = defer.succeed(None)
10570+            d.addCallback(lambda ignored:
10571+                n.get_servermap(MODE_READ))
10572+            def _then(servermap):
10573+                dumped = servermap.dump(StringIO())
10574+                self.failUnlessIn("3-of-10", dumped.getvalue())
10575+            d.addCallback(_then)
10576+            # Now overwrite the contents with some new contents. We want
10577+            # to make them big enough to force the file to be uploaded
10578+            # in more than one segment.
10579+            big_contents = "contents1" * 100000 # about 900 KiB
10580+            big_contents_uploadable = MutableData(big_contents)
10581+            d.addCallback(lambda ignored:
10582+                n.overwrite(big_contents_uploadable))
10583+            d.addCallback(lambda ignored:
10584+                n.download_best_version())
10585+            d.addCallback(lambda data:
10586+                self.failUnlessEqual(data, big_contents))
10587+            # Overwrite the contents again with some new contents. As
10588+            # before, they need to be big enough to force multiple
10589+            # segments, so that we make the downloader deal with
10590+            # multiple segments.
10591+            bigger_contents = "contents2" * 1000000 # about 9MiB
10592+            bigger_contents_uploadable = MutableData(bigger_contents)
10593+            d.addCallback(lambda ignored:
10594+                n.overwrite(bigger_contents_uploadable))
10595+            d.addCallback(lambda ignored:
10596+                n.download_best_version())
10597+            d.addCallback(lambda data:
10598+                self.failUnlessEqual(data, bigger_contents))
10599+            return d
10600+        d.addCallback(_created)
10601+        return d
10602+
10603+
10604+    def test_mdmf_write_count(self):
10605+        # Publishing an MDMF file should only cause one write for each
10606+        # share that is to be published. Otherwise, we introduce
10607+        # undesirable semantics that are a regression from SDMF
10608+        upload = MutableData("MDMF" * 100000) # about 400 KiB
10609+        d = self.nodemaker.create_mutable_file(upload,
10610+                                               version=MDMF_VERSION)
10611+        def _check_server_write_counts(ignored):
10612+            sb = self.nodemaker.storage_broker
10613+            peers = sb.test_servers.values()
10614+            for peer in peers:
10615+                self.failUnlessEqual(peer.queries, 1)
10616+        d.addCallback(_check_server_write_counts)
10617+        return d
10618+
10619+
10620     def test_create_with_initial_contents(self):
10621hunk ./src/allmydata/test/test_mutable.py 388
10622-        d = self.nodemaker.create_mutable_file("contents 1")
10623+        upload1 = MutableData("contents 1")
10624+        d = self.nodemaker.create_mutable_file(upload1)
10625         def _created(n):
10626             d = n.download_best_version()
10627             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
10628hunk ./src/allmydata/test/test_mutable.py 393
10629-            d.addCallback(lambda res: n.overwrite("contents 2"))
10630+            upload2 = MutableData("contents 2")
10631+            d.addCallback(lambda res: n.overwrite(upload2))
10632             d.addCallback(lambda res: n.download_best_version())
10633             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
10634             return d
10635hunk ./src/allmydata/test/test_mutable.py 400
10636         d.addCallback(_created)
10637         return d
10638+    test_create_with_initial_contents.timeout = 15
10639+
10640+
10641+    def test_create_mdmf_with_initial_contents(self):
10642+        initial_contents = "foobarbaz" * 131072 # 900KiB
10643+        initial_contents_uploadable = MutableData(initial_contents)
10644+        d = self.nodemaker.create_mutable_file(initial_contents_uploadable,
10645+                                               version=MDMF_VERSION)
10646+        def _created(n):
10647+            d = n.download_best_version()
10648+            d.addCallback(lambda data:
10649+                self.failUnlessEqual(data, initial_contents))
10650+            uploadable2 = MutableData(initial_contents + "foobarbaz")
10651+            d.addCallback(lambda ignored:
10652+                n.overwrite(uploadable2))
10653+            d.addCallback(lambda ignored:
10654+                n.download_best_version())
10655+            d.addCallback(lambda data:
10656+                self.failUnlessEqual(data, initial_contents +
10657+                                           "foobarbaz"))
10658+            return d
10659+        d.addCallback(_created)
10660+        return d
10661+    test_create_mdmf_with_initial_contents.timeout = 20
10662+
10663 
10664     def test_response_cache_memory_leak(self):
10665         d = self.nodemaker.create_mutable_file("contents")
10666hunk ./src/allmydata/test/test_mutable.py 451
10667             key = n.get_writekey()
10668             self.failUnless(isinstance(key, str), key)
10669             self.failUnlessEqual(len(key), 16) # AES key size
10670-            return data
10671+            return MutableData(data)
10672         d = self.nodemaker.create_mutable_file(_make_contents)
10673         def _created(n):
10674             return n.download_best_version()
10675hunk ./src/allmydata/test/test_mutable.py 459
10676         d.addCallback(lambda data2: self.failUnlessEqual(data2, data))
10677         return d
10678 
10679+
10680+    def test_create_mdmf_with_initial_contents_function(self):
10681+        data = "initial contents" * 100000
10682+        def _make_contents(n):
10683+            self.failUnless(isinstance(n, MutableFileNode))
10684+            key = n.get_writekey()
10685+            self.failUnless(isinstance(key, str), key)
10686+            self.failUnlessEqual(len(key), 16)
10687+            return MutableData(data)
10688+        d = self.nodemaker.create_mutable_file(_make_contents,
10689+                                               version=MDMF_VERSION)
10690+        d.addCallback(lambda n:
10691+            n.download_best_version())
10692+        d.addCallback(lambda data2:
10693+            self.failUnlessEqual(data2, data))
10694+        return d
10695+
10696+
10697     def test_create_with_too_large_contents(self):
10698         BIG = "a" * (self.OLD_MAX_SEGMENT_SIZE + 1)
10699hunk ./src/allmydata/test/test_mutable.py 479
10700-        d = self.nodemaker.create_mutable_file(BIG)
10701+        BIG_uploadable = MutableData(BIG)
10702+        d = self.nodemaker.create_mutable_file(BIG_uploadable)
10703         def _created(n):
10704hunk ./src/allmydata/test/test_mutable.py 482
10705-            d = n.overwrite(BIG)
10706+            other_BIG_uploadable = MutableData(BIG)
10707+            d = n.overwrite(other_BIG_uploadable)
10708             return d
10709         d.addCallback(_created)
10710         return d
10711hunk ./src/allmydata/test/test_mutable.py 497
10712 
10713     def test_modify(self):
10714         def _modifier(old_contents, servermap, first_time):
10715-            return old_contents + "line2"
10716+            new_contents = old_contents + "line2"
10717+            return new_contents
10718         def _non_modifier(old_contents, servermap, first_time):
10719             return old_contents
10720         def _none_modifier(old_contents, servermap, first_time):
10721hunk ./src/allmydata/test/test_mutable.py 506
10722         def _error_modifier(old_contents, servermap, first_time):
10723             raise ValueError("oops")
10724         def _toobig_modifier(old_contents, servermap, first_time):
10725-            return "b" * (self.OLD_MAX_SEGMENT_SIZE+1)
10726+            new_content = "b" * (self.OLD_MAX_SEGMENT_SIZE + 1)
10727+            return new_content
10728         calls = []
10729         def _ucw_error_modifier(old_contents, servermap, first_time):
10730             # simulate an UncoordinatedWriteError once
10731hunk ./src/allmydata/test/test_mutable.py 514
10732             calls.append(1)
10733             if len(calls) <= 1:
10734                 raise UncoordinatedWriteError("simulated")
10735-            return old_contents + "line3"
10736+            new_contents = old_contents + "line3"
10737+            return new_contents
10738         def _ucw_error_non_modifier(old_contents, servermap, first_time):
10739             # simulate an UncoordinatedWriteError once, and don't actually
10740             # modify the contents on subsequent invocations
10741hunk ./src/allmydata/test/test_mutable.py 524
10742                 raise UncoordinatedWriteError("simulated")
10743             return old_contents
10744 
10745-        d = self.nodemaker.create_mutable_file("line1")
10746+        initial_contents = "line1"
10747+        d = self.nodemaker.create_mutable_file(MutableData(initial_contents))
10748         def _created(n):
10749             d = n.modify(_modifier)
10750             d.addCallback(lambda res: n.download_best_version())
10751hunk ./src/allmydata/test/test_mutable.py 582
10752             return d
10753         d.addCallback(_created)
10754         return d
10755+    test_modify.timeout = 15
10756+
10757 
10758     def test_modify_backoffer(self):
10759         def _modifier(old_contents, servermap, first_time):
10760hunk ./src/allmydata/test/test_mutable.py 609
10761         giveuper._delay = 0.1
10762         giveuper.factor = 1
10763 
10764-        d = self.nodemaker.create_mutable_file("line1")
10765+        d = self.nodemaker.create_mutable_file(MutableData("line1"))
10766         def _created(n):
10767             d = n.modify(_modifier)
10768             d.addCallback(lambda res: n.download_best_version())
10769hunk ./src/allmydata/test/test_mutable.py 659
10770             d.addCallback(lambda smap: smap.dump(StringIO()))
10771             d.addCallback(lambda sio:
10772                           self.failUnless("3-of-10" in sio.getvalue()))
10773-            d.addCallback(lambda res: n.overwrite("contents 1"))
10774+            d.addCallback(lambda res: n.overwrite(MutableData("contents 1")))
10775             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
10776             d.addCallback(lambda res: n.download_best_version())
10777             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
10778hunk ./src/allmydata/test/test_mutable.py 663
10779-            d.addCallback(lambda res: n.overwrite("contents 2"))
10780+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
10781             d.addCallback(lambda res: n.download_best_version())
10782             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
10783             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
10784hunk ./src/allmydata/test/test_mutable.py 667
10785-            d.addCallback(lambda smap: n.upload("contents 3", smap))
10786+            d.addCallback(lambda smap: n.upload(MutableData("contents 3"), smap))
10787             d.addCallback(lambda res: n.download_best_version())
10788             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
10789             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
10790hunk ./src/allmydata/test/test_mutable.py 680
10791         return d
10792 
10793 
10794-class MakeShares(unittest.TestCase):
10795-    def test_encrypt(self):
10796-        nm = make_nodemaker()
10797-        CONTENTS = "some initial contents"
10798-        d = nm.create_mutable_file(CONTENTS)
10799-        def _created(fn):
10800-            p = Publish(fn, nm.storage_broker, None)
10801-            p.salt = "SALT" * 4
10802-            p.readkey = "\x00" * 16
10803-            p.newdata = CONTENTS
10804-            p.required_shares = 3
10805-            p.total_shares = 10
10806-            p.setup_encoding_parameters()
10807-            return p._encrypt_and_encode()
10808+    def test_size_after_servermap_update(self):
10809+        # a mutable file node should have something to say about how big
10810+        # it is after a servermap update is performed, since this tells
10811+        # us how large the best version of that mutable file is.
10812+        d = self.nodemaker.create_mutable_file()
10813+        def _created(n):
10814+            self.n = n
10815+            return n.get_servermap(MODE_READ)
10816+        d.addCallback(_created)
10817+        d.addCallback(lambda ignored:
10818+            self.failUnlessEqual(self.n.get_size(), 0))
10819+        d.addCallback(lambda ignored:
10820+            self.n.overwrite(MutableData("foobarbaz")))
10821+        d.addCallback(lambda ignored:
10822+            self.failUnlessEqual(self.n.get_size(), 9))
10823+        d.addCallback(lambda ignored:
10824+            self.nodemaker.create_mutable_file(MutableData("foobarbaz")))
10825+        d.addCallback(_created)
10826+        d.addCallback(lambda ignored:
10827+            self.failUnlessEqual(self.n.get_size(), 9))
10828+        return d
10829+
10830+
10831+class PublishMixin:
10832+    def publish_one(self):
10833+        # publish a file and create shares, which can then be manipulated
10834+        # later.
10835+        self.CONTENTS = "New contents go here" * 1000
10836+        self.uploadable = MutableData(self.CONTENTS)
10837+        self._storage = FakeStorage()
10838+        self._nodemaker = make_nodemaker(self._storage)
10839+        self._storage_broker = self._nodemaker.storage_broker
10840+        d = self._nodemaker.create_mutable_file(self.uploadable)
10841+        def _created(node):
10842+            self._fn = node
10843+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
10844         d.addCallback(_created)
10845hunk ./src/allmydata/test/test_mutable.py 717
10846-        def _done(shares_and_shareids):
10847-            (shares, share_ids) = shares_and_shareids
10848-            self.failUnlessEqual(len(shares), 10)
10849-            for sh in shares:
10850-                self.failUnless(isinstance(sh, str))
10851-                self.failUnlessEqual(len(sh), 7)
10852-            self.failUnlessEqual(len(share_ids), 10)
10853-        d.addCallback(_done)
10854         return d
10855 
10856hunk ./src/allmydata/test/test_mutable.py 719
10857-    def test_generate(self):
10858-        nm = make_nodemaker()
10859-        CONTENTS = "some initial contents"
10860-        d = nm.create_mutable_file(CONTENTS)
10861-        def _created(fn):
10862-            self._fn = fn
10863-            p = Publish(fn, nm.storage_broker, None)
10864-            self._p = p
10865-            p.newdata = CONTENTS
10866-            p.required_shares = 3
10867-            p.total_shares = 10
10868-            p.setup_encoding_parameters()
10869-            p._new_seqnum = 3
10870-            p.salt = "SALT" * 4
10871-            # make some fake shares
10872-            shares_and_ids = ( ["%07d" % i for i in range(10)], range(10) )
10873-            p._privkey = fn.get_privkey()
10874-            p._encprivkey = fn.get_encprivkey()
10875-            p._pubkey = fn.get_pubkey()
10876-            return p._generate_shares(shares_and_ids)
10877+    def publish_mdmf(self):
10878+        # like publish_one, except that the result is guaranteed to be
10879+        # an MDMF file.
10880+        # self.CONTENTS should have more than one segment.
10881+        self.CONTENTS = "This is an MDMF file" * 100000
10882+        self.uploadable = MutableData(self.CONTENTS)
10883+        self._storage = FakeStorage()
10884+        self._nodemaker = make_nodemaker(self._storage)
10885+        self._storage_broker = self._nodemaker.storage_broker
10886+        d = self._nodemaker.create_mutable_file(self.uploadable, version=MDMF_VERSION)
10887+        def _created(node):
10888+            self._fn = node
10889+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
10890         d.addCallback(_created)
10891hunk ./src/allmydata/test/test_mutable.py 733
10892-        def _generated(res):
10893-            p = self._p
10894-            final_shares = p.shares
10895-            root_hash = p.root_hash
10896-            self.failUnlessEqual(len(root_hash), 32)
10897-            self.failUnless(isinstance(final_shares, dict))
10898-            self.failUnlessEqual(len(final_shares), 10)
10899-            self.failUnlessEqual(sorted(final_shares.keys()), range(10))
10900-            for i,sh in final_shares.items():
10901-                self.failUnless(isinstance(sh, str))
10902-                # feed the share through the unpacker as a sanity-check
10903-                pieces = unpack_share(sh)
10904-                (u_seqnum, u_root_hash, IV, k, N, segsize, datalen,
10905-                 pubkey, signature, share_hash_chain, block_hash_tree,
10906-                 share_data, enc_privkey) = pieces
10907-                self.failUnlessEqual(u_seqnum, 3)
10908-                self.failUnlessEqual(u_root_hash, root_hash)
10909-                self.failUnlessEqual(k, 3)
10910-                self.failUnlessEqual(N, 10)
10911-                self.failUnlessEqual(segsize, 21)
10912-                self.failUnlessEqual(datalen, len(CONTENTS))
10913-                self.failUnlessEqual(pubkey, p._pubkey.serialize())
10914-                sig_material = struct.pack(">BQ32s16s BBQQ",
10915-                                           0, p._new_seqnum, root_hash, IV,
10916-                                           k, N, segsize, datalen)
10917-                self.failUnless(p._pubkey.verify(sig_material, signature))
10918-                #self.failUnlessEqual(signature, p._privkey.sign(sig_material))
10919-                self.failUnless(isinstance(share_hash_chain, dict))
10920-                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
10921-                for shnum,share_hash in share_hash_chain.items():
10922-                    self.failUnless(isinstance(shnum, int))
10923-                    self.failUnless(isinstance(share_hash, str))
10924-                    self.failUnlessEqual(len(share_hash), 32)
10925-                self.failUnless(isinstance(block_hash_tree, list))
10926-                self.failUnlessEqual(len(block_hash_tree), 1) # very small tree
10927-                self.failUnlessEqual(IV, "SALT"*4)
10928-                self.failUnlessEqual(len(share_data), len("%07d" % 1))
10929-                self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey())
10930-        d.addCallback(_generated)
10931         return d
10932 
10933hunk ./src/allmydata/test/test_mutable.py 735
10934-    # TODO: when we publish to 20 peers, we should get one share per peer on 10
10935-    # when we publish to 3 peers, we should get either 3 or 4 shares per peer
10936-    # when we publish to zero peers, we should get a NotEnoughSharesError
10937 
10938hunk ./src/allmydata/test/test_mutable.py 736
10939-class PublishMixin:
10940-    def publish_one(self):
10941-        # publish a file and create shares, which can then be manipulated
10942-        # later.
10943-        self.CONTENTS = "New contents go here" * 1000
10944+    def publish_sdmf(self):
10945+        # like publish_one, except that the result is guaranteed to be
10946+        # an SDMF file
10947+        self.CONTENTS = "This is an SDMF file" * 1000
10948+        self.uploadable = MutableData(self.CONTENTS)
10949         self._storage = FakeStorage()
10950         self._nodemaker = make_nodemaker(self._storage)
10951         self._storage_broker = self._nodemaker.storage_broker
10952hunk ./src/allmydata/test/test_mutable.py 744
10953-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
10954+        d = self._nodemaker.create_mutable_file(self.uploadable, version=SDMF_VERSION)
10955         def _created(node):
10956             self._fn = node
10957             self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
10958hunk ./src/allmydata/test/test_mutable.py 751
10959         d.addCallback(_created)
10960         return d
10961 
10962-    def publish_multiple(self):
10963+
10964+    def publish_multiple(self, version=0):
10965         self.CONTENTS = ["Contents 0",
10966                          "Contents 1",
10967                          "Contents 2",
10968hunk ./src/allmydata/test/test_mutable.py 758
10969                          "Contents 3a",
10970                          "Contents 3b"]
10971+        self.uploadables = [MutableData(d) for d in self.CONTENTS]
10972         self._copied_shares = {}
10973         self._storage = FakeStorage()
10974         self._nodemaker = make_nodemaker(self._storage)
10975hunk ./src/allmydata/test/test_mutable.py 762
10976-        d = self._nodemaker.create_mutable_file(self.CONTENTS[0]) # seqnum=1
10977+        d = self._nodemaker.create_mutable_file(self.uploadables[0], version=version) # seqnum=1
10978         def _created(node):
10979             self._fn = node
10980             # now create multiple versions of the same file, and accumulate
10981hunk ./src/allmydata/test/test_mutable.py 769
10982             # their shares, so we can mix and match them later.
10983             d = defer.succeed(None)
10984             d.addCallback(self._copy_shares, 0)
10985-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[1])) #s2
10986+            d.addCallback(lambda res: node.overwrite(self.uploadables[1])) #s2
10987             d.addCallback(self._copy_shares, 1)
10988hunk ./src/allmydata/test/test_mutable.py 771
10989-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[2])) #s3
10990+            d.addCallback(lambda res: node.overwrite(self.uploadables[2])) #s3
10991             d.addCallback(self._copy_shares, 2)
10992hunk ./src/allmydata/test/test_mutable.py 773
10993-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[3])) #s4a
10994+            d.addCallback(lambda res: node.overwrite(self.uploadables[3])) #s4a
10995             d.addCallback(self._copy_shares, 3)
10996             # now we replace all the shares with version s3, and upload a new
10997             # version to get s4b.
10998hunk ./src/allmydata/test/test_mutable.py 779
10999             rollback = dict([(i,2) for i in range(10)])
11000             d.addCallback(lambda res: self._set_versions(rollback))
11001-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[4])) #s4b
11002+            d.addCallback(lambda res: node.overwrite(self.uploadables[4])) #s4b
11003             d.addCallback(self._copy_shares, 4)
11004             # we leave the storage in state 4
11005             return d
11006hunk ./src/allmydata/test/test_mutable.py 786
11007         d.addCallback(_created)
11008         return d
11009 
11010+
11011     def _copy_shares(self, ignored, index):
11012         shares = self._storage._peers
11013         # we need a deep copy
11014hunk ./src/allmydata/test/test_mutable.py 810
11015                     shares[peerid][shnum] = oldshares[index][peerid][shnum]
11016 
11017 
11018+
11019+
11020 class Servermap(unittest.TestCase, PublishMixin):
11021     def setUp(self):
11022         return self.publish_one()
11023hunk ./src/allmydata/test/test_mutable.py 816
11024 
11025-    def make_servermap(self, mode=MODE_CHECK, fn=None, sb=None):
11026+    def make_servermap(self, mode=MODE_CHECK, fn=None, sb=None,
11027+                       update_range=None):
11028         if fn is None:
11029             fn = self._fn
11030         if sb is None:
11031hunk ./src/allmydata/test/test_mutable.py 823
11032             sb = self._storage_broker
11033         smu = ServermapUpdater(fn, sb, Monitor(),
11034-                               ServerMap(), mode)
11035+                               ServerMap(), mode, update_range=update_range)
11036         d = smu.update()
11037         return d
11038 
11039hunk ./src/allmydata/test/test_mutable.py 889
11040         # create a new file, which is large enough to knock the privkey out
11041         # of the early part of the file
11042         LARGE = "These are Larger contents" * 200 # about 5KB
11043-        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE))
11044+        LARGE_uploadable = MutableData(LARGE)
11045+        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE_uploadable))
11046         def _created(large_fn):
11047             large_fn2 = self._nodemaker.create_from_cap(large_fn.get_uri())
11048             return self.make_servermap(MODE_WRITE, large_fn2)
11049hunk ./src/allmydata/test/test_mutable.py 898
11050         d.addCallback(lambda sm: self.failUnlessOneRecoverable(sm, 10))
11051         return d
11052 
11053+
11054     def test_mark_bad(self):
11055         d = defer.succeed(None)
11056         ms = self.make_servermap
11057hunk ./src/allmydata/test/test_mutable.py 944
11058         self._storage._peers = {} # delete all shares
11059         ms = self.make_servermap
11060         d = defer.succeed(None)
11061-
11062+#
11063         d.addCallback(lambda res: ms(mode=MODE_CHECK))
11064         d.addCallback(lambda sm: self.failUnlessNoneRecoverable(sm))
11065 
11066hunk ./src/allmydata/test/test_mutable.py 996
11067         return d
11068 
11069 
11070+    def test_servermapupdater_finds_mdmf_files(self):
11071+        # setUp already published an MDMF file for us. We just need to
11072+        # make sure that when we run the ServermapUpdater, the file is
11073+        # reported to have one recoverable version.
11074+        d = defer.succeed(None)
11075+        d.addCallback(lambda ignored:
11076+            self.publish_mdmf())
11077+        d.addCallback(lambda ignored:
11078+            self.make_servermap(mode=MODE_CHECK))
11079+        # Calling make_servermap also updates the servermap in the mode
11080+        # that we specify, so we just need to see what it says.
11081+        def _check_servermap(sm):
11082+            self.failUnlessEqual(len(sm.recoverable_versions()), 1)
11083+        d.addCallback(_check_servermap)
11084+        return d
11085+
11086+
11087+    def test_fetch_update(self):
11088+        d = defer.succeed(None)
11089+        d.addCallback(lambda ignored:
11090+            self.publish_mdmf())
11091+        d.addCallback(lambda ignored:
11092+            self.make_servermap(mode=MODE_WRITE, update_range=(1, 2)))
11093+        def _check_servermap(sm):
11094+            # 10 shares
11095+            self.failUnlessEqual(len(sm.update_data), 10)
11096+            # one version
11097+            for data in sm.update_data.itervalues():
11098+                self.failUnlessEqual(len(data), 1)
11099+        d.addCallback(_check_servermap)
11100+        return d
11101+
11102+
11103+    def test_servermapupdater_finds_sdmf_files(self):
11104+        d = defer.succeed(None)
11105+        d.addCallback(lambda ignored:
11106+            self.publish_sdmf())
11107+        d.addCallback(lambda ignored:
11108+            self.make_servermap(mode=MODE_CHECK))
11109+        d.addCallback(lambda servermap:
11110+            self.failUnlessEqual(len(servermap.recoverable_versions()), 1))
11111+        return d
11112+
11113 
11114 class Roundtrip(unittest.TestCase, testutil.ShouldFailMixin, PublishMixin):
11115     def setUp(self):
11116hunk ./src/allmydata/test/test_mutable.py 1079
11117         if version is None:
11118             version = servermap.best_recoverable_version()
11119         r = Retrieve(self._fn, servermap, version)
11120-        return r.download()
11121+        c = consumer.MemoryConsumer()
11122+        d = r.download(consumer=c)
11123+        d.addCallback(lambda mc: "".join(mc.chunks))
11124+        return d
11125+
11126 
11127     def test_basic(self):
11128         d = self.make_servermap()
11129hunk ./src/allmydata/test/test_mutable.py 1160
11130         return d
11131     test_no_servers_download.timeout = 15
11132 
11133+
11134     def _test_corrupt_all(self, offset, substring,
11135hunk ./src/allmydata/test/test_mutable.py 1162
11136-                          should_succeed=False, corrupt_early=True,
11137-                          failure_checker=None):
11138+                          should_succeed=False,
11139+                          corrupt_early=True,
11140+                          failure_checker=None,
11141+                          fetch_privkey=False):
11142         d = defer.succeed(None)
11143         if corrupt_early:
11144             d.addCallback(corrupt, self._storage, offset)
11145hunk ./src/allmydata/test/test_mutable.py 1182
11146                     self.failUnlessIn(substring, "".join(allproblems))
11147                 return servermap
11148             if should_succeed:
11149-                d1 = self._fn.download_version(servermap, ver)
11150+                d1 = self._fn.download_version(servermap, ver,
11151+                                               fetch_privkey)
11152                 d1.addCallback(lambda new_contents:
11153                                self.failUnlessEqual(new_contents, self.CONTENTS))
11154             else:
11155hunk ./src/allmydata/test/test_mutable.py 1190
11156                 d1 = self.shouldFail(NotEnoughSharesError,
11157                                      "_corrupt_all(offset=%s)" % (offset,),
11158                                      substring,
11159-                                     self._fn.download_version, servermap, ver)
11160+                                     self._fn.download_version, servermap,
11161+                                                                ver,
11162+                                                                fetch_privkey)
11163             if failure_checker:
11164                 d1.addCallback(failure_checker)
11165             d1.addCallback(lambda res: servermap)
11166hunk ./src/allmydata/test/test_mutable.py 1201
11167         return d
11168 
11169     def test_corrupt_all_verbyte(self):
11170-        # when the version byte is not 0, we hit an UnknownVersionError error
11171-        # in unpack_share().
11172+        # when the version byte is not 0 or 1, we hit an UnknownVersionError
11173+        # error in unpack_share().
11174         d = self._test_corrupt_all(0, "UnknownVersionError")
11175         def _check_servermap(servermap):
11176             # and the dump should mention the problems
11177hunk ./src/allmydata/test/test_mutable.py 1208
11178             s = StringIO()
11179             dump = servermap.dump(s).getvalue()
11180-            self.failUnless("10 PROBLEMS" in dump, dump)
11181+            self.failUnless("30 PROBLEMS" in dump, dump)
11182         d.addCallback(_check_servermap)
11183         return d
11184 
11185hunk ./src/allmydata/test/test_mutable.py 1278
11186         return self._test_corrupt_all("enc_privkey", None, should_succeed=True)
11187 
11188 
11189+    def test_corrupt_all_encprivkey_late(self):
11190+        # this should work for the same reason as above, but we corrupt
11191+        # after the servermap update to exercise the error handling
11192+        # code.
11193+        # We need to remove the privkey from the node, or the retrieve
11194+        # process won't know to update it.
11195+        self._fn._privkey = None
11196+        return self._test_corrupt_all("enc_privkey",
11197+                                      None, # this shouldn't fail
11198+                                      should_succeed=True,
11199+                                      corrupt_early=False,
11200+                                      fetch_privkey=True)
11201+
11202+
11203     def test_corrupt_all_seqnum_late(self):
11204         # corrupting the seqnum between mapupdate and retrieve should result
11205         # in NotEnoughSharesError, since each share will look invalid
11206hunk ./src/allmydata/test/test_mutable.py 1298
11207         def _check(res):
11208             f = res[0]
11209             self.failUnless(f.check(NotEnoughSharesError))
11210-            self.failUnless("someone wrote to the data since we read the servermap" in str(f))
11211+            self.failUnless("uncoordinated write" in str(f))
11212         return self._test_corrupt_all(1, "ran out of peers",
11213                                       corrupt_early=False,
11214                                       failure_checker=_check)
11215hunk ./src/allmydata/test/test_mutable.py 1342
11216                             in str(servermap.problems[0]))
11217             ver = servermap.best_recoverable_version()
11218             r = Retrieve(self._fn, servermap, ver)
11219-            return r.download()
11220+            c = consumer.MemoryConsumer()
11221+            return r.download(c)
11222         d.addCallback(_do_retrieve)
11223hunk ./src/allmydata/test/test_mutable.py 1345
11224+        d.addCallback(lambda mc: "".join(mc.chunks))
11225         d.addCallback(lambda new_contents:
11226                       self.failUnlessEqual(new_contents, self.CONTENTS))
11227         return d
11228hunk ./src/allmydata/test/test_mutable.py 1350
11229 
11230-    def test_corrupt_some(self):
11231-        # corrupt the data of first five shares (so the servermap thinks
11232-        # they're good but retrieve marks them as bad), so that the
11233-        # MODE_READ set of 6 will be insufficient, forcing node.download to
11234-        # retry with more servers.
11235-        corrupt(None, self._storage, "share_data", range(5))
11236-        d = self.make_servermap()
11237+
11238+    def _test_corrupt_some(self, offset, mdmf=False):
11239+        if mdmf:
11240+            d = self.publish_mdmf()
11241+        else:
11242+            d = defer.succeed(None)
11243+        d.addCallback(lambda ignored:
11244+            corrupt(None, self._storage, offset, range(5)))
11245+        d.addCallback(lambda ignored:
11246+            self.make_servermap())
11247         def _do_retrieve(servermap):
11248             ver = servermap.best_recoverable_version()
11249             self.failUnless(ver)
11250hunk ./src/allmydata/test/test_mutable.py 1366
11251             return self._fn.download_best_version()
11252         d.addCallback(_do_retrieve)
11253         d.addCallback(lambda new_contents:
11254-                      self.failUnlessEqual(new_contents, self.CONTENTS))
11255+            self.failUnlessEqual(new_contents, self.CONTENTS))
11256         return d
11257 
11258hunk ./src/allmydata/test/test_mutable.py 1369
11259+
11260+    def test_corrupt_some(self):
11261+        # corrupt the data of first five shares (so the servermap thinks
11262+        # they're good but retrieve marks them as bad), so that the
11263+        # MODE_READ set of 6 will be insufficient, forcing node.download to
11264+        # retry with more servers.
11265+        return self._test_corrupt_some("share_data")
11266+
11267+
11268     def test_download_fails(self):
11269hunk ./src/allmydata/test/test_mutable.py 1379
11270-        corrupt(None, self._storage, "signature")
11271-        d = self.shouldFail(UnrecoverableFileError, "test_download_anyway",
11272+        d = corrupt(None, self._storage, "signature")
11273+        d.addCallback(lambda ignored:
11274+            self.shouldFail(UnrecoverableFileError, "test_download_anyway",
11275                             "no recoverable versions",
11276hunk ./src/allmydata/test/test_mutable.py 1383
11277-                            self._fn.download_best_version)
11278+                            self._fn.download_best_version))
11279         return d
11280 
11281 
11282hunk ./src/allmydata/test/test_mutable.py 1387
11283+
11284+    def test_corrupt_mdmf_block_hash_tree(self):
11285+        d = self.publish_mdmf()
11286+        d.addCallback(lambda ignored:
11287+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
11288+                                   "block hash tree failure",
11289+                                   corrupt_early=False,
11290+                                   should_succeed=False))
11291+        return d
11292+
11293+
11294+    def test_corrupt_mdmf_block_hash_tree_late(self):
11295+        d = self.publish_mdmf()
11296+        d.addCallback(lambda ignored:
11297+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
11298+                                   "block hash tree failure",
11299+                                   corrupt_early=True,
11300+                                   should_succeed=False))
11301+        return d
11302+
11303+
11304+    def test_corrupt_mdmf_share_data(self):
11305+        d = self.publish_mdmf()
11306+        d.addCallback(lambda ignored:
11307+            # TODO: Find out what the block size is and corrupt a
11308+            # specific block, rather than just guessing.
11309+            self._test_corrupt_all(("share_data", 12 * 40),
11310+                                    "block hash tree failure",
11311+                                    corrupt_early=True,
11312+                                    should_succeed=False))
11313+        return d
11314+
11315+
11316+    def test_corrupt_some_mdmf(self):
11317+        return self._test_corrupt_some(("share_data", 12 * 40),
11318+                                       mdmf=True)
11319+
11320+
11321 class CheckerMixin:
11322     def check_good(self, r, where):
11323         self.failUnless(r.is_healthy(), where)
11324hunk ./src/allmydata/test/test_mutable.py 1455
11325         d.addCallback(self.check_good, "test_check_good")
11326         return d
11327 
11328+    def test_check_mdmf_good(self):
11329+        d = self.publish_mdmf()
11330+        d.addCallback(lambda ignored:
11331+            self._fn.check(Monitor()))
11332+        d.addCallback(self.check_good, "test_check_mdmf_good")
11333+        return d
11334+
11335     def test_check_no_shares(self):
11336         for shares in self._storage._peers.values():
11337             shares.clear()
11338hunk ./src/allmydata/test/test_mutable.py 1469
11339         d.addCallback(self.check_bad, "test_check_no_shares")
11340         return d
11341 
11342+    def test_check_mdmf_no_shares(self):
11343+        d = self.publish_mdmf()
11344+        def _then(ignored):
11345+            for share in self._storage._peers.values():
11346+                share.clear()
11347+        d.addCallback(_then)
11348+        d.addCallback(lambda ignored:
11349+            self._fn.check(Monitor()))
11350+        d.addCallback(self.check_bad, "test_check_mdmf_no_shares")
11351+        return d
11352+
11353     def test_check_not_enough_shares(self):
11354         for shares in self._storage._peers.values():
11355             for shnum in shares.keys():
11356hunk ./src/allmydata/test/test_mutable.py 1489
11357         d.addCallback(self.check_bad, "test_check_not_enough_shares")
11358         return d
11359 
11360+    def test_check_mdmf_not_enough_shares(self):
11361+        d = self.publish_mdmf()
11362+        def _then(ignored):
11363+            for shares in self._storage._peers.values():
11364+                for shnum in shares.keys():
11365+                    if shnum > 0:
11366+                        del shares[shnum]
11367+        d.addCallback(_then)
11368+        d.addCallback(lambda ignored:
11369+            self._fn.check(Monitor()))
11370+        d.addCallback(self.check_bad, "test_check_mdmf_not_enougH_shares")
11371+        return d
11372+
11373+
11374     def test_check_all_bad_sig(self):
11375hunk ./src/allmydata/test/test_mutable.py 1504
11376-        corrupt(None, self._storage, 1) # bad sig
11377-        d = self._fn.check(Monitor())
11378+        d = corrupt(None, self._storage, 1) # bad sig
11379+        d.addCallback(lambda ignored:
11380+            self._fn.check(Monitor()))
11381         d.addCallback(self.check_bad, "test_check_all_bad_sig")
11382         return d
11383 
11384hunk ./src/allmydata/test/test_mutable.py 1510
11385+    def test_check_mdmf_all_bad_sig(self):
11386+        d = self.publish_mdmf()
11387+        d.addCallback(lambda ignored:
11388+            corrupt(None, self._storage, 1))
11389+        d.addCallback(lambda ignored:
11390+            self._fn.check(Monitor()))
11391+        d.addCallback(self.check_bad, "test_check_mdmf_all_bad_sig")
11392+        return d
11393+
11394     def test_check_all_bad_blocks(self):
11395hunk ./src/allmydata/test/test_mutable.py 1520
11396-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
11397+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
11398         # the Checker won't notice this.. it doesn't look at actual data
11399hunk ./src/allmydata/test/test_mutable.py 1522
11400-        d = self._fn.check(Monitor())
11401+        d.addCallback(lambda ignored:
11402+            self._fn.check(Monitor()))
11403         d.addCallback(self.check_good, "test_check_all_bad_blocks")
11404         return d
11405 
11406hunk ./src/allmydata/test/test_mutable.py 1527
11407+
11408+    def test_check_mdmf_all_bad_blocks(self):
11409+        d = self.publish_mdmf()
11410+        d.addCallback(lambda ignored:
11411+            corrupt(None, self._storage, "share_data"))
11412+        d.addCallback(lambda ignored:
11413+            self._fn.check(Monitor()))
11414+        d.addCallback(self.check_good, "test_check_mdmf_all_bad_blocks")
11415+        return d
11416+
11417     def test_verify_good(self):
11418         d = self._fn.check(Monitor(), verify=True)
11419         d.addCallback(self.check_good, "test_verify_good")
11420hunk ./src/allmydata/test/test_mutable.py 1541
11421         return d
11422+    test_verify_good.timeout = 15
11423 
11424     def test_verify_all_bad_sig(self):
11425hunk ./src/allmydata/test/test_mutable.py 1544
11426-        corrupt(None, self._storage, 1) # bad sig
11427-        d = self._fn.check(Monitor(), verify=True)
11428+        d = corrupt(None, self._storage, 1) # bad sig
11429+        d.addCallback(lambda ignored:
11430+            self._fn.check(Monitor(), verify=True))
11431         d.addCallback(self.check_bad, "test_verify_all_bad_sig")
11432         return d
11433 
11434hunk ./src/allmydata/test/test_mutable.py 1551
11435     def test_verify_one_bad_sig(self):
11436-        corrupt(None, self._storage, 1, [9]) # bad sig
11437-        d = self._fn.check(Monitor(), verify=True)
11438+        d = corrupt(None, self._storage, 1, [9]) # bad sig
11439+        d.addCallback(lambda ignored:
11440+            self._fn.check(Monitor(), verify=True))
11441         d.addCallback(self.check_bad, "test_verify_one_bad_sig")
11442         return d
11443 
11444hunk ./src/allmydata/test/test_mutable.py 1558
11445     def test_verify_one_bad_block(self):
11446-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
11447+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
11448         # the Verifier *will* notice this, since it examines every byte
11449hunk ./src/allmydata/test/test_mutable.py 1560
11450-        d = self._fn.check(Monitor(), verify=True)
11451+        d.addCallback(lambda ignored:
11452+            self._fn.check(Monitor(), verify=True))
11453         d.addCallback(self.check_bad, "test_verify_one_bad_block")
11454         d.addCallback(self.check_expected_failure,
11455                       CorruptShareError, "block hash tree failure",
11456hunk ./src/allmydata/test/test_mutable.py 1569
11457         return d
11458 
11459     def test_verify_one_bad_sharehash(self):
11460-        corrupt(None, self._storage, "share_hash_chain", [9], 5)
11461-        d = self._fn.check(Monitor(), verify=True)
11462+        d = corrupt(None, self._storage, "share_hash_chain", [9], 5)
11463+        d.addCallback(lambda ignored:
11464+            self._fn.check(Monitor(), verify=True))
11465         d.addCallback(self.check_bad, "test_verify_one_bad_sharehash")
11466         d.addCallback(self.check_expected_failure,
11467                       CorruptShareError, "corrupt hashes",
11468hunk ./src/allmydata/test/test_mutable.py 1579
11469         return d
11470 
11471     def test_verify_one_bad_encprivkey(self):
11472-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
11473-        d = self._fn.check(Monitor(), verify=True)
11474+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
11475+        d.addCallback(lambda ignored:
11476+            self._fn.check(Monitor(), verify=True))
11477         d.addCallback(self.check_bad, "test_verify_one_bad_encprivkey")
11478         d.addCallback(self.check_expected_failure,
11479                       CorruptShareError, "invalid privkey",
11480hunk ./src/allmydata/test/test_mutable.py 1589
11481         return d
11482 
11483     def test_verify_one_bad_encprivkey_uncheckable(self):
11484-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
11485+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
11486         readonly_fn = self._fn.get_readonly()
11487         # a read-only node has no way to validate the privkey
11488hunk ./src/allmydata/test/test_mutable.py 1592
11489-        d = readonly_fn.check(Monitor(), verify=True)
11490+        d.addCallback(lambda ignored:
11491+            readonly_fn.check(Monitor(), verify=True))
11492         d.addCallback(self.check_good,
11493                       "test_verify_one_bad_encprivkey_uncheckable")
11494         return d
11495hunk ./src/allmydata/test/test_mutable.py 1598
11496 
11497+
11498+    def test_verify_mdmf_good(self):
11499+        d = self.publish_mdmf()
11500+        d.addCallback(lambda ignored:
11501+            self._fn.check(Monitor(), verify=True))
11502+        d.addCallback(self.check_good, "test_verify_mdmf_good")
11503+        return d
11504+
11505+
11506+    def test_verify_mdmf_one_bad_block(self):
11507+        d = self.publish_mdmf()
11508+        d.addCallback(lambda ignored:
11509+            corrupt(None, self._storage, "share_data", [1]))
11510+        d.addCallback(lambda ignored:
11511+            self._fn.check(Monitor(), verify=True))
11512+        # We should find one bad block here
11513+        d.addCallback(self.check_bad, "test_verify_mdmf_one_bad_block")
11514+        d.addCallback(self.check_expected_failure,
11515+                      CorruptShareError, "block hash tree failure",
11516+                      "test_verify_mdmf_one_bad_block")
11517+        return d
11518+
11519+
11520+    def test_verify_mdmf_bad_encprivkey(self):
11521+        d = self.publish_mdmf()
11522+        d.addCallback(lambda ignored:
11523+            corrupt(None, self._storage, "enc_privkey", [1]))
11524+        d.addCallback(lambda ignored:
11525+            self._fn.check(Monitor(), verify=True))
11526+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_encprivkey")
11527+        d.addCallback(self.check_expected_failure,
11528+                      CorruptShareError, "privkey",
11529+                      "test_verify_mdmf_bad_encprivkey")
11530+        return d
11531+
11532+
11533+    def test_verify_mdmf_bad_sig(self):
11534+        d = self.publish_mdmf()
11535+        d.addCallback(lambda ignored:
11536+            corrupt(None, self._storage, 1, [1]))
11537+        d.addCallback(lambda ignored:
11538+            self._fn.check(Monitor(), verify=True))
11539+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_sig")
11540+        return d
11541+
11542+
11543+    def test_verify_mdmf_bad_encprivkey_uncheckable(self):
11544+        d = self.publish_mdmf()
11545+        d.addCallback(lambda ignored:
11546+            corrupt(None, self._storage, "enc_privkey", [1]))
11547+        d.addCallback(lambda ignored:
11548+            self._fn.get_readonly())
11549+        d.addCallback(lambda fn:
11550+            fn.check(Monitor(), verify=True))
11551+        d.addCallback(self.check_good,
11552+                      "test_verify_mdmf_bad_encprivkey_uncheckable")
11553+        return d
11554+
11555+
11556 class Repair(unittest.TestCase, PublishMixin, ShouldFailMixin):
11557 
11558     def get_shares(self, s):
11559hunk ./src/allmydata/test/test_mutable.py 1722
11560         current_shares = self.old_shares[-1]
11561         self.failUnlessEqual(old_shares, current_shares)
11562 
11563+
11564     def test_unrepairable_0shares(self):
11565         d = self.publish_one()
11566         def _delete_all_shares(ign):
11567hunk ./src/allmydata/test/test_mutable.py 1737
11568         d.addCallback(_check)
11569         return d
11570 
11571+    def test_mdmf_unrepairable_0shares(self):
11572+        d = self.publish_mdmf()
11573+        def _delete_all_shares(ign):
11574+            shares = self._storage._peers
11575+            for peerid in shares:
11576+                shares[peerid] = {}
11577+        d.addCallback(_delete_all_shares)
11578+        d.addCallback(lambda ign: self._fn.check(Monitor()))
11579+        d.addCallback(lambda check_results: self._fn.repair(check_results))
11580+        d.addCallback(lambda crr: self.failIf(crr.get_successful()))
11581+        return d
11582+
11583+
11584     def test_unrepairable_1share(self):
11585         d = self.publish_one()
11586         def _delete_all_shares(ign):
11587hunk ./src/allmydata/test/test_mutable.py 1766
11588         d.addCallback(_check)
11589         return d
11590 
11591+    def test_mdmf_unrepairable_1share(self):
11592+        d = self.publish_mdmf()
11593+        def _delete_all_shares(ign):
11594+            shares = self._storage._peers
11595+            for peerid in shares:
11596+                for shnum in list(shares[peerid]):
11597+                    if shnum > 0:
11598+                        del shares[peerid][shnum]
11599+        d.addCallback(_delete_all_shares)
11600+        d.addCallback(lambda ign: self._fn.check(Monitor()))
11601+        d.addCallback(lambda check_results: self._fn.repair(check_results))
11602+        def _check(crr):
11603+            self.failUnlessEqual(crr.get_successful(), False)
11604+        d.addCallback(_check)
11605+        return d
11606+
11607+    def test_repairable_5shares(self):
11608+        d = self.publish_mdmf()
11609+        def _delete_all_shares(ign):
11610+            shares = self._storage._peers
11611+            for peerid in shares:
11612+                for shnum in list(shares[peerid]):
11613+                    if shnum > 4:
11614+                        del shares[peerid][shnum]
11615+        d.addCallback(_delete_all_shares)
11616+        d.addCallback(lambda ign: self._fn.check(Monitor()))
11617+        d.addCallback(lambda check_results: self._fn.repair(check_results))
11618+        def _check(crr):
11619+            self.failUnlessEqual(crr.get_successful(), True)
11620+        d.addCallback(_check)
11621+        return d
11622+
11623+    def test_mdmf_repairable_5shares(self):
11624+        d = self.publish_mdmf()
11625+        def _delete_some_shares(ign):
11626+            shares = self._storage._peers
11627+            for peerid in shares:
11628+                for shnum in list(shares[peerid]):
11629+                    if shnum > 5:
11630+                        del shares[peerid][shnum]
11631+        d.addCallback(_delete_some_shares)
11632+        d.addCallback(lambda ign: self._fn.check(Monitor()))
11633+        def _check(cr):
11634+            self.failIf(cr.is_healthy())
11635+            self.failUnless(cr.is_recoverable())
11636+            return cr
11637+        d.addCallback(_check)
11638+        d.addCallback(lambda check_results: self._fn.repair(check_results))
11639+        def _check1(crr):
11640+            self.failUnlessEqual(crr.get_successful(), True)
11641+        d.addCallback(_check1)
11642+        return d
11643+
11644+
11645     def test_merge(self):
11646         self.old_shares = []
11647         d = self.publish_multiple()
11648hunk ./src/allmydata/test/test_mutable.py 1934
11649 class MultipleEncodings(unittest.TestCase):
11650     def setUp(self):
11651         self.CONTENTS = "New contents go here"
11652+        self.uploadable = MutableData(self.CONTENTS)
11653         self._storage = FakeStorage()
11654         self._nodemaker = make_nodemaker(self._storage, num_peers=20)
11655         self._storage_broker = self._nodemaker.storage_broker
11656hunk ./src/allmydata/test/test_mutable.py 1938
11657-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
11658+        d = self._nodemaker.create_mutable_file(self.uploadable)
11659         def _created(node):
11660             self._fn = node
11661         d.addCallback(_created)
11662hunk ./src/allmydata/test/test_mutable.py 1944
11663         return d
11664 
11665-    def _encode(self, k, n, data):
11666+    def _encode(self, k, n, data, version=SDMF_VERSION):
11667         # encode 'data' into a peerid->shares dict.
11668 
11669         fn = self._fn
11670hunk ./src/allmydata/test/test_mutable.py 1960
11671         # and set the encoding parameters to something completely different
11672         fn2._required_shares = k
11673         fn2._total_shares = n
11674+        # Normally a servermap update would occur before a publish.
11675+        # Here, it doesn't, so we have to do it ourselves.
11676+        fn2.set_version(version)
11677 
11678         s = self._storage
11679         s._peers = {} # clear existing storage
11680hunk ./src/allmydata/test/test_mutable.py 1967
11681         p2 = Publish(fn2, self._storage_broker, None)
11682-        d = p2.publish(data)
11683+        uploadable = MutableData(data)
11684+        d = p2.publish(uploadable)
11685         def _published(res):
11686             shares = s._peers
11687             s._peers = {}
11688hunk ./src/allmydata/test/test_mutable.py 2235
11689         self.basedir = "mutable/Problems/test_publish_surprise"
11690         self.set_up_grid()
11691         nm = self.g.clients[0].nodemaker
11692-        d = nm.create_mutable_file("contents 1")
11693+        d = nm.create_mutable_file(MutableData("contents 1"))
11694         def _created(n):
11695             d = defer.succeed(None)
11696             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
11697hunk ./src/allmydata/test/test_mutable.py 2245
11698             d.addCallback(_got_smap1)
11699             # then modify the file, leaving the old map untouched
11700             d.addCallback(lambda res: log.msg("starting winning write"))
11701-            d.addCallback(lambda res: n.overwrite("contents 2"))
11702+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
11703             # now attempt to modify the file with the old servermap. This
11704             # will look just like an uncoordinated write, in which every
11705             # single share got updated between our mapupdate and our publish
11706hunk ./src/allmydata/test/test_mutable.py 2254
11707                           self.shouldFail(UncoordinatedWriteError,
11708                                           "test_publish_surprise", None,
11709                                           n.upload,
11710-                                          "contents 2a", self.old_map))
11711+                                          MutableData("contents 2a"), self.old_map))
11712             return d
11713         d.addCallback(_created)
11714         return d
11715hunk ./src/allmydata/test/test_mutable.py 2263
11716         self.basedir = "mutable/Problems/test_retrieve_surprise"
11717         self.set_up_grid()
11718         nm = self.g.clients[0].nodemaker
11719-        d = nm.create_mutable_file("contents 1")
11720+        d = nm.create_mutable_file(MutableData("contents 1"))
11721         def _created(n):
11722             d = defer.succeed(None)
11723             d.addCallback(lambda res: n.get_servermap(MODE_READ))
11724hunk ./src/allmydata/test/test_mutable.py 2273
11725             d.addCallback(_got_smap1)
11726             # then modify the file, leaving the old map untouched
11727             d.addCallback(lambda res: log.msg("starting winning write"))
11728-            d.addCallback(lambda res: n.overwrite("contents 2"))
11729+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
11730             # now attempt to retrieve the old version with the old servermap.
11731             # This will look like someone has changed the file since we
11732             # updated the servermap.
11733hunk ./src/allmydata/test/test_mutable.py 2282
11734             d.addCallback(lambda res:
11735                           self.shouldFail(NotEnoughSharesError,
11736                                           "test_retrieve_surprise",
11737-                                          "ran out of peers: have 0 shares (k=3)",
11738+                                          "ran out of peers: have 0 of 1",
11739                                           n.download_version,
11740                                           self.old_map,
11741                                           self.old_map.best_recoverable_version(),
11742hunk ./src/allmydata/test/test_mutable.py 2291
11743         d.addCallback(_created)
11744         return d
11745 
11746+
11747     def test_unexpected_shares(self):
11748         # upload the file, take a servermap, shut down one of the servers,
11749         # upload it again (causing shares to appear on a new server), then
11750hunk ./src/allmydata/test/test_mutable.py 2301
11751         self.basedir = "mutable/Problems/test_unexpected_shares"
11752         self.set_up_grid()
11753         nm = self.g.clients[0].nodemaker
11754-        d = nm.create_mutable_file("contents 1")
11755+        d = nm.create_mutable_file(MutableData("contents 1"))
11756         def _created(n):
11757             d = defer.succeed(None)
11758             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
11759hunk ./src/allmydata/test/test_mutable.py 2313
11760                 self.g.remove_server(peer0)
11761                 # then modify the file, leaving the old map untouched
11762                 log.msg("starting winning write")
11763-                return n.overwrite("contents 2")
11764+                return n.overwrite(MutableData("contents 2"))
11765             d.addCallback(_got_smap1)
11766             # now attempt to modify the file with the old servermap. This
11767             # will look just like an uncoordinated write, in which every
11768hunk ./src/allmydata/test/test_mutable.py 2323
11769                           self.shouldFail(UncoordinatedWriteError,
11770                                           "test_surprise", None,
11771                                           n.upload,
11772-                                          "contents 2a", self.old_map))
11773+                                          MutableData("contents 2a"), self.old_map))
11774             return d
11775         d.addCallback(_created)
11776         return d
11777hunk ./src/allmydata/test/test_mutable.py 2327
11778+    test_unexpected_shares.timeout = 15
11779 
11780     def test_bad_server(self):
11781         # Break one server, then create the file: the initial publish should
11782hunk ./src/allmydata/test/test_mutable.py 2361
11783         d.addCallback(_break_peer0)
11784         # now "create" the file, using the pre-established key, and let the
11785         # initial publish finally happen
11786-        d.addCallback(lambda res: nm.create_mutable_file("contents 1"))
11787+        d.addCallback(lambda res: nm.create_mutable_file(MutableData("contents 1")))
11788         # that ought to work
11789         def _got_node(n):
11790             d = n.download_best_version()
11791hunk ./src/allmydata/test/test_mutable.py 2370
11792             def _break_peer1(res):
11793                 self.g.break_server(self.server1.get_serverid())
11794             d.addCallback(_break_peer1)
11795-            d.addCallback(lambda res: n.overwrite("contents 2"))
11796+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
11797             # that ought to work too
11798             d.addCallback(lambda res: n.download_best_version())
11799             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
11800hunk ./src/allmydata/test/test_mutable.py 2402
11801         peerids = [s.get_serverid() for s in sb.get_connected_servers()]
11802         self.g.break_server(peerids[0])
11803 
11804-        d = nm.create_mutable_file("contents 1")
11805+        d = nm.create_mutable_file(MutableData("contents 1"))
11806         def _created(n):
11807             d = n.download_best_version()
11808             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
11809hunk ./src/allmydata/test/test_mutable.py 2410
11810             def _break_second_server(res):
11811                 self.g.break_server(peerids[1])
11812             d.addCallback(_break_second_server)
11813-            d.addCallback(lambda res: n.overwrite("contents 2"))
11814+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
11815             # that ought to work too
11816             d.addCallback(lambda res: n.download_best_version())
11817             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
11818hunk ./src/allmydata/test/test_mutable.py 2429
11819         d = self.shouldFail(NotEnoughServersError,
11820                             "test_publish_all_servers_bad",
11821                             "Ran out of non-bad servers",
11822-                            nm.create_mutable_file, "contents")
11823+                            nm.create_mutable_file, MutableData("contents"))
11824         return d
11825 
11826     def test_publish_no_servers(self):
11827hunk ./src/allmydata/test/test_mutable.py 2441
11828         d = self.shouldFail(NotEnoughServersError,
11829                             "test_publish_no_servers",
11830                             "Ran out of non-bad servers",
11831-                            nm.create_mutable_file, "contents")
11832+                            nm.create_mutable_file, MutableData("contents"))
11833         return d
11834     test_publish_no_servers.timeout = 30
11835 
11836hunk ./src/allmydata/test/test_mutable.py 2459
11837         # we need some contents that are large enough to push the privkey out
11838         # of the early part of the file
11839         LARGE = "These are Larger contents" * 2000 # about 50KB
11840-        d = nm.create_mutable_file(LARGE)
11841+        LARGE_uploadable = MutableData(LARGE)
11842+        d = nm.create_mutable_file(LARGE_uploadable)
11843         def _created(n):
11844             self.uri = n.get_uri()
11845             self.n2 = nm.create_from_cap(self.uri)
11846hunk ./src/allmydata/test/test_mutable.py 2495
11847         self.basedir = "mutable/Problems/test_privkey_query_missing"
11848         self.set_up_grid(num_servers=20)
11849         nm = self.g.clients[0].nodemaker
11850-        LARGE = "These are Larger contents" * 2000 # about 50KB
11851+        LARGE = "These are Larger contents" * 2000 # about 50KiB
11852+        LARGE_uploadable = MutableData(LARGE)
11853         nm._node_cache = DevNullDictionary() # disable the nodecache
11854 
11855hunk ./src/allmydata/test/test_mutable.py 2499
11856-        d = nm.create_mutable_file(LARGE)
11857+        d = nm.create_mutable_file(LARGE_uploadable)
11858         def _created(n):
11859             self.uri = n.get_uri()
11860             self.n2 = nm.create_from_cap(self.uri)
11861hunk ./src/allmydata/test/test_mutable.py 2509
11862         d.addCallback(_created)
11863         d.addCallback(lambda res: self.n2.get_servermap(MODE_WRITE))
11864         return d
11865+
11866+
11867+    def test_block_and_hash_query_error(self):
11868+        # This tests for what happens when a query to a remote server
11869+        # fails in either the hash validation step or the block getting
11870+        # step (because of batching, this is the same actual query).
11871+        # We need to have the storage server persist up until the point
11872+        # that its prefix is validated, then suddenly die. This
11873+        # exercises some exception handling code in Retrieve.
11874+        self.basedir = "mutable/Problems/test_block_and_hash_query_error"
11875+        self.set_up_grid(num_servers=20)
11876+        nm = self.g.clients[0].nodemaker
11877+        CONTENTS = "contents" * 2000
11878+        CONTENTS_uploadable = MutableData(CONTENTS)
11879+        d = nm.create_mutable_file(CONTENTS_uploadable)
11880+        def _created(node):
11881+            self._node = node
11882+        d.addCallback(_created)
11883+        d.addCallback(lambda ignored:
11884+            self._node.get_servermap(MODE_READ))
11885+        def _then(servermap):
11886+            # we have our servermap. Now we set up the servers like the
11887+            # tests above -- the first one that gets a read call should
11888+            # start throwing errors, but only after returning its prefix
11889+            # for validation. Since we'll download without fetching the
11890+            # private key, the next query to the remote server will be
11891+            # for either a block and salt or for hashes, either of which
11892+            # will exercise the error handling code.
11893+            killer = FirstServerGetsKilled()
11894+            for (serverid, ss) in nm.storage_broker.get_all_servers():
11895+                ss.post_call_notifier = killer.notify
11896+            ver = servermap.best_recoverable_version()
11897+            assert ver
11898+            return self._node.download_version(servermap, ver)
11899+        d.addCallback(_then)
11900+        d.addCallback(lambda data:
11901+            self.failUnlessEqual(data, CONTENTS))
11902+        return d
11903+
11904+
11905+class FileHandle(unittest.TestCase):
11906+    def setUp(self):
11907+        self.test_data = "Test Data" * 50000
11908+        self.sio = StringIO(self.test_data)
11909+        self.uploadable = MutableFileHandle(self.sio)
11910+
11911+
11912+    def test_filehandle_read(self):
11913+        self.basedir = "mutable/FileHandle/test_filehandle_read"
11914+        chunk_size = 10
11915+        for i in xrange(0, len(self.test_data), chunk_size):
11916+            data = self.uploadable.read(chunk_size)
11917+            data = "".join(data)
11918+            start = i
11919+            end = i + chunk_size
11920+            self.failUnlessEqual(data, self.test_data[start:end])
11921+
11922+
11923+    def test_filehandle_get_size(self):
11924+        self.basedir = "mutable/FileHandle/test_filehandle_get_size"
11925+        actual_size = len(self.test_data)
11926+        size = self.uploadable.get_size()
11927+        self.failUnlessEqual(size, actual_size)
11928+
11929+
11930+    def test_filehandle_get_size_out_of_order(self):
11931+        # We should be able to call get_size whenever we want without
11932+        # disturbing the location of the seek pointer.
11933+        chunk_size = 100
11934+        data = self.uploadable.read(chunk_size)
11935+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
11936+
11937+        # Now get the size.
11938+        size = self.uploadable.get_size()
11939+        self.failUnlessEqual(size, len(self.test_data))
11940+
11941+        # Now get more data. We should be right where we left off.
11942+        more_data = self.uploadable.read(chunk_size)
11943+        start = chunk_size
11944+        end = chunk_size * 2
11945+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
11946+
11947+
11948+    def test_filehandle_file(self):
11949+        # Make sure that the MutableFileHandle works on a file as well
11950+        # as a StringIO object, since in some cases it will be asked to
11951+        # deal with files.
11952+        self.basedir = self.mktemp()
11953+        # necessary? What am I doing wrong here?
11954+        os.mkdir(self.basedir)
11955+        f_path = os.path.join(self.basedir, "test_file")
11956+        f = open(f_path, "w")
11957+        f.write(self.test_data)
11958+        f.close()
11959+        f = open(f_path, "r")
11960+
11961+        uploadable = MutableFileHandle(f)
11962+
11963+        data = uploadable.read(len(self.test_data))
11964+        self.failUnlessEqual("".join(data), self.test_data)
11965+        size = uploadable.get_size()
11966+        self.failUnlessEqual(size, len(self.test_data))
11967+
11968+
11969+    def test_close(self):
11970+        # Make sure that the MutableFileHandle closes its handle when
11971+        # told to do so.
11972+        self.uploadable.close()
11973+        self.failUnless(self.sio.closed)
11974+
11975+
11976+class DataHandle(unittest.TestCase):
11977+    def setUp(self):
11978+        self.test_data = "Test Data" * 50000
11979+        self.uploadable = MutableData(self.test_data)
11980+
11981+
11982+    def test_datahandle_read(self):
11983+        chunk_size = 10
11984+        for i in xrange(0, len(self.test_data), chunk_size):
11985+            data = self.uploadable.read(chunk_size)
11986+            data = "".join(data)
11987+            start = i
11988+            end = i + chunk_size
11989+            self.failUnlessEqual(data, self.test_data[start:end])
11990+
11991+
11992+    def test_datahandle_get_size(self):
11993+        actual_size = len(self.test_data)
11994+        size = self.uploadable.get_size()
11995+        self.failUnlessEqual(size, actual_size)
11996+
11997+
11998+    def test_datahandle_get_size_out_of_order(self):
11999+        # We should be able to call get_size whenever we want without
12000+        # disturbing the location of the seek pointer.
12001+        chunk_size = 100
12002+        data = self.uploadable.read(chunk_size)
12003+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
12004+
12005+        # Now get the size.
12006+        size = self.uploadable.get_size()
12007+        self.failUnlessEqual(size, len(self.test_data))
12008+
12009+        # Now get more data. We should be right where we left off.
12010+        more_data = self.uploadable.read(chunk_size)
12011+        start = chunk_size
12012+        end = chunk_size * 2
12013+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
12014+
12015+
12016+class Version(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin, \
12017+              PublishMixin):
12018+    def setUp(self):
12019+        GridTestMixin.setUp(self)
12020+        self.basedir = self.mktemp()
12021+        self.set_up_grid()
12022+        self.c = self.g.clients[0]
12023+        self.nm = self.c.nodemaker
12024+        self.data = "test data" * 100000 # about 900 KiB; MDMF
12025+        self.small_data = "test data" * 10 # about 90 B; SDMF
12026+        return self.do_upload()
12027+
12028+
12029+    def do_upload(self):
12030+        d1 = self.nm.create_mutable_file(MutableData(self.data),
12031+                                         version=MDMF_VERSION)
12032+        d2 = self.nm.create_mutable_file(MutableData(self.small_data))
12033+        dl = gatherResults([d1, d2])
12034+        def _then((n1, n2)):
12035+            assert isinstance(n1, MutableFileNode)
12036+            assert isinstance(n2, MutableFileNode)
12037+
12038+            self.mdmf_node = n1
12039+            self.sdmf_node = n2
12040+        dl.addCallback(_then)
12041+        return dl
12042+
12043+
12044+    def test_get_readonly_mutable_version(self):
12045+        # Attempting to get a mutable version of a mutable file from a
12046+        # filenode initialized with a readcap should return a readonly
12047+        # version of that same node.
12048+        ro = self.mdmf_node.get_readonly()
12049+        d = ro.get_best_mutable_version()
12050+        d.addCallback(lambda version:
12051+            self.failUnless(version.is_readonly()))
12052+        d.addCallback(lambda ignored:
12053+            self.sdmf_node.get_readonly())
12054+        d.addCallback(lambda version:
12055+            self.failUnless(version.is_readonly()))
12056+        return d
12057+
12058+
12059+    def test_get_sequence_number(self):
12060+        d = self.mdmf_node.get_best_readable_version()
12061+        d.addCallback(lambda bv:
12062+            self.failUnlessEqual(bv.get_sequence_number(), 1))
12063+        d.addCallback(lambda ignored:
12064+            self.sdmf_node.get_best_readable_version())
12065+        d.addCallback(lambda bv:
12066+            self.failUnlessEqual(bv.get_sequence_number(), 1))
12067+        # Now update. The sequence number in both cases should be 1 in
12068+        # both cases.
12069+        def _do_update(ignored):
12070+            new_data = MutableData("foo bar baz" * 100000)
12071+            new_small_data = MutableData("foo bar baz" * 10)
12072+            d1 = self.mdmf_node.overwrite(new_data)
12073+            d2 = self.sdmf_node.overwrite(new_small_data)
12074+            dl = gatherResults([d1, d2])
12075+            return dl
12076+        d.addCallback(_do_update)
12077+        d.addCallback(lambda ignored:
12078+            self.mdmf_node.get_best_readable_version())
12079+        d.addCallback(lambda bv:
12080+            self.failUnlessEqual(bv.get_sequence_number(), 2))
12081+        d.addCallback(lambda ignored:
12082+            self.sdmf_node.get_best_readable_version())
12083+        d.addCallback(lambda bv:
12084+            self.failUnlessEqual(bv.get_sequence_number(), 2))
12085+        return d
12086+
12087+
12088+    def test_get_writekey(self):
12089+        d = self.mdmf_node.get_best_mutable_version()
12090+        d.addCallback(lambda bv:
12091+            self.failUnlessEqual(bv.get_writekey(),
12092+                                 self.mdmf_node.get_writekey()))
12093+        d.addCallback(lambda ignored:
12094+            self.sdmf_node.get_best_mutable_version())
12095+        d.addCallback(lambda bv:
12096+            self.failUnlessEqual(bv.get_writekey(),
12097+                                 self.sdmf_node.get_writekey()))
12098+        return d
12099+
12100+
12101+    def test_get_storage_index(self):
12102+        d = self.mdmf_node.get_best_mutable_version()
12103+        d.addCallback(lambda bv:
12104+            self.failUnlessEqual(bv.get_storage_index(),
12105+                                 self.mdmf_node.get_storage_index()))
12106+        d.addCallback(lambda ignored:
12107+            self.sdmf_node.get_best_mutable_version())
12108+        d.addCallback(lambda bv:
12109+            self.failUnlessEqual(bv.get_storage_index(),
12110+                                 self.sdmf_node.get_storage_index()))
12111+        return d
12112+
12113+
12114+    def test_get_readonly_version(self):
12115+        d = self.mdmf_node.get_best_readable_version()
12116+        d.addCallback(lambda bv:
12117+            self.failUnless(bv.is_readonly()))
12118+        d.addCallback(lambda ignored:
12119+            self.sdmf_node.get_best_readable_version())
12120+        d.addCallback(lambda bv:
12121+            self.failUnless(bv.is_readonly()))
12122+        return d
12123+
12124+
12125+    def test_get_mutable_version(self):
12126+        d = self.mdmf_node.get_best_mutable_version()
12127+        d.addCallback(lambda bv:
12128+            self.failIf(bv.is_readonly()))
12129+        d.addCallback(lambda ignored:
12130+            self.sdmf_node.get_best_mutable_version())
12131+        d.addCallback(lambda bv:
12132+            self.failIf(bv.is_readonly()))
12133+        return d
12134+
12135+
12136+    def test_toplevel_overwrite(self):
12137+        new_data = MutableData("foo bar baz" * 100000)
12138+        new_small_data = MutableData("foo bar baz" * 10)
12139+        d = self.mdmf_node.overwrite(new_data)
12140+        d.addCallback(lambda ignored:
12141+            self.mdmf_node.download_best_version())
12142+        d.addCallback(lambda data:
12143+            self.failUnlessEqual(data, "foo bar baz" * 100000))
12144+        d.addCallback(lambda ignored:
12145+            self.sdmf_node.overwrite(new_small_data))
12146+        d.addCallback(lambda ignored:
12147+            self.sdmf_node.download_best_version())
12148+        d.addCallback(lambda data:
12149+            self.failUnlessEqual(data, "foo bar baz" * 10))
12150+        return d
12151+
12152+
12153+    def test_toplevel_modify(self):
12154+        def modifier(old_contents, servermap, first_time):
12155+            return old_contents + "modified"
12156+        d = self.mdmf_node.modify(modifier)
12157+        d.addCallback(lambda ignored:
12158+            self.mdmf_node.download_best_version())
12159+        d.addCallback(lambda data:
12160+            self.failUnlessIn("modified", data))
12161+        d.addCallback(lambda ignored:
12162+            self.sdmf_node.modify(modifier))
12163+        d.addCallback(lambda ignored:
12164+            self.sdmf_node.download_best_version())
12165+        d.addCallback(lambda data:
12166+            self.failUnlessIn("modified", data))
12167+        return d
12168+
12169+
12170+    def test_version_modify(self):
12171+        # TODO: When we can publish multiple versions, alter this test
12172+        # to modify a version other than the best usable version, then
12173+        # test to see that the best recoverable version is that.
12174+        def modifier(old_contents, servermap, first_time):
12175+            return old_contents + "modified"
12176+        d = self.mdmf_node.modify(modifier)
12177+        d.addCallback(lambda ignored:
12178+            self.mdmf_node.download_best_version())
12179+        d.addCallback(lambda data:
12180+            self.failUnlessIn("modified", data))
12181+        d.addCallback(lambda ignored:
12182+            self.sdmf_node.modify(modifier))
12183+        d.addCallback(lambda ignored:
12184+            self.sdmf_node.download_best_version())
12185+        d.addCallback(lambda data:
12186+            self.failUnlessIn("modified", data))
12187+        return d
12188+
12189+
12190+    def test_download_version(self):
12191+        d = self.publish_multiple()
12192+        # We want to have two recoverable versions on the grid.
12193+        d.addCallback(lambda res:
12194+                      self._set_versions({0:0,2:0,4:0,6:0,8:0,
12195+                                          1:1,3:1,5:1,7:1,9:1}))
12196+        # Now try to download each version. We should get the plaintext
12197+        # associated with that version.
12198+        d.addCallback(lambda ignored:
12199+            self._fn.get_servermap(mode=MODE_READ))
12200+        def _got_servermap(smap):
12201+            versions = smap.recoverable_versions()
12202+            assert len(versions) == 2
12203+
12204+            self.servermap = smap
12205+            self.version1, self.version2 = versions
12206+            assert self.version1 != self.version2
12207+
12208+            self.version1_seqnum = self.version1[0]
12209+            self.version2_seqnum = self.version2[0]
12210+            self.version1_index = self.version1_seqnum - 1
12211+            self.version2_index = self.version2_seqnum - 1
12212+
12213+        d.addCallback(_got_servermap)
12214+        d.addCallback(lambda ignored:
12215+            self._fn.download_version(self.servermap, self.version1))
12216+        d.addCallback(lambda results:
12217+            self.failUnlessEqual(self.CONTENTS[self.version1_index],
12218+                                 results))
12219+        d.addCallback(lambda ignored:
12220+            self._fn.download_version(self.servermap, self.version2))
12221+        d.addCallback(lambda results:
12222+            self.failUnlessEqual(self.CONTENTS[self.version2_index],
12223+                                 results))
12224+        return d
12225+
12226+
12227+    def test_download_nonexistent_version(self):
12228+        d = self.mdmf_node.get_servermap(mode=MODE_WRITE)
12229+        def _set_servermap(servermap):
12230+            self.servermap = servermap
12231+        d.addCallback(_set_servermap)
12232+        d.addCallback(lambda ignored:
12233+           self.shouldFail(UnrecoverableFileError, "nonexistent version",
12234+                           None,
12235+                           self.mdmf_node.download_version, self.servermap,
12236+                           "not a version"))
12237+        return d
12238+
12239+
12240+    def test_partial_read(self):
12241+        # read only a few bytes at a time, and see that the results are
12242+        # what we expect.
12243+        d = self.mdmf_node.get_best_readable_version()
12244+        def _read_data(version):
12245+            c = consumer.MemoryConsumer()
12246+            d2 = defer.succeed(None)
12247+            for i in xrange(0, len(self.data), 10000):
12248+                d2.addCallback(lambda ignored, i=i: version.read(c, i, 10000))
12249+            d2.addCallback(lambda ignored:
12250+                self.failUnlessEqual(self.data, "".join(c.chunks)))
12251+            return d2
12252+        d.addCallback(_read_data)
12253+        return d
12254+
12255+
12256+    def test_read(self):
12257+        d = self.mdmf_node.get_best_readable_version()
12258+        def _read_data(version):
12259+            c = consumer.MemoryConsumer()
12260+            d2 = defer.succeed(None)
12261+            d2.addCallback(lambda ignored: version.read(c))
12262+            d2.addCallback(lambda ignored:
12263+                self.failUnlessEqual("".join(c.chunks), self.data))
12264+            return d2
12265+        d.addCallback(_read_data)
12266+        return d
12267+
12268+
12269+    def test_download_best_version(self):
12270+        d = self.mdmf_node.download_best_version()
12271+        d.addCallback(lambda data:
12272+            self.failUnlessEqual(data, self.data))
12273+        d.addCallback(lambda ignored:
12274+            self.sdmf_node.download_best_version())
12275+        d.addCallback(lambda data:
12276+            self.failUnlessEqual(data, self.small_data))
12277+        return d
12278+
12279+
12280+class Update(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin):
12281+    def setUp(self):
12282+        GridTestMixin.setUp(self)
12283+        self.basedir = self.mktemp()
12284+        self.set_up_grid()
12285+        self.c = self.g.clients[0]
12286+        self.nm = self.c.nodemaker
12287+        self.data = "test data" * 100000 # about 900 KiB; MDMF
12288+        self.small_data = "test data" * 10 # about 90 B; SDMF
12289+        return self.do_upload()
12290+
12291+
12292+    def do_upload(self):
12293+        d1 = self.nm.create_mutable_file(MutableData(self.data),
12294+                                         version=MDMF_VERSION)
12295+        d2 = self.nm.create_mutable_file(MutableData(self.small_data))
12296+        dl = gatherResults([d1, d2])
12297+        def _then((n1, n2)):
12298+            assert isinstance(n1, MutableFileNode)
12299+            assert isinstance(n2, MutableFileNode)
12300+
12301+            self.mdmf_node = n1
12302+            self.sdmf_node = n2
12303+        dl.addCallback(_then)
12304+        return dl
12305+
12306+
12307+    def test_append(self):
12308+        # We should be able to append data to the middle of a mutable
12309+        # file and get what we expect.
12310+        new_data = self.data + "appended"
12311+        d = self.mdmf_node.get_best_mutable_version()
12312+        d.addCallback(lambda mv:
12313+            mv.update(MutableData("appended"), len(self.data)))
12314+        d.addCallback(lambda ignored:
12315+            self.mdmf_node.download_best_version())
12316+        d.addCallback(lambda results:
12317+            self.failUnlessEqual(results, new_data))
12318+        return d
12319+    test_append.timeout = 15
12320+
12321+
12322+    def test_replace(self):
12323+        # We should be able to replace data in the middle of a mutable
12324+        # file and get what we expect back.
12325+        new_data = self.data[:100]
12326+        new_data += "appended"
12327+        new_data += self.data[108:]
12328+        d = self.mdmf_node.get_best_mutable_version()
12329+        d.addCallback(lambda mv:
12330+            mv.update(MutableData("appended"), 100))
12331+        d.addCallback(lambda ignored:
12332+            self.mdmf_node.download_best_version())
12333+        d.addCallback(lambda results:
12334+            self.failUnlessEqual(results, new_data))
12335+        return d
12336+
12337+
12338+    def test_replace_and_extend(self):
12339+        # We should be able to replace data in the middle of a mutable
12340+        # file and extend that mutable file and get what we expect.
12341+        new_data = self.data[:100]
12342+        new_data += "modified " * 100000
12343+        d = self.mdmf_node.get_best_mutable_version()
12344+        d.addCallback(lambda mv:
12345+            mv.update(MutableData("modified " * 100000), 100))
12346+        d.addCallback(lambda ignored:
12347+            self.mdmf_node.download_best_version())
12348+        d.addCallback(lambda results:
12349+            self.failUnlessEqual(results, new_data))
12350+        return d
12351+
12352+
12353+    def test_append_power_of_two(self):
12354+        # If we attempt to extend a mutable file so that its segment
12355+        # count crosses a power-of-two boundary, the update operation
12356+        # should know how to reencode the file.
12357+
12358+        # Note that the data populating self.mdmf_node is about 900 KiB
12359+        # long -- this is 7 segments in the default segment size. So we
12360+        # need to add 2 segments worth of data to push it over a
12361+        # power-of-two boundary.
12362+        segment = "a" * DEFAULT_MAX_SEGMENT_SIZE
12363+        new_data = self.data + (segment * 2)
12364+        d = self.mdmf_node.get_best_mutable_version()
12365+        d.addCallback(lambda mv:
12366+            mv.update(MutableData(segment * 2), len(self.data)))
12367+        d.addCallback(lambda ignored:
12368+            self.mdmf_node.download_best_version())
12369+        d.addCallback(lambda results:
12370+            self.failUnlessEqual(results, new_data))
12371+        return d
12372+    test_append_power_of_two.timeout = 15
12373+
12374+
12375+    def test_update_sdmf(self):
12376+        # Running update on a single-segment file should still work.
12377+        new_data = self.small_data + "appended"
12378+        d = self.sdmf_node.get_best_mutable_version()
12379+        d.addCallback(lambda mv:
12380+            mv.update(MutableData("appended"), len(self.small_data)))
12381+        d.addCallback(lambda ignored:
12382+            self.sdmf_node.download_best_version())
12383+        d.addCallback(lambda results:
12384+            self.failUnlessEqual(results, new_data))
12385+        return d
12386+
12387+    def test_replace_in_last_segment(self):
12388+        # The wrapper should know how to handle the tail segment
12389+        # appropriately.
12390+        replace_offset = len(self.data) - 100
12391+        new_data = self.data[:replace_offset] + "replaced"
12392+        rest_offset = replace_offset + len("replaced")
12393+        new_data += self.data[rest_offset:]
12394+        d = self.mdmf_node.get_best_mutable_version()
12395+        d.addCallback(lambda mv:
12396+            mv.update(MutableData("replaced"), replace_offset))
12397+        d.addCallback(lambda ignored:
12398+            self.mdmf_node.download_best_version())
12399+        d.addCallback(lambda results:
12400+            self.failUnlessEqual(results, new_data))
12401+        return d
12402+
12403+
12404+    def test_multiple_segment_replace(self):
12405+        replace_offset = 2 * DEFAULT_MAX_SEGMENT_SIZE
12406+        new_data = self.data[:replace_offset]
12407+        new_segment = "a" * DEFAULT_MAX_SEGMENT_SIZE
12408+        new_data += 2 * new_segment
12409+        new_data += "replaced"
12410+        rest_offset = len(new_data)
12411+        new_data += self.data[rest_offset:]
12412+        d = self.mdmf_node.get_best_mutable_version()
12413+        d.addCallback(lambda mv:
12414+            mv.update(MutableData((2 * new_segment) + "replaced"),
12415+                      replace_offset))
12416+        d.addCallback(lambda ignored:
12417+            self.mdmf_node.download_best_version())
12418+        d.addCallback(lambda results:
12419+            self.failUnlessEqual(results, new_data))
12420+        return d
12421hunk ./src/allmydata/test/test_sftp.py 32
12422 
12423 from allmydata.util.consumer import download_to_data
12424 from allmydata.immutable import upload
12425+from allmydata.mutable import publish
12426 from allmydata.test.no_network import GridTestMixin
12427 from allmydata.test.common import ShouldFailMixin
12428 from allmydata.test.common_util import ReallyEqualMixin
12429hunk ./src/allmydata/test/test_sftp.py 84
12430         return d
12431 
12432     def _set_up_tree(self):
12433-        d = self.client.create_mutable_file("mutable file contents")
12434+        u = publish.MutableData("mutable file contents")
12435+        d = self.client.create_mutable_file(u)
12436         d.addCallback(lambda node: self.root.set_node(u"mutable", node))
12437         def _created_mutable(n):
12438             self.mutable = n
12439hunk ./src/allmydata/test/test_sftp.py 1334
12440         d.addCallback(lambda ign: self.failUnlessEqual(sftpd.all_heisenfiles, {}))
12441         d.addCallback(lambda ign: self.failUnlessEqual(self.handler._heisenfiles, {}))
12442         return d
12443+    test_makeDirectory.timeout = 15
12444 
12445     def test_execCommand_and_openShell(self):
12446         class FakeProtocol:
12447hunk ./src/allmydata/test/test_storage.py 27
12448                                      LayoutInvalid, MDMFSIGNABLEHEADER, \
12449                                      SIGNED_PREFIX, MDMFHEADER, \
12450                                      MDMFOFFSETS, SDMFSlotWriteProxy
12451-from allmydata.interfaces import BadWriteEnablerError, MDMF_VERSION, \
12452-                                 SDMF_VERSION
12453+from allmydata.interfaces import BadWriteEnablerError
12454 from allmydata.test.common import LoggingServiceParent, ShouldFailMixin
12455 from allmydata.test.common_web import WebRenderingMixin
12456 from allmydata.web.storage import StorageStatus, remove_prefix
12457hunk ./src/allmydata/test/test_system.py 26
12458 from allmydata.monitor import Monitor
12459 from allmydata.mutable.common import NotWriteableError
12460 from allmydata.mutable import layout as mutable_layout
12461+from allmydata.mutable.publish import MutableData
12462 from foolscap.api import DeadReferenceError
12463 from twisted.python.failure import Failure
12464 from twisted.web.client import getPage
12465hunk ./src/allmydata/test/test_system.py 467
12466     def test_mutable(self):
12467         self.basedir = "system/SystemTest/test_mutable"
12468         DATA = "initial contents go here."  # 25 bytes % 3 != 0
12469+        DATA_uploadable = MutableData(DATA)
12470         NEWDATA = "new contents yay"
12471hunk ./src/allmydata/test/test_system.py 469
12472+        NEWDATA_uploadable = MutableData(NEWDATA)
12473         NEWERDATA = "this is getting old"
12474hunk ./src/allmydata/test/test_system.py 471
12475+        NEWERDATA_uploadable = MutableData(NEWERDATA)
12476 
12477         d = self.set_up_nodes(use_key_generator=True)
12478 
12479hunk ./src/allmydata/test/test_system.py 478
12480         def _create_mutable(res):
12481             c = self.clients[0]
12482             log.msg("starting create_mutable_file")
12483-            d1 = c.create_mutable_file(DATA)
12484+            d1 = c.create_mutable_file(DATA_uploadable)
12485             def _done(res):
12486                 log.msg("DONE: %s" % (res,))
12487                 self._mutable_node_1 = res
12488hunk ./src/allmydata/test/test_system.py 565
12489             self.failUnlessEqual(res, DATA)
12490             # replace the data
12491             log.msg("starting replace1")
12492-            d1 = newnode.overwrite(NEWDATA)
12493+            d1 = newnode.overwrite(NEWDATA_uploadable)
12494             d1.addCallback(lambda res: newnode.download_best_version())
12495             return d1
12496         d.addCallback(_check_download_3)
12497hunk ./src/allmydata/test/test_system.py 579
12498             newnode2 = self.clients[3].create_node_from_uri(uri)
12499             self._newnode3 = self.clients[3].create_node_from_uri(uri)
12500             log.msg("starting replace2")
12501-            d1 = newnode1.overwrite(NEWERDATA)
12502+            d1 = newnode1.overwrite(NEWERDATA_uploadable)
12503             d1.addCallback(lambda res: newnode2.download_best_version())
12504             return d1
12505         d.addCallback(_check_download_4)
12506hunk ./src/allmydata/test/test_system.py 649
12507         def _check_empty_file(res):
12508             # make sure we can create empty files, this usually screws up the
12509             # segsize math
12510-            d1 = self.clients[2].create_mutable_file("")
12511+            d1 = self.clients[2].create_mutable_file(MutableData(""))
12512             d1.addCallback(lambda newnode: newnode.download_best_version())
12513             d1.addCallback(lambda res: self.failUnlessEqual("", res))
12514             return d1
12515hunk ./src/allmydata/test/test_system.py 680
12516                                  self.key_generator_svc.key_generator.pool_size + size_delta)
12517 
12518         d.addCallback(check_kg_poolsize, 0)
12519-        d.addCallback(lambda junk: self.clients[3].create_mutable_file('hello, world'))
12520+        d.addCallback(lambda junk:
12521+            self.clients[3].create_mutable_file(MutableData('hello, world')))
12522         d.addCallback(check_kg_poolsize, -1)
12523         d.addCallback(lambda junk: self.clients[3].create_dirnode())
12524         d.addCallback(check_kg_poolsize, -2)
12525hunk ./src/allmydata/test/test_web.py 28
12526 from allmydata.util.encodingutil import to_str
12527 from allmydata.test.common import FakeCHKFileNode, FakeMutableFileNode, \
12528      create_chk_filenode, WebErrorMixin, ShouldFailMixin, make_mutable_file_uri
12529-from allmydata.interfaces import IMutableFileNode
12530+from allmydata.interfaces import IMutableFileNode, SDMF_VERSION, MDMF_VERSION
12531 from allmydata.mutable import servermap, publish, retrieve
12532 import allmydata.test.common_util as testutil
12533 from allmydata.test.no_network import GridTestMixin
12534hunk ./src/allmydata/test/test_web.py 57
12535         return FakeCHKFileNode(cap)
12536     def _create_mutable(self, cap):
12537         return FakeMutableFileNode(None, None, None, None).init_from_cap(cap)
12538-    def create_mutable_file(self, contents="", keysize=None):
12539+    def create_mutable_file(self, contents="", keysize=None,
12540+                            version=SDMF_VERSION):
12541         n = FakeMutableFileNode(None, None, None, None)
12542hunk ./src/allmydata/test/test_web.py 60
12543+        n.set_version(version)
12544         return n.create(contents)
12545 
12546 class FakeUploader(service.Service):
12547hunk ./src/allmydata/test/test_web.py 157
12548         self.nodemaker = FakeNodeMaker(None, self._secret_holder, None,
12549                                        self.uploader, None,
12550                                        None, None)
12551+        self.mutable_file_default = SDMF_VERSION
12552 
12553     def startService(self):
12554         return service.MultiService.startService(self)
12555hunk ./src/allmydata/test/test_web.py 762
12556                              self.PUT, base + "/@@name=/blah.txt", "")
12557         return d
12558 
12559+
12560     def test_GET_DIRURL_named_bad(self):
12561         base = "/file/%s" % urllib.quote(self._foo_uri)
12562         d = self.shouldFail2(error.Error, "test_PUT_DIRURL_named_bad",
12563hunk ./src/allmydata/test/test_web.py 878
12564                                                       self.NEWFILE_CONTENTS))
12565         return d
12566 
12567+    def test_PUT_NEWFILEURL_unlinked_mdmf(self):
12568+        # this should get us a few segments of an MDMF mutable file,
12569+        # which we can then test for.
12570+        contents = self.NEWFILE_CONTENTS * 300000
12571+        d = self.PUT("/uri?mutable=true&mutable-type=mdmf",
12572+                     contents)
12573+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12574+        d.addCallback(lambda json: self.failUnlessIn("mdmf", json))
12575+        return d
12576+
12577+    def test_PUT_NEWFILEURL_unlinked_sdmf(self):
12578+        contents = self.NEWFILE_CONTENTS * 300000
12579+        d = self.PUT("/uri?mutable=true&mutable-type=sdmf",
12580+                     contents)
12581+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12582+        d.addCallback(lambda json: self.failUnlessIn("sdmf", json))
12583+        return d
12584+
12585     def test_PUT_NEWFILEURL_range_bad(self):
12586         headers = {"content-range": "bytes 1-10/%d" % len(self.NEWFILE_CONTENTS)}
12587         target = self.public_url + "/foo/new.txt"
12588hunk ./src/allmydata/test/test_web.py 928
12589         return d
12590 
12591     def test_PUT_NEWFILEURL_mutable_toobig(self):
12592-        d = self.shouldFail2(error.Error, "test_PUT_NEWFILEURL_mutable_toobig",
12593-                             "413 Request Entity Too Large",
12594-                             "SDMF is limited to one segment, and 10001 > 10000",
12595-                             self.PUT,
12596-                             self.public_url + "/foo/new.txt?mutable=true",
12597-                             "b" * (self.s.MUTABLE_SIZELIMIT+1))
12598+        # It is okay to upload large mutable files, so we should be able
12599+        # to do that.
12600+        d = self.PUT(self.public_url + "/foo/new.txt?mutable=true",
12601+                     "b" * (self.s.MUTABLE_SIZELIMIT + 1))
12602         return d
12603 
12604     def test_PUT_NEWFILEURL_replace(self):
12605hunk ./src/allmydata/test/test_web.py 1026
12606         d.addCallback(_check1)
12607         return d
12608 
12609+    def test_GET_FILEURL_json_mutable_type(self):
12610+        # The JSON should include mutable-type, which says whether the
12611+        # file is SDMF or MDMF
12612+        d = self.PUT("/uri?mutable=true&mutable-type=mdmf",
12613+                     self.NEWFILE_CONTENTS * 300000)
12614+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12615+        def _got_json(json, version):
12616+            data = simplejson.loads(json)
12617+            assert "filenode" == data[0]
12618+            data = data[1]
12619+            assert isinstance(data, dict)
12620+
12621+            self.failUnlessIn("mutable-type", data)
12622+            self.failUnlessEqual(data['mutable-type'], version)
12623+
12624+        d.addCallback(_got_json, "mdmf")
12625+        # Now make an SDMF file and check that it is reported correctly.
12626+        d.addCallback(lambda ignored:
12627+            self.PUT("/uri?mutable=true&mutable-type=sdmf",
12628+                      self.NEWFILE_CONTENTS * 300000))
12629+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12630+        d.addCallback(_got_json, "sdmf")
12631+        return d
12632+
12633     def test_GET_FILEURL_json_missing(self):
12634         d = self.GET(self.public_url + "/foo/missing?json")
12635         d.addBoth(self.should404, "test_GET_FILEURL_json_missing")
12636hunk ./src/allmydata/test/test_web.py 1088
12637         d.addBoth(self.should404, "test_GET_FILEURL_uri_missing")
12638         return d
12639 
12640-    def test_GET_DIRECTORY_html_banner(self):
12641+    def test_GET_DIRECTORY_html(self):
12642         d = self.GET(self.public_url + "/foo", followRedirect=True)
12643         def _check(res):
12644             self.failUnlessIn('<div class="toolbar-item"><a href="../../..">Return to Welcome page</a></div>',res)
12645hunk ./src/allmydata/test/test_web.py 1092
12646+            self.failUnlessIn("mutable-type-mdmf", res)
12647+            self.failUnlessIn("mutable-type-sdmf", res)
12648         d.addCallback(_check)
12649         return d
12650 
12651hunk ./src/allmydata/test/test_web.py 1097
12652+    def test_GET_root_html(self):
12653+        # make sure that we have the option to upload an unlinked
12654+        # mutable file in SDMF and MDMF formats.
12655+        d = self.GET("/")
12656+        def _got_html(html):
12657+            # These are radio buttons that allow the user to toggle
12658+            # whether a particular mutable file is MDMF or SDMF.
12659+            self.failUnlessIn("mutable-type-mdmf", html)
12660+            self.failUnlessIn("mutable-type-sdmf", html)
12661+        d.addCallback(_got_html)
12662+        return d
12663+
12664+    def test_mutable_type_defaults(self):
12665+        # The checked="checked" attribute of the inputs corresponding to
12666+        # the mutable-type parameter should change as expected with the
12667+        # value configured in tahoe.cfg.
12668+        #
12669+        # By default, the value configured with the client is
12670+        # SDMF_VERSION, so that should be checked.
12671+        assert self.s.mutable_file_default == SDMF_VERSION
12672+
12673+        d = self.GET("/")
12674+        def _got_html(html, value):
12675+            i = 'input checked="checked" type="radio" id="mutable-type-%s"'
12676+            self.failUnlessIn(i % value, html)
12677+        d.addCallback(_got_html, "sdmf")
12678+        d.addCallback(lambda ignored:
12679+            self.GET(self.public_url + "/foo", followRedirect=True))
12680+        d.addCallback(_got_html, "sdmf")
12681+        # Now switch the configuration value to MDMF. The MDMF radio
12682+        # buttons should now be checked on these pages.
12683+        def _swap_values(ignored):
12684+            self.s.mutable_file_default = MDMF_VERSION
12685+        d.addCallback(_swap_values)
12686+        d.addCallback(lambda ignored: self.GET("/"))
12687+        d.addCallback(_got_html, "mdmf")
12688+        d.addCallback(lambda ignored:
12689+            self.GET(self.public_url + "/foo", followRedirect=True))
12690+        d.addCallback(_got_html, "mdmf")
12691+        return d
12692+
12693     def test_GET_DIRURL(self):
12694         # the addSlash means we get a redirect here
12695         # from /uri/$URI/foo/ , we need ../../../ to get back to the root
12696hunk ./src/allmydata/test/test_web.py 1227
12697         d.addCallback(self.failUnlessIsFooJSON)
12698         return d
12699 
12700+    def test_GET_DIRURL_json_mutable_type(self):
12701+        d = self.PUT(self.public_url + \
12702+                     "/foo/sdmf.txt?mutable=true&mutable-type=sdmf",
12703+                     self.NEWFILE_CONTENTS * 300000)
12704+        d.addCallback(lambda ignored:
12705+            self.PUT(self.public_url + \
12706+                     "/foo/mdmf.txt?mutable=true&mutable-type=mdmf",
12707+                     self.NEWFILE_CONTENTS * 300000))
12708+        # Now we have an MDMF and SDMF file in the directory. If we GET
12709+        # its JSON, we should see their encodings.
12710+        d.addCallback(lambda ignored:
12711+            self.GET(self.public_url + "/foo?t=json"))
12712+        def _got_json(json):
12713+            data = simplejson.loads(json)
12714+            assert data[0] == "dirnode"
12715+
12716+            data = data[1]
12717+            kids = data['children']
12718+
12719+            mdmf_data = kids['mdmf.txt'][1]
12720+            self.failUnlessIn("mutable-type", mdmf_data)
12721+            self.failUnlessEqual(mdmf_data['mutable-type'], "mdmf")
12722+
12723+            sdmf_data = kids['sdmf.txt'][1]
12724+            self.failUnlessIn("mutable-type", sdmf_data)
12725+            self.failUnlessEqual(sdmf_data['mutable-type'], "sdmf")
12726+        d.addCallback(_got_json)
12727+        return d
12728+
12729 
12730     def test_POST_DIRURL_manifest_no_ophandle(self):
12731         d = self.shouldFail2(error.Error,
12732hunk ./src/allmydata/test/test_web.py 1810
12733         return d
12734 
12735     def test_POST_upload_no_link_mutable_toobig(self):
12736-        d = self.shouldFail2(error.Error,
12737-                             "test_POST_upload_no_link_mutable_toobig",
12738-                             "413 Request Entity Too Large",
12739-                             "SDMF is limited to one segment, and 10001 > 10000",
12740-                             self.POST,
12741-                             "/uri", t="upload", mutable="true",
12742-                             file=("new.txt",
12743-                                   "b" * (self.s.MUTABLE_SIZELIMIT+1)) )
12744+        # The SDMF size limit is no longer in place, so we should be
12745+        # able to upload mutable files that are as large as we want them
12746+        # to be.
12747+        d = self.POST("/uri", t="upload", mutable="true",
12748+                      file=("new.txt", "b" * (self.s.MUTABLE_SIZELIMIT + 1)))
12749         return d
12750 
12751hunk ./src/allmydata/test/test_web.py 1817
12752+
12753+    def test_POST_upload_mutable_type_unlinked(self):
12754+        d = self.POST("/uri?t=upload&mutable=true&mutable-type=sdmf",
12755+                      file=("sdmf.txt", self.NEWFILE_CONTENTS * 300000))
12756+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12757+        def _got_json(json, version):
12758+            data = simplejson.loads(json)
12759+            data = data[1]
12760+
12761+            self.failUnlessIn("mutable-type", data)
12762+            self.failUnlessEqual(data['mutable-type'], version)
12763+        d.addCallback(_got_json, "sdmf")
12764+        d.addCallback(lambda ignored:
12765+            self.POST("/uri?t=upload&mutable=true&mutable-type=mdmf",
12766+                      file=('mdmf.txt', self.NEWFILE_CONTENTS * 300000)))
12767+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12768+        d.addCallback(_got_json, "mdmf")
12769+        return d
12770+
12771+    def test_POST_upload_mutable_type(self):
12772+        d = self.POST(self.public_url + \
12773+                      "/foo?t=upload&mutable=true&mutable-type=sdmf",
12774+                      file=("sdmf.txt", self.NEWFILE_CONTENTS * 300000))
12775+        fn = self._foo_node
12776+        def _got_cap(filecap, filename):
12777+            filenameu = unicode(filename)
12778+            self.failUnlessURIMatchesRWChild(filecap, fn, filenameu)
12779+            return self.GET(self.public_url + "/foo/%s?t=json" % filename)
12780+        d.addCallback(_got_cap, "sdmf.txt")
12781+        def _got_json(json, version):
12782+            data = simplejson.loads(json)
12783+            data = data[1]
12784+
12785+            self.failUnlessIn("mutable-type", data)
12786+            self.failUnlessEqual(data['mutable-type'], version)
12787+        d.addCallback(_got_json, "sdmf")
12788+        d.addCallback(lambda ignored:
12789+            self.POST(self.public_url + \
12790+                      "/foo?t=upload&mutable=true&mutable-type=mdmf",
12791+                      file=("mdmf.txt", self.NEWFILE_CONTENTS * 300000)))
12792+        d.addCallback(_got_cap, "mdmf.txt")
12793+        d.addCallback(_got_json, "mdmf")
12794+        return d
12795+
12796     def test_POST_upload_mutable(self):
12797         # this creates a mutable file
12798         d = self.POST(self.public_url + "/foo", t="upload", mutable="true",
12799hunk ./src/allmydata/test/test_web.py 1985
12800             self.failUnlessReallyEqual(headers["content-type"], ["text/plain"])
12801         d.addCallback(_got_headers)
12802 
12803-        # make sure that size errors are displayed correctly for overwrite
12804-        d.addCallback(lambda res:
12805-                      self.shouldFail2(error.Error,
12806-                                       "test_POST_upload_mutable-toobig",
12807-                                       "413 Request Entity Too Large",
12808-                                       "SDMF is limited to one segment, and 10001 > 10000",
12809-                                       self.POST,
12810-                                       self.public_url + "/foo", t="upload",
12811-                                       mutable="true",
12812-                                       file=("new.txt",
12813-                                             "b" * (self.s.MUTABLE_SIZELIMIT+1)),
12814-                                       ))
12815-
12816+        # make sure that outdated size limits aren't enforced anymore.
12817+        d.addCallback(lambda ignored:
12818+            self.POST(self.public_url + "/foo", t="upload",
12819+                      mutable="true",
12820+                      file=("new.txt",
12821+                            "b" * (self.s.MUTABLE_SIZELIMIT+1))))
12822         d.addErrback(self.dump_error)
12823         return d
12824 
12825hunk ./src/allmydata/test/test_web.py 1995
12826     def test_POST_upload_mutable_toobig(self):
12827-        d = self.shouldFail2(error.Error,
12828-                             "test_POST_upload_mutable_toobig",
12829-                             "413 Request Entity Too Large",
12830-                             "SDMF is limited to one segment, and 10001 > 10000",
12831-                             self.POST,
12832-                             self.public_url + "/foo",
12833-                             t="upload", mutable="true",
12834-                             file=("new.txt",
12835-                                   "b" * (self.s.MUTABLE_SIZELIMIT+1)) )
12836+        # SDMF had a size limti that was removed a while ago. MDMF has
12837+        # never had a size limit. Test to make sure that we do not
12838+        # encounter errors when trying to upload large mutable files,
12839+        # since there should be no coded prohibitions regarding large
12840+        # mutable files.
12841+        d = self.POST(self.public_url + "/foo",
12842+                      t="upload", mutable="true",
12843+                      file=("new.txt", "b" * (self.s.MUTABLE_SIZELIMIT + 1)))
12844         return d
12845 
12846     def dump_error(self, f):
12847hunk ./src/allmydata/test/test_web.py 3005
12848                                                       contents))
12849         return d
12850 
12851+    def test_PUT_NEWFILEURL_mdmf(self):
12852+        new_contents = self.NEWFILE_CONTENTS * 300000
12853+        d = self.PUT(self.public_url + \
12854+                     "/foo/mdmf.txt?mutable=true&mutable-type=mdmf",
12855+                     new_contents)
12856+        d.addCallback(lambda ignored:
12857+            self.GET(self.public_url + "/foo/mdmf.txt?t=json"))
12858+        def _got_json(json):
12859+            data = simplejson.loads(json)
12860+            data = data[1]
12861+            self.failUnlessIn("mutable-type", data)
12862+            self.failUnlessEqual(data['mutable-type'], "mdmf")
12863+        d.addCallback(_got_json)
12864+        return d
12865+
12866+    def test_PUT_NEWFILEURL_sdmf(self):
12867+        new_contents = self.NEWFILE_CONTENTS * 300000
12868+        d = self.PUT(self.public_url + \
12869+                     "/foo/sdmf.txt?mutable=true&mutable-type=sdmf",
12870+                     new_contents)
12871+        d.addCallback(lambda ignored:
12872+            self.GET(self.public_url + "/foo/sdmf.txt?t=json"))
12873+        def _got_json(json):
12874+            data = simplejson.loads(json)
12875+            data = data[1]
12876+            self.failUnlessIn("mutable-type", data)
12877+            self.failUnlessEqual(data['mutable-type'], "sdmf")
12878+        d.addCallback(_got_json)
12879+        return d
12880+
12881     def test_PUT_NEWFILEURL_uri_replace(self):
12882         contents, n, new_uri = self.makefile(8)
12883         d = self.PUT(self.public_url + "/foo/bar.txt?t=uri", new_uri)
12884hunk ./src/allmydata/test/test_web.py 3156
12885         d.addCallback(_done)
12886         return d
12887 
12888+
12889+    def test_PUT_update_at_offset(self):
12890+        file_contents = "test file" * 100000 # about 900 KiB
12891+        d = self.PUT("/uri?mutable=true", file_contents)
12892+        def _then(filecap):
12893+            self.filecap = filecap
12894+            new_data = file_contents[:100]
12895+            new = "replaced and so on"
12896+            new_data += new
12897+            new_data += file_contents[len(new_data):]
12898+            assert len(new_data) == len(file_contents)
12899+            self.new_data = new_data
12900+        d.addCallback(_then)
12901+        d.addCallback(lambda ignored:
12902+            self.PUT("/uri/%s?replace=True&offset=100" % self.filecap,
12903+                     "replaced and so on"))
12904+        def _get_data(filecap):
12905+            n = self.s.create_node_from_uri(filecap)
12906+            return n.download_best_version()
12907+        d.addCallback(_get_data)
12908+        d.addCallback(lambda results:
12909+            self.failUnlessEqual(results, self.new_data))
12910+        # Now try appending things to the file
12911+        d.addCallback(lambda ignored:
12912+            self.PUT("/uri/%s?offset=%d" % (self.filecap, len(self.new_data)),
12913+                     "puppies" * 100))
12914+        d.addCallback(_get_data)
12915+        d.addCallback(lambda results:
12916+            self.failUnlessEqual(results, self.new_data + ("puppies" * 100)))
12917+        return d
12918+
12919+
12920+    def test_PUT_update_at_offset_immutable(self):
12921+        file_contents = "Test file" * 100000
12922+        d = self.PUT("/uri", file_contents)
12923+        def _then(filecap):
12924+            self.filecap = filecap
12925+        d.addCallback(_then)
12926+        d.addCallback(lambda ignored:
12927+            self.shouldHTTPError("test immutable update",
12928+                                 400, "Bad Request",
12929+                                 "immutable",
12930+                                 self.PUT,
12931+                                 "/uri/%s?offset=50" % self.filecap,
12932+                                 "foo"))
12933+        return d
12934+
12935+
12936     def test_bad_method(self):
12937         url = self.webish_url + self.public_url + "/foo/bar.txt"
12938         d = self.shouldHTTPError("test_bad_method",
12939hunk ./src/allmydata/test/test_web.py 3473
12940         def _stash_mutable_uri(n, which):
12941             self.uris[which] = n.get_uri()
12942             assert isinstance(self.uris[which], str)
12943-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
12944+        d.addCallback(lambda ign:
12945+            c0.create_mutable_file(publish.MutableData(DATA+"3")))
12946         d.addCallback(_stash_mutable_uri, "corrupt")
12947         d.addCallback(lambda ign:
12948                       c0.upload(upload.Data("literal", convergence="")))
12949hunk ./src/allmydata/test/test_web.py 3620
12950         def _stash_mutable_uri(n, which):
12951             self.uris[which] = n.get_uri()
12952             assert isinstance(self.uris[which], str)
12953-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
12954+        d.addCallback(lambda ign:
12955+            c0.create_mutable_file(publish.MutableData(DATA+"3")))
12956         d.addCallback(_stash_mutable_uri, "corrupt")
12957 
12958         def _compute_fileurls(ignored):
12959hunk ./src/allmydata/test/test_web.py 4283
12960         def _stash_mutable_uri(n, which):
12961             self.uris[which] = n.get_uri()
12962             assert isinstance(self.uris[which], str)
12963-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"2"))
12964+        d.addCallback(lambda ign:
12965+            c0.create_mutable_file(publish.MutableData(DATA+"2")))
12966         d.addCallback(_stash_mutable_uri, "mutable")
12967 
12968         def _compute_fileurls(ignored):
12969hunk ./src/allmydata/test/test_web.py 4383
12970                                                         convergence="")))
12971         d.addCallback(_stash_uri, "small")
12972 
12973-        d.addCallback(lambda ign: c0.create_mutable_file("mutable"))
12974+        d.addCallback(lambda ign:
12975+            c0.create_mutable_file(publish.MutableData("mutable")))
12976         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
12977         d.addCallback(_stash_uri, "mutable")
12978 
12979}
12980[resolve conflicts between 393-MDMF patches and trunk as of 1.8.2
12981"Brian Warner <warner@lothar.com>"**20110220230201
12982 Ignore-this: 9bbf5d26c994e8069202331dcb4cdd95
12983] {
12984merger 0.0 (
12985merger 0.0 (
12986merger 0.0 (
12987replace ./docs/configuration.rst [A-Za-z_0-9\-\.] Tahoe Tahoe-LAFS
12988merger 0.0 (
12989hunk ./docs/configuration.rst 384
12990-shares.needed = (int, optional) aka "k", default 3
12991-shares.total = (int, optional) aka "N", N >= k, default 10
12992-shares.happy = (int, optional) 1 <= happy <= N, default 7
12993-
12994- These three values set the default encoding parameters. Each time a new file
12995- is uploaded, erasure-coding is used to break the ciphertext into separate
12996- pieces. There will be "N" (i.e. shares.total) pieces created, and the file
12997- will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
12998- The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
12999- Setting k to 1 is equivalent to simple replication (uploading N copies of
13000- the file).
13001-
13002- These values control the tradeoff between storage overhead, performance, and
13003- reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
13004- backend storage space (the actual value will be a bit more, because of other
13005- forms of overhead). Up to N-k shares can be lost before the file becomes
13006- unrecoverable, so assuming there are at least N servers, up to N-k servers
13007- can be offline without losing the file. So large N/k ratios are more
13008- reliable, and small N/k ratios use less disk space. Clearly, k must never be
13009- smaller than N.
13010-
13011- Large values of N will slow down upload operations slightly, since more
13012- servers must be involved, and will slightly increase storage overhead due to
13013- the hash trees that are created. Large values of k will cause downloads to
13014- be marginally slower, because more servers must be involved. N cannot be
13015- larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
13016- uses.
13017-
13018- shares.happy allows you control over the distribution of your immutable file.
13019- For a successful upload, shares are guaranteed to be initially placed on
13020- at least 'shares.happy' distinct servers, the correct functioning of any
13021- k of which is sufficient to guarantee the availability of the uploaded file.
13022- This value should not be larger than the number of servers on your grid.
13023-
13024- A value of shares.happy <= k is allowed, but does not provide any redundancy
13025- if some servers fail or lose shares.
13026-
13027- (Mutable files use a different share placement algorithm that does not
13028-  consider this parameter.)
13029-
13030-
13031-== Storage Server Configuration ==
13032-
13033-[storage]
13034-enabled = (boolean, optional)
13035-
13036- If this is True, the node will run a storage server, offering space to other
13037- clients. If it is False, the node will not run a storage server, meaning
13038- that no shares will be stored on this node. Use False this for clients who
13039- do not wish to provide storage service. The default value is True.
13040-
13041-readonly = (boolean, optional)
13042-
13043- If True, the node will run a storage server but will not accept any shares,
13044- making it effectively read-only. Use this for storage servers which are
13045- being decommissioned: the storage/ directory could be mounted read-only,
13046- while shares are moved to other servers. Note that this currently only
13047- affects immutable shares. Mutable shares (used for directories) will be
13048- written and modified anyway. See ticket #390 for the current status of this
13049- bug. The default value is False.
13050-
13051-reserved_space = (str, optional)
13052-
13053- If provided, this value defines how much disk space is reserved: the storage
13054- server will not accept any share which causes the amount of free disk space
13055- to drop below this value. (The free space is measured by a call to statvfs(2)
13056- on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
13057- user account under which the storage server runs.)
13058-
13059- This string contains a number, with an optional case-insensitive scale
13060- suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
13061- "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
13062- thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
13063-
13064-expire.enabled =
13065-expire.mode =
13066-expire.override_lease_duration =
13067-expire.cutoff_date =
13068-expire.immutable =
13069-expire.mutable =
13070-
13071- These settings control garbage-collection, in which the server will delete
13072- shares that no longer have an up-to-date lease on them. Please see the
13073- neighboring "garbage-collection.txt" document for full details.
13074-
13075-
13076-== Running A Helper ==
13077+Running A Helper
13078+================
13079hunk ./docs/configuration.rst 424
13080+mutable.format = sdmf or mdmf
13081+
13082+ This value tells Tahoe-LAFS what the default mutable file format should
13083+ be. If mutable.format=sdmf, then newly created mutable files will be in
13084+ the old SDMF format. This is desirable for clients that operate on
13085+ grids where some peers run older versions of Tahoe-LAFS, as these older
13086+ versions cannot read the new MDMF mutable file format. If
13087+ mutable.format = mdmf, then newly created mutable files will use the
13088+ new MDMF format, which supports efficient in-place modification and
13089+ streaming downloads. You can overwrite this value using a special
13090+ mutable-type parameter in the webapi. If you do not specify a value
13091+ here, Tahoe-LAFS will use SDMF for all newly-created mutable files.
13092+
13093+ Note that this parameter only applies to mutable files. Mutable
13094+ directories, which are stored as mutable files, are not controlled by
13095+ this parameter and will always use SDMF. We may revisit this decision
13096+ in future versions of Tahoe-LAFS.
13097)
13098)
13099hunk ./docs/configuration.rst 324
13100+Frontend Configuration
13101+======================
13102+
13103+The Tahoe client process can run a variety of frontend file-access protocols.
13104+You will use these to create and retrieve files from the virtual filesystem.
13105+Configuration details for each are documented in the following
13106+protocol-specific guides:
13107+
13108+HTTP
13109+
13110+    Tahoe runs a webserver by default on port 3456. This interface provides a
13111+    human-oriented "WUI", with pages to create, modify, and browse
13112+    directories and files, as well as a number of pages to check on the
13113+    status of your Tahoe node. It also provides a machine-oriented "WAPI",
13114+    with a REST-ful HTTP interface that can be used by other programs
13115+    (including the CLI tools). Please see `<frontends/webapi.rst>`_ for full
13116+    details, and the ``web.port`` and ``web.static`` config variables above.
13117+    The `<frontends/download-status.rst>`_ document also describes a few WUI
13118+    status pages.
13119+
13120+CLI
13121+
13122+    The main "bin/tahoe" executable includes subcommands for manipulating the
13123+    filesystem, uploading/downloading files, and creating/running Tahoe
13124+    nodes. See `<frontends/CLI.rst>`_ for details.
13125+
13126+FTP, SFTP
13127+
13128+    Tahoe can also run both FTP and SFTP servers, and map a username/password
13129+    pair to a top-level Tahoe directory. See `<frontends/FTP-and-SFTP.rst>`_
13130+    for instructions on configuring these services, and the ``[ftpd]`` and
13131+    ``[sftpd]`` sections of ``tahoe.cfg``.
13132+
13133)
13134hunk ./docs/configuration.rst 324
13135+``mutable.format = sdmf or mdmf``
13136+
13137+    This value tells Tahoe what the default mutable file format should
13138+    be. If ``mutable.format=sdmf``, then newly created mutable files will be
13139+    in the old SDMF format. This is desirable for clients that operate on
13140+    grids where some peers run older versions of Tahoe, as these older
13141+    versions cannot read the new MDMF mutable file format. If
13142+    ``mutable.format`` is ``mdmf``, then newly created mutable files will use
13143+    the new MDMF format, which supports efficient in-place modification and
13144+    streaming downloads. You can overwrite this value using a special
13145+    mutable-type parameter in the webapi. If you do not specify a value here,
13146+    Tahoe will use SDMF for all newly-created mutable files.
13147+
13148+    Note that this parameter only applies to mutable files. Mutable
13149+    directories, which are stored as mutable files, are not controlled by
13150+    this parameter and will always use SDMF. We may revisit this decision
13151+    in future versions of Tahoe-LAFS.
13152+
13153)
13154merger 0.0 (
13155merger 0.0 (
13156hunk ./docs/configuration.rst 324
13157+``mutable.format = sdmf or mdmf``
13158+
13159+    This value tells Tahoe what the default mutable file format should
13160+    be. If ``mutable.format=sdmf``, then newly created mutable files will be
13161+    in the old SDMF format. This is desirable for clients that operate on
13162+    grids where some peers run older versions of Tahoe, as these older
13163+    versions cannot read the new MDMF mutable file format. If
13164+    ``mutable.format`` is ``mdmf``, then newly created mutable files will use
13165+    the new MDMF format, which supports efficient in-place modification and
13166+    streaming downloads. You can overwrite this value using a special
13167+    mutable-type parameter in the webapi. If you do not specify a value here,
13168+    Tahoe will use SDMF for all newly-created mutable files.
13169+
13170+    Note that this parameter only applies to mutable files. Mutable
13171+    directories, which are stored as mutable files, are not controlled by
13172+    this parameter and will always use SDMF. We may revisit this decision
13173+    in future versions of Tahoe-LAFS.
13174+
13175merger 0.0 (
13176merger 0.0 (
13177replace ./docs/configuration.rst [A-Za-z_0-9\-\.] Tahoe Tahoe-LAFS
13178merger 0.0 (
13179hunk ./docs/configuration.rst 384
13180-shares.needed = (int, optional) aka "k", default 3
13181-shares.total = (int, optional) aka "N", N >= k, default 10
13182-shares.happy = (int, optional) 1 <= happy <= N, default 7
13183-
13184- These three values set the default encoding parameters. Each time a new file
13185- is uploaded, erasure-coding is used to break the ciphertext into separate
13186- pieces. There will be "N" (i.e. shares.total) pieces created, and the file
13187- will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
13188- The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
13189- Setting k to 1 is equivalent to simple replication (uploading N copies of
13190- the file).
13191-
13192- These values control the tradeoff between storage overhead, performance, and
13193- reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
13194- backend storage space (the actual value will be a bit more, because of other
13195- forms of overhead). Up to N-k shares can be lost before the file becomes
13196- unrecoverable, so assuming there are at least N servers, up to N-k servers
13197- can be offline without losing the file. So large N/k ratios are more
13198- reliable, and small N/k ratios use less disk space. Clearly, k must never be
13199- smaller than N.
13200-
13201- Large values of N will slow down upload operations slightly, since more
13202- servers must be involved, and will slightly increase storage overhead due to
13203- the hash trees that are created. Large values of k will cause downloads to
13204- be marginally slower, because more servers must be involved. N cannot be
13205- larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
13206- uses.
13207-
13208- shares.happy allows you control over the distribution of your immutable file.
13209- For a successful upload, shares are guaranteed to be initially placed on
13210- at least 'shares.happy' distinct servers, the correct functioning of any
13211- k of which is sufficient to guarantee the availability of the uploaded file.
13212- This value should not be larger than the number of servers on your grid.
13213-
13214- A value of shares.happy <= k is allowed, but does not provide any redundancy
13215- if some servers fail or lose shares.
13216-
13217- (Mutable files use a different share placement algorithm that does not
13218-  consider this parameter.)
13219-
13220-
13221-== Storage Server Configuration ==
13222-
13223-[storage]
13224-enabled = (boolean, optional)
13225-
13226- If this is True, the node will run a storage server, offering space to other
13227- clients. If it is False, the node will not run a storage server, meaning
13228- that no shares will be stored on this node. Use False this for clients who
13229- do not wish to provide storage service. The default value is True.
13230-
13231-readonly = (boolean, optional)
13232-
13233- If True, the node will run a storage server but will not accept any shares,
13234- making it effectively read-only. Use this for storage servers which are
13235- being decommissioned: the storage/ directory could be mounted read-only,
13236- while shares are moved to other servers. Note that this currently only
13237- affects immutable shares. Mutable shares (used for directories) will be
13238- written and modified anyway. See ticket #390 for the current status of this
13239- bug. The default value is False.
13240-
13241-reserved_space = (str, optional)
13242-
13243- If provided, this value defines how much disk space is reserved: the storage
13244- server will not accept any share which causes the amount of free disk space
13245- to drop below this value. (The free space is measured by a call to statvfs(2)
13246- on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
13247- user account under which the storage server runs.)
13248-
13249- This string contains a number, with an optional case-insensitive scale
13250- suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
13251- "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
13252- thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
13253-
13254-expire.enabled =
13255-expire.mode =
13256-expire.override_lease_duration =
13257-expire.cutoff_date =
13258-expire.immutable =
13259-expire.mutable =
13260-
13261- These settings control garbage-collection, in which the server will delete
13262- shares that no longer have an up-to-date lease on them. Please see the
13263- neighboring "garbage-collection.txt" document for full details.
13264-
13265-
13266-== Running A Helper ==
13267+Running A Helper
13268+================
13269hunk ./docs/configuration.rst 424
13270+mutable.format = sdmf or mdmf
13271+
13272+ This value tells Tahoe-LAFS what the default mutable file format should
13273+ be. If mutable.format=sdmf, then newly created mutable files will be in
13274+ the old SDMF format. This is desirable for clients that operate on
13275+ grids where some peers run older versions of Tahoe-LAFS, as these older
13276+ versions cannot read the new MDMF mutable file format. If
13277+ mutable.format = mdmf, then newly created mutable files will use the
13278+ new MDMF format, which supports efficient in-place modification and
13279+ streaming downloads. You can overwrite this value using a special
13280+ mutable-type parameter in the webapi. If you do not specify a value
13281+ here, Tahoe-LAFS will use SDMF for all newly-created mutable files.
13282+
13283+ Note that this parameter only applies to mutable files. Mutable
13284+ directories, which are stored as mutable files, are not controlled by
13285+ this parameter and will always use SDMF. We may revisit this decision
13286+ in future versions of Tahoe-LAFS.
13287)
13288)
13289hunk ./docs/configuration.rst 324
13290+Frontend Configuration
13291+======================
13292+
13293+The Tahoe client process can run a variety of frontend file-access protocols.
13294+You will use these to create and retrieve files from the virtual filesystem.
13295+Configuration details for each are documented in the following
13296+protocol-specific guides:
13297+
13298+HTTP
13299+
13300+    Tahoe runs a webserver by default on port 3456. This interface provides a
13301+    human-oriented "WUI", with pages to create, modify, and browse
13302+    directories and files, as well as a number of pages to check on the
13303+    status of your Tahoe node. It also provides a machine-oriented "WAPI",
13304+    with a REST-ful HTTP interface that can be used by other programs
13305+    (including the CLI tools). Please see `<frontends/webapi.rst>`_ for full
13306+    details, and the ``web.port`` and ``web.static`` config variables above.
13307+    The `<frontends/download-status.rst>`_ document also describes a few WUI
13308+    status pages.
13309+
13310+CLI
13311+
13312+    The main "bin/tahoe" executable includes subcommands for manipulating the
13313+    filesystem, uploading/downloading files, and creating/running Tahoe
13314+    nodes. See `<frontends/CLI.rst>`_ for details.
13315+
13316+FTP, SFTP
13317+
13318+    Tahoe can also run both FTP and SFTP servers, and map a username/password
13319+    pair to a top-level Tahoe directory. See `<frontends/FTP-and-SFTP.rst>`_
13320+    for instructions on configuring these services, and the ``[ftpd]`` and
13321+    ``[sftpd]`` sections of ``tahoe.cfg``.
13322+
13323)
13324)
13325hunk ./docs/configuration.rst 402
13326-shares.needed = (int, optional) aka "k", default 3
13327-shares.total = (int, optional) aka "N", N >= k, default 10
13328-shares.happy = (int, optional) 1 <= happy <= N, default 7
13329-
13330- These three values set the default encoding parameters. Each time a new file
13331- is uploaded, erasure-coding is used to break the ciphertext into separate
13332- pieces. There will be "N" (i.e. shares.total) pieces created, and the file
13333- will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
13334- The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
13335- Setting k to 1 is equivalent to simple replication (uploading N copies of
13336- the file).
13337-
13338- These values control the tradeoff between storage overhead, performance, and
13339- reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
13340- backend storage space (the actual value will be a bit more, because of other
13341- forms of overhead). Up to N-k shares can be lost before the file becomes
13342- unrecoverable, so assuming there are at least N servers, up to N-k servers
13343- can be offline without losing the file. So large N/k ratios are more
13344- reliable, and small N/k ratios use less disk space. Clearly, k must never be
13345- smaller than N.
13346-
13347- Large values of N will slow down upload operations slightly, since more
13348- servers must be involved, and will slightly increase storage overhead due to
13349- the hash trees that are created. Large values of k will cause downloads to
13350- be marginally slower, because more servers must be involved. N cannot be
13351- larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
13352- uses.
13353-
13354- shares.happy allows you control over the distribution of your immutable file.
13355- For a successful upload, shares are guaranteed to be initially placed on
13356- at least 'shares.happy' distinct servers, the correct functioning of any
13357- k of which is sufficient to guarantee the availability of the uploaded file.
13358- This value should not be larger than the number of servers on your grid.
13359-
13360- A value of shares.happy <= k is allowed, but does not provide any redundancy
13361- if some servers fail or lose shares.
13362-
13363- (Mutable files use a different share placement algorithm that does not
13364-  consider this parameter.)
13365-
13366-
13367-== Storage Server Configuration ==
13368-
13369-[storage]
13370-enabled = (boolean, optional)
13371-
13372- If this is True, the node will run a storage server, offering space to other
13373- clients. If it is False, the node will not run a storage server, meaning
13374- that no shares will be stored on this node. Use False this for clients who
13375- do not wish to provide storage service. The default value is True.
13376-
13377-readonly = (boolean, optional)
13378-
13379- If True, the node will run a storage server but will not accept any shares,
13380- making it effectively read-only. Use this for storage servers which are
13381- being decommissioned: the storage/ directory could be mounted read-only,
13382- while shares are moved to other servers. Note that this currently only
13383- affects immutable shares. Mutable shares (used for directories) will be
13384- written and modified anyway. See ticket #390 for the current status of this
13385- bug. The default value is False.
13386-
13387-reserved_space = (str, optional)
13388-
13389- If provided, this value defines how much disk space is reserved: the storage
13390- server will not accept any share which causes the amount of free disk space
13391- to drop below this value. (The free space is measured by a call to statvfs(2)
13392- on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
13393- user account under which the storage server runs.)
13394-
13395- This string contains a number, with an optional case-insensitive scale
13396- suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
13397- "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
13398- thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
13399-
13400-expire.enabled =
13401-expire.mode =
13402-expire.override_lease_duration =
13403-expire.cutoff_date =
13404-expire.immutable =
13405-expire.mutable =
13406-
13407- These settings control garbage-collection, in which the server will delete
13408- shares that no longer have an up-to-date lease on them. Please see the
13409- neighboring "garbage-collection.txt" document for full details.
13410-
13411-
13412-== Running A Helper ==
13413+Running A Helper
13414+================
13415)
13416merger 0.0 (
13417merger 0.0 (
13418hunk ./docs/configuration.rst 402
13419-shares.needed = (int, optional) aka "k", default 3
13420-shares.total = (int, optional) aka "N", N >= k, default 10
13421-shares.happy = (int, optional) 1 <= happy <= N, default 7
13422-
13423- These three values set the default encoding parameters. Each time a new file
13424- is uploaded, erasure-coding is used to break the ciphertext into separate
13425- pieces. There will be "N" (i.e. shares.total) pieces created, and the file
13426- will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
13427- The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
13428- Setting k to 1 is equivalent to simple replication (uploading N copies of
13429- the file).
13430-
13431- These values control the tradeoff between storage overhead, performance, and
13432- reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
13433- backend storage space (the actual value will be a bit more, because of other
13434- forms of overhead). Up to N-k shares can be lost before the file becomes
13435- unrecoverable, so assuming there are at least N servers, up to N-k servers
13436- can be offline without losing the file. So large N/k ratios are more
13437- reliable, and small N/k ratios use less disk space. Clearly, k must never be
13438- smaller than N.
13439-
13440- Large values of N will slow down upload operations slightly, since more
13441- servers must be involved, and will slightly increase storage overhead due to
13442- the hash trees that are created. Large values of k will cause downloads to
13443- be marginally slower, because more servers must be involved. N cannot be
13444- larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
13445- uses.
13446-
13447- shares.happy allows you control over the distribution of your immutable file.
13448- For a successful upload, shares are guaranteed to be initially placed on
13449- at least 'shares.happy' distinct servers, the correct functioning of any
13450- k of which is sufficient to guarantee the availability of the uploaded file.
13451- This value should not be larger than the number of servers on your grid.
13452-
13453- A value of shares.happy <= k is allowed, but does not provide any redundancy
13454- if some servers fail or lose shares.
13455-
13456- (Mutable files use a different share placement algorithm that does not
13457-  consider this parameter.)
13458-
13459-
13460-== Storage Server Configuration ==
13461-
13462-[storage]
13463-enabled = (boolean, optional)
13464-
13465- If this is True, the node will run a storage server, offering space to other
13466- clients. If it is False, the node will not run a storage server, meaning
13467- that no shares will be stored on this node. Use False this for clients who
13468- do not wish to provide storage service. The default value is True.
13469-
13470-readonly = (boolean, optional)
13471-
13472- If True, the node will run a storage server but will not accept any shares,
13473- making it effectively read-only. Use this for storage servers which are
13474- being decommissioned: the storage/ directory could be mounted read-only,
13475- while shares are moved to other servers. Note that this currently only
13476- affects immutable shares. Mutable shares (used for directories) will be
13477- written and modified anyway. See ticket #390 for the current status of this
13478- bug. The default value is False.
13479-
13480-reserved_space = (str, optional)
13481-
13482- If provided, this value defines how much disk space is reserved: the storage
13483- server will not accept any share which causes the amount of free disk space
13484- to drop below this value. (The free space is measured by a call to statvfs(2)
13485- on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
13486- user account under which the storage server runs.)
13487-
13488- This string contains a number, with an optional case-insensitive scale
13489- suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
13490- "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
13491- thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
13492-
13493-expire.enabled =
13494-expire.mode =
13495-expire.override_lease_duration =
13496-expire.cutoff_date =
13497-expire.immutable =
13498-expire.mutable =
13499-
13500- These settings control garbage-collection, in which the server will delete
13501- shares that no longer have an up-to-date lease on them. Please see the
13502- neighboring "garbage-collection.txt" document for full details.
13503-
13504-
13505-== Running A Helper ==
13506+Running A Helper
13507+================
13508merger 0.0 (
13509hunk ./docs/configuration.rst 324
13510+``mutable.format = sdmf or mdmf``
13511+
13512+    This value tells Tahoe what the default mutable file format should
13513+    be. If ``mutable.format=sdmf``, then newly created mutable files will be
13514+    in the old SDMF format. This is desirable for clients that operate on
13515+    grids where some peers run older versions of Tahoe, as these older
13516+    versions cannot read the new MDMF mutable file format. If
13517+    ``mutable.format`` is ``mdmf``, then newly created mutable files will use
13518+    the new MDMF format, which supports efficient in-place modification and
13519+    streaming downloads. You can overwrite this value using a special
13520+    mutable-type parameter in the webapi. If you do not specify a value here,
13521+    Tahoe will use SDMF for all newly-created mutable files.
13522+
13523+    Note that this parameter only applies to mutable files. Mutable
13524+    directories, which are stored as mutable files, are not controlled by
13525+    this parameter and will always use SDMF. We may revisit this decision
13526+    in future versions of Tahoe-LAFS.
13527+
13528merger 0.0 (
13529merger 0.0 (
13530replace ./docs/configuration.rst [A-Za-z_0-9\-\.] Tahoe Tahoe-LAFS
13531merger 0.0 (
13532hunk ./docs/configuration.rst 384
13533-shares.needed = (int, optional) aka "k", default 3
13534-shares.total = (int, optional) aka "N", N >= k, default 10
13535-shares.happy = (int, optional) 1 <= happy <= N, default 7
13536-
13537- These three values set the default encoding parameters. Each time a new file
13538- is uploaded, erasure-coding is used to break the ciphertext into separate
13539- pieces. There will be "N" (i.e. shares.total) pieces created, and the file
13540- will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
13541- The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
13542- Setting k to 1 is equivalent to simple replication (uploading N copies of
13543- the file).
13544-
13545- These values control the tradeoff between storage overhead, performance, and
13546- reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
13547- backend storage space (the actual value will be a bit more, because of other
13548- forms of overhead). Up to N-k shares can be lost before the file becomes
13549- unrecoverable, so assuming there are at least N servers, up to N-k servers
13550- can be offline without losing the file. So large N/k ratios are more
13551- reliable, and small N/k ratios use less disk space. Clearly, k must never be
13552- smaller than N.
13553-
13554- Large values of N will slow down upload operations slightly, since more
13555- servers must be involved, and will slightly increase storage overhead due to
13556- the hash trees that are created. Large values of k will cause downloads to
13557- be marginally slower, because more servers must be involved. N cannot be
13558- larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
13559- uses.
13560-
13561- shares.happy allows you control over the distribution of your immutable file.
13562- For a successful upload, shares are guaranteed to be initially placed on
13563- at least 'shares.happy' distinct servers, the correct functioning of any
13564- k of which is sufficient to guarantee the availability of the uploaded file.
13565- This value should not be larger than the number of servers on your grid.
13566-
13567- A value of shares.happy <= k is allowed, but does not provide any redundancy
13568- if some servers fail or lose shares.
13569-
13570- (Mutable files use a different share placement algorithm that does not
13571-  consider this parameter.)
13572-
13573-
13574-== Storage Server Configuration ==
13575-
13576-[storage]
13577-enabled = (boolean, optional)
13578-
13579- If this is True, the node will run a storage server, offering space to other
13580- clients. If it is False, the node will not run a storage server, meaning
13581- that no shares will be stored on this node. Use False this for clients who
13582- do not wish to provide storage service. The default value is True.
13583-
13584-readonly = (boolean, optional)
13585-
13586- If True, the node will run a storage server but will not accept any shares,
13587- making it effectively read-only. Use this for storage servers which are
13588- being decommissioned: the storage/ directory could be mounted read-only,
13589- while shares are moved to other servers. Note that this currently only
13590- affects immutable shares. Mutable shares (used for directories) will be
13591- written and modified anyway. See ticket #390 for the current status of this
13592- bug. The default value is False.
13593-
13594-reserved_space = (str, optional)
13595-
13596- If provided, this value defines how much disk space is reserved: the storage
13597- server will not accept any share which causes the amount of free disk space
13598- to drop below this value. (The free space is measured by a call to statvfs(2)
13599- on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
13600- user account under which the storage server runs.)
13601-
13602- This string contains a number, with an optional case-insensitive scale
13603- suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
13604- "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
13605- thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
13606-
13607-expire.enabled =
13608-expire.mode =
13609-expire.override_lease_duration =
13610-expire.cutoff_date =
13611-expire.immutable =
13612-expire.mutable =
13613-
13614- These settings control garbage-collection, in which the server will delete
13615- shares that no longer have an up-to-date lease on them. Please see the
13616- neighboring "garbage-collection.txt" document for full details.
13617-
13618-
13619-== Running A Helper ==
13620+Running A Helper
13621+================
13622hunk ./docs/configuration.rst 424
13623+mutable.format = sdmf or mdmf
13624+
13625+ This value tells Tahoe-LAFS what the default mutable file format should
13626+ be. If mutable.format=sdmf, then newly created mutable files will be in
13627+ the old SDMF format. This is desirable for clients that operate on
13628+ grids where some peers run older versions of Tahoe-LAFS, as these older
13629+ versions cannot read the new MDMF mutable file format. If
13630+ mutable.format = mdmf, then newly created mutable files will use the
13631+ new MDMF format, which supports efficient in-place modification and
13632+ streaming downloads. You can overwrite this value using a special
13633+ mutable-type parameter in the webapi. If you do not specify a value
13634+ here, Tahoe-LAFS will use SDMF for all newly-created mutable files.
13635+
13636+ Note that this parameter only applies to mutable files. Mutable
13637+ directories, which are stored as mutable files, are not controlled by
13638+ this parameter and will always use SDMF. We may revisit this decision
13639+ in future versions of Tahoe-LAFS.
13640)
13641)
13642hunk ./docs/configuration.rst 324
13643+Frontend Configuration
13644+======================
13645+
13646+The Tahoe client process can run a variety of frontend file-access protocols.
13647+You will use these to create and retrieve files from the virtual filesystem.
13648+Configuration details for each are documented in the following
13649+protocol-specific guides:
13650+
13651+HTTP
13652+
13653+    Tahoe runs a webserver by default on port 3456. This interface provides a
13654+    human-oriented "WUI", with pages to create, modify, and browse
13655+    directories and files, as well as a number of pages to check on the
13656+    status of your Tahoe node. It also provides a machine-oriented "WAPI",
13657+    with a REST-ful HTTP interface that can be used by other programs
13658+    (including the CLI tools). Please see `<frontends/webapi.rst>`_ for full
13659+    details, and the ``web.port`` and ``web.static`` config variables above.
13660+    The `<frontends/download-status.rst>`_ document also describes a few WUI
13661+    status pages.
13662+
13663+CLI
13664+
13665+    The main "bin/tahoe" executable includes subcommands for manipulating the
13666+    filesystem, uploading/downloading files, and creating/running Tahoe
13667+    nodes. See `<frontends/CLI.rst>`_ for details.
13668+
13669+FTP, SFTP
13670+
13671+    Tahoe can also run both FTP and SFTP servers, and map a username/password
13672+    pair to a top-level Tahoe directory. See `<frontends/FTP-and-SFTP.rst>`_
13673+    for instructions on configuring these services, and the ``[ftpd]`` and
13674+    ``[sftpd]`` sections of ``tahoe.cfg``.
13675+
13676)
13677)
13678)
13679replace ./docs/configuration.rst [A-Za-z_0-9\-\.] Tahoe Tahoe-LAFS
13680)
13681hunk ./src/allmydata/mutable/retrieve.py 7
13682 from zope.interface import implements
13683 from twisted.internet import defer
13684 from twisted.python import failure
13685-from foolscap.api import DeadReferenceError, eventually, fireEventually
13686-from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError
13687-from allmydata.util import hashutil, idlib, log
13688+from twisted.internet.interfaces import IPushProducer, IConsumer
13689+from foolscap.api import eventually, fireEventually
13690+from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError, \
13691+                                 MDMF_VERSION, SDMF_VERSION
13692+from allmydata.util import hashutil, log, mathutil
13693+from allmydata.util.dictutil import DictOfSets
13694 from allmydata import hashtree, codec
13695 from allmydata.storage.server import si_b2a
13696 from pycryptopp.cipher.aes import AES
13697hunk ./src/allmydata/mutable/retrieve.py 239
13698             # KiB, so we ask for that much.
13699             # TODO: Change the cache methods to allow us to fetch all of the
13700             # data that they have, then change this method to do that.
13701-            any_cache, timestamp = self._node._read_from_cache(self.verinfo,
13702-                                                               shnum,
13703-                                                               0,
13704-                                                               1000)
13705+            any_cache = self._node._read_from_cache(self.verinfo, shnum,
13706+                                                    0, 1000)
13707             ss = self.servermap.connections[peerid]
13708             reader = MDMFSlotReadProxy(ss,
13709                                        self._storage_index,
13710hunk ./src/allmydata/mutable/retrieve.py 373
13711                  (k, n, self._num_segments, self._segment_size,
13712                   self._tail_segment_size))
13713 
13714-        # ask the cache first
13715-        got_from_cache = False
13716-        datavs = []
13717-        for (offset, length) in readv:
13718-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
13719-                                                            offset, length)
13720-            if data is not None:
13721-                datavs.append(data)
13722-        if len(datavs) == len(readv):
13723-            self.log("got data from cache")
13724-            got_from_cache = True
13725-            d = fireEventually({shnum: datavs})
13726-            # datavs is a dict mapping shnum to a pair of strings
13727+        for i in xrange(self._total_shares):
13728+            # So we don't have to do this later.
13729+            self._block_hash_trees[i] = hashtree.IncompleteHashTree(self._num_segments)
13730+
13731+        # Our last task is to tell the downloader where to start and
13732+        # where to stop. We use three parameters for that:
13733+        #   - self._start_segment: the segment that we need to start
13734+        #     downloading from.
13735+        #   - self._current_segment: the next segment that we need to
13736+        #     download.
13737+        #   - self._last_segment: The last segment that we were asked to
13738+        #     download.
13739+        #
13740+        #  We say that the download is complete when
13741+        #  self._current_segment > self._last_segment. We use
13742+        #  self._start_segment and self._last_segment to know when to
13743+        #  strip things off of segments, and how much to strip.
13744+        if self._offset:
13745+            self.log("got offset: %d" % self._offset)
13746+            # our start segment is the first segment containing the
13747+            # offset we were given.
13748+            start = mathutil.div_ceil(self._offset,
13749+                                      self._segment_size)
13750+            # this gets us the first segment after self._offset. Then
13751+            # our start segment is the one before it.
13752+            start -= 1
13753+
13754+            assert start < self._num_segments
13755+            self._start_segment = start
13756+            self.log("got start segment: %d" % self._start_segment)
13757         else:
13758             self._start_segment = 0
13759 
13760hunk ./src/allmydata/mutable/servermap.py 7
13761 from itertools import count
13762 from twisted.internet import defer
13763 from twisted.python import failure
13764-from foolscap.api import DeadReferenceError, RemoteException, eventually
13765-from allmydata.util import base32, hashutil, idlib, log
13766+from foolscap.api import DeadReferenceError, RemoteException, eventually, \
13767+                         fireEventually
13768+from allmydata.util import base32, hashutil, idlib, log, deferredutil
13769+from allmydata.util.dictutil import DictOfSets
13770 from allmydata.storage.server import si_b2a
13771 from allmydata.interfaces import IServermapUpdaterStatus
13772 from pycryptopp.publickey import rsa
13773hunk ./src/allmydata/mutable/servermap.py 16
13774 
13775 from allmydata.mutable.common import MODE_CHECK, MODE_ANYTHING, MODE_WRITE, MODE_READ, \
13776-     DictOfSets, CorruptShareError, NeedMoreDataError
13777-from allmydata.mutable.layout import unpack_prefix_and_signature, unpack_header, unpack_share, \
13778-     SIGNED_PREFIX_LENGTH
13779+     CorruptShareError
13780+from allmydata.mutable.layout import SIGNED_PREFIX_LENGTH, MDMFSlotReadProxy
13781 
13782 class UpdateStatus:
13783     implements(IServermapUpdaterStatus)
13784hunk ./src/allmydata/mutable/servermap.py 391
13785         #  * if we need the encrypted private key, we want [-1216ish:]
13786         #   * but we can't read from negative offsets
13787         #   * the offset table tells us the 'ish', also the positive offset
13788-        # A future version of the SMDF slot format should consider using
13789-        # fixed-size slots so we can retrieve less data. For now, we'll just
13790-        # read 2000 bytes, which also happens to read enough actual data to
13791-        # pre-fetch a 9-entry dirnode.
13792+        # MDMF:
13793+        #  * Checkstring? [0:72]
13794+        #  * If we want to validate the checkstring, then [0:72], [143:?] --
13795+        #    the offset table will tell us for sure.
13796+        #  * If we need the verification key, we have to consult the offset
13797+        #    table as well.
13798+        # At this point, we don't know which we are. Our filenode can
13799+        # tell us, but it might be lying -- in some cases, we're
13800+        # responsible for telling it which kind of file it is.
13801         self._read_size = 4000
13802         if mode == MODE_CHECK:
13803             # we use unpack_prefix_and_signature, so we need 1k
13804hunk ./src/allmydata/mutable/servermap.py 633
13805         updated.
13806         """
13807         if verinfo:
13808-            self._node._add_to_cache(verinfo, shnum, 0, data, now)
13809+            self._node._add_to_cache(verinfo, shnum, 0, data)
13810 
13811 
13812     def _got_results(self, datavs, peerid, readsize, stuff, started):
13813hunk ./src/allmydata/mutable/servermap.py 664
13814 
13815         for shnum,datav in datavs.items():
13816             data = datav[0]
13817-            try:
13818-                verinfo = self._got_results_one_share(shnum, data, peerid, lp)
13819-                last_verinfo = verinfo
13820-                last_shnum = shnum
13821-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
13822-            except CorruptShareError, e:
13823-                # log it and give the other shares a chance to be processed
13824-                f = failure.Failure()
13825-                self.log(format="bad share: %(f_value)s", f_value=str(f.value),
13826-                         failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
13827-                self.notify_server_corruption(peerid, shnum, str(e))
13828-                self._bad_peers.add(peerid)
13829-                self._last_failure = f
13830-                checkstring = data[:SIGNED_PREFIX_LENGTH]
13831-                self._servermap.mark_bad_share(peerid, shnum, checkstring)
13832-                self._servermap.problems.append(f)
13833-                pass
13834+            reader = MDMFSlotReadProxy(ss,
13835+                                       storage_index,
13836+                                       shnum,
13837+                                       data)
13838+            self._readers.setdefault(peerid, dict())[shnum] = reader
13839+            # our goal, with each response, is to validate the version
13840+            # information and share data as best we can at this point --
13841+            # we do this by validating the signature. To do this, we
13842+            # need to do the following:
13843+            #   - If we don't already have the public key, fetch the
13844+            #     public key. We use this to validate the signature.
13845+            if not self._node.get_pubkey():
13846+                # fetch and set the public key.
13847+                d = reader.get_verification_key(queue=True)
13848+                d.addCallback(lambda results, shnum=shnum, peerid=peerid:
13849+                    self._try_to_set_pubkey(results, peerid, shnum, lp))
13850+                # XXX: Make self._pubkey_query_failed?
13851+                d.addErrback(lambda error, shnum=shnum, peerid=peerid:
13852+                    self._got_corrupt_share(error, shnum, peerid, data, lp))
13853+            else:
13854+                # we already have the public key.
13855+                d = defer.succeed(None)
13856 
13857             # Neither of these two branches return anything of
13858             # consequence, so the first entry in our deferredlist will
13859hunk ./src/allmydata/test/test_storage.py 1
13860-import time, os.path, platform, stat, re, simplejson, struct
13861+import time, os.path, platform, stat, re, simplejson, struct, shutil
13862 
13863hunk ./src/allmydata/test/test_storage.py 3
13864-import time, os.path, stat, re, simplejson, struct
13865+import mock
13866 
13867 from twisted.trial import unittest
13868 
13869}
13870[mutable/filenode.py: fix create_mutable_file('string')
13871"Brian Warner <warner@lothar.com>"**20110221014659
13872 Ignore-this: dc6bdad761089f0199681eeb784f1001
13873] hunk ./src/allmydata/mutable/filenode.py 137
13874         if contents is None:
13875             return MutableData("")
13876 
13877+        if isinstance(contents, str):
13878+            return MutableData(contents)
13879+
13880         if IMutableUploadable.providedBy(contents):
13881             return contents
13882 
13883[resolve more conflicts with current trunk
13884"Brian Warner <warner@lothar.com>"**20110221055600
13885 Ignore-this: 77ad038a478dbf5d9b34f7a68159a3e0
13886] hunk ./src/allmydata/mutable/servermap.py 461
13887         self._queries_completed = 0
13888 
13889         sb = self._storage_broker
13890-        full_peerlist = sb.get_servers_for_index(self._storage_index)
13891+        # All of the peers, permuted by the storage index, as usual.
13892+        full_peerlist = [(s.get_serverid(), s.get_rref())
13893+                         for s in sb.get_servers_for_psi(self._storage_index)]
13894         self.full_peerlist = full_peerlist # for use later, immutable
13895         self.extra_peers = full_peerlist[:] # peers are removed as we use them
13896         self._good_peers = set() # peers who had some shares
13897[update MDMF code with StorageFarmBroker changes
13898"Brian Warner <warner@lothar.com>"**20110221061004
13899 Ignore-this: a693b201d31125b391cebe0412ddd027
13900] {
13901hunk ./src/allmydata/mutable/publish.py 203
13902         self._encprivkey = self._node.get_encprivkey()
13903 
13904         sb = self._storage_broker
13905-        full_peerlist = sb.get_servers_for_index(self._storage_index)
13906+        full_peerlist = [(s.get_serverid(), s.get_rref())
13907+                         for s in sb.get_servers_for_psi(self._storage_index)]
13908         self.full_peerlist = full_peerlist # for use later, immutable
13909         self.bad_peers = set() # peerids who have errbacked/refused requests
13910 
13911hunk ./src/allmydata/test/test_mutable.py 2538
13912             # for either a block and salt or for hashes, either of which
13913             # will exercise the error handling code.
13914             killer = FirstServerGetsKilled()
13915-            for (serverid, ss) in nm.storage_broker.get_all_servers():
13916-                ss.post_call_notifier = killer.notify
13917+            for s in nm.storage_broker.get_connected_servers():
13918+                s.get_rref().post_call_notifier = killer.notify
13919             ver = servermap.best_recoverable_version()
13920             assert ver
13921             return self._node.download_version(servermap, ver)
13922}
13923[mutable/filenode: Clean up servermap handling in MutableFileVersion
13924Kevan Carstensen <kevan@isnotajoke.com>**20110226010433
13925 Ignore-this: 2257c9f65502098789f5ea355b94f130
13926 
13927 We want to update the servermap before attempting to modify a file,
13928 which we now do. This introduced code duplication, which was addressed
13929 by refactoring the servermap update into its own method, and then
13930 eliminating duplicate servermap updates throughout the
13931 MutableFileVersion.
13932] {
13933hunk ./src/allmydata/mutable/filenode.py 19
13934 from allmydata.mutable.publish import Publish, MutableData,\
13935                                       DEFAULT_MAX_SEGMENT_SIZE, \
13936                                       TransformingUploadable
13937-from allmydata.mutable.common import MODE_READ, MODE_WRITE, UnrecoverableFileError, \
13938+from allmydata.mutable.common import MODE_READ, MODE_WRITE, MODE_CHECK, UnrecoverableFileError, \
13939      ResponseCache, UncoordinatedWriteError
13940 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
13941 from allmydata.mutable.retrieve import Retrieve
13942hunk ./src/allmydata/mutable/filenode.py 807
13943         a little bit.
13944         """
13945         log.msg("doing modify")
13946-        d = self._modify_once(modifier, first_time)
13947+        if first_time:
13948+            d = self._update_servermap()
13949+        else:
13950+            # We ran into trouble; do MODE_CHECK so we're a little more
13951+            # careful on subsequent tries.
13952+            d = self._update_servermap(mode=MODE_CHECK)
13953+
13954+        d.addCallback(lambda ignored:
13955+            self._modify_once(modifier, first_time))
13956         def _retry(f):
13957             f.trap(UncoordinatedWriteError)
13958hunk ./src/allmydata/mutable/filenode.py 818
13959+            # Uh oh, it broke. We're allowed to trust the servermap for our
13960+            # first try, but after that we need to update it. It's
13961+            # possible that we've failed due to a race with another
13962+            # uploader, and if the race is to converge correctly, we
13963+            # need to know about that upload.
13964             d2 = defer.maybeDeferred(backoffer, self, f)
13965             d2.addCallback(lambda ignored:
13966                            self._modify_and_retry(modifier,
13967hunk ./src/allmydata/mutable/filenode.py 837
13968         I attempt to apply a modifier to the contents of the mutable
13969         file.
13970         """
13971-        # XXX: This is wrong -- we could get more servers if we updated
13972-        # in MODE_ANYTHING and possibly MODE_CHECK. Probably we want to
13973-        # assert that the last update wasn't MODE_READ
13974-        assert self._servermap.last_update_mode == MODE_WRITE
13975+        assert self._servermap.last_update_mode != MODE_READ
13976 
13977         # download_to_data is serialized, so we have to call this to
13978         # avoid deadlock.
13979hunk ./src/allmydata/mutable/filenode.py 1076
13980 
13981         # Now ask for the servermap to be updated in MODE_WRITE with
13982         # this update range.
13983-        u = ServermapUpdater(self._node, self._storage_broker, Monitor(),
13984-                             self._servermap,
13985-                             mode=MODE_WRITE,
13986-                             update_range=(start_segment, end_segment))
13987-        return u.update()
13988+        return self._update_servermap(update_range=(start_segment,
13989+                                                    end_segment))
13990 
13991 
13992     def _decode_and_decrypt_segments(self, ignored, data, offset):
13993hunk ./src/allmydata/mutable/filenode.py 1135
13994                                    segments_and_bht[1])
13995         p = Publish(self._node, self._storage_broker, self._servermap)
13996         return p.update(u, offset, segments_and_bht[2], self._version)
13997+
13998+
13999+    def _update_servermap(self, mode=MODE_WRITE, update_range=None):
14000+        """
14001+        I update the servermap. I return a Deferred that fires when the
14002+        servermap update is done.
14003+        """
14004+        if update_range:
14005+            u = ServermapUpdater(self._node, self._storage_broker, Monitor(),
14006+                                 self._servermap,
14007+                                 mode=mode,
14008+                                 update_range=update_range)
14009+        else:
14010+            u = ServermapUpdater(self._node, self._storage_broker, Monitor(),
14011+                                 self._servermap,
14012+                                 mode=mode)
14013+        return u.update()
14014}
14015[web: Use the string "replace" to trigger whole-file replacement when processing an offset parameter.
14016Kevan Carstensen <kevan@isnotajoke.com>**20110227231643
14017 Ignore-this: 5bbf0b90d68efe20d4c531bb98a8321a
14018] {
14019hunk ./docs/frontends/webapi.rst 360
14020  To use the /uri/$FILECAP form, $FILECAP must be a write-cap for a mutable file.
14021 
14022  In the /uri/$DIRCAP/[SUBDIRS../]FILENAME form, if the target file is a
14023- writeable mutable file, that file's contents will be overwritten in-place. If
14024- it is a read-cap for a mutable file, an error will occur. If it is an
14025- immutable file, the old file will be discarded, and a new one will be put in
14026- its place. If the target file is a writable mutable file, you may also
14027- specify an "offset" parameter -- a byte offset that determines where in
14028- the mutable file the data from the HTTP request body is placed. This
14029- operation is relatively efficient for MDMF mutable files, and is
14030- relatively inefficient (but still supported) for SDMF mutable files.
14031+ writeable mutable file, that file's contents will be overwritten
14032+ in-place. If it is a read-cap for a mutable file, an error will occur.
14033+ If it is an immutable file, the old file will be discarded, and a new
14034+ one will be put in its place. If the target file is a writable mutable
14035+ file, you may also specify an "offset" parameter -- a byte offset that
14036+ determines where in the mutable file the data from the HTTP request
14037+ body is placed. This operation is relatively efficient for MDMF mutable
14038+ files, and is relatively inefficient (but still supported) for SDMF
14039+ mutable files. If no offset parameter is specified, then the entire
14040+ file is replaced with the data from the HTTP request body. For an
14041+ immutable file, the "offset" parameter is not valid.
14042 
14043  When creating a new file, if "mutable=true" is in the query arguments, the
14044  operation will create a mutable file instead of an immutable one.
14045hunk ./src/allmydata/test/test_web.py 3187
14046             self.failUnlessEqual(results, self.new_data + ("puppies" * 100)))
14047         return d
14048 
14049+    def test_PUT_update_at_invalid_offset(self):
14050+        file_contents = "test file" * 100000 # about 900 KiB
14051+        d = self.PUT("/uri?mutable=true", file_contents)
14052+        def _then(filecap):
14053+            self.filecap = filecap
14054+        d.addCallback(_then)
14055+        # Negative offsets should cause an error.
14056+        d.addCallback(lambda ignored:
14057+            self.shouldHTTPError("test mutable invalid offset negative",
14058+                                 400, "Bad Request",
14059+                                 "Invalid offset",
14060+                                 self.PUT,
14061+                                 "/uri/%s?offset=-1" % self.filecap,
14062+                                 "foo"))
14063+        return d
14064 
14065     def test_PUT_update_at_offset_immutable(self):
14066         file_contents = "Test file" * 100000
14067hunk ./src/allmydata/web/common.py 55
14068     # message? Since this call is going to be used by programmers and
14069     # their tools rather than users (through the wui), it is not
14070     # inconsistent to return that, I guess.
14071-    offset = int(offset)
14072-    return offset
14073+    return int(offset)
14074 
14075 
14076 def get_root(ctx_or_req):
14077hunk ./src/allmydata/web/filenode.py 219
14078         req = IRequest(ctx)
14079         t = get_arg(req, "t", "").strip()
14080         replace = parse_replace_arg(get_arg(req, "replace", "true"))
14081-        offset = parse_offset_arg(get_arg(req, "offset", -1))
14082+        offset = parse_offset_arg(get_arg(req, "offset", False))
14083 
14084         if not t:
14085hunk ./src/allmydata/web/filenode.py 222
14086-            if self.node.is_mutable() and offset >= 0:
14087-                return self.update_my_contents(req, offset)
14088-
14089-            elif self.node.is_mutable():
14090-                return self.replace_my_contents(req)
14091             if not replace:
14092                 # this is the early trap: if someone else modifies the
14093                 # directory while we're uploading, the add_file(overwrite=)
14094hunk ./src/allmydata/web/filenode.py 227
14095                 # call in replace_me_with_a_child will do the late trap.
14096                 raise ExistingChildError()
14097-            if offset >= 0:
14098-                raise WebError("PUT to a file: append operation invoked "
14099-                               "on an immutable cap")
14100 
14101hunk ./src/allmydata/web/filenode.py 228
14102+            if self.node.is_mutable():
14103+                if offset == False:
14104+                    return self.replace_my_contents(req)
14105+
14106+                if offset >= 0:
14107+                    return self.update_my_contents(req, offset)
14108+
14109+                raise WebError("PUT to a mutable file: Invalid offset")
14110+
14111+            else:
14112+                if offset != False:
14113+                    raise WebError("PUT to a file: append operation invoked "
14114+                                   "on an immutable cap")
14115+
14116+                assert self.parentnode and self.name
14117+                return self.replace_me_with_a_child(req, self.client, replace)
14118 
14119hunk ./src/allmydata/web/filenode.py 245
14120-            assert self.parentnode and self.name
14121-            return self.replace_me_with_a_child(req, self.client, replace)
14122         if t == "uri":
14123             if not replace:
14124                 raise ExistingChildError()
14125}
14126[docs/configuration.rst: fix more conflicts between #393 and trunk
14127Kevan Carstensen <kevan@isnotajoke.com>**20110228003426
14128 Ignore-this: 7917effdeecab00d634a06f1df8fe2cf
14129] {
14130replace ./docs/configuration.rst [A-Za-z_0-9\-\.] Tahoe Tahoe-LAFS
14131hunk ./docs/configuration.rst 324
14132     (Mutable files use a different share placement algorithm that does not
14133     currently consider this parameter.)
14134 
14135+``mutable.format = sdmf or mdmf``
14136+
14137+    This value tells Tahoe-LAFS what the default mutable file format should
14138+    be. If ``mutable.format=sdmf``, then newly created mutable files will be
14139+    in the old SDMF format. This is desirable for clients that operate on
14140+    grids where some peers run older versions of Tahoe-LAFS, as these older
14141+    versions cannot read the new MDMF mutable file format. If
14142+    ``mutable.format`` is ``mdmf``, then newly created mutable files will use
14143+    the new MDMF format, which supports efficient in-place modification and
14144+    streaming downloads. You can overwrite this value using a special
14145+    mutable-type parameter in the webapi. If you do not specify a value here,
14146+    Tahoe-LAFS will use SDMF for all newly-created mutable files.
14147+
14148+    Note that this parameter only applies to mutable files. Mutable
14149+    directories, which are stored as mutable files, are not controlled by
14150+    this parameter and will always use SDMF. We may revisit this decision
14151+    in future versions of Tahoe-LAFS.
14152+
14153+
14154+Frontend Configuration
14155+======================
14156+
14157+The Tahoe client process can run a variety of frontend file-access protocols.
14158+You will use these to create and retrieve files from the virtual filesystem.
14159+Configuration details for each are documented in the following
14160+protocol-specific guides:
14161+
14162+HTTP
14163+
14164+    Tahoe runs a webserver by default on port 3456. This interface provides a
14165+    human-oriented "WUI", with pages to create, modify, and browse
14166+    directories and files, as well as a number of pages to check on the
14167+    status of your Tahoe node. It also provides a machine-oriented "WAPI",
14168+    with a REST-ful HTTP interface that can be used by other programs
14169+    (including the CLI tools). Please see `<frontends/webapi.rst>`_ for full
14170+    details, and the ``web.port`` and ``web.static`` config variables above.
14171+    The `<frontends/download-status.rst>`_ document also describes a few WUI
14172+    status pages.
14173+
14174+CLI
14175+
14176+    The main "bin/tahoe" executable includes subcommands for manipulating the
14177+    filesystem, uploading/downloading files, and creating/running Tahoe
14178+    nodes. See `<frontends/CLI.rst>`_ for details.
14179+
14180+FTP, SFTP
14181+
14182+    Tahoe can also run both FTP and SFTP servers, and map a username/password
14183+    pair to a top-level Tahoe directory. See `<frontends/FTP-and-SFTP.rst>`_
14184+    for instructions on configuring these services, and the ``[ftpd]`` and
14185+    ``[sftpd]`` sections of ``tahoe.cfg``.
14186+
14187 
14188 Storage Server Configuration
14189 ============================
14190hunk ./docs/configuration.rst 436
14191     `<garbage-collection.rst>`_ for full details.
14192 
14193 
14194-shares.needed = (int, optional) aka "k", default 3
14195-shares.total = (int, optional) aka "N", N >= k, default 10
14196-shares.happy = (int, optional) 1 <= happy <= N, default 7
14197-
14198- These three values set the default encoding parameters. Each time a new file
14199- is uploaded, erasure-coding is used to break the ciphertext into separate
14200- pieces. There will be "N" (i.e. shares.total) pieces created, and the file
14201- will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
14202- The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
14203- Setting k to 1 is equivalent to simple replication (uploading N copies of
14204- the file).
14205-
14206- These values control the tradeoff between storage overhead, performance, and
14207- reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
14208- backend storage space (the actual value will be a bit more, because of other
14209- forms of overhead). Up to N-k shares can be lost before the file becomes
14210- unrecoverable, so assuming there are at least N servers, up to N-k servers
14211- can be offline without losing the file. So large N/k ratios are more
14212- reliable, and small N/k ratios use less disk space. Clearly, k must never be
14213- smaller than N.
14214-
14215- Large values of N will slow down upload operations slightly, since more
14216- servers must be involved, and will slightly increase storage overhead due to
14217- the hash trees that are created. Large values of k will cause downloads to
14218- be marginally slower, because more servers must be involved. N cannot be
14219- larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe-LAFS
14220- uses.
14221-
14222- shares.happy allows you control over the distribution of your immutable file.
14223- For a successful upload, shares are guaranteed to be initially placed on
14224- at least 'shares.happy' distinct servers, the correct functioning of any
14225- k of which is sufficient to guarantee the availability of the uploaded file.
14226- This value should not be larger than the number of servers on your grid.
14227-
14228- A value of shares.happy <= k is allowed, but does not provide any redundancy
14229- if some servers fail or lose shares.
14230-
14231- (Mutable files use a different share placement algorithm that does not
14232-  consider this parameter.)
14233-
14234-
14235-== Storage Server Configuration ==
14236-
14237-[storage]
14238-enabled = (boolean, optional)
14239-
14240- If this is True, the node will run a storage server, offering space to other
14241- clients. If it is False, the node will not run a storage server, meaning
14242- that no shares will be stored on this node. Use False this for clients who
14243- do not wish to provide storage service. The default value is True.
14244-
14245-readonly = (boolean, optional)
14246-
14247- If True, the node will run a storage server but will not accept any shares,
14248- making it effectively read-only. Use this for storage servers which are
14249- being decommissioned: the storage/ directory could be mounted read-only,
14250- while shares are moved to other servers. Note that this currently only
14251- affects immutable shares. Mutable shares (used for directories) will be
14252- written and modified anyway. See ticket #390 for the current status of this
14253- bug. The default value is False.
14254-
14255-reserved_space = (str, optional)
14256-
14257- If provided, this value defines how much disk space is reserved: the storage
14258- server will not accept any share which causes the amount of free disk space
14259- to drop below this value. (The free space is measured by a call to statvfs(2)
14260- on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
14261- user account under which the storage server runs.)
14262-
14263- This string contains a number, with an optional case-insensitive scale
14264- suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
14265- "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
14266- thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
14267-
14268-expire.enabled =
14269-expire.mode =
14270-expire.override_lease_duration =
14271-expire.cutoff_date =
14272-expire.immutable =
14273-expire.mutable =
14274-
14275- These settings control garbage-collection, in which the server will delete
14276- shares that no longer have an up-to-date lease on them. Please see the
14277- neighboring "garbage-collection.txt" document for full details.
14278-
14279-
14280-== Running A Helper ==
14281+Running A Helper
14282+================
14283 
14284 A "helper" is a regular client node that also offers the "upload helper"
14285 service.
14286}
14287[mutable/layout: remove references to the salt hash tree.
14288Kevan Carstensen <kevan@isnotajoke.com>**20110228010637
14289 Ignore-this: b3b2963ba4d0b42c78b6bba219d4deb5
14290] {
14291hunk ./src/allmydata/mutable/layout.py 577
14292     # 99          8           The offset of the EOF
14293     #
14294     # followed by salts and share data, the encrypted private key, the
14295-    # block hash tree, the salt hash tree, the share hash chain, a
14296-    # signature over the first eight fields, and a verification key.
14297+    # block hash tree, the share hash chain, a signature over the first
14298+    # eight fields, and a verification key.
14299     #
14300     # The checkstring is the first three fields -- the version number,
14301     # sequence number, root hash and root salt hash. This is consistent
14302hunk ./src/allmydata/mutable/layout.py 628
14303     #      calculate the offset for the share hash chain, and fill that
14304     #      into the offsets table.
14305     #
14306-    #   4: At the same time, we're in a position to upload the salt hash
14307-    #      tree. This is a Merkle tree over all of the salts. We use a
14308-    #      Merkle tree so that we can validate each block,salt pair as
14309-    #      we download them later. We do this using
14310-    #
14311-    #        put_salthashes(salt_hash_tree)
14312-    #
14313-    #      When you do this, I automatically put the root of the tree
14314-    #      (the hash at index 0 of the list) in its appropriate slot in
14315-    #      the signed prefix of the share.
14316-    #
14317-    #   5: We're now in a position to upload the share hash chain for
14318+    #   4: We're now in a position to upload the share hash chain for
14319     #      a share. Do that with something like:
14320     #     
14321     #        put_sharehashes(share_hash_chain)
14322hunk ./src/allmydata/mutable/layout.py 639
14323     #      The root of this tree will be put explicitly in the next
14324     #      step.
14325     #
14326-    #      TODO: Why? Why not just include it in the tree here?
14327-    #
14328-    #   6: Before putting the signature, we must first put the
14329+    #   5: Before putting the signature, we must first put the
14330     #      root_hash. Do this with:
14331     #
14332     #        put_root_hash(root_hash).
14333hunk ./src/allmydata/mutable/layout.py 872
14334             raise LayoutInvalid("I was given the wrong size block to write")
14335 
14336         # We want to write at len(MDMFHEADER) + segnum * block_size.
14337-
14338         offset = MDMFHEADERSIZE + (self._actual_block_size * segnum)
14339         data = salt + data
14340 
14341hunk ./src/allmydata/mutable/layout.py 889
14342         # tree is written, since that could cause the private key to run
14343         # into the block hash tree. Before it writes the block hash
14344         # tree, the block hash tree writing method writes the offset of
14345-        # the salt hash tree. So that's a good indicator of whether or
14346+        # the share hash chain. So that's a good indicator of whether or
14347         # not the block hash tree has been written.
14348         if "share_hash_chain" in self._offsets:
14349             raise LayoutInvalid("You must write this before the block hash tree")
14350hunk ./src/allmydata/mutable/layout.py 907
14351         The encrypted private key must be queued before the block hash
14352         tree, since we need to know how large it is to know where the
14353         block hash tree should go. The block hash tree must be put
14354-        before the salt hash tree, since its size determines the
14355+        before the share hash chain, since its size determines the
14356         offset of the share hash chain.
14357         """
14358         assert self._offsets
14359hunk ./src/allmydata/mutable/layout.py 932
14360         I queue a write vector to put the share hash chain in my
14361         argument onto the remote server.
14362 
14363-        The salt hash tree must be queued before the share hash chain,
14364-        since we need to know where the salt hash tree ends before we
14365+        The block hash tree must be queued before the share hash chain,
14366+        since we need to know where the block hash tree ends before we
14367         can know where the share hash chain starts. The share hash chain
14368         must be put before the signature, since the length of the packed
14369         share hash chain determines the offset of the signature. Also,
14370hunk ./src/allmydata/mutable/layout.py 937
14371-        semantically, you must know what the root of the salt hash tree
14372+        semantically, you must know what the root of the block hash tree
14373         is before you can generate a valid signature.
14374         """
14375         assert isinstance(sharehashes, dict)
14376hunk ./src/allmydata/mutable/layout.py 942
14377         if "share_hash_chain" not in self._offsets:
14378-            raise LayoutInvalid("You need to put the salt hash tree before "
14379+            raise LayoutInvalid("You need to put the block hash tree before "
14380                                 "you can put the share hash chain")
14381         # The signature comes after the share hash chain. If the
14382         # signature has already been written, we must not write another
14383}
14384[test_mutable.py: add test to exercise fencepost bug
14385warner@lothar.com**20110228021056
14386 Ignore-this: d2f9cf237ce6db42fb250c8ad71a4fc3
14387] {
14388hunk ./src/allmydata/test/test_mutable.py 2
14389 
14390-import os
14391+import os, re
14392 from cStringIO import StringIO
14393 from twisted.trial import unittest
14394 from twisted.internet import defer, reactor
14395hunk ./src/allmydata/test/test_mutable.py 2931
14396         self.set_up_grid()
14397         self.c = self.g.clients[0]
14398         self.nm = self.c.nodemaker
14399-        self.data = "test data" * 100000 # about 900 KiB; MDMF
14400+        self.data = "testdata " * 100000 # about 900 KiB; MDMF
14401         self.small_data = "test data" * 10 # about 90 B; SDMF
14402         return self.do_upload()
14403 
14404hunk ./src/allmydata/test/test_mutable.py 2981
14405             self.failUnlessEqual(results, new_data))
14406         return d
14407 
14408+    def test_replace_segstart1(self):
14409+        offset = 128*1024+1
14410+        new_data = "NNNN"
14411+        expected = self.data[:offset]+new_data+self.data[offset+4:]
14412+        d = self.mdmf_node.get_best_mutable_version()
14413+        d.addCallback(lambda mv:
14414+            mv.update(MutableData(new_data), offset))
14415+        d.addCallback(lambda ignored:
14416+            self.mdmf_node.download_best_version())
14417+        def _check(results):
14418+            if results != expected:
14419+                print
14420+                print "got: %s ... %s" % (results[:20], results[-20:])
14421+                print "exp: %s ... %s" % (expected[:20], expected[-20:])
14422+                self.fail("results != expected")
14423+        d.addCallback(_check)
14424+        return d
14425+
14426+    def _check_differences(self, got, expected):
14427+        # displaying arbitrary file corruption is tricky for a
14428+        # 1MB file of repeating data,, so look for likely places
14429+        # with problems and display them separately
14430+        gotmods = [mo.span() for mo in re.finditer('([A-Z]+)', got)]
14431+        expmods = [mo.span() for mo in re.finditer('([A-Z]+)', expected)]
14432+        gotspans = ["%d:%d=%s" % (start,end,got[start:end])
14433+                    for (start,end) in gotmods]
14434+        expspans = ["%d:%d=%s" % (start,end,expected[start:end])
14435+                    for (start,end) in expmods]
14436+        #print "expecting: %s" % expspans
14437+
14438+        SEGSIZE = 128*1024
14439+        if got != expected:
14440+            print "differences:"
14441+            for segnum in range(len(expected)//SEGSIZE):
14442+                start = segnum * SEGSIZE
14443+                end = (segnum+1) * SEGSIZE
14444+                got_ends = "%s .. %s" % (got[start:start+20], got[end-20:end])
14445+                exp_ends = "%s .. %s" % (expected[start:start+20], expected[end-20:end])
14446+                if got_ends != exp_ends:
14447+                    print "expected[%d]: %s" % (start, exp_ends)
14448+                    print "got     [%d]: %s" % (start, got_ends)
14449+            if expspans != gotspans:
14450+                print "expected: %s" % expspans
14451+                print "got     : %s" % gotspans
14452+            open("EXPECTED","wb").write(expected)
14453+            open("GOT","wb").write(got)
14454+            print "wrote data to EXPECTED and GOT"
14455+            self.fail("didn't get expected data")
14456+
14457+
14458+    def test_replace_locations(self):
14459+        # exercise fencepost conditions
14460+        expected = self.data
14461+        SEGSIZE = 128*1024
14462+        suspects = range(SEGSIZE-3, SEGSIZE+1)+range(2*SEGSIZE-3, 2*SEGSIZE+1)
14463+        letters = iter("ABCDEFGHIJKLMNOPQRSTUVWXYZ")
14464+        d = defer.succeed(None)
14465+        for offset in suspects:
14466+            new_data = letters.next()*2 # "AA", then "BB", etc
14467+            expected = expected[:offset]+new_data+expected[offset+2:]
14468+            d.addCallback(lambda ign:
14469+                          self.mdmf_node.get_best_mutable_version())
14470+            def _modify(mv, offset=offset, new_data=new_data):
14471+                # close over 'offset','new_data'
14472+                md = MutableData(new_data)
14473+                return mv.update(md, offset)
14474+            d.addCallback(_modify)
14475+            d.addCallback(lambda ignored:
14476+                          self.mdmf_node.download_best_version())
14477+            d.addCallback(self._check_differences, expected)
14478+        return d
14479+
14480 
14481     def test_replace_and_extend(self):
14482         # We should be able to replace data in the middle of a mutable
14483}
14484[mutable/publish: account for offsets on segment boundaries.
14485Kevan Carstensen <kevan@isnotajoke.com>**20110228083327
14486 Ignore-this: c8758a0580fcc15a22c2f8582d758a6b
14487] {
14488hunk ./src/allmydata/mutable/filenode.py 17
14489 from pycryptopp.cipher.aes import AES
14490 
14491 from allmydata.mutable.publish import Publish, MutableData,\
14492-                                      DEFAULT_MAX_SEGMENT_SIZE, \
14493                                       TransformingUploadable
14494 from allmydata.mutable.common import MODE_READ, MODE_WRITE, MODE_CHECK, UnrecoverableFileError, \
14495      ResponseCache, UncoordinatedWriteError
14496hunk ./src/allmydata/mutable/filenode.py 1058
14497         # appending data to the file.
14498         assert offset <= self.get_size()
14499 
14500+        segsize = self._version[3]
14501         # We'll need the segment that the data starts in, regardless of
14502         # what we'll do later.
14503hunk ./src/allmydata/mutable/filenode.py 1061
14504-        start_segment = mathutil.div_ceil(offset, DEFAULT_MAX_SEGMENT_SIZE)
14505+        start_segment = mathutil.div_ceil(offset, segsize)
14506         start_segment -= 1
14507 
14508         # We only need the end segment if the data we append does not go
14509hunk ./src/allmydata/mutable/filenode.py 1069
14510         end_segment = start_segment
14511         if offset + data.get_size() < self.get_size():
14512             end_data = offset + data.get_size()
14513-            end_segment = mathutil.div_ceil(end_data, DEFAULT_MAX_SEGMENT_SIZE)
14514+            end_segment = mathutil.div_ceil(end_data, segsize)
14515             end_segment -= 1
14516         self._start_segment = start_segment
14517         self._end_segment = end_segment
14518hunk ./src/allmydata/mutable/publish.py 551
14519                                                   segment_size)
14520             self.starting_segment = mathutil.div_ceil(offset,
14521                                                       segment_size)
14522-            self.starting_segment -= 1
14523+            if offset % segment_size != 0:
14524+                self.starting_segment -= 1
14525             if offset == 0:
14526                 self.starting_segment = 0
14527 
14528}
14529[tahoe-put: raise UsageError when given a nonsensical mutable type, move option validation code to the option parser.
14530Kevan Carstensen <kevan@isnotajoke.com>**20110301030807
14531 Ignore-this: 2dc19d8bd741842eff458ca553d0bf2a
14532] {
14533hunk ./src/allmydata/scripts/cli.py 179
14534         if self.from_file == u"-":
14535             self.from_file = None
14536 
14537+        if self['mutable-type'] and self['mutable-type'] not in ("sdmf", "mdmf"):
14538+            raise usage.UsageError("%s is an invalid format" % self['mutable-type'])
14539+
14540+
14541     def getSynopsis(self):
14542         return "Usage:  %s put LOCAL_FILE REMOTE_FILE" % (os.path.basename(sys.argv[0]),)
14543 
14544hunk ./src/allmydata/scripts/tahoe_put.py 33
14545     stdout = options.stdout
14546     stderr = options.stderr
14547 
14548-    if mutable_type and mutable_type not in ('sdmf', 'mdmf'):
14549-        # Don't try to pass unsupported types to the webapi
14550-        print >>stderr, "error: %s is an invalid format" % mutable_type
14551-        return 1
14552-
14553     if nodeurl[-1] != "/":
14554         nodeurl += "/"
14555     if to_file:
14556hunk ./src/allmydata/test/test_cli.py 1008
14557         return d
14558 
14559     def test_mutable_type_invalid_format(self):
14560-        self.basedir = "cli/Put/mutable_type_invalid_format"
14561-        self.set_up_grid()
14562-        data = "data" * 100000
14563-        fn1 = os.path.join(self.basedir, "data")
14564-        fileutil.write(fn1, data)
14565-        d = self.do_cli("put", "--mutable", "--mutable-type=ldmf", fn1)
14566-        def _check_failure((rc, out, err)):
14567-            self.failIfEqual(rc, 0)
14568-            self.failUnlessIn("invalid", err)
14569-        d.addCallback(_check_failure)
14570-        return d
14571+        o = cli.PutOptions()
14572+        self.failUnlessRaises(usage.UsageError,
14573+                              o.parseOptions,
14574+                              ["--mutable", "--mutable-type=ldmf"])
14575 
14576     def test_put_with_nonexistent_alias(self):
14577         # when invoked with an alias that doesn't exist, 'tahoe put'
14578}
14579[web: use None instead of False in the case of no offset, use object identity comparison to check whether or not an offset was specified.
14580Kevan Carstensen <kevan@isnotajoke.com>**20110305010858
14581 Ignore-this: 14b7550ca95ce423c9b0b7f6f14ffd2f
14582] {
14583hunk ./src/allmydata/test/test_mutable.py 2981
14584             self.failUnlessEqual(results, new_data))
14585         return d
14586 
14587+    def test_replace_beginning(self):
14588+        # We should be able to replace data at the beginning of the file
14589+        # without truncating the file
14590+        B = "beginning"
14591+        new_data = B + self.data[len(B):]
14592+        d = self.mdmf_node.get_best_mutable_version()
14593+        d.addCallback(lambda mv: mv.update(MutableData(B), 0))
14594+        d.addCallback(lambda ignored: self.mdmf_node.download_best_version())
14595+        d.addCallback(lambda results: self.failUnlessEqual(results, new_data))
14596+        return d
14597+
14598     def test_replace_segstart1(self):
14599         offset = 128*1024+1
14600         new_data = "NNNN"
14601hunk ./src/allmydata/test/test_web.py 3185
14602         d.addCallback(_get_data)
14603         d.addCallback(lambda results:
14604             self.failUnlessEqual(results, self.new_data + ("puppies" * 100)))
14605+        # and try replacing the beginning of the file
14606+        d.addCallback(lambda ignored:
14607+            self.PUT("/uri/%s?offset=0" % self.filecap, "begin"))
14608+        d.addCallback(_get_data)
14609+        d.addCallback(lambda results:
14610+            self.failUnlessEqual(results, "begin"+self.new_data[len("begin"):]+("puppies"*100)))
14611         return d
14612 
14613     def test_PUT_update_at_invalid_offset(self):
14614hunk ./src/allmydata/web/common.py 55
14615     # message? Since this call is going to be used by programmers and
14616     # their tools rather than users (through the wui), it is not
14617     # inconsistent to return that, I guess.
14618-    return int(offset)
14619+    if offset is not None:
14620+        offset = int(offset)
14621+
14622+    return offset
14623 
14624 
14625 def get_root(ctx_or_req):
14626hunk ./src/allmydata/web/filenode.py 219
14627         req = IRequest(ctx)
14628         t = get_arg(req, "t", "").strip()
14629         replace = parse_replace_arg(get_arg(req, "replace", "true"))
14630-        offset = parse_offset_arg(get_arg(req, "offset", False))
14631+        offset = parse_offset_arg(get_arg(req, "offset", None))
14632 
14633         if not t:
14634             if not replace:
14635hunk ./src/allmydata/web/filenode.py 229
14636                 raise ExistingChildError()
14637 
14638             if self.node.is_mutable():
14639-                if offset == False:
14640+                if offset is None:
14641                     return self.replace_my_contents(req)
14642 
14643                 if offset >= 0:
14644hunk ./src/allmydata/web/filenode.py 238
14645                 raise WebError("PUT to a mutable file: Invalid offset")
14646 
14647             else:
14648-                if offset != False:
14649+                if offset is not None:
14650                     raise WebError("PUT to a file: append operation invoked "
14651                                    "on an immutable cap")
14652 
14653}
14654[mutable/filenode: remove incorrect comments about segment boundaries
14655Kevan Carstensen <kevan@isnotajoke.com>**20110307081713
14656 Ignore-this: 7008644c3d9588815000a86edbf9c568
14657] {
14658hunk ./src/allmydata/mutable/filenode.py 1001
14659         offset. I return a Deferred that fires when this has been
14660         completed.
14661         """
14662-        # We have two cases here:
14663-        # 1. The new data will add few enough segments so that it does
14664-        #    not cross into the next power-of-two boundary.
14665-        # 2. It doesn't.
14666-        #
14667-        # In the former case, we can modify the file in place. In the
14668-        # latter case, we need to re-encode the file.
14669         new_size = data.get_size() + offset
14670         old_size = self.get_size()
14671         segment_size = self._version[3]
14672hunk ./src/allmydata/mutable/filenode.py 1011
14673         log.msg("got %d old segments, %d new segments" % \
14674                         (num_old_segments, num_new_segments))
14675 
14676-        # We also do a whole file re-encode if the file is an SDMF file.
14677+        # We do a whole file re-encode if the file is an SDMF file.
14678         if self._version[2]: # version[2] == SDMF salt, which MDMF lacks
14679             log.msg("doing re-encode instead of in-place update")
14680             return self._do_modify_update(data, offset)
14681hunk ./src/allmydata/mutable/filenode.py 1016
14682 
14683+        # Otherwise, we can replace just the parts that are changing.
14684         log.msg("updating in place")
14685         d = self._do_update_update(data, offset)
14686         d.addCallback(self._decode_and_decrypt_segments, data, offset)
14687}
14688[mutable: use integer division where appropriate
14689Kevan Carstensen <kevan@isnotajoke.com>**20110307082229
14690 Ignore-this: a8767e89d919c9f2a5d5fef3953d53f9
14691] {
14692hunk ./src/allmydata/mutable/filenode.py 1055
14693         segsize = self._version[3]
14694         # We'll need the segment that the data starts in, regardless of
14695         # what we'll do later.
14696-        start_segment = mathutil.div_ceil(offset, segsize)
14697-        start_segment -= 1
14698+        start_segment = offset // segsize
14699 
14700         # We only need the end segment if the data we append does not go
14701         # beyond the current end-of-file.
14702hunk ./src/allmydata/mutable/filenode.py 1062
14703         end_segment = start_segment
14704         if offset + data.get_size() < self.get_size():
14705             end_data = offset + data.get_size()
14706-            end_segment = mathutil.div_ceil(end_data, segsize)
14707-            end_segment -= 1
14708+            end_segment = end_data // segsize
14709+
14710         self._start_segment = start_segment
14711         self._end_segment = end_segment
14712 
14713hunk ./src/allmydata/mutable/publish.py 547
14714 
14715         # Calculate the starting segment for the upload.
14716         if segment_size:
14717+            # We use div_ceil instead of integer division here because
14718+            # it is semantically correct.
14719+            # If datalength isn't an even multiple of segment_size, but
14720+            # is larger than segment_size, datalength // segment_size
14721+            # will be the largest number such that num <= datalength and
14722+            # num % segment_size == 0. But that's not what we want,
14723+            # because it ignores the extra data. div_ceil will give us
14724+            # the right number of segments for the data that we're
14725+            # given.
14726             self.num_segments = mathutil.div_ceil(self.datalength,
14727                                                   segment_size)
14728hunk ./src/allmydata/mutable/publish.py 558
14729-            self.starting_segment = mathutil.div_ceil(offset,
14730-                                                      segment_size)
14731-            if offset % segment_size != 0:
14732-                self.starting_segment -= 1
14733-            if offset == 0:
14734-                self.starting_segment = 0
14735+
14736+            self.starting_segment = offset // segment_size
14737 
14738         else:
14739             self.num_segments = 0
14740hunk ./src/allmydata/mutable/publish.py 604
14741         self.end_segment = self.num_segments - 1
14742         # Now figure out where the last segment should be.
14743         if self.data.get_size() != self.datalength:
14744+            # We're updating a few segments in the middle of a mutable
14745+            # file, so we don't want to republish the whole thing.
14746+            # (we don't have enough data to do that even if we wanted
14747+            # to)
14748             end = self.data.get_size()
14749hunk ./src/allmydata/mutable/publish.py 609
14750-            self.end_segment = mathutil.div_ceil(end,
14751-                                                 segment_size)
14752-            self.end_segment -= 1
14753+            self.end_segment = end // segment_size
14754+            if end % segment_size == 0:
14755+                self.end_segment -= 1
14756+
14757         self.log("got start segment %d" % self.starting_segment)
14758         self.log("got end segment %d" % self.end_segment)
14759 
14760}
14761[mutable/layout.py: reorder on-disk format to aput variable-length fields at the end of the share, after a predictably long preamble
14762Kevan Carstensen <kevan@isnotajoke.com>**20110501224125
14763 Ignore-this: 8b2c5d29b8984dfe675c1a2ada5205cf
14764] {
14765hunk ./src/allmydata/mutable/layout.py 539
14766                                      self._readvs)
14767 
14768 
14769-MDMFHEADER = ">BQ32sBBQQ QQQQQQ"
14770+MDMFHEADER = ">BQ32sBBQQ QQQQQQQQ"
14771 MDMFHEADERWITHOUTOFFSETS = ">BQ32sBBQQ"
14772 MDMFHEADERSIZE = struct.calcsize(MDMFHEADER)
14773 MDMFHEADERWITHOUTOFFSETSSIZE = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
14774hunk ./src/allmydata/mutable/layout.py 545
14775 MDMFCHECKSTRING = ">BQ32s"
14776 MDMFSIGNABLEHEADER = ">BQ32sBBQQ"
14777-MDMFOFFSETS = ">QQQQQQ"
14778+MDMFOFFSETS = ">QQQQQQQQ"
14779 MDMFOFFSETS_LENGTH = struct.calcsize(MDMFOFFSETS)
14780hunk ./src/allmydata/mutable/layout.py 547
14781+# XXX Fix this.
14782+PRIVATE_KEY_SIZE = 2000
14783+SIGNATURE_SIZE = 10000
14784+VERIFICATION_KEY_SIZE = 2000
14785+# We know we won't ever have more than 256 shares.
14786+# XXX: This, too, can be
14787+SHARE_HASH_CHAIN_SIZE = HASH_SIZE * 256
14788 
14789 class MDMFSlotWriteProxy:
14790     implements(IMutableSlotWriter)
14791hunk ./src/allmydata/mutable/layout.py 577
14792     # 51          8           The data length of the original plaintext
14793     #-- end signed part --
14794     # 59          8           The offset of the encrypted private key
14795-    # 83          8           The offset of the signature
14796-    # 91          8           The offset of the verification key
14797-    # 67          8           The offset of the block hash tree
14798-    # 75          8           The offset of the share hash chain
14799-    # 99          8           The offset of the EOF
14800-    #
14801-    # followed by salts and share data, the encrypted private key, the
14802-    # block hash tree, the share hash chain, a signature over the first
14803-    # eight fields, and a verification key.
14804+    # 67          8           The offset of the signature
14805+    # 75          8           The offset of the verification key
14806+    # 83          8           The offset of the end of the v. key.
14807+    # 92          8           The offset of the share data
14808+    # 100         8           The offset of the block hash tree
14809+    # 108         8           The offset of the share hash chain
14810+    # 116         8           The offset of EOF
14811     #
14812hunk ./src/allmydata/mutable/layout.py 585
14813+    # followed by the encrypted private key, signature, verification
14814+    # key, share hash chain, data, and block hash tree. We order the
14815+    # fields that way to make smart downloaders -- downloaders which
14816+    # prempetively read a big part of the share -- possible.
14817+    #
14818     # The checkstring is the first three fields -- the version number,
14819     # sequence number, root hash and root salt hash. This is consistent
14820     # in meaning to what we have with SDMF files, except now instead of
14821hunk ./src/allmydata/mutable/layout.py 792
14822         data_size += self._tail_block_size
14823         data_size += SALT_SIZE
14824         self._offsets['enc_privkey'] = MDMFHEADERSIZE
14825-        self._offsets['enc_privkey'] += data_size
14826-        # We'll wait for the rest. Callers can now call my "put_block" and
14827-        # "set_checkstring" methods.
14828+
14829+        # We don't define offsets for these because we want them to be
14830+        # tightly packed -- this allows us to ignore the responsibility
14831+        # of padding individual values, and of removing that padding
14832+        # later. So nonconstant_start is where we start writing
14833+        # nonconstant data.
14834+        nonconstant_start = self._offsets['enc_privkey']
14835+        nonconstant_start += PRIVATE_KEY_SIZE
14836+        nonconstant_start += SIGNATURE_SIZE
14837+        nonconstant_start += VERIFICATION_KEY_SIZE
14838+        nonconstant_start += SHARE_HASH_CHAIN_SIZE
14839+
14840+        self._offsets['share_data'] = nonconstant_start
14841+
14842+        # Finally, we know how big the share data will be, so we can
14843+        # figure out where the block hash tree needs to go.
14844+        # XXX: But this will go away if Zooko wants to make it so that
14845+        # you don't need to know the size of the file before you start
14846+        # uploading it.
14847+        self._offsets['block_hash_tree'] = self._offsets['share_data'] + \
14848+                    data_size
14849+
14850+        # Done. We can snow start writing.
14851 
14852 
14853     def set_checkstring(self,
14854hunk ./src/allmydata/mutable/layout.py 891
14855         anything to be written yet.
14856         """
14857         if segnum >= self._num_segments:
14858-            raise LayoutInvalid("I won't overwrite the private key")
14859+            raise LayoutInvalid("I won't overwrite the block hash tree")
14860         if len(salt) != SALT_SIZE:
14861             raise LayoutInvalid("I was given a salt of size %d, but "
14862                                 "I wanted a salt of size %d")
14863hunk ./src/allmydata/mutable/layout.py 902
14864             raise LayoutInvalid("I was given the wrong size block to write")
14865 
14866         # We want to write at len(MDMFHEADER) + segnum * block_size.
14867-        offset = MDMFHEADERSIZE + (self._actual_block_size * segnum)
14868+        offset = self._offsets['share_data'] + \
14869+            (self._actual_block_size * segnum)
14870         data = salt + data
14871 
14872         self._writevs.append(tuple([offset, data]))
14873hunk ./src/allmydata/mutable/layout.py 922
14874         # tree, the block hash tree writing method writes the offset of
14875         # the share hash chain. So that's a good indicator of whether or
14876         # not the block hash tree has been written.
14877-        if "share_hash_chain" in self._offsets:
14878-            raise LayoutInvalid("You must write this before the block hash tree")
14879+        if "signature" in self._offsets:
14880+            raise LayoutInvalid("You can't put the encrypted private key "
14881+                                "after putting the share hash chain")
14882+
14883+        self._offsets['share_hash_chain'] = self._offsets['enc_privkey'] + \
14884+                len(encprivkey)
14885 
14886hunk ./src/allmydata/mutable/layout.py 929
14887-        self._offsets['block_hash_tree'] = self._offsets['enc_privkey'] + \
14888-            len(encprivkey)
14889         self._writevs.append(tuple([self._offsets['enc_privkey'], encprivkey]))
14890 
14891 
14892hunk ./src/allmydata/mutable/layout.py 944
14893         offset of the share hash chain.
14894         """
14895         assert self._offsets
14896+        assert "block_hash_tree" in self._offsets
14897+
14898         assert isinstance(blockhashes, list)
14899hunk ./src/allmydata/mutable/layout.py 947
14900-        if "block_hash_tree" not in self._offsets:
14901-            raise LayoutInvalid("You must put the encrypted private key "
14902-                                "before you put the block hash tree")
14903-        # If written, the share hash chain causes the signature offset
14904-        # to be defined.
14905-        if "signature" in self._offsets:
14906-            raise LayoutInvalid("You must put the block hash tree before "
14907-                                "you put the share hash chain")
14908+
14909         blockhashes_s = "".join(blockhashes)
14910hunk ./src/allmydata/mutable/layout.py 949
14911-        self._offsets['share_hash_chain'] = self._offsets['block_hash_tree'] + len(blockhashes_s)
14912+        self._offsets['EOF'] = self._offsets['block_hash_tree'] + len(blockhashes_s)
14913 
14914         self._writevs.append(tuple([self._offsets['block_hash_tree'],
14915                                   blockhashes_s]))
14916hunk ./src/allmydata/mutable/layout.py 969
14917         is before you can generate a valid signature.
14918         """
14919         assert isinstance(sharehashes, dict)
14920+        assert self._offsets
14921         if "share_hash_chain" not in self._offsets:
14922hunk ./src/allmydata/mutable/layout.py 971
14923-            raise LayoutInvalid("You need to put the block hash tree before "
14924-                                "you can put the share hash chain")
14925+            raise LayoutInvalid("You must put the block hash tree before "
14926+                                "putting the share hash chain")
14927+
14928         # The signature comes after the share hash chain. If the
14929         # signature has already been written, we must not write another
14930         # share hash chain. The signature writes the verification key
14931hunk ./src/allmydata/mutable/layout.py 984
14932                                 "before you write the signature")
14933         sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
14934                                   for i in sorted(sharehashes.keys())])
14935-        self._offsets['signature'] = self._offsets['share_hash_chain'] + len(sharehashes_s)
14936+        self._offsets['signature'] = self._offsets['share_hash_chain'] + \
14937+            len(sharehashes_s)
14938         self._writevs.append(tuple([self._offsets['share_hash_chain'],
14939                             sharehashes_s]))
14940 
14941hunk ./src/allmydata/mutable/layout.py 1002
14942         # Signature is defined by the routine that places the share hash
14943         # chain, so it's a good thing to look for in finding out whether
14944         # or not the share hash chain exists on the remote server.
14945-        if "signature" not in self._offsets:
14946-            raise LayoutInvalid("You need to put the share hash chain "
14947-                                "before you can put the root share hash")
14948         if len(roothash) != HASH_SIZE:
14949             raise LayoutInvalid("hashes and salts must be exactly %d bytes"
14950                                  % HASH_SIZE)
14951hunk ./src/allmydata/mutable/layout.py 1053
14952         # If we put the signature after we put the verification key, we
14953         # could end up running into the verification key, and will
14954         # probably screw up the offsets as well. So we don't allow that.
14955+        if "verification_key_end" in self._offsets:
14956+            raise LayoutInvalid("You can't put the signature after the "
14957+                                "verification key")
14958         # The method that writes the verification key defines the EOF
14959         # offset before writing the verification key, so look for that.
14960hunk ./src/allmydata/mutable/layout.py 1058
14961-        if "EOF" in self._offsets:
14962-            raise LayoutInvalid("You must write the signature before the verification key")
14963-
14964-        self._offsets['verification_key'] = self._offsets['signature'] + len(signature)
14965+        self._offsets['verification_key'] = self._offsets['signature'] +\
14966+            len(signature)
14967         self._writevs.append(tuple([self._offsets['signature'], signature]))
14968 
14969 
14970hunk ./src/allmydata/mutable/layout.py 1074
14971         if "verification_key" not in self._offsets:
14972             raise LayoutInvalid("You must put the signature before you "
14973                                 "can put the verification key")
14974-        self._offsets['EOF'] = self._offsets['verification_key'] + len(verification_key)
14975+
14976+        self._offsets['verification_key_end'] = \
14977+            self._offsets['verification_key'] + len(verification_key)
14978         self._writevs.append(tuple([self._offsets['verification_key'],
14979                             verification_key]))
14980 
14981hunk ./src/allmydata/mutable/layout.py 1102
14982         of the write vectors that I've dealt with so far to be published
14983         to the remote server, ending the write process.
14984         """
14985-        if "EOF" not in self._offsets:
14986+        if "verification_key_end" not in self._offsets:
14987             raise LayoutInvalid("You must put the verification key before "
14988                                 "you can publish the offsets")
14989         offsets_offset = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
14990hunk ./src/allmydata/mutable/layout.py 1108
14991         offsets = struct.pack(MDMFOFFSETS,
14992                               self._offsets['enc_privkey'],
14993-                              self._offsets['block_hash_tree'],
14994                               self._offsets['share_hash_chain'],
14995                               self._offsets['signature'],
14996                               self._offsets['verification_key'],
14997hunk ./src/allmydata/mutable/layout.py 1111
14998+                              self._offsets['verification_key_end'],
14999+                              self._offsets['share_data'],
15000+                              self._offsets['block_hash_tree'],
15001                               self._offsets['EOF'])
15002         self._writevs.append(tuple([offsets_offset, offsets]))
15003         encoding_parameters_offset = struct.calcsize(MDMFCHECKSTRING)
15004hunk ./src/allmydata/mutable/layout.py 1227
15005         # MDMF, though we'll be left with 4 more bytes than we
15006         # need if this ends up being MDMF. This is probably less
15007         # expensive than the cost of a second roundtrip.
15008-        readvs = [(0, 107)]
15009+        readvs = [(0, 123)]
15010         d = self._read(readvs, force_remote)
15011         d.addCallback(self._process_encoding_parameters)
15012         d.addCallback(self._process_offsets)
15013hunk ./src/allmydata/mutable/layout.py 1330
15014             read_length = MDMFOFFSETS_LENGTH
15015             end = read_offset + read_length
15016             (encprivkey,
15017-             blockhashes,
15018              sharehashes,
15019              signature,
15020              verification_key,
15021hunk ./src/allmydata/mutable/layout.py 1333
15022+             verification_key_end,
15023+             sharedata,
15024+             blockhashes,
15025              eof) = struct.unpack(MDMFOFFSETS,
15026                                   offsets[read_offset:end])
15027             self._offsets = {}
15028hunk ./src/allmydata/mutable/layout.py 1344
15029             self._offsets['share_hash_chain'] = sharehashes
15030             self._offsets['signature'] = signature
15031             self._offsets['verification_key'] = verification_key
15032+            self._offsets['verification_key_end']= \
15033+                verification_key_end
15034             self._offsets['EOF'] = eof
15035hunk ./src/allmydata/mutable/layout.py 1347
15036+            self._offsets['share_data'] = sharedata
15037 
15038 
15039     def get_block_and_salt(self, segnum, queue=False):
15040hunk ./src/allmydata/mutable/layout.py 1357
15041         """
15042         d = self._maybe_fetch_offsets_and_header()
15043         def _then(ignored):
15044-            if self._version_number == 1:
15045-                base_share_offset = MDMFHEADERSIZE
15046-            else:
15047-                base_share_offset = self._offsets['share_data']
15048+            base_share_offset = self._offsets['share_data']
15049 
15050             if segnum + 1 > self._num_segments:
15051                 raise LayoutInvalid("Not a valid segment number")
15052hunk ./src/allmydata/mutable/layout.py 1430
15053         def _then(ignored):
15054             blockhashes_offset = self._offsets['block_hash_tree']
15055             if self._version_number == 1:
15056-                blockhashes_length = self._offsets['share_hash_chain'] - blockhashes_offset
15057+                blockhashes_length = self._offsets['EOF'] - blockhashes_offset
15058             else:
15059                 blockhashes_length = self._offsets['share_data'] - blockhashes_offset
15060             readvs = [(blockhashes_offset, blockhashes_length)]
15061hunk ./src/allmydata/mutable/layout.py 1501
15062             if self._version_number == 0:
15063                 privkey_length = self._offsets['EOF'] - privkey_offset
15064             else:
15065-                privkey_length = self._offsets['block_hash_tree'] - privkey_offset
15066+                privkey_length = self._offsets['share_hash_chain'] - privkey_offset
15067             readvs = [(privkey_offset, privkey_length)]
15068             return readvs
15069         d.addCallback(_make_readvs)
15070hunk ./src/allmydata/mutable/layout.py 1549
15071         def _make_readvs(ignored):
15072             if self._version_number == 1:
15073                 vk_offset = self._offsets['verification_key']
15074-                vk_length = self._offsets['EOF'] - vk_offset
15075+                vk_length = self._offsets['verification_key_end'] - vk_offset
15076             else:
15077                 vk_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
15078                 vk_length = self._offsets['signature'] - vk_offset
15079hunk ./src/allmydata/test/test_storage.py 26
15080 from allmydata.mutable.layout import MDMFSlotWriteProxy, MDMFSlotReadProxy, \
15081                                      LayoutInvalid, MDMFSIGNABLEHEADER, \
15082                                      SIGNED_PREFIX, MDMFHEADER, \
15083-                                     MDMFOFFSETS, SDMFSlotWriteProxy
15084+                                     MDMFOFFSETS, SDMFSlotWriteProxy, \
15085+                                     PRIVATE_KEY_SIZE, \
15086+                                     SIGNATURE_SIZE, \
15087+                                     VERIFICATION_KEY_SIZE, \
15088+                                     SHARE_HASH_CHAIN_SIZE
15089 from allmydata.interfaces import BadWriteEnablerError
15090 from allmydata.test.common import LoggingServiceParent, ShouldFailMixin
15091 from allmydata.test.common_web import WebRenderingMixin
15092hunk ./src/allmydata/test/test_storage.py 1408
15093 
15094         # The encrypted private key comes after the shares + salts
15095         offset_size = struct.calcsize(MDMFOFFSETS)
15096-        encrypted_private_key_offset = len(data) + offset_size + len(sharedata)
15097-        # The blockhashes come after the private key
15098-        blockhashes_offset = encrypted_private_key_offset + len(self.encprivkey)
15099-        # The sharehashes come after the salt hashes
15100-        sharehashes_offset = blockhashes_offset + len(self.block_hash_tree_s)
15101-        # The signature comes after the share hash chain
15102+        encrypted_private_key_offset = len(data) + offset_size
15103+        # The share has chain comes after the private key
15104+        sharehashes_offset = encrypted_private_key_offset + \
15105+            len(self.encprivkey)
15106+
15107+        # The signature comes after the share hash chain.
15108         signature_offset = sharehashes_offset + len(self.share_hash_chain_s)
15109hunk ./src/allmydata/test/test_storage.py 1415
15110-        # The verification key comes after the signature
15111-        verification_offset = signature_offset + len(self.signature)
15112-        # The EOF comes after the verification key
15113-        eof_offset = verification_offset + len(self.verification_key)
15114+
15115+        verification_key_offset = signature_offset + len(self.signature)
15116+        verification_key_end = verification_key_offset + \
15117+            len(self.verification_key)
15118+
15119+        share_data_offset = offset_size
15120+        share_data_offset += PRIVATE_KEY_SIZE
15121+        share_data_offset += SIGNATURE_SIZE
15122+        share_data_offset += VERIFICATION_KEY_SIZE
15123+        share_data_offset += SHARE_HASH_CHAIN_SIZE
15124+
15125+        blockhashes_offset = share_data_offset + len(sharedata)
15126+        eof_offset = blockhashes_offset + len(self.block_hash_tree_s)
15127+
15128         data += struct.pack(MDMFOFFSETS,
15129                             encrypted_private_key_offset,
15130hunk ./src/allmydata/test/test_storage.py 1431
15131-                            blockhashes_offset,
15132                             sharehashes_offset,
15133                             signature_offset,
15134hunk ./src/allmydata/test/test_storage.py 1433
15135-                            verification_offset,
15136+                            verification_key_offset,
15137+                            verification_key_end,
15138+                            share_data_offset,
15139+                            blockhashes_offset,
15140                             eof_offset)
15141hunk ./src/allmydata/test/test_storage.py 1438
15142+
15143         self.offsets = {}
15144         self.offsets['enc_privkey'] = encrypted_private_key_offset
15145         self.offsets['block_hash_tree'] = blockhashes_offset
15146hunk ./src/allmydata/test/test_storage.py 1444
15147         self.offsets['share_hash_chain'] = sharehashes_offset
15148         self.offsets['signature'] = signature_offset
15149-        self.offsets['verification_key'] = verification_offset
15150+        self.offsets['verification_key'] = verification_key_offset
15151+        self.offsets['share_data'] = share_data_offset
15152+        self.offsets['verification_key_end'] = verification_key_end
15153         self.offsets['EOF'] = eof_offset
15154hunk ./src/allmydata/test/test_storage.py 1448
15155-        # Next, we'll add in the salts and share data,
15156-        data += sharedata
15157+
15158         # the private key,
15159         data += self.encprivkey
15160hunk ./src/allmydata/test/test_storage.py 1451
15161-        # the block hash tree,
15162-        data += self.block_hash_tree_s
15163-        # the share hash chain,
15164+        # the sharehashes
15165         data += self.share_hash_chain_s
15166         # the signature,
15167         data += self.signature
15168hunk ./src/allmydata/test/test_storage.py 1457
15169         # and the verification key
15170         data += self.verification_key
15171+        # Then we'll add in gibberish until we get to the right point.
15172+        nulls = "".join([" " for i in xrange(len(data), share_data_offset)])
15173+        data += nulls
15174+
15175+        # Then the share data
15176+        data += sharedata
15177+        # the blockhashes
15178+        data += self.block_hash_tree_s
15179         return data
15180 
15181 
15182hunk ./src/allmydata/test/test_storage.py 1729
15183         return d
15184 
15185 
15186-    def test_blockhashes_after_share_hash_chain(self):
15187+    def test_private_key_after_share_hash_chain(self):
15188         mw = self._make_new_mw("si1", 0)
15189         d = defer.succeed(None)
15190hunk ./src/allmydata/test/test_storage.py 1732
15191-        # Put everything up to and including the share hash chain
15192         for i in xrange(6):
15193             d.addCallback(lambda ignored, i=i:
15194                 mw.put_block(self.block, i, self.salt))
15195hunk ./src/allmydata/test/test_storage.py 1738
15196         d.addCallback(lambda ignored:
15197             mw.put_encprivkey(self.encprivkey))
15198         d.addCallback(lambda ignored:
15199-            mw.put_blockhashes(self.block_hash_tree))
15200-        d.addCallback(lambda ignored:
15201             mw.put_sharehashes(self.share_hash_chain))
15202 
15203hunk ./src/allmydata/test/test_storage.py 1740
15204-        # Now try to put the block hash tree again.
15205+        # Now try to put the private key again.
15206         d.addCallback(lambda ignored:
15207hunk ./src/allmydata/test/test_storage.py 1742
15208-            self.shouldFail(LayoutInvalid, "test repeat salthashes",
15209-                            None,
15210-                            mw.put_blockhashes, self.block_hash_tree))
15211-        return d
15212-
15213-
15214-    def test_encprivkey_after_blockhashes(self):
15215-        mw = self._make_new_mw("si1", 0)
15216-        d = defer.succeed(None)
15217-        # Put everything up to and including the block hash tree
15218-        for i in xrange(6):
15219-            d.addCallback(lambda ignored, i=i:
15220-                mw.put_block(self.block, i, self.salt))
15221-        d.addCallback(lambda ignored:
15222-            mw.put_encprivkey(self.encprivkey))
15223-        d.addCallback(lambda ignored:
15224-            mw.put_blockhashes(self.block_hash_tree))
15225-        d.addCallback(lambda ignored:
15226-            self.shouldFail(LayoutInvalid, "out of order private key",
15227+            self.shouldFail(LayoutInvalid, "test repeat private key",
15228                             None,
15229                             mw.put_encprivkey, self.encprivkey))
15230         return d
15231hunk ./src/allmydata/test/test_storage.py 1748
15232 
15233 
15234-    def test_share_hash_chain_after_signature(self):
15235-        mw = self._make_new_mw("si1", 0)
15236-        d = defer.succeed(None)
15237-        # Put everything up to and including the signature
15238-        for i in xrange(6):
15239-            d.addCallback(lambda ignored, i=i:
15240-                mw.put_block(self.block, i, self.salt))
15241-        d.addCallback(lambda ignored:
15242-            mw.put_encprivkey(self.encprivkey))
15243-        d.addCallback(lambda ignored:
15244-            mw.put_blockhashes(self.block_hash_tree))
15245-        d.addCallback(lambda ignored:
15246-            mw.put_sharehashes(self.share_hash_chain))
15247-        d.addCallback(lambda ignored:
15248-            mw.put_root_hash(self.root_hash))
15249-        d.addCallback(lambda ignored:
15250-            mw.put_signature(self.signature))
15251-        # Now try to put the share hash chain again. This should fail
15252-        d.addCallback(lambda ignored:
15253-            self.shouldFail(LayoutInvalid, "out of order share hash chain",
15254-                            None,
15255-                            mw.put_sharehashes, self.share_hash_chain))
15256-        return d
15257-
15258-
15259     def test_signature_after_verification_key(self):
15260         mw = self._make_new_mw("si1", 0)
15261         d = defer.succeed(None)
15262hunk ./src/allmydata/test/test_storage.py 1877
15263         mw = self._make_new_mw("si1", 0)
15264         # Test writing some blocks.
15265         read = self.ss.remote_slot_readv
15266-        expected_sharedata_offset = struct.calcsize(MDMFHEADER)
15267+        expected_private_key_offset = struct.calcsize(MDMFHEADER)
15268+        expected_sharedata_offset = struct.calcsize(MDMFHEADER) + \
15269+                                    PRIVATE_KEY_SIZE + \
15270+                                    SIGNATURE_SIZE + \
15271+                                    VERIFICATION_KEY_SIZE + \
15272+                                    SHARE_HASH_CHAIN_SIZE
15273         written_block_size = 2 + len(self.salt)
15274         written_block = self.block + self.salt
15275         for i in xrange(6):
15276hunk ./src/allmydata/test/test_storage.py 1903
15277                 self.failUnlessEqual(read("si1", [0], [(expected_sharedata_offset + (i * written_block_size), written_block_size)]),
15278                                 {0: [written_block]})
15279 
15280-            expected_private_key_offset = expected_sharedata_offset + \
15281-                                      len(written_block) * 6
15282             self.failUnlessEqual(len(self.encprivkey), 7)
15283             self.failUnlessEqual(read("si1", [0], [(expected_private_key_offset, 7)]),
15284                                  {0: [self.encprivkey]})
15285hunk ./src/allmydata/test/test_storage.py 1907
15286 
15287-            expected_block_hash_offset = expected_private_key_offset + len(self.encprivkey)
15288+            expected_block_hash_offset = expected_sharedata_offset + \
15289+                        (6 * written_block_size)
15290             self.failUnlessEqual(len(self.block_hash_tree_s), 32 * 6)
15291             self.failUnlessEqual(read("si1", [0], [(expected_block_hash_offset, 32 * 6)]),
15292                                  {0: [self.block_hash_tree_s]})
15293hunk ./src/allmydata/test/test_storage.py 1913
15294 
15295-            expected_share_hash_offset = expected_block_hash_offset + len(self.block_hash_tree_s)
15296+            expected_share_hash_offset = expected_private_key_offset + len(self.encprivkey)
15297             self.failUnlessEqual(read("si1", [0],[(expected_share_hash_offset, (32 + 2) * 6)]),
15298                                  {0: [self.share_hash_chain_s]})
15299 
15300hunk ./src/allmydata/test/test_storage.py 1919
15301             self.failUnlessEqual(read("si1", [0], [(9, 32)]),
15302                                  {0: [self.root_hash]})
15303-            expected_signature_offset = expected_share_hash_offset + len(self.share_hash_chain_s)
15304+            expected_signature_offset = expected_share_hash_offset + \
15305+                len(self.share_hash_chain_s)
15306             self.failUnlessEqual(len(self.signature), 9)
15307             self.failUnlessEqual(read("si1", [0], [(expected_signature_offset, 9)]),
15308                                  {0: [self.signature]})
15309hunk ./src/allmydata/test/test_storage.py 1941
15310             self.failUnlessEqual(n, 10)
15311             self.failUnlessEqual(segsize, 6)
15312             self.failUnlessEqual(datalen, 36)
15313-            expected_eof_offset = expected_verification_key_offset + len(self.verification_key)
15314+            expected_eof_offset = expected_block_hash_offset + \
15315+                len(self.block_hash_tree_s)
15316 
15317             # Check the version number to make sure that it is correct.
15318             expected_version_number = struct.pack(">B", 1)
15319hunk ./src/allmydata/test/test_storage.py 1969
15320             expected_offset = struct.pack(">Q", expected_private_key_offset)
15321             self.failUnlessEqual(read("si1", [0], [(59, 8)]),
15322                                  {0: [expected_offset]})
15323-            expected_offset = struct.pack(">Q", expected_block_hash_offset)
15324+            expected_offset = struct.pack(">Q", expected_share_hash_offset)
15325             self.failUnlessEqual(read("si1", [0], [(67, 8)]),
15326                                  {0: [expected_offset]})
15327hunk ./src/allmydata/test/test_storage.py 1972
15328-            expected_offset = struct.pack(">Q", expected_share_hash_offset)
15329+            expected_offset = struct.pack(">Q", expected_signature_offset)
15330             self.failUnlessEqual(read("si1", [0], [(75, 8)]),
15331                                  {0: [expected_offset]})
15332hunk ./src/allmydata/test/test_storage.py 1975
15333-            expected_offset = struct.pack(">Q", expected_signature_offset)
15334+            expected_offset = struct.pack(">Q", expected_verification_key_offset)
15335             self.failUnlessEqual(read("si1", [0], [(83, 8)]),
15336                                  {0: [expected_offset]})
15337hunk ./src/allmydata/test/test_storage.py 1978
15338-            expected_offset = struct.pack(">Q", expected_verification_key_offset)
15339+            expected_offset = struct.pack(">Q", expected_verification_key_offset + len(self.verification_key))
15340             self.failUnlessEqual(read("si1", [0], [(91, 8)]),
15341                                  {0: [expected_offset]})
15342hunk ./src/allmydata/test/test_storage.py 1981
15343-            expected_offset = struct.pack(">Q", expected_eof_offset)
15344+            expected_offset = struct.pack(">Q", expected_sharedata_offset)
15345             self.failUnlessEqual(read("si1", [0], [(99, 8)]),
15346                                  {0: [expected_offset]})
15347hunk ./src/allmydata/test/test_storage.py 1984
15348+            expected_offset = struct.pack(">Q", expected_block_hash_offset)
15349+            self.failUnlessEqual(read("si1", [0], [(107, 8)]),
15350+                                 {0: [expected_offset]})
15351+            expected_offset = struct.pack(">Q", expected_eof_offset)
15352+            self.failUnlessEqual(read("si1", [0], [(115, 8)]),
15353+                                 {0: [expected_offset]})
15354         d.addCallback(_check_publish)
15355         return d
15356 
15357hunk ./src/allmydata/test/test_storage.py 2117
15358         for i in xrange(6):
15359             d.addCallback(lambda ignored, i=i:
15360                 mw0.put_block(self.block, i, self.salt))
15361-        # Try to write the block hashes before writing the encrypted
15362-        # private key
15363-        d.addCallback(lambda ignored:
15364-            self.shouldFail(LayoutInvalid, "block hashes before key",
15365-                            None, mw0.put_blockhashes,
15366-                            self.block_hash_tree))
15367-
15368-        # Write the private key.
15369-        d.addCallback(lambda ignored:
15370-            mw0.put_encprivkey(self.encprivkey))
15371-
15372 
15373hunk ./src/allmydata/test/test_storage.py 2118
15374-        # Try to write the share hash chain without writing the block
15375-        # hash tree
15376+        # Try to write the share hash chain without writing the
15377+        # encrypted private key
15378         d.addCallback(lambda ignored:
15379             self.shouldFail(LayoutInvalid, "share hash chain before "
15380hunk ./src/allmydata/test/test_storage.py 2122
15381-                                           "salt hash tree",
15382+                                           "private key",
15383                             None,
15384                             mw0.put_sharehashes, self.share_hash_chain))
15385hunk ./src/allmydata/test/test_storage.py 2125
15386-
15387-        # Try to write the root hash and without writing either the
15388-        # block hashes or the or the share hashes
15389+        # Write the private key.
15390         d.addCallback(lambda ignored:
15391hunk ./src/allmydata/test/test_storage.py 2127
15392-            self.shouldFail(LayoutInvalid, "root hash before share hashes",
15393-                            None,
15394-                            mw0.put_root_hash, self.root_hash))
15395+            mw0.put_encprivkey(self.encprivkey))
15396 
15397         # Now write the block hashes and try again
15398         d.addCallback(lambda ignored:
15399hunk ./src/allmydata/test/test_storage.py 2133
15400             mw0.put_blockhashes(self.block_hash_tree))
15401 
15402-        d.addCallback(lambda ignored:
15403-            self.shouldFail(LayoutInvalid, "root hash before share hashes",
15404-                            None, mw0.put_root_hash, self.root_hash))
15405-
15406         # We haven't yet put the root hash on the share, so we shouldn't
15407         # be able to sign it.
15408         d.addCallback(lambda ignored:
15409hunk ./src/allmydata/test/test_storage.py 2378
15410         # This should be enough to fill in both the encoding parameters
15411         # and the table of offsets, which will complete the version
15412         # information tuple.
15413-        d.addCallback(_make_mr, 107)
15414+        d.addCallback(_make_mr, 123)
15415         d.addCallback(lambda mr:
15416             mr.get_verinfo())
15417         def _check_verinfo(verinfo):
15418hunk ./src/allmydata/test/test_storage.py 2412
15419         d.addCallback(_check_verinfo)
15420         # This is not enough data to read a block and a share, so the
15421         # wrapper should attempt to read this from the remote server.
15422-        d.addCallback(_make_mr, 107)
15423+        d.addCallback(_make_mr, 123)
15424         d.addCallback(lambda mr:
15425             mr.get_block_and_salt(0))
15426         def _check_block_and_salt((block, salt)):
15427hunk ./src/allmydata/test/test_storage.py 2420
15428             self.failUnlessEqual(salt, self.salt)
15429             self.failUnlessEqual(self.rref.read_count, 1)
15430         # This should be enough data to read one block.
15431-        d.addCallback(_make_mr, 249)
15432+        d.addCallback(_make_mr, 123 + PRIVATE_KEY_SIZE + SIGNATURE_SIZE + VERIFICATION_KEY_SIZE + SHARE_HASH_CHAIN_SIZE + 140)
15433         d.addCallback(lambda mr:
15434             mr.get_block_and_salt(0))
15435         d.addCallback(_check_block_and_salt)
15436hunk ./src/allmydata/test/test_storage.py 2438
15437         # This should be enough to get us the encoding parameters,
15438         # offset table, and everything else we need to build a verinfo
15439         # string.
15440-        d.addCallback(_make_mr, 107)
15441+        d.addCallback(_make_mr, 123)
15442         d.addCallback(lambda mr:
15443             mr.get_verinfo())
15444         def _check_verinfo(verinfo):
15445hunk ./src/allmydata/test/test_storage.py 2473
15446             self.failUnlessEqual(self.rref.read_count, 0)
15447         d.addCallback(_check_verinfo)
15448         # This shouldn't be enough to read any share data.
15449-        d.addCallback(_make_mr, 107)
15450+        d.addCallback(_make_mr, 123)
15451         d.addCallback(lambda mr:
15452             mr.get_block_and_salt(0))
15453         def _check_block_and_salt((block, salt)):
15454}
15455[uri.py: Add MDMF cap
15456Kevan Carstensen <kevan@isnotajoke.com>**20110501224249
15457 Ignore-this: a6d1046d33f5cc811c5e8b10af925f33
15458] {
15459hunk ./src/allmydata/interfaces.py 546
15460 
15461 class IMutableFileURI(Interface):
15462     """I am a URI which represents a mutable filenode."""
15463+    def get_extension_params():
15464+        """Return the extension parameters in the URI"""
15465 
15466 class IDirectoryURI(Interface):
15467     pass
15468hunk ./src/allmydata/test/test_uri.py 2
15469 
15470+import re
15471 from twisted.trial import unittest
15472 from allmydata import uri
15473 from allmydata.util import hashutil, base32
15474hunk ./src/allmydata/test/test_uri.py 259
15475         uri.CHKFileURI.init_from_string(fileURI)
15476 
15477 class Mutable(testutil.ReallyEqualMixin, unittest.TestCase):
15478-    def test_pack(self):
15479-        writekey = "\x01" * 16
15480-        fingerprint = "\x02" * 32
15481+    def setUp(self):
15482+        self.writekey = "\x01" * 16
15483+        self.fingerprint = "\x02" * 32
15484+        self.readkey = hashutil.ssk_readkey_hash(self.writekey)
15485+        self.storage_index = hashutil.ssk_storage_index_hash(self.readkey)
15486 
15487hunk ./src/allmydata/test/test_uri.py 265
15488-        u = uri.WriteableSSKFileURI(writekey, fingerprint)
15489-        self.failUnlessReallyEqual(u.writekey, writekey)
15490-        self.failUnlessReallyEqual(u.fingerprint, fingerprint)
15491+    def test_pack(self):
15492+        u = uri.WriteableSSKFileURI(self.writekey, self.fingerprint)
15493+        self.failUnlessReallyEqual(u.writekey, self.writekey)
15494+        self.failUnlessReallyEqual(u.fingerprint, self.fingerprint)
15495         self.failIf(u.is_readonly())
15496         self.failUnless(u.is_mutable())
15497         self.failUnless(IURI.providedBy(u))
15498hunk ./src/allmydata/test/test_uri.py 281
15499         self.failUnlessReallyEqual(u, u_h)
15500 
15501         u2 = uri.from_string(u.to_string())
15502-        self.failUnlessReallyEqual(u2.writekey, writekey)
15503-        self.failUnlessReallyEqual(u2.fingerprint, fingerprint)
15504+        self.failUnlessReallyEqual(u2.writekey, self.writekey)
15505+        self.failUnlessReallyEqual(u2.fingerprint, self.fingerprint)
15506         self.failIf(u2.is_readonly())
15507         self.failUnless(u2.is_mutable())
15508         self.failUnless(IURI.providedBy(u2))
15509hunk ./src/allmydata/test/test_uri.py 297
15510         self.failUnless(isinstance(u2imm, uri.UnknownURI), u2imm)
15511 
15512         u3 = u2.get_readonly()
15513-        readkey = hashutil.ssk_readkey_hash(writekey)
15514-        self.failUnlessReallyEqual(u3.fingerprint, fingerprint)
15515+        readkey = hashutil.ssk_readkey_hash(self.writekey)
15516+        self.failUnlessReallyEqual(u3.fingerprint, self.fingerprint)
15517         self.failUnlessReallyEqual(u3.readkey, readkey)
15518         self.failUnless(u3.is_readonly())
15519         self.failUnless(u3.is_mutable())
15520hunk ./src/allmydata/test/test_uri.py 317
15521         u3_h = uri.ReadonlySSKFileURI.init_from_human_encoding(he)
15522         self.failUnlessReallyEqual(u3, u3_h)
15523 
15524-        u4 = uri.ReadonlySSKFileURI(readkey, fingerprint)
15525-        self.failUnlessReallyEqual(u4.fingerprint, fingerprint)
15526+        u4 = uri.ReadonlySSKFileURI(readkey, self.fingerprint)
15527+        self.failUnlessReallyEqual(u4.fingerprint, self.fingerprint)
15528         self.failUnlessReallyEqual(u4.readkey, readkey)
15529         self.failUnless(u4.is_readonly())
15530         self.failUnless(u4.is_mutable())
15531hunk ./src/allmydata/test/test_uri.py 350
15532         self.failUnlessReallyEqual(u5, u5_h)
15533 
15534 
15535+    def test_writable_mdmf_cap(self):
15536+        u1 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint)
15537+        cap = u1.to_string()
15538+        u = uri.WritableMDMFFileURI.init_from_string(cap)
15539+
15540+        self.failUnless(IMutableFileURI.providedBy(u))
15541+        self.failUnlessReallyEqual(u.fingerprint, self.fingerprint)
15542+        self.failUnlessReallyEqual(u.writekey, self.writekey)
15543+        self.failUnless(u.is_mutable())
15544+        self.failIf(u.is_readonly())
15545+        self.failUnlessEqual(cap, u.to_string())
15546+
15547+        # Now get a readonly cap from the writable cap, and test that it
15548+        # degrades gracefully.
15549+        ru = u.get_readonly()
15550+        self.failUnlessReallyEqual(self.readkey, ru.readkey)
15551+        self.failUnlessReallyEqual(self.fingerprint, ru.fingerprint)
15552+        self.failUnless(ru.is_mutable())
15553+        self.failUnless(ru.is_readonly())
15554+
15555+        # Now get a verifier cap.
15556+        vu = ru.get_verify_cap()
15557+        self.failUnlessReallyEqual(self.storage_index, vu.storage_index)
15558+        self.failUnlessReallyEqual(self.fingerprint, vu.fingerprint)
15559+        self.failUnless(IVerifierURI.providedBy(vu))
15560+
15561+    def test_readonly_mdmf_cap(self):
15562+        u1 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint)
15563+        cap = u1.to_string()
15564+        u2 = uri.ReadonlyMDMFFileURI.init_from_string(cap)
15565+
15566+        self.failUnlessReallyEqual(u2.fingerprint, self.fingerprint)
15567+        self.failUnlessReallyEqual(u2.readkey, self.readkey)
15568+        self.failUnless(u2.is_readonly())
15569+        self.failUnless(u2.is_mutable())
15570+
15571+        vu = u2.get_verify_cap()
15572+        self.failUnlessEqual(u2.storage_index, self.storage_index)
15573+        self.failUnlessEqual(u2.fingerprint, self.fingerprint)
15574+
15575+    def test_create_writable_mdmf_cap_from_readcap(self):
15576+        # we shouldn't be able to create a writable MDMF cap given only a
15577+        # readcap.
15578+        u1 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint)
15579+        cap = u1.to_string()
15580+        self.failUnlessRaises(uri.BadURIError,
15581+                              uri.WritableMDMFFileURI.init_from_string,
15582+                              cap)
15583+
15584+    def test_create_writable_mdmf_cap_from_verifycap(self):
15585+        u1 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint)
15586+        cap = u1.to_string()
15587+        self.failUnlessRaises(uri.BadURIError,
15588+                              uri.WritableMDMFFileURI.init_from_string,
15589+                              cap)
15590+
15591+    def test_create_readonly_mdmf_cap_from_verifycap(self):
15592+        u1 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint)
15593+        cap = u1.to_string()
15594+        self.failUnlessRaises(uri.BadURIError,
15595+                              uri.ReadonlyMDMFFileURI.init_from_string,
15596+                              cap)
15597+
15598+    def test_mdmf_verifier_cap(self):
15599+        u1 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint)
15600+        self.failUnless(u1.is_readonly())
15601+        self.failIf(u1.is_mutable())
15602+        self.failUnlessReallyEqual(self.storage_index, u1.storage_index)
15603+        self.failUnlessReallyEqual(self.fingerprint, u1.fingerprint)
15604+
15605+        cap = u1.to_string()
15606+        u2 = uri.MDMFVerifierURI.init_from_string(cap)
15607+        self.failUnless(u2.is_readonly())
15608+        self.failIf(u2.is_mutable())
15609+        self.failUnlessReallyEqual(self.storage_index, u2.storage_index)
15610+        self.failUnlessReallyEqual(self.fingerprint, u2.fingerprint)
15611+
15612+        u3 = u2.get_readonly()
15613+        self.failUnlessReallyEqual(u3, u2)
15614+
15615+        u4 = u2.get_verify_cap()
15616+        self.failUnlessReallyEqual(u4, u2)
15617+
15618+    def test_mdmf_cap_extra_information(self):
15619+        # MDMF caps can be arbitrarily extended after the fingerprint
15620+        # and key/storage index fields.
15621+        u1 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint)
15622+        self.failUnlessEqual([], u1.get_extension_params())
15623+
15624+        cap = u1.to_string()
15625+        # Now let's append some fields. Say, 131073 (the segment size)
15626+        # and 3 (the "k" encoding parameter).
15627+        expected_extensions = []
15628+        for e in ('131073', '3'):
15629+            cap += (":%s" % e)
15630+            expected_extensions.append(e)
15631+
15632+            u2 = uri.WritableMDMFFileURI.init_from_string(cap)
15633+            self.failUnlessReallyEqual(self.writekey, u2.writekey)
15634+            self.failUnlessReallyEqual(self.fingerprint, u2.fingerprint)
15635+            self.failIf(u2.is_readonly())
15636+            self.failUnless(u2.is_mutable())
15637+
15638+            c2 = u2.to_string()
15639+            u2n = uri.WritableMDMFFileURI.init_from_string(c2)
15640+            self.failUnlessReallyEqual(u2, u2n)
15641+
15642+            # We should get the extra back when we ask for it.
15643+            self.failUnlessEqual(expected_extensions, u2.get_extension_params())
15644+
15645+            # These should be preserved through cap attenuation, too.
15646+            u3 = u2.get_readonly()
15647+            self.failUnlessReallyEqual(self.readkey, u3.readkey)
15648+            self.failUnlessReallyEqual(self.fingerprint, u3.fingerprint)
15649+            self.failUnless(u3.is_readonly())
15650+            self.failUnless(u3.is_mutable())
15651+            self.failUnlessEqual(expected_extensions, u3.get_extension_params())
15652+
15653+            c3 = u3.to_string()
15654+            u3n = uri.ReadonlyMDMFFileURI.init_from_string(c3)
15655+            self.failUnlessReallyEqual(u3, u3n)
15656+
15657+            u4 = u3.get_verify_cap()
15658+            self.failUnlessReallyEqual(self.storage_index, u4.storage_index)
15659+            self.failUnlessReallyEqual(self.fingerprint, u4.fingerprint)
15660+            self.failUnless(u4.is_readonly())
15661+            self.failIf(u4.is_mutable())
15662+
15663+            c4 = u4.to_string()
15664+            u4n = uri.MDMFVerifierURI.init_from_string(c4)
15665+            self.failUnlessReallyEqual(u4n, u4)
15666+
15667+            self.failUnlessEqual(expected_extensions, u4.get_extension_params())
15668+
15669+
15670+    def test_sdmf_cap_extra_information(self):
15671+        # For interface consistency, we define a method to get
15672+        # extensions for SDMF files as well. This method must always
15673+        # return no extensions, since SDMF files were not created with
15674+        # extensions and cannot be modified to include extensions
15675+        # without breaking older clients.
15676+        u1 = uri.WriteableSSKFileURI(self.writekey, self.fingerprint)
15677+        cap = u1.to_string()
15678+        u2 = uri.WriteableSSKFileURI.init_from_string(cap)
15679+        self.failUnlessEqual([], u2.get_extension_params())
15680+
15681+    def test_extension_character_range(self):
15682+        # As written now, we shouldn't put things other than numbers in
15683+        # the extension fields.
15684+        writecap = uri.WritableMDMFFileURI(self.writekey, self.fingerprint).to_string()
15685+        readcap  = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint).to_string()
15686+        vcap     = uri.MDMFVerifierURI(self.storage_index, self.fingerprint).to_string()
15687+        self.failUnlessRaises(uri.BadURIError,
15688+                              uri.WritableMDMFFileURI.init_from_string,
15689+                              ("%s:invalid" % writecap))
15690+        self.failUnlessRaises(uri.BadURIError,
15691+                              uri.ReadonlyMDMFFileURI.init_from_string,
15692+                              ("%s:invalid" % readcap))
15693+        self.failUnlessRaises(uri.BadURIError,
15694+                              uri.MDMFVerifierURI.init_from_string,
15695+                              ("%s:invalid" % vcap))
15696+
15697+
15698+    def test_mdmf_valid_human_encoding(self):
15699+        # What's a human encoding? Well, it's of the form:
15700+        base = "https://127.0.0.1:3456/uri/"
15701+        # With a cap on the end. For each of the cap types, we need to
15702+        # test that a valid cap (with and without the traditional
15703+        # separators) is recognized and accepted by the classes.
15704+        w1 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint)
15705+        w2 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint,
15706+                                     ['131073', '3'])
15707+        r1 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint)
15708+        r2 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint,
15709+                                     ['131073', '3'])
15710+        v1 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint)
15711+        v2 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint,
15712+                                 ['131073', '3'])
15713+
15714+        # These will yield six different caps.
15715+        for o in (w1, w2, r1 , r2, v1, v2):
15716+            url = base + o.to_string()
15717+            o1 = o.__class__.init_from_human_encoding(url)
15718+            self.failUnlessReallyEqual(o1, o)
15719+
15720+            # Note that our cap will, by default, have : as separators.
15721+            # But it's expected that users from, e.g., the WUI, will
15722+            # have %3A as a separator. We need to make sure that the
15723+            # initialization routine handles that, too.
15724+            cap = o.to_string()
15725+            cap = re.sub(":", "%3A", cap)
15726+            url = base + cap
15727+            o2 = o.__class__.init_from_human_encoding(url)
15728+            self.failUnlessReallyEqual(o2, o)
15729+
15730+
15731+    def test_mdmf_human_encoding_invalid_base(self):
15732+        # What's a human encoding? Well, it's of the form:
15733+        base = "https://127.0.0.1:3456/foo/bar/bazuri/"
15734+        # With a cap on the end. For each of the cap types, we need to
15735+        # test that a valid cap (with and without the traditional
15736+        # separators) is recognized and accepted by the classes.
15737+        w1 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint)
15738+        w2 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint,
15739+                                     ['131073', '3'])
15740+        r1 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint)
15741+        r2 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint,
15742+                                     ['131073', '3'])
15743+        v1 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint)
15744+        v2 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint,
15745+                                 ['131073', '3'])
15746+
15747+        # These will yield six different caps.
15748+        for o in (w1, w2, r1 , r2, v1, v2):
15749+            url = base + o.to_string()
15750+            self.failUnlessRaises(uri.BadURIError,
15751+                                  o.__class__.init_from_human_encoding,
15752+                                  url)
15753+
15754+    def test_mdmf_human_encoding_invalid_cap(self):
15755+        base = "https://127.0.0.1:3456/uri/"
15756+        # With a cap on the end. For each of the cap types, we need to
15757+        # test that a valid cap (with and without the traditional
15758+        # separators) is recognized and accepted by the classes.
15759+        w1 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint)
15760+        w2 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint,
15761+                                     ['131073', '3'])
15762+        r1 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint)
15763+        r2 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint,
15764+                                     ['131073', '3'])
15765+        v1 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint)
15766+        v2 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint,
15767+                                 ['131073', '3'])
15768+
15769+        # These will yield six different caps.
15770+        for o in (w1, w2, r1 , r2, v1, v2):
15771+            # not exhaustive, obviously...
15772+            url = base + o.to_string() + "foobarbaz"
15773+            url2 = base + "foobarbaz" + o.to_string()
15774+            url3 = base + o.to_string()[:25] + "foo" + o.to_string()[:25]
15775+            for u in (url, url2, url3):
15776+                self.failUnlessRaises(uri.BadURIError,
15777+                                      o.__class__.init_from_human_encoding,
15778+                                      u)
15779+
15780+    def test_mdmf_from_string(self):
15781+        # Make sure that the from_string utility function works with
15782+        # MDMF caps.
15783+        u1 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint)
15784+        cap = u1.to_string()
15785+        self.failUnless(uri.is_uri(cap))
15786+        u2 = uri.from_string(cap)
15787+        self.failUnlessReallyEqual(u1, u2)
15788+        u3 = uri.from_string_mutable_filenode(cap)
15789+        self.failUnlessEqual(u3, u1)
15790+
15791+        # XXX: We should refactor the extension field into setUp
15792+        u1 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint,
15793+                                     ['131073', '3'])
15794+        cap = u1.to_string()
15795+        self.failUnless(uri.is_uri(cap))
15796+        u2 = uri.from_string(cap)
15797+        self.failUnlessReallyEqual(u1, u2)
15798+        u3 = uri.from_string_mutable_filenode(cap)
15799+        self.failUnlessEqual(u3, u1)
15800+
15801+        u1 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint)
15802+        cap = u1.to_string()
15803+        self.failUnless(uri.is_uri(cap))
15804+        u2 = uri.from_string(cap)
15805+        self.failUnlessReallyEqual(u1, u2)
15806+        u3 = uri.from_string_mutable_filenode(cap)
15807+        self.failUnlessEqual(u3, u1)
15808+
15809+        u1 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint,
15810+                                     ['131073', '3'])
15811+        cap = u1.to_string()
15812+        self.failUnless(uri.is_uri(cap))
15813+        u2 = uri.from_string(cap)
15814+        self.failUnlessReallyEqual(u1, u2)
15815+        u3 = uri.from_string_mutable_filenode(cap)
15816+        self.failUnlessEqual(u3, u1)
15817+
15818+        u1 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint)
15819+        cap = u1.to_string()
15820+        self.failUnless(uri.is_uri(cap))
15821+        u2 = uri.from_string(cap)
15822+        self.failUnlessReallyEqual(u1, u2)
15823+        u3 = uri.from_string_verifier(cap)
15824+        self.failUnlessEqual(u3, u1)
15825+
15826+        u1 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint,
15827+                                 ['131073', '3'])
15828+        cap = u1.to_string()
15829+        self.failUnless(uri.is_uri(cap))
15830+        u2 = uri.from_string(cap)
15831+        self.failUnlessReallyEqual(u1, u2)
15832+        u3 = uri.from_string_verifier(cap)
15833+        self.failUnlessEqual(u3, u1)
15834+
15835+
15836 class Dirnode(testutil.ReallyEqualMixin, unittest.TestCase):
15837     def test_pack(self):
15838         writekey = "\x01" * 16
15839hunk ./src/allmydata/uri.py 31
15840 SEP='(?::|%3A)'
15841 NUMBER='([0-9]+)'
15842 NUMBER_IGNORE='(?:[0-9]+)'
15843+OPTIONAL_EXTENSION_FIELD = '(' + SEP + '[0-9' + SEP + ']+|)'
15844 
15845 # "human-encoded" URIs are allowed to come with a leading
15846 # 'http://127.0.0.1:(8123|3456)/uri/' that will be ignored.
15847hunk ./src/allmydata/uri.py 297
15848     def get_verify_cap(self):
15849         return SSKVerifierURI(self.storage_index, self.fingerprint)
15850 
15851+    def get_extension_params(self):
15852+        return []
15853 
15854 class ReadonlySSKFileURI(_BaseURI):
15855     implements(IURI, IMutableFileURI)
15856hunk ./src/allmydata/uri.py 354
15857     def get_verify_cap(self):
15858         return SSKVerifierURI(self.storage_index, self.fingerprint)
15859 
15860+    def get_extension_params(self):
15861+        return []
15862 
15863 class SSKVerifierURI(_BaseURI):
15864     implements(IVerifierURI)
15865hunk ./src/allmydata/uri.py 401
15866     def get_verify_cap(self):
15867         return self
15868 
15869+    def get_extension_params(self):
15870+        return []
15871+
15872+class WritableMDMFFileURI(_BaseURI):
15873+    implements(IURI, IMutableFileURI)
15874+
15875+    BASE_STRING='URI:MDMF:'
15876+    STRING_RE=re.compile('^'+BASE_STRING+BASE32STR_128bits+':'+BASE32STR_256bits+OPTIONAL_EXTENSION_FIELD+'$')
15877+    HUMAN_RE=re.compile('^'+OPTIONALHTTPLEAD+'URI'+SEP+'MDMF'+SEP+BASE32STR_128bits+SEP+BASE32STR_256bits+OPTIONAL_EXTENSION_FIELD+'$')
15878+
15879+    def __init__(self, writekey, fingerprint, params=[]):
15880+        self.writekey = writekey
15881+        self.readkey = hashutil.ssk_readkey_hash(writekey)
15882+        self.storage_index = hashutil.ssk_storage_index_hash(self.readkey)
15883+        assert len(self.storage_index) == 16
15884+        self.fingerprint = fingerprint
15885+        self.extension = params
15886+
15887+    @classmethod
15888+    def init_from_human_encoding(cls, uri):
15889+        mo = cls.HUMAN_RE.search(uri)
15890+        if not mo:
15891+            raise BadURIError("'%s' doesn't look like a %s cap" % (uri, cls))
15892+        params = filter(lambda x: x != '', re.split(SEP, mo.group(3)))
15893+        return cls(base32.a2b(mo.group(1)), base32.a2b(mo.group(2)), params)
15894+
15895+    @classmethod
15896+    def init_from_string(cls, uri):
15897+        mo = cls.STRING_RE.search(uri)
15898+        if not mo:
15899+            raise BadURIError("'%s' doesn't look like a %s cap" % (uri, cls))
15900+        params = mo.group(3)
15901+        params = filter(lambda x: x != '', params.split(":"))
15902+        return cls(base32.a2b(mo.group(1)), base32.a2b(mo.group(2)), params)
15903+
15904+    def to_string(self):
15905+        assert isinstance(self.writekey, str)
15906+        assert isinstance(self.fingerprint, str)
15907+        ret = 'URI:MDMF:%s:%s' % (base32.b2a(self.writekey),
15908+                                  base32.b2a(self.fingerprint))
15909+        if self.extension:
15910+            ret += ":"
15911+            ret += ":".join(self.extension)
15912+
15913+        return ret
15914+
15915+    def __repr__(self):
15916+        return "<%s %s>" % (self.__class__.__name__, self.abbrev())
15917+
15918+    def abbrev(self):
15919+        return base32.b2a(self.writekey[:5])
15920+
15921+    def abbrev_si(self):
15922+        return base32.b2a(self.storage_index)[:5]
15923+
15924+    def is_readonly(self):
15925+        return False
15926+
15927+    def is_mutable(self):
15928+        return True
15929+
15930+    def get_readonly(self):
15931+        return ReadonlyMDMFFileURI(self.readkey, self.fingerprint, self.extension)
15932+
15933+    def get_verify_cap(self):
15934+        return MDMFVerifierURI(self.storage_index, self.fingerprint, self.extension)
15935+
15936+    def get_extension_params(self):
15937+        return self.extension
15938+
15939+class ReadonlyMDMFFileURI(_BaseURI):
15940+    implements(IURI, IMutableFileURI)
15941+
15942+    BASE_STRING='URI:MDMF-RO:'
15943+    STRING_RE=re.compile('^' +BASE_STRING+BASE32STR_128bits+':'+BASE32STR_256bits+OPTIONAL_EXTENSION_FIELD+'$')
15944+    HUMAN_RE=re.compile('^'+OPTIONALHTTPLEAD+'URI'+SEP+'MDMF-RO'+SEP+BASE32STR_128bits+SEP+BASE32STR_256bits+OPTIONAL_EXTENSION_FIELD+'$')
15945+
15946+    def __init__(self, readkey, fingerprint, params=[]):
15947+        self.readkey = readkey
15948+        self.storage_index = hashutil.ssk_storage_index_hash(self.readkey)
15949+        assert len(self.storage_index) == 16
15950+        self.fingerprint = fingerprint
15951+        self.extension = params
15952+
15953+    @classmethod
15954+    def init_from_human_encoding(cls, uri):
15955+        mo = cls.HUMAN_RE.search(uri)
15956+        if not mo:
15957+            raise BadURIError("'%s' doesn't look like a %s cap" % (uri, cls))
15958+        params = mo.group(3)
15959+        params = filter(lambda x: x!= '', re.split(SEP, params))
15960+        return cls(base32.a2b(mo.group(1)), base32.a2b(mo.group(2)), params)
15961+
15962+    @classmethod
15963+    def init_from_string(cls, uri):
15964+        mo = cls.STRING_RE.search(uri)
15965+        if not mo:
15966+            raise BadURIError("'%s' doesn't look like a %s cap" % (uri, cls))
15967+
15968+        params = mo.group(3)
15969+        params = filter(lambda x: x != '', params.split(":"))
15970+        return cls(base32.a2b(mo.group(1)), base32.a2b(mo.group(2)), params)
15971+
15972+    def to_string(self):
15973+        assert isinstance(self.readkey, str)
15974+        assert isinstance(self.fingerprint, str)
15975+        ret = 'URI:MDMF-RO:%s:%s' % (base32.b2a(self.readkey),
15976+                                     base32.b2a(self.fingerprint))
15977+        if self.extension:
15978+            ret += ":"
15979+            ret += ":".join(self.extension)
15980+
15981+        return ret
15982+
15983+    def __repr__(self):
15984+        return "<%s %s>" % (self.__class__.__name__, self.abbrev())
15985+
15986+    def abbrev(self):
15987+        return base32.b2a(self.readkey[:5])
15988+
15989+    def abbrev_si(self):
15990+        return base32.b2a(self.storage_index)[:5]
15991+
15992+    def is_readonly(self):
15993+        return True
15994+
15995+    def is_mutable(self):
15996+        return True
15997+
15998+    def get_readonly(self):
15999+        return self
16000+
16001+    def get_verify_cap(self):
16002+        return MDMFVerifierURI(self.storage_index, self.fingerprint, self.extension)
16003+
16004+    def get_extension_params(self):
16005+        return self.extension
16006+
16007+class MDMFVerifierURI(_BaseURI):
16008+    implements(IVerifierURI)
16009+
16010+    BASE_STRING='URI:MDMF-Verifier:'
16011+    STRING_RE=re.compile('^'+BASE_STRING+BASE32STR_128bits+':'+BASE32STR_256bits+OPTIONAL_EXTENSION_FIELD+'$')
16012+    HUMAN_RE=re.compile('^'+OPTIONALHTTPLEAD+'URI'+SEP+'MDMF-Verifier'+SEP+BASE32STR_128bits+SEP+BASE32STR_256bits+OPTIONAL_EXTENSION_FIELD+'$')
16013+
16014+    def __init__(self, storage_index, fingerprint, params=[]):
16015+        assert len(storage_index) == 16
16016+        self.storage_index = storage_index
16017+        self.fingerprint = fingerprint
16018+        self.extension = params
16019+
16020+    @classmethod
16021+    def init_from_human_encoding(cls, uri):
16022+        mo = cls.HUMAN_RE.search(uri)
16023+        if not mo:
16024+            raise BadURIError("'%s' doesn't look like a %s cap" % (uri, cls))
16025+        params = mo.group(3)
16026+        params = filter(lambda x: x != '', re.split(SEP, params))
16027+        return cls(si_a2b(mo.group(1)), base32.a2b(mo.group(2)), params)
16028+
16029+    @classmethod
16030+    def init_from_string(cls, uri):
16031+        mo = cls.STRING_RE.search(uri)
16032+        if not mo:
16033+            raise BadURIError("'%s' doesn't look like a %s cap" % (uri, cls))
16034+        params = mo.group(3)
16035+        params = filter(lambda x: x != '', params.split(":"))
16036+        return cls(si_a2b(mo.group(1)), base32.a2b(mo.group(2)), params)
16037+
16038+    def to_string(self):
16039+        assert isinstance(self.storage_index, str)
16040+        assert isinstance(self.fingerprint, str)
16041+        ret = 'URI:MDMF-Verifier:%s:%s' % (si_b2a(self.storage_index),
16042+                                           base32.b2a(self.fingerprint))
16043+        if self.extension:
16044+            ret += ':'
16045+            ret += ":".join(self.extension)
16046+
16047+        return ret
16048+
16049+    def is_readonly(self):
16050+        return True
16051+
16052+    def is_mutable(self):
16053+        return False
16054+
16055+    def get_readonly(self):
16056+        return self
16057+
16058+    def get_verify_cap(self):
16059+        return self
16060+
16061+    def get_extension_params(self):
16062+        return self.extension
16063+
16064 class _DirectoryBaseURI(_BaseURI):
16065     implements(IURI, IDirnodeURI)
16066     def __init__(self, filenode_uri=None):
16067hunk ./src/allmydata/uri.py 831
16068             kind = "URI:SSK-RO readcap to a mutable file"
16069         elif s.startswith('URI:SSK-Verifier:'):
16070             return SSKVerifierURI.init_from_string(s)
16071+        elif s.startswith('URI:MDMF:'):
16072+            return WritableMDMFFileURI.init_from_string(s)
16073+        elif s.startswith('URI:MDMF-RO:'):
16074+            return ReadonlyMDMFFileURI.init_from_string(s)
16075+        elif s.startswith('URI:MDMF-Verifier:'):
16076+            return MDMFVerifierURI.init_from_string(s)
16077         elif s.startswith('URI:DIR2:'):
16078             if can_be_writeable:
16079                 return DirectoryURI.init_from_string(s)
16080}
16081[nodemaker, mutable/filenode: train nodemaker and filenode to handle MDMF caps
16082Kevan Carstensen <kevan@isnotajoke.com>**20110501224523
16083 Ignore-this: 1f3b4581eb583e7bb93d234182bda395
16084] {
16085hunk ./src/allmydata/mutable/filenode.py 12
16086      IMutableFileVersion, IWritable
16087 from allmydata.util import hashutil, log, consumer, deferredutil, mathutil
16088 from allmydata.util.assertutil import precondition
16089-from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI
16090+from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI, \
16091+                          WritableMDMFFileURI, ReadonlyMDMFFileURI
16092 from allmydata.monitor import Monitor
16093 from pycryptopp.cipher.aes import AES
16094 
16095hunk ./src/allmydata/mutable/filenode.py 75
16096         # set to this default value in case neither of those things happen,
16097         # or in case the servermap can't find any shares to tell us what
16098         # to publish as.
16099-        # TODO: Set this back to None, and find out why the tests fail
16100-        #       with it set to None.
16101+        # XXX: Version should come in via the constructor.
16102         self._protocol_version = None
16103 
16104         # all users of this MutableFileNode go through the serializer. This
16105hunk ./src/allmydata/mutable/filenode.py 95
16106         # verification key, nor things like 'k' or 'N'. If and when someone
16107         # wants to get our contents, we'll pull from shares and fill those
16108         # in.
16109-        assert isinstance(filecap, (ReadonlySSKFileURI, WriteableSSKFileURI))
16110+        if isinstance(filecap, (WritableMDMFFileURI, ReadonlyMDMFFileURI)):
16111+            self._protocol_version = MDMF_VERSION
16112+        elif isinstance(filecap, (ReadonlySSKFileURI, WriteableSSKFileURI)):
16113+            self._protocol_version = SDMF_VERSION
16114+
16115         self._uri = filecap
16116         self._writekey = None
16117hunk ./src/allmydata/mutable/filenode.py 102
16118-        if isinstance(filecap, WriteableSSKFileURI):
16119+
16120+        if not filecap.is_readonly() and filecap.is_mutable():
16121             self._writekey = self._uri.writekey
16122         self._readkey = self._uri.readkey
16123         self._storage_index = self._uri.storage_index
16124hunk ./src/allmydata/mutable/filenode.py 131
16125         self._writekey = hashutil.ssk_writekey_hash(privkey_s)
16126         self._encprivkey = self._encrypt_privkey(self._writekey, privkey_s)
16127         self._fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
16128-        self._uri = WriteableSSKFileURI(self._writekey, self._fingerprint)
16129+        if self._protocol_version == MDMF_VERSION:
16130+            self._uri = WritableMDMFFileURI(self._writekey, self._fingerprint)
16131+        else:
16132+            self._uri = WriteableSSKFileURI(self._writekey, self._fingerprint)
16133         self._readkey = self._uri.readkey
16134         self._storage_index = self._uri.storage_index
16135         initial_contents = self._get_initial_contents(contents)
16136hunk ./src/allmydata/nodemaker.py 82
16137             return self._create_immutable(cap)
16138         if isinstance(cap, uri.CHKFileVerifierURI):
16139             return self._create_immutable_verifier(cap)
16140-        if isinstance(cap, (uri.ReadonlySSKFileURI, uri.WriteableSSKFileURI)):
16141+        if isinstance(cap, (uri.ReadonlySSKFileURI, uri.WriteableSSKFileURI,
16142+                            uri.WritableMDMFFileURI, uri.ReadonlyMDMFFileURI)):
16143             return self._create_mutable(cap)
16144         if isinstance(cap, (uri.DirectoryURI,
16145                             uri.ReadonlyDirectoryURI,
16146hunk ./src/allmydata/test/test_mutable.py 196
16147                     offset2 = 0
16148                 if offset1 == "pubkey" and IV:
16149                     real_offset = 107
16150-                elif offset1 == "share_data" and not IV:
16151-                    real_offset = 107
16152                 elif offset1 in o:
16153                     real_offset = o[offset1]
16154                 else:
16155hunk ./src/allmydata/test/test_mutable.py 270
16156         return d
16157 
16158 
16159+    def test_mdmf_filenode_cap(self):
16160+        # Test that an MDMF filenode, once created, returns an MDMF URI.
16161+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
16162+        def _created(n):
16163+            self.failUnless(isinstance(n, MutableFileNode))
16164+            cap = n.get_cap()
16165+            self.failUnless(isinstance(cap, uri.WritableMDMFFileURI))
16166+            rcap = n.get_readcap()
16167+            self.failUnless(isinstance(rcap, uri.ReadonlyMDMFFileURI))
16168+            vcap = n.get_verify_cap()
16169+            self.failUnless(isinstance(vcap, uri.MDMFVerifierURI))
16170+        d.addCallback(_created)
16171+        return d
16172+
16173+
16174+    def test_create_from_mdmf_writecap(self):
16175+        # Test that the nodemaker is capable of creating an MDMF
16176+        # filenode given an MDMF cap.
16177+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
16178+        def _created(n):
16179+            self.failUnless(isinstance(n, MutableFileNode))
16180+            s = n.get_uri()
16181+            self.failUnless(s.startswith("URI:MDMF"))
16182+            n2 = self.nodemaker.create_from_cap(s)
16183+            self.failUnless(isinstance(n2, MutableFileNode))
16184+            self.failUnlessEqual(n.get_storage_index(), n2.get_storage_index())
16185+            self.failUnlessEqual(n.get_uri(), n2.get_uri())
16186+        d.addCallback(_created)
16187+        return d
16188+
16189+
16190+    def test_create_from_mdmf_writecap_with_extensions(self):
16191+        # Test that the nodemaker is capable of creating an MDMF
16192+        # filenode when given a writecap with extension parameters in
16193+        # them.
16194+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
16195+        def _created(n):
16196+            self.failUnless(isinstance(n, MutableFileNode))
16197+            s = n.get_uri()
16198+            s2 = "%s:3:131073" % s
16199+            n2 = self.nodemaker.create_from_cap(s2)
16200+
16201+            self.failUnlessEqual(n2.get_storage_index(), n.get_storage_index())
16202+            self.failUnlessEqual(n.get_writekey(), n2.get_writekey())
16203+        d.addCallback(_created)
16204+        return d
16205+
16206+
16207+    def test_create_from_mdmf_readcap(self):
16208+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
16209+        def _created(n):
16210+            self.failUnless(isinstance(n, MutableFileNode))
16211+            s = n.get_readonly_uri()
16212+            n2 = self.nodemaker.create_from_cap(s)
16213+            self.failUnless(isinstance(n2, MutableFileNode))
16214+
16215+            # Check that it's a readonly node
16216+            self.failUnless(n2.is_readonly())
16217+        d.addCallback(_created)
16218+        return d
16219+
16220+
16221+    def test_create_from_mdmf_readcap_with_extensions(self):
16222+        # We should be able to create an MDMF filenode with the
16223+        # extension parameters without it breaking.
16224+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
16225+        def _created(n):
16226+            self.failUnless(isinstance(n, MutableFileNode))
16227+            s = n.get_readonly_uri()
16228+            s = "%s:3:131073" % s
16229+
16230+            n2 = self.nodemaker.create_from_cap(s)
16231+            self.failUnless(isinstance(n2, MutableFileNode))
16232+            self.failUnless(n2.is_readonly())
16233+            self.failUnlessEqual(n.get_storage_index(), n2.get_storage_index())
16234+        d.addCallback(_created)
16235+        return d
16236+
16237+
16238+    def test_internal_version_from_cap(self):
16239+        # MutableFileNodes and MutableFileVersions have an internal
16240+        # switch that tells them whether they're dealing with an SDMF or
16241+        # MDMF mutable file when they start doing stuff. We want to make
16242+        # sure that this is set appropriately given an MDMF cap.
16243+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
16244+        def _created(n):
16245+            self.uri = n.get_uri()
16246+            self.failUnlessEqual(n._protocol_version, MDMF_VERSION)
16247+
16248+            n2 = self.nodemaker.create_from_cap(self.uri)
16249+            self.failUnlessEqual(n2._protocol_version, MDMF_VERSION)
16250+        d.addCallback(_created)
16251+        return d
16252+
16253+
16254     def test_serialize(self):
16255         n = MutableFileNode(None, None, {"k": 3, "n": 10}, None)
16256         calls = []
16257hunk ./src/allmydata/test/test_mutable.py 464
16258         return d
16259 
16260 
16261+    def test_download_from_mdmf_cap(self):
16262+        # We should be able to download an MDMF file given its cap
16263+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
16264+        def _created(node):
16265+            self.uri = node.get_uri()
16266+
16267+            return node.overwrite(MutableData("contents1" * 100000))
16268+        def _then(ignored):
16269+            node = self.nodemaker.create_from_cap(self.uri)
16270+            return node.download_best_version()
16271+        def _downloaded(data):
16272+            self.failUnlessEqual(data, "contents1" * 100000)
16273+        d.addCallback(_created)
16274+        d.addCallback(_then)
16275+        d.addCallback(_downloaded)
16276+        return d
16277+
16278+
16279     def test_mdmf_write_count(self):
16280         # Publishing an MDMF file should only cause one write for each
16281         # share that is to be published. Otherwise, we introduce
16282hunk ./src/allmydata/test/test_mutable.py 1735
16283     def test_verify_mdmf_bad_encprivkey(self):
16284         d = self.publish_mdmf()
16285         d.addCallback(lambda ignored:
16286-            corrupt(None, self._storage, "enc_privkey", [1]))
16287+            corrupt(None, self._storage, "enc_privkey", [0]))
16288         d.addCallback(lambda ignored:
16289             self._fn.check(Monitor(), verify=True))
16290         d.addCallback(self.check_bad, "test_verify_mdmf_bad_encprivkey")
16291hunk ./src/allmydata/test/test_mutable.py 2843
16292         return d
16293 
16294 
16295+    def test_version_extension_api(self):
16296+        # We need to define an API by which an uploader can set the
16297+        # extension parameters, and by which a downloader can retrieve
16298+        # extensions.
16299+        self.failUnless(False)
16300+
16301+
16302+    def test_extensions_from_cap(self):
16303+        self.failUnless(False)
16304+
16305+
16306+    def test_extensions_from_upload(self):
16307+        self.failUnless(False)
16308+
16309+
16310+    def test_cap_after_upload(self):
16311+        self.failUnless(False)
16312+
16313+
16314     def test_get_writekey(self):
16315         d = self.mdmf_node.get_best_mutable_version()
16316         d.addCallback(lambda bv:
16317}
16318[mutable/retrieve: fix typo in paused check
16319Kevan Carstensen <kevan@isnotajoke.com>**20110515225946
16320 Ignore-this: a9c7f3bdbab2f8248f8b6a64f574e7c4
16321] hunk ./src/allmydata/mutable/retrieve.py 207
16322         """
16323         if self._paused:
16324             d = defer.Deferred()
16325-            self._pause_defered.addCallback(lambda ignored: d.callback(res))
16326+            self._pause_deferred.addCallback(lambda ignored: d.callback(res))
16327             return d
16328         return defer.succeed(res)
16329 
16330[scripts/tahoe_put.py: teach tahoe put about MDMF caps
16331Kevan Carstensen <kevan@isnotajoke.com>**20110515230008
16332 Ignore-this: 1522f434f651683c924e37251a3c1bfd
16333] hunk ./src/allmydata/scripts/tahoe_put.py 49
16334         #  DIRCAP:./subdir/foo : DIRCAP/subdir/foo
16335         #  MUTABLE-FILE-WRITECAP : filecap
16336 
16337-        # FIXME: this shouldn't rely on a particular prefix.
16338-        if to_file.startswith("URI:SSK:"):
16339+        # FIXME: don't hardcode cap format.
16340+        if to_file.startswith("URI:MDMF:") or to_file.startswith("URI:SSK:"):
16341             url = nodeurl + "uri/%s" % urllib.quote(to_file)
16342         else:
16343             try:
16344[test/common.py: fix some MDMF-related bugs in common test fixtures
16345Kevan Carstensen <kevan@isnotajoke.com>**20110515230038
16346 Ignore-this: ab5ffe4789bb5e6ed5f54b91b760bac9
16347] {
16348hunk ./src/allmydata/test/common.py 199
16349                  default_encoding_parameters, history):
16350         self.init_from_cap(make_mutable_file_cap())
16351     def create(self, contents, key_generator=None, keysize=None):
16352+        if self.file_types[self.storage_index] == MDMF_VERSION and \
16353+            isinstance(self.my_uri, (uri.ReadonlySSKFileURI,
16354+                                 uri.WriteableSSKFileURI)):
16355+            self.init_from_cap(make_mdmf_mutable_file_cap())
16356         initial_contents = self._get_initial_contents(contents)
16357         data = initial_contents.read(initial_contents.get_size())
16358         data = "".join(data)
16359hunk ./src/allmydata/test/common.py 220
16360         return contents(self)
16361     def init_from_cap(self, filecap):
16362         assert isinstance(filecap, (uri.WriteableSSKFileURI,
16363-                                    uri.ReadonlySSKFileURI))
16364+                                    uri.ReadonlySSKFileURI,
16365+                                    uri.WritableMDMFFileURI,
16366+                                    uri.ReadonlyMDMFFileURI))
16367         self.my_uri = filecap
16368         self.storage_index = self.my_uri.get_storage_index()
16369hunk ./src/allmydata/test/common.py 225
16370+        if isinstance(filecap, (uri.WritableMDMFFileURI,
16371+                                uri.ReadonlyMDMFFileURI)):
16372+            self.file_types[self.storage_index] = MDMF_VERSION
16373+
16374+        else:
16375+            self.file_types[self.storage_index] = SDMF_VERSION
16376+
16377         return self
16378     def get_cap(self):
16379         return self.my_uri
16380hunk ./src/allmydata/test/common.py 249
16381         return self.my_uri.get_readonly().to_string()
16382     def get_verify_cap(self):
16383         return self.my_uri.get_verify_cap()
16384+    def get_repair_cap(self):
16385+        if self.my_uri.is_readonly():
16386+            return None
16387+        return self.my_uri
16388     def is_readonly(self):
16389         return self.my_uri.is_readonly()
16390     def is_mutable(self):
16391hunk ./src/allmydata/test/common.py 406
16392 def make_mutable_file_cap():
16393     return uri.WriteableSSKFileURI(writekey=os.urandom(16),
16394                                    fingerprint=os.urandom(32))
16395-def make_mutable_file_uri():
16396-    return make_mutable_file_cap().to_string()
16397+
16398+def make_mdmf_mutable_file_cap():
16399+    return uri.WritableMDMFFileURI(writekey=os.urandom(16),
16400+                                   fingerprint=os.urandom(32))
16401+
16402+def make_mutable_file_uri(mdmf=False):
16403+    if mdmf:
16404+        uri = make_mdmf_mutable_file_cap()
16405+    else:
16406+        uri = make_mutable_file_cap()
16407+
16408+    return uri.to_string()
16409 
16410 def make_verifier_uri():
16411     return uri.SSKVerifierURI(storage_index=os.urandom(16),
16412hunk ./src/allmydata/test/common.py 423
16413                               fingerprint=os.urandom(32)).to_string()
16414 
16415+def create_mutable_filenode(contents, mdmf=False):
16416+    # XXX: All of these arguments are kind of stupid.
16417+    if mdmf:
16418+        cap = make_mdmf_mutable_file_cap()
16419+    else:
16420+        cap = make_mutable_file_cap()
16421+
16422+    filenode = FakeMutableFileNode(None, None, None, None)
16423+    filenode.init_from_cap(cap)
16424+    FakeMutableFileNode.all_contents[filenode.storage_index] = contents
16425+    return filenode
16426+
16427+
16428 class FakeDirectoryNode(dirnode.DirectoryNode):
16429     """This offers IDirectoryNode, but uses a FakeMutableFileNode for the
16430     backing store, so it doesn't go to the grid. The child data is still
16431}
16432[test/test_cli: Alter existing MDMF tests to test for MDMF caps
16433Kevan Carstensen <kevan@isnotajoke.com>**20110515230054
16434 Ignore-this: a90d089e1afb0f261710083c2be6b2fa
16435] {
16436hunk ./src/allmydata/test/test_cli.py 13
16437 from allmydata.util import fileutil, hashutil, base32
16438 from allmydata import uri
16439 from allmydata.immutable import upload
16440+from allmydata.interfaces import MDMF_VERSION, SDMF_VERSION
16441 from allmydata.mutable.publish import MutableData
16442 from allmydata.dirnode import normalize
16443 
16444hunk ./src/allmydata/test/test_cli.py 33
16445 from allmydata.test.common_util import StallMixin, ReallyEqualMixin
16446 from allmydata.test.no_network import GridTestMixin
16447 from twisted.internet import threads # CLI tests use deferToThread
16448+from twisted.internet import defer # List uses a DeferredList in one place.
16449 from twisted.python import usage
16450 
16451 from allmydata.util.assertutil import precondition
16452hunk ./src/allmydata/test/test_cli.py 969
16453         d.addCallback(lambda (rc,out,err): self.failUnlessReallyEqual(out, DATA2))
16454         return d
16455 
16456+    def _check_mdmf_json(self, (rc, json, err)):
16457+         self.failUnlessEqual(rc, 0)
16458+         self.failUnlessEqual(err, "")
16459+         self.failUnlessIn('"mutable-type": "mdmf"', json)
16460+         # We also want a valid MDMF cap to be in the json.
16461+         self.failUnlessIn("URI:MDMF", json)
16462+         self.failUnlessIn("URI:MDMF-RO", json)
16463+         self.failUnlessIn("URI:MDMF-Verifier", json)
16464+
16465+    def _check_sdmf_json(self, (rc, json, err)):
16466+        self.failUnlessEqual(rc, 0)
16467+        self.failUnlessEqual(err, "")
16468+        self.failUnlessIn('"mutable-type": "sdmf"', json)
16469+        # We also want to see the appropriate SDMF caps.
16470+        self.failUnlessIn("URI:SSK", json)
16471+        self.failUnlessIn("URI:SSK-RO", json)
16472+        self.failUnlessIn("URI:SSK-Verifier", json)
16473+
16474     def test_mutable_type(self):
16475         self.basedir = "cli/Put/mutable_type"
16476         self.set_up_grid()
16477hunk ./src/allmydata/test/test_cli.py 999
16478                         fn1, "tahoe:uploaded.txt"))
16479         d.addCallback(lambda ignored:
16480             self.do_cli("ls", "--json", "tahoe:uploaded.txt"))
16481-        d.addCallback(lambda (rc, json, err): self.failUnlessIn("mdmf", json))
16482+        d.addCallback(self._check_mdmf_json)
16483         d.addCallback(lambda ignored:
16484             self.do_cli("put", "--mutable", "--mutable-type=sdmf",
16485                         fn1, "tahoe:uploaded2.txt"))
16486hunk ./src/allmydata/test/test_cli.py 1005
16487         d.addCallback(lambda ignored:
16488             self.do_cli("ls", "--json", "tahoe:uploaded2.txt"))
16489-        d.addCallback(lambda (rc, json, err):
16490-            self.failUnlessIn("sdmf", json))
16491+        d.addCallback(self._check_sdmf_json)
16492         return d
16493 
16494     def test_mutable_type_unlinked(self):
16495hunk ./src/allmydata/test/test_cli.py 1017
16496         d = self.do_cli("put", "--mutable", "--mutable-type=mdmf", fn1)
16497         d.addCallback(lambda (rc, cap, err):
16498             self.do_cli("ls", "--json", cap))
16499-        d.addCallback(lambda (rc, json, err): self.failUnlessIn("mdmf", json))
16500+        d.addCallback(self._check_mdmf_json)
16501         d.addCallback(lambda ignored:
16502             self.do_cli("put", "--mutable", "--mutable-type=sdmf", fn1))
16503         d.addCallback(lambda (rc, cap, err):
16504hunk ./src/allmydata/test/test_cli.py 1022
16505             self.do_cli("ls", "--json", cap))
16506-        d.addCallback(lambda (rc, json, err):
16507-            self.failUnlessIn("sdmf", json))
16508+        d.addCallback(self._check_sdmf_json)
16509         return d
16510 
16511hunk ./src/allmydata/test/test_cli.py 1025
16512+    def test_put_to_mdmf_cap(self):
16513+        self.basedir = "cli/Put/put_to_mdmf_cap"
16514+        self.set_up_grid()
16515+        data = "data" * 100000
16516+        fn1 = os.path.join(self.basedir, "data")
16517+        fileutil.write(fn1, data)
16518+        d = self.do_cli("put", "--mutable", "--mutable-type=mdmf", fn1)
16519+        def _got_cap((rc, out, err)):
16520+            self.failUnlessEqual(rc, 0)
16521+            self.cap = out
16522+        d.addCallback(_got_cap)
16523+        # Now try to write something to the cap using put.
16524+        data2 = "data2" * 100000
16525+        fn2 = os.path.join(self.basedir, "data2")
16526+        fileutil.write(fn2, data2)
16527+        d.addCallback(lambda ignored:
16528+            self.do_cli("put", fn2, self.cap))
16529+        def _got_put((rc, out, err)):
16530+            self.failUnlessEqual(rc, 0)
16531+            self.failUnlessIn(self.cap, out)
16532+        d.addCallback(_got_put)
16533+        # Now get the cap. We should see the data we just put there.
16534+        d.addCallback(lambda ignored:
16535+            self.do_cli("get", self.cap))
16536+        def _got_data((rc, out, err)):
16537+            self.failUnlessEqual(rc, 0)
16538+            self.failUnlessEqual(out, data2)
16539+        d.addCallback(_got_data)
16540+        return d
16541+
16542+    def test_put_to_sdmf_cap(self):
16543+        self.basedir = "cli/Put/put_to_sdmf_cap"
16544+        self.set_up_grid()
16545+        data = "data" * 100000
16546+        fn1 = os.path.join(self.basedir, "data")
16547+        fileutil.write(fn1, data)
16548+        d = self.do_cli("put", "--mutable", "--mutable-type=sdmf", fn1)
16549+        def _got_cap((rc, out, err)):
16550+            self.failUnlessEqual(rc, 0)
16551+            self.cap = out
16552+        d.addCallback(_got_cap)
16553+        # Now try to write something to the cap using put.
16554+        data2 = "data2" * 100000
16555+        fn2 = os.path.join(self.basedir, "data2")
16556+        fileutil.write(fn2, data2)
16557+        d.addCallback(lambda ignored:
16558+            self.do_cli("put", fn2, self.cap))
16559+        def _got_put((rc, out, err)):
16560+            self.failUnlessEqual(rc, 0)
16561+            self.failUnlessIn(self.cap, out)
16562+        d.addCallback(_got_put)
16563+        # Now get the cap. We should see the data we just put there.
16564+        d.addCallback(lambda ignored:
16565+            self.do_cli("get", self.cap))
16566+        def _got_data((rc, out, err)):
16567+            self.failUnlessEqual(rc, 0)
16568+            self.failUnlessEqual(out, data2)
16569+        d.addCallback(_got_data)
16570+        return d
16571+
16572     def test_mutable_type_invalid_format(self):
16573         o = cli.PutOptions()
16574         self.failUnlessRaises(usage.UsageError,
16575hunk ./src/allmydata/test/test_cli.py 1318
16576         d.addCallback(_check)
16577         return d
16578 
16579+    def _create_directory_structure(self):
16580+        # Create a simple directory structure that we can use for MDMF,
16581+        # SDMF, and immutable testing.
16582+        assert self.g
16583+
16584+        client = self.g.clients[0]
16585+        # Create a dirnode
16586+        d = client.create_dirnode()
16587+        def _got_rootnode(n):
16588+            # Add a few nodes.
16589+            self._dircap = n.get_uri()
16590+            nm = n._nodemaker
16591+            # The uploaders may run at the same time, so we need two
16592+            # MutableData instances or they'll fight over offsets &c and
16593+            # break.
16594+            mutable_data = MutableData("data" * 100000)
16595+            mutable_data2 = MutableData("data" * 100000)
16596+            # Add both kinds of mutable node.
16597+            d1 = nm.create_mutable_file(mutable_data,
16598+                                        version=MDMF_VERSION)
16599+            d2 = nm.create_mutable_file(mutable_data2,
16600+                                        version=SDMF_VERSION)
16601+            # Add an immutable node. We do this through the directory,
16602+            # with add_file.
16603+            immutable_data = upload.Data("immutable data" * 100000,
16604+                                         convergence="")
16605+            d3 = n.add_file(u"immutable", immutable_data)
16606+            ds = [d1, d2, d3]
16607+            dl = defer.DeferredList(ds)
16608+            def _made_files((r1, r2, r3)):
16609+                self.failUnless(r1[0])
16610+                self.failUnless(r2[0])
16611+                self.failUnless(r3[0])
16612+
16613+                # r1, r2, and r3 contain nodes.
16614+                mdmf_node = r1[1]
16615+                sdmf_node = r2[1]
16616+                imm_node = r3[1]
16617+
16618+                self._mdmf_uri = mdmf_node.get_uri()
16619+                self._mdmf_readonly_uri = mdmf_node.get_readonly_uri()
16620+                self._sdmf_uri = mdmf_node.get_uri()
16621+                self._sdmf_readonly_uri = sdmf_node.get_readonly_uri()
16622+                self._imm_uri = imm_node.get_uri()
16623+
16624+                d1 = n.set_node(u"mdmf", mdmf_node)
16625+                d2 = n.set_node(u"sdmf", sdmf_node)
16626+                return defer.DeferredList([d1, d2])
16627+            # We can now list the directory by listing self._dircap.
16628+            dl.addCallback(_made_files)
16629+            return dl
16630+        d.addCallback(_got_rootnode)
16631+        return d
16632+
16633+    def test_list_mdmf(self):
16634+        # 'tahoe ls' should include MDMF files.
16635+        self.basedir = "cli/List/list_mdmf"
16636+        self.set_up_grid()
16637+        d = self._create_directory_structure()
16638+        d.addCallback(lambda ignored:
16639+            self.do_cli("ls", self._dircap))
16640+        def _got_ls((rc, out, err)):
16641+            self.failUnlessEqual(rc, 0)
16642+            self.failUnlessEqual(err, "")
16643+            self.failUnlessIn("immutable", out)
16644+            self.failUnlessIn("mdmf", out)
16645+            self.failUnlessIn("sdmf", out)
16646+        d.addCallback(_got_ls)
16647+        return d
16648+
16649+    def test_list_mdmf_json(self):
16650+        # 'tahoe ls' should include MDMF caps when invoked with MDMF
16651+        # caps.
16652+        self.basedir = "cli/List/list_mdmf_json"
16653+        self.set_up_grid()
16654+        d = self._create_directory_structure()
16655+        d.addCallback(lambda ignored:
16656+            self.do_cli("ls", "--json", self._dircap))
16657+        def _got_json((rc, out, err)):
16658+            self.failUnlessEqual(rc, 0)
16659+            self.failUnlessEqual(err, "")
16660+            self.failUnlessIn(self._mdmf_uri, out)
16661+            self.failUnlessIn(self._mdmf_readonly_uri, out)
16662+            self.failUnlessIn(self._sdmf_uri, out)
16663+            self.failUnlessIn(self._sdmf_readonly_uri, out)
16664+            self.failUnlessIn(self._imm_uri, out)
16665+            self.failUnlessIn('"mutable-type": "sdmf"', out)
16666+            self.failUnlessIn('"mutable-type": "mdmf"', out)
16667+        d.addCallback(_got_json)
16668+        return d
16669+
16670 
16671 class Mv(GridTestMixin, CLITestMixin, unittest.TestCase):
16672     def test_mv_behavior(self):
16673}
16674[test/test_mutable.py: write a test for pausing during retrieval, write support structure for that test
16675Kevan Carstensen <kevan@isnotajoke.com>**20110515230207
16676 Ignore-this: 8884ef3ad5be59dbc870ed14002ac45
16677] {
16678hunk ./src/allmydata/test/test_mutable.py 6
16679 from cStringIO import StringIO
16680 from twisted.trial import unittest
16681 from twisted.internet import defer, reactor
16682+from twisted.internet.interfaces import IConsumer
16683+from zope.interface import implements
16684 from allmydata import uri, client
16685 from allmydata.nodemaker import NodeMaker
16686 from allmydata.util import base32, consumer
16687hunk ./src/allmydata/test/test_mutable.py 466
16688         return d
16689 
16690 
16691+    def test_retrieve_pause(self):
16692+        # We should make sure that the retriever is able to pause
16693+        # correctly.
16694+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
16695+        def _created(node):
16696+            self.node = node
16697+
16698+            return node.overwrite(MutableData("contents1" * 100000))
16699+        d.addCallback(_created)
16700+        # Now we'll retrieve it into a pausing consumer.
16701+        d.addCallback(lambda ignored:
16702+            self.node.get_best_mutable_version())
16703+        def _got_version(version):
16704+            self.c = PausingConsumer()
16705+            return version.read(self.c)
16706+        d.addCallback(_got_version)
16707+        d.addCallback(lambda ignored:
16708+            self.failUnlessEqual(self.c.data, "contents1" * 100000))
16709+        return d
16710+    test_retrieve_pause.timeout = 25
16711+
16712+
16713     def test_download_from_mdmf_cap(self):
16714         # We should be able to download an MDMF file given its cap
16715         d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
16716hunk ./src/allmydata/test/test_mutable.py 944
16717                     index = versionmap[shnum]
16718                     shares[peerid][shnum] = oldshares[index][peerid][shnum]
16719 
16720+class PausingConsumer:
16721+    implements(IConsumer)
16722+    def __init__(self):
16723+        self.data = ""
16724+        self.already_paused = False
16725+
16726+    def registerProducer(self, producer, streaming):
16727+        self.producer = producer
16728+        self.producer.resumeProducing()
16729 
16730hunk ./src/allmydata/test/test_mutable.py 954
16731+    def unregisterProducer(self):
16732+        self.producer = None
16733+
16734+    def _unpause(self, ignored):
16735+        self.producer.resumeProducing()
16736+
16737+    def write(self, data):
16738+        self.data += data
16739+        if not self.already_paused:
16740+           self.producer.pauseProducing()
16741+           self.already_paused = True
16742+           reactor.callLater(15, self._unpause, None)
16743 
16744 
16745 class Servermap(unittest.TestCase, PublishMixin):
16746}
16747[test/test_mutable.py: implement cap type checking
16748Kevan Carstensen <kevan@isnotajoke.com>**20110515230326
16749 Ignore-this: 64cf51b809605061047c8a1b02f5e212
16750] hunk ./src/allmydata/test/test_mutable.py 2904
16751 
16752 
16753     def test_cap_after_upload(self):
16754-        self.failUnless(False)
16755+        # If we create a new mutable file and upload things to it, and
16756+        # it's an MDMF file, we should get an MDMF cap back from that
16757+        # file and should be able to use that.
16758+        # That's essentially what MDMF node is, so just check that.
16759+        mdmf_uri = self.mdmf_node.get_uri()
16760+        cap = uri.from_string(mdmf_uri)
16761+        self.failUnless(isinstance(cap, uri.WritableMDMFFileURI))
16762+        readonly_mdmf_uri = self.mdmf_node.get_readonly_uri()
16763+        cap = uri.from_string(readonly_mdmf_uri)
16764+        self.failUnless(isinstance(cap, uri.ReadonlyMDMFFileURI))
16765 
16766 
16767     def test_get_writekey(self):
16768[test/test_web: add MDMF cap tests
16769Kevan Carstensen <kevan@isnotajoke.com>**20110515230358
16770 Ignore-this: ace5af3bdc9b65c3f6964c8fe056816
16771] {
16772hunk ./src/allmydata/test/test_web.py 27
16773 from allmydata.util.netstring import split_netstring
16774 from allmydata.util.encodingutil import to_str
16775 from allmydata.test.common import FakeCHKFileNode, FakeMutableFileNode, \
16776-     create_chk_filenode, WebErrorMixin, ShouldFailMixin, make_mutable_file_uri
16777+     create_chk_filenode, WebErrorMixin, ShouldFailMixin, \
16778+     make_mutable_file_uri, create_mutable_filenode
16779 from allmydata.interfaces import IMutableFileNode, SDMF_VERSION, MDMF_VERSION
16780 from allmydata.mutable import servermap, publish, retrieve
16781 import allmydata.test.common_util as testutil
16782hunk ./src/allmydata/test/test_web.py 203
16783             foo.set_uri(u"bar.txt", self._bar_txt_uri, self._bar_txt_uri)
16784             self._bar_txt_verifycap = n.get_verify_cap().to_string()
16785 
16786+            # sdmf
16787+            # XXX: Do we ever use this?
16788+            self.BAZ_CONTENTS, n, self._baz_txt_uri, self._baz_txt_readonly_uri = self.makefile_mutable(0)
16789+
16790+            foo.set_uri(u"baz.txt", self._baz_txt_uri, self._baz_txt_readonly_uri)
16791+
16792+            # mdmf
16793+            self.QUUX_CONTENTS, n, self._quux_txt_uri, self._quux_txt_readonly_uri = self.makefile_mutable(0, mdmf=True)
16794+            assert self._quux_txt_uri.startswith("URI:MDMF")
16795+            foo.set_uri(u"quux.txt", self._quux_txt_uri, self._quux_txt_readonly_uri)
16796+
16797             foo.set_uri(u"empty", res[3][1].get_uri(),
16798                         res[3][1].get_readonly_uri())
16799             sub_uri = res[4][1].get_uri()
16800hunk ./src/allmydata/test/test_web.py 245
16801             # public/
16802             # public/foo/
16803             # public/foo/bar.txt
16804+            # public/foo/baz.txt
16805+            # public/foo/quux.txt
16806             # public/foo/blockingfile
16807             # public/foo/empty/
16808             # public/foo/sub/
16809hunk ./src/allmydata/test/test_web.py 267
16810         n = create_chk_filenode(contents)
16811         return contents, n, n.get_uri()
16812 
16813+    def makefile_mutable(self, number, mdmf=False):
16814+        contents = "contents of mutable file %s\n" % number
16815+        n = create_mutable_filenode(contents, mdmf)
16816+        return contents, n, n.get_uri(), n.get_readonly_uri()
16817+
16818     def tearDown(self):
16819         return self.s.stopService()
16820 
16821hunk ./src/allmydata/test/test_web.py 278
16822     def failUnlessIsBarDotTxt(self, res):
16823         self.failUnlessReallyEqual(res, self.BAR_CONTENTS, res)
16824 
16825+    def failUnlessIsQuuxDotTxt(self, res):
16826+        self.failUnlessReallyEqual(res, self.QUUX_CONTENTS, res)
16827+
16828+    def failUnlessIsBazDotTxt(self, res):
16829+        self.failUnlessReallyEqual(res, self.BAZ_CONTENTS, res)
16830+
16831     def failUnlessIsBarJSON(self, res):
16832         data = simplejson.loads(res)
16833         self.failUnless(isinstance(data, list))
16834hunk ./src/allmydata/test/test_web.py 295
16835         self.failUnlessReallyEqual(to_str(data[1]["verify_uri"]), self._bar_txt_verifycap)
16836         self.failUnlessReallyEqual(data[1]["size"], len(self.BAR_CONTENTS))
16837 
16838+    def failUnlessIsQuuxJSON(self, res):
16839+        data = simplejson.loads(res)
16840+        self.failUnless(isinstance(data, list))
16841+        self.failUnlessEqual(data[0], "filenode")
16842+        self.failUnless(isinstance(data[1], dict))
16843+        metadata = data[1]
16844+        return self.failUnlessIsQuuxDotTxtMetadata(metadata)
16845+
16846+    def failUnlessIsQuuxDotTxtMetadata(self, metadata):
16847+        self.failUnless(metadata['mutable'])
16848+        self.failUnless("rw_uri" in metadata)
16849+        self.failUnlessEqual(metadata['rw_uri'], self._quux_txt_uri)
16850+        self.failUnless("ro_uri" in metadata)
16851+        self.failUnlessEqual(metadata['ro_uri'], self._quux_txt_readonly_uri)
16852+        self.failUnlessReallyEqual(metadata['size'], len(self.QUUX_CONTENTS))
16853+
16854     def failUnlessIsFooJSON(self, res):
16855         data = simplejson.loads(res)
16856         self.failUnless(isinstance(data, list))
16857hunk ./src/allmydata/test/test_web.py 324
16858 
16859         kidnames = sorted([unicode(n) for n in data[1]["children"]])
16860         self.failUnlessEqual(kidnames,
16861-                             [u"bar.txt", u"blockingfile", u"empty",
16862-                              u"n\u00fc.txt", u"sub"])
16863+                             [u"bar.txt", u"baz.txt", u"blockingfile",
16864+                              u"empty", u"n\u00fc.txt", u"quux.txt", u"sub"])
16865         kids = dict( [(unicode(name),value)
16866                       for (name,value)
16867                       in data[1]["children"].iteritems()] )
16868hunk ./src/allmydata/test/test_web.py 346
16869                                    self._bar_txt_metadata["tahoe"]["linkcrtime"])
16870         self.failUnlessReallyEqual(to_str(kids[u"n\u00fc.txt"][1]["ro_uri"]),
16871                                    self._bar_txt_uri)
16872+        self.failUnlessIn("quux.txt", kids)
16873+        self.failUnlessReallyEqual(kids[u"quux.txt"][1]["rw_uri"],
16874+                                   self._quux_txt_uri)
16875+        self.failUnlessReallyEqual(kids[u"quux.txt"][1]["ro_uri"],
16876+                                   self._quux_txt_readonly_uri)
16877 
16878     def GET(self, urlpath, followRedirect=False, return_response=False,
16879             **kwargs):
16880hunk ./src/allmydata/test/test_web.py 851
16881         d.addCallback(self.failUnlessIsBarDotTxt)
16882         return d
16883 
16884+    def test_GET_FILE_URI_mdmf(self):
16885+        base = "/uri/%s" % urllib.quote(self._quux_txt_uri)
16886+        d = self.GET(base)
16887+        d.addCallback(self.failUnlessIsQuuxDotTxt)
16888+        return d
16889+
16890+    def test_GET_FILE_URI_mdmf_extensions(self):
16891+        base = "/uri/%s" % urllib.quote("%s:3:131073" % self._quux_txt_uri)
16892+        d = self.GET(base)
16893+        d.addCallback(self.failUnlessIsQuuxDotTxt)
16894+        return d
16895+
16896+    def test_GET_FILE_URI_mdmf_readonly(self):
16897+        base = "/uri/%s" % urllib.quote(self._quux_txt_readonly_uri)
16898+        d = self.GET(base)
16899+        d.addCallback(self.failUnlessIsQuuxDotTxt)
16900+        return d
16901+
16902     def test_GET_FILE_URI_badchild(self):
16903         base = "/uri/%s/boguschild" % urllib.quote(self._bar_txt_uri)
16904         errmsg = "Files have no children, certainly not named 'boguschild'"
16905hunk ./src/allmydata/test/test_web.py 885
16906                              self.PUT, base, "")
16907         return d
16908 
16909+    def test_PUT_FILE_URI_mdmf(self):
16910+        base = "/uri/%s" % urllib.quote(self._quux_txt_uri)
16911+        self._quux_new_contents = "new_contents"
16912+        d = self.GET(base)
16913+        d.addCallback(lambda res:
16914+            self.failUnlessIsQuuxDotTxt(res))
16915+        d.addCallback(lambda ignored:
16916+            self.PUT(base, self._quux_new_contents))
16917+        d.addCallback(lambda ignored:
16918+            self.GET(base))
16919+        d.addCallback(lambda res:
16920+            self.failUnlessReallyEqual(res, self._quux_new_contents))
16921+        return d
16922+
16923+    def test_PUT_FILE_URI_mdmf_extensions(self):
16924+        base = "/uri/%s" % urllib.quote("%s:3:131073" % self._quux_txt_uri)
16925+        self._quux_new_contents = "new_contents"
16926+        d = self.GET(base)
16927+        d.addCallback(lambda res: self.failUnlessIsQuuxDotTxt(res))
16928+        d.addCallback(lambda ignored: self.PUT(base, self._quux_new_contents))
16929+        d.addCallback(lambda ignored: self.GET(base))
16930+        d.addCallback(lambda res: self.failUnlessEqual(self._quux_new_contents,
16931+                                                       res))
16932+        return d
16933+
16934+    def test_PUT_FILE_URI_mdmf_readonly(self):
16935+        # We're not allowed to PUT things to a readonly cap.
16936+        base = "/uri/%s" % self._quux_txt_readonly_uri
16937+        d = self.GET(base)
16938+        d.addCallback(lambda res:
16939+            self.failUnlessIsQuuxDotTxt(res))
16940+        # What should we get here? We get a 500 error now; that's not right.
16941+        d.addCallback(lambda ignored:
16942+            self.shouldFail2(error.Error, "test_PUT_FILE_URI_mdmf_readonly",
16943+                             "400 Bad Request", "read-only cap",
16944+                             self.PUT, base, "new data"))
16945+        return d
16946+
16947+    def test_PUT_FILE_URI_sdmf_readonly(self):
16948+        # We're not allowed to put things to a readonly cap.
16949+        base = "/uri/%s" % self._baz_txt_readonly_uri
16950+        d = self.GET(base)
16951+        d.addCallback(lambda res:
16952+            self.failUnlessIsBazDotTxt(res))
16953+        d.addCallback(lambda ignored:
16954+            self.shouldFail2(error.Error, "test_PUT_FILE_URI_sdmf_readonly",
16955+                             "400 Bad Request", "read-only cap",
16956+                             self.PUT, base, "new_data"))
16957+        return d
16958+
16959     # TODO: version of this with a Unicode filename
16960     def test_GET_FILEURL_save(self):
16961         d = self.GET(self.public_url + "/foo/bar.txt?filename=bar.txt&save=true",
16962hunk ./src/allmydata/test/test_web.py 951
16963         d.addBoth(self.should404, "test_GET_FILEURL_missing")
16964         return d
16965 
16966+    def test_GET_FILEURL_info_mdmf(self):
16967+        d = self.GET("/uri/%s?t=info" % self._quux_txt_uri)
16968+        def _got(res):
16969+            self.failUnlessIn("mutable file (mdmf)", res)
16970+            self.failUnlessIn(self._quux_txt_uri, res)
16971+            self.failUnlessIn(self._quux_txt_readonly_uri, res)
16972+        d.addCallback(_got)
16973+        return d
16974+
16975+    def test_GET_FILEURL_info_mdmf_readonly(self):
16976+        d = self.GET("/uri/%s?t=info" % self._quux_txt_readonly_uri)
16977+        def _got(res):
16978+            self.failUnlessIn("mutable file (mdmf)", res)
16979+            self.failIfIn(self._quux_txt_uri, res)
16980+            self.failUnlessIn(self._quux_txt_readonly_uri, res)
16981+        d.addCallback(_got)
16982+        return d
16983+
16984+    def test_GET_FILEURL_info_sdmf(self):
16985+        d = self.GET("/uri/%s?t=info" % self._baz_txt_uri)
16986+        def _got(res):
16987+            self.failUnlessIn("mutable file (sdmf)", res)
16988+            self.failUnlessIn(self._baz_txt_uri, res)
16989+        d.addCallback(_got)
16990+        return d
16991+
16992+    def test_GET_FILEURL_info_mdmf_extensions(self):
16993+        d = self.GET("/uri/%s:3:131073?t=info" % self._quux_txt_uri)
16994+        def _got(res):
16995+            self.failUnlessIn("mutable file (mdmf)", res)
16996+            self.failUnlessIn(self._quux_txt_uri, res)
16997+            self.failUnlessIn(self._quux_txt_readonly_uri, res)
16998+        d.addCallback(_got)
16999+        return d
17000+
17001     def test_PUT_overwrite_only_files(self):
17002         # create a directory, put a file in that directory.
17003         contents, n, filecap = self.makefile(8)
17004hunk ./src/allmydata/test/test_web.py 1033
17005         contents = self.NEWFILE_CONTENTS * 300000
17006         d = self.PUT("/uri?mutable=true&mutable-type=mdmf",
17007                      contents)
17008+        def _got_filecap(filecap):
17009+            self.failUnless(filecap.startswith("URI:MDMF"))
17010+            return filecap
17011+        d.addCallback(_got_filecap)
17012         d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
17013         d.addCallback(lambda json: self.failUnlessIn("mdmf", json))
17014         return d
17015hunk ./src/allmydata/test/test_web.py 1203
17016         d.addCallback(_got_json, "sdmf")
17017         return d
17018 
17019+    def test_GET_FILEURL_json_mdmf_extensions(self):
17020+        # A GET invoked against a URL that includes an MDMF cap with
17021+        # extensions should fetch the same JSON information as a GET
17022+        # invoked against a bare cap.
17023+        self._quux_txt_uri = "%s:3:131073" % self._quux_txt_uri
17024+        self._quux_txt_readonly_uri = "%s:3:131073" % self._quux_txt_readonly_uri
17025+        d = self.GET("/uri/%s?t=json" % urllib.quote(self._quux_txt_uri))
17026+        d.addCallback(self.failUnlessIsQuuxJSON)
17027+        return d
17028+
17029+    def test_GET_FILEURL_json_mdmf(self):
17030+        d = self.GET("/uri/%s?t=json" % urllib.quote(self._quux_txt_uri))
17031+        d.addCallback(self.failUnlessIsQuuxJSON)
17032+        return d
17033+
17034     def test_GET_FILEURL_json_missing(self):
17035         d = self.GET(self.public_url + "/foo/missing?json")
17036         d.addBoth(self.should404, "test_GET_FILEURL_json_missing")
17037hunk ./src/allmydata/test/test_web.py 1262
17038             self.failUnlessIn('<div class="toolbar-item"><a href="../../..">Return to Welcome page</a></div>',res)
17039             self.failUnlessIn("mutable-type-mdmf", res)
17040             self.failUnlessIn("mutable-type-sdmf", res)
17041+            self.failUnlessIn("quux", res)
17042         d.addCallback(_check)
17043         return d
17044 
17045hunk ./src/allmydata/test/test_web.py 1520
17046         d.addCallback(self.get_operation_results, "127", "json")
17047         def _got_json(stats):
17048             expected = {"count-immutable-files": 3,
17049-                        "count-mutable-files": 0,
17050+                        "count-mutable-files": 2,
17051                         "count-literal-files": 0,
17052hunk ./src/allmydata/test/test_web.py 1522
17053-                        "count-files": 3,
17054+                        "count-files": 5,
17055                         "count-directories": 3,
17056                         "size-immutable-files": 57,
17057                         "size-literal-files": 0,
17058hunk ./src/allmydata/test/test_web.py 1528
17059                         #"size-directories": 1912, # varies
17060                         #"largest-directory": 1590,
17061-                        "largest-directory-children": 5,
17062+                        "largest-directory-children": 7,
17063                         "largest-immutable-file": 19,
17064                         }
17065             for k,v in expected.iteritems():
17066hunk ./src/allmydata/test/test_web.py 1545
17067         def _check(res):
17068             self.failUnless(res.endswith("\n"))
17069             units = [simplejson.loads(t) for t in res[:-1].split("\n")]
17070-            self.failUnlessReallyEqual(len(units), 7)
17071+            self.failUnlessReallyEqual(len(units), 9)
17072             self.failUnlessEqual(units[-1]["type"], "stats")
17073             first = units[0]
17074             self.failUnlessEqual(first["path"], [])
17075hunk ./src/allmydata/test/test_web.py 1556
17076             self.failIfEqual(baz["storage-index"], None)
17077             self.failIfEqual(baz["verifycap"], None)
17078             self.failIfEqual(baz["repaircap"], None)
17079+            # XXX: Add quux and baz to this test.
17080             return
17081         d.addCallback(_check)
17082         return d
17083hunk ./src/allmydata/test/test_web.py 2002
17084         d.addCallback(lambda ignored:
17085             self.POST("/uri?t=upload&mutable=true&mutable-type=mdmf",
17086                       file=('mdmf.txt', self.NEWFILE_CONTENTS * 300000)))
17087+        def _got_filecap(filecap):
17088+            self.failUnless(filecap.startswith("URI:MDMF"))
17089+            return filecap
17090+        d.addCallback(_got_filecap)
17091         d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
17092         d.addCallback(_got_json, "mdmf")
17093         return d
17094hunk ./src/allmydata/test/test_web.py 2019
17095             filenameu = unicode(filename)
17096             self.failUnlessURIMatchesRWChild(filecap, fn, filenameu)
17097             return self.GET(self.public_url + "/foo/%s?t=json" % filename)
17098+        def _got_mdmf_cap(filecap):
17099+            self.failUnless(filecap.startswith("URI:MDMF"))
17100+            return filecap
17101         d.addCallback(_got_cap, "sdmf.txt")
17102         def _got_json(json, version):
17103             data = simplejson.loads(json)
17104hunk ./src/allmydata/test/test_web.py 2034
17105             self.POST(self.public_url + \
17106                       "/foo?t=upload&mutable=true&mutable-type=mdmf",
17107                       file=("mdmf.txt", self.NEWFILE_CONTENTS * 300000)))
17108+        d.addCallback(_got_mdmf_cap)
17109         d.addCallback(_got_cap, "mdmf.txt")
17110         d.addCallback(_got_json, "mdmf")
17111         return d
17112hunk ./src/allmydata/test/test_web.py 2268
17113         # make sure that nothing was added
17114         d.addCallback(lambda res:
17115                       self.failUnlessNodeKeysAre(self._foo_node,
17116-                                                 [u"bar.txt", u"blockingfile",
17117-                                                  u"empty", u"n\u00fc.txt",
17118+                                                 [u"bar.txt", u"baz.txt", u"blockingfile",
17119+                                                  u"empty", u"n\u00fc.txt", u"quux.txt",
17120                                                   u"sub"]))
17121         return d
17122 
17123hunk ./src/allmydata/test/test_web.py 2391
17124         d.addCallback(_check3)
17125         return d
17126 
17127+    def test_POST_FILEURL_mdmf_check(self):
17128+        quux_url = "/uri/%s" % urllib.quote(self._quux_txt_uri)
17129+        d = self.POST(quux_url, t="check")
17130+        def _check(res):
17131+            self.failUnlessIn("Healthy", res)
17132+        d.addCallback(_check)
17133+        quux_extension_url = "/uri/%s" % urllib.quote("%s:3:131073" % self._quux_txt_uri)
17134+        d.addCallback(lambda ignored:
17135+            self.POST(quux_extension_url, t="check"))
17136+        d.addCallback(_check)
17137+        return d
17138+
17139+    def test_POST_FILEURL_mdmf_check_and_repair(self):
17140+        quux_url = "/uri/%s" % urllib.quote(self._quux_txt_uri)
17141+        d = self.POST(quux_url, t="check", repair="true")
17142+        def _check(res):
17143+            self.failUnlessIn("Healthy", res)
17144+        d.addCallback(_check)
17145+        quux_extension_url = "/uri/%s" %\
17146+            urllib.quote("%s:3:131073" % self._quux_txt_uri)
17147+        d.addCallback(lambda ignored:
17148+            self.POST(quux_extension_url, t="check", repair="true"))
17149+        d.addCallback(_check)
17150+        return d
17151+
17152     def wait_for_operation(self, ignored, ophandle):
17153         url = "/operations/" + ophandle
17154         url += "?t=status&output=JSON"
17155hunk ./src/allmydata/test/test_web.py 2461
17156         d.addCallback(self.wait_for_operation, "123")
17157         def _check_json(data):
17158             self.failUnlessReallyEqual(data["finished"], True)
17159-            self.failUnlessReallyEqual(data["count-objects-checked"], 8)
17160-            self.failUnlessReallyEqual(data["count-objects-healthy"], 8)
17161+            self.failUnlessReallyEqual(data["count-objects-checked"], 10)
17162+            self.failUnlessReallyEqual(data["count-objects-healthy"], 10)
17163         d.addCallback(_check_json)
17164         d.addCallback(self.get_operation_results, "123", "html")
17165         def _check_html(res):
17166hunk ./src/allmydata/test/test_web.py 2466
17167-            self.failUnless("Objects Checked: <span>8</span>" in res)
17168-            self.failUnless("Objects Healthy: <span>8</span>" in res)
17169+            self.failUnless("Objects Checked: <span>10</span>" in res)
17170+            self.failUnless("Objects Healthy: <span>10</span>" in res)
17171         d.addCallback(_check_html)
17172 
17173         d.addCallback(lambda res:
17174hunk ./src/allmydata/test/test_web.py 2496
17175         d.addCallback(self.wait_for_operation, "124")
17176         def _check_json(data):
17177             self.failUnlessReallyEqual(data["finished"], True)
17178-            self.failUnlessReallyEqual(data["count-objects-checked"], 8)
17179-            self.failUnlessReallyEqual(data["count-objects-healthy-pre-repair"], 8)
17180+            self.failUnlessReallyEqual(data["count-objects-checked"], 10)
17181+            self.failUnlessReallyEqual(data["count-objects-healthy-pre-repair"], 10)
17182             self.failUnlessReallyEqual(data["count-objects-unhealthy-pre-repair"], 0)
17183             self.failUnlessReallyEqual(data["count-corrupt-shares-pre-repair"], 0)
17184             self.failUnlessReallyEqual(data["count-repairs-attempted"], 0)
17185hunk ./src/allmydata/test/test_web.py 2503
17186             self.failUnlessReallyEqual(data["count-repairs-successful"], 0)
17187             self.failUnlessReallyEqual(data["count-repairs-unsuccessful"], 0)
17188-            self.failUnlessReallyEqual(data["count-objects-healthy-post-repair"], 8)
17189+            self.failUnlessReallyEqual(data["count-objects-healthy-post-repair"], 10)
17190             self.failUnlessReallyEqual(data["count-objects-unhealthy-post-repair"], 0)
17191             self.failUnlessReallyEqual(data["count-corrupt-shares-post-repair"], 0)
17192         d.addCallback(_check_json)
17193hunk ./src/allmydata/test/test_web.py 2509
17194         d.addCallback(self.get_operation_results, "124", "html")
17195         def _check_html(res):
17196-            self.failUnless("Objects Checked: <span>8</span>" in res)
17197+            self.failUnless("Objects Checked: <span>10</span>" in res)
17198 
17199hunk ./src/allmydata/test/test_web.py 2511
17200-            self.failUnless("Objects Healthy (before repair): <span>8</span>" in res)
17201+            self.failUnless("Objects Healthy (before repair): <span>10</span>" in res)
17202             self.failUnless("Objects Unhealthy (before repair): <span>0</span>" in res)
17203             self.failUnless("Corrupt Shares (before repair): <span>0</span>" in res)
17204 
17205hunk ./src/allmydata/test/test_web.py 2519
17206             self.failUnless("Repairs Successful: <span>0</span>" in res)
17207             self.failUnless("Repairs Unsuccessful: <span>0</span>" in res)
17208 
17209-            self.failUnless("Objects Healthy (after repair): <span>8</span>" in res)
17210+            self.failUnless("Objects Healthy (after repair): <span>10</span>" in res)
17211             self.failUnless("Objects Unhealthy (after repair): <span>0</span>" in res)
17212             self.failUnless("Corrupt Shares (after repair): <span>0</span>" in res)
17213         d.addCallback(_check_html)
17214hunk ./src/allmydata/test/test_web.py 2649
17215         filecap3 = node3.get_readonly_uri()
17216         node4 = self.s.create_node_from_uri(make_mutable_file_uri())
17217         dircap = DirectoryNode(node4, None, None).get_uri()
17218+        mdmfcap = make_mutable_file_uri(mdmf=True)
17219         litdircap = "URI:DIR2-LIT:ge3dumj2mewdcotyfqydulbshj5x2lbm"
17220         emptydircap = "URI:DIR2-LIT:"
17221         newkids = {u"child-imm":        ["filenode", {"rw_uri": filecap1,
17222hunk ./src/allmydata/test/test_web.py 2666
17223                                                       "ro_uri": self._make_readonly(dircap)}],
17224                    u"dirchild-lit":     ["dirnode",  {"ro_uri": litdircap}],
17225                    u"dirchild-empty":   ["dirnode",  {"ro_uri": emptydircap}],
17226+                   u"child-mutable-mdmf": ["filenode", {"rw_uri": mdmfcap,
17227+                                                        "ro_uri": self._make_readonly(mdmfcap)}],
17228                    }
17229         return newkids, {'filecap1': filecap1,
17230                          'filecap2': filecap2,
17231hunk ./src/allmydata/test/test_web.py 2677
17232                          'unknown_immcap': unknown_immcap,
17233                          'dircap': dircap,
17234                          'litdircap': litdircap,
17235-                         'emptydircap': emptydircap}
17236+                         'emptydircap': emptydircap,
17237+                         'mdmfcap': mdmfcap}
17238 
17239     def _create_immutable_children(self):
17240         contents, n, filecap1 = self.makefile(12)
17241hunk ./src/allmydata/test/test_web.py 3224
17242             data = data[1]
17243             self.failUnlessIn("mutable-type", data)
17244             self.failUnlessEqual(data['mutable-type'], "mdmf")
17245+            self.failUnless(data['rw_uri'].startswith("URI:MDMF"))
17246+            self.failUnless(data['ro_uri'].startswith("URI:MDMF"))
17247         d.addCallback(_got_json)
17248         return d
17249 
17250}
17251[web/filenode.py: complain if a PUT is requested with a readonly cap
17252Kevan Carstensen <kevan@isnotajoke.com>**20110515230421
17253 Ignore-this: e2f05201f3b008e157062ed187eacbb9
17254] hunk ./src/allmydata/web/filenode.py 229
17255                 raise ExistingChildError()
17256 
17257             if self.node.is_mutable():
17258+                # Are we a readonly filenode? We shouldn't allow callers
17259+                # to try to replace us if we are.
17260+                if self.node.is_readonly():
17261+                    raise WebError("PUT to a mutable file: replace or update"
17262+                                   " requested with read-only cap")
17263                 if offset is None:
17264                     return self.replace_my_contents(req)
17265 
17266[web/info.py: Display mutable type information when describing a mutable file
17267Kevan Carstensen <kevan@isnotajoke.com>**20110515230444
17268 Ignore-this: ce5ad22b494effe6c15e49471fae0d99
17269] {
17270hunk ./src/allmydata/web/info.py 8
17271 from nevow.inevow import IRequest
17272 
17273 from allmydata.util import base32
17274-from allmydata.interfaces import IDirectoryNode, IFileNode
17275+from allmydata.interfaces import IDirectoryNode, IFileNode, MDMF_VERSION, SDMF_VERSION
17276 from allmydata.web.common import getxmlfile
17277 from allmydata.mutable.common import UnrecoverableFileError # TODO: move
17278 
17279hunk ./src/allmydata/web/info.py 31
17280             si = node.get_storage_index()
17281             if si:
17282                 if node.is_mutable():
17283-                    return "mutable file"
17284+                    ret = "mutable file"
17285+                    if node.get_version() == MDMF_VERSION:
17286+                        ret += " (mdmf)"
17287+                    else:
17288+                        ret += " (sdmf)"
17289+                    return ret
17290                 return "immutable file"
17291             return "immutable LIT file"
17292         return "unknown"
17293}
17294
17295Context:
17296
17297[allmydata/__init__.py: Nicer reporting of unparseable version numbers in dependencies. fixes #1388
17298david-sarah@jacaranda.org**20110401202750
17299 Ignore-this: 9c6bd599259d2405e1caadbb3e0d8c7f
17300] 
17301[update FTP-and-SFTP.rst: the necessary patch is included in Twisted-10.1
17302Brian Warner <warner@lothar.com>**20110325232511
17303 Ignore-this: d5307faa6900f143193bfbe14e0f01a
17304] 
17305[control.py: remove all uses of s.get_serverid()
17306warner@lothar.com**20110227011203
17307 Ignore-this: f80a787953bd7fa3d40e828bde00e855
17308] 
17309[web: remove some uses of s.get_serverid(), not all
17310warner@lothar.com**20110227011159
17311 Ignore-this: a9347d9cf6436537a47edc6efde9f8be
17312] 
17313[immutable/downloader/fetcher.py: remove all get_serverid() calls
17314warner@lothar.com**20110227011156
17315 Ignore-this: fb5ef018ade1749348b546ec24f7f09a
17316] 
17317[immutable/downloader/fetcher.py: fix diversity bug in server-response handling
17318warner@lothar.com**20110227011153
17319 Ignore-this: bcd62232c9159371ae8a16ff63d22c1b
17320 
17321 When blocks terminate (either COMPLETE or CORRUPT/DEAD/BADSEGNUM), the
17322 _shares_from_server dict was being popped incorrectly (using shnum as the
17323 index instead of serverid). I'm still thinking through the consequences of
17324 this bug. It was probably benign and really hard to detect. I think it would
17325 cause us to incorrectly believe that we're pulling too many shares from a
17326 server, and thus prefer a different server rather than asking for a second
17327 share from the first server. The diversity code is intended to spread out the
17328 number of shares simultaneously being requested from each server, but with
17329 this bug, it might be spreading out the total number of shares requested at
17330 all, not just simultaneously. (note that SegmentFetcher is scoped to a single
17331 segment, so the effect doesn't last very long).
17332] 
17333[immutable/downloader/share.py: reduce get_serverid(), one left, update ext deps
17334warner@lothar.com**20110227011150
17335 Ignore-this: d8d56dd8e7b280792b40105e13664554
17336 
17337 test_download.py: create+check MyShare instances better, make sure they share
17338 Server objects, now that finder.py cares
17339] 
17340[immutable/downloader/finder.py: reduce use of get_serverid(), one left
17341warner@lothar.com**20110227011146
17342 Ignore-this: 5785be173b491ae8a78faf5142892020
17343] 
17344[immutable/offloaded.py: reduce use of get_serverid() a bit more
17345warner@lothar.com**20110227011142
17346 Ignore-this: b48acc1b2ae1b311da7f3ba4ffba38f
17347] 
17348[immutable/upload.py: reduce use of get_serverid()
17349warner@lothar.com**20110227011138
17350 Ignore-this: ffdd7ff32bca890782119a6e9f1495f6
17351] 
17352[immutable/checker.py: remove some uses of s.get_serverid(), not all
17353warner@lothar.com**20110227011134
17354 Ignore-this: e480a37efa9e94e8016d826c492f626e
17355] 
17356[add remaining get_* methods to storage_client.Server, NoNetworkServer, and
17357warner@lothar.com**20110227011132
17358 Ignore-this: 6078279ddf42b179996a4b53bee8c421
17359 MockIServer stubs
17360] 
17361[upload.py: rearrange _make_trackers a bit, no behavior changes
17362warner@lothar.com**20110227011128
17363 Ignore-this: 296d4819e2af452b107177aef6ebb40f
17364] 
17365[happinessutil.py: finally rename merge_peers to merge_servers
17366warner@lothar.com**20110227011124
17367 Ignore-this: c8cd381fea1dd888899cb71e4f86de6e
17368] 
17369[test_upload.py: factor out FakeServerTracker
17370warner@lothar.com**20110227011120
17371 Ignore-this: 6c182cba90e908221099472cc159325b
17372] 
17373[test_upload.py: server-vs-tracker cleanup
17374warner@lothar.com**20110227011115
17375 Ignore-this: 2915133be1a3ba456e8603885437e03
17376] 
17377[happinessutil.py: server-vs-tracker cleanup
17378warner@lothar.com**20110227011111
17379 Ignore-this: b856c84033562d7d718cae7cb01085a9
17380] 
17381[upload.py: more tracker-vs-server cleanup
17382warner@lothar.com**20110227011107
17383 Ignore-this: bb75ed2afef55e47c085b35def2de315
17384] 
17385[upload.py: fix var names to avoid confusion between 'trackers' and 'servers'
17386warner@lothar.com**20110227011103
17387 Ignore-this: 5d5e3415b7d2732d92f42413c25d205d
17388] 
17389[refactor: s/peer/server/ in immutable/upload, happinessutil.py, test_upload
17390warner@lothar.com**20110227011100
17391 Ignore-this: 7ea858755cbe5896ac212a925840fe68
17392 
17393 No behavioral changes, just updating variable/method names and log messages.
17394 The effects outside these three files should be minimal: some exception
17395 messages changed (to say "server" instead of "peer"), and some internal class
17396 names were changed. A few things still use "peer" to minimize external
17397 changes, like UploadResults.timings["peer_selection"] and
17398 happinessutil.merge_peers, which can be changed later.
17399] 
17400[storage_client.py: clean up test_add_server/test_add_descriptor, remove .test_servers
17401warner@lothar.com**20110227011056
17402 Ignore-this: efad933e78179d3d5fdcd6d1ef2b19cc
17403] 
17404[test_client.py, upload.py:: remove KiB/MiB/etc constants, and other dead code
17405warner@lothar.com**20110227011051
17406 Ignore-this: dc83c5794c2afc4f81e592f689c0dc2d
17407] 
17408[test: increase timeout on a network test because Francois's ARM machine hit that timeout
17409zooko@zooko.com**20110317165909
17410 Ignore-this: 380c345cdcbd196268ca5b65664ac85b
17411 I'm skeptical that the test was proceeding correctly but ran out of time. It seems more likely that it had gotten hung. But if we raise the timeout to an even more extravagant number then we can be even more certain that the test was never going to finish.
17412] 
17413[docs/configuration.rst: add a "Frontend Configuration" section
17414Brian Warner <warner@lothar.com>**20110222014323
17415 Ignore-this: 657018aa501fe4f0efef9851628444ca
17416 
17417 this points to docs/frontends/*.rst, which were previously underlinked
17418] 
17419[web/filenode.py: avoid calling req.finish() on closed HTTP connections. Closes #1366
17420"Brian Warner <warner@lothar.com>"**20110221061544
17421 Ignore-this: 799d4de19933f2309b3c0c19a63bb888
17422] 
17423[Add unit tests for cross_check_pkg_resources_versus_import, and a regression test for ref #1355. This requires a little refactoring to make it testable.
17424david-sarah@jacaranda.org**20110221015817
17425 Ignore-this: 51d181698f8c20d3aca58b057e9c475a
17426] 
17427[allmydata/__init__.py: .name was used in place of the correct .__name__ when printing an exception. Also, robustify string formatting by using %r instead of %s in some places. fixes #1355.
17428david-sarah@jacaranda.org**20110221020125
17429 Ignore-this: b0744ed58f161bf188e037bad077fc48
17430] 
17431[Refactor StorageFarmBroker handling of servers
17432Brian Warner <warner@lothar.com>**20110221015804
17433 Ignore-this: 842144ed92f5717699b8f580eab32a51
17434 
17435 Pass around IServer instance instead of (peerid, rref) tuple. Replace
17436 "descriptor" with "server". Other replacements:
17437 
17438  get_all_servers -> get_connected_servers/get_known_servers
17439  get_servers_for_index -> get_servers_for_psi (now returns IServers)
17440 
17441 This change still needs to be pushed further down: lots of code is now
17442 getting the IServer and then distributing (peerid, rref) internally.
17443 Instead, it ought to distribute the IServer internally and delay
17444 extracting a serverid or rref until the last moment.
17445 
17446 no_network.py was updated to retain parallelism.
17447] 
17448[TAG allmydata-tahoe-1.8.2
17449warner@lothar.com**20110131020101] 
17450Patch bundle hash:
17451aa8587a55764e5912ebeb248f6198d1bf08ec13a