Ticket #393: 393status41.dpatch

File 393status41.dpatch, 700.2 KB (added by kevan, at 2011-05-03T03:17:44Z)

begin work on layout changes, add MDMF caps, tweak nodemaker and filenode to use MDMF caps

Line 
1Mon Aug  9 16:32:44 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
2  * interfaces.py: Add #993 interfaces
3
4Mon Aug  9 16:35:35 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
5  * frontends/sftpd.py: Modify the sftp frontend to work with the MDMF changes
6
7Mon Aug  9 17:06:19 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
8  * immutable/filenode.py: Make the immutable file node implement the same interfaces as the mutable one
9
10Mon Aug  9 17:06:33 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
11  * immutable/literal.py: implement the same interfaces as other filenodes
12
13Fri Aug 13 16:49:57 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
14  * scripts: tell 'tahoe put' about MDMF
15
16Sat Aug 14 01:10:12 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
17  * web: Alter the webapi to get along with and take advantage of the MDMF changes
18 
19  The main benefit that the webapi gets from MDMF, at least initially, is
20  the ability to do a streaming download of an MDMF mutable file. It also
21  exposes a way (through the PUT verb) to append to or otherwise modify
22  (in-place) an MDMF mutable file.
23
24Sat Aug 14 15:57:11 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
25  * client.py: learn how to create different kinds of mutable files
26
27Wed Aug 18 17:32:16 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
28  * mutable/checker.py and mutable/repair.py: Modify checker and repairer to work with MDMF
29 
30  The checker and repairer required minimal changes to work with the MDMF
31  modifications made elsewhere. The checker duplicated a lot of the code
32  that was already in the downloader, so I modified the downloader
33  slightly to expose this functionality to the checker and removed the
34  duplicated code. The repairer only required a minor change to deal with
35  data representation.
36
37Wed Aug 18 17:32:31 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
38  * mutable/filenode.py: add versions and partial-file updates to the mutable file node
39 
40  One of the goals of MDMF as a GSoC project is to lay the groundwork for
41  LDMF, a format that will allow Tahoe-LAFS to deal with and encourage
42  multiple versions of a single cap on the grid. In line with this, there
43  is a now a distinction between an overriding mutable file (which can be
44  thought to correspond to the cap/unique identifier for that mutable
45  file) and versions of the mutable file (which we can download, update,
46  and so on). All download, upload, and modification operations end up
47  happening on a particular version of a mutable file, but there are
48  shortcut methods on the object representing the overriding mutable file
49  that perform these operations on the best version of the mutable file
50  (which is what code should be doing until we have LDMF and better
51  support for other paradigms).
52 
53  Another goal of MDMF was to take advantage of segmentation to give
54  callers more efficient partial file updates or appends. This patch
55  implements methods that do that, too.
56 
57
58Wed Aug 18 17:33:42 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
59  * mutable/publish.py: Modify the publish process to support MDMF
60 
61  The inner workings of the publishing process needed to be reworked to a
62  large extend to cope with segmented mutable files, and to cope with
63  partial-file updates of mutable files. This patch does that. It also
64  introduces wrappers for uploadable data, allowing the use of
65  filehandle-like objects as data sources, in addition to strings. This
66  reduces memory inefficiency when dealing with large files through the
67  webapi, and clarifies update code there.
68
69Wed Aug 18 17:35:09 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
70  * nodemaker.py: Make nodemaker expose a way to create MDMF files
71
72Sat Aug 14 15:56:44 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
73  * docs: update docs to mention MDMF
74
75Wed Aug 18 17:33:04 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
76  * mutable/layout.py and interfaces.py: add MDMF writer and reader
77 
78  The MDMF writer is responsible for keeping state as plaintext is
79  gradually processed into share data by the upload process. When the
80  upload finishes, it will write all of its share data to a remote server,
81  reporting its status back to the publisher.
82 
83  The MDMF reader is responsible for abstracting an MDMF file as it sits
84  on the grid from the downloader; specifically, by receiving and
85  responding to requests for arbitrary data within the MDMF file.
86 
87  The interfaces.py file has also been modified to contain an interface
88  for the writer.
89
90Wed Aug 18 17:34:09 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
91  * mutable/retrieve.py: Modify the retrieval process to support MDMF
92 
93  The logic behind a mutable file download had to be adapted to work with
94  segmented mutable files; this patch performs those adaptations. It also
95  exposes some decoding and decrypting functionality to make partial-file
96  updates a little easier, and supports efficient random-access downloads
97  of parts of an MDMF file.
98
99Wed Aug 18 17:34:39 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
100  * mutable/servermap.py: Alter the servermap updater to work with MDMF files
101 
102  These modifications were basically all to the end of having the
103  servermap updater use the unified MDMF + SDMF read interface whenever
104  possible -- this reduces the complexity of the code, making it easier to
105  read and maintain. To do this, I needed to modify the process of
106  updating the servermap a little bit.
107 
108  To support partial-file updates, I also modified the servermap updater
109  to fetch the block hash trees and certain segments of files while it
110  performed a servermap update (this can be done without adding any new
111  roundtrips because of batch-read functionality that the read proxy has).
112 
113
114Wed Aug 18 17:35:31 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
115  * tests:
116 
117      - A lot of existing tests relied on aspects of the mutable file
118        implementation that were changed. This patch updates those tests
119        to work with the changes.
120      - This patch also adds tests for new features.
121
122Sun Feb 20 15:02:01 PST 2011  "Brian Warner <warner@lothar.com>"
123  * resolve conflicts between 393-MDMF patches and trunk as of 1.8.2
124
125Sun Feb 20 17:46:59 PST 2011  "Brian Warner <warner@lothar.com>"
126  * mutable/filenode.py: fix create_mutable_file('string')
127
128Sun Feb 20 21:56:00 PST 2011  "Brian Warner <warner@lothar.com>"
129  * resolve more conflicts with current trunk
130
131Sun Feb 20 22:10:04 PST 2011  "Brian Warner <warner@lothar.com>"
132  * update MDMF code with StorageFarmBroker changes
133
134Fri Feb 25 17:04:33 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
135  * mutable/filenode: Clean up servermap handling in MutableFileVersion
136 
137  We want to update the servermap before attempting to modify a file,
138  which we now do. This introduced code duplication, which was addressed
139  by refactoring the servermap update into its own method, and then
140  eliminating duplicate servermap updates throughout the
141  MutableFileVersion.
142
143Sun Feb 27 15:16:43 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
144  * web: Use the string "replace" to trigger whole-file replacement when processing an offset parameter.
145
146Sun Feb 27 16:34:26 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
147  * docs/configuration.rst: fix more conflicts between #393 and trunk
148
149Sun Feb 27 17:06:37 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
150  * mutable/layout: remove references to the salt hash tree.
151
152Sun Feb 27 18:10:56 PST 2011  warner@lothar.com
153  * test_mutable.py: add test to exercise fencepost bug
154
155Mon Feb 28 00:33:27 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
156  * mutable/publish: account for offsets on segment boundaries.
157
158Mon Feb 28 19:08:07 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
159  * tahoe-put: raise UsageError when given a nonsensical mutable type, move option validation code to the option parser.
160
161Fri Mar  4 17:08:58 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
162  * web: use None instead of False in the case of no offset, use object identity comparison to check whether or not an offset was specified.
163
164Mon Mar  7 00:17:13 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
165  * mutable/filenode: remove incorrect comments about segment boundaries
166
167Mon Mar  7 00:22:29 PST 2011  Kevan Carstensen <kevan@isnotajoke.com>
168  * mutable: use integer division where appropriate
169
170Sun May  1 15:41:25 PDT 2011  Kevan Carstensen <kevan@isnotajoke.com>
171  * mutable/layout.py: reorder on-disk format to aput variable-length fields at the end of the share, after a predictably long preamble
172
173Sun May  1 15:42:49 PDT 2011  Kevan Carstensen <kevan@isnotajoke.com>
174  * uri.py: Add MDMF cap
175
176Sun May  1 15:45:23 PDT 2011  Kevan Carstensen <kevan@isnotajoke.com>
177  * nodemaker, mutable/filenode: train nodemaker and filenode to handle MDMF caps
178
179New patches:
180
181[interfaces.py: Add #993 interfaces
182Kevan Carstensen <kevan@isnotajoke.com>**20100809233244
183 Ignore-this: b58621ac5cc86f1b4b4149f9e6c6a1ce
184] {
185hunk ./src/allmydata/interfaces.py 499
186 class MustNotBeUnknownRWError(CapConstraintError):
187     """Cannot add an unknown child cap specified in a rw_uri field."""
188 
189+
190+class IReadable(Interface):
191+    """I represent a readable object -- either an immutable file, or a
192+    specific version of a mutable file.
193+    """
194+
195+    def is_readonly():
196+        """Return True if this reference provides mutable access to the given
197+        file or directory (i.e. if you can modify it), or False if not. Note
198+        that even if this reference is read-only, someone else may hold a
199+        read-write reference to it.
200+
201+        For an IReadable returned by get_best_readable_version(), this will
202+        always return True, but for instances of subinterfaces such as
203+        IMutableFileVersion, it may return False."""
204+
205+    def is_mutable():
206+        """Return True if this file or directory is mutable (by *somebody*,
207+        not necessarily you), False if it is is immutable. Note that a file
208+        might be mutable overall, but your reference to it might be
209+        read-only. On the other hand, all references to an immutable file
210+        will be read-only; there are no read-write references to an immutable
211+        file."""
212+
213+    def get_storage_index():
214+        """Return the storage index of the file."""
215+
216+    def get_size():
217+        """Return the length (in bytes) of this readable object."""
218+
219+    def download_to_data():
220+        """Download all of the file contents. I return a Deferred that fires
221+        with the contents as a byte string."""
222+
223+    def read(consumer, offset=0, size=None):
224+        """Download a portion (possibly all) of the file's contents, making
225+        them available to the given IConsumer. Return a Deferred that fires
226+        (with the consumer) when the consumer is unregistered (either because
227+        the last byte has been given to it, or because the consumer threw an
228+        exception during write(), possibly because it no longer wants to
229+        receive data). The portion downloaded will start at 'offset' and
230+        contain 'size' bytes (or the remainder of the file if size==None).
231+
232+        The consumer will be used in non-streaming mode: an IPullProducer
233+        will be attached to it.
234+
235+        The consumer will not receive data right away: several network trips
236+        must occur first. The order of events will be::
237+
238+         consumer.registerProducer(p, streaming)
239+          (if streaming == False)::
240+           consumer does p.resumeProducing()
241+            consumer.write(data)
242+           consumer does p.resumeProducing()
243+            consumer.write(data).. (repeat until all data is written)
244+         consumer.unregisterProducer()
245+         deferred.callback(consumer)
246+
247+        If a download error occurs, or an exception is raised by
248+        consumer.registerProducer() or consumer.write(), I will call
249+        consumer.unregisterProducer() and then deliver the exception via
250+        deferred.errback(). To cancel the download, the consumer should call
251+        p.stopProducing(), which will result in an exception being delivered
252+        via deferred.errback().
253+
254+        See src/allmydata/util/consumer.py for an example of a simple
255+        download-to-memory consumer.
256+        """
257+
258+
259+class IWritable(Interface):
260+    """
261+    I define methods that callers can use to update SDMF and MDMF
262+    mutable files on a Tahoe-LAFS grid.
263+    """
264+    # XXX: For the moment, we have only this. It is possible that we
265+    #      want to move overwrite() and modify() in here too.
266+    def update(data, offset):
267+        """
268+        I write the data from my data argument to the MDMF file,
269+        starting at offset. I continue writing data until my data
270+        argument is exhausted, appending data to the file as necessary.
271+        """
272+        # assert IMutableUploadable.providedBy(data)
273+        # to append data: offset=node.get_size_of_best_version()
274+        # do we want to support compacting MDMF?
275+        # for an MDMF file, this can be done with O(data.get_size())
276+        # memory. For an SDMF file, any modification takes
277+        # O(node.get_size_of_best_version()).
278+
279+
280+class IMutableFileVersion(IReadable):
281+    """I provide access to a particular version of a mutable file. The
282+    access is read/write if I was obtained from a filenode derived from
283+    a write cap, or read-only if the filenode was derived from a read cap.
284+    """
285+
286+    def get_sequence_number():
287+        """Return the sequence number of this version."""
288+
289+    def get_servermap():
290+        """Return the IMutableFileServerMap instance that was used to create
291+        this object.
292+        """
293+
294+    def get_writekey():
295+        """Return this filenode's writekey, or None if the node does not have
296+        write-capability. This may be used to assist with data structures
297+        that need to make certain data available only to writers, such as the
298+        read-write child caps in dirnodes. The recommended process is to have
299+        reader-visible data be submitted to the filenode in the clear (where
300+        it will be encrypted by the filenode using the readkey), but encrypt
301+        writer-visible data using this writekey.
302+        """
303+
304+    # TODO: Can this be overwrite instead of replace?
305+    def replace(new_contents):
306+        """Replace the contents of the mutable file, provided that no other
307+        node has published (or is attempting to publish, concurrently) a
308+        newer version of the file than this one.
309+
310+        I will avoid modifying any share that is different than the version
311+        given by get_sequence_number(). However, if another node is writing
312+        to the file at the same time as me, I may manage to update some shares
313+        while they update others. If I see any evidence of this, I will signal
314+        UncoordinatedWriteError, and the file will be left in an inconsistent
315+        state (possibly the version you provided, possibly the old version,
316+        possibly somebody else's version, and possibly a mix of shares from
317+        all of these).
318+
319+        The recommended response to UncoordinatedWriteError is to either
320+        return it to the caller (since they failed to coordinate their
321+        writes), or to attempt some sort of recovery. It may be sufficient to
322+        wait a random interval (with exponential backoff) and repeat your
323+        operation. If I do not signal UncoordinatedWriteError, then I was
324+        able to write the new version without incident.
325+
326+        I return a Deferred that fires (with a PublishStatus object) when the
327+        update has completed.
328+        """
329+
330+    def modify(modifier_cb):
331+        """Modify the contents of the file, by downloading this version,
332+        applying the modifier function (or bound method), then uploading
333+        the new version. This will succeed as long as no other node
334+        publishes a version between the download and the upload.
335+        I return a Deferred that fires (with a PublishStatus object) when
336+        the update is complete.
337+
338+        The modifier callable will be given three arguments: a string (with
339+        the old contents), a 'first_time' boolean, and a servermap. As with
340+        download_to_data(), the old contents will be from this version,
341+        but the modifier can use the servermap to make other decisions
342+        (such as refusing to apply the delta if there are multiple parallel
343+        versions, or if there is evidence of a newer unrecoverable version).
344+        'first_time' will be True the first time the modifier is called,
345+        and False on any subsequent calls.
346+
347+        The callable should return a string with the new contents. The
348+        callable must be prepared to be called multiple times, and must
349+        examine the input string to see if the change that it wants to make
350+        is already present in the old version. If it does not need to make
351+        any changes, it can either return None, or return its input string.
352+
353+        If the modifier raises an exception, it will be returned in the
354+        errback.
355+        """
356+
357+
358 # The hierarchy looks like this:
359 #  IFilesystemNode
360 #   IFileNode
361hunk ./src/allmydata/interfaces.py 758
362     def raise_error():
363         """Raise any error associated with this node."""
364 
365+    # XXX: These may not be appropriate outside the context of an IReadable.
366     def get_size():
367         """Return the length (in bytes) of the data this node represents. For
368         directory nodes, I return the size of the backing store. I return
369hunk ./src/allmydata/interfaces.py 775
370 class IFileNode(IFilesystemNode):
371     """I am a node which represents a file: a sequence of bytes. I am not a
372     container, like IDirectoryNode."""
373+    def get_best_readable_version():
374+        """Return a Deferred that fires with an IReadable for the 'best'
375+        available version of the file. The IReadable provides only read
376+        access, even if this filenode was derived from a write cap.
377 
378hunk ./src/allmydata/interfaces.py 780
379-class IImmutableFileNode(IFileNode):
380-    def read(consumer, offset=0, size=None):
381-        """Download a portion (possibly all) of the file's contents, making
382-        them available to the given IConsumer. Return a Deferred that fires
383-        (with the consumer) when the consumer is unregistered (either because
384-        the last byte has been given to it, or because the consumer threw an
385-        exception during write(), possibly because it no longer wants to
386-        receive data). The portion downloaded will start at 'offset' and
387-        contain 'size' bytes (or the remainder of the file if size==None).
388-
389-        The consumer will be used in non-streaming mode: an IPullProducer
390-        will be attached to it.
391+        For an immutable file, there is only one version. For a mutable
392+        file, the 'best' version is the recoverable version with the
393+        highest sequence number. If no uncoordinated writes have occurred,
394+        and if enough shares are available, then this will be the most
395+        recent version that has been uploaded. If no version is recoverable,
396+        the Deferred will errback with an UnrecoverableFileError.
397+        """
398 
399hunk ./src/allmydata/interfaces.py 788
400-        The consumer will not receive data right away: several network trips
401-        must occur first. The order of events will be::
402+    def download_best_version():
403+        """Download the contents of the version that would be returned
404+        by get_best_readable_version(). This is equivalent to calling
405+        download_to_data() on the IReadable given by that method.
406 
407hunk ./src/allmydata/interfaces.py 793
408-         consumer.registerProducer(p, streaming)
409-          (if streaming == False)::
410-           consumer does p.resumeProducing()
411-            consumer.write(data)
412-           consumer does p.resumeProducing()
413-            consumer.write(data).. (repeat until all data is written)
414-         consumer.unregisterProducer()
415-         deferred.callback(consumer)
416+        I return a Deferred that fires with a byte string when the file
417+        has been fully downloaded. To support streaming download, use
418+        the 'read' method of IReadable. If no version is recoverable,
419+        the Deferred will errback with an UnrecoverableFileError.
420+        """
421 
422hunk ./src/allmydata/interfaces.py 799
423-        If a download error occurs, or an exception is raised by
424-        consumer.registerProducer() or consumer.write(), I will call
425-        consumer.unregisterProducer() and then deliver the exception via
426-        deferred.errback(). To cancel the download, the consumer should call
427-        p.stopProducing(), which will result in an exception being delivered
428-        via deferred.errback().
429+    def get_size_of_best_version():
430+        """Find the size of the version that would be returned by
431+        get_best_readable_version().
432 
433hunk ./src/allmydata/interfaces.py 803
434-        See src/allmydata/util/consumer.py for an example of a simple
435-        download-to-memory consumer.
436+        I return a Deferred that fires with an integer. If no version
437+        is recoverable, the Deferred will errback with an
438+        UnrecoverableFileError.
439         """
440 
441hunk ./src/allmydata/interfaces.py 808
442+
443+class IImmutableFileNode(IFileNode, IReadable):
444+    """I am a node representing an immutable file. Immutable files have
445+    only one version"""
446+
447+
448 class IMutableFileNode(IFileNode):
449     """I provide access to a 'mutable file', which retains its identity
450     regardless of what contents are put in it.
451hunk ./src/allmydata/interfaces.py 873
452     only be retrieved and updated all-at-once, as a single big string. Future
453     versions of our mutable files will remove this restriction.
454     """
455-
456-    def download_best_version():
457-        """Download the 'best' available version of the file, meaning one of
458-        the recoverable versions with the highest sequence number. If no
459+    def get_best_mutable_version():
460+        """Return a Deferred that fires with an IMutableFileVersion for
461+        the 'best' available version of the file. The best version is
462+        the recoverable version with the highest sequence number. If no
463         uncoordinated writes have occurred, and if enough shares are
464hunk ./src/allmydata/interfaces.py 878
465-        available, then this will be the most recent version that has been
466-        uploaded.
467+        available, then this will be the most recent version that has
468+        been uploaded.
469 
470hunk ./src/allmydata/interfaces.py 881
471-        I update an internal servermap with MODE_READ, determine which
472-        version of the file is indicated by
473-        servermap.best_recoverable_version(), and return a Deferred that
474-        fires with its contents. If no version is recoverable, the Deferred
475-        will errback with UnrecoverableFileError.
476-        """
477-
478-    def get_size_of_best_version():
479-        """Find the size of the version that would be downloaded with
480-        download_best_version(), without actually downloading the whole file.
481-
482-        I return a Deferred that fires with an integer.
483+        If no version is recoverable, the Deferred will errback with an
484+        UnrecoverableFileError.
485         """
486 
487     def overwrite(new_contents):
488hunk ./src/allmydata/interfaces.py 921
489         errback.
490         """
491 
492-
493     def get_servermap(mode):
494         """Return a Deferred that fires with an IMutableFileServerMap
495         instance, updated using the given mode.
496hunk ./src/allmydata/interfaces.py 974
497         writer-visible data using this writekey.
498         """
499 
500+    def set_version(version):
501+        """Tahoe-LAFS supports SDMF and MDMF mutable files. By default,
502+        we upload in SDMF for reasons of compatibility. If you want to
503+        change this, set_version will let you do that.
504+
505+        To say that this file should be uploaded in SDMF, pass in a 0. To
506+        say that the file should be uploaded as MDMF, pass in a 1.
507+        """
508+
509+    def get_version():
510+        """Returns the mutable file protocol version."""
511+
512 class NotEnoughSharesError(Exception):
513     """Download was unable to get enough shares"""
514 
515hunk ./src/allmydata/interfaces.py 1822
516         """The upload is finished, and whatever filehandle was in use may be
517         closed."""
518 
519+
520+class IMutableUploadable(Interface):
521+    """
522+    I represent content that is due to be uploaded to a mutable filecap.
523+    """
524+    # This is somewhat simpler than the IUploadable interface above
525+    # because mutable files do not need to be concerned with possibly
526+    # generating a CHK, nor with per-file keys. It is a subset of the
527+    # methods in IUploadable, though, so we could just as well implement
528+    # the mutable uploadables as IUploadables that don't happen to use
529+    # those methods (with the understanding that the unused methods will
530+    # never be called on such objects)
531+    def get_size():
532+        """
533+        Returns a Deferred that fires with the size of the content held
534+        by the uploadable.
535+        """
536+
537+    def read(length):
538+        """
539+        Returns a list of strings which, when concatenated, are the next
540+        length bytes of the file, or fewer if there are fewer bytes
541+        between the current location and the end of the file.
542+        """
543+
544+    def close():
545+        """
546+        The process that used the Uploadable is finished using it, so
547+        the uploadable may be closed.
548+        """
549+
550 class IUploadResults(Interface):
551     """I am returned by upload() methods. I contain a number of public
552     attributes which can be read to determine the results of the upload. Some
553}
554[frontends/sftpd.py: Modify the sftp frontend to work with the MDMF changes
555Kevan Carstensen <kevan@isnotajoke.com>**20100809233535
556 Ignore-this: 2d25e2cfcd0d7bbcbba660c7e1da12f
557] {
558hunk ./src/allmydata/frontends/sftpd.py 33
559 from allmydata.interfaces import IFileNode, IDirectoryNode, ExistingChildError, \
560      NoSuchChildError, ChildOfWrongTypeError
561 from allmydata.mutable.common import NotWriteableError
562+from allmydata.mutable.publish import MutableFileHandle
563 from allmydata.immutable.upload import FileHandle
564 from allmydata.dirnode import update_metadata
565 from allmydata.util.fileutil import EncryptedTemporaryFile
566hunk ./src/allmydata/frontends/sftpd.py 667
567         else:
568             assert IFileNode.providedBy(filenode), filenode
569 
570-            if filenode.is_mutable():
571-                self.async.addCallback(lambda ign: filenode.download_best_version())
572-                def _downloaded(data):
573-                    self.consumer = OverwriteableFileConsumer(len(data), tempfile_maker)
574-                    self.consumer.write(data)
575-                    self.consumer.finish()
576-                    return None
577-                self.async.addCallback(_downloaded)
578-            else:
579-                download_size = filenode.get_size()
580-                assert download_size is not None, "download_size is None"
581+            self.async.addCallback(lambda ignored: filenode.get_best_readable_version())
582+
583+            def _read(version):
584+                if noisy: self.log("_read", level=NOISY)
585+                download_size = version.get_size()
586+                assert download_size is not None
587+
588                 self.consumer = OverwriteableFileConsumer(download_size, tempfile_maker)
589hunk ./src/allmydata/frontends/sftpd.py 675
590-                def _read(ign):
591-                    if noisy: self.log("_read immutable", level=NOISY)
592-                    filenode.read(self.consumer, 0, None)
593-                self.async.addCallback(_read)
594+
595+                version.read(self.consumer, 0, None)
596+            self.async.addCallback(_read)
597 
598         eventually(self.async.callback, None)
599 
600hunk ./src/allmydata/frontends/sftpd.py 821
601                     assert parent and childname, (parent, childname, self.metadata)
602                     d2.addCallback(lambda ign: parent.set_metadata_for(childname, self.metadata))
603 
604-                d2.addCallback(lambda ign: self.consumer.get_current_size())
605-                d2.addCallback(lambda size: self.consumer.read(0, size))
606-                d2.addCallback(lambda new_contents: self.filenode.overwrite(new_contents))
607+                d2.addCallback(lambda ign: self.filenode.overwrite(MutableFileHandle(self.consumer.get_file())))
608             else:
609                 def _add_file(ign):
610                     self.log("_add_file childname=%r" % (childname,), level=OPERATIONAL)
611}
612[immutable/filenode.py: Make the immutable file node implement the same interfaces as the mutable one
613Kevan Carstensen <kevan@isnotajoke.com>**20100810000619
614 Ignore-this: 93e536c0f8efb705310f13ff64621527
615] {
616hunk ./src/allmydata/immutable/filenode.py 8
617 now = time.time
618 from zope.interface import implements, Interface
619 from twisted.internet import defer
620-from twisted.internet.interfaces import IConsumer
621 
622hunk ./src/allmydata/immutable/filenode.py 9
623-from allmydata.interfaces import IImmutableFileNode, IUploadResults
624 from allmydata import uri
625hunk ./src/allmydata/immutable/filenode.py 10
626+from twisted.internet.interfaces import IConsumer
627+from twisted.protocols import basic
628+from foolscap.api import eventually
629+from allmydata.interfaces import IImmutableFileNode, ICheckable, \
630+     IDownloadTarget, IUploadResults
631+from allmydata.util import dictutil, log, base32, consumer
632+from allmydata.immutable.checker import Checker
633 from allmydata.check_results import CheckResults, CheckAndRepairResults
634 from allmydata.util.dictutil import DictOfSets
635 from pycryptopp.cipher.aes import AES
636hunk ./src/allmydata/immutable/filenode.py 296
637         return self._cnode.check_and_repair(monitor, verify, add_lease)
638     def check(self, monitor, verify=False, add_lease=False):
639         return self._cnode.check(monitor, verify, add_lease)
640+
641+    def get_best_readable_version(self):
642+        """
643+        Return an IReadable of the best version of this file. Since
644+        immutable files can have only one version, we just return the
645+        current filenode.
646+        """
647+        return defer.succeed(self)
648+
649+
650+    def download_best_version(self):
651+        """
652+        Download the best version of this file, returning its contents
653+        as a bytestring. Since there is only one version of an immutable
654+        file, we download and return the contents of this file.
655+        """
656+        d = consumer.download_to_data(self)
657+        return d
658+
659+    # for an immutable file, download_to_data (specified in IReadable)
660+    # is the same as download_best_version (specified in IFileNode). For
661+    # mutable files, the difference is more meaningful, since they can
662+    # have multiple versions.
663+    download_to_data = download_best_version
664+
665+
666+    # get_size() (IReadable), get_current_size() (IFilesystemNode), and
667+    # get_size_of_best_version(IFileNode) are all the same for immutable
668+    # files.
669+    get_size_of_best_version = get_current_size
670}
671[immutable/literal.py: implement the same interfaces as other filenodes
672Kevan Carstensen <kevan@isnotajoke.com>**20100810000633
673 Ignore-this: b50dd5df2d34ecd6477b8499a27aef13
674] hunk ./src/allmydata/immutable/literal.py 106
675         d.addCallback(lambda lastSent: consumer)
676         return d
677 
678+    # IReadable, IFileNode, IFilesystemNode
679+    def get_best_readable_version(self):
680+        return defer.succeed(self)
681+
682+
683+    def download_best_version(self):
684+        return defer.succeed(self.u.data)
685+
686+
687+    download_to_data = download_best_version
688+    get_size_of_best_version = get_current_size
689+
690[scripts: tell 'tahoe put' about MDMF
691Kevan Carstensen <kevan@isnotajoke.com>**20100813234957
692 Ignore-this: c106b3384fc676bd3c0fb466d2a52b1b
693] {
694hunk ./src/allmydata/scripts/cli.py 160
695     optFlags = [
696         ("mutable", "m", "Create a mutable file instead of an immutable one."),
697         ]
698+    optParameters = [
699+        ("mutable-type", None, False, "Create a mutable file in the given format. Valid formats are 'sdmf' for SDMF and 'mdmf' for MDMF"),
700+        ]
701 
702     def parseArgs(self, arg1=None, arg2=None):
703         # see Examples below
704hunk ./src/allmydata/scripts/tahoe_put.py 21
705     from_file = options.from_file
706     to_file = options.to_file
707     mutable = options['mutable']
708+    mutable_type = False
709+
710+    if mutable:
711+        mutable_type = options['mutable-type']
712     if options['quiet']:
713         verbosity = 0
714     else:
715hunk ./src/allmydata/scripts/tahoe_put.py 33
716     stdout = options.stdout
717     stderr = options.stderr
718 
719+    if mutable_type and mutable_type not in ('sdmf', 'mdmf'):
720+        # Don't try to pass unsupported types to the webapi
721+        print >>stderr, "error: %s is an invalid format" % mutable_type
722+        return 1
723+
724     if nodeurl[-1] != "/":
725         nodeurl += "/"
726     if to_file:
727hunk ./src/allmydata/scripts/tahoe_put.py 76
728         url = nodeurl + "uri"
729     if mutable:
730         url += "?mutable=true"
731+    if mutable_type:
732+        assert mutable
733+        url += "&mutable-type=%s" % mutable_type
734+
735     if from_file:
736         infileobj = open(os.path.expanduser(from_file), "rb")
737     else:
738}
739[web: Alter the webapi to get along with and take advantage of the MDMF changes
740Kevan Carstensen <kevan@isnotajoke.com>**20100814081012
741 Ignore-this: 96c2ed4e4a9f450fb84db5d711d10bd6
742 
743 The main benefit that the webapi gets from MDMF, at least initially, is
744 the ability to do a streaming download of an MDMF mutable file. It also
745 exposes a way (through the PUT verb) to append to or otherwise modify
746 (in-place) an MDMF mutable file.
747] {
748hunk ./src/allmydata/web/common.py 12
749 from allmydata.interfaces import ExistingChildError, NoSuchChildError, \
750      FileTooLargeError, NotEnoughSharesError, NoSharesError, \
751      EmptyPathnameComponentError, MustBeDeepImmutableError, \
752-     MustBeReadonlyError, MustNotBeUnknownRWError
753+     MustBeReadonlyError, MustNotBeUnknownRWError, SDMF_VERSION, MDMF_VERSION
754 from allmydata.mutable.common import UnrecoverableFileError
755 from allmydata.util import abbreviate
756 from allmydata.util.encodingutil import to_str, quote_output
757hunk ./src/allmydata/web/common.py 35
758     else:
759         return boolean_of_arg(replace)
760 
761+
762+def parse_mutable_type_arg(arg):
763+    if not arg:
764+        return None # interpreted by the caller as "let the nodemaker decide"
765+
766+    arg = arg.lower()
767+    assert arg in ("mdmf", "sdmf")
768+
769+    if arg == "mdmf":
770+        return MDMF_VERSION
771+
772+    return SDMF_VERSION
773+
774+
775+def parse_offset_arg(offset):
776+    # XXX: This will raise a ValueError when invoked on something that
777+    # is not an integer. Is that okay? Or do we want a better error
778+    # message? Since this call is going to be used by programmers and
779+    # their tools rather than users (through the wui), it is not
780+    # inconsistent to return that, I guess.
781+    offset = int(offset)
782+    return offset
783+
784+
785 def get_root(ctx_or_req):
786     req = IRequest(ctx_or_req)
787     # the addSlash=True gives us one extra (empty) segment
788hunk ./src/allmydata/web/directory.py 19
789 from allmydata.uri import from_string_dirnode
790 from allmydata.interfaces import IDirectoryNode, IFileNode, IFilesystemNode, \
791      IImmutableFileNode, IMutableFileNode, ExistingChildError, \
792-     NoSuchChildError, EmptyPathnameComponentError
793+     NoSuchChildError, EmptyPathnameComponentError, SDMF_VERSION, MDMF_VERSION
794 from allmydata.monitor import Monitor, OperationCancelledError
795 from allmydata import dirnode
796 from allmydata.web.common import text_plain, WebError, \
797hunk ./src/allmydata/web/directory.py 153
798         if not t:
799             # render the directory as HTML, using the docFactory and Nevow's
800             # whole templating thing.
801-            return DirectoryAsHTML(self.node)
802+            return DirectoryAsHTML(self.node,
803+                                   self.client.mutable_file_default)
804 
805         if t == "json":
806             return DirectoryJSONMetadata(ctx, self.node)
807hunk ./src/allmydata/web/directory.py 556
808     docFactory = getxmlfile("directory.xhtml")
809     addSlash = True
810 
811-    def __init__(self, node):
812+    def __init__(self, node, default_mutable_format):
813         rend.Page.__init__(self)
814         self.node = node
815 
816hunk ./src/allmydata/web/directory.py 560
817+        assert default_mutable_format in (MDMF_VERSION, SDMF_VERSION)
818+        self.default_mutable_format = default_mutable_format
819+
820     def beforeRender(self, ctx):
821         # attempt to get the dirnode's children, stashing them (or the
822         # failure that results) for later use
823hunk ./src/allmydata/web/directory.py 780
824             ]]
825         forms.append(T.div(class_="freeform-form")[mkdir])
826 
827+        # Build input elements for mutable file type. We do this outside
828+        # of the list so we can check the appropriate format, based on
829+        # the default configured in the client (which reflects the
830+        # default configured in tahoe.cfg)
831+        if self.default_mutable_format == MDMF_VERSION:
832+            mdmf_input = T.input(type='radio', name='mutable-type',
833+                                 id='mutable-type-mdmf', value='mdmf',
834+                                 checked='checked')
835+        else:
836+            mdmf_input = T.input(type='radio', name='mutable-type',
837+                                 id='mutable-type-mdmf', value='mdmf')
838+
839+        if self.default_mutable_format == SDMF_VERSION:
840+            sdmf_input = T.input(type='radio', name='mutable-type',
841+                                 id='mutable-type-sdmf', value='sdmf',
842+                                 checked="checked")
843+        else:
844+            sdmf_input = T.input(type='radio', name='mutable-type',
845+                                 id='mutable-type-sdmf', value='sdmf')
846+
847         upload = T.form(action=".", method="post",
848                         enctype="multipart/form-data")[
849             T.fieldset[
850hunk ./src/allmydata/web/directory.py 812
851             T.input(type="submit", value="Upload"),
852             " Mutable?:",
853             T.input(type="checkbox", name="mutable"),
854+            sdmf_input, T.label(for_="mutable-type-sdmf")["SDMF"],
855+            mdmf_input,
856+            T.label(for_="mutable-type-mdmf")["MDMF (experimental)"],
857             ]]
858         forms.append(T.div(class_="freeform-form")[upload])
859 
860hunk ./src/allmydata/web/directory.py 850
861                 kiddata = ("filenode", {'size': childnode.get_size(),
862                                         'mutable': childnode.is_mutable(),
863                                         })
864+                if childnode.is_mutable() and \
865+                    childnode.get_version() is not None:
866+                    mutable_type = childnode.get_version()
867+                    assert mutable_type in (SDMF_VERSION, MDMF_VERSION)
868+
869+                    if mutable_type == MDMF_VERSION:
870+                        mutable_type = "mdmf"
871+                    else:
872+                        mutable_type = "sdmf"
873+                    kiddata[1]['mutable-type'] = mutable_type
874+
875             elif IDirectoryNode.providedBy(childnode):
876                 kiddata = ("dirnode", {'mutable': childnode.is_mutable()})
877             else:
878hunk ./src/allmydata/web/filenode.py 9
879 from nevow import url, rend
880 from nevow.inevow import IRequest
881 
882-from allmydata.interfaces import ExistingChildError
883+from allmydata.interfaces import ExistingChildError, SDMF_VERSION, MDMF_VERSION
884 from allmydata.monitor import Monitor
885 from allmydata.immutable.upload import FileHandle
886hunk ./src/allmydata/web/filenode.py 12
887+from allmydata.mutable.publish import MutableFileHandle
888+from allmydata.mutable.common import MODE_READ
889 from allmydata.util import log, base32
890 
891 from allmydata.web.common import text_plain, WebError, RenderMixin, \
892hunk ./src/allmydata/web/filenode.py 18
893      boolean_of_arg, get_arg, should_create_intermediate_directories, \
894-     MyExceptionHandler, parse_replace_arg
895+     MyExceptionHandler, parse_replace_arg, parse_offset_arg, \
896+     parse_mutable_type_arg
897 from allmydata.web.check_results import CheckResults, \
898      CheckAndRepairResults, LiteralCheckResults
899 from allmydata.web.info import MoreInfo
900hunk ./src/allmydata/web/filenode.py 29
901         # a new file is being uploaded in our place.
902         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
903         if mutable:
904-            req.content.seek(0)
905-            data = req.content.read()
906-            d = client.create_mutable_file(data)
907+            mutable_type = parse_mutable_type_arg(get_arg(req,
908+                                                          "mutable-type",
909+                                                          None))
910+            data = MutableFileHandle(req.content)
911+            d = client.create_mutable_file(data, version=mutable_type)
912             def _uploaded(newnode):
913                 d2 = self.parentnode.set_node(self.name, newnode,
914                                               overwrite=replace)
915hunk ./src/allmydata/web/filenode.py 66
916         d.addCallback(lambda res: childnode.get_uri())
917         return d
918 
919-    def _read_data_from_formpost(self, req):
920-        # SDMF: files are small, and we can only upload data, so we read
921-        # the whole file into memory before uploading.
922-        contents = req.fields["file"]
923-        contents.file.seek(0)
924-        data = contents.file.read()
925-        return data
926 
927     def replace_me_with_a_formpost(self, req, client, replace):
928         # create a new file, maybe mutable, maybe immutable
929hunk ./src/allmydata/web/filenode.py 71
930         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
931 
932+        # create an immutable file
933+        contents = req.fields["file"]
934         if mutable:
935hunk ./src/allmydata/web/filenode.py 74
936-            data = self._read_data_from_formpost(req)
937-            d = client.create_mutable_file(data)
938+            mutable_type = parse_mutable_type_arg(get_arg(req, "mutable-type",
939+                                                          None))
940+            uploadable = MutableFileHandle(contents.file)
941+            d = client.create_mutable_file(uploadable, version=mutable_type)
942             def _uploaded(newnode):
943                 d2 = self.parentnode.set_node(self.name, newnode,
944                                               overwrite=replace)
945hunk ./src/allmydata/web/filenode.py 85
946                 return d2
947             d.addCallback(_uploaded)
948             return d
949-        # create an immutable file
950-        contents = req.fields["file"]
951+
952         uploadable = FileHandle(contents.file, convergence=client.convergence)
953         d = self.parentnode.add_file(self.name, uploadable, overwrite=replace)
954         d.addCallback(lambda newnode: newnode.get_uri())
955hunk ./src/allmydata/web/filenode.py 91
956         return d
957 
958+
959 class PlaceHolderNodeHandler(RenderMixin, rend.Page, ReplaceMeMixin):
960     def __init__(self, client, parentnode, name):
961         rend.Page.__init__(self)
962hunk ./src/allmydata/web/filenode.py 174
963             # properly. So we assume that at least the browser will agree
964             # with itself, and echo back the same bytes that we were given.
965             filename = get_arg(req, "filename", self.name) or "unknown"
966-            if self.node.is_mutable():
967-                # some day: d = self.node.get_best_version()
968-                d = makeMutableDownloadable(self.node)
969-            else:
970-                d = defer.succeed(self.node)
971+            d = self.node.get_best_readable_version()
972             d.addCallback(lambda dn: FileDownloader(dn, filename))
973             return d
974         if t == "json":
975hunk ./src/allmydata/web/filenode.py 178
976-            if self.parentnode and self.name:
977-                d = self.parentnode.get_metadata_for(self.name)
978+            # We do this to make sure that fields like size and
979+            # mutable-type (which depend on the file on the grid and not
980+            # just on the cap) are filled in. The latter gets used in
981+            # tests, in particular.
982+            #
983+            # TODO: Make it so that the servermap knows how to update in
984+            # a mode specifically designed to fill in these fields, and
985+            # then update it in that mode.
986+            if self.node.is_mutable():
987+                d = self.node.get_servermap(MODE_READ)
988             else:
989                 d = defer.succeed(None)
990hunk ./src/allmydata/web/filenode.py 190
991+            if self.parentnode and self.name:
992+                d.addCallback(lambda ignored:
993+                    self.parentnode.get_metadata_for(self.name))
994+            else:
995+                d.addCallback(lambda ignored: None)
996             d.addCallback(lambda md: FileJSONMetadata(ctx, self.node, md))
997             return d
998         if t == "info":
999hunk ./src/allmydata/web/filenode.py 211
1000         if t:
1001             raise WebError("GET file: bad t=%s" % t)
1002         filename = get_arg(req, "filename", self.name) or "unknown"
1003-        if self.node.is_mutable():
1004-            # some day: d = self.node.get_best_version()
1005-            d = makeMutableDownloadable(self.node)
1006-        else:
1007-            d = defer.succeed(self.node)
1008+        d = self.node.get_best_readable_version()
1009         d.addCallback(lambda dn: FileDownloader(dn, filename))
1010         return d
1011 
1012hunk ./src/allmydata/web/filenode.py 219
1013         req = IRequest(ctx)
1014         t = get_arg(req, "t", "").strip()
1015         replace = parse_replace_arg(get_arg(req, "replace", "true"))
1016+        offset = parse_offset_arg(get_arg(req, "offset", -1))
1017 
1018         if not t:
1019hunk ./src/allmydata/web/filenode.py 222
1020-            if self.node.is_mutable():
1021+            if self.node.is_mutable() and offset >= 0:
1022+                return self.update_my_contents(req, offset)
1023+
1024+            elif self.node.is_mutable():
1025                 return self.replace_my_contents(req)
1026             if not replace:
1027                 # this is the early trap: if someone else modifies the
1028hunk ./src/allmydata/web/filenode.py 232
1029                 # directory while we're uploading, the add_file(overwrite=)
1030                 # call in replace_me_with_a_child will do the late trap.
1031                 raise ExistingChildError()
1032+            if offset >= 0:
1033+                raise WebError("PUT to a file: append operation invoked "
1034+                               "on an immutable cap")
1035+
1036+
1037             assert self.parentnode and self.name
1038             return self.replace_me_with_a_child(req, self.client, replace)
1039         if t == "uri":
1040hunk ./src/allmydata/web/filenode.py 299
1041 
1042     def replace_my_contents(self, req):
1043         req.content.seek(0)
1044-        new_contents = req.content.read()
1045+        new_contents = MutableFileHandle(req.content)
1046         d = self.node.overwrite(new_contents)
1047         d.addCallback(lambda res: self.node.get_uri())
1048         return d
1049hunk ./src/allmydata/web/filenode.py 304
1050 
1051+
1052+    def update_my_contents(self, req, offset):
1053+        req.content.seek(0)
1054+        added_contents = MutableFileHandle(req.content)
1055+
1056+        d = self.node.get_best_mutable_version()
1057+        d.addCallback(lambda mv:
1058+            mv.update(added_contents, offset))
1059+        d.addCallback(lambda ignored:
1060+            self.node.get_uri())
1061+        return d
1062+
1063+
1064     def replace_my_contents_with_a_formpost(self, req):
1065         # we have a mutable file. Get the data from the formpost, and replace
1066         # the mutable file's contents with it.
1067hunk ./src/allmydata/web/filenode.py 320
1068-        new_contents = self._read_data_from_formpost(req)
1069+        new_contents = req.fields['file']
1070+        new_contents = MutableFileHandle(new_contents.file)
1071+
1072         d = self.node.overwrite(new_contents)
1073         d.addCallback(lambda res: self.node.get_uri())
1074         return d
1075hunk ./src/allmydata/web/filenode.py 327
1076 
1077-class MutableDownloadable:
1078-    #implements(IDownloadable)
1079-    def __init__(self, size, node):
1080-        self.size = size
1081-        self.node = node
1082-    def get_size(self):
1083-        return self.size
1084-    def is_mutable(self):
1085-        return True
1086-    def read(self, consumer, offset=0, size=None):
1087-        d = self.node.download_best_version()
1088-        d.addCallback(self._got_data, consumer, offset, size)
1089-        return d
1090-    def _got_data(self, contents, consumer, offset, size):
1091-        start = offset
1092-        if size is not None:
1093-            end = offset+size
1094-        else:
1095-            end = self.size
1096-        # SDMF: we can write the whole file in one big chunk
1097-        consumer.write(contents[start:end])
1098-        return consumer
1099-
1100-def makeMutableDownloadable(n):
1101-    d = defer.maybeDeferred(n.get_size_of_best_version)
1102-    d.addCallback(MutableDownloadable, n)
1103-    return d
1104 
1105 class FileDownloader(rend.Page):
1106     # since we override the rendering process (to let the tahoe Downloader
1107hunk ./src/allmydata/web/filenode.py 509
1108     data[1]['mutable'] = filenode.is_mutable()
1109     if edge_metadata is not None:
1110         data[1]['metadata'] = edge_metadata
1111+
1112+    if filenode.is_mutable() and filenode.get_version() is not None:
1113+        mutable_type = filenode.get_version()
1114+        assert mutable_type in (MDMF_VERSION, SDMF_VERSION)
1115+        if mutable_type == MDMF_VERSION:
1116+            mutable_type = "mdmf"
1117+        else:
1118+            mutable_type = "sdmf"
1119+        data[1]['mutable-type'] = mutable_type
1120+
1121     return text_plain(simplejson.dumps(data, indent=1) + "\n", ctx)
1122 
1123 def FileURI(ctx, filenode):
1124hunk ./src/allmydata/web/root.py 15
1125 from allmydata import get_package_versions_string
1126 from allmydata import provisioning
1127 from allmydata.util import idlib, log
1128-from allmydata.interfaces import IFileNode
1129+from allmydata.interfaces import IFileNode, MDMF_VERSION, SDMF_VERSION
1130 from allmydata.web import filenode, directory, unlinked, status, operations
1131 from allmydata.web import reliability, storage
1132 from allmydata.web.common import abbreviate_size, getxmlfile, WebError, \
1133hunk ./src/allmydata/web/root.py 19
1134-     get_arg, RenderMixin, boolean_of_arg
1135+     get_arg, RenderMixin, boolean_of_arg, parse_mutable_type_arg
1136 
1137 
1138 class URIHandler(RenderMixin, rend.Page):
1139hunk ./src/allmydata/web/root.py 50
1140         if t == "":
1141             mutable = boolean_of_arg(get_arg(req, "mutable", "false").strip())
1142             if mutable:
1143-                return unlinked.PUTUnlinkedSSK(req, self.client)
1144+                version = parse_mutable_type_arg(get_arg(req, "mutable-type",
1145+                                                 None))
1146+                return unlinked.PUTUnlinkedSSK(req, self.client, version)
1147             else:
1148                 return unlinked.PUTUnlinkedCHK(req, self.client)
1149         if t == "mkdir":
1150hunk ./src/allmydata/web/root.py 70
1151         if t in ("", "upload"):
1152             mutable = bool(get_arg(req, "mutable", "").strip())
1153             if mutable:
1154-                return unlinked.POSTUnlinkedSSK(req, self.client)
1155+                version = parse_mutable_type_arg(get_arg(req, "mutable-type",
1156+                                                         None))
1157+                return unlinked.POSTUnlinkedSSK(req, self.client, version)
1158             else:
1159                 return unlinked.POSTUnlinkedCHK(req, self.client)
1160         if t == "mkdir":
1161hunk ./src/allmydata/web/root.py 324
1162 
1163     def render_upload_form(self, ctx, data):
1164         # this is a form where users can upload unlinked files
1165+        #
1166+        # for mutable files, users can choose the format by selecting
1167+        # MDMF or SDMF from a radio button. They can also configure a
1168+        # default format in tahoe.cfg, which they rightly expect us to
1169+        # obey. we convey to them that we are obeying their choice by
1170+        # ensuring that the one that they've chosen is selected in the
1171+        # interface.
1172+        if self.client.mutable_file_default == MDMF_VERSION:
1173+            mdmf_input = T.input(type='radio', name='mutable-type',
1174+                                 value='mdmf', id='mutable-type-mdmf',
1175+                                 checked='checked')
1176+        else:
1177+            mdmf_input = T.input(type='radio', name='mutable-type',
1178+                                 value='mdmf', id='mutable-type-mdmf')
1179+
1180+        if self.client.mutable_file_default == SDMF_VERSION:
1181+            sdmf_input = T.input(type='radio', name='mutable-type',
1182+                                 value='sdmf', id='mutable-type-sdmf',
1183+                                 checked='checked')
1184+        else:
1185+            sdmf_input = T.input(type='radio', name='mutable-type',
1186+                                 value='sdmf', id='mutable-type-sdmf')
1187+
1188+
1189         form = T.form(action="uri", method="post",
1190                       enctype="multipart/form-data")[
1191             T.fieldset[
1192hunk ./src/allmydata/web/root.py 356
1193                   T.input(type="file", name="file", class_="freeform-input-file")],
1194             T.input(type="hidden", name="t", value="upload"),
1195             T.div[T.input(type="checkbox", name="mutable"), T.label(for_="mutable")["Create mutable file"],
1196+                  sdmf_input, T.label(for_="mutable-type-sdmf")["SDMF"],
1197+                  mdmf_input,
1198+                  T.label(for_='mutable-type-mdmf')['MDMF (experimental)'],
1199                   " ", T.input(type="submit", value="Upload!")],
1200             ]]
1201         return T.div[form]
1202hunk ./src/allmydata/web/unlinked.py 7
1203 from twisted.internet import defer
1204 from nevow import rend, url, tags as T
1205 from allmydata.immutable.upload import FileHandle
1206+from allmydata.mutable.publish import MutableFileHandle
1207 from allmydata.web.common import getxmlfile, get_arg, boolean_of_arg, \
1208      convert_children_json, WebError
1209 from allmydata.web import status
1210hunk ./src/allmydata/web/unlinked.py 20
1211     # that fires with the URI of the new file
1212     return d
1213 
1214-def PUTUnlinkedSSK(req, client):
1215+def PUTUnlinkedSSK(req, client, version):
1216     # SDMF: files are small, and we can only upload data
1217     req.content.seek(0)
1218hunk ./src/allmydata/web/unlinked.py 23
1219-    data = req.content.read()
1220-    d = client.create_mutable_file(data)
1221+    data = MutableFileHandle(req.content)
1222+    d = client.create_mutable_file(data, version=version)
1223     d.addCallback(lambda n: n.get_uri())
1224     return d
1225 
1226hunk ./src/allmydata/web/unlinked.py 83
1227                       ["/uri/" + res.uri])
1228         return d
1229 
1230-def POSTUnlinkedSSK(req, client):
1231+def POSTUnlinkedSSK(req, client, version):
1232     # "POST /uri", to create an unlinked file.
1233     # SDMF: files are small, and we can only upload data
1234hunk ./src/allmydata/web/unlinked.py 86
1235-    contents = req.fields["file"]
1236-    contents.file.seek(0)
1237-    data = contents.file.read()
1238-    d = client.create_mutable_file(data)
1239+    contents = req.fields["file"].file
1240+    data = MutableFileHandle(contents)
1241+    d = client.create_mutable_file(data, version=version)
1242     d.addCallback(lambda n: n.get_uri())
1243     return d
1244 
1245}
1246[client.py: learn how to create different kinds of mutable files
1247Kevan Carstensen <kevan@isnotajoke.com>**20100814225711
1248 Ignore-this: 61ff665bc050cba5f58bf2ed779d692b
1249] {
1250hunk ./src/allmydata/client.py 25
1251 from allmydata.util.time_format import parse_duration, parse_date
1252 from allmydata.stats import StatsProvider
1253 from allmydata.history import History
1254-from allmydata.interfaces import IStatsProducer, RIStubClient
1255+from allmydata.interfaces import IStatsProducer, RIStubClient, \
1256+                                 SDMF_VERSION, MDMF_VERSION
1257 from allmydata.nodemaker import NodeMaker
1258 
1259 
1260hunk ./src/allmydata/client.py 357
1261                                    self.terminator,
1262                                    self.get_encoding_parameters(),
1263                                    self._key_generator)
1264+        default = self.get_config("client", "mutable.format", default="sdmf")
1265+        if default == "mdmf":
1266+            self.mutable_file_default = MDMF_VERSION
1267+        else:
1268+            self.mutable_file_default = SDMF_VERSION
1269 
1270     def get_history(self):
1271         return self.history
1272hunk ./src/allmydata/client.py 500
1273     def create_immutable_dirnode(self, children, convergence=None):
1274         return self.nodemaker.create_immutable_directory(children, convergence)
1275 
1276-    def create_mutable_file(self, contents=None, keysize=None):
1277-        return self.nodemaker.create_mutable_file(contents, keysize)
1278+    def create_mutable_file(self, contents=None, keysize=None, version=None):
1279+        if not version:
1280+            version = self.mutable_file_default
1281+        return self.nodemaker.create_mutable_file(contents, keysize,
1282+                                                  version=version)
1283 
1284     def upload(self, uploadable):
1285         uploader = self.getServiceNamed("uploader")
1286}
1287[mutable/checker.py and mutable/repair.py: Modify checker and repairer to work with MDMF
1288Kevan Carstensen <kevan@isnotajoke.com>**20100819003216
1289 Ignore-this: d3bd3260742be8964877f0a53543b01b
1290 
1291 The checker and repairer required minimal changes to work with the MDMF
1292 modifications made elsewhere. The checker duplicated a lot of the code
1293 that was already in the downloader, so I modified the downloader
1294 slightly to expose this functionality to the checker and removed the
1295 duplicated code. The repairer only required a minor change to deal with
1296 data representation.
1297] {
1298hunk ./src/allmydata/mutable/checker.py 2
1299 
1300-from twisted.internet import defer
1301-from twisted.python import failure
1302-from allmydata import hashtree
1303 from allmydata.uri import from_string
1304hunk ./src/allmydata/mutable/checker.py 3
1305-from allmydata.util import hashutil, base32, idlib, log
1306+from allmydata.util import base32, idlib, log
1307 from allmydata.check_results import CheckAndRepairResults, CheckResults
1308 
1309 from allmydata.mutable.common import MODE_CHECK, CorruptShareError
1310hunk ./src/allmydata/mutable/checker.py 8
1311 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
1312-from allmydata.mutable.layout import unpack_share, SIGNED_PREFIX_LENGTH
1313+from allmydata.mutable.retrieve import Retrieve # for verifying
1314 
1315 class MutableChecker:
1316 
1317hunk ./src/allmydata/mutable/checker.py 25
1318 
1319     def check(self, verify=False, add_lease=False):
1320         servermap = ServerMap()
1321+        # Updating the servermap in MODE_CHECK will stand a good chance
1322+        # of finding all of the shares, and getting a good idea of
1323+        # recoverability, etc, without verifying.
1324         u = ServermapUpdater(self._node, self._storage_broker, self._monitor,
1325                              servermap, MODE_CHECK, add_lease=add_lease)
1326         if self._history:
1327hunk ./src/allmydata/mutable/checker.py 51
1328         if num_recoverable:
1329             self.best_version = servermap.best_recoverable_version()
1330 
1331+        # The file is unhealthy and needs to be repaired if:
1332+        # - There are unrecoverable versions.
1333         if servermap.unrecoverable_versions():
1334             self.need_repair = True
1335hunk ./src/allmydata/mutable/checker.py 55
1336+        # - There isn't a recoverable version.
1337         if num_recoverable != 1:
1338             self.need_repair = True
1339hunk ./src/allmydata/mutable/checker.py 58
1340+        # - The best recoverable version is missing some shares.
1341         if self.best_version:
1342             available_shares = servermap.shares_available()
1343             (num_distinct_shares, k, N) = available_shares[self.best_version]
1344hunk ./src/allmydata/mutable/checker.py 69
1345 
1346     def _verify_all_shares(self, servermap):
1347         # read every byte of each share
1348+        #
1349+        # This logic is going to be very nearly the same as the
1350+        # downloader. I bet we could pass the downloader a flag that
1351+        # makes it do this, and piggyback onto that instead of
1352+        # duplicating a bunch of code.
1353+        #
1354+        # Like:
1355+        #  r = Retrieve(blah, blah, blah, verify=True)
1356+        #  d = r.download()
1357+        #  (wait, wait, wait, d.callback)
1358+        # 
1359+        #  Then, when it has finished, we can check the servermap (which
1360+        #  we provided to Retrieve) to figure out which shares are bad,
1361+        #  since the Retrieve process will have updated the servermap as
1362+        #  it went along.
1363+        #
1364+        #  By passing the verify=True flag to the constructor, we are
1365+        #  telling the downloader a few things.
1366+        #
1367+        #  1. It needs to download all N shares, not just K shares.
1368+        #  2. It doesn't need to decrypt or decode the shares, only
1369+        #     verify them.
1370         if not self.best_version:
1371             return
1372hunk ./src/allmydata/mutable/checker.py 93
1373-        versionmap = servermap.make_versionmap()
1374-        shares = versionmap[self.best_version]
1375-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1376-         offsets_tuple) = self.best_version
1377-        offsets = dict(offsets_tuple)
1378-        readv = [ (0, offsets["EOF"]) ]
1379-        dl = []
1380-        for (shnum, peerid, timestamp) in shares:
1381-            ss = servermap.connections[peerid]
1382-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
1383-            d.addCallback(self._got_answer, peerid, servermap)
1384-            dl.append(d)
1385-        return defer.DeferredList(dl, fireOnOneErrback=True, consumeErrors=True)
1386 
1387hunk ./src/allmydata/mutable/checker.py 94
1388-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
1389-        # isolate the callRemote to a separate method, so tests can subclass
1390-        # Publish and override it
1391-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
1392+        r = Retrieve(self._node, servermap, self.best_version, verify=True)
1393+        d = r.download()
1394+        d.addCallback(self._process_bad_shares)
1395         return d
1396 
1397hunk ./src/allmydata/mutable/checker.py 99
1398-    def _got_answer(self, datavs, peerid, servermap):
1399-        for shnum,datav in datavs.items():
1400-            data = datav[0]
1401-            try:
1402-                self._got_results_one_share(shnum, peerid, data)
1403-            except CorruptShareError:
1404-                f = failure.Failure()
1405-                self.need_repair = True
1406-                self.bad_shares.append( (peerid, shnum, f) )
1407-                prefix = data[:SIGNED_PREFIX_LENGTH]
1408-                servermap.mark_bad_share(peerid, shnum, prefix)
1409-                ss = servermap.connections[peerid]
1410-                self.notify_server_corruption(ss, shnum, str(f.value))
1411-
1412-    def check_prefix(self, peerid, shnum, data):
1413-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1414-         offsets_tuple) = self.best_version
1415-        got_prefix = data[:SIGNED_PREFIX_LENGTH]
1416-        if got_prefix != prefix:
1417-            raise CorruptShareError(peerid, shnum,
1418-                                    "prefix mismatch: share changed while we were reading it")
1419-
1420-    def _got_results_one_share(self, shnum, peerid, data):
1421-        self.check_prefix(peerid, shnum, data)
1422-
1423-        # the [seqnum:signature] pieces are validated by _compare_prefix,
1424-        # which checks their signature against the pubkey known to be
1425-        # associated with this file.
1426 
1427hunk ./src/allmydata/mutable/checker.py 100
1428-        (seqnum, root_hash, IV, k, N, segsize, datalen, pubkey, signature,
1429-         share_hash_chain, block_hash_tree, share_data,
1430-         enc_privkey) = unpack_share(data)
1431-
1432-        # validate [share_hash_chain,block_hash_tree,share_data]
1433-
1434-        leaves = [hashutil.block_hash(share_data)]
1435-        t = hashtree.HashTree(leaves)
1436-        if list(t) != block_hash_tree:
1437-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
1438-        share_hash_leaf = t[0]
1439-        t2 = hashtree.IncompleteHashTree(N)
1440-        # root_hash was checked by the signature
1441-        t2.set_hashes({0: root_hash})
1442-        try:
1443-            t2.set_hashes(hashes=share_hash_chain,
1444-                          leaves={shnum: share_hash_leaf})
1445-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
1446-                IndexError), e:
1447-            msg = "corrupt hashes: %s" % (e,)
1448-            raise CorruptShareError(peerid, shnum, msg)
1449-
1450-        # validate enc_privkey: only possible if we have a write-cap
1451-        if not self._node.is_readonly():
1452-            alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
1453-            alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
1454-            if alleged_writekey != self._node.get_writekey():
1455-                raise CorruptShareError(peerid, shnum, "invalid privkey")
1456+    def _process_bad_shares(self, bad_shares):
1457+        if bad_shares:
1458+            self.need_repair = True
1459+        self.bad_shares = bad_shares
1460 
1461hunk ./src/allmydata/mutable/checker.py 105
1462-    def notify_server_corruption(self, ss, shnum, reason):
1463-        ss.callRemoteOnly("advise_corrupt_share",
1464-                          "mutable", self._storage_index, shnum, reason)
1465 
1466     def _count_shares(self, smap, version):
1467         available_shares = smap.shares_available()
1468hunk ./src/allmydata/mutable/repairer.py 5
1469 from zope.interface import implements
1470 from twisted.internet import defer
1471 from allmydata.interfaces import IRepairResults, ICheckResults
1472+from allmydata.mutable.publish import MutableData
1473 
1474 class RepairResults:
1475     implements(IRepairResults)
1476hunk ./src/allmydata/mutable/repairer.py 108
1477             raise RepairRequiresWritecapError("Sorry, repair currently requires a writecap, to set the write-enabler properly.")
1478 
1479         d = self.node.download_version(smap, best_version, fetch_privkey=True)
1480+        d.addCallback(lambda data:
1481+            MutableData(data))
1482         d.addCallback(self.node.upload, smap)
1483         d.addCallback(self.get_results, smap)
1484         return d
1485}
1486[mutable/filenode.py: add versions and partial-file updates to the mutable file node
1487Kevan Carstensen <kevan@isnotajoke.com>**20100819003231
1488 Ignore-this: b7b5434201fdb9b48f902d7ab25ef45c
1489 
1490 One of the goals of MDMF as a GSoC project is to lay the groundwork for
1491 LDMF, a format that will allow Tahoe-LAFS to deal with and encourage
1492 multiple versions of a single cap on the grid. In line with this, there
1493 is a now a distinction between an overriding mutable file (which can be
1494 thought to correspond to the cap/unique identifier for that mutable
1495 file) and versions of the mutable file (which we can download, update,
1496 and so on). All download, upload, and modification operations end up
1497 happening on a particular version of a mutable file, but there are
1498 shortcut methods on the object representing the overriding mutable file
1499 that perform these operations on the best version of the mutable file
1500 (which is what code should be doing until we have LDMF and better
1501 support for other paradigms).
1502 
1503 Another goal of MDMF was to take advantage of segmentation to give
1504 callers more efficient partial file updates or appends. This patch
1505 implements methods that do that, too.
1506 
1507] {
1508hunk ./src/allmydata/mutable/filenode.py 7
1509 from zope.interface import implements
1510 from twisted.internet import defer, reactor
1511 from foolscap.api import eventually
1512-from allmydata.interfaces import IMutableFileNode, \
1513-     ICheckable, ICheckResults, NotEnoughSharesError
1514-from allmydata.util import hashutil, log
1515+from allmydata.interfaces import IMutableFileNode, ICheckable, ICheckResults, \
1516+     NotEnoughSharesError, MDMF_VERSION, SDMF_VERSION, IMutableUploadable, \
1517+     IMutableFileVersion, IWritable
1518+from allmydata.util import hashutil, log, consumer, deferredutil, mathutil
1519 from allmydata.util.assertutil import precondition
1520 from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI
1521 from allmydata.monitor import Monitor
1522hunk ./src/allmydata/mutable/filenode.py 16
1523 from pycryptopp.cipher.aes import AES
1524 
1525-from allmydata.mutable.publish import Publish
1526+from allmydata.mutable.publish import Publish, MutableData,\
1527+                                      DEFAULT_MAX_SEGMENT_SIZE, \
1528+                                      TransformingUploadable
1529 from allmydata.mutable.common import MODE_READ, MODE_WRITE, UnrecoverableFileError, \
1530      ResponseCache, UncoordinatedWriteError
1531 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
1532hunk ./src/allmydata/mutable/filenode.py 70
1533         self._sharemap = {} # known shares, shnum-to-[nodeids]
1534         self._cache = ResponseCache()
1535         self._most_recent_size = None
1536+        # filled in after __init__ if we're being created for the first time;
1537+        # filled in by the servermap updater before publishing, otherwise.
1538+        # set to this default value in case neither of those things happen,
1539+        # or in case the servermap can't find any shares to tell us what
1540+        # to publish as.
1541+        # TODO: Set this back to None, and find out why the tests fail
1542+        #       with it set to None.
1543+        self._protocol_version = None
1544 
1545         # all users of this MutableFileNode go through the serializer. This
1546         # takes advantage of the fact that Deferreds discard the callbacks
1547hunk ./src/allmydata/mutable/filenode.py 134
1548         return self._upload(initial_contents, None)
1549 
1550     def _get_initial_contents(self, contents):
1551-        if isinstance(contents, str):
1552-            return contents
1553         if contents is None:
1554hunk ./src/allmydata/mutable/filenode.py 135
1555-            return ""
1556+            return MutableData("")
1557+
1558+        if IMutableUploadable.providedBy(contents):
1559+            return contents
1560+
1561         assert callable(contents), "%s should be callable, not %s" % \
1562                (contents, type(contents))
1563         return contents(self)
1564hunk ./src/allmydata/mutable/filenode.py 209
1565 
1566     def get_size(self):
1567         return self._most_recent_size
1568+
1569     def get_current_size(self):
1570         d = self.get_size_of_best_version()
1571         d.addCallback(self._stash_size)
1572hunk ./src/allmydata/mutable/filenode.py 214
1573         return d
1574+
1575     def _stash_size(self, size):
1576         self._most_recent_size = size
1577         return size
1578hunk ./src/allmydata/mutable/filenode.py 273
1579             return cmp(self.__class__, them.__class__)
1580         return cmp(self._uri, them._uri)
1581 
1582-    def _do_serialized(self, cb, *args, **kwargs):
1583-        # note: to avoid deadlock, this callable is *not* allowed to invoke
1584-        # other serialized methods within this (or any other)
1585-        # MutableFileNode. The callable should be a bound method of this same
1586-        # MFN instance.
1587-        d = defer.Deferred()
1588-        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
1589-        # we need to put off d.callback until this Deferred is finished being
1590-        # processed. Otherwise the caller's subsequent activities (like,
1591-        # doing other things with this node) can cause reentrancy problems in
1592-        # the Deferred code itself
1593-        self._serializer.addBoth(lambda res: eventually(d.callback, res))
1594-        # add a log.err just in case something really weird happens, because
1595-        # self._serializer stays around forever, therefore we won't see the
1596-        # usual Unhandled Error in Deferred that would give us a hint.
1597-        self._serializer.addErrback(log.err)
1598-        return d
1599 
1600     #################################
1601     # ICheckable
1602hunk ./src/allmydata/mutable/filenode.py 298
1603 
1604 
1605     #################################
1606-    # IMutableFileNode
1607+    # IFileNode
1608+
1609+    def get_best_readable_version(self):
1610+        """
1611+        I return a Deferred that fires with a MutableFileVersion
1612+        representing the best readable version of the file that I
1613+        represent
1614+        """
1615+        return self.get_readable_version()
1616+
1617+
1618+    def get_readable_version(self, servermap=None, version=None):
1619+        """
1620+        I return a Deferred that fires with an MutableFileVersion for my
1621+        version argument, if there is a recoverable file of that version
1622+        on the grid. If there is no recoverable version, I fire with an
1623+        UnrecoverableFileError.
1624+
1625+        If a servermap is provided, I look in there for the requested
1626+        version. If no servermap is provided, I create and update a new
1627+        one.
1628+
1629+        If no version is provided, then I return a MutableFileVersion
1630+        representing the best recoverable version of the file.
1631+        """
1632+        d = self._get_version_from_servermap(MODE_READ, servermap, version)
1633+        def _build_version((servermap, their_version)):
1634+            assert their_version in servermap.recoverable_versions()
1635+            assert their_version in servermap.make_versionmap()
1636+
1637+            mfv = MutableFileVersion(self,
1638+                                     servermap,
1639+                                     their_version,
1640+                                     self._storage_index,
1641+                                     self._storage_broker,
1642+                                     self._readkey,
1643+                                     history=self._history)
1644+            assert mfv.is_readonly()
1645+            # our caller can use this to download the contents of the
1646+            # mutable file.
1647+            return mfv
1648+        return d.addCallback(_build_version)
1649+
1650+
1651+    def _get_version_from_servermap(self,
1652+                                    mode,
1653+                                    servermap=None,
1654+                                    version=None):
1655+        """
1656+        I return a Deferred that fires with (servermap, version).
1657+
1658+        This function performs validation and a servermap update. If it
1659+        returns (servermap, version), the caller can assume that:
1660+            - servermap was last updated in mode.
1661+            - version is recoverable, and corresponds to the servermap.
1662+
1663+        If version and servermap are provided to me, I will validate
1664+        that version exists in the servermap, and that the servermap was
1665+        updated correctly.
1666+
1667+        If version is not provided, but servermap is, I will validate
1668+        the servermap and return the best recoverable version that I can
1669+        find in the servermap.
1670+
1671+        If the version is provided but the servermap isn't, I will
1672+        obtain a servermap that has been updated in the correct mode and
1673+        validate that version is found and recoverable.
1674+
1675+        If neither servermap nor version are provided, I will obtain a
1676+        servermap updated in the correct mode, and return the best
1677+        recoverable version that I can find in there.
1678+        """
1679+        # XXX: wording ^^^^
1680+        if servermap and servermap.last_update_mode == mode:
1681+            d = defer.succeed(servermap)
1682+        else:
1683+            d = self._get_servermap(mode)
1684+
1685+        def _get_version(servermap, v):
1686+            if v and v not in servermap.recoverable_versions():
1687+                v = None
1688+            elif not v:
1689+                v = servermap.best_recoverable_version()
1690+            if not v:
1691+                raise UnrecoverableFileError("no recoverable versions")
1692+
1693+            return (servermap, v)
1694+        return d.addCallback(_get_version, version)
1695+
1696 
1697     def download_best_version(self):
1698hunk ./src/allmydata/mutable/filenode.py 389
1699+        """
1700+        I return a Deferred that fires with the contents of the best
1701+        version of this mutable file.
1702+        """
1703         return self._do_serialized(self._download_best_version)
1704hunk ./src/allmydata/mutable/filenode.py 394
1705+
1706+
1707     def _download_best_version(self):
1708hunk ./src/allmydata/mutable/filenode.py 397
1709-        servermap = ServerMap()
1710-        d = self._try_once_to_download_best_version(servermap, MODE_READ)
1711-        def _maybe_retry(f):
1712-            f.trap(NotEnoughSharesError)
1713-            # the download is worth retrying once. Make sure to use the
1714-            # old servermap, since it is what remembers the bad shares,
1715-            # but use MODE_WRITE to make it look for even more shares.
1716-            # TODO: consider allowing this to retry multiple times.. this
1717-            # approach will let us tolerate about 8 bad shares, I think.
1718-            return self._try_once_to_download_best_version(servermap,
1719-                                                           MODE_WRITE)
1720+        """
1721+        I am the serialized sibling of download_best_version.
1722+        """
1723+        d = self.get_best_readable_version()
1724+        d.addCallback(self._record_size)
1725+        d.addCallback(lambda version: version.download_to_data())
1726+
1727+        # It is possible that the download will fail because there
1728+        # aren't enough shares to be had. If so, we will try again after
1729+        # updating the servermap in MODE_WRITE, which may find more
1730+        # shares than updating in MODE_READ, as we just did. We can do
1731+        # this by getting the best mutable version and downloading from
1732+        # that -- the best mutable version will be a MutableFileVersion
1733+        # with a servermap that was last updated in MODE_WRITE, as we
1734+        # want. If this fails, then we give up.
1735+        def _maybe_retry(failure):
1736+            failure.trap(NotEnoughSharesError)
1737+
1738+            d = self.get_best_mutable_version()
1739+            d.addCallback(self._record_size)
1740+            d.addCallback(lambda version: version.download_to_data())
1741+            return d
1742+
1743         d.addErrback(_maybe_retry)
1744         return d
1745hunk ./src/allmydata/mutable/filenode.py 422
1746-    def _try_once_to_download_best_version(self, servermap, mode):
1747-        d = self._update_servermap(servermap, mode)
1748-        d.addCallback(self._once_updated_download_best_version, servermap)
1749-        return d
1750-    def _once_updated_download_best_version(self, ignored, servermap):
1751-        goal = servermap.best_recoverable_version()
1752-        if not goal:
1753-            raise UnrecoverableFileError("no recoverable versions")
1754-        return self._try_once_to_download_version(servermap, goal)
1755+
1756+
1757+    def _record_size(self, mfv):
1758+        """
1759+        I record the size of a mutable file version.
1760+        """
1761+        self._most_recent_size = mfv.get_size()
1762+        return mfv
1763+
1764 
1765     def get_size_of_best_version(self):
1766hunk ./src/allmydata/mutable/filenode.py 433
1767-        d = self.get_servermap(MODE_READ)
1768-        def _got_servermap(smap):
1769-            ver = smap.best_recoverable_version()
1770-            if not ver:
1771-                raise UnrecoverableFileError("no recoverable version")
1772-            return smap.size_of_version(ver)
1773-        d.addCallback(_got_servermap)
1774-        return d
1775+        """
1776+        I return the size of the best version of this mutable file.
1777 
1778hunk ./src/allmydata/mutable/filenode.py 436
1779+        This is equivalent to calling get_size() on the result of
1780+        get_best_readable_version().
1781+        """
1782+        d = self.get_best_readable_version()
1783+        return d.addCallback(lambda mfv: mfv.get_size())
1784+
1785+
1786+    #################################
1787+    # IMutableFileNode
1788+
1789+    def get_best_mutable_version(self, servermap=None):
1790+        """
1791+        I return a Deferred that fires with a MutableFileVersion
1792+        representing the best readable version of the file that I
1793+        represent. I am like get_best_readable_version, except that I
1794+        will try to make a writable version if I can.
1795+        """
1796+        return self.get_mutable_version(servermap=servermap)
1797+
1798+
1799+    def get_mutable_version(self, servermap=None, version=None):
1800+        """
1801+        I return a version of this mutable file. I return a Deferred
1802+        that fires with a MutableFileVersion
1803+
1804+        If version is provided, the Deferred will fire with a
1805+        MutableFileVersion initailized with that version. Otherwise, it
1806+        will fire with the best version that I can recover.
1807+
1808+        If servermap is provided, I will use that to find versions
1809+        instead of performing my own servermap update.
1810+        """
1811+        if self.is_readonly():
1812+            return self.get_readable_version(servermap=servermap,
1813+                                             version=version)
1814+
1815+        # get_mutable_version => write intent, so we require that the
1816+        # servermap is updated in MODE_WRITE
1817+        d = self._get_version_from_servermap(MODE_WRITE, servermap, version)
1818+        def _build_version((servermap, smap_version)):
1819+            # these should have been set by the servermap update.
1820+            assert self._secret_holder
1821+            assert self._writekey
1822+
1823+            mfv = MutableFileVersion(self,
1824+                                     servermap,
1825+                                     smap_version,
1826+                                     self._storage_index,
1827+                                     self._storage_broker,
1828+                                     self._readkey,
1829+                                     self._writekey,
1830+                                     self._secret_holder,
1831+                                     history=self._history)
1832+            assert not mfv.is_readonly()
1833+            return mfv
1834+
1835+        return d.addCallback(_build_version)
1836+
1837+
1838+    # XXX: I'm uncomfortable with the difference between upload and
1839+    #      overwrite, which, FWICT, is basically that you don't have to
1840+    #      do a servermap update before you overwrite. We split them up
1841+    #      that way anyway, so I guess there's no real difficulty in
1842+    #      offering both ways to callers, but it also makes the
1843+    #      public-facing API cluttery, and makes it hard to discern the
1844+    #      right way of doing things.
1845+
1846+    # In general, we leave it to callers to ensure that they aren't
1847+    # going to cause UncoordinatedWriteErrors when working with
1848+    # MutableFileVersions. We know that the next three operations
1849+    # (upload, overwrite, and modify) will all operate on the same
1850+    # version, so we say that only one of them can be going on at once,
1851+    # and serialize them to ensure that that actually happens, since as
1852+    # the caller in this situation it is our job to do that.
1853     def overwrite(self, new_contents):
1854hunk ./src/allmydata/mutable/filenode.py 511
1855+        """
1856+        I overwrite the contents of the best recoverable version of this
1857+        mutable file with new_contents. This is equivalent to calling
1858+        overwrite on the result of get_best_mutable_version with
1859+        new_contents as an argument. I return a Deferred that eventually
1860+        fires with the results of my replacement process.
1861+        """
1862         return self._do_serialized(self._overwrite, new_contents)
1863hunk ./src/allmydata/mutable/filenode.py 519
1864+
1865+
1866     def _overwrite(self, new_contents):
1867hunk ./src/allmydata/mutable/filenode.py 522
1868+        """
1869+        I am the serialized sibling of overwrite.
1870+        """
1871+        d = self.get_best_mutable_version()
1872+        d.addCallback(lambda mfv: mfv.overwrite(new_contents))
1873+        d.addCallback(self._did_upload, new_contents.get_size())
1874+        return d
1875+
1876+
1877+
1878+    def upload(self, new_contents, servermap):
1879+        """
1880+        I overwrite the contents of the best recoverable version of this
1881+        mutable file with new_contents, using servermap instead of
1882+        creating/updating our own servermap. I return a Deferred that
1883+        fires with the results of my upload.
1884+        """
1885+        return self._do_serialized(self._upload, new_contents, servermap)
1886+
1887+
1888+    def modify(self, modifier, backoffer=None):
1889+        """
1890+        I modify the contents of the best recoverable version of this
1891+        mutable file with the modifier. This is equivalent to calling
1892+        modify on the result of get_best_mutable_version. I return a
1893+        Deferred that eventually fires with an UploadResults instance
1894+        describing this process.
1895+        """
1896+        return self._do_serialized(self._modify, modifier, backoffer)
1897+
1898+
1899+    def _modify(self, modifier, backoffer):
1900+        """
1901+        I am the serialized sibling of modify.
1902+        """
1903+        d = self.get_best_mutable_version()
1904+        d.addCallback(lambda mfv: mfv.modify(modifier, backoffer))
1905+        return d
1906+
1907+
1908+    def download_version(self, servermap, version, fetch_privkey=False):
1909+        """
1910+        Download the specified version of this mutable file. I return a
1911+        Deferred that fires with the contents of the specified version
1912+        as a bytestring, or errbacks if the file is not recoverable.
1913+        """
1914+        d = self.get_readable_version(servermap, version)
1915+        return d.addCallback(lambda mfv: mfv.download_to_data(fetch_privkey))
1916+
1917+
1918+    def get_servermap(self, mode):
1919+        """
1920+        I return a servermap that has been updated in mode.
1921+
1922+        mode should be one of MODE_READ, MODE_WRITE, MODE_CHECK or
1923+        MODE_ANYTHING. See servermap.py for more on what these mean.
1924+        """
1925+        return self._do_serialized(self._get_servermap, mode)
1926+
1927+
1928+    def _get_servermap(self, mode):
1929+        """
1930+        I am a serialized twin to get_servermap.
1931+        """
1932         servermap = ServerMap()
1933hunk ./src/allmydata/mutable/filenode.py 587
1934-        d = self._update_servermap(servermap, mode=MODE_WRITE)
1935-        d.addCallback(lambda ignored: self._upload(new_contents, servermap))
1936+        d = self._update_servermap(servermap, mode)
1937+        # The servermap will tell us about the most recent size of the
1938+        # file, so we may as well set that so that callers might get
1939+        # more data about us.
1940+        if not self._most_recent_size:
1941+            d.addCallback(self._get_size_from_servermap)
1942+        return d
1943+
1944+
1945+    def _get_size_from_servermap(self, servermap):
1946+        """
1947+        I extract the size of the best version of this file and record
1948+        it in self._most_recent_size. I return the servermap that I was
1949+        given.
1950+        """
1951+        if servermap.recoverable_versions():
1952+            v = servermap.best_recoverable_version()
1953+            size = v[4] # verinfo[4] == size
1954+            self._most_recent_size = size
1955+        return servermap
1956+
1957+
1958+    def _update_servermap(self, servermap, mode):
1959+        u = ServermapUpdater(self, self._storage_broker, Monitor(), servermap,
1960+                             mode)
1961+        if self._history:
1962+            self._history.notify_mapupdate(u.get_status())
1963+        return u.update()
1964+
1965+
1966+    def set_version(self, version):
1967+        # I can be set in two ways:
1968+        #  1. When the node is created.
1969+        #  2. (for an existing share) when the Servermap is updated
1970+        #     before I am read.
1971+        assert version in (MDMF_VERSION, SDMF_VERSION)
1972+        self._protocol_version = version
1973+
1974+
1975+    def get_version(self):
1976+        return self._protocol_version
1977+
1978+
1979+    def _do_serialized(self, cb, *args, **kwargs):
1980+        # note: to avoid deadlock, this callable is *not* allowed to invoke
1981+        # other serialized methods within this (or any other)
1982+        # MutableFileNode. The callable should be a bound method of this same
1983+        # MFN instance.
1984+        d = defer.Deferred()
1985+        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
1986+        # we need to put off d.callback until this Deferred is finished being
1987+        # processed. Otherwise the caller's subsequent activities (like,
1988+        # doing other things with this node) can cause reentrancy problems in
1989+        # the Deferred code itself
1990+        self._serializer.addBoth(lambda res: eventually(d.callback, res))
1991+        # add a log.err just in case something really weird happens, because
1992+        # self._serializer stays around forever, therefore we won't see the
1993+        # usual Unhandled Error in Deferred that would give us a hint.
1994+        self._serializer.addErrback(log.err)
1995         return d
1996 
1997 
1998hunk ./src/allmydata/mutable/filenode.py 649
1999+    def _upload(self, new_contents, servermap):
2000+        """
2001+        A MutableFileNode still has to have some way of getting
2002+        published initially, which is what I am here for. After that,
2003+        all publishing, updating, modifying and so on happens through
2004+        MutableFileVersions.
2005+        """
2006+        assert self._pubkey, "update_servermap must be called before publish"
2007+
2008+        p = Publish(self, self._storage_broker, servermap)
2009+        if self._history:
2010+            self._history.notify_publish(p.get_status(),
2011+                                         new_contents.get_size())
2012+        d = p.publish(new_contents)
2013+        d.addCallback(self._did_upload, new_contents.get_size())
2014+        return d
2015+
2016+
2017+    def _did_upload(self, res, size):
2018+        self._most_recent_size = size
2019+        return res
2020+
2021+
2022+class MutableFileVersion:
2023+    """
2024+    I represent a specific version (most likely the best version) of a
2025+    mutable file.
2026+
2027+    Since I implement IReadable, instances which hold a
2028+    reference to an instance of me are guaranteed the ability (absent
2029+    connection difficulties or unrecoverable versions) to read the file
2030+    that I represent. Depending on whether I was initialized with a
2031+    write capability or not, I may also provide callers the ability to
2032+    overwrite or modify the contents of the mutable file that I
2033+    reference.
2034+    """
2035+    implements(IMutableFileVersion, IWritable)
2036+
2037+    def __init__(self,
2038+                 node,
2039+                 servermap,
2040+                 version,
2041+                 storage_index,
2042+                 storage_broker,
2043+                 readcap,
2044+                 writekey=None,
2045+                 write_secrets=None,
2046+                 history=None):
2047+
2048+        self._node = node
2049+        self._servermap = servermap
2050+        self._version = version
2051+        self._storage_index = storage_index
2052+        self._write_secrets = write_secrets
2053+        self._history = history
2054+        self._storage_broker = storage_broker
2055+
2056+        #assert isinstance(readcap, IURI)
2057+        self._readcap = readcap
2058+
2059+        self._writekey = writekey
2060+        self._serializer = defer.succeed(None)
2061+
2062+
2063+    def get_sequence_number(self):
2064+        """
2065+        Get the sequence number of the mutable version that I represent.
2066+        """
2067+        return self._version[0] # verinfo[0] == the sequence number
2068+
2069+
2070+    # TODO: Terminology?
2071+    def get_writekey(self):
2072+        """
2073+        I return a writekey or None if I don't have a writekey.
2074+        """
2075+        return self._writekey
2076+
2077+
2078+    def overwrite(self, new_contents):
2079+        """
2080+        I overwrite the contents of this mutable file version with the
2081+        data in new_contents.
2082+        """
2083+        assert not self.is_readonly()
2084+
2085+        return self._do_serialized(self._overwrite, new_contents)
2086+
2087+
2088+    def _overwrite(self, new_contents):
2089+        assert IMutableUploadable.providedBy(new_contents)
2090+        assert self._servermap.last_update_mode == MODE_WRITE
2091+
2092+        return self._upload(new_contents)
2093+
2094+
2095     def modify(self, modifier, backoffer=None):
2096         """I use a modifier callback to apply a change to the mutable file.
2097         I implement the following pseudocode::
2098hunk ./src/allmydata/mutable/filenode.py 785
2099         backoffer should not invoke any methods on this MutableFileNode
2100         instance, and it needs to be highly conscious of deadlock issues.
2101         """
2102+        assert not self.is_readonly()
2103+
2104         return self._do_serialized(self._modify, modifier, backoffer)
2105hunk ./src/allmydata/mutable/filenode.py 788
2106+
2107+
2108     def _modify(self, modifier, backoffer):
2109hunk ./src/allmydata/mutable/filenode.py 791
2110-        servermap = ServerMap()
2111         if backoffer is None:
2112             backoffer = BackoffAgent().delay
2113hunk ./src/allmydata/mutable/filenode.py 793
2114-        return self._modify_and_retry(servermap, modifier, backoffer, True)
2115-    def _modify_and_retry(self, servermap, modifier, backoffer, first_time):
2116-        d = self._modify_once(servermap, modifier, first_time)
2117+        return self._modify_and_retry(modifier, backoffer, True)
2118+
2119+
2120+    def _modify_and_retry(self, modifier, backoffer, first_time):
2121+        """
2122+        I try to apply modifier to the contents of this version of the
2123+        mutable file. If I succeed, I return an UploadResults instance
2124+        describing my success. If I fail, I try again after waiting for
2125+        a little bit.
2126+        """
2127+        log.msg("doing modify")
2128+        d = self._modify_once(modifier, first_time)
2129         def _retry(f):
2130             f.trap(UncoordinatedWriteError)
2131             d2 = defer.maybeDeferred(backoffer, self, f)
2132hunk ./src/allmydata/mutable/filenode.py 809
2133             d2.addCallback(lambda ignored:
2134-                           self._modify_and_retry(servermap, modifier,
2135+                           self._modify_and_retry(modifier,
2136                                                   backoffer, False))
2137             return d2
2138         d.addErrback(_retry)
2139hunk ./src/allmydata/mutable/filenode.py 814
2140         return d
2141-    def _modify_once(self, servermap, modifier, first_time):
2142-        d = self._update_servermap(servermap, MODE_WRITE)
2143-        d.addCallback(self._once_updated_download_best_version, servermap)
2144+
2145+
2146+    def _modify_once(self, modifier, first_time):
2147+        """
2148+        I attempt to apply a modifier to the contents of the mutable
2149+        file.
2150+        """
2151+        # XXX: This is wrong -- we could get more servers if we updated
2152+        # in MODE_ANYTHING and possibly MODE_CHECK. Probably we want to
2153+        # assert that the last update wasn't MODE_READ
2154+        assert self._servermap.last_update_mode == MODE_WRITE
2155+
2156+        # download_to_data is serialized, so we have to call this to
2157+        # avoid deadlock.
2158+        d = self._try_to_download_data()
2159         def _apply(old_contents):
2160hunk ./src/allmydata/mutable/filenode.py 830
2161-            new_contents = modifier(old_contents, servermap, first_time)
2162+            new_contents = modifier(old_contents, self._servermap, first_time)
2163+            precondition((isinstance(new_contents, str) or
2164+                          new_contents is None),
2165+                         "Modifier function must return a string "
2166+                         "or None")
2167+
2168             if new_contents is None or new_contents == old_contents:
2169hunk ./src/allmydata/mutable/filenode.py 837
2170+                log.msg("no changes")
2171                 # no changes need to be made
2172                 if first_time:
2173                     return
2174hunk ./src/allmydata/mutable/filenode.py 845
2175                 # recovery when it observes UCWE, we need to do a second
2176                 # publish. See #551 for details. We'll basically loop until
2177                 # we managed an uncontested publish.
2178-                new_contents = old_contents
2179-            precondition(isinstance(new_contents, str),
2180-                         "Modifier function must return a string or None")
2181-            return self._upload(new_contents, servermap)
2182+                old_uploadable = MutableData(old_contents)
2183+                new_contents = old_uploadable
2184+            else:
2185+                new_contents = MutableData(new_contents)
2186+
2187+            return self._upload(new_contents)
2188         d.addCallback(_apply)
2189         return d
2190 
2191hunk ./src/allmydata/mutable/filenode.py 854
2192-    def get_servermap(self, mode):
2193-        return self._do_serialized(self._get_servermap, mode)
2194-    def _get_servermap(self, mode):
2195-        servermap = ServerMap()
2196-        return self._update_servermap(servermap, mode)
2197-    def _update_servermap(self, servermap, mode):
2198-        u = ServermapUpdater(self, self._storage_broker, Monitor(), servermap,
2199-                             mode)
2200-        if self._history:
2201-            self._history.notify_mapupdate(u.get_status())
2202-        return u.update()
2203 
2204hunk ./src/allmydata/mutable/filenode.py 855
2205-    def download_version(self, servermap, version, fetch_privkey=False):
2206-        return self._do_serialized(self._try_once_to_download_version,
2207-                                   servermap, version, fetch_privkey)
2208-    def _try_once_to_download_version(self, servermap, version,
2209-                                      fetch_privkey=False):
2210-        r = Retrieve(self, servermap, version, fetch_privkey)
2211+    def is_readonly(self):
2212+        """
2213+        I return True if this MutableFileVersion provides no write
2214+        access to the file that it encapsulates, and False if it
2215+        provides the ability to modify the file.
2216+        """
2217+        return self._writekey is None
2218+
2219+
2220+    def is_mutable(self):
2221+        """
2222+        I return True, since mutable files are always mutable by
2223+        somebody.
2224+        """
2225+        return True
2226+
2227+
2228+    def get_storage_index(self):
2229+        """
2230+        I return the storage index of the reference that I encapsulate.
2231+        """
2232+        return self._storage_index
2233+
2234+
2235+    def get_size(self):
2236+        """
2237+        I return the length, in bytes, of this readable object.
2238+        """
2239+        return self._servermap.size_of_version(self._version)
2240+
2241+
2242+    def download_to_data(self, fetch_privkey=False):
2243+        """
2244+        I return a Deferred that fires with the contents of this
2245+        readable object as a byte string.
2246+
2247+        """
2248+        c = consumer.MemoryConsumer()
2249+        d = self.read(c, fetch_privkey=fetch_privkey)
2250+        d.addCallback(lambda mc: "".join(mc.chunks))
2251+        return d
2252+
2253+
2254+    def _try_to_download_data(self):
2255+        """
2256+        I am an unserialized cousin of download_to_data; I am called
2257+        from the children of modify() to download the data associated
2258+        with this mutable version.
2259+        """
2260+        c = consumer.MemoryConsumer()
2261+        # modify will almost certainly write, so we need the privkey.
2262+        d = self._read(c, fetch_privkey=True)
2263+        d.addCallback(lambda mc: "".join(mc.chunks))
2264+        return d
2265+
2266+
2267+    def read(self, consumer, offset=0, size=None, fetch_privkey=False):
2268+        """
2269+        I read a portion (possibly all) of the mutable file that I
2270+        reference into consumer.
2271+        """
2272+        return self._do_serialized(self._read, consumer, offset, size,
2273+                                   fetch_privkey)
2274+
2275+
2276+    def _read(self, consumer, offset=0, size=None, fetch_privkey=False):
2277+        """
2278+        I am the serialized companion of read.
2279+        """
2280+        r = Retrieve(self._node, self._servermap, self._version, fetch_privkey)
2281         if self._history:
2282             self._history.notify_retrieve(r.get_status())
2283hunk ./src/allmydata/mutable/filenode.py 927
2284-        d = r.download()
2285-        d.addCallback(self._downloaded_version)
2286+        d = r.download(consumer, offset, size)
2287         return d
2288hunk ./src/allmydata/mutable/filenode.py 929
2289-    def _downloaded_version(self, data):
2290-        self._most_recent_size = len(data)
2291-        return data
2292 
2293hunk ./src/allmydata/mutable/filenode.py 930
2294-    def upload(self, new_contents, servermap):
2295-        return self._do_serialized(self._upload, new_contents, servermap)
2296-    def _upload(self, new_contents, servermap):
2297-        assert self._pubkey, "update_servermap must be called before publish"
2298-        p = Publish(self, self._storage_broker, servermap)
2299+
2300+    def _do_serialized(self, cb, *args, **kwargs):
2301+        # note: to avoid deadlock, this callable is *not* allowed to invoke
2302+        # other serialized methods within this (or any other)
2303+        # MutableFileNode. The callable should be a bound method of this same
2304+        # MFN instance.
2305+        d = defer.Deferred()
2306+        self._serializer.addCallback(lambda ignore: cb(*args, **kwargs))
2307+        # we need to put off d.callback until this Deferred is finished being
2308+        # processed. Otherwise the caller's subsequent activities (like,
2309+        # doing other things with this node) can cause reentrancy problems in
2310+        # the Deferred code itself
2311+        self._serializer.addBoth(lambda res: eventually(d.callback, res))
2312+        # add a log.err just in case something really weird happens, because
2313+        # self._serializer stays around forever, therefore we won't see the
2314+        # usual Unhandled Error in Deferred that would give us a hint.
2315+        self._serializer.addErrback(log.err)
2316+        return d
2317+
2318+
2319+    def _upload(self, new_contents):
2320+        #assert self._pubkey, "update_servermap must be called before publish"
2321+        p = Publish(self._node, self._storage_broker, self._servermap)
2322         if self._history:
2323hunk ./src/allmydata/mutable/filenode.py 954
2324-            self._history.notify_publish(p.get_status(), len(new_contents))
2325+            self._history.notify_publish(p.get_status(),
2326+                                         new_contents.get_size())
2327         d = p.publish(new_contents)
2328hunk ./src/allmydata/mutable/filenode.py 957
2329-        d.addCallback(self._did_upload, len(new_contents))
2330+        d.addCallback(self._did_upload, new_contents.get_size())
2331         return d
2332hunk ./src/allmydata/mutable/filenode.py 959
2333+
2334+
2335     def _did_upload(self, res, size):
2336         self._most_recent_size = size
2337         return res
2338hunk ./src/allmydata/mutable/filenode.py 964
2339+
2340+    def update(self, data, offset):
2341+        """
2342+        Do an update of this mutable file version by inserting data at
2343+        offset within the file. If offset is the EOF, this is an append
2344+        operation. I return a Deferred that fires with the results of
2345+        the update operation when it has completed.
2346+
2347+        In cases where update does not append any data, or where it does
2348+        not append so many blocks that the block count crosses a
2349+        power-of-two boundary, this operation will use roughly
2350+        O(data.get_size()) memory/bandwidth/CPU to perform the update.
2351+        Otherwise, it must download, re-encode, and upload the entire
2352+        file again, which will use O(filesize) resources.
2353+        """
2354+        return self._do_serialized(self._update, data, offset)
2355+
2356+
2357+    def _update(self, data, offset):
2358+        """
2359+        I update the mutable file version represented by this particular
2360+        IMutableVersion by inserting the data in data at the offset
2361+        offset. I return a Deferred that fires when this has been
2362+        completed.
2363+        """
2364+        # We have two cases here:
2365+        # 1. The new data will add few enough segments so that it does
2366+        #    not cross into the next power-of-two boundary.
2367+        # 2. It doesn't.
2368+        #
2369+        # In the former case, we can modify the file in place. In the
2370+        # latter case, we need to re-encode the file.
2371+        new_size = data.get_size() + offset
2372+        old_size = self.get_size()
2373+        segment_size = self._version[3]
2374+        num_old_segments = mathutil.div_ceil(old_size,
2375+                                             segment_size)
2376+        num_new_segments = mathutil.div_ceil(new_size,
2377+                                             segment_size)
2378+        log.msg("got %d old segments, %d new segments" % \
2379+                        (num_old_segments, num_new_segments))
2380+
2381+        # We also do a whole file re-encode if the file is an SDMF file.
2382+        if self._version[2]: # version[2] == SDMF salt, which MDMF lacks
2383+            log.msg("doing re-encode instead of in-place update")
2384+            return self._do_modify_update(data, offset)
2385+
2386+        log.msg("updating in place")
2387+        d = self._do_update_update(data, offset)
2388+        d.addCallback(self._decode_and_decrypt_segments, data, offset)
2389+        d.addCallback(self._build_uploadable_and_finish, data, offset)
2390+        return d
2391+
2392+
2393+    def _do_modify_update(self, data, offset):
2394+        """
2395+        I perform a file update by modifying the contents of the file
2396+        after downloading it, then reuploading it. I am less efficient
2397+        than _do_update_update, but am necessary for certain updates.
2398+        """
2399+        def m(old, servermap, first_time):
2400+            start = offset
2401+            rest = offset + data.get_size()
2402+            new = old[:start]
2403+            new += "".join(data.read(data.get_size()))
2404+            new += old[rest:]
2405+            return new
2406+        return self._modify(m, None)
2407+
2408+
2409+    def _do_update_update(self, data, offset):
2410+        """
2411+        I start the Servermap update that gets us the data we need to
2412+        continue the update process. I return a Deferred that fires when
2413+        the servermap update is done.
2414+        """
2415+        assert IMutableUploadable.providedBy(data)
2416+        assert self.is_mutable()
2417+        # offset == self.get_size() is valid and means that we are
2418+        # appending data to the file.
2419+        assert offset <= self.get_size()
2420+
2421+        # We'll need the segment that the data starts in, regardless of
2422+        # what we'll do later.
2423+        start_segment = mathutil.div_ceil(offset, DEFAULT_MAX_SEGMENT_SIZE)
2424+        start_segment -= 1
2425+
2426+        # We only need the end segment if the data we append does not go
2427+        # beyond the current end-of-file.
2428+        end_segment = start_segment
2429+        if offset + data.get_size() < self.get_size():
2430+            end_data = offset + data.get_size()
2431+            end_segment = mathutil.div_ceil(end_data, DEFAULT_MAX_SEGMENT_SIZE)
2432+            end_segment -= 1
2433+        self._start_segment = start_segment
2434+        self._end_segment = end_segment
2435+
2436+        # Now ask for the servermap to be updated in MODE_WRITE with
2437+        # this update range.
2438+        u = ServermapUpdater(self._node, self._storage_broker, Monitor(),
2439+                             self._servermap,
2440+                             mode=MODE_WRITE,
2441+                             update_range=(start_segment, end_segment))
2442+        return u.update()
2443+
2444+
2445+    def _decode_and_decrypt_segments(self, ignored, data, offset):
2446+        """
2447+        After the servermap update, I take the encrypted and encoded
2448+        data that the servermap fetched while doing its update and
2449+        transform it into decoded-and-decrypted plaintext that can be
2450+        used by the new uploadable. I return a Deferred that fires with
2451+        the segments.
2452+        """
2453+        r = Retrieve(self._node, self._servermap, self._version)
2454+        # decode: takes in our blocks and salts from the servermap,
2455+        # returns a Deferred that fires with the corresponding plaintext
2456+        # segments. Does not download -- simply takes advantage of
2457+        # existing infrastructure within the Retrieve class to avoid
2458+        # duplicating code.
2459+        sm = self._servermap
2460+        # XXX: If the methods in the servermap don't work as
2461+        # abstractions, you should rewrite them instead of going around
2462+        # them.
2463+        update_data = sm.update_data
2464+        start_segments = {} # shnum -> start segment
2465+        end_segments = {} # shnum -> end segment
2466+        blockhashes = {} # shnum -> blockhash tree
2467+        for (shnum, data) in update_data.iteritems():
2468+            data = [d[1] for d in data if d[0] == self._version]
2469+
2470+            # Every data entry in our list should now be share shnum for
2471+            # a particular version of the mutable file, so all of the
2472+            # entries should be identical.
2473+            datum = data[0]
2474+            assert filter(lambda x: x != datum, data) == []
2475+
2476+            blockhashes[shnum] = datum[0]
2477+            start_segments[shnum] = datum[1]
2478+            end_segments[shnum] = datum[2]
2479+
2480+        d1 = r.decode(start_segments, self._start_segment)
2481+        d2 = r.decode(end_segments, self._end_segment)
2482+        d3 = defer.succeed(blockhashes)
2483+        return deferredutil.gatherResults([d1, d2, d3])
2484+
2485+
2486+    def _build_uploadable_and_finish(self, segments_and_bht, data, offset):
2487+        """
2488+        After the process has the plaintext segments, I build the
2489+        TransformingUploadable that the publisher will eventually
2490+        re-upload to the grid. I then invoke the publisher with that
2491+        uploadable, and return a Deferred when the publish operation has
2492+        completed without issue.
2493+        """
2494+        u = TransformingUploadable(data, offset,
2495+                                   self._version[3],
2496+                                   segments_and_bht[0],
2497+                                   segments_and_bht[1])
2498+        p = Publish(self._node, self._storage_broker, self._servermap)
2499+        return p.update(u, offset, segments_and_bht[2], self._version)
2500}
2501[mutable/publish.py: Modify the publish process to support MDMF
2502Kevan Carstensen <kevan@isnotajoke.com>**20100819003342
2503 Ignore-this: 2bb379974927e2e20cff75bae8302d1d
2504 
2505 The inner workings of the publishing process needed to be reworked to a
2506 large extend to cope with segmented mutable files, and to cope with
2507 partial-file updates of mutable files. This patch does that. It also
2508 introduces wrappers for uploadable data, allowing the use of
2509 filehandle-like objects as data sources, in addition to strings. This
2510 reduces memory inefficiency when dealing with large files through the
2511 webapi, and clarifies update code there.
2512] {
2513hunk ./src/allmydata/mutable/publish.py 3
2514 
2515 
2516-import os, struct, time
2517+import os, time
2518+from StringIO import StringIO
2519 from itertools import count
2520 from zope.interface import implements
2521 from twisted.internet import defer
2522hunk ./src/allmydata/mutable/publish.py 9
2523 from twisted.python import failure
2524-from allmydata.interfaces import IPublishStatus
2525+from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION, \
2526+                                 IMutableUploadable
2527 from allmydata.util import base32, hashutil, mathutil, idlib, log
2528 from allmydata.util.dictutil import DictOfSets
2529 from allmydata import hashtree, codec
2530hunk ./src/allmydata/mutable/publish.py 21
2531 from allmydata.mutable.common import MODE_WRITE, MODE_CHECK, \
2532      UncoordinatedWriteError, NotEnoughServersError
2533 from allmydata.mutable.servermap import ServerMap
2534-from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
2535-     unpack_checkstring, SIGNED_PREFIX
2536+from allmydata.mutable.layout import unpack_checkstring, MDMFSlotWriteProxy, \
2537+                                     SDMFSlotWriteProxy
2538+
2539+KiB = 1024
2540+DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
2541+PUSHING_BLOCKS_STATE = 0
2542+PUSHING_EVERYTHING_ELSE_STATE = 1
2543+DONE_STATE = 2
2544 
2545 class PublishStatus:
2546     implements(IPublishStatus)
2547hunk ./src/allmydata/mutable/publish.py 118
2548         self._status.set_helper(False)
2549         self._status.set_progress(0.0)
2550         self._status.set_active(True)
2551+        self._version = self._node.get_version()
2552+        assert self._version in (SDMF_VERSION, MDMF_VERSION)
2553+
2554 
2555     def get_status(self):
2556         return self._status
2557hunk ./src/allmydata/mutable/publish.py 132
2558             kwargs["facility"] = "tahoe.mutable.publish"
2559         return log.msg(*args, **kwargs)
2560 
2561+
2562+    def update(self, data, offset, blockhashes, version):
2563+        """
2564+        I replace the contents of this file with the contents of data,
2565+        starting at offset. I return a Deferred that fires with None
2566+        when the replacement has been completed, or with an error if
2567+        something went wrong during the process.
2568+
2569+        Note that this process will not upload new shares. If the file
2570+        being updated is in need of repair, callers will have to repair
2571+        it on their own.
2572+        """
2573+        # How this works:
2574+        # 1: Make peer assignments. We'll assign each share that we know
2575+        # about on the grid to that peer that currently holds that
2576+        # share, and will not place any new shares.
2577+        # 2: Setup encoding parameters. Most of these will stay the same
2578+        # -- datalength will change, as will some of the offsets.
2579+        # 3. Upload the new segments.
2580+        # 4. Be done.
2581+        assert IMutableUploadable.providedBy(data)
2582+
2583+        self.data = data
2584+
2585+        # XXX: Use the MutableFileVersion instead.
2586+        self.datalength = self._node.get_size()
2587+        if data.get_size() > self.datalength:
2588+            self.datalength = data.get_size()
2589+
2590+        self.log("starting update")
2591+        self.log("adding new data of length %d at offset %d" % \
2592+                    (data.get_size(), offset))
2593+        self.log("new data length is %d" % self.datalength)
2594+        self._status.set_size(self.datalength)
2595+        self._status.set_status("Started")
2596+        self._started = time.time()
2597+
2598+        self.done_deferred = defer.Deferred()
2599+
2600+        self._writekey = self._node.get_writekey()
2601+        assert self._writekey, "need write capability to publish"
2602+
2603+        # first, which servers will we publish to? We require that the
2604+        # servermap was updated in MODE_WRITE, so we can depend upon the
2605+        # peerlist computed by that process instead of computing our own.
2606+        assert self._servermap
2607+        assert self._servermap.last_update_mode in (MODE_WRITE, MODE_CHECK)
2608+        # we will push a version that is one larger than anything present
2609+        # in the grid, according to the servermap.
2610+        self._new_seqnum = self._servermap.highest_seqnum() + 1
2611+        self._status.set_servermap(self._servermap)
2612+
2613+        self.log(format="new seqnum will be %(seqnum)d",
2614+                 seqnum=self._new_seqnum, level=log.NOISY)
2615+
2616+        # We're updating an existing file, so all of the following
2617+        # should be available.
2618+        self.readkey = self._node.get_readkey()
2619+        self.required_shares = self._node.get_required_shares()
2620+        assert self.required_shares is not None
2621+        self.total_shares = self._node.get_total_shares()
2622+        assert self.total_shares is not None
2623+        self._status.set_encoding(self.required_shares, self.total_shares)
2624+
2625+        self._pubkey = self._node.get_pubkey()
2626+        assert self._pubkey
2627+        self._privkey = self._node.get_privkey()
2628+        assert self._privkey
2629+        self._encprivkey = self._node.get_encprivkey()
2630+
2631+        sb = self._storage_broker
2632+        full_peerlist = sb.get_servers_for_index(self._storage_index)
2633+        self.full_peerlist = full_peerlist # for use later, immutable
2634+        self.bad_peers = set() # peerids who have errbacked/refused requests
2635+
2636+        # This will set self.segment_size, self.num_segments, and
2637+        # self.fec. TODO: Does it know how to do the offset? Probably
2638+        # not. So do that part next.
2639+        self.setup_encoding_parameters(offset=offset)
2640+
2641+        # if we experience any surprises (writes which were rejected because
2642+        # our test vector did not match, or shares which we didn't expect to
2643+        # see), we set this flag and report an UncoordinatedWriteError at the
2644+        # end of the publish process.
2645+        self.surprised = False
2646+
2647+        # we keep track of three tables. The first is our goal: which share
2648+        # we want to see on which servers. This is initially populated by the
2649+        # existing servermap.
2650+        self.goal = set() # pairs of (peerid, shnum) tuples
2651+
2652+        # the second table is our list of outstanding queries: those which
2653+        # are in flight and may or may not be delivered, accepted, or
2654+        # acknowledged. Items are added to this table when the request is
2655+        # sent, and removed when the response returns (or errbacks).
2656+        self.outstanding = set() # (peerid, shnum) tuples
2657+
2658+        # the third is a table of successes: share which have actually been
2659+        # placed. These are populated when responses come back with success.
2660+        # When self.placed == self.goal, we're done.
2661+        self.placed = set() # (peerid, shnum) tuples
2662+
2663+        # we also keep a mapping from peerid to RemoteReference. Each time we
2664+        # pull a connection out of the full peerlist, we add it to this for
2665+        # use later.
2666+        self.connections = {}
2667+
2668+        self.bad_share_checkstrings = {}
2669+
2670+        # This is set at the last step of the publishing process.
2671+        self.versioninfo = ""
2672+
2673+        # we use the servermap to populate the initial goal: this way we will
2674+        # try to update each existing share in place. Since we're
2675+        # updating, we ignore damaged and missing shares -- callers must
2676+        # do a repair to repair and recreate these.
2677+        for (peerid, shnum) in self._servermap.servermap:
2678+            self.goal.add( (peerid, shnum) )
2679+            self.connections[peerid] = self._servermap.connections[peerid]
2680+        self.writers = {}
2681+
2682+        # SDMF files are updated differently.
2683+        self._version = MDMF_VERSION
2684+        writer_class = MDMFSlotWriteProxy
2685+
2686+        # For each (peerid, shnum) in self.goal, we make a
2687+        # write proxy for that peer. We'll use this to write
2688+        # shares to the peer.
2689+        for key in self.goal:
2690+            peerid, shnum = key
2691+            write_enabler = self._node.get_write_enabler(peerid)
2692+            renew_secret = self._node.get_renewal_secret(peerid)
2693+            cancel_secret = self._node.get_cancel_secret(peerid)
2694+            secrets = (write_enabler, renew_secret, cancel_secret)
2695+
2696+            self.writers[shnum] =  writer_class(shnum,
2697+                                                self.connections[peerid],
2698+                                                self._storage_index,
2699+                                                secrets,
2700+                                                self._new_seqnum,
2701+                                                self.required_shares,
2702+                                                self.total_shares,
2703+                                                self.segment_size,
2704+                                                self.datalength)
2705+            self.writers[shnum].peerid = peerid
2706+            assert (peerid, shnum) in self._servermap.servermap
2707+            old_versionid, old_timestamp = self._servermap.servermap[key]
2708+            (old_seqnum, old_root_hash, old_salt, old_segsize,
2709+             old_datalength, old_k, old_N, old_prefix,
2710+             old_offsets_tuple) = old_versionid
2711+            self.writers[shnum].set_checkstring(old_seqnum,
2712+                                                old_root_hash,
2713+                                                old_salt)
2714+
2715+        # Our remote shares will not have a complete checkstring until
2716+        # after we are done writing share data and have started to write
2717+        # blocks. In the meantime, we need to know what to look for when
2718+        # writing, so that we can detect UncoordinatedWriteErrors.
2719+        self._checkstring = self.writers.values()[0].get_checkstring()
2720+
2721+        # Now, we start pushing shares.
2722+        self._status.timings["setup"] = time.time() - self._started
2723+        # First, we encrypt, encode, and publish the shares that we need
2724+        # to encrypt, encode, and publish.
2725+
2726+        # Our update process fetched these for us. We need to update
2727+        # them in place as publishing happens.
2728+        self.blockhashes = {} # (shnum, [blochashes])
2729+        for (i, bht) in blockhashes.iteritems():
2730+            # We need to extract the leaves from our old hash tree.
2731+            old_segcount = mathutil.div_ceil(version[4],
2732+                                             version[3])
2733+            h = hashtree.IncompleteHashTree(old_segcount)
2734+            bht = dict(enumerate(bht))
2735+            h.set_hashes(bht)
2736+            leaves = h[h.get_leaf_index(0):]
2737+            for j in xrange(self.num_segments - len(leaves)):
2738+                leaves.append(None)
2739+
2740+            assert len(leaves) >= self.num_segments
2741+            self.blockhashes[i] = leaves
2742+            # This list will now be the leaves that were set during the
2743+            # initial upload + enough empty hashes to make it a
2744+            # power-of-two. If we exceed a power of two boundary, we
2745+            # should be encoding the file over again, and should not be
2746+            # here. So, we have
2747+            #assert len(self.blockhashes[i]) == \
2748+            #    hashtree.roundup_pow2(self.num_segments), \
2749+            #        len(self.blockhashes[i])
2750+            # XXX: Except this doesn't work. Figure out why.
2751+
2752+        # These are filled in later, after we've modified the block hash
2753+        # tree suitably.
2754+        self.sharehash_leaves = None # eventually [sharehashes]
2755+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
2756+                              # validate the share]
2757+
2758+        self.log("Starting push")
2759+
2760+        self._state = PUSHING_BLOCKS_STATE
2761+        self._push()
2762+
2763+        return self.done_deferred
2764+
2765+
2766     def publish(self, newdata):
2767         """Publish the filenode's current contents.  Returns a Deferred that
2768         fires (with None) when the publish has done as much work as it's ever
2769hunk ./src/allmydata/mutable/publish.py 344
2770         simultaneous write.
2771         """
2772 
2773-        # 1: generate shares (SDMF: files are small, so we can do it in RAM)
2774-        # 2: perform peer selection, get candidate servers
2775-        #  2a: send queries to n+epsilon servers, to determine current shares
2776-        #  2b: based upon responses, create target map
2777-        # 3: send slot_testv_and_readv_and_writev messages
2778-        # 4: as responses return, update share-dispatch table
2779-        # 4a: may need to run recovery algorithm
2780-        # 5: when enough responses are back, we're done
2781+        # 0. Setup encoding parameters, encoder, and other such things.
2782+        # 1. Encrypt, encode, and publish segments.
2783+        assert IMutableUploadable.providedBy(newdata)
2784 
2785hunk ./src/allmydata/mutable/publish.py 348
2786-        self.log("starting publish, datalen is %s" % len(newdata))
2787-        self._status.set_size(len(newdata))
2788+        self.data = newdata
2789+        self.datalength = newdata.get_size()
2790+        #if self.datalength >= DEFAULT_MAX_SEGMENT_SIZE:
2791+        #    self._version = MDMF_VERSION
2792+        #else:
2793+        #    self._version = SDMF_VERSION
2794+
2795+        self.log("starting publish, datalen is %s" % self.datalength)
2796+        self._status.set_size(self.datalength)
2797         self._status.set_status("Started")
2798         self._started = time.time()
2799 
2800hunk ./src/allmydata/mutable/publish.py 405
2801         self.full_peerlist = full_peerlist # for use later, immutable
2802         self.bad_peers = set() # peerids who have errbacked/refused requests
2803 
2804-        self.newdata = newdata
2805-        self.salt = os.urandom(16)
2806-
2807+        # This will set self.segment_size, self.num_segments, and
2808+        # self.fec.
2809         self.setup_encoding_parameters()
2810 
2811         # if we experience any surprises (writes which were rejected because
2812hunk ./src/allmydata/mutable/publish.py 415
2813         # end of the publish process.
2814         self.surprised = False
2815 
2816-        # as a failsafe, refuse to iterate through self.loop more than a
2817-        # thousand times.
2818-        self.looplimit = 1000
2819-
2820         # we keep track of three tables. The first is our goal: which share
2821         # we want to see on which servers. This is initially populated by the
2822         # existing servermap.
2823hunk ./src/allmydata/mutable/publish.py 438
2824 
2825         self.bad_share_checkstrings = {}
2826 
2827+        # This is set at the last step of the publishing process.
2828+        self.versioninfo = ""
2829+
2830         # we use the servermap to populate the initial goal: this way we will
2831         # try to update each existing share in place.
2832         for (peerid, shnum) in self._servermap.servermap:
2833hunk ./src/allmydata/mutable/publish.py 454
2834             self.bad_share_checkstrings[key] = old_checkstring
2835             self.connections[peerid] = self._servermap.connections[peerid]
2836 
2837-        # create the shares. We'll discard these as they are delivered. SDMF:
2838-        # we're allowed to hold everything in memory.
2839+        # TODO: Make this part do peer selection.
2840+        self.update_goal()
2841+        self.writers = {}
2842+        if self._version == MDMF_VERSION:
2843+            writer_class = MDMFSlotWriteProxy
2844+        else:
2845+            writer_class = SDMFSlotWriteProxy
2846 
2847hunk ./src/allmydata/mutable/publish.py 462
2848+        # For each (peerid, shnum) in self.goal, we make a
2849+        # write proxy for that peer. We'll use this to write
2850+        # shares to the peer.
2851+        for key in self.goal:
2852+            peerid, shnum = key
2853+            write_enabler = self._node.get_write_enabler(peerid)
2854+            renew_secret = self._node.get_renewal_secret(peerid)
2855+            cancel_secret = self._node.get_cancel_secret(peerid)
2856+            secrets = (write_enabler, renew_secret, cancel_secret)
2857+
2858+            self.writers[shnum] =  writer_class(shnum,
2859+                                                self.connections[peerid],
2860+                                                self._storage_index,
2861+                                                secrets,
2862+                                                self._new_seqnum,
2863+                                                self.required_shares,
2864+                                                self.total_shares,
2865+                                                self.segment_size,
2866+                                                self.datalength)
2867+            self.writers[shnum].peerid = peerid
2868+            if (peerid, shnum) in self._servermap.servermap:
2869+                old_versionid, old_timestamp = self._servermap.servermap[key]
2870+                (old_seqnum, old_root_hash, old_salt, old_segsize,
2871+                 old_datalength, old_k, old_N, old_prefix,
2872+                 old_offsets_tuple) = old_versionid
2873+                self.writers[shnum].set_checkstring(old_seqnum,
2874+                                                    old_root_hash,
2875+                                                    old_salt)
2876+            elif (peerid, shnum) in self.bad_share_checkstrings:
2877+                old_checkstring = self.bad_share_checkstrings[(peerid, shnum)]
2878+                self.writers[shnum].set_checkstring(old_checkstring)
2879+
2880+        # Our remote shares will not have a complete checkstring until
2881+        # after we are done writing share data and have started to write
2882+        # blocks. In the meantime, we need to know what to look for when
2883+        # writing, so that we can detect UncoordinatedWriteErrors.
2884+        self._checkstring = self.writers.values()[0].get_checkstring()
2885+
2886+        # Now, we start pushing shares.
2887         self._status.timings["setup"] = time.time() - self._started
2888hunk ./src/allmydata/mutable/publish.py 502
2889-        d = self._encrypt_and_encode()
2890-        d.addCallback(self._generate_shares)
2891-        def _start_pushing(res):
2892-            self._started_pushing = time.time()
2893-            return res
2894-        d.addCallback(_start_pushing)
2895-        d.addCallback(self.loop) # trigger delivery
2896-        d.addErrback(self._fatal_error)
2897+        # First, we encrypt, encode, and publish the shares that we need
2898+        # to encrypt, encode, and publish.
2899+
2900+        # This will eventually hold the block hash chain for each share
2901+        # that we publish. We define it this way so that empty publishes
2902+        # will still have something to write to the remote slot.
2903+        self.blockhashes = dict([(i, []) for i in xrange(self.total_shares)])
2904+        for i in xrange(self.total_shares):
2905+            blocks = self.blockhashes[i]
2906+            for j in xrange(self.num_segments):
2907+                blocks.append(None)
2908+        self.sharehash_leaves = None # eventually [sharehashes]
2909+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
2910+                              # validate the share]
2911+
2912+        self.log("Starting push")
2913+
2914+        self._state = PUSHING_BLOCKS_STATE
2915+        self._push()
2916 
2917         return self.done_deferred
2918 
2919hunk ./src/allmydata/mutable/publish.py 524
2920-    def setup_encoding_parameters(self):
2921-        segment_size = len(self.newdata)
2922+
2923+    def _update_status(self):
2924+        self._status.set_status("Sending Shares: %d placed out of %d, "
2925+                                "%d messages outstanding" %
2926+                                (len(self.placed),
2927+                                 len(self.goal),
2928+                                 len(self.outstanding)))
2929+        self._status.set_progress(1.0 * len(self.placed) / len(self.goal))
2930+
2931+
2932+    def setup_encoding_parameters(self, offset=0):
2933+        if self._version == MDMF_VERSION:
2934+            segment_size = DEFAULT_MAX_SEGMENT_SIZE # 128 KiB by default
2935+        else:
2936+            segment_size = self.datalength # SDMF is only one segment
2937         # this must be a multiple of self.required_shares
2938         segment_size = mathutil.next_multiple(segment_size,
2939                                               self.required_shares)
2940hunk ./src/allmydata/mutable/publish.py 543
2941         self.segment_size = segment_size
2942+
2943+        # Calculate the starting segment for the upload.
2944         if segment_size:
2945hunk ./src/allmydata/mutable/publish.py 546
2946-            self.num_segments = mathutil.div_ceil(len(self.newdata),
2947+            self.num_segments = mathutil.div_ceil(self.datalength,
2948                                                   segment_size)
2949hunk ./src/allmydata/mutable/publish.py 548
2950+            self.starting_segment = mathutil.div_ceil(offset,
2951+                                                      segment_size)
2952+            self.starting_segment -= 1
2953+            if offset == 0:
2954+                self.starting_segment = 0
2955+
2956         else:
2957             self.num_segments = 0
2958hunk ./src/allmydata/mutable/publish.py 556
2959-        assert self.num_segments in [0, 1,] # SDMF restrictions
2960+            self.starting_segment = 0
2961+
2962+
2963+        self.log("building encoding parameters for file")
2964+        self.log("got segsize %d" % self.segment_size)
2965+        self.log("got %d segments" % self.num_segments)
2966+
2967+        if self._version == SDMF_VERSION:
2968+            assert self.num_segments in (0, 1) # SDMF
2969+        # calculate the tail segment size.
2970+
2971+        if segment_size and self.datalength:
2972+            self.tail_segment_size = self.datalength % segment_size
2973+            self.log("got tail segment size %d" % self.tail_segment_size)
2974+        else:
2975+            self.tail_segment_size = 0
2976+
2977+        if self.tail_segment_size == 0 and segment_size:
2978+            # The tail segment is the same size as the other segments.
2979+            self.tail_segment_size = segment_size
2980+
2981+        # Make FEC encoders
2982+        fec = codec.CRSEncoder()
2983+        fec.set_params(self.segment_size,
2984+                       self.required_shares, self.total_shares)
2985+        self.piece_size = fec.get_block_size()
2986+        self.fec = fec
2987+
2988+        if self.tail_segment_size == self.segment_size:
2989+            self.tail_fec = self.fec
2990+        else:
2991+            tail_fec = codec.CRSEncoder()
2992+            tail_fec.set_params(self.tail_segment_size,
2993+                                self.required_shares,
2994+                                self.total_shares)
2995+            self.tail_fec = tail_fec
2996+
2997+        self._current_segment = self.starting_segment
2998+        self.end_segment = self.num_segments - 1
2999+        # Now figure out where the last segment should be.
3000+        if self.data.get_size() != self.datalength:
3001+            end = self.data.get_size()
3002+            self.end_segment = mathutil.div_ceil(end,
3003+                                                 segment_size)
3004+            self.end_segment -= 1
3005+        self.log("got start segment %d" % self.starting_segment)
3006+        self.log("got end segment %d" % self.end_segment)
3007+
3008+
3009+    def _push(self, ignored=None):
3010+        """
3011+        I manage state transitions. In particular, I see that we still
3012+        have a good enough number of writers to complete the upload
3013+        successfully.
3014+        """
3015+        # Can we still successfully publish this file?
3016+        # TODO: Keep track of outstanding queries before aborting the
3017+        #       process.
3018+        if len(self.writers) <= self.required_shares or self.surprised:
3019+            return self._failure()
3020+
3021+        # Figure out what we need to do next. Each of these needs to
3022+        # return a deferred so that we don't block execution when this
3023+        # is first called in the upload method.
3024+        if self._state == PUSHING_BLOCKS_STATE:
3025+            return self.push_segment(self._current_segment)
3026+
3027+        elif self._state == PUSHING_EVERYTHING_ELSE_STATE:
3028+            return self.push_everything_else()
3029+
3030+        # If we make it to this point, we were successful in placing the
3031+        # file.
3032+        return self._done(None)
3033+
3034+
3035+    def push_segment(self, segnum):
3036+        if self.num_segments == 0 and self._version == SDMF_VERSION:
3037+            self._add_dummy_salts()
3038 
3039hunk ./src/allmydata/mutable/publish.py 635
3040-    def _fatal_error(self, f):
3041-        self.log("error during loop", failure=f, level=log.UNUSUAL)
3042-        self._done(f)
3043+        if segnum > self.end_segment:
3044+            # We don't have any more segments to push.
3045+            self._state = PUSHING_EVERYTHING_ELSE_STATE
3046+            return self._push()
3047+
3048+        d = self._encode_segment(segnum)
3049+        d.addCallback(self._push_segment, segnum)
3050+        def _increment_segnum(ign):
3051+            self._current_segment += 1
3052+        # XXX: I don't think we need to do addBoth here -- any errBacks
3053+        # should be handled within push_segment.
3054+        d.addBoth(_increment_segnum)
3055+        d.addBoth(self._turn_barrier)
3056+        d.addBoth(self._push)
3057+
3058+
3059+    def _turn_barrier(self, result):
3060+        """
3061+        I help the publish process avoid the recursion limit issues
3062+        described in #237.
3063+        """
3064+        return fireEventually(result)
3065+
3066+
3067+    def _add_dummy_salts(self):
3068+        """
3069+        SDMF files need a salt even if they're empty, or the signature
3070+        won't make sense. This method adds a dummy salt to each of our
3071+        SDMF writers so that they can write the signature later.
3072+        """
3073+        salt = os.urandom(16)
3074+        assert self._version == SDMF_VERSION
3075+
3076+        for writer in self.writers.itervalues():
3077+            writer.put_salt(salt)
3078+
3079+
3080+    def _encode_segment(self, segnum):
3081+        """
3082+        I encrypt and encode the segment segnum.
3083+        """
3084+        started = time.time()
3085+
3086+        if segnum + 1 == self.num_segments:
3087+            segsize = self.tail_segment_size
3088+        else:
3089+            segsize = self.segment_size
3090+
3091+
3092+        self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
3093+        data = self.data.read(segsize)
3094+        # XXX: This is dumb. Why return a list?
3095+        data = "".join(data)
3096+
3097+        assert len(data) == segsize, len(data)
3098+
3099+        salt = os.urandom(16)
3100+
3101+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
3102+        self._status.set_status("Encrypting")
3103+        enc = AES(key)
3104+        crypttext = enc.process(data)
3105+        assert len(crypttext) == len(data)
3106+
3107+        now = time.time()
3108+        self._status.timings["encrypt"] = now - started
3109+        started = now
3110+
3111+        # now apply FEC
3112+        if segnum + 1 == self.num_segments:
3113+            fec = self.tail_fec
3114+        else:
3115+            fec = self.fec
3116+
3117+        self._status.set_status("Encoding")
3118+        crypttext_pieces = [None] * self.required_shares
3119+        piece_size = fec.get_block_size()
3120+        for i in range(len(crypttext_pieces)):
3121+            offset = i * piece_size
3122+            piece = crypttext[offset:offset+piece_size]
3123+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
3124+            crypttext_pieces[i] = piece
3125+            assert len(piece) == piece_size
3126+        d = fec.encode(crypttext_pieces)
3127+        def _done_encoding(res):
3128+            elapsed = time.time() - started
3129+            self._status.timings["encode"] = elapsed
3130+            return (res, salt)
3131+        d.addCallback(_done_encoding)
3132+        return d
3133+
3134+
3135+    def _push_segment(self, encoded_and_salt, segnum):
3136+        """
3137+        I push (data, salt) as segment number segnum.
3138+        """
3139+        results, salt = encoded_and_salt
3140+        shares, shareids = results
3141+        self._status.set_status("Pushing segment")
3142+        for i in xrange(len(shares)):
3143+            sharedata = shares[i]
3144+            shareid = shareids[i]
3145+            if self._version == MDMF_VERSION:
3146+                hashed = salt + sharedata
3147+            else:
3148+                hashed = sharedata
3149+            block_hash = hashutil.block_hash(hashed)
3150+            self.blockhashes[shareid][segnum] = block_hash
3151+            # find the writer for this share
3152+            writer = self.writers[shareid]
3153+            writer.put_block(sharedata, segnum, salt)
3154+
3155+
3156+    def push_everything_else(self):
3157+        """
3158+        I put everything else associated with a share.
3159+        """
3160+        self._pack_started = time.time()
3161+        self.push_encprivkey()
3162+        self.push_blockhashes()
3163+        self.push_sharehashes()
3164+        self.push_toplevel_hashes_and_signature()
3165+        d = self.finish_publishing()
3166+        def _change_state(ignored):
3167+            self._state = DONE_STATE
3168+        d.addCallback(_change_state)
3169+        d.addCallback(self._push)
3170+        return d
3171+
3172+
3173+    def push_encprivkey(self):
3174+        encprivkey = self._encprivkey
3175+        self._status.set_status("Pushing encrypted private key")
3176+        for writer in self.writers.itervalues():
3177+            writer.put_encprivkey(encprivkey)
3178+
3179+
3180+    def push_blockhashes(self):
3181+        self.sharehash_leaves = [None] * len(self.blockhashes)
3182+        self._status.set_status("Building and pushing block hash tree")
3183+        for shnum, blockhashes in self.blockhashes.iteritems():
3184+            t = hashtree.HashTree(blockhashes)
3185+            self.blockhashes[shnum] = list(t)
3186+            # set the leaf for future use.
3187+            self.sharehash_leaves[shnum] = t[0]
3188+
3189+            writer = self.writers[shnum]
3190+            writer.put_blockhashes(self.blockhashes[shnum])
3191+
3192+
3193+    def push_sharehashes(self):
3194+        self._status.set_status("Building and pushing share hash chain")
3195+        share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
3196+        for shnum in xrange(len(self.sharehash_leaves)):
3197+            needed_indices = share_hash_tree.needed_hashes(shnum)
3198+            self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
3199+                                             for i in needed_indices] )
3200+            writer = self.writers[shnum]
3201+            writer.put_sharehashes(self.sharehashes[shnum])
3202+        self.root_hash = share_hash_tree[0]
3203+
3204+
3205+    def push_toplevel_hashes_and_signature(self):
3206+        # We need to to three things here:
3207+        #   - Push the root hash and salt hash
3208+        #   - Get the checkstring of the resulting layout; sign that.
3209+        #   - Push the signature
3210+        self._status.set_status("Pushing root hashes and signature")
3211+        for shnum in xrange(self.total_shares):
3212+            writer = self.writers[shnum]
3213+            writer.put_root_hash(self.root_hash)
3214+        self._update_checkstring()
3215+        self._make_and_place_signature()
3216+
3217+
3218+    def _update_checkstring(self):
3219+        """
3220+        After putting the root hash, MDMF files will have the
3221+        checkstring written to the storage server. This means that we
3222+        can update our copy of the checkstring so we can detect
3223+        uncoordinated writes. SDMF files will have the same checkstring,
3224+        so we need not do anything.
3225+        """
3226+        self._checkstring = self.writers.values()[0].get_checkstring()
3227+
3228+
3229+    def _make_and_place_signature(self):
3230+        """
3231+        I create and place the signature.
3232+        """
3233+        started = time.time()
3234+        self._status.set_status("Signing prefix")
3235+        signable = self.writers[0].get_signable()
3236+        self.signature = self._privkey.sign(signable)
3237+
3238+        for (shnum, writer) in self.writers.iteritems():
3239+            writer.put_signature(self.signature)
3240+        self._status.timings['sign'] = time.time() - started
3241+
3242+
3243+    def finish_publishing(self):
3244+        # We're almost done -- we just need to put the verification key
3245+        # and the offsets
3246+        started = time.time()
3247+        self._status.set_status("Pushing shares")
3248+        self._started_pushing = started
3249+        ds = []
3250+        verification_key = self._pubkey.serialize()
3251+
3252+
3253+        # TODO: Bad, since we remove from this same dict. We need to
3254+        # make a copy, or just use a non-iterated value.
3255+        for (shnum, writer) in self.writers.iteritems():
3256+            writer.put_verification_key(verification_key)
3257+            d = writer.finish_publishing()
3258+            # Add the (peerid, shnum) tuple to our list of outstanding
3259+            # queries. This gets used by _loop if some of our queries
3260+            # fail to place shares.
3261+            self.outstanding.add((writer.peerid, writer.shnum))
3262+            d.addCallback(self._got_write_answer, writer, started)
3263+            d.addErrback(self._connection_problem, writer)
3264+            ds.append(d)
3265+        self._record_verinfo()
3266+        self._status.timings['pack'] = time.time() - started
3267+        return defer.DeferredList(ds)
3268+
3269+
3270+    def _record_verinfo(self):
3271+        self.versioninfo = self.writers.values()[0].get_verinfo()
3272+
3273+
3274+    def _connection_problem(self, f, writer):
3275+        """
3276+        We ran into a connection problem while working with writer, and
3277+        need to deal with that.
3278+        """
3279+        self.log("found problem: %s" % str(f))
3280+        self._last_failure = f
3281+        del(self.writers[writer.shnum])
3282 
3283hunk ./src/allmydata/mutable/publish.py 875
3284-    def _update_status(self):
3285-        self._status.set_status("Sending Shares: %d placed out of %d, "
3286-                                "%d messages outstanding" %
3287-                                (len(self.placed),
3288-                                 len(self.goal),
3289-                                 len(self.outstanding)))
3290-        self._status.set_progress(1.0 * len(self.placed) / len(self.goal))
3291 
3292hunk ./src/allmydata/mutable/publish.py 876
3293-    def loop(self, ignored=None):
3294-        self.log("entering loop", level=log.NOISY)
3295-        if not self._running:
3296-            return
3297-
3298-        self.looplimit -= 1
3299-        if self.looplimit <= 0:
3300-            raise LoopLimitExceededError("loop limit exceeded")
3301-
3302-        if self.surprised:
3303-            # don't send out any new shares, just wait for the outstanding
3304-            # ones to be retired.
3305-            self.log("currently surprised, so don't send any new shares",
3306-                     level=log.NOISY)
3307-        else:
3308-            self.update_goal()
3309-            # how far are we from our goal?
3310-            needed = self.goal - self.placed - self.outstanding
3311-            self._update_status()
3312-
3313-            if needed:
3314-                # we need to send out new shares
3315-                self.log(format="need to send %(needed)d new shares",
3316-                         needed=len(needed), level=log.NOISY)
3317-                self._send_shares(needed)
3318-                return
3319-
3320-        if self.outstanding:
3321-            # queries are still pending, keep waiting
3322-            self.log(format="%(outstanding)d queries still outstanding",
3323-                     outstanding=len(self.outstanding),
3324-                     level=log.NOISY)
3325-            return
3326-
3327-        # no queries outstanding, no placements needed: we're done
3328-        self.log("no queries outstanding, no placements needed: done",
3329-                 level=log.OPERATIONAL)
3330-        now = time.time()
3331-        elapsed = now - self._started_pushing
3332-        self._status.timings["push"] = elapsed
3333-        return self._done(None)
3334-
3335     def log_goal(self, goal, message=""):
3336         logmsg = [message]
3337         for (shnum, peerid) in sorted([(s,p) for (p,s) in goal]):
3338hunk ./src/allmydata/mutable/publish.py 957
3339             self.log_goal(self.goal, "after update: ")
3340 
3341 
3342+    def _got_write_answer(self, answer, writer, started):
3343+        if not answer:
3344+            # SDMF writers only pretend to write when readers set their
3345+            # blocks, salts, and so on -- they actually just write once,
3346+            # at the end of the upload process. In fake writes, they
3347+            # return defer.succeed(None). If we see that, we shouldn't
3348+            # bother checking it.
3349+            return
3350 
3351hunk ./src/allmydata/mutable/publish.py 966
3352-    def _encrypt_and_encode(self):
3353-        # this returns a Deferred that fires with a list of (sharedata,
3354-        # sharenum) tuples. TODO: cache the ciphertext, only produce the
3355-        # shares that we care about.
3356-        self.log("_encrypt_and_encode")
3357-
3358-        self._status.set_status("Encrypting")
3359-        started = time.time()
3360-
3361-        key = hashutil.ssk_readkey_data_hash(self.salt, self.readkey)
3362-        enc = AES(key)
3363-        crypttext = enc.process(self.newdata)
3364-        assert len(crypttext) == len(self.newdata)
3365+        peerid = writer.peerid
3366+        lp = self.log("_got_write_answer from %s, share %d" %
3367+                      (idlib.shortnodeid_b2a(peerid), writer.shnum))
3368 
3369         now = time.time()
3370hunk ./src/allmydata/mutable/publish.py 971
3371-        self._status.timings["encrypt"] = now - started
3372-        started = now
3373-
3374-        # now apply FEC
3375-
3376-        self._status.set_status("Encoding")
3377-        fec = codec.CRSEncoder()
3378-        fec.set_params(self.segment_size,
3379-                       self.required_shares, self.total_shares)
3380-        piece_size = fec.get_block_size()
3381-        crypttext_pieces = [None] * self.required_shares
3382-        for i in range(len(crypttext_pieces)):
3383-            offset = i * piece_size
3384-            piece = crypttext[offset:offset+piece_size]
3385-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
3386-            crypttext_pieces[i] = piece
3387-            assert len(piece) == piece_size
3388-
3389-        d = fec.encode(crypttext_pieces)
3390-        def _done_encoding(res):
3391-            elapsed = time.time() - started
3392-            self._status.timings["encode"] = elapsed
3393-            return res
3394-        d.addCallback(_done_encoding)
3395-        return d
3396-
3397-    def _generate_shares(self, shares_and_shareids):
3398-        # this sets self.shares and self.root_hash
3399-        self.log("_generate_shares")
3400-        self._status.set_status("Generating Shares")
3401-        started = time.time()
3402-
3403-        # we should know these by now
3404-        privkey = self._privkey
3405-        encprivkey = self._encprivkey
3406-        pubkey = self._pubkey
3407-
3408-        (shares, share_ids) = shares_and_shareids
3409-
3410-        assert len(shares) == len(share_ids)
3411-        assert len(shares) == self.total_shares
3412-        all_shares = {}
3413-        block_hash_trees = {}
3414-        share_hash_leaves = [None] * len(shares)
3415-        for i in range(len(shares)):
3416-            share_data = shares[i]
3417-            shnum = share_ids[i]
3418-            all_shares[shnum] = share_data
3419-
3420-            # build the block hash tree. SDMF has only one leaf.
3421-            leaves = [hashutil.block_hash(share_data)]
3422-            t = hashtree.HashTree(leaves)
3423-            block_hash_trees[shnum] = list(t)
3424-            share_hash_leaves[shnum] = t[0]
3425-        for leaf in share_hash_leaves:
3426-            assert leaf is not None
3427-        share_hash_tree = hashtree.HashTree(share_hash_leaves)
3428-        share_hash_chain = {}
3429-        for shnum in range(self.total_shares):
3430-            needed_hashes = share_hash_tree.needed_hashes(shnum)
3431-            share_hash_chain[shnum] = dict( [ (i, share_hash_tree[i])
3432-                                              for i in needed_hashes ] )
3433-        root_hash = share_hash_tree[0]
3434-        assert len(root_hash) == 32
3435-        self.log("my new root_hash is %s" % base32.b2a(root_hash))
3436-        self._new_version_info = (self._new_seqnum, root_hash, self.salt)
3437-
3438-        prefix = pack_prefix(self._new_seqnum, root_hash, self.salt,
3439-                             self.required_shares, self.total_shares,
3440-                             self.segment_size, len(self.newdata))
3441-
3442-        # now pack the beginning of the share. All shares are the same up
3443-        # to the signature, then they have divergent share hash chains,
3444-        # then completely different block hash trees + salt + share data,
3445-        # then they all share the same encprivkey at the end. The sizes
3446-        # of everything are the same for all shares.
3447-
3448-        sign_started = time.time()
3449-        signature = privkey.sign(prefix)
3450-        self._status.timings["sign"] = time.time() - sign_started
3451-
3452-        verification_key = pubkey.serialize()
3453-
3454-        final_shares = {}
3455-        for shnum in range(self.total_shares):
3456-            final_share = pack_share(prefix,
3457-                                     verification_key,
3458-                                     signature,
3459-                                     share_hash_chain[shnum],
3460-                                     block_hash_trees[shnum],
3461-                                     all_shares[shnum],
3462-                                     encprivkey)
3463-            final_shares[shnum] = final_share
3464-        elapsed = time.time() - started
3465-        self._status.timings["pack"] = elapsed
3466-        self.shares = final_shares
3467-        self.root_hash = root_hash
3468-
3469-        # we also need to build up the version identifier for what we're
3470-        # pushing. Extract the offsets from one of our shares.
3471-        assert final_shares
3472-        offsets = unpack_header(final_shares.values()[0])[-1]
3473-        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
3474-        verinfo = (self._new_seqnum, root_hash, self.salt,
3475-                   self.segment_size, len(self.newdata),
3476-                   self.required_shares, self.total_shares,
3477-                   prefix, offsets_tuple)
3478-        self.versioninfo = verinfo
3479-
3480-
3481-
3482-    def _send_shares(self, needed):
3483-        self.log("_send_shares")
3484-
3485-        # we're finally ready to send out our shares. If we encounter any
3486-        # surprises here, it's because somebody else is writing at the same
3487-        # time. (Note: in the future, when we remove the _query_peers() step
3488-        # and instead speculate about [or remember] which shares are where,
3489-        # surprises here are *not* indications of UncoordinatedWriteError,
3490-        # and we'll need to respond to them more gracefully.)
3491-
3492-        # needed is a set of (peerid, shnum) tuples. The first thing we do is
3493-        # organize it by peerid.
3494-
3495-        peermap = DictOfSets()
3496-        for (peerid, shnum) in needed:
3497-            peermap.add(peerid, shnum)
3498-
3499-        # the next thing is to build up a bunch of test vectors. The
3500-        # semantics of Publish are that we perform the operation if the world
3501-        # hasn't changed since the ServerMap was constructed (more or less).
3502-        # For every share we're trying to place, we create a test vector that
3503-        # tests to see if the server*share still corresponds to the
3504-        # map.
3505-
3506-        all_tw_vectors = {} # maps peerid to tw_vectors
3507-        sm = self._servermap.servermap
3508-
3509-        for key in needed:
3510-            (peerid, shnum) = key
3511-
3512-            if key in sm:
3513-                # an old version of that share already exists on the
3514-                # server, according to our servermap. We will create a
3515-                # request that attempts to replace it.
3516-                old_versionid, old_timestamp = sm[key]
3517-                (old_seqnum, old_root_hash, old_salt, old_segsize,
3518-                 old_datalength, old_k, old_N, old_prefix,
3519-                 old_offsets_tuple) = old_versionid
3520-                old_checkstring = pack_checkstring(old_seqnum,
3521-                                                   old_root_hash,
3522-                                                   old_salt)
3523-                testv = (0, len(old_checkstring), "eq", old_checkstring)
3524-
3525-            elif key in self.bad_share_checkstrings:
3526-                old_checkstring = self.bad_share_checkstrings[key]
3527-                testv = (0, len(old_checkstring), "eq", old_checkstring)
3528-
3529-            else:
3530-                # add a testv that requires the share not exist
3531-
3532-                # Unfortunately, foolscap-0.2.5 has a bug in the way inbound
3533-                # constraints are handled. If the same object is referenced
3534-                # multiple times inside the arguments, foolscap emits a
3535-                # 'reference' token instead of a distinct copy of the
3536-                # argument. The bug is that these 'reference' tokens are not
3537-                # accepted by the inbound constraint code. To work around
3538-                # this, we need to prevent python from interning the
3539-                # (constant) tuple, by creating a new copy of this vector
3540-                # each time.
3541-
3542-                # This bug is fixed in foolscap-0.2.6, and even though this
3543-                # version of Tahoe requires foolscap-0.3.1 or newer, we are
3544-                # supposed to be able to interoperate with older versions of
3545-                # Tahoe which are allowed to use older versions of foolscap,
3546-                # including foolscap-0.2.5 . In addition, I've seen other
3547-                # foolscap problems triggered by 'reference' tokens (see #541
3548-                # for details). So we must keep this workaround in place.
3549-
3550-                #testv = (0, 1, 'eq', "")
3551-                testv = tuple([0, 1, 'eq', ""])
3552-
3553-            testvs = [testv]
3554-            # the write vector is simply the share
3555-            writev = [(0, self.shares[shnum])]
3556-
3557-            if peerid not in all_tw_vectors:
3558-                all_tw_vectors[peerid] = {}
3559-                # maps shnum to (testvs, writevs, new_length)
3560-            assert shnum not in all_tw_vectors[peerid]
3561-
3562-            all_tw_vectors[peerid][shnum] = (testvs, writev, None)
3563-
3564-        # we read the checkstring back from each share, however we only use
3565-        # it to detect whether there was a new share that we didn't know
3566-        # about. The success or failure of the write will tell us whether
3567-        # there was a collision or not. If there is a collision, the first
3568-        # thing we'll do is update the servermap, which will find out what
3569-        # happened. We could conceivably reduce a roundtrip by using the
3570-        # readv checkstring to populate the servermap, but really we'd have
3571-        # to read enough data to validate the signatures too, so it wouldn't
3572-        # be an overall win.
3573-        read_vector = [(0, struct.calcsize(SIGNED_PREFIX))]
3574-
3575-        # ok, send the messages!
3576-        self.log("sending %d shares" % len(all_tw_vectors), level=log.NOISY)
3577-        started = time.time()
3578-        for (peerid, tw_vectors) in all_tw_vectors.items():
3579-
3580-            write_enabler = self._node.get_write_enabler(peerid)
3581-            renew_secret = self._node.get_renewal_secret(peerid)
3582-            cancel_secret = self._node.get_cancel_secret(peerid)
3583-            secrets = (write_enabler, renew_secret, cancel_secret)
3584-            shnums = tw_vectors.keys()
3585-
3586-            for shnum in shnums:
3587-                self.outstanding.add( (peerid, shnum) )
3588+        elapsed = now - started
3589 
3590hunk ./src/allmydata/mutable/publish.py 973
3591-            d = self._do_testreadwrite(peerid, secrets,
3592-                                       tw_vectors, read_vector)
3593-            d.addCallbacks(self._got_write_answer, self._got_write_error,
3594-                           callbackArgs=(peerid, shnums, started),
3595-                           errbackArgs=(peerid, shnums, started))
3596-            # tolerate immediate errback, like with DeadReferenceError
3597-            d.addBoth(fireEventually)
3598-            d.addCallback(self.loop)
3599-            d.addErrback(self._fatal_error)
3600+        self._status.add_per_server_time(peerid, elapsed)
3601 
3602hunk ./src/allmydata/mutable/publish.py 975
3603-        self._update_status()
3604-        self.log("%d shares sent" % len(all_tw_vectors), level=log.NOISY)
3605+        wrote, read_data = answer
3606 
3607hunk ./src/allmydata/mutable/publish.py 977
3608-    def _do_testreadwrite(self, peerid, secrets,
3609-                          tw_vectors, read_vector):
3610-        storage_index = self._storage_index
3611-        ss = self.connections[peerid]
3612+        surprise_shares = set(read_data.keys()) - set([writer.shnum])
3613 
3614hunk ./src/allmydata/mutable/publish.py 979
3615-        #print "SS[%s] is %s" % (idlib.shortnodeid_b2a(peerid), ss), ss.tracker.interfaceName
3616-        d = ss.callRemote("slot_testv_and_readv_and_writev",
3617-                          storage_index,
3618-                          secrets,
3619-                          tw_vectors,
3620-                          read_vector)
3621-        return d
3622+        # We need to remove from surprise_shares any shares that we are
3623+        # knowingly also writing to that peer from other writers.
3624 
3625hunk ./src/allmydata/mutable/publish.py 982
3626-    def _got_write_answer(self, answer, peerid, shnums, started):
3627-        lp = self.log("_got_write_answer from %s" %
3628-                      idlib.shortnodeid_b2a(peerid))
3629-        for shnum in shnums:
3630-            self.outstanding.discard( (peerid, shnum) )
3631+        # TODO: Precompute this.
3632+        known_shnums = [x.shnum for x in self.writers.values()
3633+                        if x.peerid == peerid]
3634+        surprise_shares -= set(known_shnums)
3635+        self.log("found the following surprise shares: %s" %
3636+                 str(surprise_shares))
3637 
3638hunk ./src/allmydata/mutable/publish.py 989
3639-        now = time.time()
3640-        elapsed = now - started
3641-        self._status.add_per_server_time(peerid, elapsed)
3642-
3643-        wrote, read_data = answer
3644-
3645-        surprise_shares = set(read_data.keys()) - set(shnums)
3646+        # Now surprise shares contains all of the shares that we did not
3647+        # expect to be there.
3648 
3649         surprised = False
3650         for shnum in surprise_shares:
3651hunk ./src/allmydata/mutable/publish.py 996
3652             # read_data is a dict mapping shnum to checkstring (SIGNED_PREFIX)
3653             checkstring = read_data[shnum][0]
3654-            their_version_info = unpack_checkstring(checkstring)
3655-            if their_version_info == self._new_version_info:
3656+            # What we want to do here is to see if their (seqnum,
3657+            # roothash, salt) is the same as our (seqnum, roothash,
3658+            # salt), or the equivalent for MDMF. The best way to do this
3659+            # is to store a packed representation of our checkstring
3660+            # somewhere, then not bother unpacking the other
3661+            # checkstring.
3662+            if checkstring == self._checkstring:
3663                 # they have the right share, somehow
3664 
3665                 if (peerid,shnum) in self.goal:
3666hunk ./src/allmydata/mutable/publish.py 1081
3667             self.log("our testv failed, so the write did not happen",
3668                      parent=lp, level=log.WEIRD, umid="8sc26g")
3669             self.surprised = True
3670-            self.bad_peers.add(peerid) # don't ask them again
3671+            self.bad_peers.add(writer) # don't ask them again
3672             # use the checkstring to add information to the log message
3673             for (shnum,readv) in read_data.items():
3674                 checkstring = readv[0]
3675hunk ./src/allmydata/mutable/publish.py 1103
3676                 # if expected_version==None, then we didn't expect to see a
3677                 # share on that peer, and the 'surprise_shares' clause above
3678                 # will have logged it.
3679-            # self.loop() will take care of finding new homes
3680             return
3681 
3682hunk ./src/allmydata/mutable/publish.py 1105
3683-        for shnum in shnums:
3684-            self.placed.add( (peerid, shnum) )
3685-            # and update the servermap
3686-            self._servermap.add_new_share(peerid, shnum,
3687+        # and update the servermap
3688+        # self.versioninfo is set during the last phase of publishing.
3689+        # If we get there, we know that responses correspond to placed
3690+        # shares, and can safely execute these statements.
3691+        if self.versioninfo:
3692+            self.log("wrote successfully: adding new share to servermap")
3693+            self._servermap.add_new_share(peerid, writer.shnum,
3694                                           self.versioninfo, started)
3695hunk ./src/allmydata/mutable/publish.py 1113
3696-
3697-        # self.loop() will take care of checking to see if we're done
3698+            self.placed.add( (peerid, writer.shnum) )
3699+        self._update_status()
3700+        # the next method in the deferred chain will check to see if
3701+        # we're done and successful.
3702         return
3703 
3704hunk ./src/allmydata/mutable/publish.py 1119
3705-    def _got_write_error(self, f, peerid, shnums, started):
3706-        for shnum in shnums:
3707-            self.outstanding.discard( (peerid, shnum) )
3708-        self.bad_peers.add(peerid)
3709-        if self._first_write_error is None:
3710-            self._first_write_error = f
3711-        self.log(format="error while writing shares %(shnums)s to peerid %(peerid)s",
3712-                 shnums=list(shnums), peerid=idlib.shortnodeid_b2a(peerid),
3713-                 failure=f,
3714-                 level=log.UNUSUAL)
3715-        # self.loop() will take care of checking to see if we're done
3716-        return
3717-
3718 
3719     def _done(self, res):
3720         if not self._running:
3721hunk ./src/allmydata/mutable/publish.py 1126
3722         self._running = False
3723         now = time.time()
3724         self._status.timings["total"] = now - self._started
3725+
3726+        elapsed = now - self._started_pushing
3727+        self._status.timings['push'] = elapsed
3728+
3729         self._status.set_active(False)
3730hunk ./src/allmydata/mutable/publish.py 1131
3731-        if isinstance(res, failure.Failure):
3732-            self.log("Publish done, with failure", failure=res,
3733-                     level=log.WEIRD, umid="nRsR9Q")
3734-            self._status.set_status("Failed")
3735-        elif self.surprised:
3736-            self.log("Publish done, UncoordinatedWriteError", level=log.UNUSUAL)
3737-            self._status.set_status("UncoordinatedWriteError")
3738-            # deliver a failure
3739-            res = failure.Failure(UncoordinatedWriteError())
3740-            # TODO: recovery
3741-        else:
3742-            self.log("Publish done, success")
3743-            self._status.set_status("Finished")
3744-            self._status.set_progress(1.0)
3745+        self.log("Publish done, success")
3746+        self._status.set_status("Finished")
3747+        self._status.set_progress(1.0)
3748         eventually(self.done_deferred.callback, res)
3749 
3750hunk ./src/allmydata/mutable/publish.py 1136
3751+    def _failure(self):
3752+
3753+        if not self.surprised:
3754+            # We ran out of servers
3755+            self.log("Publish ran out of good servers, "
3756+                     "last failure was: %s" % str(self._last_failure))
3757+            e = NotEnoughServersError("Ran out of non-bad servers, "
3758+                                      "last failure was %s" %
3759+                                      str(self._last_failure))
3760+        else:
3761+            # We ran into shares that we didn't recognize, which means
3762+            # that we need to return an UncoordinatedWriteError.
3763+            self.log("Publish failed with UncoordinatedWriteError")
3764+            e = UncoordinatedWriteError()
3765+        f = failure.Failure(e)
3766+        eventually(self.done_deferred.callback, f)
3767+
3768+
3769+class MutableFileHandle:
3770+    """
3771+    I am a mutable uploadable built around a filehandle-like object,
3772+    usually either a StringIO instance or a handle to an actual file.
3773+    """
3774+    implements(IMutableUploadable)
3775+
3776+    def __init__(self, filehandle):
3777+        # The filehandle is defined as a generally file-like object that
3778+        # has these two methods. We don't care beyond that.
3779+        assert hasattr(filehandle, "read")
3780+        assert hasattr(filehandle, "close")
3781+
3782+        self._filehandle = filehandle
3783+        # We must start reading at the beginning of the file, or we risk
3784+        # encountering errors when the data read does not match the size
3785+        # reported to the uploader.
3786+        self._filehandle.seek(0)
3787+
3788+        # We have not yet read anything, so our position is 0.
3789+        self._marker = 0
3790+
3791+
3792+    def get_size(self):
3793+        """
3794+        I return the amount of data in my filehandle.
3795+        """
3796+        if not hasattr(self, "_size"):
3797+            old_position = self._filehandle.tell()
3798+            # Seek to the end of the file by seeking 0 bytes from the
3799+            # file's end
3800+            self._filehandle.seek(0, 2) # 2 == os.SEEK_END in 2.5+
3801+            self._size = self._filehandle.tell()
3802+            # Restore the previous position, in case this was called
3803+            # after a read.
3804+            self._filehandle.seek(old_position)
3805+            assert self._filehandle.tell() == old_position
3806+
3807+        assert hasattr(self, "_size")
3808+        return self._size
3809+
3810+
3811+    def pos(self):
3812+        """
3813+        I return the position of my read marker -- i.e., how much data I
3814+        have already read and returned to callers.
3815+        """
3816+        return self._marker
3817+
3818+
3819+    def read(self, length):
3820+        """
3821+        I return some data (up to length bytes) from my filehandle.
3822+
3823+        In most cases, I return length bytes, but sometimes I won't --
3824+        for example, if I am asked to read beyond the end of a file, or
3825+        an error occurs.
3826+        """
3827+        results = self._filehandle.read(length)
3828+        self._marker += len(results)
3829+        return [results]
3830+
3831+
3832+    def close(self):
3833+        """
3834+        I close the underlying filehandle. Any further operations on the
3835+        filehandle fail at this point.
3836+        """
3837+        self._filehandle.close()
3838+
3839+
3840+class MutableData(MutableFileHandle):
3841+    """
3842+    I am a mutable uploadable built around a string, which I then cast
3843+    into a StringIO and treat as a filehandle.
3844+    """
3845+
3846+    def __init__(self, s):
3847+        # Take a string and return a file-like uploadable.
3848+        assert isinstance(s, str)
3849+
3850+        MutableFileHandle.__init__(self, StringIO(s))
3851+
3852+
3853+class TransformingUploadable:
3854+    """
3855+    I am an IMutableUploadable that wraps another IMutableUploadable,
3856+    and some segments that are already on the grid. When I am called to
3857+    read, I handle merging of boundary segments.
3858+    """
3859+    implements(IMutableUploadable)
3860+
3861+
3862+    def __init__(self, data, offset, segment_size, start, end):
3863+        assert IMutableUploadable.providedBy(data)
3864+
3865+        self._newdata = data
3866+        self._offset = offset
3867+        self._segment_size = segment_size
3868+        self._start = start
3869+        self._end = end
3870+
3871+        self._read_marker = 0
3872+
3873+        self._first_segment_offset = offset % segment_size
3874+
3875+        num = self.log("TransformingUploadable: starting", parent=None)
3876+        self._log_number = num
3877+        self.log("got fso: %d" % self._first_segment_offset)
3878+        self.log("got offset: %d" % self._offset)
3879+
3880+
3881+    def log(self, *args, **kwargs):
3882+        if 'parent' not in kwargs:
3883+            kwargs['parent'] = self._log_number
3884+        if "facility" not in kwargs:
3885+            kwargs["facility"] = "tahoe.mutable.transforminguploadable"
3886+        return log.msg(*args, **kwargs)
3887+
3888+
3889+    def get_size(self):
3890+        return self._offset + self._newdata.get_size()
3891+
3892+
3893+    def read(self, length):
3894+        # We can get data from 3 sources here.
3895+        #   1. The first of the segments provided to us.
3896+        #   2. The data that we're replacing things with.
3897+        #   3. The last of the segments provided to us.
3898+
3899+        # are we in state 0?
3900+        self.log("reading %d bytes" % length)
3901+
3902+        old_start_data = ""
3903+        old_data_length = self._first_segment_offset - self._read_marker
3904+        if old_data_length > 0:
3905+            if old_data_length > length:
3906+                old_data_length = length
3907+            self.log("returning %d bytes of old start data" % old_data_length)
3908+
3909+            old_data_end = old_data_length + self._read_marker
3910+            old_start_data = self._start[self._read_marker:old_data_end]
3911+            length -= old_data_length
3912+        else:
3913+            # otherwise calculations later get screwed up.
3914+            old_data_length = 0
3915+
3916+        # Is there enough new data to satisfy this read? If not, we need
3917+        # to pad the end of the data with data from our last segment.
3918+        old_end_length = length - \
3919+            (self._newdata.get_size() - self._newdata.pos())
3920+        old_end_data = ""
3921+        if old_end_length > 0:
3922+            self.log("reading %d bytes of old end data" % old_end_length)
3923+
3924+            # TODO: We're not explicitly checking for tail segment size
3925+            # here. Is that a problem?
3926+            old_data_offset = (length - old_end_length + \
3927+                               old_data_length) % self._segment_size
3928+            self.log("reading at offset %d" % old_data_offset)
3929+            old_end = old_data_offset + old_end_length
3930+            old_end_data = self._end[old_data_offset:old_end]
3931+            length -= old_end_length
3932+            assert length == self._newdata.get_size() - self._newdata.pos()
3933+
3934+        self.log("reading %d bytes of new data" % length)
3935+        new_data = self._newdata.read(length)
3936+        new_data = "".join(new_data)
3937+
3938+        self._read_marker += len(old_start_data + new_data + old_end_data)
3939+
3940+        return old_start_data + new_data + old_end_data
3941 
3942hunk ./src/allmydata/mutable/publish.py 1327
3943+    def close(self):
3944+        pass
3945}
3946[nodemaker.py: Make nodemaker expose a way to create MDMF files
3947Kevan Carstensen <kevan@isnotajoke.com>**20100819003509
3948 Ignore-this: a6701746d6b992fc07bc0556a2b4a61d
3949] {
3950hunk ./src/allmydata/nodemaker.py 3
3951 import weakref
3952 from zope.interface import implements
3953-from allmydata.interfaces import INodeMaker
3954+from allmydata.util.assertutil import precondition
3955+from allmydata.interfaces import INodeMaker, SDMF_VERSION
3956 from allmydata.immutable.literal import LiteralFileNode
3957 from allmydata.immutable.filenode import ImmutableFileNode, CiphertextFileNode
3958 from allmydata.immutable.upload import Data
3959hunk ./src/allmydata/nodemaker.py 9
3960 from allmydata.mutable.filenode import MutableFileNode
3961+from allmydata.mutable.publish import MutableData
3962 from allmydata.dirnode import DirectoryNode, pack_children
3963 from allmydata.unknown import UnknownNode
3964 from allmydata import uri
3965hunk ./src/allmydata/nodemaker.py 92
3966             return self._create_dirnode(filenode)
3967         return None
3968 
3969-    def create_mutable_file(self, contents=None, keysize=None):
3970+    def create_mutable_file(self, contents=None, keysize=None,
3971+                            version=SDMF_VERSION):
3972         n = MutableFileNode(self.storage_broker, self.secret_holder,
3973                             self.default_encoding_parameters, self.history)
3974hunk ./src/allmydata/nodemaker.py 96
3975+        n.set_version(version)
3976         d = self.key_generator.generate(keysize)
3977         d.addCallback(n.create_with_keys, contents)
3978         d.addCallback(lambda res: n)
3979hunk ./src/allmydata/nodemaker.py 103
3980         return d
3981 
3982     def create_new_mutable_directory(self, initial_children={}):
3983+        # mutable directories will always be SDMF for now, to help
3984+        # compatibility with older clients.
3985+        version = SDMF_VERSION
3986+        # initial_children must have metadata (i.e. {} instead of None)
3987+        for (name, (node, metadata)) in initial_children.iteritems():
3988+            precondition(isinstance(metadata, dict),
3989+                         "create_new_mutable_directory requires metadata to be a dict, not None", metadata)
3990+            node.raise_error()
3991         d = self.create_mutable_file(lambda n:
3992hunk ./src/allmydata/nodemaker.py 112
3993-                                     pack_children(initial_children, n.get_writekey()))
3994+                                     MutableData(pack_children(initial_children,
3995+                                                    n.get_writekey())),
3996+                                     version=version)
3997         d.addCallback(self._create_dirnode)
3998         return d
3999 
4000}
4001[docs: update docs to mention MDMF
4002Kevan Carstensen <kevan@isnotajoke.com>**20100814225644
4003 Ignore-this: 1c3caa3cd44831007dcfbef297814308
4004] {
4005merger 0.0 (
4006hunk ./docs/configuration.rst 324
4007+Frontend Configuration
4008+======================
4009+
4010+The Tahoe client process can run a variety of frontend file-access protocols.
4011+You will use these to create and retrieve files from the virtual filesystem.
4012+Configuration details for each are documented in the following
4013+protocol-specific guides:
4014+
4015+HTTP
4016+
4017+    Tahoe runs a webserver by default on port 3456. This interface provides a
4018+    human-oriented "WUI", with pages to create, modify, and browse
4019+    directories and files, as well as a number of pages to check on the
4020+    status of your Tahoe node. It also provides a machine-oriented "WAPI",
4021+    with a REST-ful HTTP interface that can be used by other programs
4022+    (including the CLI tools). Please see `<frontends/webapi.rst>`_ for full
4023+    details, and the ``web.port`` and ``web.static`` config variables above.
4024+    The `<frontends/download-status.rst>`_ document also describes a few WUI
4025+    status pages.
4026+
4027+CLI
4028+
4029+    The main "bin/tahoe" executable includes subcommands for manipulating the
4030+    filesystem, uploading/downloading files, and creating/running Tahoe
4031+    nodes. See `<frontends/CLI.rst>`_ for details.
4032+
4033+FTP, SFTP
4034+
4035+    Tahoe can also run both FTP and SFTP servers, and map a username/password
4036+    pair to a top-level Tahoe directory. See `<frontends/FTP-and-SFTP.rst>`_
4037+    for instructions on configuring these services, and the ``[ftpd]`` and
4038+    ``[sftpd]`` sections of ``tahoe.cfg``.
4039+
4040merger 0.0 (
4041replace ./docs/configuration.rst [A-Za-z_0-9\-\.] Tahoe Tahoe-LAFS
4042merger 0.0 (
4043hunk ./docs/configuration.rst 384
4044-shares.needed = (int, optional) aka "k", default 3
4045-shares.total = (int, optional) aka "N", N >= k, default 10
4046-shares.happy = (int, optional) 1 <= happy <= N, default 7
4047-
4048- These three values set the default encoding parameters. Each time a new file
4049- is uploaded, erasure-coding is used to break the ciphertext into separate
4050- pieces. There will be "N" (i.e. shares.total) pieces created, and the file
4051- will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
4052- The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
4053- Setting k to 1 is equivalent to simple replication (uploading N copies of
4054- the file).
4055-
4056- These values control the tradeoff between storage overhead, performance, and
4057- reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
4058- backend storage space (the actual value will be a bit more, because of other
4059- forms of overhead). Up to N-k shares can be lost before the file becomes
4060- unrecoverable, so assuming there are at least N servers, up to N-k servers
4061- can be offline without losing the file. So large N/k ratios are more
4062- reliable, and small N/k ratios use less disk space. Clearly, k must never be
4063- smaller than N.
4064-
4065- Large values of N will slow down upload operations slightly, since more
4066- servers must be involved, and will slightly increase storage overhead due to
4067- the hash trees that are created. Large values of k will cause downloads to
4068- be marginally slower, because more servers must be involved. N cannot be
4069- larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
4070- uses.
4071-
4072- shares.happy allows you control over the distribution of your immutable file.
4073- For a successful upload, shares are guaranteed to be initially placed on
4074- at least 'shares.happy' distinct servers, the correct functioning of any
4075- k of which is sufficient to guarantee the availability of the uploaded file.
4076- This value should not be larger than the number of servers on your grid.
4077-
4078- A value of shares.happy <= k is allowed, but does not provide any redundancy
4079- if some servers fail or lose shares.
4080-
4081- (Mutable files use a different share placement algorithm that does not
4082-  consider this parameter.)
4083-
4084-
4085-== Storage Server Configuration ==
4086-
4087-[storage]
4088-enabled = (boolean, optional)
4089-
4090- If this is True, the node will run a storage server, offering space to other
4091- clients. If it is False, the node will not run a storage server, meaning
4092- that no shares will be stored on this node. Use False this for clients who
4093- do not wish to provide storage service. The default value is True.
4094-
4095-readonly = (boolean, optional)
4096-
4097- If True, the node will run a storage server but will not accept any shares,
4098- making it effectively read-only. Use this for storage servers which are
4099- being decommissioned: the storage/ directory could be mounted read-only,
4100- while shares are moved to other servers. Note that this currently only
4101- affects immutable shares. Mutable shares (used for directories) will be
4102- written and modified anyway. See ticket #390 for the current status of this
4103- bug. The default value is False.
4104-
4105-reserved_space = (str, optional)
4106-
4107- If provided, this value defines how much disk space is reserved: the storage
4108- server will not accept any share which causes the amount of free disk space
4109- to drop below this value. (The free space is measured by a call to statvfs(2)
4110- on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
4111- user account under which the storage server runs.)
4112-
4113- This string contains a number, with an optional case-insensitive scale
4114- suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
4115- "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
4116- thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
4117-
4118-expire.enabled =
4119-expire.mode =
4120-expire.override_lease_duration =
4121-expire.cutoff_date =
4122-expire.immutable =
4123-expire.mutable =
4124-
4125- These settings control garbage-collection, in which the server will delete
4126- shares that no longer have an up-to-date lease on them. Please see the
4127- neighboring "garbage-collection.txt" document for full details.
4128-
4129-
4130-== Running A Helper ==
4131+Running A Helper
4132+================
4133hunk ./docs/configuration.rst 424
4134+mutable.format = sdmf or mdmf
4135+
4136+ This value tells Tahoe-LAFS what the default mutable file format should
4137+ be. If mutable.format=sdmf, then newly created mutable files will be in
4138+ the old SDMF format. This is desirable for clients that operate on
4139+ grids where some peers run older versions of Tahoe-LAFS, as these older
4140+ versions cannot read the new MDMF mutable file format. If
4141+ mutable.format = mdmf, then newly created mutable files will use the
4142+ new MDMF format, which supports efficient in-place modification and
4143+ streaming downloads. You can overwrite this value using a special
4144+ mutable-type parameter in the webapi. If you do not specify a value
4145+ here, Tahoe-LAFS will use SDMF for all newly-created mutable files.
4146+
4147+ Note that this parameter only applies to mutable files. Mutable
4148+ directories, which are stored as mutable files, are not controlled by
4149+ this parameter and will always use SDMF. We may revisit this decision
4150+ in future versions of Tahoe-LAFS.
4151)
4152)
4153)
4154hunk ./docs/frontends/webapi.rst 363
4155  writeable mutable file, that file's contents will be overwritten in-place. If
4156  it is a read-cap for a mutable file, an error will occur. If it is an
4157  immutable file, the old file will be discarded, and a new one will be put in
4158- its place.
4159+ its place. If the target file is a writable mutable file, you may also
4160+ specify an "offset" parameter -- a byte offset that determines where in
4161+ the mutable file the data from the HTTP request body is placed. This
4162+ operation is relatively efficient for MDMF mutable files, and is
4163+ relatively inefficient (but still supported) for SDMF mutable files.
4164 
4165  When creating a new file, if "mutable=true" is in the query arguments, the
4166  operation will create a mutable file instead of an immutable one.
4167hunk ./docs/frontends/webapi.rst 388
4168 
4169  If "mutable=true" is in the query arguments, the operation will create a
4170  mutable file, and return its write-cap in the HTTP respose. The default is
4171- to create an immutable file, returning the read-cap as a response.
4172+ to create an immutable file, returning the read-cap as a response. If
4173+ you create a mutable file, you can also use the "mutable-type" query
4174+ parameter. If "mutable-type=sdmf", then the mutable file will be created
4175+ in the old SDMF mutable file format. This is desirable for files that
4176+ need to be read by old clients. If "mutable-type=mdmf", then the file
4177+ will be created in the new MDMF mutable file format. MDMF mutable files
4178+ can be downloaded more efficiently, and modified in-place efficiently,
4179+ but are not compatible with older versions of Tahoe-LAFS. If no
4180+ "mutable-type" argument is given, the file is created in whatever
4181+ format was configured in tahoe.cfg.
4182 
4183 Creating A New Directory
4184 ------------------------
4185hunk ./docs/frontends/webapi.rst 1082
4186  If a "mutable=true" argument is provided, the operation will create a
4187  mutable file, and the response body will contain the write-cap instead of
4188  the upload results page. The default is to create an immutable file,
4189- returning the upload results page as a response.
4190+ returning the upload results page as a response. If you create a
4191+ mutable file, you may choose to specify the format of that mutable file
4192+ with the "mutable-type" parameter. If "mutable-type=mdmf", then the
4193+ file will be created as an MDMF mutable file. If "mutable-type=sdmf",
4194+ then the file will be created as an SDMF mutable file. If no value is
4195+ specified, the file will be created in whatever format is specified in
4196+ tahoe.cfg.
4197 
4198 
4199 ``POST /uri/$DIRCAP/[SUBDIRS../]?t=upload``
4200}
4201[mutable/layout.py and interfaces.py: add MDMF writer and reader
4202Kevan Carstensen <kevan@isnotajoke.com>**20100819003304
4203 Ignore-this: 44400fec923987b62830da2ed5075fb4
4204 
4205 The MDMF writer is responsible for keeping state as plaintext is
4206 gradually processed into share data by the upload process. When the
4207 upload finishes, it will write all of its share data to a remote server,
4208 reporting its status back to the publisher.
4209 
4210 The MDMF reader is responsible for abstracting an MDMF file as it sits
4211 on the grid from the downloader; specifically, by receiving and
4212 responding to requests for arbitrary data within the MDMF file.
4213 
4214 The interfaces.py file has also been modified to contain an interface
4215 for the writer.
4216] {
4217hunk ./src/allmydata/interfaces.py 7
4218      ChoiceOf, IntegerConstraint, Any, RemoteInterface, Referenceable
4219 
4220 HASH_SIZE=32
4221+SALT_SIZE=16
4222+
4223+SDMF_VERSION=0
4224+MDMF_VERSION=1
4225 
4226 Hash = StringConstraint(maxLength=HASH_SIZE,
4227                         minLength=HASH_SIZE)# binary format 32-byte SHA256 hash
4228hunk ./src/allmydata/interfaces.py 424
4229         """
4230 
4231 
4232+class IMutableSlotWriter(Interface):
4233+    """
4234+    The interface for a writer around a mutable slot on a remote server.
4235+    """
4236+    def set_checkstring(checkstring, *args):
4237+        """
4238+        Set the checkstring that I will pass to the remote server when
4239+        writing.
4240+
4241+            @param checkstring A packed checkstring to use.
4242+
4243+        Note that implementations can differ in which semantics they
4244+        wish to support for set_checkstring -- they can, for example,
4245+        build the checkstring themselves from its constituents, or
4246+        some other thing.
4247+        """
4248+
4249+    def get_checkstring():
4250+        """
4251+        Get the checkstring that I think currently exists on the remote
4252+        server.
4253+        """
4254+
4255+    def put_block(data, segnum, salt):
4256+        """
4257+        Add a block and salt to the share.
4258+        """
4259+
4260+    def put_encprivey(encprivkey):
4261+        """
4262+        Add the encrypted private key to the share.
4263+        """
4264+
4265+    def put_blockhashes(blockhashes=list):
4266+        """
4267+        Add the block hash tree to the share.
4268+        """
4269+
4270+    def put_sharehashes(sharehashes=dict):
4271+        """
4272+        Add the share hash chain to the share.
4273+        """
4274+
4275+    def get_signable():
4276+        """
4277+        Return the part of the share that needs to be signed.
4278+        """
4279+
4280+    def put_signature(signature):
4281+        """
4282+        Add the signature to the share.
4283+        """
4284+
4285+    def put_verification_key(verification_key):
4286+        """
4287+        Add the verification key to the share.
4288+        """
4289+
4290+    def finish_publishing():
4291+        """
4292+        Do anything necessary to finish writing the share to a remote
4293+        server. I require that no further publishing needs to take place
4294+        after this method has been called.
4295+        """
4296+
4297+
4298 class IURI(Interface):
4299     def init_from_string(uri):
4300         """Accept a string (as created by my to_string() method) and populate
4301hunk ./src/allmydata/mutable/layout.py 4
4302 
4303 import struct
4304 from allmydata.mutable.common import NeedMoreDataError, UnknownVersionError
4305+from allmydata.interfaces import HASH_SIZE, SALT_SIZE, SDMF_VERSION, \
4306+                                 MDMF_VERSION, IMutableSlotWriter
4307+from allmydata.util import mathutil, observer
4308+from twisted.python import failure
4309+from twisted.internet import defer
4310+from zope.interface import implements
4311+
4312+
4313+# These strings describe the format of the packed structs they help process
4314+# Here's what they mean:
4315+#
4316+#  PREFIX:
4317+#    >: Big-endian byte order; the most significant byte is first (leftmost).
4318+#    B: The version information; an 8 bit version identifier. Stored as
4319+#       an unsigned char. This is currently 00 00 00 00; our modifications
4320+#       will turn it into 00 00 00 01.
4321+#    Q: The sequence number; this is sort of like a revision history for
4322+#       mutable files; they start at 1 and increase as they are changed after
4323+#       being uploaded. Stored as an unsigned long long, which is 8 bytes in
4324+#       length.
4325+#  32s: The root hash of the share hash tree. We use sha-256d, so we use 32
4326+#       characters = 32 bytes to store the value.
4327+#  16s: The salt for the readkey. This is a 16-byte random value, stored as
4328+#       16 characters.
4329+#
4330+#  SIGNED_PREFIX additions, things that are covered by the signature:
4331+#    B: The "k" encoding parameter. We store this as an 8-bit character,
4332+#       which is convenient because our erasure coding scheme cannot
4333+#       encode if you ask for more than 255 pieces.
4334+#    B: The "N" encoding parameter. Stored as an 8-bit character for the
4335+#       same reasons as above.
4336+#    Q: The segment size of the uploaded file. This will essentially be the
4337+#       length of the file in SDMF. An unsigned long long, so we can store
4338+#       files of quite large size.
4339+#    Q: The data length of the uploaded file. Modulo padding, this will be
4340+#       the same of the data length field. Like the data length field, it is
4341+#       an unsigned long long and can be quite large.
4342+#
4343+#   HEADER additions:
4344+#     L: The offset of the signature of this. An unsigned long.
4345+#     L: The offset of the share hash chain. An unsigned long.
4346+#     L: The offset of the block hash tree. An unsigned long.
4347+#     L: The offset of the share data. An unsigned long.
4348+#     Q: The offset of the encrypted private key. An unsigned long long, to
4349+#        account for the possibility of a lot of share data.
4350+#     Q: The offset of the EOF. An unsigned long long, to account for the
4351+#        possibility of a lot of share data.
4352+#
4353+#  After all of these, we have the following:
4354+#    - The verification key: Occupies the space between the end of the header
4355+#      and the start of the signature (i.e.: data[HEADER_LENGTH:o['signature']].
4356+#    - The signature, which goes from the signature offset to the share hash
4357+#      chain offset.
4358+#    - The share hash chain, which goes from the share hash chain offset to
4359+#      the block hash tree offset.
4360+#    - The share data, which goes from the share data offset to the encrypted
4361+#      private key offset.
4362+#    - The encrypted private key offset, which goes until the end of the file.
4363+#
4364+#  The block hash tree in this encoding has only one share, so the offset of
4365+#  the share data will be 32 bits more than the offset of the block hash tree.
4366+#  Given this, we may need to check to see how many bytes a reasonably sized
4367+#  block hash tree will take up.
4368 
4369 PREFIX = ">BQ32s16s" # each version has a different prefix
4370 SIGNED_PREFIX = ">BQ32s16s BBQQ" # this is covered by the signature
4371hunk ./src/allmydata/mutable/layout.py 73
4372 SIGNED_PREFIX_LENGTH = struct.calcsize(SIGNED_PREFIX)
4373 HEADER = ">BQ32s16s BBQQ LLLLQQ" # includes offsets
4374 HEADER_LENGTH = struct.calcsize(HEADER)
4375+OFFSETS = ">LLLLQQ"
4376+OFFSETS_LENGTH = struct.calcsize(OFFSETS)
4377 
4378hunk ./src/allmydata/mutable/layout.py 76
4379+# These are still used for some tests.
4380 def unpack_header(data):
4381     o = {}
4382     (version,
4383hunk ./src/allmydata/mutable/layout.py 92
4384      o['EOF']) = struct.unpack(HEADER, data[:HEADER_LENGTH])
4385     return (version, seqnum, root_hash, IV, k, N, segsize, datalen, o)
4386 
4387-def unpack_prefix_and_signature(data):
4388-    assert len(data) >= HEADER_LENGTH, len(data)
4389-    prefix = data[:SIGNED_PREFIX_LENGTH]
4390-
4391-    (version,
4392-     seqnum,
4393-     root_hash,
4394-     IV,
4395-     k, N, segsize, datalen,
4396-     o) = unpack_header(data)
4397-
4398-    if version != 0:
4399-        raise UnknownVersionError("got mutable share version %d, but I only understand version 0" % version)
4400-
4401-    if len(data) < o['share_hash_chain']:
4402-        raise NeedMoreDataError(o['share_hash_chain'],
4403-                                o['enc_privkey'], o['EOF']-o['enc_privkey'])
4404-
4405-    pubkey_s = data[HEADER_LENGTH:o['signature']]
4406-    signature = data[o['signature']:o['share_hash_chain']]
4407-
4408-    return (seqnum, root_hash, IV, k, N, segsize, datalen,
4409-            pubkey_s, signature, prefix)
4410-
4411 def unpack_share(data):
4412     assert len(data) >= HEADER_LENGTH
4413     o = {}
4414hunk ./src/allmydata/mutable/layout.py 139
4415             pubkey, signature, share_hash_chain, block_hash_tree,
4416             share_data, enc_privkey)
4417 
4418-def unpack_share_data(verinfo, hash_and_data):
4419-    (seqnum, root_hash, IV, segsize, datalength, k, N, prefix, o_t) = verinfo
4420-
4421-    # hash_and_data starts with the share_hash_chain, so figure out what the
4422-    # offsets really are
4423-    o = dict(o_t)
4424-    o_share_hash_chain = 0
4425-    o_block_hash_tree = o['block_hash_tree'] - o['share_hash_chain']
4426-    o_share_data = o['share_data'] - o['share_hash_chain']
4427-    o_enc_privkey = o['enc_privkey'] - o['share_hash_chain']
4428-
4429-    share_hash_chain_s = hash_and_data[o_share_hash_chain:o_block_hash_tree]
4430-    share_hash_format = ">H32s"
4431-    hsize = struct.calcsize(share_hash_format)
4432-    assert len(share_hash_chain_s) % hsize == 0, len(share_hash_chain_s)
4433-    share_hash_chain = []
4434-    for i in range(0, len(share_hash_chain_s), hsize):
4435-        chunk = share_hash_chain_s[i:i+hsize]
4436-        (hid, h) = struct.unpack(share_hash_format, chunk)
4437-        share_hash_chain.append( (hid, h) )
4438-    share_hash_chain = dict(share_hash_chain)
4439-    block_hash_tree_s = hash_and_data[o_block_hash_tree:o_share_data]
4440-    assert len(block_hash_tree_s) % 32 == 0, len(block_hash_tree_s)
4441-    block_hash_tree = []
4442-    for i in range(0, len(block_hash_tree_s), 32):
4443-        block_hash_tree.append(block_hash_tree_s[i:i+32])
4444-
4445-    share_data = hash_and_data[o_share_data:o_enc_privkey]
4446-
4447-    return (share_hash_chain, block_hash_tree, share_data)
4448-
4449-
4450-def pack_checkstring(seqnum, root_hash, IV):
4451-    return struct.pack(PREFIX,
4452-                       0, # version,
4453-                       seqnum,
4454-                       root_hash,
4455-                       IV)
4456-
4457 def unpack_checkstring(checkstring):
4458     cs_len = struct.calcsize(PREFIX)
4459     version, seqnum, root_hash, IV = struct.unpack(PREFIX, checkstring[:cs_len])
4460hunk ./src/allmydata/mutable/layout.py 146
4461         raise UnknownVersionError("got mutable share version %d, but I only understand version 0" % version)
4462     return (seqnum, root_hash, IV)
4463 
4464-def pack_prefix(seqnum, root_hash, IV,
4465-                required_shares, total_shares,
4466-                segment_size, data_length):
4467-    prefix = struct.pack(SIGNED_PREFIX,
4468-                         0, # version,
4469-                         seqnum,
4470-                         root_hash,
4471-                         IV,
4472-
4473-                         required_shares,
4474-                         total_shares,
4475-                         segment_size,
4476-                         data_length,
4477-                         )
4478-    return prefix
4479 
4480 def pack_offsets(verification_key_length, signature_length,
4481                  share_hash_chain_length, block_hash_tree_length,
4482hunk ./src/allmydata/mutable/layout.py 192
4483                            encprivkey])
4484     return final_share
4485 
4486+def pack_prefix(seqnum, root_hash, IV,
4487+                required_shares, total_shares,
4488+                segment_size, data_length):
4489+    prefix = struct.pack(SIGNED_PREFIX,
4490+                         0, # version,
4491+                         seqnum,
4492+                         root_hash,
4493+                         IV,
4494+                         required_shares,
4495+                         total_shares,
4496+                         segment_size,
4497+                         data_length,
4498+                         )
4499+    return prefix
4500+
4501+
4502+class SDMFSlotWriteProxy:
4503+    implements(IMutableSlotWriter)
4504+    """
4505+    I represent a remote write slot for an SDMF mutable file. I build a
4506+    share in memory, and then write it in one piece to the remote
4507+    server. This mimics how SDMF shares were built before MDMF (and the
4508+    new MDMF uploader), but provides that functionality in a way that
4509+    allows the MDMF uploader to be built without much special-casing for
4510+    file format, which makes the uploader code more readable.
4511+    """
4512+    def __init__(self,
4513+                 shnum,
4514+                 rref, # a remote reference to a storage server
4515+                 storage_index,
4516+                 secrets, # (write_enabler, renew_secret, cancel_secret)
4517+                 seqnum, # the sequence number of the mutable file
4518+                 required_shares,
4519+                 total_shares,
4520+                 segment_size,
4521+                 data_length): # the length of the original file
4522+        self.shnum = shnum
4523+        self._rref = rref
4524+        self._storage_index = storage_index
4525+        self._secrets = secrets
4526+        self._seqnum = seqnum
4527+        self._required_shares = required_shares
4528+        self._total_shares = total_shares
4529+        self._segment_size = segment_size
4530+        self._data_length = data_length
4531+
4532+        # This is an SDMF file, so it should have only one segment, so,
4533+        # modulo padding of the data length, the segment size and the
4534+        # data length should be the same.
4535+        expected_segment_size = mathutil.next_multiple(data_length,
4536+                                                       self._required_shares)
4537+        assert expected_segment_size == segment_size
4538+
4539+        self._block_size = self._segment_size / self._required_shares
4540+
4541+        # This is meant to mimic how SDMF files were built before MDMF
4542+        # entered the picture: we generate each share in its entirety,
4543+        # then push it off to the storage server in one write. When
4544+        # callers call set_*, they are just populating this dict.
4545+        # finish_publishing will stitch these pieces together into a
4546+        # coherent share, and then write the coherent share to the
4547+        # storage server.
4548+        self._share_pieces = {}
4549+
4550+        # This tells the write logic what checkstring to use when
4551+        # writing remote shares.
4552+        self._testvs = []
4553+
4554+        self._readvs = [(0, struct.calcsize(PREFIX))]
4555+
4556+
4557+    def set_checkstring(self, checkstring_or_seqnum,
4558+                              root_hash=None,
4559+                              salt=None):
4560+        """
4561+        Set the checkstring that I will pass to the remote server when
4562+        writing.
4563+
4564+            @param checkstring_or_seqnum: A packed checkstring to use,
4565+                   or a sequence number. I will treat this as a checkstr
4566+
4567+        Note that implementations can differ in which semantics they
4568+        wish to support for set_checkstring -- they can, for example,
4569+        build the checkstring themselves from its constituents, or
4570+        some other thing.
4571+        """
4572+        if root_hash and salt:
4573+            checkstring = struct.pack(PREFIX,
4574+                                      0,
4575+                                      checkstring_or_seqnum,
4576+                                      root_hash,
4577+                                      salt)
4578+        else:
4579+            checkstring = checkstring_or_seqnum
4580+        self._testvs = [(0, len(checkstring), "eq", checkstring)]
4581+
4582+
4583+    def get_checkstring(self):
4584+        """
4585+        Get the checkstring that I think currently exists on the remote
4586+        server.
4587+        """
4588+        if self._testvs:
4589+            return self._testvs[0][3]
4590+        return ""
4591+
4592+
4593+    def put_block(self, data, segnum, salt):
4594+        """
4595+        Add a block and salt to the share.
4596+        """
4597+        # SDMF files have only one segment
4598+        assert segnum == 0
4599+        assert len(data) == self._block_size
4600+        assert len(salt) == SALT_SIZE
4601+
4602+        self._share_pieces['sharedata'] = data
4603+        self._share_pieces['salt'] = salt
4604+
4605+        # TODO: Figure out something intelligent to return.
4606+        return defer.succeed(None)
4607+
4608+
4609+    def put_encprivkey(self, encprivkey):
4610+        """
4611+        Add the encrypted private key to the share.
4612+        """
4613+        self._share_pieces['encprivkey'] = encprivkey
4614+
4615+        return defer.succeed(None)
4616+
4617+
4618+    def put_blockhashes(self, blockhashes):
4619+        """
4620+        Add the block hash tree to the share.
4621+        """
4622+        assert isinstance(blockhashes, list)
4623+        for h in blockhashes:
4624+            assert len(h) == HASH_SIZE
4625+
4626+        # serialize the blockhashes, then set them.
4627+        blockhashes_s = "".join(blockhashes)
4628+        self._share_pieces['block_hash_tree'] = blockhashes_s
4629+
4630+        return defer.succeed(None)
4631+
4632+
4633+    def put_sharehashes(self, sharehashes):
4634+        """
4635+        Add the share hash chain to the share.
4636+        """
4637+        assert isinstance(sharehashes, dict)
4638+        for h in sharehashes.itervalues():
4639+            assert len(h) == HASH_SIZE
4640+
4641+        # serialize the sharehashes, then set them.
4642+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
4643+                                 for i in sorted(sharehashes.keys())])
4644+        self._share_pieces['share_hash_chain'] = sharehashes_s
4645+
4646+        return defer.succeed(None)
4647+
4648+
4649+    def put_root_hash(self, root_hash):
4650+        """
4651+        Add the root hash to the share.
4652+        """
4653+        assert len(root_hash) == HASH_SIZE
4654+
4655+        self._share_pieces['root_hash'] = root_hash
4656+
4657+        return defer.succeed(None)
4658+
4659+
4660+    def put_salt(self, salt):
4661+        """
4662+        Add a salt to an empty SDMF file.
4663+        """
4664+        assert len(salt) == SALT_SIZE
4665+
4666+        self._share_pieces['salt'] = salt
4667+        self._share_pieces['sharedata'] = ""
4668+
4669+
4670+    def get_signable(self):
4671+        """
4672+        Return the part of the share that needs to be signed.
4673+
4674+        SDMF writers need to sign the packed representation of the
4675+        first eight fields of the remote share, that is:
4676+            - version number (0)
4677+            - sequence number
4678+            - root of the share hash tree
4679+            - salt
4680+            - k
4681+            - n
4682+            - segsize
4683+            - datalen
4684+
4685+        This method is responsible for returning that to callers.
4686+        """
4687+        return struct.pack(SIGNED_PREFIX,
4688+                           0,
4689+                           self._seqnum,
4690+                           self._share_pieces['root_hash'],
4691+                           self._share_pieces['salt'],
4692+                           self._required_shares,
4693+                           self._total_shares,
4694+                           self._segment_size,
4695+                           self._data_length)
4696+
4697+
4698+    def put_signature(self, signature):
4699+        """
4700+        Add the signature to the share.
4701+        """
4702+        self._share_pieces['signature'] = signature
4703+
4704+        return defer.succeed(None)
4705+
4706+
4707+    def put_verification_key(self, verification_key):
4708+        """
4709+        Add the verification key to the share.
4710+        """
4711+        self._share_pieces['verification_key'] = verification_key
4712+
4713+        return defer.succeed(None)
4714+
4715+
4716+    def get_verinfo(self):
4717+        """
4718+        I return my verinfo tuple. This is used by the ServermapUpdater
4719+        to keep track of versions of mutable files.
4720+
4721+        The verinfo tuple for MDMF files contains:
4722+            - seqnum
4723+            - root hash
4724+            - a blank (nothing)
4725+            - segsize
4726+            - datalen
4727+            - k
4728+            - n
4729+            - prefix (the thing that you sign)
4730+            - a tuple of offsets
4731+
4732+        We include the nonce in MDMF to simplify processing of version
4733+        information tuples.
4734+
4735+        The verinfo tuple for SDMF files is the same, but contains a
4736+        16-byte IV instead of a hash of salts.
4737+        """
4738+        return (self._seqnum,
4739+                self._share_pieces['root_hash'],
4740+                self._share_pieces['salt'],
4741+                self._segment_size,
4742+                self._data_length,
4743+                self._required_shares,
4744+                self._total_shares,
4745+                self.get_signable(),
4746+                self._get_offsets_tuple())
4747+
4748+    def _get_offsets_dict(self):
4749+        post_offset = HEADER_LENGTH
4750+        offsets = {}
4751+
4752+        verification_key_length = len(self._share_pieces['verification_key'])
4753+        o1 = offsets['signature'] = post_offset + verification_key_length
4754+
4755+        signature_length = len(self._share_pieces['signature'])
4756+        o2 = offsets['share_hash_chain'] = o1 + signature_length
4757+
4758+        share_hash_chain_length = len(self._share_pieces['share_hash_chain'])
4759+        o3 = offsets['block_hash_tree'] = o2 + share_hash_chain_length
4760+
4761+        block_hash_tree_length = len(self._share_pieces['block_hash_tree'])
4762+        o4 = offsets['share_data'] = o3 + block_hash_tree_length
4763+
4764+        share_data_length = len(self._share_pieces['sharedata'])
4765+        o5 = offsets['enc_privkey'] = o4 + share_data_length
4766+
4767+        encprivkey_length = len(self._share_pieces['encprivkey'])
4768+        offsets['EOF'] = o5 + encprivkey_length
4769+        return offsets
4770+
4771+
4772+    def _get_offsets_tuple(self):
4773+        offsets = self._get_offsets_dict()
4774+        return tuple([(key, value) for key, value in offsets.items()])
4775+
4776+
4777+    def _pack_offsets(self):
4778+        offsets = self._get_offsets_dict()
4779+        return struct.pack(">LLLLQQ",
4780+                           offsets['signature'],
4781+                           offsets['share_hash_chain'],
4782+                           offsets['block_hash_tree'],
4783+                           offsets['share_data'],
4784+                           offsets['enc_privkey'],
4785+                           offsets['EOF'])
4786+
4787+
4788+    def finish_publishing(self):
4789+        """
4790+        Do anything necessary to finish writing the share to a remote
4791+        server. I require that no further publishing needs to take place
4792+        after this method has been called.
4793+        """
4794+        for k in ["sharedata", "encprivkey", "signature", "verification_key",
4795+                  "share_hash_chain", "block_hash_tree"]:
4796+            assert k in self._share_pieces
4797+        # This is the only method that actually writes something to the
4798+        # remote server.
4799+        # First, we need to pack the share into data that we can write
4800+        # to the remote server in one write.
4801+        offsets = self._pack_offsets()
4802+        prefix = self.get_signable()
4803+        final_share = "".join([prefix,
4804+                               offsets,
4805+                               self._share_pieces['verification_key'],
4806+                               self._share_pieces['signature'],
4807+                               self._share_pieces['share_hash_chain'],
4808+                               self._share_pieces['block_hash_tree'],
4809+                               self._share_pieces['sharedata'],
4810+                               self._share_pieces['encprivkey']])
4811+
4812+        # Our only data vector is going to be writing the final share,
4813+        # in its entirely.
4814+        datavs = [(0, final_share)]
4815+
4816+        if not self._testvs:
4817+            # Our caller has not provided us with another checkstring
4818+            # yet, so we assume that we are writing a new share, and set
4819+            # a test vector that will allow a new share to be written.
4820+            self._testvs = []
4821+            self._testvs.append(tuple([0, 1, "eq", ""]))
4822+
4823+        tw_vectors = {}
4824+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
4825+        return self._rref.callRemote("slot_testv_and_readv_and_writev",
4826+                                     self._storage_index,
4827+                                     self._secrets,
4828+                                     tw_vectors,
4829+                                     # TODO is it useful to read something?
4830+                                     self._readvs)
4831+
4832+
4833+MDMFHEADER = ">BQ32sBBQQ QQQQQQ"
4834+MDMFHEADERWITHOUTOFFSETS = ">BQ32sBBQQ"
4835+MDMFHEADERSIZE = struct.calcsize(MDMFHEADER)
4836+MDMFHEADERWITHOUTOFFSETSSIZE = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
4837+MDMFCHECKSTRING = ">BQ32s"
4838+MDMFSIGNABLEHEADER = ">BQ32sBBQQ"
4839+MDMFOFFSETS = ">QQQQQQ"
4840+MDMFOFFSETS_LENGTH = struct.calcsize(MDMFOFFSETS)
4841+
4842+class MDMFSlotWriteProxy:
4843+    implements(IMutableSlotWriter)
4844+
4845+    """
4846+    I represent a remote write slot for an MDMF mutable file.
4847+
4848+    I abstract away from my caller the details of block and salt
4849+    management, and the implementation of the on-disk format for MDMF
4850+    shares.
4851+    """
4852+    # Expected layout, MDMF:
4853+    # offset:     size:       name:
4854+    #-- signed part --
4855+    # 0           1           version number (01)
4856+    # 1           8           sequence number
4857+    # 9           32          share tree root hash
4858+    # 41          1           The "k" encoding parameter
4859+    # 42          1           The "N" encoding parameter
4860+    # 43          8           The segment size of the uploaded file
4861+    # 51          8           The data length of the original plaintext
4862+    #-- end signed part --
4863+    # 59          8           The offset of the encrypted private key
4864+    # 83          8           The offset of the signature
4865+    # 91          8           The offset of the verification key
4866+    # 67          8           The offset of the block hash tree
4867+    # 75          8           The offset of the share hash chain
4868+    # 99          8           The offset of the EOF
4869+    #
4870+    # followed by salts and share data, the encrypted private key, the
4871+    # block hash tree, the salt hash tree, the share hash chain, a
4872+    # signature over the first eight fields, and a verification key.
4873+    #
4874+    # The checkstring is the first three fields -- the version number,
4875+    # sequence number, root hash and root salt hash. This is consistent
4876+    # in meaning to what we have with SDMF files, except now instead of
4877+    # using the literal salt, we use a value derived from all of the
4878+    # salts -- the share hash root.
4879+    #
4880+    # The salt is stored before the block for each segment. The block
4881+    # hash tree is computed over the combination of block and salt for
4882+    # each segment. In this way, we get integrity checking for both
4883+    # block and salt with the current block hash tree arrangement.
4884+    #
4885+    # The ordering of the offsets is different to reflect the dependencies
4886+    # that we'll run into with an MDMF file. The expected write flow is
4887+    # something like this:
4888+    #
4889+    #   0: Initialize with the sequence number, encoding parameters and
4890+    #      data length. From this, we can deduce the number of segments,
4891+    #      and where they should go.. We can also figure out where the
4892+    #      encrypted private key should go, because we can figure out how
4893+    #      big the share data will be.
4894+    #
4895+    #   1: Encrypt, encode, and upload the file in chunks. Do something
4896+    #      like
4897+    #
4898+    #       put_block(data, segnum, salt)
4899+    #
4900+    #      to write a block and a salt to the disk. We can do both of
4901+    #      these operations now because we have enough of the offsets to
4902+    #      know where to put them.
4903+    #
4904+    #   2: Put the encrypted private key. Use:
4905+    #
4906+    #        put_encprivkey(encprivkey)
4907+    #
4908+    #      Now that we know the length of the private key, we can fill
4909+    #      in the offset for the block hash tree.
4910+    #
4911+    #   3: We're now in a position to upload the block hash tree for
4912+    #      a share. Put that using something like:
4913+    #       
4914+    #        put_blockhashes(block_hash_tree)
4915+    #
4916+    #      Note that block_hash_tree is a list of hashes -- we'll take
4917+    #      care of the details of serializing that appropriately. When
4918+    #      we get the block hash tree, we are also in a position to
4919+    #      calculate the offset for the share hash chain, and fill that
4920+    #      into the offsets table.
4921+    #
4922+    #   4: At the same time, we're in a position to upload the salt hash
4923+    #      tree. This is a Merkle tree over all of the salts. We use a
4924+    #      Merkle tree so that we can validate each block,salt pair as
4925+    #      we download them later. We do this using
4926+    #
4927+    #        put_salthashes(salt_hash_tree)
4928+    #
4929+    #      When you do this, I automatically put the root of the tree
4930+    #      (the hash at index 0 of the list) in its appropriate slot in
4931+    #      the signed prefix of the share.
4932+    #
4933+    #   5: We're now in a position to upload the share hash chain for
4934+    #      a share. Do that with something like:
4935+    #     
4936+    #        put_sharehashes(share_hash_chain)
4937+    #
4938+    #      share_hash_chain should be a dictionary mapping shnums to
4939+    #      32-byte hashes -- the wrapper handles serialization.
4940+    #      We'll know where to put the signature at this point, also.
4941+    #      The root of this tree will be put explicitly in the next
4942+    #      step.
4943+    #
4944+    #      TODO: Why? Why not just include it in the tree here?
4945+    #
4946+    #   6: Before putting the signature, we must first put the
4947+    #      root_hash. Do this with:
4948+    #
4949+    #        put_root_hash(root_hash).
4950+    #     
4951+    #      In terms of knowing where to put this value, it was always
4952+    #      possible to place it, but it makes sense semantically to
4953+    #      place it after the share hash tree, so that's why you do it
4954+    #      in this order.
4955+    #
4956+    #   6: With the root hash put, we can now sign the header. Use:
4957+    #
4958+    #        get_signable()
4959+    #
4960+    #      to get the part of the header that you want to sign, and use:
4961+    #       
4962+    #        put_signature(signature)
4963+    #
4964+    #      to write your signature to the remote server.
4965+    #
4966+    #   6: Add the verification key, and finish. Do:
4967+    #
4968+    #        put_verification_key(key)
4969+    #
4970+    #      and
4971+    #
4972+    #        finish_publish()
4973+    #
4974+    # Checkstring management:
4975+    #
4976+    # To write to a mutable slot, we have to provide test vectors to ensure
4977+    # that we are writing to the same data that we think we are. These
4978+    # vectors allow us to detect uncoordinated writes; that is, writes
4979+    # where both we and some other shareholder are writing to the
4980+    # mutable slot, and to report those back to the parts of the program
4981+    # doing the writing.
4982+    #
4983+    # With SDMF, this was easy -- all of the share data was written in
4984+    # one go, so it was easy to detect uncoordinated writes, and we only
4985+    # had to do it once. With MDMF, not all of the file is written at
4986+    # once.
4987+    #
4988+    # If a share is new, we write out as much of the header as we can
4989+    # before writing out anything else. This gives other writers a
4990+    # canary that they can use to detect uncoordinated writes, and, if
4991+    # they do the same thing, gives us the same canary. We them update
4992+    # the share. We won't be able to write out two fields of the header
4993+    # -- the share tree hash and the salt hash -- until we finish
4994+    # writing out the share. We only require the writer to provide the
4995+    # initial checkstring, and keep track of what it should be after
4996+    # updates ourselves.
4997+    #
4998+    # If we haven't written anything yet, then on the first write (which
4999+    # will probably be a block + salt of a share), we'll also write out
5000+    # the header. On subsequent passes, we'll expect to see the header.
5001+    # This changes in two places:
5002+    #
5003+    #   - When we write out the salt hash
5004+    #   - When we write out the root of the share hash tree
5005+    #
5006+    # since these values will change the header. It is possible that we
5007+    # can just make those be written in one operation to minimize
5008+    # disruption.
5009+    def __init__(self,
5010+                 shnum,
5011+                 rref, # a remote reference to a storage server
5012+                 storage_index,
5013+                 secrets, # (write_enabler, renew_secret, cancel_secret)
5014+                 seqnum, # the sequence number of the mutable file
5015+                 required_shares,
5016+                 total_shares,
5017+                 segment_size,
5018+                 data_length): # the length of the original file
5019+        self.shnum = shnum
5020+        self._rref = rref
5021+        self._storage_index = storage_index
5022+        self._seqnum = seqnum
5023+        self._required_shares = required_shares
5024+        assert self.shnum >= 0 and self.shnum < total_shares
5025+        self._total_shares = total_shares
5026+        # We build up the offset table as we write things. It is the
5027+        # last thing we write to the remote server.
5028+        self._offsets = {}
5029+        self._testvs = []
5030+        # This is a list of write vectors that will be sent to our
5031+        # remote server once we are directed to write things there.
5032+        self._writevs = []
5033+        self._secrets = secrets
5034+        # The segment size needs to be a multiple of the k parameter --
5035+        # any padding should have been carried out by the publisher
5036+        # already.
5037+        assert segment_size % required_shares == 0
5038+        self._segment_size = segment_size
5039+        self._data_length = data_length
5040+
5041+        # These are set later -- we define them here so that we can
5042+        # check for their existence easily
5043+
5044+        # This is the root of the share hash tree -- the Merkle tree
5045+        # over the roots of the block hash trees computed for shares in
5046+        # this upload.
5047+        self._root_hash = None
5048+
5049+        # We haven't yet written anything to the remote bucket. By
5050+        # setting this, we tell the _write method as much. The write
5051+        # method will then know that it also needs to add a write vector
5052+        # for the checkstring (or what we have of it) to the first write
5053+        # request. We'll then record that value for future use.  If
5054+        # we're expecting something to be there already, we need to call
5055+        # set_checkstring before we write anything to tell the first
5056+        # write about that.
5057+        self._written = False
5058+
5059+        # When writing data to the storage servers, we get a read vector
5060+        # for free. We'll read the checkstring, which will help us
5061+        # figure out what's gone wrong if a write fails.
5062+        self._readv = [(0, struct.calcsize(MDMFCHECKSTRING))]
5063+
5064+        # We calculate the number of segments because it tells us
5065+        # where the salt part of the file ends/share segment begins,
5066+        # and also because it provides a useful amount of bounds checking.
5067+        self._num_segments = mathutil.div_ceil(self._data_length,
5068+                                               self._segment_size)
5069+        self._block_size = self._segment_size / self._required_shares
5070+        # We also calculate the share size, to help us with block
5071+        # constraints later.
5072+        tail_size = self._data_length % self._segment_size
5073+        if not tail_size:
5074+            self._tail_block_size = self._block_size
5075+        else:
5076+            self._tail_block_size = mathutil.next_multiple(tail_size,
5077+                                                           self._required_shares)
5078+            self._tail_block_size /= self._required_shares
5079+
5080+        # We already know where the sharedata starts; right after the end
5081+        # of the header (which is defined as the signable part + the offsets)
5082+        # We can also calculate where the encrypted private key begins
5083+        # from what we know know.
5084+        self._actual_block_size = self._block_size + SALT_SIZE
5085+        data_size = self._actual_block_size * (self._num_segments - 1)
5086+        data_size += self._tail_block_size
5087+        data_size += SALT_SIZE
5088+        self._offsets['enc_privkey'] = MDMFHEADERSIZE
5089+        self._offsets['enc_privkey'] += data_size
5090+        # We'll wait for the rest. Callers can now call my "put_block" and
5091+        # "set_checkstring" methods.
5092+
5093+
5094+    def set_checkstring(self,
5095+                        seqnum_or_checkstring,
5096+                        root_hash=None,
5097+                        salt=None):
5098+        """
5099+        Set checkstring checkstring for the given shnum.
5100+
5101+        This can be invoked in one of two ways.
5102+
5103+        With one argument, I assume that you are giving me a literal
5104+        checkstring -- e.g., the output of get_checkstring. I will then
5105+        set that checkstring as it is. This form is used by unit tests.
5106+
5107+        With two arguments, I assume that you are giving me a sequence
5108+        number and root hash to make a checkstring from. In that case, I
5109+        will build a checkstring and set it for you. This form is used
5110+        by the publisher.
5111+
5112+        By default, I assume that I am writing new shares to the grid.
5113+        If you don't explcitly set your own checkstring, I will use
5114+        one that requires that the remote share not exist. You will want
5115+        to use this method if you are updating a share in-place;
5116+        otherwise, writes will fail.
5117+        """
5118+        # You're allowed to overwrite checkstrings with this method;
5119+        # I assume that users know what they are doing when they call
5120+        # it.
5121+        if root_hash:
5122+            checkstring = struct.pack(MDMFCHECKSTRING,
5123+                                      1,
5124+                                      seqnum_or_checkstring,
5125+                                      root_hash)
5126+        else:
5127+            checkstring = seqnum_or_checkstring
5128+
5129+        if checkstring == "":
5130+            # We special-case this, since len("") = 0, but we need
5131+            # length of 1 for the case of an empty share to work on the
5132+            # storage server, which is what a checkstring that is the
5133+            # empty string means.
5134+            self._testvs = []
5135+        else:
5136+            self._testvs = []
5137+            self._testvs.append((0, len(checkstring), "eq", checkstring))
5138+
5139+
5140+    def __repr__(self):
5141+        return "MDMFSlotWriteProxy for share %d" % self.shnum
5142+
5143+
5144+    def get_checkstring(self):
5145+        """
5146+        Given a share number, I return a representation of what the
5147+        checkstring for that share on the server will look like.
5148+
5149+        I am mostly used for tests.
5150+        """
5151+        if self._root_hash:
5152+            roothash = self._root_hash
5153+        else:
5154+            roothash = "\x00" * 32
5155+        return struct.pack(MDMFCHECKSTRING,
5156+                           1,
5157+                           self._seqnum,
5158+                           roothash)
5159+
5160+
5161+    def put_block(self, data, segnum, salt):
5162+        """
5163+        I queue a write vector for the data, salt, and segment number
5164+        provided to me. I return None, as I do not actually cause
5165+        anything to be written yet.
5166+        """
5167+        if segnum >= self._num_segments:
5168+            raise LayoutInvalid("I won't overwrite the private key")
5169+        if len(salt) != SALT_SIZE:
5170+            raise LayoutInvalid("I was given a salt of size %d, but "
5171+                                "I wanted a salt of size %d")
5172+        if segnum + 1 == self._num_segments:
5173+            if len(data) != self._tail_block_size:
5174+                raise LayoutInvalid("I was given the wrong size block to write")
5175+        elif len(data) != self._block_size:
5176+            raise LayoutInvalid("I was given the wrong size block to write")
5177+
5178+        # We want to write at len(MDMFHEADER) + segnum * block_size.
5179+
5180+        offset = MDMFHEADERSIZE + (self._actual_block_size * segnum)
5181+        data = salt + data
5182+
5183+        self._writevs.append(tuple([offset, data]))
5184+
5185+
5186+    def put_encprivkey(self, encprivkey):
5187+        """
5188+        I queue a write vector for the encrypted private key provided to
5189+        me.
5190+        """
5191+        assert self._offsets
5192+        assert self._offsets['enc_privkey']
5193+        # You shouldn't re-write the encprivkey after the block hash
5194+        # tree is written, since that could cause the private key to run
5195+        # into the block hash tree. Before it writes the block hash
5196+        # tree, the block hash tree writing method writes the offset of
5197+        # the salt hash tree. So that's a good indicator of whether or
5198+        # not the block hash tree has been written.
5199+        if "share_hash_chain" in self._offsets:
5200+            raise LayoutInvalid("You must write this before the block hash tree")
5201+
5202+        self._offsets['block_hash_tree'] = self._offsets['enc_privkey'] + \
5203+            len(encprivkey)
5204+        self._writevs.append(tuple([self._offsets['enc_privkey'], encprivkey]))
5205+
5206+
5207+    def put_blockhashes(self, blockhashes):
5208+        """
5209+        I queue a write vector to put the block hash tree in blockhashes
5210+        onto the remote server.
5211+
5212+        The encrypted private key must be queued before the block hash
5213+        tree, since we need to know how large it is to know where the
5214+        block hash tree should go. The block hash tree must be put
5215+        before the salt hash tree, since its size determines the
5216+        offset of the share hash chain.
5217+        """
5218+        assert self._offsets
5219+        assert isinstance(blockhashes, list)
5220+        if "block_hash_tree" not in self._offsets:
5221+            raise LayoutInvalid("You must put the encrypted private key "
5222+                                "before you put the block hash tree")
5223+        # If written, the share hash chain causes the signature offset
5224+        # to be defined.
5225+        if "signature" in self._offsets:
5226+            raise LayoutInvalid("You must put the block hash tree before "
5227+                                "you put the share hash chain")
5228+        blockhashes_s = "".join(blockhashes)
5229+        self._offsets['share_hash_chain'] = self._offsets['block_hash_tree'] + len(blockhashes_s)
5230+
5231+        self._writevs.append(tuple([self._offsets['block_hash_tree'],
5232+                                  blockhashes_s]))
5233+
5234+
5235+    def put_sharehashes(self, sharehashes):
5236+        """
5237+        I queue a write vector to put the share hash chain in my
5238+        argument onto the remote server.
5239+
5240+        The salt hash tree must be queued before the share hash chain,
5241+        since we need to know where the salt hash tree ends before we
5242+        can know where the share hash chain starts. The share hash chain
5243+        must be put before the signature, since the length of the packed
5244+        share hash chain determines the offset of the signature. Also,
5245+        semantically, you must know what the root of the salt hash tree
5246+        is before you can generate a valid signature.
5247+        """
5248+        assert isinstance(sharehashes, dict)
5249+        if "share_hash_chain" not in self._offsets:
5250+            raise LayoutInvalid("You need to put the salt hash tree before "
5251+                                "you can put the share hash chain")
5252+        # The signature comes after the share hash chain. If the
5253+        # signature has already been written, we must not write another
5254+        # share hash chain. The signature writes the verification key
5255+        # offset when it gets sent to the remote server, so we look for
5256+        # that.
5257+        if "verification_key" in self._offsets:
5258+            raise LayoutInvalid("You must write the share hash chain "
5259+                                "before you write the signature")
5260+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
5261+                                  for i in sorted(sharehashes.keys())])
5262+        self._offsets['signature'] = self._offsets['share_hash_chain'] + len(sharehashes_s)
5263+        self._writevs.append(tuple([self._offsets['share_hash_chain'],
5264+                            sharehashes_s]))
5265+
5266+
5267+    def put_root_hash(self, roothash):
5268+        """
5269+        Put the root hash (the root of the share hash tree) in the
5270+        remote slot.
5271+        """
5272+        # It does not make sense to be able to put the root
5273+        # hash without first putting the share hashes, since you need
5274+        # the share hashes to generate the root hash.
5275+        #
5276+        # Signature is defined by the routine that places the share hash
5277+        # chain, so it's a good thing to look for in finding out whether
5278+        # or not the share hash chain exists on the remote server.
5279+        if "signature" not in self._offsets:
5280+            raise LayoutInvalid("You need to put the share hash chain "
5281+                                "before you can put the root share hash")
5282+        if len(roothash) != HASH_SIZE:
5283+            raise LayoutInvalid("hashes and salts must be exactly %d bytes"
5284+                                 % HASH_SIZE)
5285+        self._root_hash = roothash
5286+        # To write both of these values, we update the checkstring on
5287+        # the remote server, which includes them
5288+        checkstring = self.get_checkstring()
5289+        self._writevs.append(tuple([0, checkstring]))
5290+        # This write, if successful, changes the checkstring, so we need
5291+        # to update our internal checkstring to be consistent with the
5292+        # one on the server.
5293+
5294+
5295+    def get_signable(self):
5296+        """
5297+        Get the first seven fields of the mutable file; the parts that
5298+        are signed.
5299+        """
5300+        if not self._root_hash:
5301+            raise LayoutInvalid("You need to set the root hash "
5302+                                "before getting something to "
5303+                                "sign")
5304+        return struct.pack(MDMFSIGNABLEHEADER,
5305+                           1,
5306+                           self._seqnum,
5307+                           self._root_hash,
5308+                           self._required_shares,
5309+                           self._total_shares,
5310+                           self._segment_size,
5311+                           self._data_length)
5312+
5313+
5314+    def put_signature(self, signature):
5315+        """
5316+        I queue a write vector for the signature of the MDMF share.
5317+
5318+        I require that the root hash and share hash chain have been put
5319+        to the grid before I will write the signature to the grid.
5320+        """
5321+        if "signature" not in self._offsets:
5322+            raise LayoutInvalid("You must put the share hash chain "
5323+        # It does not make sense to put a signature without first
5324+        # putting the root hash and the salt hash (since otherwise
5325+        # the signature would be incomplete), so we don't allow that.
5326+                       "before putting the signature")
5327+        if not self._root_hash:
5328+            raise LayoutInvalid("You must complete the signed prefix "
5329+                                "before computing a signature")
5330+        # If we put the signature after we put the verification key, we
5331+        # could end up running into the verification key, and will
5332+        # probably screw up the offsets as well. So we don't allow that.
5333+        # The method that writes the verification key defines the EOF
5334+        # offset before writing the verification key, so look for that.
5335+        if "EOF" in self._offsets:
5336+            raise LayoutInvalid("You must write the signature before the verification key")
5337+
5338+        self._offsets['verification_key'] = self._offsets['signature'] + len(signature)
5339+        self._writevs.append(tuple([self._offsets['signature'], signature]))
5340+
5341+
5342+    def put_verification_key(self, verification_key):
5343+        """
5344+        I queue a write vector for the verification key.
5345+
5346+        I require that the signature have been written to the storage
5347+        server before I allow the verification key to be written to the
5348+        remote server.
5349+        """
5350+        if "verification_key" not in self._offsets:
5351+            raise LayoutInvalid("You must put the signature before you "
5352+                                "can put the verification key")
5353+        self._offsets['EOF'] = self._offsets['verification_key'] + len(verification_key)
5354+        self._writevs.append(tuple([self._offsets['verification_key'],
5355+                            verification_key]))
5356+
5357+
5358+    def _get_offsets_tuple(self):
5359+        return tuple([(key, value) for key, value in self._offsets.items()])
5360+
5361+
5362+    def get_verinfo(self):
5363+        return (self._seqnum,
5364+                self._root_hash,
5365+                self._required_shares,
5366+                self._total_shares,
5367+                self._segment_size,
5368+                self._data_length,
5369+                self.get_signable(),
5370+                self._get_offsets_tuple())
5371+
5372+
5373+    def finish_publishing(self):
5374+        """
5375+        I add a write vector for the offsets table, and then cause all
5376+        of the write vectors that I've dealt with so far to be published
5377+        to the remote server, ending the write process.
5378+        """
5379+        if "EOF" not in self._offsets:
5380+            raise LayoutInvalid("You must put the verification key before "
5381+                                "you can publish the offsets")
5382+        offsets_offset = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
5383+        offsets = struct.pack(MDMFOFFSETS,
5384+                              self._offsets['enc_privkey'],
5385+                              self._offsets['block_hash_tree'],
5386+                              self._offsets['share_hash_chain'],
5387+                              self._offsets['signature'],
5388+                              self._offsets['verification_key'],
5389+                              self._offsets['EOF'])
5390+        self._writevs.append(tuple([offsets_offset, offsets]))
5391+        encoding_parameters_offset = struct.calcsize(MDMFCHECKSTRING)
5392+        params = struct.pack(">BBQQ",
5393+                             self._required_shares,
5394+                             self._total_shares,
5395+                             self._segment_size,
5396+                             self._data_length)
5397+        self._writevs.append(tuple([encoding_parameters_offset, params]))
5398+        return self._write(self._writevs)
5399+
5400+
5401+    def _write(self, datavs, on_failure=None, on_success=None):
5402+        """I write the data vectors in datavs to the remote slot."""
5403+        tw_vectors = {}
5404+        if not self._testvs:
5405+            self._testvs = []
5406+            self._testvs.append(tuple([0, 1, "eq", ""]))
5407+        if not self._written:
5408+            # Write a new checkstring to the share when we write it, so
5409+            # that we have something to check later.
5410+            new_checkstring = self.get_checkstring()
5411+            datavs.append((0, new_checkstring))
5412+            def _first_write():
5413+                self._written = True
5414+                self._testvs = [(0, len(new_checkstring), "eq", new_checkstring)]
5415+            on_success = _first_write
5416+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
5417+        d = self._rref.callRemote("slot_testv_and_readv_and_writev",
5418+                                  self._storage_index,
5419+                                  self._secrets,
5420+                                  tw_vectors,
5421+                                  self._readv)
5422+        def _result(results):
5423+            if isinstance(results, failure.Failure) or not results[0]:
5424+                # Do nothing; the write was unsuccessful.
5425+                if on_failure: on_failure()
5426+            else:
5427+                if on_success: on_success()
5428+            return results
5429+        d.addCallback(_result)
5430+        return d
5431+
5432+
5433+class MDMFSlotReadProxy:
5434+    """
5435+    I read from a mutable slot filled with data written in the MDMF data
5436+    format (which is described above).
5437+
5438+    I can be initialized with some amount of data, which I will use (if
5439+    it is valid) to eliminate some of the need to fetch it from servers.
5440+    """
5441+    def __init__(self,
5442+                 rref,
5443+                 storage_index,
5444+                 shnum,
5445+                 data=""):
5446+        # Start the initialization process.
5447+        self._rref = rref
5448+        self._storage_index = storage_index
5449+        self.shnum = shnum
5450+
5451+        # Before doing anything, the reader is probably going to want to
5452+        # verify that the signature is correct. To do that, they'll need
5453+        # the verification key, and the signature. To get those, we'll
5454+        # need the offset table. So fetch the offset table on the
5455+        # assumption that that will be the first thing that a reader is
5456+        # going to do.
5457+
5458+        # The fact that these encoding parameters are None tells us
5459+        # that we haven't yet fetched them from the remote share, so we
5460+        # should. We could just not set them, but the checks will be
5461+        # easier to read if we don't have to use hasattr.
5462+        self._version_number = None
5463+        self._sequence_number = None
5464+        self._root_hash = None
5465+        # Filled in if we're dealing with an SDMF file. Unused
5466+        # otherwise.
5467+        self._salt = None
5468+        self._required_shares = None
5469+        self._total_shares = None
5470+        self._segment_size = None
5471+        self._data_length = None
5472+        self._offsets = None
5473+
5474+        # If the user has chosen to initialize us with some data, we'll
5475+        # try to satisfy subsequent data requests with that data before
5476+        # asking the storage server for it. If
5477+        self._data = data
5478+        # The way callers interact with cache in the filenode returns
5479+        # None if there isn't any cached data, but the way we index the
5480+        # cached data requires a string, so convert None to "".
5481+        if self._data == None:
5482+            self._data = ""
5483+
5484+        self._queue_observers = observer.ObserverList()
5485+        self._queue_errbacks = observer.ObserverList()
5486+        self._readvs = []
5487+
5488+
5489+    def _maybe_fetch_offsets_and_header(self, force_remote=False):
5490+        """
5491+        I fetch the offset table and the header from the remote slot if
5492+        I don't already have them. If I do have them, I do nothing and
5493+        return an empty Deferred.
5494+        """
5495+        if self._offsets:
5496+            return defer.succeed(None)
5497+        # At this point, we may be either SDMF or MDMF. Fetching 107
5498+        # bytes will be enough to get header and offsets for both SDMF and
5499+        # MDMF, though we'll be left with 4 more bytes than we
5500+        # need if this ends up being MDMF. This is probably less
5501+        # expensive than the cost of a second roundtrip.
5502+        readvs = [(0, 107)]
5503+        d = self._read(readvs, force_remote)
5504+        d.addCallback(self._process_encoding_parameters)
5505+        d.addCallback(self._process_offsets)
5506+        return d
5507+
5508+
5509+    def _process_encoding_parameters(self, encoding_parameters):
5510+        assert self.shnum in encoding_parameters
5511+        encoding_parameters = encoding_parameters[self.shnum][0]
5512+        # The first byte is the version number. It will tell us what
5513+        # to do next.
5514+        (verno,) = struct.unpack(">B", encoding_parameters[:1])
5515+        if verno == MDMF_VERSION:
5516+            read_size = MDMFHEADERWITHOUTOFFSETSSIZE
5517+            (verno,
5518+             seqnum,
5519+             root_hash,
5520+             k,
5521+             n,
5522+             segsize,
5523+             datalen) = struct.unpack(MDMFHEADERWITHOUTOFFSETS,
5524+                                      encoding_parameters[:read_size])
5525+            if segsize == 0 and datalen == 0:
5526+                # Empty file, no segments.
5527+                self._num_segments = 0
5528+            else:
5529+                self._num_segments = mathutil.div_ceil(datalen, segsize)
5530+
5531+        elif verno == SDMF_VERSION:
5532+            read_size = SIGNED_PREFIX_LENGTH
5533+            (verno,
5534+             seqnum,
5535+             root_hash,
5536+             salt,
5537+             k,
5538+             n,
5539+             segsize,
5540+             datalen) = struct.unpack(">BQ32s16s BBQQ",
5541+                                encoding_parameters[:SIGNED_PREFIX_LENGTH])
5542+            self._salt = salt
5543+            if segsize == 0 and datalen == 0:
5544+                # empty file
5545+                self._num_segments = 0
5546+            else:
5547+                # non-empty SDMF files have one segment.
5548+                self._num_segments = 1
5549+        else:
5550+            raise UnknownVersionError("You asked me to read mutable file "
5551+                                      "version %d, but I only understand "
5552+                                      "%d and %d" % (verno, SDMF_VERSION,
5553+                                                     MDMF_VERSION))
5554+
5555+        self._version_number = verno
5556+        self._sequence_number = seqnum
5557+        self._root_hash = root_hash
5558+        self._required_shares = k
5559+        self._total_shares = n
5560+        self._segment_size = segsize
5561+        self._data_length = datalen
5562+
5563+        self._block_size = self._segment_size / self._required_shares
5564+        # We can upload empty files, and need to account for this fact
5565+        # so as to avoid zero-division and zero-modulo errors.
5566+        if datalen > 0:
5567+            tail_size = self._data_length % self._segment_size
5568+        else:
5569+            tail_size = 0
5570+        if not tail_size:
5571+            self._tail_block_size = self._block_size
5572+        else:
5573+            self._tail_block_size = mathutil.next_multiple(tail_size,
5574+                                                    self._required_shares)
5575+            self._tail_block_size /= self._required_shares
5576+
5577+        return encoding_parameters
5578+
5579+
5580+    def _process_offsets(self, offsets):
5581+        if self._version_number == 0:
5582+            read_size = OFFSETS_LENGTH
5583+            read_offset = SIGNED_PREFIX_LENGTH
5584+            end = read_size + read_offset
5585+            (signature,
5586+             share_hash_chain,
5587+             block_hash_tree,
5588+             share_data,
5589+             enc_privkey,
5590+             EOF) = struct.unpack(">LLLLQQ",
5591+                                  offsets[read_offset:end])
5592+            self._offsets = {}
5593+            self._offsets['signature'] = signature
5594+            self._offsets['share_data'] = share_data
5595+            self._offsets['block_hash_tree'] = block_hash_tree
5596+            self._offsets['share_hash_chain'] = share_hash_chain
5597+            self._offsets['enc_privkey'] = enc_privkey
5598+            self._offsets['EOF'] = EOF
5599+
5600+        elif self._version_number == 1:
5601+            read_offset = MDMFHEADERWITHOUTOFFSETSSIZE
5602+            read_length = MDMFOFFSETS_LENGTH
5603+            end = read_offset + read_length
5604+            (encprivkey,
5605+             blockhashes,
5606+             sharehashes,
5607+             signature,
5608+             verification_key,
5609+             eof) = struct.unpack(MDMFOFFSETS,
5610+                                  offsets[read_offset:end])
5611+            self._offsets = {}
5612+            self._offsets['enc_privkey'] = encprivkey
5613+            self._offsets['block_hash_tree'] = blockhashes
5614+            self._offsets['share_hash_chain'] = sharehashes
5615+            self._offsets['signature'] = signature
5616+            self._offsets['verification_key'] = verification_key
5617+            self._offsets['EOF'] = eof
5618+
5619+
5620+    def get_block_and_salt(self, segnum, queue=False):
5621+        """
5622+        I return (block, salt), where block is the block data and
5623+        salt is the salt used to encrypt that segment.
5624+        """
5625+        d = self._maybe_fetch_offsets_and_header()
5626+        def _then(ignored):
5627+            if self._version_number == 1:
5628+                base_share_offset = MDMFHEADERSIZE
5629+            else:
5630+                base_share_offset = self._offsets['share_data']
5631+
5632+            if segnum + 1 > self._num_segments:
5633+                raise LayoutInvalid("Not a valid segment number")
5634+
5635+            if self._version_number == 0:
5636+                share_offset = base_share_offset + self._block_size * segnum
5637+            else:
5638+                share_offset = base_share_offset + (self._block_size + \
5639+                                                    SALT_SIZE) * segnum
5640+            if segnum + 1 == self._num_segments:
5641+                data = self._tail_block_size
5642+            else:
5643+                data = self._block_size
5644+
5645+            if self._version_number == 1:
5646+                data += SALT_SIZE
5647+
5648+            readvs = [(share_offset, data)]
5649+            return readvs
5650+        d.addCallback(_then)
5651+        d.addCallback(lambda readvs:
5652+            self._read(readvs, queue=queue))
5653+        def _process_results(results):
5654+            assert self.shnum in results
5655+            if self._version_number == 0:
5656+                # We only read the share data, but we know the salt from
5657+                # when we fetched the header
5658+                data = results[self.shnum]
5659+                if not data:
5660+                    data = ""
5661+                else:
5662+                    assert len(data) == 1
5663+                    data = data[0]
5664+                salt = self._salt
5665+            else:
5666+                data = results[self.shnum]
5667+                if not data:
5668+                    salt = data = ""
5669+                else:
5670+                    salt_and_data = results[self.shnum][0]
5671+                    salt = salt_and_data[:SALT_SIZE]
5672+                    data = salt_and_data[SALT_SIZE:]
5673+            return data, salt
5674+        d.addCallback(_process_results)
5675+        return d
5676+
5677+
5678+    def get_blockhashes(self, needed=None, queue=False, force_remote=False):
5679+        """
5680+        I return the block hash tree
5681+
5682+        I take an optional argument, needed, which is a set of indices
5683+        correspond to hashes that I should fetch. If this argument is
5684+        missing, I will fetch the entire block hash tree; otherwise, I
5685+        may attempt to fetch fewer hashes, based on what needed says
5686+        that I should do. Note that I may fetch as many hashes as I
5687+        want, so long as the set of hashes that I do fetch is a superset
5688+        of the ones that I am asked for, so callers should be prepared
5689+        to tolerate additional hashes.
5690+        """
5691+        # TODO: Return only the parts of the block hash tree necessary
5692+        # to validate the blocknum provided?
5693+        # This is a good idea, but it is hard to implement correctly. It
5694+        # is bad to fetch any one block hash more than once, so we
5695+        # probably just want to fetch the whole thing at once and then
5696+        # serve it.
5697+        if needed == set([]):
5698+            return defer.succeed([])
5699+        d = self._maybe_fetch_offsets_and_header()
5700+        def _then(ignored):
5701+            blockhashes_offset = self._offsets['block_hash_tree']
5702+            if self._version_number == 1:
5703+                blockhashes_length = self._offsets['share_hash_chain'] - blockhashes_offset
5704+            else:
5705+                blockhashes_length = self._offsets['share_data'] - blockhashes_offset
5706+            readvs = [(blockhashes_offset, blockhashes_length)]
5707+            return readvs
5708+        d.addCallback(_then)
5709+        d.addCallback(lambda readvs:
5710+            self._read(readvs, queue=queue, force_remote=force_remote))
5711+        def _build_block_hash_tree(results):
5712+            assert self.shnum in results
5713+
5714+            rawhashes = results[self.shnum][0]
5715+            results = [rawhashes[i:i+HASH_SIZE]
5716+                       for i in range(0, len(rawhashes), HASH_SIZE)]
5717+            return results
5718+        d.addCallback(_build_block_hash_tree)
5719+        return d
5720+
5721+
5722+    def get_sharehashes(self, needed=None, queue=False, force_remote=False):
5723+        """
5724+        I return the part of the share hash chain placed to validate
5725+        this share.
5726+
5727+        I take an optional argument, needed. Needed is a set of indices
5728+        that correspond to the hashes that I should fetch. If needed is
5729+        not present, I will fetch and return the entire share hash
5730+        chain. Otherwise, I may fetch and return any part of the share
5731+        hash chain that is a superset of the part that I am asked to
5732+        fetch. Callers should be prepared to deal with more hashes than
5733+        they've asked for.
5734+        """
5735+        if needed == set([]):
5736+            return defer.succeed([])
5737+        d = self._maybe_fetch_offsets_and_header()
5738+
5739+        def _make_readvs(ignored):
5740+            sharehashes_offset = self._offsets['share_hash_chain']
5741+            if self._version_number == 0:
5742+                sharehashes_length = self._offsets['block_hash_tree'] - sharehashes_offset
5743+            else:
5744+                sharehashes_length = self._offsets['signature'] - sharehashes_offset
5745+            readvs = [(sharehashes_offset, sharehashes_length)]
5746+            return readvs
5747+        d.addCallback(_make_readvs)
5748+        d.addCallback(lambda readvs:
5749+            self._read(readvs, queue=queue, force_remote=force_remote))
5750+        def _build_share_hash_chain(results):
5751+            assert self.shnum in results
5752+
5753+            sharehashes = results[self.shnum][0]
5754+            results = [sharehashes[i:i+(HASH_SIZE + 2)]
5755+                       for i in range(0, len(sharehashes), HASH_SIZE + 2)]
5756+            results = dict([struct.unpack(">H32s", data)
5757+                            for data in results])
5758+            return results
5759+        d.addCallback(_build_share_hash_chain)
5760+        return d
5761+
5762+
5763+    def get_encprivkey(self, queue=False):
5764+        """
5765+        I return the encrypted private key.
5766+        """
5767+        d = self._maybe_fetch_offsets_and_header()
5768+
5769+        def _make_readvs(ignored):
5770+            privkey_offset = self._offsets['enc_privkey']
5771+            if self._version_number == 0:
5772+                privkey_length = self._offsets['EOF'] - privkey_offset
5773+            else:
5774+                privkey_length = self._offsets['block_hash_tree'] - privkey_offset
5775+            readvs = [(privkey_offset, privkey_length)]
5776+            return readvs
5777+        d.addCallback(_make_readvs)
5778+        d.addCallback(lambda readvs:
5779+            self._read(readvs, queue=queue))
5780+        def _process_results(results):
5781+            assert self.shnum in results
5782+            privkey = results[self.shnum][0]
5783+            return privkey
5784+        d.addCallback(_process_results)
5785+        return d
5786+
5787+
5788+    def get_signature(self, queue=False):
5789+        """
5790+        I return the signature of my share.
5791+        """
5792+        d = self._maybe_fetch_offsets_and_header()
5793+
5794+        def _make_readvs(ignored):
5795+            signature_offset = self._offsets['signature']
5796+            if self._version_number == 1:
5797+                signature_length = self._offsets['verification_key'] - signature_offset
5798+            else:
5799+                signature_length = self._offsets['share_hash_chain'] - signature_offset
5800+            readvs = [(signature_offset, signature_length)]
5801+            return readvs
5802+        d.addCallback(_make_readvs)
5803+        d.addCallback(lambda readvs:
5804+            self._read(readvs, queue=queue))
5805+        def _process_results(results):
5806+            assert self.shnum in results
5807+            signature = results[self.shnum][0]
5808+            return signature
5809+        d.addCallback(_process_results)
5810+        return d
5811+
5812+
5813+    def get_verification_key(self, queue=False):
5814+        """
5815+        I return the verification key.
5816+        """
5817+        d = self._maybe_fetch_offsets_and_header()
5818+
5819+        def _make_readvs(ignored):
5820+            if self._version_number == 1:
5821+                vk_offset = self._offsets['verification_key']
5822+                vk_length = self._offsets['EOF'] - vk_offset
5823+            else:
5824+                vk_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
5825+                vk_length = self._offsets['signature'] - vk_offset
5826+            readvs = [(vk_offset, vk_length)]
5827+            return readvs
5828+        d.addCallback(_make_readvs)
5829+        d.addCallback(lambda readvs:
5830+            self._read(readvs, queue=queue))
5831+        def _process_results(results):
5832+            assert self.shnum in results
5833+            verification_key = results[self.shnum][0]
5834+            return verification_key
5835+        d.addCallback(_process_results)
5836+        return d
5837+
5838+
5839+    def get_encoding_parameters(self):
5840+        """
5841+        I return (k, n, segsize, datalen)
5842+        """
5843+        d = self._maybe_fetch_offsets_and_header()
5844+        d.addCallback(lambda ignored:
5845+            (self._required_shares,
5846+             self._total_shares,
5847+             self._segment_size,
5848+             self._data_length))
5849+        return d
5850+
5851+
5852+    def get_seqnum(self):
5853+        """
5854+        I return the sequence number for this share.
5855+        """
5856+        d = self._maybe_fetch_offsets_and_header()
5857+        d.addCallback(lambda ignored:
5858+            self._sequence_number)
5859+        return d
5860+
5861+
5862+    def get_root_hash(self):
5863+        """
5864+        I return the root of the block hash tree
5865+        """
5866+        d = self._maybe_fetch_offsets_and_header()
5867+        d.addCallback(lambda ignored: self._root_hash)
5868+        return d
5869+
5870+
5871+    def get_checkstring(self):
5872+        """
5873+        I return the packed representation of the following:
5874+
5875+            - version number
5876+            - sequence number
5877+            - root hash
5878+            - salt hash
5879+
5880+        which my users use as a checkstring to detect other writers.
5881+        """
5882+        d = self._maybe_fetch_offsets_and_header()
5883+        def _build_checkstring(ignored):
5884+            if self._salt:
5885+                checkstring = struct.pack(PREFIX,
5886+                                          self._version_number,
5887+                                          self._sequence_number,
5888+                                          self._root_hash,
5889+                                          self._salt)
5890+            else:
5891+                checkstring = struct.pack(MDMFCHECKSTRING,
5892+                                          self._version_number,
5893+                                          self._sequence_number,
5894+                                          self._root_hash)
5895+
5896+            return checkstring
5897+        d.addCallback(_build_checkstring)
5898+        return d
5899+
5900+
5901+    def get_prefix(self, force_remote):
5902+        d = self._maybe_fetch_offsets_and_header(force_remote)
5903+        d.addCallback(lambda ignored:
5904+            self._build_prefix())
5905+        return d
5906+
5907+
5908+    def _build_prefix(self):
5909+        # The prefix is another name for the part of the remote share
5910+        # that gets signed. It consists of everything up to and
5911+        # including the datalength, packed by struct.
5912+        if self._version_number == SDMF_VERSION:
5913+            return struct.pack(SIGNED_PREFIX,
5914+                           self._version_number,
5915+                           self._sequence_number,
5916+                           self._root_hash,
5917+                           self._salt,
5918+                           self._required_shares,
5919+                           self._total_shares,
5920+                           self._segment_size,
5921+                           self._data_length)
5922+
5923+        else:
5924+            return struct.pack(MDMFSIGNABLEHEADER,
5925+                           self._version_number,
5926+                           self._sequence_number,
5927+                           self._root_hash,
5928+                           self._required_shares,
5929+                           self._total_shares,
5930+                           self._segment_size,
5931+                           self._data_length)
5932+
5933+
5934+    def _get_offsets_tuple(self):
5935+        # The offsets tuple is another component of the version
5936+        # information tuple. It is basically our offsets dictionary,
5937+        # itemized and in a tuple.
5938+        return self._offsets.copy()
5939+
5940+
5941+    def get_verinfo(self):
5942+        """
5943+        I return my verinfo tuple. This is used by the ServermapUpdater
5944+        to keep track of versions of mutable files.
5945+
5946+        The verinfo tuple for MDMF files contains:
5947+            - seqnum
5948+            - root hash
5949+            - a blank (nothing)
5950+            - segsize
5951+            - datalen
5952+            - k
5953+            - n
5954+            - prefix (the thing that you sign)
5955+            - a tuple of offsets
5956+
5957+        We include the nonce in MDMF to simplify processing of version
5958+        information tuples.
5959+
5960+        The verinfo tuple for SDMF files is the same, but contains a
5961+        16-byte IV instead of a hash of salts.
5962+        """
5963+        d = self._maybe_fetch_offsets_and_header()
5964+        def _build_verinfo(ignored):
5965+            if self._version_number == SDMF_VERSION:
5966+                salt_to_use = self._salt
5967+            else:
5968+                salt_to_use = None
5969+            return (self._sequence_number,
5970+                    self._root_hash,
5971+                    salt_to_use,
5972+                    self._segment_size,
5973+                    self._data_length,
5974+                    self._required_shares,
5975+                    self._total_shares,
5976+                    self._build_prefix(),
5977+                    self._get_offsets_tuple())
5978+        d.addCallback(_build_verinfo)
5979+        return d
5980+
5981+
5982+    def flush(self):
5983+        """
5984+        I flush my queue of read vectors.
5985+        """
5986+        d = self._read(self._readvs)
5987+        def _then(results):
5988+            self._readvs = []
5989+            if isinstance(results, failure.Failure):
5990+                self._queue_errbacks.notify(results)
5991+            else:
5992+                self._queue_observers.notify(results)
5993+            self._queue_observers = observer.ObserverList()
5994+            self._queue_errbacks = observer.ObserverList()
5995+        d.addBoth(_then)
5996+
5997+
5998+    def _read(self, readvs, force_remote=False, queue=False):
5999+        unsatisfiable = filter(lambda x: x[0] + x[1] > len(self._data), readvs)
6000+        # TODO: It's entirely possible to tweak this so that it just
6001+        # fulfills the requests that it can, and not demand that all
6002+        # requests are satisfiable before running it.
6003+        if not unsatisfiable and not force_remote:
6004+            results = [self._data[offset:offset+length]
6005+                       for (offset, length) in readvs]
6006+            results = {self.shnum: results}
6007+            return defer.succeed(results)
6008+        else:
6009+            if queue:
6010+                start = len(self._readvs)
6011+                self._readvs += readvs
6012+                end = len(self._readvs)
6013+                def _get_results(results, start, end):
6014+                    if not self.shnum in results:
6015+                        return {self._shnum: [""]}
6016+                    return {self.shnum: results[self.shnum][start:end]}
6017+                d = defer.Deferred()
6018+                d.addCallback(_get_results, start, end)
6019+                self._queue_observers.subscribe(d.callback)
6020+                self._queue_errbacks.subscribe(d.errback)
6021+                return d
6022+            return self._rref.callRemote("slot_readv",
6023+                                         self._storage_index,
6024+                                         [self.shnum],
6025+                                         readvs)
6026+
6027+
6028+    def is_sdmf(self):
6029+        """I tell my caller whether or not my remote file is SDMF or MDMF
6030+        """
6031+        d = self._maybe_fetch_offsets_and_header()
6032+        d.addCallback(lambda ignored:
6033+            self._version_number == 0)
6034+        return d
6035+
6036+
6037+class LayoutInvalid(Exception):
6038+    """
6039+    This isn't a valid MDMF mutable file
6040+    """
6041merger 0.0 (
6042hunk ./src/allmydata/test/test_storage.py 3
6043-from allmydata.util import log
6044-
6045merger 0.0 (
6046hunk ./src/allmydata/test/test_storage.py 3
6047-import time, os.path, stat, re, simplejson, struct
6048+from allmydata.util import log
6049+
6050+import mock
6051hunk ./src/allmydata/test/test_storage.py 3
6052-import time, os.path, stat, re, simplejson, struct
6053+import time, os.path, stat, re, simplejson, struct, shutil
6054)
6055)
6056hunk ./src/allmydata/test/test_storage.py 23
6057 from allmydata.storage.expirer import LeaseCheckingCrawler
6058 from allmydata.immutable.layout import WriteBucketProxy, WriteBucketProxy_v2, \
6059      ReadBucketProxy
6060-from allmydata.interfaces import BadWriteEnablerError
6061-from allmydata.test.common import LoggingServiceParent
6062+from allmydata.mutable.layout import MDMFSlotWriteProxy, MDMFSlotReadProxy, \
6063+                                     LayoutInvalid, MDMFSIGNABLEHEADER, \
6064+                                     SIGNED_PREFIX, MDMFHEADER, \
6065+                                     MDMFOFFSETS, SDMFSlotWriteProxy
6066+from allmydata.interfaces import BadWriteEnablerError, MDMF_VERSION, \
6067+                                 SDMF_VERSION
6068+from allmydata.test.common import LoggingServiceParent, ShouldFailMixin
6069 from allmydata.test.common_web import WebRenderingMixin
6070 from allmydata.web.storage import StorageStatus, remove_prefix
6071 
6072hunk ./src/allmydata/test/test_storage.py 107
6073 
6074 class RemoteBucket:
6075 
6076+    def __init__(self):
6077+        self.read_count = 0
6078+        self.write_count = 0
6079+
6080     def callRemote(self, methname, *args, **kwargs):
6081         def _call():
6082             meth = getattr(self.target, "remote_" + methname)
6083hunk ./src/allmydata/test/test_storage.py 115
6084             return meth(*args, **kwargs)
6085+
6086+        if methname == "slot_readv":
6087+            self.read_count += 1
6088+        if "writev" in methname:
6089+            self.write_count += 1
6090+
6091         return defer.maybeDeferred(_call)
6092 
6093hunk ./src/allmydata/test/test_storage.py 123
6094+
6095 class BucketProxy(unittest.TestCase):
6096     def make_bucket(self, name, size):
6097         basedir = os.path.join("storage", "BucketProxy", name)
6098hunk ./src/allmydata/test/test_storage.py 1306
6099         self.failUnless(os.path.exists(prefixdir), prefixdir)
6100         self.failIf(os.path.exists(bucketdir), bucketdir)
6101 
6102+
6103+class MDMFProxies(unittest.TestCase, ShouldFailMixin):
6104+    def setUp(self):
6105+        self.sparent = LoggingServiceParent()
6106+        self._lease_secret = itertools.count()
6107+        self.ss = self.create("MDMFProxies storage test server")
6108+        self.rref = RemoteBucket()
6109+        self.rref.target = self.ss
6110+        self.secrets = (self.write_enabler("we_secret"),
6111+                        self.renew_secret("renew_secret"),
6112+                        self.cancel_secret("cancel_secret"))
6113+        self.segment = "aaaaaa"
6114+        self.block = "aa"
6115+        self.salt = "a" * 16
6116+        self.block_hash = "a" * 32
6117+        self.block_hash_tree = [self.block_hash for i in xrange(6)]
6118+        self.share_hash = self.block_hash
6119+        self.share_hash_chain = dict([(i, self.share_hash) for i in xrange(6)])
6120+        self.signature = "foobarbaz"
6121+        self.verification_key = "vvvvvv"
6122+        self.encprivkey = "private"
6123+        self.root_hash = self.block_hash
6124+        self.salt_hash = self.root_hash
6125+        self.salt_hash_tree = [self.salt_hash for i in xrange(6)]
6126+        self.block_hash_tree_s = self.serialize_blockhashes(self.block_hash_tree)
6127+        self.share_hash_chain_s = self.serialize_sharehashes(self.share_hash_chain)
6128+        # blockhashes and salt hashes are serialized in the same way,
6129+        # only we lop off the first element and store that in the
6130+        # header.
6131+        self.salt_hash_tree_s = self.serialize_blockhashes(self.salt_hash_tree[1:])
6132+
6133+
6134+    def tearDown(self):
6135+        self.sparent.stopService()
6136+        shutil.rmtree(self.workdir("MDMFProxies storage test server"))
6137+
6138+
6139+    def write_enabler(self, we_tag):
6140+        return hashutil.tagged_hash("we_blah", we_tag)
6141+
6142+
6143+    def renew_secret(self, tag):
6144+        return hashutil.tagged_hash("renew_blah", str(tag))
6145+
6146+
6147+    def cancel_secret(self, tag):
6148+        return hashutil.tagged_hash("cancel_blah", str(tag))
6149+
6150+
6151+    def workdir(self, name):
6152+        basedir = os.path.join("storage", "MutableServer", name)
6153+        return basedir
6154+
6155+
6156+    def create(self, name):
6157+        workdir = self.workdir(name)
6158+        ss = StorageServer(workdir, "\x00" * 20)
6159+        ss.setServiceParent(self.sparent)
6160+        return ss
6161+
6162+
6163+    def build_test_mdmf_share(self, tail_segment=False, empty=False):
6164+        # Start with the checkstring
6165+        data = struct.pack(">BQ32s",
6166+                           1,
6167+                           0,
6168+                           self.root_hash)
6169+        self.checkstring = data
6170+        # Next, the encoding parameters
6171+        if tail_segment:
6172+            data += struct.pack(">BBQQ",
6173+                                3,
6174+                                10,
6175+                                6,
6176+                                33)
6177+        elif empty:
6178+            data += struct.pack(">BBQQ",
6179+                                3,
6180+                                10,
6181+                                0,
6182+                                0)
6183+        else:
6184+            data += struct.pack(">BBQQ",
6185+                                3,
6186+                                10,
6187+                                6,
6188+                                36)
6189+        # Now we'll build the offsets.
6190+        sharedata = ""
6191+        if not tail_segment and not empty:
6192+            for i in xrange(6):
6193+                sharedata += self.salt + self.block
6194+        elif tail_segment:
6195+            for i in xrange(5):
6196+                sharedata += self.salt + self.block
6197+            sharedata += self.salt + "a"
6198+
6199+        # The encrypted private key comes after the shares + salts
6200+        offset_size = struct.calcsize(MDMFOFFSETS)
6201+        encrypted_private_key_offset = len(data) + offset_size + len(sharedata)
6202+        # The blockhashes come after the private key
6203+        blockhashes_offset = encrypted_private_key_offset + len(self.encprivkey)
6204+        # The sharehashes come after the salt hashes
6205+        sharehashes_offset = blockhashes_offset + len(self.block_hash_tree_s)
6206+        # The signature comes after the share hash chain
6207+        signature_offset = sharehashes_offset + len(self.share_hash_chain_s)
6208+        # The verification key comes after the signature
6209+        verification_offset = signature_offset + len(self.signature)
6210+        # The EOF comes after the verification key
6211+        eof_offset = verification_offset + len(self.verification_key)
6212+        data += struct.pack(MDMFOFFSETS,
6213+                            encrypted_private_key_offset,
6214+                            blockhashes_offset,
6215+                            sharehashes_offset,
6216+                            signature_offset,
6217+                            verification_offset,
6218+                            eof_offset)
6219+        self.offsets = {}
6220+        self.offsets['enc_privkey'] = encrypted_private_key_offset
6221+        self.offsets['block_hash_tree'] = blockhashes_offset
6222+        self.offsets['share_hash_chain'] = sharehashes_offset
6223+        self.offsets['signature'] = signature_offset
6224+        self.offsets['verification_key'] = verification_offset
6225+        self.offsets['EOF'] = eof_offset
6226+        # Next, we'll add in the salts and share data,
6227+        data += sharedata
6228+        # the private key,
6229+        data += self.encprivkey
6230+        # the block hash tree,
6231+        data += self.block_hash_tree_s
6232+        # the share hash chain,
6233+        data += self.share_hash_chain_s
6234+        # the signature,
6235+        data += self.signature
6236+        # and the verification key
6237+        data += self.verification_key
6238+        return data
6239+
6240+
6241+    def write_test_share_to_server(self,
6242+                                   storage_index,
6243+                                   tail_segment=False,
6244+                                   empty=False):
6245+        """
6246+        I write some data for the read tests to read to self.ss
6247+
6248+        If tail_segment=True, then I will write a share that has a
6249+        smaller tail segment than other segments.
6250+        """
6251+        write = self.ss.remote_slot_testv_and_readv_and_writev
6252+        data = self.build_test_mdmf_share(tail_segment, empty)
6253+        # Finally, we write the whole thing to the storage server in one
6254+        # pass.
6255+        testvs = [(0, 1, "eq", "")]
6256+        tws = {}
6257+        tws[0] = (testvs, [(0, data)], None)
6258+        readv = [(0, 1)]
6259+        results = write(storage_index, self.secrets, tws, readv)
6260+        self.failUnless(results[0])
6261+
6262+
6263+    def build_test_sdmf_share(self, empty=False):
6264+        if empty:
6265+            sharedata = ""
6266+        else:
6267+            sharedata = self.segment * 6
6268+        self.sharedata = sharedata
6269+        blocksize = len(sharedata) / 3
6270+        block = sharedata[:blocksize]
6271+        self.blockdata = block
6272+        prefix = struct.pack(">BQ32s16s BBQQ",
6273+                             0, # version,
6274+                             0,
6275+                             self.root_hash,
6276+                             self.salt,
6277+                             3,
6278+                             10,
6279+                             len(sharedata),
6280+                             len(sharedata),
6281+                            )
6282+        post_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
6283+        signature_offset = post_offset + len(self.verification_key)
6284+        sharehashes_offset = signature_offset + len(self.signature)
6285+        blockhashes_offset = sharehashes_offset + len(self.share_hash_chain_s)
6286+        sharedata_offset = blockhashes_offset + len(self.block_hash_tree_s)
6287+        encprivkey_offset = sharedata_offset + len(block)
6288+        eof_offset = encprivkey_offset + len(self.encprivkey)
6289+        offsets = struct.pack(">LLLLQQ",
6290+                              signature_offset,
6291+                              sharehashes_offset,
6292+                              blockhashes_offset,
6293+                              sharedata_offset,
6294+                              encprivkey_offset,
6295+                              eof_offset)
6296+        final_share = "".join([prefix,
6297+                           offsets,
6298+                           self.verification_key,
6299+                           self.signature,
6300+                           self.share_hash_chain_s,
6301+                           self.block_hash_tree_s,
6302+                           block,
6303+                           self.encprivkey])
6304+        self.offsets = {}
6305+        self.offsets['signature'] = signature_offset
6306+        self.offsets['share_hash_chain'] = sharehashes_offset
6307+        self.offsets['block_hash_tree'] = blockhashes_offset
6308+        self.offsets['share_data'] = sharedata_offset
6309+        self.offsets['enc_privkey'] = encprivkey_offset
6310+        self.offsets['EOF'] = eof_offset
6311+        return final_share
6312+
6313+
6314+    def write_sdmf_share_to_server(self,
6315+                                   storage_index,
6316+                                   empty=False):
6317+        # Some tests need SDMF shares to verify that we can still
6318+        # read them. This method writes one, which resembles but is not
6319+        assert self.rref
6320+        write = self.ss.remote_slot_testv_and_readv_and_writev
6321+        share = self.build_test_sdmf_share(empty)
6322+        testvs = [(0, 1, "eq", "")]
6323+        tws = {}
6324+        tws[0] = (testvs, [(0, share)], None)
6325+        readv = []
6326+        results = write(storage_index, self.secrets, tws, readv)
6327+        self.failUnless(results[0])
6328+
6329+
6330+    def test_read(self):
6331+        self.write_test_share_to_server("si1")
6332+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6333+        # Check that every method equals what we expect it to.
6334+        d = defer.succeed(None)
6335+        def _check_block_and_salt((block, salt)):
6336+            self.failUnlessEqual(block, self.block)
6337+            self.failUnlessEqual(salt, self.salt)
6338+
6339+        for i in xrange(6):
6340+            d.addCallback(lambda ignored, i=i:
6341+                mr.get_block_and_salt(i))
6342+            d.addCallback(_check_block_and_salt)
6343+
6344+        d.addCallback(lambda ignored:
6345+            mr.get_encprivkey())
6346+        d.addCallback(lambda encprivkey:
6347+            self.failUnlessEqual(self.encprivkey, encprivkey))
6348+
6349+        d.addCallback(lambda ignored:
6350+            mr.get_blockhashes())
6351+        d.addCallback(lambda blockhashes:
6352+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
6353+
6354+        d.addCallback(lambda ignored:
6355+            mr.get_sharehashes())
6356+        d.addCallback(lambda sharehashes:
6357+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
6358+
6359+        d.addCallback(lambda ignored:
6360+            mr.get_signature())
6361+        d.addCallback(lambda signature:
6362+            self.failUnlessEqual(signature, self.signature))
6363+
6364+        d.addCallback(lambda ignored:
6365+            mr.get_verification_key())
6366+        d.addCallback(lambda verification_key:
6367+            self.failUnlessEqual(verification_key, self.verification_key))
6368+
6369+        d.addCallback(lambda ignored:
6370+            mr.get_seqnum())
6371+        d.addCallback(lambda seqnum:
6372+            self.failUnlessEqual(seqnum, 0))
6373+
6374+        d.addCallback(lambda ignored:
6375+            mr.get_root_hash())
6376+        d.addCallback(lambda root_hash:
6377+            self.failUnlessEqual(self.root_hash, root_hash))
6378+
6379+        d.addCallback(lambda ignored:
6380+            mr.get_seqnum())
6381+        d.addCallback(lambda seqnum:
6382+            self.failUnlessEqual(0, seqnum))
6383+
6384+        d.addCallback(lambda ignored:
6385+            mr.get_encoding_parameters())
6386+        def _check_encoding_parameters((k, n, segsize, datalen)):
6387+            self.failUnlessEqual(k, 3)
6388+            self.failUnlessEqual(n, 10)
6389+            self.failUnlessEqual(segsize, 6)
6390+            self.failUnlessEqual(datalen, 36)
6391+        d.addCallback(_check_encoding_parameters)
6392+
6393+        d.addCallback(lambda ignored:
6394+            mr.get_checkstring())
6395+        d.addCallback(lambda checkstring:
6396+            self.failUnlessEqual(checkstring, checkstring))
6397+        return d
6398+
6399+
6400+    def test_read_with_different_tail_segment_size(self):
6401+        self.write_test_share_to_server("si1", tail_segment=True)
6402+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6403+        d = mr.get_block_and_salt(5)
6404+        def _check_tail_segment(results):
6405+            block, salt = results
6406+            self.failUnlessEqual(len(block), 1)
6407+            self.failUnlessEqual(block, "a")
6408+        d.addCallback(_check_tail_segment)
6409+        return d
6410+
6411+
6412+    def test_get_block_with_invalid_segnum(self):
6413+        self.write_test_share_to_server("si1")
6414+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6415+        d = defer.succeed(None)
6416+        d.addCallback(lambda ignored:
6417+            self.shouldFail(LayoutInvalid, "test invalid segnum",
6418+                            None,
6419+                            mr.get_block_and_salt, 7))
6420+        return d
6421+
6422+
6423+    def test_get_encoding_parameters_first(self):
6424+        self.write_test_share_to_server("si1")
6425+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6426+        d = mr.get_encoding_parameters()
6427+        def _check_encoding_parameters((k, n, segment_size, datalen)):
6428+            self.failUnlessEqual(k, 3)
6429+            self.failUnlessEqual(n, 10)
6430+            self.failUnlessEqual(segment_size, 6)
6431+            self.failUnlessEqual(datalen, 36)
6432+        d.addCallback(_check_encoding_parameters)
6433+        return d
6434+
6435+
6436+    def test_get_seqnum_first(self):
6437+        self.write_test_share_to_server("si1")
6438+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6439+        d = mr.get_seqnum()
6440+        d.addCallback(lambda seqnum:
6441+            self.failUnlessEqual(seqnum, 0))
6442+        return d
6443+
6444+
6445+    def test_get_root_hash_first(self):
6446+        self.write_test_share_to_server("si1")
6447+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6448+        d = mr.get_root_hash()
6449+        d.addCallback(lambda root_hash:
6450+            self.failUnlessEqual(root_hash, self.root_hash))
6451+        return d
6452+
6453+
6454+    def test_get_checkstring_first(self):
6455+        self.write_test_share_to_server("si1")
6456+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6457+        d = mr.get_checkstring()
6458+        d.addCallback(lambda checkstring:
6459+            self.failUnlessEqual(checkstring, self.checkstring))
6460+        return d
6461+
6462+
6463+    def test_write_read_vectors(self):
6464+        # When writing for us, the storage server will return to us a
6465+        # read vector, along with its result. If a write fails because
6466+        # the test vectors failed, this read vector can help us to
6467+        # diagnose the problem. This test ensures that the read vector
6468+        # is working appropriately.
6469+        mw = self._make_new_mw("si1", 0)
6470+
6471+        for i in xrange(6):
6472+            mw.put_block(self.block, i, self.salt)
6473+        mw.put_encprivkey(self.encprivkey)
6474+        mw.put_blockhashes(self.block_hash_tree)
6475+        mw.put_sharehashes(self.share_hash_chain)
6476+        mw.put_root_hash(self.root_hash)
6477+        mw.put_signature(self.signature)
6478+        mw.put_verification_key(self.verification_key)
6479+        d = mw.finish_publishing()
6480+        def _then(results):
6481+            self.failUnless(len(results), 2)
6482+            result, readv = results
6483+            self.failUnless(result)
6484+            self.failIf(readv)
6485+            self.old_checkstring = mw.get_checkstring()
6486+            mw.set_checkstring("")
6487+        d.addCallback(_then)
6488+        d.addCallback(lambda ignored:
6489+            mw.finish_publishing())
6490+        def _then_again(results):
6491+            self.failUnlessEqual(len(results), 2)
6492+            result, readvs = results
6493+            self.failIf(result)
6494+            self.failUnlessIn(0, readvs)
6495+            readv = readvs[0][0]
6496+            self.failUnlessEqual(readv, self.old_checkstring)
6497+        d.addCallback(_then_again)
6498+        # The checkstring remains the same for the rest of the process.
6499+        return d
6500+
6501+
6502+    def test_blockhashes_after_share_hash_chain(self):
6503+        mw = self._make_new_mw("si1", 0)
6504+        d = defer.succeed(None)
6505+        # Put everything up to and including the share hash chain
6506+        for i in xrange(6):
6507+            d.addCallback(lambda ignored, i=i:
6508+                mw.put_block(self.block, i, self.salt))
6509+        d.addCallback(lambda ignored:
6510+            mw.put_encprivkey(self.encprivkey))
6511+        d.addCallback(lambda ignored:
6512+            mw.put_blockhashes(self.block_hash_tree))
6513+        d.addCallback(lambda ignored:
6514+            mw.put_sharehashes(self.share_hash_chain))
6515+
6516+        # Now try to put the block hash tree again.
6517+        d.addCallback(lambda ignored:
6518+            self.shouldFail(LayoutInvalid, "test repeat salthashes",
6519+                            None,
6520+                            mw.put_blockhashes, self.block_hash_tree))
6521+        return d
6522+
6523+
6524+    def test_encprivkey_after_blockhashes(self):
6525+        mw = self._make_new_mw("si1", 0)
6526+        d = defer.succeed(None)
6527+        # Put everything up to and including the block hash tree
6528+        for i in xrange(6):
6529+            d.addCallback(lambda ignored, i=i:
6530+                mw.put_block(self.block, i, self.salt))
6531+        d.addCallback(lambda ignored:
6532+            mw.put_encprivkey(self.encprivkey))
6533+        d.addCallback(lambda ignored:
6534+            mw.put_blockhashes(self.block_hash_tree))
6535+        d.addCallback(lambda ignored:
6536+            self.shouldFail(LayoutInvalid, "out of order private key",
6537+                            None,
6538+                            mw.put_encprivkey, self.encprivkey))
6539+        return d
6540+
6541+
6542+    def test_share_hash_chain_after_signature(self):
6543+        mw = self._make_new_mw("si1", 0)
6544+        d = defer.succeed(None)
6545+        # Put everything up to and including the signature
6546+        for i in xrange(6):
6547+            d.addCallback(lambda ignored, i=i:
6548+                mw.put_block(self.block, i, self.salt))
6549+        d.addCallback(lambda ignored:
6550+            mw.put_encprivkey(self.encprivkey))
6551+        d.addCallback(lambda ignored:
6552+            mw.put_blockhashes(self.block_hash_tree))
6553+        d.addCallback(lambda ignored:
6554+            mw.put_sharehashes(self.share_hash_chain))
6555+        d.addCallback(lambda ignored:
6556+            mw.put_root_hash(self.root_hash))
6557+        d.addCallback(lambda ignored:
6558+            mw.put_signature(self.signature))
6559+        # Now try to put the share hash chain again. This should fail
6560+        d.addCallback(lambda ignored:
6561+            self.shouldFail(LayoutInvalid, "out of order share hash chain",
6562+                            None,
6563+                            mw.put_sharehashes, self.share_hash_chain))
6564+        return d
6565+
6566+
6567+    def test_signature_after_verification_key(self):
6568+        mw = self._make_new_mw("si1", 0)
6569+        d = defer.succeed(None)
6570+        # Put everything up to and including the verification key.
6571+        for i in xrange(6):
6572+            d.addCallback(lambda ignored, i=i:
6573+                mw.put_block(self.block, i, self.salt))
6574+        d.addCallback(lambda ignored:
6575+            mw.put_encprivkey(self.encprivkey))
6576+        d.addCallback(lambda ignored:
6577+            mw.put_blockhashes(self.block_hash_tree))
6578+        d.addCallback(lambda ignored:
6579+            mw.put_sharehashes(self.share_hash_chain))
6580+        d.addCallback(lambda ignored:
6581+            mw.put_root_hash(self.root_hash))
6582+        d.addCallback(lambda ignored:
6583+            mw.put_signature(self.signature))
6584+        d.addCallback(lambda ignored:
6585+            mw.put_verification_key(self.verification_key))
6586+        # Now try to put the signature again. This should fail
6587+        d.addCallback(lambda ignored:
6588+            self.shouldFail(LayoutInvalid, "signature after verification",
6589+                            None,
6590+                            mw.put_signature, self.signature))
6591+        return d
6592+
6593+
6594+    def test_uncoordinated_write(self):
6595+        # Make two mutable writers, both pointing to the same storage
6596+        # server, both at the same storage index, and try writing to the
6597+        # same share.
6598+        mw1 = self._make_new_mw("si1", 0)
6599+        mw2 = self._make_new_mw("si1", 0)
6600+
6601+        def _check_success(results):
6602+            result, readvs = results
6603+            self.failUnless(result)
6604+
6605+        def _check_failure(results):
6606+            result, readvs = results
6607+            self.failIf(result)
6608+
6609+        def _write_share(mw):
6610+            for i in xrange(6):
6611+                mw.put_block(self.block, i, self.salt)
6612+            mw.put_encprivkey(self.encprivkey)
6613+            mw.put_blockhashes(self.block_hash_tree)
6614+            mw.put_sharehashes(self.share_hash_chain)
6615+            mw.put_root_hash(self.root_hash)
6616+            mw.put_signature(self.signature)
6617+            mw.put_verification_key(self.verification_key)
6618+            return mw.finish_publishing()
6619+        d = _write_share(mw1)
6620+        d.addCallback(_check_success)
6621+        d.addCallback(lambda ignored:
6622+            _write_share(mw2))
6623+        d.addCallback(_check_failure)
6624+        return d
6625+
6626+
6627+    def test_invalid_salt_size(self):
6628+        # Salts need to be 16 bytes in size. Writes that attempt to
6629+        # write more or less than this should be rejected.
6630+        mw = self._make_new_mw("si1", 0)
6631+        invalid_salt = "a" * 17 # 17 bytes
6632+        another_invalid_salt = "b" * 15 # 15 bytes
6633+        d = defer.succeed(None)
6634+        d.addCallback(lambda ignored:
6635+            self.shouldFail(LayoutInvalid, "salt too big",
6636+                            None,
6637+                            mw.put_block, self.block, 0, invalid_salt))
6638+        d.addCallback(lambda ignored:
6639+            self.shouldFail(LayoutInvalid, "salt too small",
6640+                            None,
6641+                            mw.put_block, self.block, 0,
6642+                            another_invalid_salt))
6643+        return d
6644+
6645+
6646+    def test_write_test_vectors(self):
6647+        # If we give the write proxy a bogus test vector at
6648+        # any point during the process, it should fail to write when we
6649+        # tell it to write.
6650+        def _check_failure(results):
6651+            self.failUnlessEqual(len(results), 2)
6652+            res, d = results
6653+            self.failIf(res)
6654+
6655+        def _check_success(results):
6656+            self.failUnlessEqual(len(results), 2)
6657+            res, d = results
6658+            self.failUnless(results)
6659+
6660+        mw = self._make_new_mw("si1", 0)
6661+        mw.set_checkstring("this is a lie")
6662+        for i in xrange(6):
6663+            mw.put_block(self.block, i, self.salt)
6664+        mw.put_encprivkey(self.encprivkey)
6665+        mw.put_blockhashes(self.block_hash_tree)
6666+        mw.put_sharehashes(self.share_hash_chain)
6667+        mw.put_root_hash(self.root_hash)
6668+        mw.put_signature(self.signature)
6669+        mw.put_verification_key(self.verification_key)
6670+        d = mw.finish_publishing()
6671+        d.addCallback(_check_failure)
6672+        d.addCallback(lambda ignored:
6673+            mw.set_checkstring(""))
6674+        d.addCallback(lambda ignored:
6675+            mw.finish_publishing())
6676+        d.addCallback(_check_success)
6677+        return d
6678+
6679+
6680+    def serialize_blockhashes(self, blockhashes):
6681+        return "".join(blockhashes)
6682+
6683+
6684+    def serialize_sharehashes(self, sharehashes):
6685+        ret = "".join([struct.pack(">H32s", i, sharehashes[i])
6686+                        for i in sorted(sharehashes.keys())])
6687+        return ret
6688+
6689+
6690+    def test_write(self):
6691+        # This translates to a file with 6 6-byte segments, and with 2-byte
6692+        # blocks.
6693+        mw = self._make_new_mw("si1", 0)
6694+        # Test writing some blocks.
6695+        read = self.ss.remote_slot_readv
6696+        expected_sharedata_offset = struct.calcsize(MDMFHEADER)
6697+        written_block_size = 2 + len(self.salt)
6698+        written_block = self.block + self.salt
6699+        for i in xrange(6):
6700+            mw.put_block(self.block, i, self.salt)
6701+
6702+        mw.put_encprivkey(self.encprivkey)
6703+        mw.put_blockhashes(self.block_hash_tree)
6704+        mw.put_sharehashes(self.share_hash_chain)
6705+        mw.put_root_hash(self.root_hash)
6706+        mw.put_signature(self.signature)
6707+        mw.put_verification_key(self.verification_key)
6708+        d = mw.finish_publishing()
6709+        def _check_publish(results):
6710+            self.failUnlessEqual(len(results), 2)
6711+            result, ign = results
6712+            self.failUnless(result, "publish failed")
6713+            for i in xrange(6):
6714+                self.failUnlessEqual(read("si1", [0], [(expected_sharedata_offset + (i * written_block_size), written_block_size)]),
6715+                                {0: [written_block]})
6716+
6717+            expected_private_key_offset = expected_sharedata_offset + \
6718+                                      len(written_block) * 6
6719+            self.failUnlessEqual(len(self.encprivkey), 7)
6720+            self.failUnlessEqual(read("si1", [0], [(expected_private_key_offset, 7)]),
6721+                                 {0: [self.encprivkey]})
6722+
6723+            expected_block_hash_offset = expected_private_key_offset + len(self.encprivkey)
6724+            self.failUnlessEqual(len(self.block_hash_tree_s), 32 * 6)
6725+            self.failUnlessEqual(read("si1", [0], [(expected_block_hash_offset, 32 * 6)]),
6726+                                 {0: [self.block_hash_tree_s]})
6727+
6728+            expected_share_hash_offset = expected_block_hash_offset + len(self.block_hash_tree_s)
6729+            self.failUnlessEqual(read("si1", [0],[(expected_share_hash_offset, (32 + 2) * 6)]),
6730+                                 {0: [self.share_hash_chain_s]})
6731+
6732+            self.failUnlessEqual(read("si1", [0], [(9, 32)]),
6733+                                 {0: [self.root_hash]})
6734+            expected_signature_offset = expected_share_hash_offset + len(self.share_hash_chain_s)
6735+            self.failUnlessEqual(len(self.signature), 9)
6736+            self.failUnlessEqual(read("si1", [0], [(expected_signature_offset, 9)]),
6737+                                 {0: [self.signature]})
6738+
6739+            expected_verification_key_offset = expected_signature_offset + len(self.signature)
6740+            self.failUnlessEqual(len(self.verification_key), 6)
6741+            self.failUnlessEqual(read("si1", [0], [(expected_verification_key_offset, 6)]),
6742+                                 {0: [self.verification_key]})
6743+
6744+            signable = mw.get_signable()
6745+            verno, seq, roothash, k, n, segsize, datalen = \
6746+                                            struct.unpack(">BQ32sBBQQ",
6747+                                                          signable)
6748+            self.failUnlessEqual(verno, 1)
6749+            self.failUnlessEqual(seq, 0)
6750+            self.failUnlessEqual(roothash, self.root_hash)
6751+            self.failUnlessEqual(k, 3)
6752+            self.failUnlessEqual(n, 10)
6753+            self.failUnlessEqual(segsize, 6)
6754+            self.failUnlessEqual(datalen, 36)
6755+            expected_eof_offset = expected_verification_key_offset + len(self.verification_key)
6756+
6757+            # Check the version number to make sure that it is correct.
6758+            expected_version_number = struct.pack(">B", 1)
6759+            self.failUnlessEqual(read("si1", [0], [(0, 1)]),
6760+                                 {0: [expected_version_number]})
6761+            # Check the sequence number to make sure that it is correct
6762+            expected_sequence_number = struct.pack(">Q", 0)
6763+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
6764+                                 {0: [expected_sequence_number]})
6765+            # Check that the encoding parameters (k, N, segement size, data
6766+            # length) are what they should be. These are  3, 10, 6, 36
6767+            expected_k = struct.pack(">B", 3)
6768+            self.failUnlessEqual(read("si1", [0], [(41, 1)]),
6769+                                 {0: [expected_k]})
6770+            expected_n = struct.pack(">B", 10)
6771+            self.failUnlessEqual(read("si1", [0], [(42, 1)]),
6772+                                 {0: [expected_n]})
6773+            expected_segment_size = struct.pack(">Q", 6)
6774+            self.failUnlessEqual(read("si1", [0], [(43, 8)]),
6775+                                 {0: [expected_segment_size]})
6776+            expected_data_length = struct.pack(">Q", 36)
6777+            self.failUnlessEqual(read("si1", [0], [(51, 8)]),
6778+                                 {0: [expected_data_length]})
6779+            expected_offset = struct.pack(">Q", expected_private_key_offset)
6780+            self.failUnlessEqual(read("si1", [0], [(59, 8)]),
6781+                                 {0: [expected_offset]})
6782+            expected_offset = struct.pack(">Q", expected_block_hash_offset)
6783+            self.failUnlessEqual(read("si1", [0], [(67, 8)]),
6784+                                 {0: [expected_offset]})
6785+            expected_offset = struct.pack(">Q", expected_share_hash_offset)
6786+            self.failUnlessEqual(read("si1", [0], [(75, 8)]),
6787+                                 {0: [expected_offset]})
6788+            expected_offset = struct.pack(">Q", expected_signature_offset)
6789+            self.failUnlessEqual(read("si1", [0], [(83, 8)]),
6790+                                 {0: [expected_offset]})
6791+            expected_offset = struct.pack(">Q", expected_verification_key_offset)
6792+            self.failUnlessEqual(read("si1", [0], [(91, 8)]),
6793+                                 {0: [expected_offset]})
6794+            expected_offset = struct.pack(">Q", expected_eof_offset)
6795+            self.failUnlessEqual(read("si1", [0], [(99, 8)]),
6796+                                 {0: [expected_offset]})
6797+        d.addCallback(_check_publish)
6798+        return d
6799+
6800+    def _make_new_mw(self, si, share, datalength=36):
6801+        # This is a file of size 36 bytes. Since it has a segment
6802+        # size of 6, we know that it has 6 byte segments, which will
6803+        # be split into blocks of 2 bytes because our FEC k
6804+        # parameter is 3.
6805+        mw = MDMFSlotWriteProxy(share, self.rref, si, self.secrets, 0, 3, 10,
6806+                                6, datalength)
6807+        return mw
6808+
6809+
6810+    def test_write_rejected_with_too_many_blocks(self):
6811+        mw = self._make_new_mw("si0", 0)
6812+
6813+        # Try writing too many blocks. We should not be able to write
6814+        # more than 6
6815+        # blocks into each share.
6816+        d = defer.succeed(None)
6817+        for i in xrange(6):
6818+            d.addCallback(lambda ignored, i=i:
6819+                mw.put_block(self.block, i, self.salt))
6820+        d.addCallback(lambda ignored:
6821+            self.shouldFail(LayoutInvalid, "too many blocks",
6822+                            None,
6823+                            mw.put_block, self.block, 7, self.salt))
6824+        return d
6825+
6826+
6827+    def test_write_rejected_with_invalid_salt(self):
6828+        # Try writing an invalid salt. Salts are 16 bytes -- any more or
6829+        # less should cause an error.
6830+        mw = self._make_new_mw("si1", 0)
6831+        bad_salt = "a" * 17 # 17 bytes
6832+        d = defer.succeed(None)
6833+        d.addCallback(lambda ignored:
6834+            self.shouldFail(LayoutInvalid, "test_invalid_salt",
6835+                            None, mw.put_block, self.block, 7, bad_salt))
6836+        return d
6837+
6838+
6839+    def test_write_rejected_with_invalid_root_hash(self):
6840+        # Try writing an invalid root hash. This should be SHA256d, and
6841+        # 32 bytes long as a result.
6842+        mw = self._make_new_mw("si2", 0)
6843+        # 17 bytes != 32 bytes
6844+        invalid_root_hash = "a" * 17
6845+        d = defer.succeed(None)
6846+        # Before this test can work, we need to put some blocks + salts,
6847+        # a block hash tree, and a share hash tree. Otherwise, we'll see
6848+        # failures that match what we are looking for, but are caused by
6849+        # the constraints imposed on operation ordering.
6850+        for i in xrange(6):
6851+            d.addCallback(lambda ignored, i=i:
6852+                mw.put_block(self.block, i, self.salt))
6853+        d.addCallback(lambda ignored:
6854+            mw.put_encprivkey(self.encprivkey))
6855+        d.addCallback(lambda ignored:
6856+            mw.put_blockhashes(self.block_hash_tree))
6857+        d.addCallback(lambda ignored:
6858+            mw.put_sharehashes(self.share_hash_chain))
6859+        d.addCallback(lambda ignored:
6860+            self.shouldFail(LayoutInvalid, "invalid root hash",
6861+                            None, mw.put_root_hash, invalid_root_hash))
6862+        return d
6863+
6864+
6865+    def test_write_rejected_with_invalid_blocksize(self):
6866+        # The blocksize implied by the writer that we get from
6867+        # _make_new_mw is 2bytes -- any more or any less than this
6868+        # should be cause for failure, unless it is the tail segment, in
6869+        # which case it may not be failure.
6870+        invalid_block = "a"
6871+        mw = self._make_new_mw("si3", 0, 33) # implies a tail segment with
6872+                                             # one byte blocks
6873+        # 1 bytes != 2 bytes
6874+        d = defer.succeed(None)
6875+        d.addCallback(lambda ignored, invalid_block=invalid_block:
6876+            self.shouldFail(LayoutInvalid, "test blocksize too small",
6877+                            None, mw.put_block, invalid_block, 0,
6878+                            self.salt))
6879+        invalid_block = invalid_block * 3
6880+        # 3 bytes != 2 bytes
6881+        d.addCallback(lambda ignored:
6882+            self.shouldFail(LayoutInvalid, "test blocksize too large",
6883+                            None,
6884+                            mw.put_block, invalid_block, 0, self.salt))
6885+        for i in xrange(5):
6886+            d.addCallback(lambda ignored, i=i:
6887+                mw.put_block(self.block, i, self.salt))
6888+        # Try to put an invalid tail segment
6889+        d.addCallback(lambda ignored:
6890+            self.shouldFail(LayoutInvalid, "test invalid tail segment",
6891+                            None,
6892+                            mw.put_block, self.block, 5, self.salt))
6893+        valid_block = "a"
6894+        d.addCallback(lambda ignored:
6895+            mw.put_block(valid_block, 5, self.salt))
6896+        return d
6897+
6898+
6899+    def test_write_enforces_order_constraints(self):
6900+        # We require that the MDMFSlotWriteProxy be interacted with in a
6901+        # specific way.
6902+        # That way is:
6903+        # 0: __init__
6904+        # 1: write blocks and salts
6905+        # 2: Write the encrypted private key
6906+        # 3: Write the block hashes
6907+        # 4: Write the share hashes
6908+        # 5: Write the root hash and salt hash
6909+        # 6: Write the signature and verification key
6910+        # 7: Write the file.
6911+        #
6912+        # Some of these can be performed out-of-order, and some can't.
6913+        # The dependencies that I want to test here are:
6914+        #  - Private key before block hashes
6915+        #  - share hashes and block hashes before root hash
6916+        #  - root hash before signature
6917+        #  - signature before verification key
6918+        mw0 = self._make_new_mw("si0", 0)
6919+        # Write some shares
6920+        d = defer.succeed(None)
6921+        for i in xrange(6):
6922+            d.addCallback(lambda ignored, i=i:
6923+                mw0.put_block(self.block, i, self.salt))
6924+        # Try to write the block hashes before writing the encrypted
6925+        # private key
6926+        d.addCallback(lambda ignored:
6927+            self.shouldFail(LayoutInvalid, "block hashes before key",
6928+                            None, mw0.put_blockhashes,
6929+                            self.block_hash_tree))
6930+
6931+        # Write the private key.
6932+        d.addCallback(lambda ignored:
6933+            mw0.put_encprivkey(self.encprivkey))
6934+
6935+
6936+        # Try to write the share hash chain without writing the block
6937+        # hash tree
6938+        d.addCallback(lambda ignored:
6939+            self.shouldFail(LayoutInvalid, "share hash chain before "
6940+                                           "salt hash tree",
6941+                            None,
6942+                            mw0.put_sharehashes, self.share_hash_chain))
6943+
6944+        # Try to write the root hash and without writing either the
6945+        # block hashes or the or the share hashes
6946+        d.addCallback(lambda ignored:
6947+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
6948+                            None,
6949+                            mw0.put_root_hash, self.root_hash))
6950+
6951+        # Now write the block hashes and try again
6952+        d.addCallback(lambda ignored:
6953+            mw0.put_blockhashes(self.block_hash_tree))
6954+
6955+        d.addCallback(lambda ignored:
6956+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
6957+                            None, mw0.put_root_hash, self.root_hash))
6958+
6959+        # We haven't yet put the root hash on the share, so we shouldn't
6960+        # be able to sign it.
6961+        d.addCallback(lambda ignored:
6962+            self.shouldFail(LayoutInvalid, "signature before root hash",
6963+                            None, mw0.put_signature, self.signature))
6964+
6965+        d.addCallback(lambda ignored:
6966+            self.failUnlessRaises(LayoutInvalid, mw0.get_signable))
6967+
6968+        # ..and, since that fails, we also shouldn't be able to put the
6969+        # verification key.
6970+        d.addCallback(lambda ignored:
6971+            self.shouldFail(LayoutInvalid, "key before signature",
6972+                            None, mw0.put_verification_key,
6973+                            self.verification_key))
6974+
6975+        # Now write the share hashes.
6976+        d.addCallback(lambda ignored:
6977+            mw0.put_sharehashes(self.share_hash_chain))
6978+        # We should be able to write the root hash now too
6979+        d.addCallback(lambda ignored:
6980+            mw0.put_root_hash(self.root_hash))
6981+
6982+        # We should still be unable to put the verification key
6983+        d.addCallback(lambda ignored:
6984+            self.shouldFail(LayoutInvalid, "key before signature",
6985+                            None, mw0.put_verification_key,
6986+                            self.verification_key))
6987+
6988+        d.addCallback(lambda ignored:
6989+            mw0.put_signature(self.signature))
6990+
6991+        # We shouldn't be able to write the offsets to the remote server
6992+        # until the offset table is finished; IOW, until we have written
6993+        # the verification key.
6994+        d.addCallback(lambda ignored:
6995+            self.shouldFail(LayoutInvalid, "offsets before verification key",
6996+                            None,
6997+                            mw0.finish_publishing))
6998+
6999+        d.addCallback(lambda ignored:
7000+            mw0.put_verification_key(self.verification_key))
7001+        return d
7002+
7003+
7004+    def test_end_to_end(self):
7005+        mw = self._make_new_mw("si1", 0)
7006+        # Write a share using the mutable writer, and make sure that the
7007+        # reader knows how to read everything back to us.
7008+        d = defer.succeed(None)
7009+        for i in xrange(6):
7010+            d.addCallback(lambda ignored, i=i:
7011+                mw.put_block(self.block, i, self.salt))
7012+        d.addCallback(lambda ignored:
7013+            mw.put_encprivkey(self.encprivkey))
7014+        d.addCallback(lambda ignored:
7015+            mw.put_blockhashes(self.block_hash_tree))
7016+        d.addCallback(lambda ignored:
7017+            mw.put_sharehashes(self.share_hash_chain))
7018+        d.addCallback(lambda ignored:
7019+            mw.put_root_hash(self.root_hash))
7020+        d.addCallback(lambda ignored:
7021+            mw.put_signature(self.signature))
7022+        d.addCallback(lambda ignored:
7023+            mw.put_verification_key(self.verification_key))
7024+        d.addCallback(lambda ignored:
7025+            mw.finish_publishing())
7026+
7027+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7028+        def _check_block_and_salt((block, salt)):
7029+            self.failUnlessEqual(block, self.block)
7030+            self.failUnlessEqual(salt, self.salt)
7031+
7032+        for i in xrange(6):
7033+            d.addCallback(lambda ignored, i=i:
7034+                mr.get_block_and_salt(i))
7035+            d.addCallback(_check_block_and_salt)
7036+
7037+        d.addCallback(lambda ignored:
7038+            mr.get_encprivkey())
7039+        d.addCallback(lambda encprivkey:
7040+            self.failUnlessEqual(self.encprivkey, encprivkey))
7041+
7042+        d.addCallback(lambda ignored:
7043+            mr.get_blockhashes())
7044+        d.addCallback(lambda blockhashes:
7045+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
7046+
7047+        d.addCallback(lambda ignored:
7048+            mr.get_sharehashes())
7049+        d.addCallback(lambda sharehashes:
7050+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
7051+
7052+        d.addCallback(lambda ignored:
7053+            mr.get_signature())
7054+        d.addCallback(lambda signature:
7055+            self.failUnlessEqual(signature, self.signature))
7056+
7057+        d.addCallback(lambda ignored:
7058+            mr.get_verification_key())
7059+        d.addCallback(lambda verification_key:
7060+            self.failUnlessEqual(verification_key, self.verification_key))
7061+
7062+        d.addCallback(lambda ignored:
7063+            mr.get_seqnum())
7064+        d.addCallback(lambda seqnum:
7065+            self.failUnlessEqual(seqnum, 0))
7066+
7067+        d.addCallback(lambda ignored:
7068+            mr.get_root_hash())
7069+        d.addCallback(lambda root_hash:
7070+            self.failUnlessEqual(self.root_hash, root_hash))
7071+
7072+        d.addCallback(lambda ignored:
7073+            mr.get_encoding_parameters())
7074+        def _check_encoding_parameters((k, n, segsize, datalen)):
7075+            self.failUnlessEqual(k, 3)
7076+            self.failUnlessEqual(n, 10)
7077+            self.failUnlessEqual(segsize, 6)
7078+            self.failUnlessEqual(datalen, 36)
7079+        d.addCallback(_check_encoding_parameters)
7080+
7081+        d.addCallback(lambda ignored:
7082+            mr.get_checkstring())
7083+        d.addCallback(lambda checkstring:
7084+            self.failUnlessEqual(checkstring, mw.get_checkstring()))
7085+        return d
7086+
7087+
7088+    def test_is_sdmf(self):
7089+        # The MDMFSlotReadProxy should also know how to read SDMF files,
7090+        # since it will encounter them on the grid. Callers use the
7091+        # is_sdmf method to test this.
7092+        self.write_sdmf_share_to_server("si1")
7093+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7094+        d = mr.is_sdmf()
7095+        d.addCallback(lambda issdmf:
7096+            self.failUnless(issdmf))
7097+        return d
7098+
7099+
7100+    def test_reads_sdmf(self):
7101+        # The slot read proxy should, naturally, know how to tell us
7102+        # about data in the SDMF format
7103+        self.write_sdmf_share_to_server("si1")
7104+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7105+        d = defer.succeed(None)
7106+        d.addCallback(lambda ignored:
7107+            mr.is_sdmf())
7108+        d.addCallback(lambda issdmf:
7109+            self.failUnless(issdmf))
7110+
7111+        # What do we need to read?
7112+        #  - The sharedata
7113+        #  - The salt
7114+        d.addCallback(lambda ignored:
7115+            mr.get_block_and_salt(0))
7116+        def _check_block_and_salt(results):
7117+            block, salt = results
7118+            # Our original file is 36 bytes long. Then each share is 12
7119+            # bytes in size. The share is composed entirely of the
7120+            # letter a. self.block contains 2 as, so 6 * self.block is
7121+            # what we are looking for.
7122+            self.failUnlessEqual(block, self.block * 6)
7123+            self.failUnlessEqual(salt, self.salt)
7124+        d.addCallback(_check_block_and_salt)
7125+
7126+        #  - The blockhashes
7127+        d.addCallback(lambda ignored:
7128+            mr.get_blockhashes())
7129+        d.addCallback(lambda blockhashes:
7130+            self.failUnlessEqual(self.block_hash_tree,
7131+                                 blockhashes,
7132+                                 blockhashes))
7133+        #  - The sharehashes
7134+        d.addCallback(lambda ignored:
7135+            mr.get_sharehashes())
7136+        d.addCallback(lambda sharehashes:
7137+            self.failUnlessEqual(self.share_hash_chain,
7138+                                 sharehashes))
7139+        #  - The keys
7140+        d.addCallback(lambda ignored:
7141+            mr.get_encprivkey())
7142+        d.addCallback(lambda encprivkey:
7143+            self.failUnlessEqual(encprivkey, self.encprivkey, encprivkey))
7144+        d.addCallback(lambda ignored:
7145+            mr.get_verification_key())
7146+        d.addCallback(lambda verification_key:
7147+            self.failUnlessEqual(verification_key,
7148+                                 self.verification_key,
7149+                                 verification_key))
7150+        #  - The signature
7151+        d.addCallback(lambda ignored:
7152+            mr.get_signature())
7153+        d.addCallback(lambda signature:
7154+            self.failUnlessEqual(signature, self.signature, signature))
7155+
7156+        #  - The sequence number
7157+        d.addCallback(lambda ignored:
7158+            mr.get_seqnum())
7159+        d.addCallback(lambda seqnum:
7160+            self.failUnlessEqual(seqnum, 0, seqnum))
7161+
7162+        #  - The root hash
7163+        d.addCallback(lambda ignored:
7164+            mr.get_root_hash())
7165+        d.addCallback(lambda root_hash:
7166+            self.failUnlessEqual(root_hash, self.root_hash, root_hash))
7167+        return d
7168+
7169+
7170+    def test_only_reads_one_segment_sdmf(self):
7171+        # SDMF shares have only one segment, so it doesn't make sense to
7172+        # read more segments than that. The reader should know this and
7173+        # complain if we try to do that.
7174+        self.write_sdmf_share_to_server("si1")
7175+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7176+        d = defer.succeed(None)
7177+        d.addCallback(lambda ignored:
7178+            mr.is_sdmf())
7179+        d.addCallback(lambda issdmf:
7180+            self.failUnless(issdmf))
7181+        d.addCallback(lambda ignored:
7182+            self.shouldFail(LayoutInvalid, "test bad segment",
7183+                            None,
7184+                            mr.get_block_and_salt, 1))
7185+        return d
7186+
7187+
7188+    def test_read_with_prefetched_mdmf_data(self):
7189+        # The MDMFSlotReadProxy will prefill certain fields if you pass
7190+        # it data that you have already fetched. This is useful for
7191+        # cases like the Servermap, which prefetches ~2kb of data while
7192+        # finding out which shares are on the remote peer so that it
7193+        # doesn't waste round trips.
7194+        mdmf_data = self.build_test_mdmf_share()
7195+        self.write_test_share_to_server("si1")
7196+        def _make_mr(ignored, length):
7197+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, mdmf_data[:length])
7198+            return mr
7199+
7200+        d = defer.succeed(None)
7201+        # This should be enough to fill in both the encoding parameters
7202+        # and the table of offsets, which will complete the version
7203+        # information tuple.
7204+        d.addCallback(_make_mr, 107)
7205+        d.addCallback(lambda mr:
7206+            mr.get_verinfo())
7207+        def _check_verinfo(verinfo):
7208+            self.failUnless(verinfo)
7209+            self.failUnlessEqual(len(verinfo), 9)
7210+            (seqnum,
7211+             root_hash,
7212+             salt_hash,
7213+             segsize,
7214+             datalen,
7215+             k,
7216+             n,
7217+             prefix,
7218+             offsets) = verinfo
7219+            self.failUnlessEqual(seqnum, 0)
7220+            self.failUnlessEqual(root_hash, self.root_hash)
7221+            self.failUnlessEqual(segsize, 6)
7222+            self.failUnlessEqual(datalen, 36)
7223+            self.failUnlessEqual(k, 3)
7224+            self.failUnlessEqual(n, 10)
7225+            expected_prefix = struct.pack(MDMFSIGNABLEHEADER,
7226+                                          1,
7227+                                          seqnum,
7228+                                          root_hash,
7229+                                          k,
7230+                                          n,
7231+                                          segsize,
7232+                                          datalen)
7233+            self.failUnlessEqual(expected_prefix, prefix)
7234+            self.failUnlessEqual(self.rref.read_count, 0)
7235+        d.addCallback(_check_verinfo)
7236+        # This is not enough data to read a block and a share, so the
7237+        # wrapper should attempt to read this from the remote server.
7238+        d.addCallback(_make_mr, 107)
7239+        d.addCallback(lambda mr:
7240+            mr.get_block_and_salt(0))
7241+        def _check_block_and_salt((block, salt)):
7242+            self.failUnlessEqual(block, self.block)
7243+            self.failUnlessEqual(salt, self.salt)
7244+            self.failUnlessEqual(self.rref.read_count, 1)
7245+        # This should be enough data to read one block.
7246+        d.addCallback(_make_mr, 249)
7247+        d.addCallback(lambda mr:
7248+            mr.get_block_and_salt(0))
7249+        d.addCallback(_check_block_and_salt)
7250+        return d
7251+
7252+
7253+    def test_read_with_prefetched_sdmf_data(self):
7254+        sdmf_data = self.build_test_sdmf_share()
7255+        self.write_sdmf_share_to_server("si1")
7256+        def _make_mr(ignored, length):
7257+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, sdmf_data[:length])
7258+            return mr
7259+
7260+        d = defer.succeed(None)
7261+        # This should be enough to get us the encoding parameters,
7262+        # offset table, and everything else we need to build a verinfo
7263+        # string.
7264+        d.addCallback(_make_mr, 107)
7265+        d.addCallback(lambda mr:
7266+            mr.get_verinfo())
7267+        def _check_verinfo(verinfo):
7268+            self.failUnless(verinfo)
7269+            self.failUnlessEqual(len(verinfo), 9)
7270+            (seqnum,
7271+             root_hash,
7272+             salt,
7273+             segsize,
7274+             datalen,
7275+             k,
7276+             n,
7277+             prefix,
7278+             offsets) = verinfo
7279+            self.failUnlessEqual(seqnum, 0)
7280+            self.failUnlessEqual(root_hash, self.root_hash)
7281+            self.failUnlessEqual(salt, self.salt)
7282+            self.failUnlessEqual(segsize, 36)
7283+            self.failUnlessEqual(datalen, 36)
7284+            self.failUnlessEqual(k, 3)
7285+            self.failUnlessEqual(n, 10)
7286+            expected_prefix = struct.pack(SIGNED_PREFIX,
7287+                                          0,
7288+                                          seqnum,
7289+                                          root_hash,
7290+                                          salt,
7291+                                          k,
7292+                                          n,
7293+                                          segsize,
7294+                                          datalen)
7295+            self.failUnlessEqual(expected_prefix, prefix)
7296+            self.failUnlessEqual(self.rref.read_count, 0)
7297+        d.addCallback(_check_verinfo)
7298+        # This shouldn't be enough to read any share data.
7299+        d.addCallback(_make_mr, 107)
7300+        d.addCallback(lambda mr:
7301+            mr.get_block_and_salt(0))
7302+        def _check_block_and_salt((block, salt)):
7303+            self.failUnlessEqual(block, self.block * 6)
7304+            self.failUnlessEqual(salt, self.salt)
7305+            # TODO: Fix the read routine so that it reads only the data
7306+            #       that it has cached if it can't read all of it.
7307+            self.failUnlessEqual(self.rref.read_count, 2)
7308+
7309+        # This should be enough to read share data.
7310+        d.addCallback(_make_mr, self.offsets['share_data'])
7311+        d.addCallback(lambda mr:
7312+            mr.get_block_and_salt(0))
7313+        d.addCallback(_check_block_and_salt)
7314+        return d
7315+
7316+
7317+    def test_read_with_empty_mdmf_file(self):
7318+        # Some tests upload a file with no contents to test things
7319+        # unrelated to the actual handling of the content of the file.
7320+        # The reader should behave intelligently in these cases.
7321+        self.write_test_share_to_server("si1", empty=True)
7322+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7323+        # We should be able to get the encoding parameters, and they
7324+        # should be correct.
7325+        d = defer.succeed(None)
7326+        d.addCallback(lambda ignored:
7327+            mr.get_encoding_parameters())
7328+        def _check_encoding_parameters(params):
7329+            self.failUnlessEqual(len(params), 4)
7330+            k, n, segsize, datalen = params
7331+            self.failUnlessEqual(k, 3)
7332+            self.failUnlessEqual(n, 10)
7333+            self.failUnlessEqual(segsize, 0)
7334+            self.failUnlessEqual(datalen, 0)
7335+        d.addCallback(_check_encoding_parameters)
7336+
7337+        # We should not be able to fetch a block, since there are no
7338+        # blocks to fetch
7339+        d.addCallback(lambda ignored:
7340+            self.shouldFail(LayoutInvalid, "get block on empty file",
7341+                            None,
7342+                            mr.get_block_and_salt, 0))
7343+        return d
7344+
7345+
7346+    def test_read_with_empty_sdmf_file(self):
7347+        self.write_sdmf_share_to_server("si1", empty=True)
7348+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7349+        # We should be able to get the encoding parameters, and they
7350+        # should be correct
7351+        d = defer.succeed(None)
7352+        d.addCallback(lambda ignored:
7353+            mr.get_encoding_parameters())
7354+        def _check_encoding_parameters(params):
7355+            self.failUnlessEqual(len(params), 4)
7356+            k, n, segsize, datalen = params
7357+            self.failUnlessEqual(k, 3)
7358+            self.failUnlessEqual(n, 10)
7359+            self.failUnlessEqual(segsize, 0)
7360+            self.failUnlessEqual(datalen, 0)
7361+        d.addCallback(_check_encoding_parameters)
7362+
7363+        # It does not make sense to get a block in this format, so we
7364+        # should not be able to.
7365+        d.addCallback(lambda ignored:
7366+            self.shouldFail(LayoutInvalid, "get block on an empty file",
7367+                            None,
7368+                            mr.get_block_and_salt, 0))
7369+        return d
7370+
7371+
7372+    def test_verinfo_with_sdmf_file(self):
7373+        self.write_sdmf_share_to_server("si1")
7374+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7375+        # We should be able to get the version information.
7376+        d = defer.succeed(None)
7377+        d.addCallback(lambda ignored:
7378+            mr.get_verinfo())
7379+        def _check_verinfo(verinfo):
7380+            self.failUnless(verinfo)
7381+            self.failUnlessEqual(len(verinfo), 9)
7382+            (seqnum,
7383+             root_hash,
7384+             salt,
7385+             segsize,
7386+             datalen,
7387+             k,
7388+             n,
7389+             prefix,
7390+             offsets) = verinfo
7391+            self.failUnlessEqual(seqnum, 0)
7392+            self.failUnlessEqual(root_hash, self.root_hash)
7393+            self.failUnlessEqual(salt, self.salt)
7394+            self.failUnlessEqual(segsize, 36)
7395+            self.failUnlessEqual(datalen, 36)
7396+            self.failUnlessEqual(k, 3)
7397+            self.failUnlessEqual(n, 10)
7398+            expected_prefix = struct.pack(">BQ32s16s BBQQ",
7399+                                          0,
7400+                                          seqnum,
7401+                                          root_hash,
7402+                                          salt,
7403+                                          k,
7404+                                          n,
7405+                                          segsize,
7406+                                          datalen)
7407+            self.failUnlessEqual(prefix, expected_prefix)
7408+            self.failUnlessEqual(offsets, self.offsets)
7409+        d.addCallback(_check_verinfo)
7410+        return d
7411+
7412+
7413+    def test_verinfo_with_mdmf_file(self):
7414+        self.write_test_share_to_server("si1")
7415+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7416+        d = defer.succeed(None)
7417+        d.addCallback(lambda ignored:
7418+            mr.get_verinfo())
7419+        def _check_verinfo(verinfo):
7420+            self.failUnless(verinfo)
7421+            self.failUnlessEqual(len(verinfo), 9)
7422+            (seqnum,
7423+             root_hash,
7424+             IV,
7425+             segsize,
7426+             datalen,
7427+             k,
7428+             n,
7429+             prefix,
7430+             offsets) = verinfo
7431+            self.failUnlessEqual(seqnum, 0)
7432+            self.failUnlessEqual(root_hash, self.root_hash)
7433+            self.failIf(IV)
7434+            self.failUnlessEqual(segsize, 6)
7435+            self.failUnlessEqual(datalen, 36)
7436+            self.failUnlessEqual(k, 3)
7437+            self.failUnlessEqual(n, 10)
7438+            expected_prefix = struct.pack(">BQ32s BBQQ",
7439+                                          1,
7440+                                          seqnum,
7441+                                          root_hash,
7442+                                          k,
7443+                                          n,
7444+                                          segsize,
7445+                                          datalen)
7446+            self.failUnlessEqual(prefix, expected_prefix)
7447+            self.failUnlessEqual(offsets, self.offsets)
7448+        d.addCallback(_check_verinfo)
7449+        return d
7450+
7451+
7452+    def test_reader_queue(self):
7453+        self.write_test_share_to_server('si1')
7454+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7455+        d1 = mr.get_block_and_salt(0, queue=True)
7456+        d2 = mr.get_blockhashes(queue=True)
7457+        d3 = mr.get_sharehashes(queue=True)
7458+        d4 = mr.get_signature(queue=True)
7459+        d5 = mr.get_verification_key(queue=True)
7460+        dl = defer.DeferredList([d1, d2, d3, d4, d5])
7461+        mr.flush()
7462+        def _print(results):
7463+            self.failUnlessEqual(len(results), 5)
7464+            # We have one read for version information and offsets, and
7465+            # one for everything else.
7466+            self.failUnlessEqual(self.rref.read_count, 2)
7467+            block, salt = results[0][1] # results[0] is a boolean that says
7468+                                           # whether or not the operation
7469+                                           # worked.
7470+            self.failUnlessEqual(self.block, block)
7471+            self.failUnlessEqual(self.salt, salt)
7472+
7473+            blockhashes = results[1][1]
7474+            self.failUnlessEqual(self.block_hash_tree, blockhashes)
7475+
7476+            sharehashes = results[2][1]
7477+            self.failUnlessEqual(self.share_hash_chain, sharehashes)
7478+
7479+            signature = results[3][1]
7480+            self.failUnlessEqual(self.signature, signature)
7481+
7482+            verification_key = results[4][1]
7483+            self.failUnlessEqual(self.verification_key, verification_key)
7484+        dl.addCallback(_print)
7485+        return dl
7486+
7487+
7488+    def test_sdmf_writer(self):
7489+        # Go through the motions of writing an SDMF share to the storage
7490+        # server. Then read the storage server to see that the share got
7491+        # written in the way that we think it should have.
7492+
7493+        # We do this first so that the necessary instance variables get
7494+        # set the way we want them for the tests below.
7495+        data = self.build_test_sdmf_share()
7496+        sdmfr = SDMFSlotWriteProxy(0,
7497+                                   self.rref,
7498+                                   "si1",
7499+                                   self.secrets,
7500+                                   0, 3, 10, 36, 36)
7501+        # Put the block and salt.
7502+        sdmfr.put_block(self.blockdata, 0, self.salt)
7503+
7504+        # Put the encprivkey
7505+        sdmfr.put_encprivkey(self.encprivkey)
7506+
7507+        # Put the block and share hash chains
7508+        sdmfr.put_blockhashes(self.block_hash_tree)
7509+        sdmfr.put_sharehashes(self.share_hash_chain)
7510+        sdmfr.put_root_hash(self.root_hash)
7511+
7512+        # Put the signature
7513+        sdmfr.put_signature(self.signature)
7514+
7515+        # Put the verification key
7516+        sdmfr.put_verification_key(self.verification_key)
7517+
7518+        # Now check to make sure that nothing has been written yet.
7519+        self.failUnlessEqual(self.rref.write_count, 0)
7520+
7521+        # Now finish publishing
7522+        d = sdmfr.finish_publishing()
7523+        def _then(ignored):
7524+            self.failUnlessEqual(self.rref.write_count, 1)
7525+            read = self.ss.remote_slot_readv
7526+            self.failUnlessEqual(read("si1", [0], [(0, len(data))]),
7527+                                 {0: [data]})
7528+        d.addCallback(_then)
7529+        return d
7530+
7531+
7532+    def test_sdmf_writer_preexisting_share(self):
7533+        data = self.build_test_sdmf_share()
7534+        self.write_sdmf_share_to_server("si1")
7535+
7536+        # Now there is a share on the storage server. To successfully
7537+        # write, we need to set the checkstring correctly. When we
7538+        # don't, no write should occur.
7539+        sdmfw = SDMFSlotWriteProxy(0,
7540+                                   self.rref,
7541+                                   "si1",
7542+                                   self.secrets,
7543+                                   1, 3, 10, 36, 36)
7544+        sdmfw.put_block(self.blockdata, 0, self.salt)
7545+
7546+        # Put the encprivkey
7547+        sdmfw.put_encprivkey(self.encprivkey)
7548+
7549+        # Put the block and share hash chains
7550+        sdmfw.put_blockhashes(self.block_hash_tree)
7551+        sdmfw.put_sharehashes(self.share_hash_chain)
7552+
7553+        # Put the root hash
7554+        sdmfw.put_root_hash(self.root_hash)
7555+
7556+        # Put the signature
7557+        sdmfw.put_signature(self.signature)
7558+
7559+        # Put the verification key
7560+        sdmfw.put_verification_key(self.verification_key)
7561+
7562+        # We shouldn't have a checkstring yet
7563+        self.failUnlessEqual(sdmfw.get_checkstring(), "")
7564+
7565+        d = sdmfw.finish_publishing()
7566+        def _then(results):
7567+            self.failIf(results[0])
7568+            # this is the correct checkstring
7569+            self._expected_checkstring = results[1][0][0]
7570+            return self._expected_checkstring
7571+
7572+        d.addCallback(_then)
7573+        d.addCallback(sdmfw.set_checkstring)
7574+        d.addCallback(lambda ignored:
7575+            sdmfw.get_checkstring())
7576+        d.addCallback(lambda checkstring:
7577+            self.failUnlessEqual(checkstring, self._expected_checkstring))
7578+        d.addCallback(lambda ignored:
7579+            sdmfw.finish_publishing())
7580+        def _then_again(results):
7581+            self.failUnless(results[0])
7582+            read = self.ss.remote_slot_readv
7583+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
7584+                                 {0: [struct.pack(">Q", 1)]})
7585+            self.failUnlessEqual(read("si1", [0], [(9, len(data) - 9)]),
7586+                                 {0: [data[9:]]})
7587+        d.addCallback(_then_again)
7588+        return d
7589+
7590+
7591 class Stats(unittest.TestCase):
7592 
7593     def setUp(self):
7594}
7595[mutable/retrieve.py: Modify the retrieval process to support MDMF
7596Kevan Carstensen <kevan@isnotajoke.com>**20100819003409
7597 Ignore-this: c03f4e41aaa0366a9bf44847f2caf9db
7598 
7599 The logic behind a mutable file download had to be adapted to work with
7600 segmented mutable files; this patch performs those adaptations. It also
7601 exposes some decoding and decrypting functionality to make partial-file
7602 updates a little easier, and supports efficient random-access downloads
7603 of parts of an MDMF file.
7604] {
7605hunk ./src/allmydata/mutable/retrieve.py 2
7606 
7607-import struct, time
7608+import time
7609 from itertools import count
7610 from zope.interface import implements
7611 from twisted.internet import defer
7612merger 0.0 (
7613hunk ./src/allmydata/mutable/retrieve.py 10
7614+from allmydata.util.dictutil import DictOfSets
7615hunk ./src/allmydata/mutable/retrieve.py 7
7616-from foolscap.api import DeadReferenceError, eventually, fireEventually
7617-from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError
7618-from allmydata.util import hashutil, idlib, log
7619+from twisted.internet.interfaces import IPushProducer, IConsumer
7620+from foolscap.api import eventually, fireEventually
7621+from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError, \
7622+                                 MDMF_VERSION, SDMF_VERSION
7623+from allmydata.util import hashutil, log, mathutil
7624)
7625hunk ./src/allmydata/mutable/retrieve.py 16
7626 from pycryptopp.publickey import rsa
7627 
7628 from allmydata.mutable.common import CorruptShareError, UncoordinatedWriteError
7629-from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data
7630+from allmydata.mutable.layout import MDMFSlotReadProxy
7631 
7632 class RetrieveStatus:
7633     implements(IRetrieveStatus)
7634hunk ./src/allmydata/mutable/retrieve.py 83
7635     # times, and each will have a separate response chain. However the
7636     # Retrieve object will remain tied to a specific version of the file, and
7637     # will use a single ServerMap instance.
7638+    implements(IPushProducer)
7639 
7640hunk ./src/allmydata/mutable/retrieve.py 85
7641-    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False):
7642+    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False,
7643+                 verify=False):
7644         self._node = filenode
7645         assert self._node.get_pubkey()
7646         self._storage_index = filenode.get_storage_index()
7647hunk ./src/allmydata/mutable/retrieve.py 104
7648         self.verinfo = verinfo
7649         # during repair, we may be called upon to grab the private key, since
7650         # it wasn't picked up during a verify=False checker run, and we'll
7651-        # need it for repair to generate the a new version.
7652-        self._need_privkey = fetch_privkey
7653-        if self._node.get_privkey():
7654+        # need it for repair to generate a new version.
7655+        self._need_privkey = fetch_privkey or verify
7656+        if self._node.get_privkey() and not verify:
7657             self._need_privkey = False
7658 
7659hunk ./src/allmydata/mutable/retrieve.py 109
7660+        if self._need_privkey:
7661+            # TODO: Evaluate the need for this. We'll use it if we want
7662+            # to limit how many queries are on the wire for the privkey
7663+            # at once.
7664+            self._privkey_query_markers = [] # one Marker for each time we've
7665+                                             # tried to get the privkey.
7666+
7667+        # verify means that we are using the downloader logic to verify all
7668+        # of our shares. This tells the downloader a few things.
7669+        #
7670+        # 1. We need to download all of the shares.
7671+        # 2. We don't need to decode or decrypt the shares, since our
7672+        #    caller doesn't care about the plaintext, only the
7673+        #    information about which shares are or are not valid.
7674+        # 3. When we are validating readers, we need to validate the
7675+        #    signature on the prefix. Do we? We already do this in the
7676+        #    servermap update?
7677+        self._verify = False
7678+        if verify:
7679+            self._verify = True
7680+
7681         self._status = RetrieveStatus()
7682         self._status.set_storage_index(self._storage_index)
7683         self._status.set_helper(False)
7684hunk ./src/allmydata/mutable/retrieve.py 139
7685          offsets_tuple) = self.verinfo
7686         self._status.set_size(datalength)
7687         self._status.set_encoding(k, N)
7688+        self.readers = {}
7689+        self._paused = False
7690+        self._paused_deferred = None
7691+        self._offset = None
7692+        self._read_length = None
7693+        self.log("got seqnum %d" % self.verinfo[0])
7694+
7695 
7696     def get_status(self):
7697         return self._status
7698hunk ./src/allmydata/mutable/retrieve.py 157
7699             kwargs["facility"] = "tahoe.mutable.retrieve"
7700         return log.msg(*args, **kwargs)
7701 
7702-    def download(self):
7703+
7704+    ###################
7705+    # IPushProducer
7706+
7707+    def pauseProducing(self):
7708+        """
7709+        I am called by my download target if we have produced too much
7710+        data for it to handle. I make the downloader stop producing new
7711+        data until my resumeProducing method is called.
7712+        """
7713+        if self._paused:
7714+            return
7715+
7716+        # fired when the download is unpaused.
7717+        self._old_status = self._status.get_status()
7718+        self._status.set_status("Paused")
7719+
7720+        self._pause_deferred = defer.Deferred()
7721+        self._paused = True
7722+
7723+
7724+    def resumeProducing(self):
7725+        """
7726+        I am called by my download target once it is ready to begin
7727+        receiving data again.
7728+        """
7729+        if not self._paused:
7730+            return
7731+
7732+        self._paused = False
7733+        p = self._pause_deferred
7734+        self._pause_deferred = None
7735+        self._status.set_status(self._old_status)
7736+
7737+        eventually(p.callback, None)
7738+
7739+
7740+    def _check_for_paused(self, res):
7741+        """
7742+        I am called just before a write to the consumer. I return a
7743+        Deferred that eventually fires with the data that is to be
7744+        written to the consumer. If the download has not been paused,
7745+        the Deferred fires immediately. Otherwise, the Deferred fires
7746+        when the downloader is unpaused.
7747+        """
7748+        if self._paused:
7749+            d = defer.Deferred()
7750+            self._pause_defered.addCallback(lambda ignored: d.callback(res))
7751+            return d
7752+        return defer.succeed(res)
7753+
7754+
7755+    def download(self, consumer=None, offset=0, size=None):
7756+        assert IConsumer.providedBy(consumer) or self._verify
7757+
7758+        if consumer:
7759+            self._consumer = consumer
7760+            # we provide IPushProducer, so streaming=True, per
7761+            # IConsumer.
7762+            self._consumer.registerProducer(self, streaming=True)
7763+
7764         self._done_deferred = defer.Deferred()
7765         self._started = time.time()
7766         self._status.set_status("Retrieving Shares")
7767hunk ./src/allmydata/mutable/retrieve.py 222
7768 
7769+        self._offset = offset
7770+        self._read_length = size
7771+
7772         # first, which servers can we use?
7773         versionmap = self.servermap.make_versionmap()
7774         shares = versionmap[self.verinfo]
7775hunk ./src/allmydata/mutable/retrieve.py 232
7776         self.remaining_sharemap = DictOfSets()
7777         for (shnum, peerid, timestamp) in shares:
7778             self.remaining_sharemap.add(shnum, peerid)
7779+            # If the servermap update fetched anything, it fetched at least 1
7780+            # KiB, so we ask for that much.
7781+            # TODO: Change the cache methods to allow us to fetch all of the
7782+            # data that they have, then change this method to do that.
7783+            any_cache, timestamp = self._node._read_from_cache(self.verinfo,
7784+                                                               shnum,
7785+                                                               0,
7786+                                                               1000)
7787+            ss = self.servermap.connections[peerid]
7788+            reader = MDMFSlotReadProxy(ss,
7789+                                       self._storage_index,
7790+                                       shnum,
7791+                                       any_cache)
7792+            reader.peerid = peerid
7793+            self.readers[shnum] = reader
7794+
7795 
7796         self.shares = {} # maps shnum to validated blocks
7797hunk ./src/allmydata/mutable/retrieve.py 250
7798+        self._active_readers = [] # list of active readers for this dl.
7799+        self._validated_readers = set() # set of readers that we have
7800+                                        # validated the prefix of
7801+        self._block_hash_trees = {} # shnum => hashtree
7802 
7803         # how many shares do we need?
7804hunk ./src/allmydata/mutable/retrieve.py 256
7805-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
7806+        (seqnum,
7807+         root_hash,
7808+         IV,
7809+         segsize,
7810+         datalength,
7811+         k,
7812+         N,
7813+         prefix,
7814          offsets_tuple) = self.verinfo
7815hunk ./src/allmydata/mutable/retrieve.py 265
7816-        assert len(self.remaining_sharemap) >= k
7817-        # we start with the lowest shnums we have available, since FEC is
7818-        # faster if we're using "primary shares"
7819-        self.active_shnums = set(sorted(self.remaining_sharemap.keys())[:k])
7820-        for shnum in self.active_shnums:
7821-            # we use an arbitrary peer who has the share. If shares are
7822-            # doubled up (more than one share per peer), we could make this
7823-            # run faster by spreading the load among multiple peers. But the
7824-            # algorithm to do that is more complicated than I want to write
7825-            # right now, and a well-provisioned grid shouldn't have multiple
7826-            # shares per peer.
7827-            peerid = list(self.remaining_sharemap[shnum])[0]
7828-            self.get_data(shnum, peerid)
7829 
7830hunk ./src/allmydata/mutable/retrieve.py 266
7831-        # control flow beyond this point: state machine. Receiving responses
7832-        # from queries is the input. We might send out more queries, or we
7833-        # might produce a result.
7834 
7835hunk ./src/allmydata/mutable/retrieve.py 267
7836+        # We need one share hash tree for the entire file; its leaves
7837+        # are the roots of the block hash trees for the shares that
7838+        # comprise it, and its root is in the verinfo.
7839+        self.share_hash_tree = hashtree.IncompleteHashTree(N)
7840+        self.share_hash_tree.set_hashes({0: root_hash})
7841+
7842+        # This will set up both the segment decoder and the tail segment
7843+        # decoder, as well as a variety of other instance variables that
7844+        # the download process will use.
7845+        self._setup_encoding_parameters()
7846+        assert len(self.remaining_sharemap) >= k
7847+
7848+        self.log("starting download")
7849+        self._paused = False
7850+        self._started_fetching = time.time()
7851+
7852+        self._add_active_peers()
7853+        # The download process beyond this is a state machine.
7854+        # _add_active_peers will select the peers that we want to use
7855+        # for the download, and then attempt to start downloading. After
7856+        # each segment, it will check for doneness, reacting to broken
7857+        # peers and corrupt shares as necessary. If it runs out of good
7858+        # peers before downloading all of the segments, _done_deferred
7859+        # will errback.  Otherwise, it will eventually callback with the
7860+        # contents of the mutable file.
7861         return self._done_deferred
7862 
7863hunk ./src/allmydata/mutable/retrieve.py 294
7864-    def get_data(self, shnum, peerid):
7865-        self.log(format="sending sh#%(shnum)d request to [%(peerid)s]",
7866-                 shnum=shnum,
7867-                 peerid=idlib.shortnodeid_b2a(peerid),
7868-                 level=log.NOISY)
7869-        ss = self.servermap.connections[peerid]
7870-        started = time.time()
7871-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
7872+
7873+    def decode(self, blocks_and_salts, segnum):
7874+        """
7875+        I am a helper method that the mutable file update process uses
7876+        as a shortcut to decode and decrypt the segments that it needs
7877+        to fetch in order to perform a file update. I take in a
7878+        collection of blocks and salts, and pick some of those to make a
7879+        segment with. I return the plaintext associated with that
7880+        segment.
7881+        """
7882+        # shnum => block hash tree. Unusued, but setup_encoding_parameters will
7883+        # want to set this.
7884+        # XXX: Make it so that it won't set this if we're just decoding.
7885+        self._block_hash_trees = {}
7886+        self._setup_encoding_parameters()
7887+        # This is the form expected by decode.
7888+        blocks_and_salts = blocks_and_salts.items()
7889+        blocks_and_salts = [(True, [d]) for d in blocks_and_salts]
7890+
7891+        d = self._decode_blocks(blocks_and_salts, segnum)
7892+        d.addCallback(self._decrypt_segment)
7893+        return d
7894+
7895+
7896+    def _setup_encoding_parameters(self):
7897+        """
7898+        I set up the encoding parameters, including k, n, the number
7899+        of segments associated with this file, and the segment decoder.
7900+        """
7901+        (seqnum,
7902+         root_hash,
7903+         IV,
7904+         segsize,
7905+         datalength,
7906+         k,
7907+         n,
7908+         known_prefix,
7909          offsets_tuple) = self.verinfo
7910hunk ./src/allmydata/mutable/retrieve.py 332
7911-        offsets = dict(offsets_tuple)
7912+        self._required_shares = k
7913+        self._total_shares = n
7914+        self._segment_size = segsize
7915+        self._data_length = datalength
7916 
7917hunk ./src/allmydata/mutable/retrieve.py 337
7918-        # we read the checkstring, to make sure that the data we grab is from
7919-        # the right version.
7920-        readv = [ (0, struct.calcsize(SIGNED_PREFIX)) ]
7921+        if not IV:
7922+            self._version = MDMF_VERSION
7923+        else:
7924+            self._version = SDMF_VERSION
7925 
7926hunk ./src/allmydata/mutable/retrieve.py 342
7927-        # We also read the data, and the hashes necessary to validate them
7928-        # (share_hash_chain, block_hash_tree, share_data). We don't read the
7929-        # signature or the pubkey, since that was handled during the
7930-        # servermap phase, and we'll be comparing the share hash chain
7931-        # against the roothash that was validated back then.
7932+        if datalength and segsize:
7933+            self._num_segments = mathutil.div_ceil(datalength, segsize)
7934+            self._tail_data_size = datalength % segsize
7935+        else:
7936+            self._num_segments = 0
7937+            self._tail_data_size = 0
7938 
7939hunk ./src/allmydata/mutable/retrieve.py 349
7940-        readv.append( (offsets['share_hash_chain'],
7941-                       offsets['enc_privkey'] - offsets['share_hash_chain'] ) )
7942+        self._segment_decoder = codec.CRSDecoder()
7943+        self._segment_decoder.set_params(segsize, k, n)
7944 
7945hunk ./src/allmydata/mutable/retrieve.py 352
7946-        # if we need the private key (for repair), we also fetch that
7947-        if self._need_privkey:
7948-            readv.append( (offsets['enc_privkey'],
7949-                           offsets['EOF'] - offsets['enc_privkey']) )
7950+        if  not self._tail_data_size:
7951+            self._tail_data_size = segsize
7952+
7953+        self._tail_segment_size = mathutil.next_multiple(self._tail_data_size,
7954+                                                         self._required_shares)
7955+        if self._tail_segment_size == self._segment_size:
7956+            self._tail_decoder = self._segment_decoder
7957+        else:
7958+            self._tail_decoder = codec.CRSDecoder()
7959+            self._tail_decoder.set_params(self._tail_segment_size,
7960+                                          self._required_shares,
7961+                                          self._total_shares)
7962 
7963hunk ./src/allmydata/mutable/retrieve.py 365
7964-        m = Marker()
7965-        self._outstanding_queries[m] = (peerid, shnum, started)
7966+        self.log("got encoding parameters: "
7967+                 "k: %d "
7968+                 "n: %d "
7969+                 "%d segments of %d bytes each (%d byte tail segment)" % \
7970+                 (k, n, self._num_segments, self._segment_size,
7971+                  self._tail_segment_size))
7972 
7973         # ask the cache first
7974         got_from_cache = False
7975merger 0.0 (
7976hunk ./src/allmydata/mutable/retrieve.py 376
7977-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
7978-                                                            offset, length)
7979+            data = self._node._read_from_cache(self.verinfo, shnum, offset, length)
7980hunk ./src/allmydata/mutable/retrieve.py 372
7981-        # ask the cache first
7982-        got_from_cache = False
7983-        datavs = []
7984-        for (offset, length) in readv:
7985-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
7986-                                                            offset, length)
7987-            if data is not None:
7988-                datavs.append(data)
7989-        if len(datavs) == len(readv):
7990-            self.log("got data from cache")
7991-            got_from_cache = True
7992-            d = fireEventually({shnum: datavs})
7993-            # datavs is a dict mapping shnum to a pair of strings
7994+        for i in xrange(self._total_shares):
7995+            # So we don't have to do this later.
7996+            self._block_hash_trees[i] = hashtree.IncompleteHashTree(self._num_segments)
7997+
7998+        # Our last task is to tell the downloader where to start and
7999+        # where to stop. We use three parameters for that:
8000+        #   - self._start_segment: the segment that we need to start
8001+        #     downloading from.
8002+        #   - self._current_segment: the next segment that we need to
8003+        #     download.
8004+        #   - self._last_segment: The last segment that we were asked to
8005+        #     download.
8006+        #
8007+        #  We say that the download is complete when
8008+        #  self._current_segment > self._last_segment. We use
8009+        #  self._start_segment and self._last_segment to know when to
8010+        #  strip things off of segments, and how much to strip.
8011+        if self._offset:
8012+            self.log("got offset: %d" % self._offset)
8013+            # our start segment is the first segment containing the
8014+            # offset we were given.
8015+            start = mathutil.div_ceil(self._offset,
8016+                                      self._segment_size)
8017+            # this gets us the first segment after self._offset. Then
8018+            # our start segment is the one before it.
8019+            start -= 1
8020+
8021+            assert start < self._num_segments
8022+            self._start_segment = start
8023+            self.log("got start segment: %d" % self._start_segment)
8024)
8025hunk ./src/allmydata/mutable/retrieve.py 386
8026             d = fireEventually({shnum: datavs})
8027             # datavs is a dict mapping shnum to a pair of strings
8028         else:
8029-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
8030-        self.remaining_sharemap.discard(shnum, peerid)
8031+            self._start_segment = 0
8032 
8033hunk ./src/allmydata/mutable/retrieve.py 388
8034-        d.addCallback(self._got_results, m, peerid, started, got_from_cache)
8035-        d.addErrback(self._query_failed, m, peerid)
8036-        # errors that aren't handled by _query_failed (and errors caused by
8037-        # _query_failed) get logged, but we still want to check for doneness.
8038-        def _oops(f):
8039-            self.log(format="problem in _query_failed for sh#%(shnum)d to %(peerid)s",
8040-                     shnum=shnum,
8041-                     peerid=idlib.shortnodeid_b2a(peerid),
8042-                     failure=f,
8043-                     level=log.WEIRD, umid="W0xnQA")
8044-        d.addErrback(_oops)
8045-        d.addBoth(self._check_for_done)
8046-        # any error during _check_for_done means the download fails. If the
8047-        # download is successful, _check_for_done will fire _done by itself.
8048-        d.addErrback(self._done)
8049-        d.addErrback(log.err)
8050-        return d # purely for testing convenience
8051 
8052hunk ./src/allmydata/mutable/retrieve.py 389
8053-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
8054-        # isolate the callRemote to a separate method, so tests can subclass
8055-        # Publish and override it
8056-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
8057-        return d
8058+        if self._read_length:
8059+            # our end segment is the last segment containing part of the
8060+            # segment that we were asked to read.
8061+            self.log("got read length %d" % self._read_length)
8062+            end_data = self._offset + self._read_length
8063+            end = mathutil.div_ceil(end_data,
8064+                                    self._segment_size)
8065+            end -= 1
8066+            assert end < self._num_segments
8067+            self._last_segment = end
8068+            self.log("got end segment: %d" % self._last_segment)
8069+        else:
8070+            self._last_segment = self._num_segments - 1
8071 
8072hunk ./src/allmydata/mutable/retrieve.py 403
8073-    def remove_peer(self, peerid):
8074-        for shnum in list(self.remaining_sharemap.keys()):
8075-            self.remaining_sharemap.discard(shnum, peerid)
8076+        self._current_segment = self._start_segment
8077 
8078hunk ./src/allmydata/mutable/retrieve.py 405
8079-    def _got_results(self, datavs, marker, peerid, started, got_from_cache):
8080-        now = time.time()
8081-        elapsed = now - started
8082-        if not got_from_cache:
8083-            self._status.add_fetch_timing(peerid, elapsed)
8084-        self.log(format="got results (%(shares)d shares) from [%(peerid)s]",
8085-                 shares=len(datavs),
8086-                 peerid=idlib.shortnodeid_b2a(peerid),
8087-                 level=log.NOISY)
8088-        self._outstanding_queries.pop(marker, None)
8089-        if not self._running:
8090-            return
8091+    def _add_active_peers(self):
8092+        """
8093+        I populate self._active_readers with enough active readers to
8094+        retrieve the contents of this mutable file. I am called before
8095+        downloading starts, and (eventually) after each validation
8096+        error, connection error, or other problem in the download.
8097+        """
8098+        # TODO: It would be cool to investigate other heuristics for
8099+        # reader selection. For instance, the cost (in time the user
8100+        # spends waiting for their file) of selecting a really slow peer
8101+        # that happens to have a primary share is probably more than
8102+        # selecting a really fast peer that doesn't have a primary
8103+        # share. Maybe the servermap could be extended to provide this
8104+        # information; it could keep track of latency information while
8105+        # it gathers more important data, and then this routine could
8106+        # use that to select active readers.
8107+        #
8108+        # (these and other questions would be easier to answer with a
8109+        #  robust, configurable tahoe-lafs simulator, which modeled node
8110+        #  failures, differences in node speed, and other characteristics
8111+        #  that we expect storage servers to have.  You could have
8112+        #  presets for really stable grids (like allmydata.com),
8113+        #  friendnets, make it easy to configure your own settings, and
8114+        #  then simulate the effect of big changes on these use cases
8115+        #  instead of just reasoning about what the effect might be. Out
8116+        #  of scope for MDMF, though.)
8117 
8118hunk ./src/allmydata/mutable/retrieve.py 432
8119-        # note that we only ask for a single share per query, so we only
8120-        # expect a single share back. On the other hand, we use the extra
8121-        # shares if we get them.. seems better than an assert().
8122+        # We need at least self._required_shares readers to download a
8123+        # segment.
8124+        if self._verify:
8125+            needed = self._total_shares
8126+        else:
8127+            needed = self._required_shares - len(self._active_readers)
8128+        # XXX: Why don't format= log messages work here?
8129+        self.log("adding %d peers to the active peers list" % needed)
8130 
8131hunk ./src/allmydata/mutable/retrieve.py 441
8132-        for shnum,datav in datavs.items():
8133-            (prefix, hash_and_data) = datav[:2]
8134-            try:
8135-                self._got_results_one_share(shnum, peerid,
8136-                                            prefix, hash_and_data)
8137-            except CorruptShareError, e:
8138-                # log it and give the other shares a chance to be processed
8139-                f = failure.Failure()
8140-                self.log(format="bad share: %(f_value)s",
8141-                         f_value=str(f.value), failure=f,
8142-                         level=log.WEIRD, umid="7fzWZw")
8143-                self.notify_server_corruption(peerid, shnum, str(e))
8144-                self.remove_peer(peerid)
8145-                self.servermap.mark_bad_share(peerid, shnum, prefix)
8146-                self._bad_shares.add( (peerid, shnum) )
8147-                self._status.problems[peerid] = f
8148-                self._last_failure = f
8149-                pass
8150-            if self._need_privkey and len(datav) > 2:
8151-                lp = None
8152-                self._try_to_validate_privkey(datav[2], peerid, shnum, lp)
8153-        # all done!
8154+        # We favor lower numbered shares, since FEC is faster with
8155+        # primary shares than with other shares, and lower-numbered
8156+        # shares are more likely to be primary than higher numbered
8157+        # shares.
8158+        active_shnums = set(sorted(self.remaining_sharemap.keys()))
8159+        # We shouldn't consider adding shares that we already have; this
8160+        # will cause problems later.
8161+        active_shnums -= set([reader.shnum for reader in self._active_readers])
8162+        active_shnums = list(active_shnums)[:needed]
8163+        if len(active_shnums) < needed and not self._verify:
8164+            # We don't have enough readers to retrieve the file; fail.
8165+            return self._failed()
8166 
8167hunk ./src/allmydata/mutable/retrieve.py 454
8168-    def notify_server_corruption(self, peerid, shnum, reason):
8169-        ss = self.servermap.connections[peerid]
8170-        ss.callRemoteOnly("advise_corrupt_share",
8171-                          "mutable", self._storage_index, shnum, reason)
8172+        for shnum in active_shnums:
8173+            self._active_readers.append(self.readers[shnum])
8174+            self.log("added reader for share %d" % shnum)
8175+        assert len(self._active_readers) >= self._required_shares
8176+        # Conceptually, this is part of the _add_active_peers step. It
8177+        # validates the prefixes of newly added readers to make sure
8178+        # that they match what we are expecting for self.verinfo. If
8179+        # validation is successful, _validate_active_prefixes will call
8180+        # _download_current_segment for us. If validation is
8181+        # unsuccessful, then _validate_prefixes will remove the peer and
8182+        # call _add_active_peers again, where we will attempt to rectify
8183+        # the problem by choosing another peer.
8184+        return self._validate_active_prefixes()
8185 
8186hunk ./src/allmydata/mutable/retrieve.py 468
8187-    def _got_results_one_share(self, shnum, peerid,
8188-                               got_prefix, got_hash_and_data):
8189-        self.log("_got_results: got shnum #%d from peerid %s"
8190-                 % (shnum, idlib.shortnodeid_b2a(peerid)))
8191-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
8192-         offsets_tuple) = self.verinfo
8193-        assert len(got_prefix) == len(prefix), (len(got_prefix), len(prefix))
8194-        if got_prefix != prefix:
8195-            msg = "someone wrote to the data since we read the servermap: prefix changed"
8196-            raise UncoordinatedWriteError(msg)
8197-        (share_hash_chain, block_hash_tree,
8198-         share_data) = unpack_share_data(self.verinfo, got_hash_and_data)
8199 
8200hunk ./src/allmydata/mutable/retrieve.py 469
8201-        assert isinstance(share_data, str)
8202-        # build the block hash tree. SDMF has only one leaf.
8203-        leaves = [hashutil.block_hash(share_data)]
8204-        t = hashtree.HashTree(leaves)
8205-        if list(t) != block_hash_tree:
8206-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
8207-        share_hash_leaf = t[0]
8208-        t2 = hashtree.IncompleteHashTree(N)
8209-        # root_hash was checked by the signature
8210-        t2.set_hashes({0: root_hash})
8211-        try:
8212-            t2.set_hashes(hashes=share_hash_chain,
8213-                          leaves={shnum: share_hash_leaf})
8214-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
8215-                IndexError), e:
8216-            msg = "corrupt hashes: %s" % (e,)
8217-            raise CorruptShareError(peerid, shnum, msg)
8218-        self.log(" data valid! len=%d" % len(share_data))
8219-        # each query comes down to this: placing validated share data into
8220-        # self.shares
8221-        self.shares[shnum] = share_data
8222+    def _validate_active_prefixes(self):
8223+        """
8224+        I check to make sure that the prefixes on the peers that I am
8225+        currently reading from match the prefix that we want to see, as
8226+        said in self.verinfo.
8227 
8228hunk ./src/allmydata/mutable/retrieve.py 475
8229-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
8230+        If I find that all of the active peers have acceptable prefixes,
8231+        I pass control to _download_current_segment, which will use
8232+        those peers to do cool things. If I find that some of the active
8233+        peers have unacceptable prefixes, I will remove them from active
8234+        peers (and from further consideration) and call
8235+        _add_active_peers to attempt to rectify the situation. I keep
8236+        track of which peers I have already validated so that I don't
8237+        need to do so again.
8238+        """
8239+        assert self._active_readers, "No more active readers"
8240 
8241hunk ./src/allmydata/mutable/retrieve.py 486
8242-        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
8243-        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
8244-        if alleged_writekey != self._node.get_writekey():
8245-            self.log("invalid privkey from %s shnum %d" %
8246-                     (idlib.nodeid_b2a(peerid)[:8], shnum),
8247-                     parent=lp, level=log.WEIRD, umid="YIw4tA")
8248-            return
8249+        ds = []
8250+        new_readers = set(self._active_readers) - self._validated_readers
8251+        self.log('validating %d newly-added active readers' % len(new_readers))
8252 
8253hunk ./src/allmydata/mutable/retrieve.py 490
8254-        # it's good
8255-        self.log("got valid privkey from shnum %d on peerid %s" %
8256-                 (shnum, idlib.shortnodeid_b2a(peerid)),
8257-                 parent=lp)
8258-        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
8259-        self._node._populate_encprivkey(enc_privkey)
8260-        self._node._populate_privkey(privkey)
8261-        self._need_privkey = False
8262+        for reader in new_readers:
8263+            # We force a remote read here -- otherwise, we are relying
8264+            # on cached data that we already verified as valid, and we
8265+            # won't detect an uncoordinated write that has occurred
8266+            # since the last servermap update.
8267+            d = reader.get_prefix(force_remote=True)
8268+            d.addCallback(self._try_to_validate_prefix, reader)
8269+            ds.append(d)
8270+        dl = defer.DeferredList(ds, consumeErrors=True)
8271+        def _check_results(results):
8272+            # Each result in results will be of the form (success, msg).
8273+            # We don't care about msg, but success will tell us whether
8274+            # or not the checkstring validated. If it didn't, we need to
8275+            # remove the offending (peer,share) from our active readers,
8276+            # and ensure that active readers is again populated.
8277+            bad_readers = []
8278+            for i, result in enumerate(results):
8279+                if not result[0]:
8280+                    reader = self._active_readers[i]
8281+                    f = result[1]
8282+                    assert isinstance(f, failure.Failure)
8283 
8284hunk ./src/allmydata/mutable/retrieve.py 512
8285-    def _query_failed(self, f, marker, peerid):
8286-        self.log(format="query to [%(peerid)s] failed",
8287-                 peerid=idlib.shortnodeid_b2a(peerid),
8288-                 level=log.NOISY)
8289-        self._status.problems[peerid] = f
8290-        self._outstanding_queries.pop(marker, None)
8291-        if not self._running:
8292-            return
8293-        self._last_failure = f
8294-        self.remove_peer(peerid)
8295-        level = log.WEIRD
8296-        if f.check(DeadReferenceError):
8297-            level = log.UNUSUAL
8298-        self.log(format="error during query: %(f_value)s",
8299-                 f_value=str(f.value), failure=f, level=level, umid="gOJB5g")
8300+                    self.log("The reader %s failed to "
8301+                             "properly validate: %s" % \
8302+                             (reader, str(f.value)))
8303+                    bad_readers.append((reader, f))
8304+                else:
8305+                    reader = self._active_readers[i]
8306+                    self.log("the reader %s checks out, so we'll use it" % \
8307+                             reader)
8308+                    self._validated_readers.add(reader)
8309+                    # Each time we validate a reader, we check to see if
8310+                    # we need the private key. If we do, we politely ask
8311+                    # for it and then continue computing. If we find
8312+                    # that we haven't gotten it at the end of
8313+                    # segment decoding, then we'll take more drastic
8314+                    # measures.
8315+                    if self._need_privkey and not self._node.is_readonly():
8316+                        d = reader.get_encprivkey()
8317+                        d.addCallback(self._try_to_validate_privkey, reader)
8318+            if bad_readers:
8319+                # We do them all at once, or else we screw up list indexing.
8320+                for (reader, f) in bad_readers:
8321+                    self._mark_bad_share(reader, f)
8322+                if self._verify:
8323+                    if len(self._active_readers) >= self._required_shares:
8324+                        return self._download_current_segment()
8325+                    else:
8326+                        return self._failed()
8327+                else:
8328+                    return self._add_active_peers()
8329+            else:
8330+                return self._download_current_segment()
8331+            # The next step will assert that it has enough active
8332+            # readers to fetch shares; we just need to remove it.
8333+        dl.addCallback(_check_results)
8334+        return dl
8335 
8336hunk ./src/allmydata/mutable/retrieve.py 548
8337-    def _check_for_done(self, res):
8338-        # exit paths:
8339-        #  return : keep waiting, no new queries
8340-        #  return self._send_more_queries(outstanding) : send some more queries
8341-        #  fire self._done(plaintext) : download successful
8342-        #  raise exception : download fails
8343 
8344hunk ./src/allmydata/mutable/retrieve.py 549
8345-        self.log(format="_check_for_done: running=%(running)s, decoding=%(decoding)s",
8346-                 running=self._running, decoding=self._decoding,
8347-                 level=log.NOISY)
8348-        if not self._running:
8349-            return
8350-        if self._decoding:
8351-            return
8352-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
8353+    def _try_to_validate_prefix(self, prefix, reader):
8354+        """
8355+        I check that the prefix returned by a candidate server for
8356+        retrieval matches the prefix that the servermap knows about
8357+        (and, hence, the prefix that was validated earlier). If it does,
8358+        I return True, which means that I approve of the use of the
8359+        candidate server for segment retrieval. If it doesn't, I return
8360+        False, which means that another server must be chosen.
8361+        """
8362+        (seqnum,
8363+         root_hash,
8364+         IV,
8365+         segsize,
8366+         datalength,
8367+         k,
8368+         N,
8369+         known_prefix,
8370          offsets_tuple) = self.verinfo
8371hunk ./src/allmydata/mutable/retrieve.py 567
8372+        if known_prefix != prefix:
8373+            self.log("prefix from share %d doesn't match" % reader.shnum)
8374+            raise UncoordinatedWriteError("Mismatched prefix -- this could "
8375+                                          "indicate an uncoordinated write")
8376+        # Otherwise, we're okay -- no issues.
8377 
8378hunk ./src/allmydata/mutable/retrieve.py 573
8379-        if len(self.shares) < k:
8380-            # we don't have enough shares yet
8381-            return self._maybe_send_more_queries(k)
8382-        if self._need_privkey:
8383-            # we got k shares, but none of them had a valid privkey. TODO:
8384-            # look further. Adding code to do this is a bit complicated, and
8385-            # I want to avoid that complication, and this should be pretty
8386-            # rare (k shares with bitflips in the enc_privkey but not in the
8387-            # data blocks). If we actually do get here, the subsequent repair
8388-            # will fail for lack of a privkey.
8389-            self.log("got k shares but still need_privkey, bummer",
8390-                     level=log.WEIRD, umid="MdRHPA")
8391 
8392hunk ./src/allmydata/mutable/retrieve.py 574
8393-        # we have enough to finish. All the shares have had their hashes
8394-        # checked, so if something fails at this point, we don't know how
8395-        # to fix it, so the download will fail.
8396+    def _remove_reader(self, reader):
8397+        """
8398+        At various points, we will wish to remove a peer from
8399+        consideration and/or use. These include, but are not necessarily
8400+        limited to:
8401 
8402hunk ./src/allmydata/mutable/retrieve.py 580
8403-        self._decoding = True # avoid reentrancy
8404-        self._status.set_status("decoding")
8405-        now = time.time()
8406-        elapsed = now - self._started
8407-        self._status.timings["fetch"] = elapsed
8408+            - A connection error.
8409+            - A mismatched prefix (that is, a prefix that does not match
8410+              our conception of the version information string).
8411+            - A failing block hash, salt hash, or share hash, which can
8412+              indicate disk failure/bit flips, or network trouble.
8413 
8414hunk ./src/allmydata/mutable/retrieve.py 586
8415-        d = defer.maybeDeferred(self._decode)
8416-        d.addCallback(self._decrypt, IV, self._node.get_readkey())
8417-        d.addBoth(self._done)
8418-        return d # purely for test convenience
8419+        This method will do that. I will make sure that the
8420+        (shnum,reader) combination represented by my reader argument is
8421+        not used for anything else during this download. I will not
8422+        advise the reader of any corruption, something that my callers
8423+        may wish to do on their own.
8424+        """
8425+        # TODO: When you're done writing this, see if this is ever
8426+        # actually used for something that _mark_bad_share isn't. I have
8427+        # a feeling that they will be used for very similar things, and
8428+        # that having them both here is just going to be an epic amount
8429+        # of code duplication.
8430+        #
8431+        # (well, okay, not epic, but meaningful)
8432+        self.log("removing reader %s" % reader)
8433+        # Remove the reader from _active_readers
8434+        self._active_readers.remove(reader)
8435+        # TODO: self.readers.remove(reader)?
8436+        for shnum in list(self.remaining_sharemap.keys()):
8437+            self.remaining_sharemap.discard(shnum, reader.peerid)
8438 
8439hunk ./src/allmydata/mutable/retrieve.py 606
8440-    def _maybe_send_more_queries(self, k):
8441-        # we don't have enough shares yet. Should we send out more queries?
8442-        # There are some number of queries outstanding, each for a single
8443-        # share. If we can generate 'needed_shares' additional queries, we do
8444-        # so. If we can't, then we know this file is a goner, and we raise
8445-        # NotEnoughSharesError.
8446-        self.log(format=("_maybe_send_more_queries, have=%(have)d, k=%(k)d, "
8447-                         "outstanding=%(outstanding)d"),
8448-                 have=len(self.shares), k=k,
8449-                 outstanding=len(self._outstanding_queries),
8450-                 level=log.NOISY)
8451 
8452hunk ./src/allmydata/mutable/retrieve.py 607
8453-        remaining_shares = k - len(self.shares)
8454-        needed = remaining_shares - len(self._outstanding_queries)
8455-        if not needed:
8456-            # we have enough queries in flight already
8457+    def _mark_bad_share(self, reader, f):
8458+        """
8459+        I mark the (peerid, shnum) encapsulated by my reader argument as
8460+        a bad share, which means that it will not be used anywhere else.
8461 
8462hunk ./src/allmydata/mutable/retrieve.py 612
8463-            # TODO: but if they've been in flight for a long time, and we
8464-            # have reason to believe that new queries might respond faster
8465-            # (i.e. we've seen other queries come back faster, then consider
8466-            # sending out new queries. This could help with peers which have
8467-            # silently gone away since the servermap was updated, for which
8468-            # we're still waiting for the 15-minute TCP disconnect to happen.
8469-            self.log("enough queries are in flight, no more are needed",
8470-                     level=log.NOISY)
8471-            return
8472+        There are several reasons to want to mark something as a bad
8473+        share. These include:
8474+
8475+            - A connection error to the peer.
8476+            - A mismatched prefix (that is, a prefix that does not match
8477+              our local conception of the version information string).
8478+            - A failing block hash, salt hash, share hash, or other
8479+              integrity check.
8480 
8481hunk ./src/allmydata/mutable/retrieve.py 621
8482-        outstanding_shnums = set([shnum
8483-                                  for (peerid, shnum, started)
8484-                                  in self._outstanding_queries.values()])
8485-        # prefer low-numbered shares, they are more likely to be primary
8486-        available_shnums = sorted(self.remaining_sharemap.keys())
8487-        for shnum in available_shnums:
8488-            if shnum in outstanding_shnums:
8489-                # skip ones that are already in transit
8490-                continue
8491-            if shnum not in self.remaining_sharemap:
8492-                # no servers for that shnum. note that DictOfSets removes
8493-                # empty sets from the dict for us.
8494-                continue
8495-            peerid = list(self.remaining_sharemap[shnum])[0]
8496-            # get_data will remove that peerid from the sharemap, and add the
8497-            # query to self._outstanding_queries
8498-            self._status.set_status("Retrieving More Shares")
8499-            self.get_data(shnum, peerid)
8500-            needed -= 1
8501-            if not needed:
8502+        This method will ensure that readers that we wish to mark bad
8503+        (for these reasons or other reasons) are not used for the rest
8504+        of the download. Additionally, it will attempt to tell the
8505+        remote peer (with no guarantee of success) that its share is
8506+        corrupt.
8507+        """
8508+        self.log("marking share %d on server %s as bad" % \
8509+                 (reader.shnum, reader))
8510+        prefix = self.verinfo[-2]
8511+        self.servermap.mark_bad_share(reader.peerid,
8512+                                      reader.shnum,
8513+                                      prefix)
8514+        self._remove_reader(reader)
8515+        self._bad_shares.add((reader.peerid, reader.shnum, f))
8516+        self._status.problems[reader.peerid] = f
8517+        self._last_failure = f
8518+        self.notify_server_corruption(reader.peerid, reader.shnum,
8519+                                      str(f.value))
8520+
8521+
8522+    def _download_current_segment(self):
8523+        """
8524+        I download, validate, decode, decrypt, and assemble the segment
8525+        that this Retrieve is currently responsible for downloading.
8526+        """
8527+        assert len(self._active_readers) >= self._required_shares
8528+        if self._current_segment <= self._last_segment:
8529+            d = self._process_segment(self._current_segment)
8530+        else:
8531+            d = defer.succeed(None)
8532+        d.addBoth(self._turn_barrier)
8533+        d.addCallback(self._check_for_done)
8534+        return d
8535+
8536+
8537+    def _turn_barrier(self, result):
8538+        """
8539+        I help the download process avoid the recursion limit issues
8540+        discussed in #237.
8541+        """
8542+        return fireEventually(result)
8543+
8544+
8545+    def _process_segment(self, segnum):
8546+        """
8547+        I download, validate, decode, and decrypt one segment of the
8548+        file that this Retrieve is retrieving. This means coordinating
8549+        the process of getting k blocks of that file, validating them,
8550+        assembling them into one segment with the decoder, and then
8551+        decrypting them.
8552+        """
8553+        self.log("processing segment %d" % segnum)
8554+
8555+        # TODO: The old code uses a marker. Should this code do that
8556+        # too? What did the Marker do?
8557+        assert len(self._active_readers) >= self._required_shares
8558+
8559+        # We need to ask each of our active readers for its block and
8560+        # salt. We will then validate those. If validation is
8561+        # successful, we will assemble the results into plaintext.
8562+        ds = []
8563+        for reader in self._active_readers:
8564+            started = time.time()
8565+            d = reader.get_block_and_salt(segnum, queue=True)
8566+            d2 = self._get_needed_hashes(reader, segnum)
8567+            dl = defer.DeferredList([d, d2], consumeErrors=True)
8568+            dl.addCallback(self._validate_block, segnum, reader, started)
8569+            dl.addErrback(self._validation_or_decoding_failed, [reader])
8570+            ds.append(dl)
8571+            reader.flush()
8572+        dl = defer.DeferredList(ds)
8573+        if self._verify:
8574+            dl.addCallback(lambda ignored: "")
8575+            dl.addCallback(self._set_segment)
8576+        else:
8577+            dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
8578+        return dl
8579+
8580+
8581+    def _maybe_decode_and_decrypt_segment(self, blocks_and_salts, segnum):
8582+        """
8583+        I take the results of fetching and validating the blocks from a
8584+        callback chain in another method. If the results are such that
8585+        they tell me that validation and fetching succeeded without
8586+        incident, I will proceed with decoding and decryption.
8587+        Otherwise, I will do nothing.
8588+        """
8589+        self.log("trying to decode and decrypt segment %d" % segnum)
8590+        failures = False
8591+        for block_and_salt in blocks_and_salts:
8592+            if not block_and_salt[0] or block_and_salt[1] == None:
8593+                self.log("some validation operations failed; not proceeding")
8594+                failures = True
8595                 break
8596hunk ./src/allmydata/mutable/retrieve.py 715
8597+        if not failures:
8598+            self.log("everything looks ok, building segment %d" % segnum)
8599+            d = self._decode_blocks(blocks_and_salts, segnum)
8600+            d.addCallback(self._decrypt_segment)
8601+            d.addErrback(self._validation_or_decoding_failed,
8602+                         self._active_readers)
8603+            # check to see whether we've been paused before writing
8604+            # anything.
8605+            d.addCallback(self._check_for_paused)
8606+            d.addCallback(self._set_segment)
8607+            return d
8608+        else:
8609+            return defer.succeed(None)
8610+
8611+
8612+    def _set_segment(self, segment):
8613+        """
8614+        Given a plaintext segment, I register that segment with the
8615+        target that is handling the file download.
8616+        """
8617+        self.log("got plaintext for segment %d" % self._current_segment)
8618+        if self._current_segment == self._start_segment:
8619+            # We're on the first segment. It's possible that we want
8620+            # only some part of the end of this segment, and that we
8621+            # just downloaded the whole thing to get that part. If so,
8622+            # we need to account for that and give the reader just the
8623+            # data that they want.
8624+            n = self._offset % self._segment_size
8625+            self.log("stripping %d bytes off of the first segment" % n)
8626+            self.log("original segment length: %d" % len(segment))
8627+            segment = segment[n:]
8628+            self.log("new segment length: %d" % len(segment))
8629+
8630+        if self._current_segment == self._last_segment and self._read_length is not None:
8631+            # We're on the last segment. It's possible that we only want
8632+            # part of the beginning of this segment, and that we
8633+            # downloaded the whole thing anyway. Make sure to give the
8634+            # caller only the portion of the segment that they want to
8635+            # receive.
8636+            extra = self._read_length
8637+            if self._start_segment != self._last_segment:
8638+                extra -= self._segment_size - \
8639+                            (self._offset % self._segment_size)
8640+            extra %= self._segment_size
8641+            self.log("original segment length: %d" % len(segment))
8642+            segment = segment[:extra]
8643+            self.log("new segment length: %d" % len(segment))
8644+            self.log("only taking %d bytes of the last segment" % extra)
8645+
8646+        if not self._verify:
8647+            self._consumer.write(segment)
8648+        else:
8649+            # we don't care about the plaintext if we are doing a verify.
8650+            segment = None
8651+        self._current_segment += 1
8652 
8653hunk ./src/allmydata/mutable/retrieve.py 771
8654-        # at this point, we have as many outstanding queries as we can. If
8655-        # needed!=0 then we might not have enough to recover the file.
8656-        if needed:
8657-            format = ("ran out of peers: "
8658-                      "have %(have)d shares (k=%(k)d), "
8659-                      "%(outstanding)d queries in flight, "
8660-                      "need %(need)d more, "
8661-                      "found %(bad)d bad shares")
8662-            args = {"have": len(self.shares),
8663-                    "k": k,
8664-                    "outstanding": len(self._outstanding_queries),
8665-                    "need": needed,
8666-                    "bad": len(self._bad_shares),
8667-                    }
8668-            self.log(format=format,
8669-                     level=log.WEIRD, umid="ezTfjw", **args)
8670-            err = NotEnoughSharesError("%s, last failure: %s" %
8671-                                      (format % args, self._last_failure))
8672-            if self._bad_shares:
8673-                self.log("We found some bad shares this pass. You should "
8674-                         "update the servermap and try again to check "
8675-                         "more peers",
8676-                         level=log.WEIRD, umid="EFkOlA")
8677-                err.servermap = self.servermap
8678-            raise err
8679 
8680hunk ./src/allmydata/mutable/retrieve.py 772
8681+    def _validation_or_decoding_failed(self, f, readers):
8682+        """
8683+        I am called when a block or a salt fails to correctly validate, or when
8684+        the decryption or decoding operation fails for some reason.  I react to
8685+        this failure by notifying the remote server of corruption, and then
8686+        removing the remote peer from further activity.
8687+        """
8688+        assert isinstance(readers, list)
8689+        bad_shnums = [reader.shnum for reader in readers]
8690+
8691+        self.log("validation or decoding failed on share(s) %s, peer(s) %s "
8692+                 ", segment %d: %s" % \
8693+                 (bad_shnums, readers, self._current_segment, str(f)))
8694+        for reader in readers:
8695+            self._mark_bad_share(reader, f)
8696         return
8697 
8698hunk ./src/allmydata/mutable/retrieve.py 789
8699-    def _decode(self):
8700-        started = time.time()
8701-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
8702-         offsets_tuple) = self.verinfo
8703 
8704hunk ./src/allmydata/mutable/retrieve.py 790
8705-        # shares_dict is a dict mapping shnum to share data, but the codec
8706-        # wants two lists.
8707-        shareids = []; shares = []
8708-        for shareid, share in self.shares.items():
8709+    def _validate_block(self, results, segnum, reader, started):
8710+        """
8711+        I validate a block from one share on a remote server.
8712+        """
8713+        # Grab the part of the block hash tree that is necessary to
8714+        # validate this block, then generate the block hash root.
8715+        self.log("validating share %d for segment %d" % (reader.shnum,
8716+                                                             segnum))
8717+        self._status.add_fetch_timing(reader.peerid, started)
8718+        self._status.set_status("Valdiating blocks for segment %d" % segnum)
8719+        # Did we fail to fetch either of the things that we were
8720+        # supposed to? Fail if so.
8721+        if not results[0][0] and results[1][0]:
8722+            # handled by the errback handler.
8723+
8724+            # These all get batched into one query, so the resulting
8725+            # failure should be the same for all of them, so we can just
8726+            # use the first one.
8727+            assert isinstance(results[0][1], failure.Failure)
8728+
8729+            f = results[0][1]
8730+            raise CorruptShareError(reader.peerid,
8731+                                    reader.shnum,
8732+                                    "Connection error: %s" % str(f))
8733+
8734+        block_and_salt, block_and_sharehashes = results
8735+        block, salt = block_and_salt[1]
8736+        blockhashes, sharehashes = block_and_sharehashes[1]
8737+
8738+        blockhashes = dict(enumerate(blockhashes[1]))
8739+        self.log("the reader gave me the following blockhashes: %s" % \
8740+                 blockhashes.keys())
8741+        self.log("the reader gave me the following sharehashes: %s" % \
8742+                 sharehashes[1].keys())
8743+        bht = self._block_hash_trees[reader.shnum]
8744+
8745+        if bht.needed_hashes(segnum, include_leaf=True):
8746+            try:
8747+                bht.set_hashes(blockhashes)
8748+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
8749+                    IndexError), e:
8750+                raise CorruptShareError(reader.peerid,
8751+                                        reader.shnum,
8752+                                        "block hash tree failure: %s" % e)
8753+
8754+        if self._version == MDMF_VERSION:
8755+            blockhash = hashutil.block_hash(salt + block)
8756+        else:
8757+            blockhash = hashutil.block_hash(block)
8758+        # If this works without an error, then validation is
8759+        # successful.
8760+        try:
8761+           bht.set_hashes(leaves={segnum: blockhash})
8762+        except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
8763+                IndexError), e:
8764+            raise CorruptShareError(reader.peerid,
8765+                                    reader.shnum,
8766+                                    "block hash tree failure: %s" % e)
8767+
8768+        # Reaching this point means that we know that this segment
8769+        # is correct. Now we need to check to see whether the share
8770+        # hash chain is also correct.
8771+        # SDMF wrote share hash chains that didn't contain the
8772+        # leaves, which would be produced from the block hash tree.
8773+        # So we need to validate the block hash tree first. If
8774+        # successful, then bht[0] will contain the root for the
8775+        # shnum, which will be a leaf in the share hash tree, which
8776+        # will allow us to validate the rest of the tree.
8777+        if self.share_hash_tree.needed_hashes(reader.shnum,
8778+                                              include_leaf=True) or \
8779+                                              self._verify:
8780+            try:
8781+                self.share_hash_tree.set_hashes(hashes=sharehashes[1],
8782+                                            leaves={reader.shnum: bht[0]})
8783+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
8784+                    IndexError), e:
8785+                raise CorruptShareError(reader.peerid,
8786+                                        reader.shnum,
8787+                                        "corrupt hashes: %s" % e)
8788+
8789+        self.log('share %d is valid for segment %d' % (reader.shnum,
8790+                                                       segnum))
8791+        return {reader.shnum: (block, salt)}
8792+
8793+
8794+    def _get_needed_hashes(self, reader, segnum):
8795+        """
8796+        I get the hashes needed to validate segnum from the reader, then return
8797+        to my caller when this is done.
8798+        """
8799+        bht = self._block_hash_trees[reader.shnum]
8800+        needed = bht.needed_hashes(segnum, include_leaf=True)
8801+        # The root of the block hash tree is also a leaf in the share
8802+        # hash tree. So we don't need to fetch it from the remote
8803+        # server. In the case of files with one segment, this means that
8804+        # we won't fetch any block hash tree from the remote server,
8805+        # since the hash of each share of the file is the entire block
8806+        # hash tree, and is a leaf in the share hash tree. This is fine,
8807+        # since any share corruption will be detected in the share hash
8808+        # tree.
8809+        #needed.discard(0)
8810+        self.log("getting blockhashes for segment %d, share %d: %s" % \
8811+                 (segnum, reader.shnum, str(needed)))
8812+        d1 = reader.get_blockhashes(needed, queue=True, force_remote=True)
8813+        if self.share_hash_tree.needed_hashes(reader.shnum):
8814+            need = self.share_hash_tree.needed_hashes(reader.shnum)
8815+            self.log("also need sharehashes for share %d: %s" % (reader.shnum,
8816+                                                                 str(need)))
8817+            d2 = reader.get_sharehashes(need, queue=True, force_remote=True)
8818+        else:
8819+            d2 = defer.succeed({}) # the logic in the next method
8820+                                   # expects a dict
8821+        dl = defer.DeferredList([d1, d2], consumeErrors=True)
8822+        return dl
8823+
8824+
8825+    def _decode_blocks(self, blocks_and_salts, segnum):
8826+        """
8827+        I take a list of k blocks and salts, and decode that into a
8828+        single encrypted segment.
8829+        """
8830+        d = {}
8831+        # We want to merge our dictionaries to the form
8832+        # {shnum: blocks_and_salts}
8833+        #
8834+        # The dictionaries come from validate block that way, so we just
8835+        # need to merge them.
8836+        for block_and_salt in blocks_and_salts:
8837+            d.update(block_and_salt[1])
8838+
8839+        # All of these blocks should have the same salt; in SDMF, it is
8840+        # the file-wide IV, while in MDMF it is the per-segment salt. In
8841+        # either case, we just need to get one of them and use it.
8842+        #
8843+        # d.items()[0] is like (shnum, (block, salt))
8844+        # d.items()[0][1] is like (block, salt)
8845+        # d.items()[0][1][1] is the salt.
8846+        salt = d.items()[0][1][1]
8847+        # Next, extract just the blocks from the dict. We'll use the
8848+        # salt in the next step.
8849+        share_and_shareids = [(k, v[0]) for k, v in d.items()]
8850+        d2 = dict(share_and_shareids)
8851+        shareids = []
8852+        shares = []
8853+        for shareid, share in d2.items():
8854             shareids.append(shareid)
8855             shares.append(share)
8856 
8857hunk ./src/allmydata/mutable/retrieve.py 938
8858-        assert len(shareids) >= k, len(shareids)
8859+        self._status.set_status("Decoding")
8860+        started = time.time()
8861+        assert len(shareids) >= self._required_shares, len(shareids)
8862         # zfec really doesn't want extra shares
8863hunk ./src/allmydata/mutable/retrieve.py 942
8864-        shareids = shareids[:k]
8865-        shares = shares[:k]
8866-
8867-        fec = codec.CRSDecoder()
8868-        fec.set_params(segsize, k, N)
8869-
8870-        self.log("params %s, we have %d shares" % ((segsize, k, N), len(shares)))
8871-        self.log("about to decode, shareids=%s" % (shareids,))
8872-        d = defer.maybeDeferred(fec.decode, shares, shareids)
8873-        def _done(buffers):
8874-            self._status.timings["decode"] = time.time() - started
8875-            self.log(" decode done, %d buffers" % len(buffers))
8876+        shareids = shareids[:self._required_shares]
8877+        shares = shares[:self._required_shares]
8878+        self.log("decoding segment %d" % segnum)
8879+        if segnum == self._num_segments - 1:
8880+            d = defer.maybeDeferred(self._tail_decoder.decode, shares, shareids)
8881+        else:
8882+            d = defer.maybeDeferred(self._segment_decoder.decode, shares, shareids)
8883+        def _process(buffers):
8884             segment = "".join(buffers)
8885hunk ./src/allmydata/mutable/retrieve.py 951
8886+            self.log(format="now decoding segment %(segnum)s of %(numsegs)s",
8887+                     segnum=segnum,
8888+                     numsegs=self._num_segments,
8889+                     level=log.NOISY)
8890             self.log(" joined length %d, datalength %d" %
8891hunk ./src/allmydata/mutable/retrieve.py 956
8892-                     (len(segment), datalength))
8893-            segment = segment[:datalength]
8894+                     (len(segment), self._data_length))
8895+            if segnum == self._num_segments - 1:
8896+                size_to_use = self._tail_data_size
8897+            else:
8898+                size_to_use = self._segment_size
8899+            segment = segment[:size_to_use]
8900             self.log(" segment len=%d" % len(segment))
8901hunk ./src/allmydata/mutable/retrieve.py 963
8902-            return segment
8903-        def _err(f):
8904-            self.log(" decode failed: %s" % f)
8905-            return f
8906-        d.addCallback(_done)
8907-        d.addErrback(_err)
8908+            self._status.timings.setdefault("decode", 0)
8909+            self._status.timings['decode'] = time.time() - started
8910+            return segment, salt
8911+        d.addCallback(_process)
8912         return d
8913 
8914hunk ./src/allmydata/mutable/retrieve.py 969
8915-    def _decrypt(self, crypttext, IV, readkey):
8916+
8917+    def _decrypt_segment(self, segment_and_salt):
8918+        """
8919+        I take a single segment and its salt, and decrypt it. I return
8920+        the plaintext of the segment that is in my argument.
8921+        """
8922+        segment, salt = segment_and_salt
8923         self._status.set_status("decrypting")
8924hunk ./src/allmydata/mutable/retrieve.py 977
8925+        self.log("decrypting segment %d" % self._current_segment)
8926         started = time.time()
8927hunk ./src/allmydata/mutable/retrieve.py 979
8928-        key = hashutil.ssk_readkey_data_hash(IV, readkey)
8929+        key = hashutil.ssk_readkey_data_hash(salt, self._node.get_readkey())
8930         decryptor = AES(key)
8931hunk ./src/allmydata/mutable/retrieve.py 981
8932-        plaintext = decryptor.process(crypttext)
8933-        self._status.timings["decrypt"] = time.time() - started
8934+        plaintext = decryptor.process(segment)
8935+        self._status.timings.setdefault("decrypt", 0)
8936+        self._status.timings['decrypt'] = time.time() - started
8937         return plaintext
8938 
8939hunk ./src/allmydata/mutable/retrieve.py 986
8940-    def _done(self, res):
8941-        if not self._running:
8942+
8943+    def notify_server_corruption(self, peerid, shnum, reason):
8944+        ss = self.servermap.connections[peerid]
8945+        ss.callRemoteOnly("advise_corrupt_share",
8946+                          "mutable", self._storage_index, shnum, reason)
8947+
8948+
8949+    def _try_to_validate_privkey(self, enc_privkey, reader):
8950+        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
8951+        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
8952+        if alleged_writekey != self._node.get_writekey():
8953+            self.log("invalid privkey from %s shnum %d" %
8954+                     (reader, reader.shnum),
8955+                     level=log.WEIRD, umid="YIw4tA")
8956+            if self._verify:
8957+                self.servermap.mark_bad_share(reader.peerid, reader.shnum,
8958+                                              self.verinfo[-2])
8959+                e = CorruptShareError(reader.peerid,
8960+                                      reader.shnum,
8961+                                      "invalid privkey")
8962+                f = failure.Failure(e)
8963+                self._bad_shares.add((reader.peerid, reader.shnum, f))
8964             return
8965hunk ./src/allmydata/mutable/retrieve.py 1009
8966+
8967+        # it's good
8968+        self.log("got valid privkey from shnum %d on reader %s" %
8969+                 (reader.shnum, reader))
8970+        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
8971+        self._node._populate_encprivkey(enc_privkey)
8972+        self._node._populate_privkey(privkey)
8973+        self._need_privkey = False
8974+
8975+
8976+    def _check_for_done(self, res):
8977+        """
8978+        I check to see if this Retrieve object has successfully finished
8979+        its work.
8980+
8981+        I can exit in the following ways:
8982+            - If there are no more segments to download, then I exit by
8983+              causing self._done_deferred to fire with the plaintext
8984+              content requested by the caller.
8985+            - If there are still segments to be downloaded, and there
8986+              are enough active readers (readers which have not broken
8987+              and have not given us corrupt data) to continue
8988+              downloading, I send control back to
8989+              _download_current_segment.
8990+            - If there are still segments to be downloaded but there are
8991+              not enough active peers to download them, I ask
8992+              _add_active_peers to add more peers. If it is successful,
8993+              it will call _download_current_segment. If there are not
8994+              enough peers to retrieve the file, then that will cause
8995+              _done_deferred to errback.
8996+        """
8997+        self.log("checking for doneness")
8998+        if self._current_segment > self._last_segment:
8999+            # No more segments to download, we're done.
9000+            self.log("got plaintext, done")
9001+            return self._done()
9002+
9003+        if len(self._active_readers) >= self._required_shares:
9004+            # More segments to download, but we have enough good peers
9005+            # in self._active_readers that we can do that without issue,
9006+            # so go nab the next segment.
9007+            self.log("not done yet: on segment %d of %d" % \
9008+                     (self._current_segment + 1, self._num_segments))
9009+            return self._download_current_segment()
9010+
9011+        self.log("not done yet: on segment %d of %d, need to add peers" % \
9012+                 (self._current_segment + 1, self._num_segments))
9013+        return self._add_active_peers()
9014+
9015+
9016+    def _done(self):
9017+        """
9018+        I am called by _check_for_done when the download process has
9019+        finished successfully. After making some useful logging
9020+        statements, I return the decrypted contents to the owner of this
9021+        Retrieve object through self._done_deferred.
9022+        """
9023         self._running = False
9024         self._status.set_active(False)
9025hunk ./src/allmydata/mutable/retrieve.py 1068
9026-        self._status.timings["total"] = time.time() - self._started
9027-        # res is either the new contents, or a Failure
9028-        if isinstance(res, failure.Failure):
9029-            self.log("Retrieve done, with failure", failure=res,
9030-                     level=log.UNUSUAL)
9031-            self._status.set_status("Failed")
9032+        now = time.time()
9033+        self._status.timings['total'] = now - self._started
9034+        self._status.timings['fetch'] = now - self._started_fetching
9035+
9036+        if self._verify:
9037+            ret = list(self._bad_shares)
9038+            self.log("done verifying, found %d bad shares" % len(ret))
9039         else:
9040hunk ./src/allmydata/mutable/retrieve.py 1076
9041-            self.log("Retrieve done, success!")
9042-            self._status.set_status("Finished")
9043-            self._status.set_progress(1.0)
9044-            # remember the encoding parameters, use them again next time
9045-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
9046-             offsets_tuple) = self.verinfo
9047-            self._node._populate_required_shares(k)
9048-            self._node._populate_total_shares(N)
9049-        eventually(self._done_deferred.callback, res)
9050+            # TODO: upload status here?
9051+            ret = self._consumer
9052+            self._consumer.unregisterProducer()
9053+        eventually(self._done_deferred.callback, ret)
9054+
9055 
9056hunk ./src/allmydata/mutable/retrieve.py 1082
9057+    def _failed(self):
9058+        """
9059+        I am called by _add_active_peers when there are not enough
9060+        active peers left to complete the download. After making some
9061+        useful logging statements, I return an exception to that effect
9062+        to the caller of this Retrieve object through
9063+        self._done_deferred.
9064+        """
9065+        self._running = False
9066+        self._status.set_active(False)
9067+        now = time.time()
9068+        self._status.timings['total'] = now - self._started
9069+        self._status.timings['fetch'] = now - self._started_fetching
9070+
9071+        if self._verify:
9072+            ret = list(self._bad_shares)
9073+        else:
9074+            format = ("ran out of peers: "
9075+                      "have %(have)d of %(total)d segments "
9076+                      "found %(bad)d bad shares "
9077+                      "encoding %(k)d-of-%(n)d")
9078+            args = {"have": self._current_segment,
9079+                    "total": self._num_segments,
9080+                    "need": self._last_segment,
9081+                    "k": self._required_shares,
9082+                    "n": self._total_shares,
9083+                    "bad": len(self._bad_shares)}
9084+            e = NotEnoughSharesError("%s, last failure: %s" % \
9085+                                     (format % args, str(self._last_failure)))
9086+            f = failure.Failure(e)
9087+            ret = f
9088+        eventually(self._done_deferred.callback, ret)
9089}
9090[mutable/servermap.py: Alter the servermap updater to work with MDMF files
9091Kevan Carstensen <kevan@isnotajoke.com>**20100819003439
9092 Ignore-this: 7e408303194834bd59a2f27efab3bdb
9093 
9094 These modifications were basically all to the end of having the
9095 servermap updater use the unified MDMF + SDMF read interface whenever
9096 possible -- this reduces the complexity of the code, making it easier to
9097 read and maintain. To do this, I needed to modify the process of
9098 updating the servermap a little bit.
9099 
9100 To support partial-file updates, I also modified the servermap updater
9101 to fetch the block hash trees and certain segments of files while it
9102 performed a servermap update (this can be done without adding any new
9103 roundtrips because of batch-read functionality that the read proxy has).
9104 
9105] {
9106hunk ./src/allmydata/mutable/servermap.py 2
9107 
9108-import sys, time
9109+import sys, time, struct
9110 from zope.interface import implements
9111 from itertools import count
9112 from twisted.internet import defer
9113merger 0.0 (
9114hunk ./src/allmydata/mutable/servermap.py 9
9115+from allmydata.util.dictutil import DictOfSets
9116hunk ./src/allmydata/mutable/servermap.py 7
9117-from foolscap.api import DeadReferenceError, RemoteException, eventually
9118-from allmydata.util import base32, hashutil, idlib, log
9119+from foolscap.api import DeadReferenceError, RemoteException, eventually, \
9120+                         fireEventually
9121+from allmydata.util import base32, hashutil, idlib, log, deferredutil
9122)
9123merger 0.0 (
9124hunk ./src/allmydata/mutable/servermap.py 14
9125-     DictOfSets, CorruptShareError, NeedMoreDataError
9126+     CorruptShareError, NeedMoreDataError
9127hunk ./src/allmydata/mutable/servermap.py 14
9128-     DictOfSets, CorruptShareError, NeedMoreDataError
9129-from allmydata.mutable.layout import unpack_prefix_and_signature, unpack_header, unpack_share, \
9130-     SIGNED_PREFIX_LENGTH
9131+     DictOfSets, CorruptShareError
9132+from allmydata.mutable.layout import SIGNED_PREFIX_LENGTH, MDMFSlotReadProxy
9133)
9134hunk ./src/allmydata/mutable/servermap.py 123
9135         self.bad_shares = {} # maps (peerid,shnum) to old checkstring
9136         self.last_update_mode = None
9137         self.last_update_time = 0
9138+        self.update_data = {} # (verinfo,shnum) => data
9139 
9140     def copy(self):
9141         s = ServerMap()
9142hunk ./src/allmydata/mutable/servermap.py 254
9143         """Return a set of versionids, one for each version that is currently
9144         recoverable."""
9145         versionmap = self.make_versionmap()
9146-
9147         recoverable_versions = set()
9148         for (verinfo, shares) in versionmap.items():
9149             (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
9150hunk ./src/allmydata/mutable/servermap.py 339
9151         return False
9152 
9153 
9154+    def get_update_data_for_share_and_verinfo(self, shnum, verinfo):
9155+        """
9156+        I return the update data for the given shnum
9157+        """
9158+        update_data = self.update_data[shnum]
9159+        update_datum = [i[1] for i in update_data if i[0] == verinfo][0]
9160+        return update_datum
9161+
9162+
9163+    def set_update_data_for_share_and_verinfo(self, shnum, verinfo, data):
9164+        """
9165+        I record the block hash tree for the given shnum.
9166+        """
9167+        self.update_data.setdefault(shnum , []).append((verinfo, data))
9168+
9169+
9170 class ServermapUpdater:
9171     def __init__(self, filenode, storage_broker, monitor, servermap,
9172hunk ./src/allmydata/mutable/servermap.py 357
9173-                 mode=MODE_READ, add_lease=False):
9174+                 mode=MODE_READ, add_lease=False, update_range=None):
9175         """I update a servermap, locating a sufficient number of useful
9176         shares and remembering where they are located.
9177 
9178hunk ./src/allmydata/mutable/servermap.py 382
9179         self._servers_responded = set()
9180 
9181         # how much data should we read?
9182+        # SDMF:
9183         #  * if we only need the checkstring, then [0:75]
9184         #  * if we need to validate the checkstring sig, then [543ish:799ish]
9185         #  * if we need the verification key, then [107:436ish]
9186merger 0.0 (
9187hunk ./src/allmydata/mutable/servermap.py 392
9188-        # read 2000 bytes, which also happens to read enough actual data to
9189-        # pre-fetch a 9-entry dirnode.
9190+        # read 4000 bytes, which also happens to read enough actual data to
9191+        # pre-fetch an 18-entry dirnode.
9192hunk ./src/allmydata/mutable/servermap.py 390
9193-        # A future version of the SMDF slot format should consider using
9194-        # fixed-size slots so we can retrieve less data. For now, we'll just
9195-        # read 2000 bytes, which also happens to read enough actual data to
9196-        # pre-fetch a 9-entry dirnode.
9197+        # MDMF:
9198+        #  * Checkstring? [0:72]
9199+        #  * If we want to validate the checkstring, then [0:72], [143:?] --
9200+        #    the offset table will tell us for sure.
9201+        #  * If we need the verification key, we have to consult the offset
9202+        #    table as well.
9203+        # At this point, we don't know which we are. Our filenode can
9204+        # tell us, but it might be lying -- in some cases, we're
9205+        # responsible for telling it which kind of file it is.
9206)
9207hunk ./src/allmydata/mutable/servermap.py 399
9208             # we use unpack_prefix_and_signature, so we need 1k
9209             self._read_size = 1000
9210         self._need_privkey = False
9211+
9212         if mode == MODE_WRITE and not self._node.get_privkey():
9213             self._need_privkey = True
9214         # check+repair: repair requires the privkey, so if we didn't happen
9215hunk ./src/allmydata/mutable/servermap.py 406
9216         # to ask for it during the check, we'll have problems doing the
9217         # publish.
9218 
9219+        self.fetch_update_data = False
9220+        if mode == MODE_WRITE and update_range:
9221+            # We're updating the servermap in preparation for an
9222+            # in-place file update, so we need to fetch some additional
9223+            # data from each share that we find.
9224+            assert len(update_range) == 2
9225+
9226+            self.start_segment = update_range[0]
9227+            self.end_segment = update_range[1]
9228+            self.fetch_update_data = True
9229+
9230         prefix = si_b2a(self._storage_index)[:5]
9231         self._log_number = log.msg(format="SharemapUpdater(%(si)s): starting (%(mode)s)",
9232                                    si=prefix, mode=mode)
9233merger 0.0 (
9234hunk ./src/allmydata/mutable/servermap.py 455
9235-        full_peerlist = sb.get_servers_for_index(self._storage_index)
9236+        full_peerlist = [(s.get_serverid(), s.get_rref())
9237+                         for s in sb.get_servers_for_psi(self._storage_index)]
9238hunk ./src/allmydata/mutable/servermap.py 455
9239+        # All of the peers, permuted by the storage index, as usual.
9240)
9241hunk ./src/allmydata/mutable/servermap.py 461
9242         self._good_peers = set() # peers who had some shares
9243         self._empty_peers = set() # peers who don't have any shares
9244         self._bad_peers = set() # peers to whom our queries failed
9245+        self._readers = {} # peerid -> dict(sharewriters), filled in
9246+                           # after responses come in.
9247 
9248         k = self._node.get_required_shares()
9249hunk ./src/allmydata/mutable/servermap.py 465
9250+        # For what cases can these conditions work?
9251         if k is None:
9252             # make a guess
9253             k = 3
9254hunk ./src/allmydata/mutable/servermap.py 478
9255         self.num_peers_to_query = k + self.EPSILON
9256 
9257         if self.mode == MODE_CHECK:
9258+            # We want to query all of the peers.
9259             initial_peers_to_query = dict(full_peerlist)
9260             must_query = set(initial_peers_to_query.keys())
9261             self.extra_peers = []
9262hunk ./src/allmydata/mutable/servermap.py 486
9263             # we're planning to replace all the shares, so we want a good
9264             # chance of finding them all. We will keep searching until we've
9265             # seen epsilon that don't have a share.
9266+            # We don't query all of the peers because that could take a while.
9267             self.num_peers_to_query = N + self.EPSILON
9268             initial_peers_to_query, must_query = self._build_initial_querylist()
9269             self.required_num_empty_peers = self.EPSILON
9270hunk ./src/allmydata/mutable/servermap.py 496
9271             # might also avoid the round trip required to read the encrypted
9272             # private key.
9273 
9274-        else:
9275+        else: # MODE_READ, MODE_ANYTHING
9276+            # 2k peers is good enough.
9277             initial_peers_to_query, must_query = self._build_initial_querylist()
9278 
9279         # this is a set of peers that we are required to get responses from:
9280hunk ./src/allmydata/mutable/servermap.py 512
9281         # before we can consider ourselves finished, and self.extra_peers
9282         # contains the overflow (peers that we should tap if we don't get
9283         # enough responses)
9284+        # I guess that self._must_query is a subset of
9285+        # initial_peers_to_query?
9286+        assert set(must_query).issubset(set(initial_peers_to_query))
9287 
9288         self._send_initial_requests(initial_peers_to_query)
9289         self._status.timings["initial_queries"] = time.time() - self._started
9290hunk ./src/allmydata/mutable/servermap.py 571
9291         # errors that aren't handled by _query_failed (and errors caused by
9292         # _query_failed) get logged, but we still want to check for doneness.
9293         d.addErrback(log.err)
9294-        d.addBoth(self._check_for_done)
9295         d.addErrback(self._fatal_error)
9296hunk ./src/allmydata/mutable/servermap.py 572
9297+        d.addCallback(self._check_for_done)
9298         return d
9299 
9300     def _do_read(self, ss, peerid, storage_index, shnums, readv):
9301hunk ./src/allmydata/mutable/servermap.py 591
9302         d = ss.callRemote("slot_readv", storage_index, shnums, readv)
9303         return d
9304 
9305+
9306+    def _got_corrupt_share(self, e, shnum, peerid, data, lp):
9307+        """
9308+        I am called when a remote server returns a corrupt share in
9309+        response to one of our queries. By corrupt, I mean a share
9310+        without a valid signature. I then record the failure, notify the
9311+        server of the corruption, and record the share as bad.
9312+        """
9313+        f = failure.Failure(e)
9314+        self.log(format="bad share: %(f_value)s", f_value=str(f),
9315+                 failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
9316+        # Notify the server that its share is corrupt.
9317+        self.notify_server_corruption(peerid, shnum, str(e))
9318+        # By flagging this as a bad peer, we won't count any of
9319+        # the other shares on that peer as valid, though if we
9320+        # happen to find a valid version string amongst those
9321+        # shares, we'll keep track of it so that we don't need
9322+        # to validate the signature on those again.
9323+        self._bad_peers.add(peerid)
9324+        self._last_failure = f
9325+        # XXX: Use the reader for this?
9326+        checkstring = data[:SIGNED_PREFIX_LENGTH]
9327+        self._servermap.mark_bad_share(peerid, shnum, checkstring)
9328+        self._servermap.problems.append(f)
9329+
9330+
9331+    def _cache_good_sharedata(self, verinfo, shnum, now, data):
9332+        """
9333+        If one of my queries returns successfully (which means that we
9334+        were able to and successfully did validate the signature), I
9335+        cache the data that we initially fetched from the storage
9336+        server. This will help reduce the number of roundtrips that need
9337+        to occur when the file is downloaded, or when the file is
9338+        updated.
9339+        """
9340+        if verinfo:
9341+            self._node._add_to_cache(verinfo, shnum, 0, data, now)
9342+
9343+
9344     def _got_results(self, datavs, peerid, readsize, stuff, started):
9345         lp = self.log(format="got result from [%(peerid)s], %(numshares)d shares",
9346                       peerid=idlib.shortnodeid_b2a(peerid),
9347hunk ./src/allmydata/mutable/servermap.py 633
9348-                      numshares=len(datavs),
9349-                      level=log.NOISY)
9350+                      numshares=len(datavs))
9351         now = time.time()
9352         elapsed = now - started
9353hunk ./src/allmydata/mutable/servermap.py 636
9354-        self._queries_outstanding.discard(peerid)
9355-        self._servermap.reachable_peers.add(peerid)
9356-        self._must_query.discard(peerid)
9357-        self._queries_completed += 1
9358+        def _done_processing(ignored=None):
9359+            self._queries_outstanding.discard(peerid)
9360+            self._servermap.reachable_peers.add(peerid)
9361+            self._must_query.discard(peerid)
9362+            self._queries_completed += 1
9363         if not self._running:
9364hunk ./src/allmydata/mutable/servermap.py 642
9365-            self.log("but we're not running, so we'll ignore it", parent=lp,
9366-                     level=log.NOISY)
9367+            self.log("but we're not running, so we'll ignore it", parent=lp)
9368+            _done_processing()
9369             self._status.add_per_server_time(peerid, "late", started, elapsed)
9370             return
9371         self._status.add_per_server_time(peerid, "query", started, elapsed)
9372hunk ./src/allmydata/mutable/servermap.py 653
9373         else:
9374             self._empty_peers.add(peerid)
9375 
9376-        last_verinfo = None
9377-        last_shnum = None
9378+        ss, storage_index = stuff
9379+        ds = []
9380+
9381         for shnum,datav in datavs.items():
9382             data = datav[0]
9383             try:
9384merger 0.0 (
9385hunk ./src/allmydata/mutable/servermap.py 662
9386-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
9387+                self._node._add_to_cache(verinfo, shnum, 0, data)
9388hunk ./src/allmydata/mutable/servermap.py 658
9389-            try:
9390-                verinfo = self._got_results_one_share(shnum, data, peerid, lp)
9391-                last_verinfo = verinfo
9392-                last_shnum = shnum
9393-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
9394-            except CorruptShareError, e:
9395-                # log it and give the other shares a chance to be processed
9396-                f = failure.Failure()
9397-                self.log(format="bad share: %(f_value)s", f_value=str(f.value),
9398-                         failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
9399-                self.notify_server_corruption(peerid, shnum, str(e))
9400-                self._bad_peers.add(peerid)
9401-                self._last_failure = f
9402-                checkstring = data[:SIGNED_PREFIX_LENGTH]
9403-                self._servermap.mark_bad_share(peerid, shnum, checkstring)
9404-                self._servermap.problems.append(f)
9405-                pass
9406+            reader = MDMFSlotReadProxy(ss,
9407+                                       storage_index,
9408+                                       shnum,
9409+                                       data)
9410+            self._readers.setdefault(peerid, dict())[shnum] = reader
9411+            # our goal, with each response, is to validate the version
9412+            # information and share data as best we can at this point --
9413+            # we do this by validating the signature. To do this, we
9414+            # need to do the following:
9415+            #   - If we don't already have the public key, fetch the
9416+            #     public key. We use this to validate the signature.
9417+            if not self._node.get_pubkey():
9418+                # fetch and set the public key.
9419+                d = reader.get_verification_key(queue=True)
9420+                d.addCallback(lambda results, shnum=shnum, peerid=peerid:
9421+                    self._try_to_set_pubkey(results, peerid, shnum, lp))
9422+                # XXX: Make self._pubkey_query_failed?
9423+                d.addErrback(lambda error, shnum=shnum, peerid=peerid:
9424+                    self._got_corrupt_share(error, shnum, peerid, data, lp))
9425+            else:
9426+                # we already have the public key.
9427+                d = defer.succeed(None)
9428)
9429hunk ./src/allmydata/mutable/servermap.py 676
9430                 self._servermap.problems.append(f)
9431                 pass
9432 
9433-        self._status.timings["cumulative_verify"] += (time.time() - now)
9434+            # Neither of these two branches return anything of
9435+            # consequence, so the first entry in our deferredlist will
9436+            # be None.
9437 
9438hunk ./src/allmydata/mutable/servermap.py 680
9439-        if self._need_privkey and last_verinfo:
9440-            # send them a request for the privkey. We send one request per
9441-            # server.
9442-            lp2 = self.log("sending privkey request",
9443-                           parent=lp, level=log.NOISY)
9444-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
9445-             offsets_tuple) = last_verinfo
9446-            o = dict(offsets_tuple)
9447+            # - Next, we need the version information. We almost
9448+            #   certainly got this by reading the first thousand or so
9449+            #   bytes of the share on the storage server, so we
9450+            #   shouldn't need to fetch anything at this step.
9451+            d2 = reader.get_verinfo()
9452+            d2.addErrback(lambda error, shnum=shnum, peerid=peerid:
9453+                self._got_corrupt_share(error, shnum, peerid, data, lp))
9454+            # - Next, we need the signature. For an SDMF share, it is
9455+            #   likely that we fetched this when doing our initial fetch
9456+            #   to get the version information. In MDMF, this lives at
9457+            #   the end of the share, so unless the file is quite small,
9458+            #   we'll need to do a remote fetch to get it.
9459+            d3 = reader.get_signature(queue=True)
9460+            d3.addErrback(lambda error, shnum=shnum, peerid=peerid:
9461+                self._got_corrupt_share(error, shnum, peerid, data, lp))
9462+            #  Once we have all three of these responses, we can move on
9463+            #  to validating the signature
9464 
9465hunk ./src/allmydata/mutable/servermap.py 698
9466-            self._queries_outstanding.add(peerid)
9467-            readv = [ (o['enc_privkey'], (o['EOF'] - o['enc_privkey'])) ]
9468-            ss = self._servermap.connections[peerid]
9469-            privkey_started = time.time()
9470-            d = self._do_read(ss, peerid, self._storage_index,
9471-                              [last_shnum], readv)
9472-            d.addCallback(self._got_privkey_results, peerid, last_shnum,
9473-                          privkey_started, lp2)
9474-            d.addErrback(self._privkey_query_failed, peerid, last_shnum, lp2)
9475-            d.addErrback(log.err)
9476-            d.addCallback(self._check_for_done)
9477-            d.addErrback(self._fatal_error)
9478+            # Does the node already have a privkey? If not, we'll try to
9479+            # fetch it here.
9480+            if self._need_privkey:
9481+                d4 = reader.get_encprivkey(queue=True)
9482+                d4.addCallback(lambda results, shnum=shnum, peerid=peerid:
9483+                    self._try_to_validate_privkey(results, peerid, shnum, lp))
9484+                d4.addErrback(lambda error, shnum=shnum, peerid=peerid:
9485+                    self._privkey_query_failed(error, shnum, data, lp))
9486+            else:
9487+                d4 = defer.succeed(None)
9488+
9489+
9490+            if self.fetch_update_data:
9491+                # fetch the block hash tree and first + last segment, as
9492+                # configured earlier.
9493+                # Then set them in wherever we happen to want to set
9494+                # them.
9495+                ds = []
9496+                # XXX: We do this above, too. Is there a good way to
9497+                # make the two routines share the value without
9498+                # introducing more roundtrips?
9499+                ds.append(reader.get_verinfo())
9500+                ds.append(reader.get_blockhashes(queue=True))
9501+                ds.append(reader.get_block_and_salt(self.start_segment,
9502+                                                    queue=True))
9503+                ds.append(reader.get_block_and_salt(self.end_segment,
9504+                                                    queue=True))
9505+                d5 = deferredutil.gatherResults(ds)
9506+                d5.addCallback(self._got_update_results_one_share, shnum)
9507+            else:
9508+                d5 = defer.succeed(None)
9509 
9510hunk ./src/allmydata/mutable/servermap.py 730
9511+            dl = defer.DeferredList([d, d2, d3, d4, d5])
9512+            dl.addBoth(self._turn_barrier)
9513+            reader.flush()
9514+            dl.addCallback(lambda results, shnum=shnum, peerid=peerid:
9515+                self._got_signature_one_share(results, shnum, peerid, lp))
9516+            dl.addErrback(lambda error, shnum=shnum, data=data:
9517+               self._got_corrupt_share(error, shnum, peerid, data, lp))
9518+            dl.addCallback(lambda verinfo, shnum=shnum, peerid=peerid, data=data:
9519+                self._cache_good_sharedata(verinfo, shnum, now, data))
9520+            ds.append(dl)
9521+        # dl is a deferred list that will fire when all of the shares
9522+        # that we found on this peer are done processing. When dl fires,
9523+        # we know that processing is done, so we can decrement the
9524+        # semaphore-like thing that we incremented earlier.
9525+        dl = defer.DeferredList(ds, fireOnOneErrback=True)
9526+        # Are we done? Done means that there are no more queries to
9527+        # send, that there are no outstanding queries, and that we
9528+        # haven't received any queries that are still processing. If we
9529+        # are done, self._check_for_done will cause the done deferred
9530+        # that we returned to our caller to fire, which tells them that
9531+        # they have a complete servermap, and that we won't be touching
9532+        # the servermap anymore.
9533+        dl.addCallback(_done_processing)
9534+        dl.addCallback(self._check_for_done)
9535+        dl.addErrback(self._fatal_error)
9536         # all done!
9537         self.log("_got_results done", parent=lp, level=log.NOISY)
9538hunk ./src/allmydata/mutable/servermap.py 757
9539+        return dl
9540+
9541+
9542+    def _turn_barrier(self, result):
9543+        """
9544+        I help the servermap updater avoid the recursion limit issues
9545+        discussed in #237.
9546+        """
9547+        return fireEventually(result)
9548+
9549+
9550+    def _try_to_set_pubkey(self, pubkey_s, peerid, shnum, lp):
9551+        if self._node.get_pubkey():
9552+            return # don't go through this again if we don't have to
9553+        fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
9554+        assert len(fingerprint) == 32
9555+        if fingerprint != self._node.get_fingerprint():
9556+            raise CorruptShareError(peerid, shnum,
9557+                                "pubkey doesn't match fingerprint")
9558+        self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
9559+        assert self._node.get_pubkey()
9560+
9561 
9562     def notify_server_corruption(self, peerid, shnum, reason):
9563         ss = self._servermap.connections[peerid]
9564hunk ./src/allmydata/mutable/servermap.py 785
9565         ss.callRemoteOnly("advise_corrupt_share",
9566                           "mutable", self._storage_index, shnum, reason)
9567 
9568-    def _got_results_one_share(self, shnum, data, peerid, lp):
9569+
9570+    def _got_signature_one_share(self, results, shnum, peerid, lp):
9571+        # It is our job to give versioninfo to our caller. We need to
9572+        # raise CorruptShareError if the share is corrupt for any
9573+        # reason, something that our caller will handle.
9574         self.log(format="_got_results: got shnum #%(shnum)d from peerid %(peerid)s",
9575                  shnum=shnum,
9576                  peerid=idlib.shortnodeid_b2a(peerid),
9577hunk ./src/allmydata/mutable/servermap.py 795
9578                  level=log.NOISY,
9579                  parent=lp)
9580+        if not self._running:
9581+            # We can't process the results, since we can't touch the
9582+            # servermap anymore.
9583+            self.log("but we're not running anymore.")
9584+            return None
9585 
9586hunk ./src/allmydata/mutable/servermap.py 801
9587-        # this might raise NeedMoreDataError, if the pubkey and signature
9588-        # live at some weird offset. That shouldn't happen, so I'm going to
9589-        # treat it as a bad share.
9590-        (seqnum, root_hash, IV, k, N, segsize, datalength,
9591-         pubkey_s, signature, prefix) = unpack_prefix_and_signature(data)
9592-
9593-        if not self._node.get_pubkey():
9594-            fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
9595-            assert len(fingerprint) == 32
9596-            if fingerprint != self._node.get_fingerprint():
9597-                raise CorruptShareError(peerid, shnum,
9598-                                        "pubkey doesn't match fingerprint")
9599-            self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
9600-
9601-        if self._need_privkey:
9602-            self._try_to_extract_privkey(data, peerid, shnum, lp)
9603-
9604-        (ig_version, ig_seqnum, ig_root_hash, ig_IV, ig_k, ig_N,
9605-         ig_segsize, ig_datalen, offsets) = unpack_header(data)
9606+        _, verinfo, signature, __, ___ = results
9607+        (seqnum,
9608+         root_hash,
9609+         saltish,
9610+         segsize,
9611+         datalen,
9612+         k,
9613+         n,
9614+         prefix,
9615+         offsets) = verinfo[1]
9616         offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
9617 
9618hunk ./src/allmydata/mutable/servermap.py 813
9619-        verinfo = (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
9620+        # XXX: This should be done for us in the method, so
9621+        # presumably you can go in there and fix it.
9622+        verinfo = (seqnum,
9623+                   root_hash,
9624+                   saltish,
9625+                   segsize,
9626+                   datalen,
9627+                   k,
9628+                   n,
9629+                   prefix,
9630                    offsets_tuple)
9631hunk ./src/allmydata/mutable/servermap.py 824
9632+        # This tuple uniquely identifies a share on the grid; we use it
9633+        # to keep track of the ones that we've already seen.
9634 
9635         if verinfo not in self._valid_versions:
9636hunk ./src/allmydata/mutable/servermap.py 828
9637-            # it's a new pair. Verify the signature.
9638-            valid = self._node.get_pubkey().verify(prefix, signature)
9639+            # This is a new version tuple, and we need to validate it
9640+            # against the public key before keeping track of it.
9641+            assert self._node.get_pubkey()
9642+            valid = self._node.get_pubkey().verify(prefix, signature[1])
9643             if not valid:
9644hunk ./src/allmydata/mutable/servermap.py 833
9645-                raise CorruptShareError(peerid, shnum, "signature is invalid")
9646+                raise CorruptShareError(peerid, shnum,
9647+                                        "signature is invalid")
9648 
9649hunk ./src/allmydata/mutable/servermap.py 836
9650-            # ok, it's a valid verinfo. Add it to the list of validated
9651-            # versions.
9652-            self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
9653-                     % (seqnum, base32.b2a(root_hash)[:4],
9654-                        idlib.shortnodeid_b2a(peerid), shnum,
9655-                        k, N, segsize, datalength),
9656-                     parent=lp)
9657-            self._valid_versions.add(verinfo)
9658-        # We now know that this is a valid candidate verinfo.
9659+        # ok, it's a valid verinfo. Add it to the list of validated
9660+        # versions.
9661+        self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
9662+                 % (seqnum, base32.b2a(root_hash)[:4],
9663+                    idlib.shortnodeid_b2a(peerid), shnum,
9664+                    k, n, segsize, datalen),
9665+                    parent=lp)
9666+        self._valid_versions.add(verinfo)
9667+        # We now know that this is a valid candidate verinfo. Whether or
9668+        # not this instance of it is valid is a matter for the next
9669+        # statement; at this point, we just know that if we see this
9670+        # version info again, that its signature checks out and that
9671+        # we're okay to skip the signature-checking step.
9672 
9673hunk ./src/allmydata/mutable/servermap.py 850
9674+        # (peerid, shnum) are bound in the method invocation.
9675         if (peerid, shnum) in self._servermap.bad_shares:
9676             # we've been told that the rest of the data in this share is
9677             # unusable, so don't add it to the servermap.
9678hunk ./src/allmydata/mutable/servermap.py 863
9679         self._servermap.add_new_share(peerid, shnum, verinfo, timestamp)
9680         # and the versionmap
9681         self.versionmap.add(verinfo, (shnum, peerid, timestamp))
9682+
9683+        # It's our job to set the protocol version of our parent
9684+        # filenode if it isn't already set.
9685+        if not self._node.get_version():
9686+            # The first byte of the prefix is the version.
9687+            v = struct.unpack(">B", prefix[:1])[0]
9688+            self.log("got version %d" % v)
9689+            self._node.set_version(v)
9690+
9691         return verinfo
9692 
9693hunk ./src/allmydata/mutable/servermap.py 874
9694-    def _deserialize_pubkey(self, pubkey_s):
9695-        verifier = rsa.create_verifying_key_from_string(pubkey_s)
9696-        return verifier
9697 
9698hunk ./src/allmydata/mutable/servermap.py 875
9699-    def _try_to_extract_privkey(self, data, peerid, shnum, lp):
9700-        try:
9701-            r = unpack_share(data)
9702-        except NeedMoreDataError, e:
9703-            # this share won't help us. oh well.
9704-            offset = e.encprivkey_offset
9705-            length = e.encprivkey_length
9706-            self.log("shnum %d on peerid %s: share was too short (%dB) "
9707-                     "to get the encprivkey; [%d:%d] ought to hold it" %
9708-                     (shnum, idlib.shortnodeid_b2a(peerid), len(data),
9709-                      offset, offset+length),
9710-                     parent=lp)
9711-            # NOTE: if uncoordinated writes are taking place, someone might
9712-            # change the share (and most probably move the encprivkey) before
9713-            # we get a chance to do one of these reads and fetch it. This
9714-            # will cause us to see a NotEnoughSharesError(unable to fetch
9715-            # privkey) instead of an UncoordinatedWriteError . This is a
9716-            # nuisance, but it will go away when we move to DSA-based mutable
9717-            # files (since the privkey will be small enough to fit in the
9718-            # write cap).
9719+    def _got_update_results_one_share(self, results, share):
9720+        """
9721+        I record the update results in results.
9722+        """
9723+        assert len(results) == 4
9724+        verinfo, blockhashes, start, end = results
9725+        (seqnum,
9726+         root_hash,
9727+         saltish,
9728+         segsize,
9729+         datalen,
9730+         k,
9731+         n,
9732+         prefix,
9733+         offsets) = verinfo
9734+        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
9735 
9736hunk ./src/allmydata/mutable/servermap.py 892
9737-            return
9738+        # XXX: This should be done for us in the method, so
9739+        # presumably you can go in there and fix it.
9740+        verinfo = (seqnum,
9741+                   root_hash,
9742+                   saltish,
9743+                   segsize,
9744+                   datalen,
9745+                   k,
9746+                   n,
9747+                   prefix,
9748+                   offsets_tuple)
9749 
9750hunk ./src/allmydata/mutable/servermap.py 904
9751-        (seqnum, root_hash, IV, k, N, segsize, datalen,
9752-         pubkey, signature, share_hash_chain, block_hash_tree,
9753-         share_data, enc_privkey) = r
9754+        update_data = (blockhashes, start, end)
9755+        self._servermap.set_update_data_for_share_and_verinfo(share,
9756+                                                              verinfo,
9757+                                                              update_data)
9758 
9759hunk ./src/allmydata/mutable/servermap.py 909
9760-        return self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
9761+
9762+    def _deserialize_pubkey(self, pubkey_s):
9763+        verifier = rsa.create_verifying_key_from_string(pubkey_s)
9764+        return verifier
9765 
9766hunk ./src/allmydata/mutable/servermap.py 914
9767-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
9768 
9769hunk ./src/allmydata/mutable/servermap.py 915
9770+    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
9771+        """
9772+        Given a writekey from a remote server, I validate it against the
9773+        writekey stored in my node. If it is valid, then I set the
9774+        privkey and encprivkey properties of the node.
9775+        """
9776         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
9777         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
9778         if alleged_writekey != self._node.get_writekey():
9779hunk ./src/allmydata/mutable/servermap.py 993
9780         self._queries_completed += 1
9781         self._last_failure = f
9782 
9783-    def _got_privkey_results(self, datavs, peerid, shnum, started, lp):
9784-        now = time.time()
9785-        elapsed = now - started
9786-        self._status.add_per_server_time(peerid, "privkey", started, elapsed)
9787-        self._queries_outstanding.discard(peerid)
9788-        if not self._need_privkey:
9789-            return
9790-        if shnum not in datavs:
9791-            self.log("privkey wasn't there when we asked it",
9792-                     level=log.WEIRD, umid="VA9uDQ")
9793-            return
9794-        datav = datavs[shnum]
9795-        enc_privkey = datav[0]
9796-        self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
9797 
9798     def _privkey_query_failed(self, f, peerid, shnum, lp):
9799         self._queries_outstanding.discard(peerid)
9800hunk ./src/allmydata/mutable/servermap.py 1007
9801         self._servermap.problems.append(f)
9802         self._last_failure = f
9803 
9804+
9805     def _check_for_done(self, res):
9806         # exit paths:
9807         #  return self._send_more_queries(outstanding) : send some more queries
9808hunk ./src/allmydata/mutable/servermap.py 1013
9809         #  return self._done() : all done
9810         #  return : keep waiting, no new queries
9811-
9812         lp = self.log(format=("_check_for_done, mode is '%(mode)s', "
9813                               "%(outstanding)d queries outstanding, "
9814                               "%(extra)d extra peers available, "
9815hunk ./src/allmydata/mutable/servermap.py 1204
9816 
9817     def _done(self):
9818         if not self._running:
9819+            self.log("not running; we're already done")
9820             return
9821         self._running = False
9822         now = time.time()
9823hunk ./src/allmydata/mutable/servermap.py 1219
9824         self._servermap.last_update_time = self._started
9825         # the servermap will not be touched after this
9826         self.log("servermap: %s" % self._servermap.summarize_versions())
9827+
9828         eventually(self._done_deferred.callback, self._servermap)
9829 
9830     def _fatal_error(self, f):
9831}
9832[tests:
9833Kevan Carstensen <kevan@isnotajoke.com>**20100819003531
9834 Ignore-this: 314e8bbcce532ea4d5d2cecc9f31cca0
9835 
9836     - A lot of existing tests relied on aspects of the mutable file
9837       implementation that were changed. This patch updates those tests
9838       to work with the changes.
9839     - This patch also adds tests for new features.
9840] {
9841hunk ./src/allmydata/test/common.py 11
9842 from foolscap.api import flushEventualQueue, fireEventually
9843 from allmydata import uri, dirnode, client
9844 from allmydata.introducer.server import IntroducerNode
9845-from allmydata.interfaces import IMutableFileNode, IImmutableFileNode, \
9846-     FileTooLargeError, NotEnoughSharesError, ICheckable
9847+from allmydata.interfaces import IMutableFileNode, IImmutableFileNode,\
9848+                                 NotEnoughSharesError, ICheckable, \
9849+                                 IMutableUploadable, SDMF_VERSION, \
9850+                                 MDMF_VERSION
9851 from allmydata.check_results import CheckResults, CheckAndRepairResults, \
9852      DeepCheckResults, DeepCheckAndRepairResults
9853 from allmydata.mutable.common import CorruptShareError
9854hunk ./src/allmydata/test/common.py 19
9855 from allmydata.mutable.layout import unpack_header
9856+from allmydata.mutable.publish import MutableData
9857 from allmydata.storage.server import storage_index_to_dir
9858 from allmydata.storage.mutable import MutableShareFile
9859 from allmydata.util import hashutil, log, fileutil, pollmixin
9860hunk ./src/allmydata/test/common.py 153
9861         consumer.write(data[start:end])
9862         return consumer
9863 
9864+
9865+    def get_best_readable_version(self):
9866+        return defer.succeed(self)
9867+
9868+
9869+    download_best_version = download_to_data
9870+
9871+
9872+    def download_to_data(self):
9873+        return download_to_data(self)
9874+
9875+
9876+    def get_size_of_best_version(self):
9877+        return defer.succeed(self.get_size)
9878+
9879+
9880 def make_chk_file_cap(size):
9881     return uri.CHKFileURI(key=os.urandom(16),
9882                           uri_extension_hash=os.urandom(32),
9883hunk ./src/allmydata/test/common.py 193
9884     MUTABLE_SIZELIMIT = 10000
9885     all_contents = {}
9886     bad_shares = {}
9887+    file_types = {} # storage index => MDMF_VERSION or SDMF_VERSION
9888 
9889     def __init__(self, storage_broker, secret_holder,
9890                  default_encoding_parameters, history):
9891hunk ./src/allmydata/test/common.py 200
9892         self.init_from_cap(make_mutable_file_cap())
9893     def create(self, contents, key_generator=None, keysize=None):
9894         initial_contents = self._get_initial_contents(contents)
9895-        if len(initial_contents) > self.MUTABLE_SIZELIMIT:
9896-            raise FileTooLargeError("SDMF is limited to one segment, and "
9897-                                    "%d > %d" % (len(initial_contents),
9898-                                                 self.MUTABLE_SIZELIMIT))
9899-        self.all_contents[self.storage_index] = initial_contents
9900+        data = initial_contents.read(initial_contents.get_size())
9901+        data = "".join(data)
9902+        self.all_contents[self.storage_index] = data
9903         return defer.succeed(self)
9904     def _get_initial_contents(self, contents):
9905hunk ./src/allmydata/test/common.py 205
9906-        if isinstance(contents, str):
9907-            return contents
9908         if contents is None:
9909hunk ./src/allmydata/test/common.py 206
9910-            return ""
9911+            return MutableData("")
9912+
9913+        if IMutableUploadable.providedBy(contents):
9914+            return contents
9915+
9916         assert callable(contents), "%s should be callable, not %s" % \
9917                (contents, type(contents))
9918         return contents(self)
9919hunk ./src/allmydata/test/common.py 258
9920     def get_storage_index(self):
9921         return self.storage_index
9922 
9923+    def get_servermap(self, mode):
9924+        return defer.succeed(None)
9925+
9926+    def set_version(self, version):
9927+        assert version in (SDMF_VERSION, MDMF_VERSION)
9928+        self.file_types[self.storage_index] = version
9929+
9930+    def get_version(self):
9931+        assert self.storage_index in self.file_types
9932+        return self.file_types[self.storage_index]
9933+
9934     def check(self, monitor, verify=False, add_lease=False):
9935         r = CheckResults(self.my_uri, self.storage_index)
9936         is_bad = self.bad_shares.get(self.storage_index, None)
9937hunk ./src/allmydata/test/common.py 327
9938         return d
9939 
9940     def download_best_version(self):
9941+        return defer.succeed(self._download_best_version())
9942+
9943+
9944+    def _download_best_version(self, ignored=None):
9945         if isinstance(self.my_uri, uri.LiteralFileURI):
9946hunk ./src/allmydata/test/common.py 332
9947-            return defer.succeed(self.my_uri.data)
9948+            return self.my_uri.data
9949         if self.storage_index not in self.all_contents:
9950hunk ./src/allmydata/test/common.py 334
9951-            return defer.fail(NotEnoughSharesError(None, 0, 3))
9952-        return defer.succeed(self.all_contents[self.storage_index])
9953+            raise NotEnoughSharesError(None, 0, 3)
9954+        return self.all_contents[self.storage_index]
9955+
9956 
9957     def overwrite(self, new_contents):
9958hunk ./src/allmydata/test/common.py 339
9959-        if len(new_contents) > self.MUTABLE_SIZELIMIT:
9960-            raise FileTooLargeError("SDMF is limited to one segment, and "
9961-                                    "%d > %d" % (len(new_contents),
9962-                                                 self.MUTABLE_SIZELIMIT))
9963         assert not self.is_readonly()
9964hunk ./src/allmydata/test/common.py 340
9965-        self.all_contents[self.storage_index] = new_contents
9966+        new_data = new_contents.read(new_contents.get_size())
9967+        new_data = "".join(new_data)
9968+        self.all_contents[self.storage_index] = new_data
9969         return defer.succeed(None)
9970     def modify(self, modifier):
9971         # this does not implement FileTooLargeError, but the real one does
9972hunk ./src/allmydata/test/common.py 350
9973     def _modify(self, modifier):
9974         assert not self.is_readonly()
9975         old_contents = self.all_contents[self.storage_index]
9976-        self.all_contents[self.storage_index] = modifier(old_contents, None, True)
9977+        new_data = modifier(old_contents, None, True)
9978+        self.all_contents[self.storage_index] = new_data
9979         return None
9980 
9981hunk ./src/allmydata/test/common.py 354
9982+    # As actually implemented, MutableFilenode and MutableFileVersion
9983+    # are distinct. However, nothing in the webapi uses (yet) that
9984+    # distinction -- it just uses the unified download interface
9985+    # provided by get_best_readable_version and read. When we start
9986+    # doing cooler things like LDMF, we will want to revise this code to
9987+    # be less simplistic.
9988+    def get_best_readable_version(self):
9989+        return defer.succeed(self)
9990+
9991+
9992+    def get_best_mutable_version(self):
9993+        return defer.succeed(self)
9994+
9995+    # Ditto for this, which is an implementation of IWritable.
9996+    # XXX: Declare that the same is implemented.
9997+    def update(self, data, offset):
9998+        assert not self.is_readonly()
9999+        def modifier(old, servermap, first_time):
10000+            new = old[:offset] + "".join(data.read(data.get_size()))
10001+            new += old[len(new):]
10002+            return new
10003+        return self.modify(modifier)
10004+
10005+
10006+    def read(self, consumer, offset=0, size=None):
10007+        data = self._download_best_version()
10008+        if size:
10009+            data = data[offset:offset+size]
10010+        consumer.write(data)
10011+        return defer.succeed(consumer)
10012+
10013+
10014 def make_mutable_file_cap():
10015     return uri.WriteableSSKFileURI(writekey=os.urandom(16),
10016                                    fingerprint=os.urandom(32))
10017hunk ./src/allmydata/test/test_checker.py 11
10018 from allmydata.test.no_network import GridTestMixin
10019 from allmydata.immutable.upload import Data
10020 from allmydata.test.common_web import WebRenderingMixin
10021+from allmydata.mutable.publish import MutableData
10022 
10023 class FakeClient:
10024     def get_storage_broker(self):
10025hunk ./src/allmydata/test/test_checker.py 291
10026         def _stash_immutable(ur):
10027             self.imm = c0.create_node_from_uri(ur.uri)
10028         d.addCallback(_stash_immutable)
10029-        d.addCallback(lambda ign: c0.create_mutable_file("contents"))
10030+        d.addCallback(lambda ign:
10031+            c0.create_mutable_file(MutableData("contents")))
10032         def _stash_mutable(node):
10033             self.mut = node
10034         d.addCallback(_stash_mutable)
10035hunk ./src/allmydata/test/test_cli.py 13
10036 from allmydata.util import fileutil, hashutil, base32
10037 from allmydata import uri
10038 from allmydata.immutable import upload
10039+from allmydata.mutable.publish import MutableData
10040 from allmydata.dirnode import normalize
10041 
10042 # Test that the scripts can be imported.
10043hunk ./src/allmydata/test/test_cli.py 662
10044 
10045         d = self.do_cli("create-alias", etudes_arg)
10046         def _check_create_unicode((rc, out, err)):
10047-            self.failUnlessReallyEqual(rc, 0)
10048+            #self.failUnlessReallyEqual(rc, 0)
10049             self.failUnlessReallyEqual(err, "")
10050             self.failUnlessIn("Alias %s created" % quote_output(u"\u00E9tudes"), out)
10051 
10052hunk ./src/allmydata/test/test_cli.py 967
10053         d.addCallback(lambda (rc,out,err): self.failUnlessReallyEqual(out, DATA2))
10054         return d
10055 
10056+    def test_mutable_type(self):
10057+        self.basedir = "cli/Put/mutable_type"
10058+        self.set_up_grid()
10059+        data = "data" * 100000
10060+        fn1 = os.path.join(self.basedir, "data")
10061+        fileutil.write(fn1, data)
10062+        d = self.do_cli("create-alias", "tahoe")
10063+        d.addCallback(lambda ignored:
10064+            self.do_cli("put", "--mutable", "--mutable-type=mdmf",
10065+                        fn1, "tahoe:uploaded.txt"))
10066+        d.addCallback(lambda ignored:
10067+            self.do_cli("ls", "--json", "tahoe:uploaded.txt"))
10068+        d.addCallback(lambda (rc, json, err): self.failUnlessIn("mdmf", json))
10069+        d.addCallback(lambda ignored:
10070+            self.do_cli("put", "--mutable", "--mutable-type=sdmf",
10071+                        fn1, "tahoe:uploaded2.txt"))
10072+        d.addCallback(lambda ignored:
10073+            self.do_cli("ls", "--json", "tahoe:uploaded2.txt"))
10074+        d.addCallback(lambda (rc, json, err):
10075+            self.failUnlessIn("sdmf", json))
10076+        return d
10077+
10078+    def test_mutable_type_unlinked(self):
10079+        self.basedir = "cli/Put/mutable_type_unlinked"
10080+        self.set_up_grid()
10081+        data = "data" * 100000
10082+        fn1 = os.path.join(self.basedir, "data")
10083+        fileutil.write(fn1, data)
10084+        d = self.do_cli("put", "--mutable", "--mutable-type=mdmf", fn1)
10085+        d.addCallback(lambda (rc, cap, err):
10086+            self.do_cli("ls", "--json", cap))
10087+        d.addCallback(lambda (rc, json, err): self.failUnlessIn("mdmf", json))
10088+        d.addCallback(lambda ignored:
10089+            self.do_cli("put", "--mutable", "--mutable-type=sdmf", fn1))
10090+        d.addCallback(lambda (rc, cap, err):
10091+            self.do_cli("ls", "--json", cap))
10092+        d.addCallback(lambda (rc, json, err):
10093+            self.failUnlessIn("sdmf", json))
10094+        return d
10095+
10096+    def test_mutable_type_invalid_format(self):
10097+        self.basedir = "cli/Put/mutable_type_invalid_format"
10098+        self.set_up_grid()
10099+        data = "data" * 100000
10100+        fn1 = os.path.join(self.basedir, "data")
10101+        fileutil.write(fn1, data)
10102+        d = self.do_cli("put", "--mutable", "--mutable-type=ldmf", fn1)
10103+        def _check_failure((rc, out, err)):
10104+            self.failIfEqual(rc, 0)
10105+            self.failUnlessIn("invalid", err)
10106+        d.addCallback(_check_failure)
10107+        return d
10108+
10109     def test_put_with_nonexistent_alias(self):
10110         # when invoked with an alias that doesn't exist, 'tahoe put'
10111         # should output a useful error message, not a stack trace
10112hunk ./src/allmydata/test/test_cli.py 2136
10113         self.set_up_grid()
10114         c0 = self.g.clients[0]
10115         DATA = "data" * 100
10116-        d = c0.create_mutable_file(DATA)
10117+        DATA_uploadable = MutableData(DATA)
10118+        d = c0.create_mutable_file(DATA_uploadable)
10119         def _stash_uri(n):
10120             self.uri = n.get_uri()
10121         d.addCallback(_stash_uri)
10122hunk ./src/allmydata/test/test_cli.py 2238
10123                                            upload.Data("literal",
10124                                                         convergence="")))
10125         d.addCallback(_stash_uri, "small")
10126-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"1"))
10127+        d.addCallback(lambda ign:
10128+            c0.create_mutable_file(MutableData(DATA+"1")))
10129         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
10130         d.addCallback(_stash_uri, "mutable")
10131 
10132hunk ./src/allmydata/test/test_cli.py 2257
10133         # root/small
10134         # root/mutable
10135 
10136+        # We haven't broken anything yet, so this should all be healthy.
10137         d.addCallback(lambda ign: self.do_cli("deep-check", "--verbose",
10138                                               self.rooturi))
10139         def _check2((rc, out, err)):
10140hunk ./src/allmydata/test/test_cli.py 2272
10141                             in lines, out)
10142         d.addCallback(_check2)
10143 
10144+        # Similarly, all of these results should be as we expect them to
10145+        # be for a healthy file layout.
10146         d.addCallback(lambda ign: self.do_cli("stats", self.rooturi))
10147         def _check_stats((rc, out, err)):
10148             self.failUnlessReallyEqual(err, "")
10149hunk ./src/allmydata/test/test_cli.py 2289
10150             self.failUnlessIn(" 317-1000 : 1    (1000 B, 1000 B)", lines)
10151         d.addCallback(_check_stats)
10152 
10153+        # Now we break things.
10154         def _clobber_shares(ignored):
10155             shares = self.find_uri_shares(self.uris[u"g\u00F6\u00F6d"])
10156             self.failUnlessReallyEqual(len(shares), 10)
10157hunk ./src/allmydata/test/test_cli.py 2314
10158 
10159         d.addCallback(lambda ign:
10160                       self.do_cli("deep-check", "--verbose", self.rooturi))
10161+        # This should reveal the missing share, but not the corrupt
10162+        # share, since we didn't tell the deep check operation to also
10163+        # verify.
10164         def _check3((rc, out, err)):
10165             self.failUnlessReallyEqual(err, "")
10166             self.failUnlessReallyEqual(rc, 0)
10167hunk ./src/allmydata/test/test_cli.py 2365
10168                                   "--verbose", "--verify", "--repair",
10169                                   self.rooturi))
10170         def _check6((rc, out, err)):
10171+            # We've just repaired the directory. There is no reason for
10172+            # that repair to be unsuccessful.
10173             self.failUnlessReallyEqual(err, "")
10174             self.failUnlessReallyEqual(rc, 0)
10175             lines = out.splitlines()
10176hunk ./src/allmydata/test/test_deepcheck.py 9
10177 from twisted.internet import threads # CLI tests use deferToThread
10178 from allmydata.immutable import upload
10179 from allmydata.mutable.common import UnrecoverableFileError
10180+from allmydata.mutable.publish import MutableData
10181 from allmydata.util import idlib
10182 from allmydata.util import base32
10183 from allmydata.scripts import runner
10184hunk ./src/allmydata/test/test_deepcheck.py 38
10185         self.basedir = "deepcheck/MutableChecker/good"
10186         self.set_up_grid()
10187         CONTENTS = "a little bit of data"
10188-        d = self.g.clients[0].create_mutable_file(CONTENTS)
10189+        CONTENTS_uploadable = MutableData(CONTENTS)
10190+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
10191         def _created(node):
10192             self.node = node
10193             self.fileurl = "uri/" + urllib.quote(node.get_uri())
10194hunk ./src/allmydata/test/test_deepcheck.py 61
10195         self.basedir = "deepcheck/MutableChecker/corrupt"
10196         self.set_up_grid()
10197         CONTENTS = "a little bit of data"
10198-        d = self.g.clients[0].create_mutable_file(CONTENTS)
10199+        CONTENTS_uploadable = MutableData(CONTENTS)
10200+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
10201         def _stash_and_corrupt(node):
10202             self.node = node
10203             self.fileurl = "uri/" + urllib.quote(node.get_uri())
10204hunk ./src/allmydata/test/test_deepcheck.py 99
10205         self.basedir = "deepcheck/MutableChecker/delete_share"
10206         self.set_up_grid()
10207         CONTENTS = "a little bit of data"
10208-        d = self.g.clients[0].create_mutable_file(CONTENTS)
10209+        CONTENTS_uploadable = MutableData(CONTENTS)
10210+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
10211         def _stash_and_delete(node):
10212             self.node = node
10213             self.fileurl = "uri/" + urllib.quote(node.get_uri())
10214hunk ./src/allmydata/test/test_deepcheck.py 223
10215             self.root = n
10216             self.root_uri = n.get_uri()
10217         d.addCallback(_created_root)
10218-        d.addCallback(lambda ign: c0.create_mutable_file("mutable file contents"))
10219+        d.addCallback(lambda ign:
10220+            c0.create_mutable_file(MutableData("mutable file contents")))
10221         d.addCallback(lambda n: self.root.set_node(u"mutable", n))
10222         def _created_mutable(n):
10223             self.mutable = n
10224hunk ./src/allmydata/test/test_deepcheck.py 965
10225     def create_mangled(self, ignored, name):
10226         nodetype, mangletype = name.split("-", 1)
10227         if nodetype == "mutable":
10228-            d = self.g.clients[0].create_mutable_file("mutable file contents")
10229+            mutable_uploadable = MutableData("mutable file contents")
10230+            d = self.g.clients[0].create_mutable_file(mutable_uploadable)
10231             d.addCallback(lambda n: self.root.set_node(unicode(name), n))
10232         elif nodetype == "large":
10233             large = upload.Data("Lots of data\n" * 1000 + name + "\n", None)
10234hunk ./src/allmydata/test/test_dirnode.py 1304
10235     implements(IMutableFileNode)
10236     counter = 0
10237     def __init__(self, initial_contents=""):
10238-        self.data = self._get_initial_contents(initial_contents)
10239+        data = self._get_initial_contents(initial_contents)
10240+        self.data = data.read(data.get_size())
10241+        self.data = "".join(self.data)
10242+
10243         counter = FakeMutableFile.counter
10244         FakeMutableFile.counter += 1
10245         writekey = hashutil.ssk_writekey_hash(str(counter))
10246hunk ./src/allmydata/test/test_dirnode.py 1354
10247         pass
10248 
10249     def modify(self, modifier):
10250-        self.data = modifier(self.data, None, True)
10251+        data = modifier(self.data, None, True)
10252+        self.data = data
10253         return defer.succeed(None)
10254 
10255 class FakeNodeMaker(NodeMaker):
10256hunk ./src/allmydata/test/test_dirnode.py 1359
10257-    def create_mutable_file(self, contents="", keysize=None):
10258+    def create_mutable_file(self, contents="", keysize=None, version=None):
10259         return defer.succeed(FakeMutableFile(contents))
10260 
10261 class FakeClient2(Client):
10262hunk ./src/allmydata/test/test_filenode.py 98
10263         def _check_segment(res):
10264             self.failUnlessEqual(res, DATA[1:1+5])
10265         d.addCallback(_check_segment)
10266+        d.addCallback(lambda ignored: fn1.get_best_readable_version())
10267+        d.addCallback(lambda fn2: self.failUnlessEqual(fn1, fn2))
10268+        d.addCallback(lambda ignored:
10269+            fn1.get_size_of_best_version())
10270+        d.addCallback(lambda size:
10271+            self.failUnlessEqual(size, len(DATA)))
10272+        d.addCallback(lambda ignored:
10273+            fn1.download_to_data())
10274+        d.addCallback(lambda data:
10275+            self.failUnlessEqual(data, DATA))
10276+        d.addCallback(lambda ignored:
10277+            fn1.download_best_version())
10278+        d.addCallback(lambda data:
10279+            self.failUnlessEqual(data, DATA))
10280 
10281         return d
10282 
10283hunk ./src/allmydata/test/test_hung_server.py 10
10284 from allmydata.util.consumer import download_to_data
10285 from allmydata.immutable import upload
10286 from allmydata.mutable.common import UnrecoverableFileError
10287+from allmydata.mutable.publish import MutableData
10288 from allmydata.storage.common import storage_index_to_dir
10289 from allmydata.test.no_network import GridTestMixin
10290 from allmydata.test.common import ShouldFailMixin
10291hunk ./src/allmydata/test/test_hung_server.py 110
10292         self.servers = self.servers[5:] + self.servers[:5]
10293 
10294         if mutable:
10295-            d = nm.create_mutable_file(mutable_plaintext)
10296+            uploadable = MutableData(mutable_plaintext)
10297+            d = nm.create_mutable_file(uploadable)
10298             def _uploaded_mutable(node):
10299                 self.uri = node.get_uri()
10300                 self.shares = self.find_uri_shares(self.uri)
10301hunk ./src/allmydata/test/test_immutable.py 267
10302         d.addCallback(_after_attempt)
10303         return d
10304 
10305+    def test_download_to_data(self):
10306+        d = self.n.download_to_data()
10307+        d.addCallback(lambda data:
10308+            self.failUnlessEqual(data, common.TEST_DATA))
10309+        return d
10310 
10311hunk ./src/allmydata/test/test_immutable.py 273
10312+
10313+    def test_download_best_version(self):
10314+        d = self.n.download_best_version()
10315+        d.addCallback(lambda data:
10316+            self.failUnlessEqual(data, common.TEST_DATA))
10317+        return d
10318+
10319+
10320+    def test_get_best_readable_version(self):
10321+        d = self.n.get_best_readable_version()
10322+        d.addCallback(lambda n2:
10323+            self.failUnlessEqual(n2, self.n))
10324+        return d
10325+
10326+    def test_get_size_of_best_version(self):
10327+        d = self.n.get_size_of_best_version()
10328+        d.addCallback(lambda size:
10329+            self.failUnlessEqual(size, len(common.TEST_DATA)))
10330+        return d
10331+
10332+
10333 # XXX extend these tests to show bad behavior of various kinds from servers:
10334 # raising exception from each remove_foo() method, for example
10335 
10336hunk ./src/allmydata/test/test_mutable.py 2
10337 
10338-import struct
10339+import os
10340 from cStringIO import StringIO
10341 from twisted.trial import unittest
10342 from twisted.internet import defer, reactor
10343hunk ./src/allmydata/test/test_mutable.py 8
10344 from allmydata import uri, client
10345 from allmydata.nodemaker import NodeMaker
10346-from allmydata.util import base32
10347+from allmydata.util import base32, consumer
10348 from allmydata.util.hashutil import tagged_hash, ssk_writekey_hash, \
10349      ssk_pubkey_fingerprint_hash
10350hunk ./src/allmydata/test/test_mutable.py 11
10351+from allmydata.util.deferredutil import gatherResults
10352 from allmydata.interfaces import IRepairResults, ICheckAndRepairResults, \
10353hunk ./src/allmydata/test/test_mutable.py 13
10354-     NotEnoughSharesError
10355+     NotEnoughSharesError, SDMF_VERSION, MDMF_VERSION
10356 from allmydata.monitor import Monitor
10357 from allmydata.test.common import ShouldFailMixin
10358 from allmydata.test.no_network import GridTestMixin
10359hunk ./src/allmydata/test/test_mutable.py 27
10360      NeedMoreDataError, UnrecoverableFileError, UncoordinatedWriteError, \
10361      NotEnoughServersError, CorruptShareError
10362 from allmydata.mutable.retrieve import Retrieve
10363-from allmydata.mutable.publish import Publish
10364+from allmydata.mutable.publish import Publish, MutableFileHandle, \
10365+                                      MutableData, \
10366+                                      DEFAULT_MAX_SEGMENT_SIZE
10367 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
10368hunk ./src/allmydata/test/test_mutable.py 31
10369-from allmydata.mutable.layout import unpack_header, unpack_share
10370+from allmydata.mutable.layout import unpack_header, MDMFSlotReadProxy
10371 from allmydata.mutable.repairer import MustForceRepairError
10372 
10373 import allmydata.test.common_util as testutil
10374hunk ./src/allmydata/test/test_mutable.py 100
10375         self.storage = storage
10376         self.queries = 0
10377     def callRemote(self, methname, *args, **kwargs):
10378+        self.queries += 1
10379         def _call():
10380             meth = getattr(self, methname)
10381             return meth(*args, **kwargs)
10382hunk ./src/allmydata/test/test_mutable.py 107
10383         d = fireEventually()
10384         d.addCallback(lambda res: _call())
10385         return d
10386+
10387     def callRemoteOnly(self, methname, *args, **kwargs):
10388hunk ./src/allmydata/test/test_mutable.py 109
10389+        self.queries += 1
10390         d = self.callRemote(methname, *args, **kwargs)
10391         d.addBoth(lambda ignore: None)
10392         pass
10393hunk ./src/allmydata/test/test_mutable.py 157
10394             chr(ord(original[byte_offset]) ^ 0x01) +
10395             original[byte_offset+1:])
10396 
10397+def add_two(original, byte_offset):
10398+    # It isn't enough to simply flip the bit for the version number,
10399+    # because 1 is a valid version number. So we add two instead.
10400+    return (original[:byte_offset] +
10401+            chr(ord(original[byte_offset]) ^ 0x02) +
10402+            original[byte_offset+1:])
10403+
10404 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
10405     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
10406     # list of shnums to corrupt.
10407hunk ./src/allmydata/test/test_mutable.py 167
10408+    ds = []
10409     for peerid in s._peers:
10410         shares = s._peers[peerid]
10411         for shnum in shares:
10412hunk ./src/allmydata/test/test_mutable.py 175
10413                 and shnum not in shnums_to_corrupt):
10414                 continue
10415             data = shares[shnum]
10416-            (version,
10417-             seqnum,
10418-             root_hash,
10419-             IV,
10420-             k, N, segsize, datalen,
10421-             o) = unpack_header(data)
10422-            if isinstance(offset, tuple):
10423-                offset1, offset2 = offset
10424-            else:
10425-                offset1 = offset
10426-                offset2 = 0
10427-            if offset1 == "pubkey":
10428-                real_offset = 107
10429-            elif offset1 in o:
10430-                real_offset = o[offset1]
10431-            else:
10432-                real_offset = offset1
10433-            real_offset = int(real_offset) + offset2 + offset_offset
10434-            assert isinstance(real_offset, int), offset
10435-            shares[shnum] = flip_bit(data, real_offset)
10436-    return res
10437+            # We're feeding the reader all of the share data, so it
10438+            # won't need to use the rref that we didn't provide, nor the
10439+            # storage index that we didn't provide. We do this because
10440+            # the reader will work for both MDMF and SDMF.
10441+            reader = MDMFSlotReadProxy(None, None, shnum, data)
10442+            # We need to get the offsets for the next part.
10443+            d = reader.get_verinfo()
10444+            def _do_corruption(verinfo, data, shnum):
10445+                (seqnum,
10446+                 root_hash,
10447+                 IV,
10448+                 segsize,
10449+                 datalen,
10450+                 k, n, prefix, o) = verinfo
10451+                if isinstance(offset, tuple):
10452+                    offset1, offset2 = offset
10453+                else:
10454+                    offset1 = offset
10455+                    offset2 = 0
10456+                if offset1 == "pubkey" and IV:
10457+                    real_offset = 107
10458+                elif offset1 == "share_data" and not IV:
10459+                    real_offset = 107
10460+                elif offset1 in o:
10461+                    real_offset = o[offset1]
10462+                else:
10463+                    real_offset = offset1
10464+                real_offset = int(real_offset) + offset2 + offset_offset
10465+                assert isinstance(real_offset, int), offset
10466+                if offset1 == 0: # verbyte
10467+                    f = add_two
10468+                else:
10469+                    f = flip_bit
10470+                shares[shnum] = f(data, real_offset)
10471+            d.addCallback(_do_corruption, data, shnum)
10472+            ds.append(d)
10473+    dl = defer.DeferredList(ds)
10474+    dl.addCallback(lambda ignored: res)
10475+    return dl
10476 
10477 def make_storagebroker(s=None, num_peers=10):
10478     if not s:
10479hunk ./src/allmydata/test/test_mutable.py 256
10480             self.failUnlessEqual(len(shnums), 1)
10481         d.addCallback(_created)
10482         return d
10483+    test_create.timeout = 15
10484+
10485+
10486+    def test_create_mdmf(self):
10487+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
10488+        def _created(n):
10489+            self.failUnless(isinstance(n, MutableFileNode))
10490+            self.failUnlessEqual(n.get_storage_index(), n._storage_index)
10491+            sb = self.nodemaker.storage_broker
10492+            peer0 = sorted(sb.get_all_serverids())[0]
10493+            shnums = self._storage._peers[peer0].keys()
10494+            self.failUnlessEqual(len(shnums), 1)
10495+        d.addCallback(_created)
10496+        return d
10497+
10498 
10499     def test_serialize(self):
10500         n = MutableFileNode(None, None, {"k": 3, "n": 10}, None)
10501hunk ./src/allmydata/test/test_mutable.py 301
10502             d.addCallback(lambda smap: smap.dump(StringIO()))
10503             d.addCallback(lambda sio:
10504                           self.failUnless("3-of-10" in sio.getvalue()))
10505-            d.addCallback(lambda res: n.overwrite("contents 1"))
10506+            d.addCallback(lambda res: n.overwrite(MutableData("contents 1")))
10507             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
10508             d.addCallback(lambda res: n.download_best_version())
10509             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
10510hunk ./src/allmydata/test/test_mutable.py 308
10511             d.addCallback(lambda res: n.get_size_of_best_version())
10512             d.addCallback(lambda size:
10513                           self.failUnlessEqual(size, len("contents 1")))
10514-            d.addCallback(lambda res: n.overwrite("contents 2"))
10515+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
10516             d.addCallback(lambda res: n.download_best_version())
10517             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
10518             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
10519hunk ./src/allmydata/test/test_mutable.py 312
10520-            d.addCallback(lambda smap: n.upload("contents 3", smap))
10521+            d.addCallback(lambda smap: n.upload(MutableData("contents 3"), smap))
10522             d.addCallback(lambda res: n.download_best_version())
10523             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
10524             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
10525hunk ./src/allmydata/test/test_mutable.py 324
10526             # mapupdate-to-retrieve data caching (i.e. make the shares larger
10527             # than the default readsize, which is 2000 bytes). A 15kB file
10528             # will have 5kB shares.
10529-            d.addCallback(lambda res: n.overwrite("large size file" * 1000))
10530+            d.addCallback(lambda res: n.overwrite(MutableData("large size file" * 1000)))
10531             d.addCallback(lambda res: n.download_best_version())
10532             d.addCallback(lambda res:
10533                           self.failUnlessEqual(res, "large size file" * 1000))
10534hunk ./src/allmydata/test/test_mutable.py 332
10535         d.addCallback(_created)
10536         return d
10537 
10538+
10539+    def test_upload_and_download_mdmf(self):
10540+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
10541+        def _created(n):
10542+            d = defer.succeed(None)
10543+            d.addCallback(lambda ignored:
10544+                n.get_servermap(MODE_READ))
10545+            def _then(servermap):
10546+                dumped = servermap.dump(StringIO())
10547+                self.failUnlessIn("3-of-10", dumped.getvalue())
10548+            d.addCallback(_then)
10549+            # Now overwrite the contents with some new contents. We want
10550+            # to make them big enough to force the file to be uploaded
10551+            # in more than one segment.
10552+            big_contents = "contents1" * 100000 # about 900 KiB
10553+            big_contents_uploadable = MutableData(big_contents)
10554+            d.addCallback(lambda ignored:
10555+                n.overwrite(big_contents_uploadable))
10556+            d.addCallback(lambda ignored:
10557+                n.download_best_version())
10558+            d.addCallback(lambda data:
10559+                self.failUnlessEqual(data, big_contents))
10560+            # Overwrite the contents again with some new contents. As
10561+            # before, they need to be big enough to force multiple
10562+            # segments, so that we make the downloader deal with
10563+            # multiple segments.
10564+            bigger_contents = "contents2" * 1000000 # about 9MiB
10565+            bigger_contents_uploadable = MutableData(bigger_contents)
10566+            d.addCallback(lambda ignored:
10567+                n.overwrite(bigger_contents_uploadable))
10568+            d.addCallback(lambda ignored:
10569+                n.download_best_version())
10570+            d.addCallback(lambda data:
10571+                self.failUnlessEqual(data, bigger_contents))
10572+            return d
10573+        d.addCallback(_created)
10574+        return d
10575+
10576+
10577+    def test_mdmf_write_count(self):
10578+        # Publishing an MDMF file should only cause one write for each
10579+        # share that is to be published. Otherwise, we introduce
10580+        # undesirable semantics that are a regression from SDMF
10581+        upload = MutableData("MDMF" * 100000) # about 400 KiB
10582+        d = self.nodemaker.create_mutable_file(upload,
10583+                                               version=MDMF_VERSION)
10584+        def _check_server_write_counts(ignored):
10585+            sb = self.nodemaker.storage_broker
10586+            peers = sb.test_servers.values()
10587+            for peer in peers:
10588+                self.failUnlessEqual(peer.queries, 1)
10589+        d.addCallback(_check_server_write_counts)
10590+        return d
10591+
10592+
10593     def test_create_with_initial_contents(self):
10594hunk ./src/allmydata/test/test_mutable.py 388
10595-        d = self.nodemaker.create_mutable_file("contents 1")
10596+        upload1 = MutableData("contents 1")
10597+        d = self.nodemaker.create_mutable_file(upload1)
10598         def _created(n):
10599             d = n.download_best_version()
10600             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
10601hunk ./src/allmydata/test/test_mutable.py 393
10602-            d.addCallback(lambda res: n.overwrite("contents 2"))
10603+            upload2 = MutableData("contents 2")
10604+            d.addCallback(lambda res: n.overwrite(upload2))
10605             d.addCallback(lambda res: n.download_best_version())
10606             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
10607             return d
10608hunk ./src/allmydata/test/test_mutable.py 400
10609         d.addCallback(_created)
10610         return d
10611+    test_create_with_initial_contents.timeout = 15
10612+
10613+
10614+    def test_create_mdmf_with_initial_contents(self):
10615+        initial_contents = "foobarbaz" * 131072 # 900KiB
10616+        initial_contents_uploadable = MutableData(initial_contents)
10617+        d = self.nodemaker.create_mutable_file(initial_contents_uploadable,
10618+                                               version=MDMF_VERSION)
10619+        def _created(n):
10620+            d = n.download_best_version()
10621+            d.addCallback(lambda data:
10622+                self.failUnlessEqual(data, initial_contents))
10623+            uploadable2 = MutableData(initial_contents + "foobarbaz")
10624+            d.addCallback(lambda ignored:
10625+                n.overwrite(uploadable2))
10626+            d.addCallback(lambda ignored:
10627+                n.download_best_version())
10628+            d.addCallback(lambda data:
10629+                self.failUnlessEqual(data, initial_contents +
10630+                                           "foobarbaz"))
10631+            return d
10632+        d.addCallback(_created)
10633+        return d
10634+    test_create_mdmf_with_initial_contents.timeout = 20
10635+
10636 
10637     def test_response_cache_memory_leak(self):
10638         d = self.nodemaker.create_mutable_file("contents")
10639hunk ./src/allmydata/test/test_mutable.py 451
10640             key = n.get_writekey()
10641             self.failUnless(isinstance(key, str), key)
10642             self.failUnlessEqual(len(key), 16) # AES key size
10643-            return data
10644+            return MutableData(data)
10645         d = self.nodemaker.create_mutable_file(_make_contents)
10646         def _created(n):
10647             return n.download_best_version()
10648hunk ./src/allmydata/test/test_mutable.py 459
10649         d.addCallback(lambda data2: self.failUnlessEqual(data2, data))
10650         return d
10651 
10652+
10653+    def test_create_mdmf_with_initial_contents_function(self):
10654+        data = "initial contents" * 100000
10655+        def _make_contents(n):
10656+            self.failUnless(isinstance(n, MutableFileNode))
10657+            key = n.get_writekey()
10658+            self.failUnless(isinstance(key, str), key)
10659+            self.failUnlessEqual(len(key), 16)
10660+            return MutableData(data)
10661+        d = self.nodemaker.create_mutable_file(_make_contents,
10662+                                               version=MDMF_VERSION)
10663+        d.addCallback(lambda n:
10664+            n.download_best_version())
10665+        d.addCallback(lambda data2:
10666+            self.failUnlessEqual(data2, data))
10667+        return d
10668+
10669+
10670     def test_create_with_too_large_contents(self):
10671         BIG = "a" * (self.OLD_MAX_SEGMENT_SIZE + 1)
10672hunk ./src/allmydata/test/test_mutable.py 479
10673-        d = self.nodemaker.create_mutable_file(BIG)
10674+        BIG_uploadable = MutableData(BIG)
10675+        d = self.nodemaker.create_mutable_file(BIG_uploadable)
10676         def _created(n):
10677hunk ./src/allmydata/test/test_mutable.py 482
10678-            d = n.overwrite(BIG)
10679+            other_BIG_uploadable = MutableData(BIG)
10680+            d = n.overwrite(other_BIG_uploadable)
10681             return d
10682         d.addCallback(_created)
10683         return d
10684hunk ./src/allmydata/test/test_mutable.py 497
10685 
10686     def test_modify(self):
10687         def _modifier(old_contents, servermap, first_time):
10688-            return old_contents + "line2"
10689+            new_contents = old_contents + "line2"
10690+            return new_contents
10691         def _non_modifier(old_contents, servermap, first_time):
10692             return old_contents
10693         def _none_modifier(old_contents, servermap, first_time):
10694hunk ./src/allmydata/test/test_mutable.py 506
10695         def _error_modifier(old_contents, servermap, first_time):
10696             raise ValueError("oops")
10697         def _toobig_modifier(old_contents, servermap, first_time):
10698-            return "b" * (self.OLD_MAX_SEGMENT_SIZE+1)
10699+            new_content = "b" * (self.OLD_MAX_SEGMENT_SIZE + 1)
10700+            return new_content
10701         calls = []
10702         def _ucw_error_modifier(old_contents, servermap, first_time):
10703             # simulate an UncoordinatedWriteError once
10704hunk ./src/allmydata/test/test_mutable.py 514
10705             calls.append(1)
10706             if len(calls) <= 1:
10707                 raise UncoordinatedWriteError("simulated")
10708-            return old_contents + "line3"
10709+            new_contents = old_contents + "line3"
10710+            return new_contents
10711         def _ucw_error_non_modifier(old_contents, servermap, first_time):
10712             # simulate an UncoordinatedWriteError once, and don't actually
10713             # modify the contents on subsequent invocations
10714hunk ./src/allmydata/test/test_mutable.py 524
10715                 raise UncoordinatedWriteError("simulated")
10716             return old_contents
10717 
10718-        d = self.nodemaker.create_mutable_file("line1")
10719+        initial_contents = "line1"
10720+        d = self.nodemaker.create_mutable_file(MutableData(initial_contents))
10721         def _created(n):
10722             d = n.modify(_modifier)
10723             d.addCallback(lambda res: n.download_best_version())
10724hunk ./src/allmydata/test/test_mutable.py 582
10725             return d
10726         d.addCallback(_created)
10727         return d
10728+    test_modify.timeout = 15
10729+
10730 
10731     def test_modify_backoffer(self):
10732         def _modifier(old_contents, servermap, first_time):
10733hunk ./src/allmydata/test/test_mutable.py 609
10734         giveuper._delay = 0.1
10735         giveuper.factor = 1
10736 
10737-        d = self.nodemaker.create_mutable_file("line1")
10738+        d = self.nodemaker.create_mutable_file(MutableData("line1"))
10739         def _created(n):
10740             d = n.modify(_modifier)
10741             d.addCallback(lambda res: n.download_best_version())
10742hunk ./src/allmydata/test/test_mutable.py 659
10743             d.addCallback(lambda smap: smap.dump(StringIO()))
10744             d.addCallback(lambda sio:
10745                           self.failUnless("3-of-10" in sio.getvalue()))
10746-            d.addCallback(lambda res: n.overwrite("contents 1"))
10747+            d.addCallback(lambda res: n.overwrite(MutableData("contents 1")))
10748             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
10749             d.addCallback(lambda res: n.download_best_version())
10750             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
10751hunk ./src/allmydata/test/test_mutable.py 663
10752-            d.addCallback(lambda res: n.overwrite("contents 2"))
10753+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
10754             d.addCallback(lambda res: n.download_best_version())
10755             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
10756             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
10757hunk ./src/allmydata/test/test_mutable.py 667
10758-            d.addCallback(lambda smap: n.upload("contents 3", smap))
10759+            d.addCallback(lambda smap: n.upload(MutableData("contents 3"), smap))
10760             d.addCallback(lambda res: n.download_best_version())
10761             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
10762             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
10763hunk ./src/allmydata/test/test_mutable.py 680
10764         return d
10765 
10766 
10767-class MakeShares(unittest.TestCase):
10768-    def test_encrypt(self):
10769-        nm = make_nodemaker()
10770-        CONTENTS = "some initial contents"
10771-        d = nm.create_mutable_file(CONTENTS)
10772-        def _created(fn):
10773-            p = Publish(fn, nm.storage_broker, None)
10774-            p.salt = "SALT" * 4
10775-            p.readkey = "\x00" * 16
10776-            p.newdata = CONTENTS
10777-            p.required_shares = 3
10778-            p.total_shares = 10
10779-            p.setup_encoding_parameters()
10780-            return p._encrypt_and_encode()
10781+    def test_size_after_servermap_update(self):
10782+        # a mutable file node should have something to say about how big
10783+        # it is after a servermap update is performed, since this tells
10784+        # us how large the best version of that mutable file is.
10785+        d = self.nodemaker.create_mutable_file()
10786+        def _created(n):
10787+            self.n = n
10788+            return n.get_servermap(MODE_READ)
10789+        d.addCallback(_created)
10790+        d.addCallback(lambda ignored:
10791+            self.failUnlessEqual(self.n.get_size(), 0))
10792+        d.addCallback(lambda ignored:
10793+            self.n.overwrite(MutableData("foobarbaz")))
10794+        d.addCallback(lambda ignored:
10795+            self.failUnlessEqual(self.n.get_size(), 9))
10796+        d.addCallback(lambda ignored:
10797+            self.nodemaker.create_mutable_file(MutableData("foobarbaz")))
10798+        d.addCallback(_created)
10799+        d.addCallback(lambda ignored:
10800+            self.failUnlessEqual(self.n.get_size(), 9))
10801+        return d
10802+
10803+
10804+class PublishMixin:
10805+    def publish_one(self):
10806+        # publish a file and create shares, which can then be manipulated
10807+        # later.
10808+        self.CONTENTS = "New contents go here" * 1000
10809+        self.uploadable = MutableData(self.CONTENTS)
10810+        self._storage = FakeStorage()
10811+        self._nodemaker = make_nodemaker(self._storage)
10812+        self._storage_broker = self._nodemaker.storage_broker
10813+        d = self._nodemaker.create_mutable_file(self.uploadable)
10814+        def _created(node):
10815+            self._fn = node
10816+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
10817         d.addCallback(_created)
10818hunk ./src/allmydata/test/test_mutable.py 717
10819-        def _done(shares_and_shareids):
10820-            (shares, share_ids) = shares_and_shareids
10821-            self.failUnlessEqual(len(shares), 10)
10822-            for sh in shares:
10823-                self.failUnless(isinstance(sh, str))
10824-                self.failUnlessEqual(len(sh), 7)
10825-            self.failUnlessEqual(len(share_ids), 10)
10826-        d.addCallback(_done)
10827         return d
10828 
10829hunk ./src/allmydata/test/test_mutable.py 719
10830-    def test_generate(self):
10831-        nm = make_nodemaker()
10832-        CONTENTS = "some initial contents"
10833-        d = nm.create_mutable_file(CONTENTS)
10834-        def _created(fn):
10835-            self._fn = fn
10836-            p = Publish(fn, nm.storage_broker, None)
10837-            self._p = p
10838-            p.newdata = CONTENTS
10839-            p.required_shares = 3
10840-            p.total_shares = 10
10841-            p.setup_encoding_parameters()
10842-            p._new_seqnum = 3
10843-            p.salt = "SALT" * 4
10844-            # make some fake shares
10845-            shares_and_ids = ( ["%07d" % i for i in range(10)], range(10) )
10846-            p._privkey = fn.get_privkey()
10847-            p._encprivkey = fn.get_encprivkey()
10848-            p._pubkey = fn.get_pubkey()
10849-            return p._generate_shares(shares_and_ids)
10850+    def publish_mdmf(self):
10851+        # like publish_one, except that the result is guaranteed to be
10852+        # an MDMF file.
10853+        # self.CONTENTS should have more than one segment.
10854+        self.CONTENTS = "This is an MDMF file" * 100000
10855+        self.uploadable = MutableData(self.CONTENTS)
10856+        self._storage = FakeStorage()
10857+        self._nodemaker = make_nodemaker(self._storage)
10858+        self._storage_broker = self._nodemaker.storage_broker
10859+        d = self._nodemaker.create_mutable_file(self.uploadable, version=MDMF_VERSION)
10860+        def _created(node):
10861+            self._fn = node
10862+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
10863         d.addCallback(_created)
10864hunk ./src/allmydata/test/test_mutable.py 733
10865-        def _generated(res):
10866-            p = self._p
10867-            final_shares = p.shares
10868-            root_hash = p.root_hash
10869-            self.failUnlessEqual(len(root_hash), 32)
10870-            self.failUnless(isinstance(final_shares, dict))
10871-            self.failUnlessEqual(len(final_shares), 10)
10872-            self.failUnlessEqual(sorted(final_shares.keys()), range(10))
10873-            for i,sh in final_shares.items():
10874-                self.failUnless(isinstance(sh, str))
10875-                # feed the share through the unpacker as a sanity-check
10876-                pieces = unpack_share(sh)
10877-                (u_seqnum, u_root_hash, IV, k, N, segsize, datalen,
10878-                 pubkey, signature, share_hash_chain, block_hash_tree,
10879-                 share_data, enc_privkey) = pieces
10880-                self.failUnlessEqual(u_seqnum, 3)
10881-                self.failUnlessEqual(u_root_hash, root_hash)
10882-                self.failUnlessEqual(k, 3)
10883-                self.failUnlessEqual(N, 10)
10884-                self.failUnlessEqual(segsize, 21)
10885-                self.failUnlessEqual(datalen, len(CONTENTS))
10886-                self.failUnlessEqual(pubkey, p._pubkey.serialize())
10887-                sig_material = struct.pack(">BQ32s16s BBQQ",
10888-                                           0, p._new_seqnum, root_hash, IV,
10889-                                           k, N, segsize, datalen)
10890-                self.failUnless(p._pubkey.verify(sig_material, signature))
10891-                #self.failUnlessEqual(signature, p._privkey.sign(sig_material))
10892-                self.failUnless(isinstance(share_hash_chain, dict))
10893-                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
10894-                for shnum,share_hash in share_hash_chain.items():
10895-                    self.failUnless(isinstance(shnum, int))
10896-                    self.failUnless(isinstance(share_hash, str))
10897-                    self.failUnlessEqual(len(share_hash), 32)
10898-                self.failUnless(isinstance(block_hash_tree, list))
10899-                self.failUnlessEqual(len(block_hash_tree), 1) # very small tree
10900-                self.failUnlessEqual(IV, "SALT"*4)
10901-                self.failUnlessEqual(len(share_data), len("%07d" % 1))
10902-                self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey())
10903-        d.addCallback(_generated)
10904         return d
10905 
10906hunk ./src/allmydata/test/test_mutable.py 735
10907-    # TODO: when we publish to 20 peers, we should get one share per peer on 10
10908-    # when we publish to 3 peers, we should get either 3 or 4 shares per peer
10909-    # when we publish to zero peers, we should get a NotEnoughSharesError
10910 
10911hunk ./src/allmydata/test/test_mutable.py 736
10912-class PublishMixin:
10913-    def publish_one(self):
10914-        # publish a file and create shares, which can then be manipulated
10915-        # later.
10916-        self.CONTENTS = "New contents go here" * 1000
10917+    def publish_sdmf(self):
10918+        # like publish_one, except that the result is guaranteed to be
10919+        # an SDMF file
10920+        self.CONTENTS = "This is an SDMF file" * 1000
10921+        self.uploadable = MutableData(self.CONTENTS)
10922         self._storage = FakeStorage()
10923         self._nodemaker = make_nodemaker(self._storage)
10924         self._storage_broker = self._nodemaker.storage_broker
10925hunk ./src/allmydata/test/test_mutable.py 744
10926-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
10927+        d = self._nodemaker.create_mutable_file(self.uploadable, version=SDMF_VERSION)
10928         def _created(node):
10929             self._fn = node
10930             self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
10931hunk ./src/allmydata/test/test_mutable.py 751
10932         d.addCallback(_created)
10933         return d
10934 
10935-    def publish_multiple(self):
10936+
10937+    def publish_multiple(self, version=0):
10938         self.CONTENTS = ["Contents 0",
10939                          "Contents 1",
10940                          "Contents 2",
10941hunk ./src/allmydata/test/test_mutable.py 758
10942                          "Contents 3a",
10943                          "Contents 3b"]
10944+        self.uploadables = [MutableData(d) for d in self.CONTENTS]
10945         self._copied_shares = {}
10946         self._storage = FakeStorage()
10947         self._nodemaker = make_nodemaker(self._storage)
10948hunk ./src/allmydata/test/test_mutable.py 762
10949-        d = self._nodemaker.create_mutable_file(self.CONTENTS[0]) # seqnum=1
10950+        d = self._nodemaker.create_mutable_file(self.uploadables[0], version=version) # seqnum=1
10951         def _created(node):
10952             self._fn = node
10953             # now create multiple versions of the same file, and accumulate
10954hunk ./src/allmydata/test/test_mutable.py 769
10955             # their shares, so we can mix and match them later.
10956             d = defer.succeed(None)
10957             d.addCallback(self._copy_shares, 0)
10958-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[1])) #s2
10959+            d.addCallback(lambda res: node.overwrite(self.uploadables[1])) #s2
10960             d.addCallback(self._copy_shares, 1)
10961hunk ./src/allmydata/test/test_mutable.py 771
10962-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[2])) #s3
10963+            d.addCallback(lambda res: node.overwrite(self.uploadables[2])) #s3
10964             d.addCallback(self._copy_shares, 2)
10965hunk ./src/allmydata/test/test_mutable.py 773
10966-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[3])) #s4a
10967+            d.addCallback(lambda res: node.overwrite(self.uploadables[3])) #s4a
10968             d.addCallback(self._copy_shares, 3)
10969             # now we replace all the shares with version s3, and upload a new
10970             # version to get s4b.
10971hunk ./src/allmydata/test/test_mutable.py 779
10972             rollback = dict([(i,2) for i in range(10)])
10973             d.addCallback(lambda res: self._set_versions(rollback))
10974-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[4])) #s4b
10975+            d.addCallback(lambda res: node.overwrite(self.uploadables[4])) #s4b
10976             d.addCallback(self._copy_shares, 4)
10977             # we leave the storage in state 4
10978             return d
10979hunk ./src/allmydata/test/test_mutable.py 786
10980         d.addCallback(_created)
10981         return d
10982 
10983+
10984     def _copy_shares(self, ignored, index):
10985         shares = self._storage._peers
10986         # we need a deep copy
10987hunk ./src/allmydata/test/test_mutable.py 810
10988                     shares[peerid][shnum] = oldshares[index][peerid][shnum]
10989 
10990 
10991+
10992+
10993 class Servermap(unittest.TestCase, PublishMixin):
10994     def setUp(self):
10995         return self.publish_one()
10996hunk ./src/allmydata/test/test_mutable.py 816
10997 
10998-    def make_servermap(self, mode=MODE_CHECK, fn=None, sb=None):
10999+    def make_servermap(self, mode=MODE_CHECK, fn=None, sb=None,
11000+                       update_range=None):
11001         if fn is None:
11002             fn = self._fn
11003         if sb is None:
11004hunk ./src/allmydata/test/test_mutable.py 823
11005             sb = self._storage_broker
11006         smu = ServermapUpdater(fn, sb, Monitor(),
11007-                               ServerMap(), mode)
11008+                               ServerMap(), mode, update_range=update_range)
11009         d = smu.update()
11010         return d
11011 
11012hunk ./src/allmydata/test/test_mutable.py 889
11013         # create a new file, which is large enough to knock the privkey out
11014         # of the early part of the file
11015         LARGE = "These are Larger contents" * 200 # about 5KB
11016-        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE))
11017+        LARGE_uploadable = MutableData(LARGE)
11018+        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE_uploadable))
11019         def _created(large_fn):
11020             large_fn2 = self._nodemaker.create_from_cap(large_fn.get_uri())
11021             return self.make_servermap(MODE_WRITE, large_fn2)
11022hunk ./src/allmydata/test/test_mutable.py 898
11023         d.addCallback(lambda sm: self.failUnlessOneRecoverable(sm, 10))
11024         return d
11025 
11026+
11027     def test_mark_bad(self):
11028         d = defer.succeed(None)
11029         ms = self.make_servermap
11030hunk ./src/allmydata/test/test_mutable.py 944
11031         self._storage._peers = {} # delete all shares
11032         ms = self.make_servermap
11033         d = defer.succeed(None)
11034-
11035+#
11036         d.addCallback(lambda res: ms(mode=MODE_CHECK))
11037         d.addCallback(lambda sm: self.failUnlessNoneRecoverable(sm))
11038 
11039hunk ./src/allmydata/test/test_mutable.py 996
11040         return d
11041 
11042 
11043+    def test_servermapupdater_finds_mdmf_files(self):
11044+        # setUp already published an MDMF file for us. We just need to
11045+        # make sure that when we run the ServermapUpdater, the file is
11046+        # reported to have one recoverable version.
11047+        d = defer.succeed(None)
11048+        d.addCallback(lambda ignored:
11049+            self.publish_mdmf())
11050+        d.addCallback(lambda ignored:
11051+            self.make_servermap(mode=MODE_CHECK))
11052+        # Calling make_servermap also updates the servermap in the mode
11053+        # that we specify, so we just need to see what it says.
11054+        def _check_servermap(sm):
11055+            self.failUnlessEqual(len(sm.recoverable_versions()), 1)
11056+        d.addCallback(_check_servermap)
11057+        return d
11058+
11059+
11060+    def test_fetch_update(self):
11061+        d = defer.succeed(None)
11062+        d.addCallback(lambda ignored:
11063+            self.publish_mdmf())
11064+        d.addCallback(lambda ignored:
11065+            self.make_servermap(mode=MODE_WRITE, update_range=(1, 2)))
11066+        def _check_servermap(sm):
11067+            # 10 shares
11068+            self.failUnlessEqual(len(sm.update_data), 10)
11069+            # one version
11070+            for data in sm.update_data.itervalues():
11071+                self.failUnlessEqual(len(data), 1)
11072+        d.addCallback(_check_servermap)
11073+        return d
11074+
11075+
11076+    def test_servermapupdater_finds_sdmf_files(self):
11077+        d = defer.succeed(None)
11078+        d.addCallback(lambda ignored:
11079+            self.publish_sdmf())
11080+        d.addCallback(lambda ignored:
11081+            self.make_servermap(mode=MODE_CHECK))
11082+        d.addCallback(lambda servermap:
11083+            self.failUnlessEqual(len(servermap.recoverable_versions()), 1))
11084+        return d
11085+
11086 
11087 class Roundtrip(unittest.TestCase, testutil.ShouldFailMixin, PublishMixin):
11088     def setUp(self):
11089hunk ./src/allmydata/test/test_mutable.py 1079
11090         if version is None:
11091             version = servermap.best_recoverable_version()
11092         r = Retrieve(self._fn, servermap, version)
11093-        return r.download()
11094+        c = consumer.MemoryConsumer()
11095+        d = r.download(consumer=c)
11096+        d.addCallback(lambda mc: "".join(mc.chunks))
11097+        return d
11098+
11099 
11100     def test_basic(self):
11101         d = self.make_servermap()
11102hunk ./src/allmydata/test/test_mutable.py 1160
11103         return d
11104     test_no_servers_download.timeout = 15
11105 
11106+
11107     def _test_corrupt_all(self, offset, substring,
11108hunk ./src/allmydata/test/test_mutable.py 1162
11109-                          should_succeed=False, corrupt_early=True,
11110-                          failure_checker=None):
11111+                          should_succeed=False,
11112+                          corrupt_early=True,
11113+                          failure_checker=None,
11114+                          fetch_privkey=False):
11115         d = defer.succeed(None)
11116         if corrupt_early:
11117             d.addCallback(corrupt, self._storage, offset)
11118hunk ./src/allmydata/test/test_mutable.py 1182
11119                     self.failUnlessIn(substring, "".join(allproblems))
11120                 return servermap
11121             if should_succeed:
11122-                d1 = self._fn.download_version(servermap, ver)
11123+                d1 = self._fn.download_version(servermap, ver,
11124+                                               fetch_privkey)
11125                 d1.addCallback(lambda new_contents:
11126                                self.failUnlessEqual(new_contents, self.CONTENTS))
11127             else:
11128hunk ./src/allmydata/test/test_mutable.py 1190
11129                 d1 = self.shouldFail(NotEnoughSharesError,
11130                                      "_corrupt_all(offset=%s)" % (offset,),
11131                                      substring,
11132-                                     self._fn.download_version, servermap, ver)
11133+                                     self._fn.download_version, servermap,
11134+                                                                ver,
11135+                                                                fetch_privkey)
11136             if failure_checker:
11137                 d1.addCallback(failure_checker)
11138             d1.addCallback(lambda res: servermap)
11139hunk ./src/allmydata/test/test_mutable.py 1201
11140         return d
11141 
11142     def test_corrupt_all_verbyte(self):
11143-        # when the version byte is not 0, we hit an UnknownVersionError error
11144-        # in unpack_share().
11145+        # when the version byte is not 0 or 1, we hit an UnknownVersionError
11146+        # error in unpack_share().
11147         d = self._test_corrupt_all(0, "UnknownVersionError")
11148         def _check_servermap(servermap):
11149             # and the dump should mention the problems
11150hunk ./src/allmydata/test/test_mutable.py 1208
11151             s = StringIO()
11152             dump = servermap.dump(s).getvalue()
11153-            self.failUnless("10 PROBLEMS" in dump, dump)
11154+            self.failUnless("30 PROBLEMS" in dump, dump)
11155         d.addCallback(_check_servermap)
11156         return d
11157 
11158hunk ./src/allmydata/test/test_mutable.py 1278
11159         return self._test_corrupt_all("enc_privkey", None, should_succeed=True)
11160 
11161 
11162+    def test_corrupt_all_encprivkey_late(self):
11163+        # this should work for the same reason as above, but we corrupt
11164+        # after the servermap update to exercise the error handling
11165+        # code.
11166+        # We need to remove the privkey from the node, or the retrieve
11167+        # process won't know to update it.
11168+        self._fn._privkey = None
11169+        return self._test_corrupt_all("enc_privkey",
11170+                                      None, # this shouldn't fail
11171+                                      should_succeed=True,
11172+                                      corrupt_early=False,
11173+                                      fetch_privkey=True)
11174+
11175+
11176     def test_corrupt_all_seqnum_late(self):
11177         # corrupting the seqnum between mapupdate and retrieve should result
11178         # in NotEnoughSharesError, since each share will look invalid
11179hunk ./src/allmydata/test/test_mutable.py 1298
11180         def _check(res):
11181             f = res[0]
11182             self.failUnless(f.check(NotEnoughSharesError))
11183-            self.failUnless("someone wrote to the data since we read the servermap" in str(f))
11184+            self.failUnless("uncoordinated write" in str(f))
11185         return self._test_corrupt_all(1, "ran out of peers",
11186                                       corrupt_early=False,
11187                                       failure_checker=_check)
11188hunk ./src/allmydata/test/test_mutable.py 1342
11189                             in str(servermap.problems[0]))
11190             ver = servermap.best_recoverable_version()
11191             r = Retrieve(self._fn, servermap, ver)
11192-            return r.download()
11193+            c = consumer.MemoryConsumer()
11194+            return r.download(c)
11195         d.addCallback(_do_retrieve)
11196hunk ./src/allmydata/test/test_mutable.py 1345
11197+        d.addCallback(lambda mc: "".join(mc.chunks))
11198         d.addCallback(lambda new_contents:
11199                       self.failUnlessEqual(new_contents, self.CONTENTS))
11200         return d
11201hunk ./src/allmydata/test/test_mutable.py 1350
11202 
11203-    def test_corrupt_some(self):
11204-        # corrupt the data of first five shares (so the servermap thinks
11205-        # they're good but retrieve marks them as bad), so that the
11206-        # MODE_READ set of 6 will be insufficient, forcing node.download to
11207-        # retry with more servers.
11208-        corrupt(None, self._storage, "share_data", range(5))
11209-        d = self.make_servermap()
11210+
11211+    def _test_corrupt_some(self, offset, mdmf=False):
11212+        if mdmf:
11213+            d = self.publish_mdmf()
11214+        else:
11215+            d = defer.succeed(None)
11216+        d.addCallback(lambda ignored:
11217+            corrupt(None, self._storage, offset, range(5)))
11218+        d.addCallback(lambda ignored:
11219+            self.make_servermap())
11220         def _do_retrieve(servermap):
11221             ver = servermap.best_recoverable_version()
11222             self.failUnless(ver)
11223hunk ./src/allmydata/test/test_mutable.py 1366
11224             return self._fn.download_best_version()
11225         d.addCallback(_do_retrieve)
11226         d.addCallback(lambda new_contents:
11227-                      self.failUnlessEqual(new_contents, self.CONTENTS))
11228+            self.failUnlessEqual(new_contents, self.CONTENTS))
11229         return d
11230 
11231hunk ./src/allmydata/test/test_mutable.py 1369
11232+
11233+    def test_corrupt_some(self):
11234+        # corrupt the data of first five shares (so the servermap thinks
11235+        # they're good but retrieve marks them as bad), so that the
11236+        # MODE_READ set of 6 will be insufficient, forcing node.download to
11237+        # retry with more servers.
11238+        return self._test_corrupt_some("share_data")
11239+
11240+
11241     def test_download_fails(self):
11242hunk ./src/allmydata/test/test_mutable.py 1379
11243-        corrupt(None, self._storage, "signature")
11244-        d = self.shouldFail(UnrecoverableFileError, "test_download_anyway",
11245+        d = corrupt(None, self._storage, "signature")
11246+        d.addCallback(lambda ignored:
11247+            self.shouldFail(UnrecoverableFileError, "test_download_anyway",
11248                             "no recoverable versions",
11249hunk ./src/allmydata/test/test_mutable.py 1383
11250-                            self._fn.download_best_version)
11251+                            self._fn.download_best_version))
11252         return d
11253 
11254 
11255hunk ./src/allmydata/test/test_mutable.py 1387
11256+
11257+    def test_corrupt_mdmf_block_hash_tree(self):
11258+        d = self.publish_mdmf()
11259+        d.addCallback(lambda ignored:
11260+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
11261+                                   "block hash tree failure",
11262+                                   corrupt_early=False,
11263+                                   should_succeed=False))
11264+        return d
11265+
11266+
11267+    def test_corrupt_mdmf_block_hash_tree_late(self):
11268+        d = self.publish_mdmf()
11269+        d.addCallback(lambda ignored:
11270+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
11271+                                   "block hash tree failure",
11272+                                   corrupt_early=True,
11273+                                   should_succeed=False))
11274+        return d
11275+
11276+
11277+    def test_corrupt_mdmf_share_data(self):
11278+        d = self.publish_mdmf()
11279+        d.addCallback(lambda ignored:
11280+            # TODO: Find out what the block size is and corrupt a
11281+            # specific block, rather than just guessing.
11282+            self._test_corrupt_all(("share_data", 12 * 40),
11283+                                    "block hash tree failure",
11284+                                    corrupt_early=True,
11285+                                    should_succeed=False))
11286+        return d
11287+
11288+
11289+    def test_corrupt_some_mdmf(self):
11290+        return self._test_corrupt_some(("share_data", 12 * 40),
11291+                                       mdmf=True)
11292+
11293+
11294 class CheckerMixin:
11295     def check_good(self, r, where):
11296         self.failUnless(r.is_healthy(), where)
11297hunk ./src/allmydata/test/test_mutable.py 1455
11298         d.addCallback(self.check_good, "test_check_good")
11299         return d
11300 
11301+    def test_check_mdmf_good(self):
11302+        d = self.publish_mdmf()
11303+        d.addCallback(lambda ignored:
11304+            self._fn.check(Monitor()))
11305+        d.addCallback(self.check_good, "test_check_mdmf_good")
11306+        return d
11307+
11308     def test_check_no_shares(self):
11309         for shares in self._storage._peers.values():
11310             shares.clear()
11311hunk ./src/allmydata/test/test_mutable.py 1469
11312         d.addCallback(self.check_bad, "test_check_no_shares")
11313         return d
11314 
11315+    def test_check_mdmf_no_shares(self):
11316+        d = self.publish_mdmf()
11317+        def _then(ignored):
11318+            for share in self._storage._peers.values():
11319+                share.clear()
11320+        d.addCallback(_then)
11321+        d.addCallback(lambda ignored:
11322+            self._fn.check(Monitor()))
11323+        d.addCallback(self.check_bad, "test_check_mdmf_no_shares")
11324+        return d
11325+
11326     def test_check_not_enough_shares(self):
11327         for shares in self._storage._peers.values():
11328             for shnum in shares.keys():
11329hunk ./src/allmydata/test/test_mutable.py 1489
11330         d.addCallback(self.check_bad, "test_check_not_enough_shares")
11331         return d
11332 
11333+    def test_check_mdmf_not_enough_shares(self):
11334+        d = self.publish_mdmf()
11335+        def _then(ignored):
11336+            for shares in self._storage._peers.values():
11337+                for shnum in shares.keys():
11338+                    if shnum > 0:
11339+                        del shares[shnum]
11340+        d.addCallback(_then)
11341+        d.addCallback(lambda ignored:
11342+            self._fn.check(Monitor()))
11343+        d.addCallback(self.check_bad, "test_check_mdmf_not_enougH_shares")
11344+        return d
11345+
11346+
11347     def test_check_all_bad_sig(self):
11348hunk ./src/allmydata/test/test_mutable.py 1504
11349-        corrupt(None, self._storage, 1) # bad sig
11350-        d = self._fn.check(Monitor())
11351+        d = corrupt(None, self._storage, 1) # bad sig
11352+        d.addCallback(lambda ignored:
11353+            self._fn.check(Monitor()))
11354         d.addCallback(self.check_bad, "test_check_all_bad_sig")
11355         return d
11356 
11357hunk ./src/allmydata/test/test_mutable.py 1510
11358+    def test_check_mdmf_all_bad_sig(self):
11359+        d = self.publish_mdmf()
11360+        d.addCallback(lambda ignored:
11361+            corrupt(None, self._storage, 1))
11362+        d.addCallback(lambda ignored:
11363+            self._fn.check(Monitor()))
11364+        d.addCallback(self.check_bad, "test_check_mdmf_all_bad_sig")
11365+        return d
11366+
11367     def test_check_all_bad_blocks(self):
11368hunk ./src/allmydata/test/test_mutable.py 1520
11369-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
11370+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
11371         # the Checker won't notice this.. it doesn't look at actual data
11372hunk ./src/allmydata/test/test_mutable.py 1522
11373-        d = self._fn.check(Monitor())
11374+        d.addCallback(lambda ignored:
11375+            self._fn.check(Monitor()))
11376         d.addCallback(self.check_good, "test_check_all_bad_blocks")
11377         return d
11378 
11379hunk ./src/allmydata/test/test_mutable.py 1527
11380+
11381+    def test_check_mdmf_all_bad_blocks(self):
11382+        d = self.publish_mdmf()
11383+        d.addCallback(lambda ignored:
11384+            corrupt(None, self._storage, "share_data"))
11385+        d.addCallback(lambda ignored:
11386+            self._fn.check(Monitor()))
11387+        d.addCallback(self.check_good, "test_check_mdmf_all_bad_blocks")
11388+        return d
11389+
11390     def test_verify_good(self):
11391         d = self._fn.check(Monitor(), verify=True)
11392         d.addCallback(self.check_good, "test_verify_good")
11393hunk ./src/allmydata/test/test_mutable.py 1541
11394         return d
11395+    test_verify_good.timeout = 15
11396 
11397     def test_verify_all_bad_sig(self):
11398hunk ./src/allmydata/test/test_mutable.py 1544
11399-        corrupt(None, self._storage, 1) # bad sig
11400-        d = self._fn.check(Monitor(), verify=True)
11401+        d = corrupt(None, self._storage, 1) # bad sig
11402+        d.addCallback(lambda ignored:
11403+            self._fn.check(Monitor(), verify=True))
11404         d.addCallback(self.check_bad, "test_verify_all_bad_sig")
11405         return d
11406 
11407hunk ./src/allmydata/test/test_mutable.py 1551
11408     def test_verify_one_bad_sig(self):
11409-        corrupt(None, self._storage, 1, [9]) # bad sig
11410-        d = self._fn.check(Monitor(), verify=True)
11411+        d = corrupt(None, self._storage, 1, [9]) # bad sig
11412+        d.addCallback(lambda ignored:
11413+            self._fn.check(Monitor(), verify=True))
11414         d.addCallback(self.check_bad, "test_verify_one_bad_sig")
11415         return d
11416 
11417hunk ./src/allmydata/test/test_mutable.py 1558
11418     def test_verify_one_bad_block(self):
11419-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
11420+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
11421         # the Verifier *will* notice this, since it examines every byte
11422hunk ./src/allmydata/test/test_mutable.py 1560
11423-        d = self._fn.check(Monitor(), verify=True)
11424+        d.addCallback(lambda ignored:
11425+            self._fn.check(Monitor(), verify=True))
11426         d.addCallback(self.check_bad, "test_verify_one_bad_block")
11427         d.addCallback(self.check_expected_failure,
11428                       CorruptShareError, "block hash tree failure",
11429hunk ./src/allmydata/test/test_mutable.py 1569
11430         return d
11431 
11432     def test_verify_one_bad_sharehash(self):
11433-        corrupt(None, self._storage, "share_hash_chain", [9], 5)
11434-        d = self._fn.check(Monitor(), verify=True)
11435+        d = corrupt(None, self._storage, "share_hash_chain", [9], 5)
11436+        d.addCallback(lambda ignored:
11437+            self._fn.check(Monitor(), verify=True))
11438         d.addCallback(self.check_bad, "test_verify_one_bad_sharehash")
11439         d.addCallback(self.check_expected_failure,
11440                       CorruptShareError, "corrupt hashes",
11441hunk ./src/allmydata/test/test_mutable.py 1579
11442         return d
11443 
11444     def test_verify_one_bad_encprivkey(self):
11445-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
11446-        d = self._fn.check(Monitor(), verify=True)
11447+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
11448+        d.addCallback(lambda ignored:
11449+            self._fn.check(Monitor(), verify=True))
11450         d.addCallback(self.check_bad, "test_verify_one_bad_encprivkey")
11451         d.addCallback(self.check_expected_failure,
11452                       CorruptShareError, "invalid privkey",
11453hunk ./src/allmydata/test/test_mutable.py 1589
11454         return d
11455 
11456     def test_verify_one_bad_encprivkey_uncheckable(self):
11457-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
11458+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
11459         readonly_fn = self._fn.get_readonly()
11460         # a read-only node has no way to validate the privkey
11461hunk ./src/allmydata/test/test_mutable.py 1592
11462-        d = readonly_fn.check(Monitor(), verify=True)
11463+        d.addCallback(lambda ignored:
11464+            readonly_fn.check(Monitor(), verify=True))
11465         d.addCallback(self.check_good,
11466                       "test_verify_one_bad_encprivkey_uncheckable")
11467         return d
11468hunk ./src/allmydata/test/test_mutable.py 1598
11469 
11470+
11471+    def test_verify_mdmf_good(self):
11472+        d = self.publish_mdmf()
11473+        d.addCallback(lambda ignored:
11474+            self._fn.check(Monitor(), verify=True))
11475+        d.addCallback(self.check_good, "test_verify_mdmf_good")
11476+        return d
11477+
11478+
11479+    def test_verify_mdmf_one_bad_block(self):
11480+        d = self.publish_mdmf()
11481+        d.addCallback(lambda ignored:
11482+            corrupt(None, self._storage, "share_data", [1]))
11483+        d.addCallback(lambda ignored:
11484+            self._fn.check(Monitor(), verify=True))
11485+        # We should find one bad block here
11486+        d.addCallback(self.check_bad, "test_verify_mdmf_one_bad_block")
11487+        d.addCallback(self.check_expected_failure,
11488+                      CorruptShareError, "block hash tree failure",
11489+                      "test_verify_mdmf_one_bad_block")
11490+        return d
11491+
11492+
11493+    def test_verify_mdmf_bad_encprivkey(self):
11494+        d = self.publish_mdmf()
11495+        d.addCallback(lambda ignored:
11496+            corrupt(None, self._storage, "enc_privkey", [1]))
11497+        d.addCallback(lambda ignored:
11498+            self._fn.check(Monitor(), verify=True))
11499+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_encprivkey")
11500+        d.addCallback(self.check_expected_failure,
11501+                      CorruptShareError, "privkey",
11502+                      "test_verify_mdmf_bad_encprivkey")
11503+        return d
11504+
11505+
11506+    def test_verify_mdmf_bad_sig(self):
11507+        d = self.publish_mdmf()
11508+        d.addCallback(lambda ignored:
11509+            corrupt(None, self._storage, 1, [1]))
11510+        d.addCallback(lambda ignored:
11511+            self._fn.check(Monitor(), verify=True))
11512+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_sig")
11513+        return d
11514+
11515+
11516+    def test_verify_mdmf_bad_encprivkey_uncheckable(self):
11517+        d = self.publish_mdmf()
11518+        d.addCallback(lambda ignored:
11519+            corrupt(None, self._storage, "enc_privkey", [1]))
11520+        d.addCallback(lambda ignored:
11521+            self._fn.get_readonly())
11522+        d.addCallback(lambda fn:
11523+            fn.check(Monitor(), verify=True))
11524+        d.addCallback(self.check_good,
11525+                      "test_verify_mdmf_bad_encprivkey_uncheckable")
11526+        return d
11527+
11528+
11529 class Repair(unittest.TestCase, PublishMixin, ShouldFailMixin):
11530 
11531     def get_shares(self, s):
11532hunk ./src/allmydata/test/test_mutable.py 1722
11533         current_shares = self.old_shares[-1]
11534         self.failUnlessEqual(old_shares, current_shares)
11535 
11536+
11537     def test_unrepairable_0shares(self):
11538         d = self.publish_one()
11539         def _delete_all_shares(ign):
11540hunk ./src/allmydata/test/test_mutable.py 1737
11541         d.addCallback(_check)
11542         return d
11543 
11544+    def test_mdmf_unrepairable_0shares(self):
11545+        d = self.publish_mdmf()
11546+        def _delete_all_shares(ign):
11547+            shares = self._storage._peers
11548+            for peerid in shares:
11549+                shares[peerid] = {}
11550+        d.addCallback(_delete_all_shares)
11551+        d.addCallback(lambda ign: self._fn.check(Monitor()))
11552+        d.addCallback(lambda check_results: self._fn.repair(check_results))
11553+        d.addCallback(lambda crr: self.failIf(crr.get_successful()))
11554+        return d
11555+
11556+
11557     def test_unrepairable_1share(self):
11558         d = self.publish_one()
11559         def _delete_all_shares(ign):
11560hunk ./src/allmydata/test/test_mutable.py 1766
11561         d.addCallback(_check)
11562         return d
11563 
11564+    def test_mdmf_unrepairable_1share(self):
11565+        d = self.publish_mdmf()
11566+        def _delete_all_shares(ign):
11567+            shares = self._storage._peers
11568+            for peerid in shares:
11569+                for shnum in list(shares[peerid]):
11570+                    if shnum > 0:
11571+                        del shares[peerid][shnum]
11572+        d.addCallback(_delete_all_shares)
11573+        d.addCallback(lambda ign: self._fn.check(Monitor()))
11574+        d.addCallback(lambda check_results: self._fn.repair(check_results))
11575+        def _check(crr):
11576+            self.failUnlessEqual(crr.get_successful(), False)
11577+        d.addCallback(_check)
11578+        return d
11579+
11580+    def test_repairable_5shares(self):
11581+        d = self.publish_mdmf()
11582+        def _delete_all_shares(ign):
11583+            shares = self._storage._peers
11584+            for peerid in shares:
11585+                for shnum in list(shares[peerid]):
11586+                    if shnum > 4:
11587+                        del shares[peerid][shnum]
11588+        d.addCallback(_delete_all_shares)
11589+        d.addCallback(lambda ign: self._fn.check(Monitor()))
11590+        d.addCallback(lambda check_results: self._fn.repair(check_results))
11591+        def _check(crr):
11592+            self.failUnlessEqual(crr.get_successful(), True)
11593+        d.addCallback(_check)
11594+        return d
11595+
11596+    def test_mdmf_repairable_5shares(self):
11597+        d = self.publish_mdmf()
11598+        def _delete_some_shares(ign):
11599+            shares = self._storage._peers
11600+            for peerid in shares:
11601+                for shnum in list(shares[peerid]):
11602+                    if shnum > 5:
11603+                        del shares[peerid][shnum]
11604+        d.addCallback(_delete_some_shares)
11605+        d.addCallback(lambda ign: self._fn.check(Monitor()))
11606+        def _check(cr):
11607+            self.failIf(cr.is_healthy())
11608+            self.failUnless(cr.is_recoverable())
11609+            return cr
11610+        d.addCallback(_check)
11611+        d.addCallback(lambda check_results: self._fn.repair(check_results))
11612+        def _check1(crr):
11613+            self.failUnlessEqual(crr.get_successful(), True)
11614+        d.addCallback(_check1)
11615+        return d
11616+
11617+
11618     def test_merge(self):
11619         self.old_shares = []
11620         d = self.publish_multiple()
11621hunk ./src/allmydata/test/test_mutable.py 1934
11622 class MultipleEncodings(unittest.TestCase):
11623     def setUp(self):
11624         self.CONTENTS = "New contents go here"
11625+        self.uploadable = MutableData(self.CONTENTS)
11626         self._storage = FakeStorage()
11627         self._nodemaker = make_nodemaker(self._storage, num_peers=20)
11628         self._storage_broker = self._nodemaker.storage_broker
11629hunk ./src/allmydata/test/test_mutable.py 1938
11630-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
11631+        d = self._nodemaker.create_mutable_file(self.uploadable)
11632         def _created(node):
11633             self._fn = node
11634         d.addCallback(_created)
11635hunk ./src/allmydata/test/test_mutable.py 1944
11636         return d
11637 
11638-    def _encode(self, k, n, data):
11639+    def _encode(self, k, n, data, version=SDMF_VERSION):
11640         # encode 'data' into a peerid->shares dict.
11641 
11642         fn = self._fn
11643hunk ./src/allmydata/test/test_mutable.py 1960
11644         # and set the encoding parameters to something completely different
11645         fn2._required_shares = k
11646         fn2._total_shares = n
11647+        # Normally a servermap update would occur before a publish.
11648+        # Here, it doesn't, so we have to do it ourselves.
11649+        fn2.set_version(version)
11650 
11651         s = self._storage
11652         s._peers = {} # clear existing storage
11653hunk ./src/allmydata/test/test_mutable.py 1967
11654         p2 = Publish(fn2, self._storage_broker, None)
11655-        d = p2.publish(data)
11656+        uploadable = MutableData(data)
11657+        d = p2.publish(uploadable)
11658         def _published(res):
11659             shares = s._peers
11660             s._peers = {}
11661hunk ./src/allmydata/test/test_mutable.py 2235
11662         self.basedir = "mutable/Problems/test_publish_surprise"
11663         self.set_up_grid()
11664         nm = self.g.clients[0].nodemaker
11665-        d = nm.create_mutable_file("contents 1")
11666+        d = nm.create_mutable_file(MutableData("contents 1"))
11667         def _created(n):
11668             d = defer.succeed(None)
11669             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
11670hunk ./src/allmydata/test/test_mutable.py 2245
11671             d.addCallback(_got_smap1)
11672             # then modify the file, leaving the old map untouched
11673             d.addCallback(lambda res: log.msg("starting winning write"))
11674-            d.addCallback(lambda res: n.overwrite("contents 2"))
11675+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
11676             # now attempt to modify the file with the old servermap. This
11677             # will look just like an uncoordinated write, in which every
11678             # single share got updated between our mapupdate and our publish
11679hunk ./src/allmydata/test/test_mutable.py 2254
11680                           self.shouldFail(UncoordinatedWriteError,
11681                                           "test_publish_surprise", None,
11682                                           n.upload,
11683-                                          "contents 2a", self.old_map))
11684+                                          MutableData("contents 2a"), self.old_map))
11685             return d
11686         d.addCallback(_created)
11687         return d
11688hunk ./src/allmydata/test/test_mutable.py 2263
11689         self.basedir = "mutable/Problems/test_retrieve_surprise"
11690         self.set_up_grid()
11691         nm = self.g.clients[0].nodemaker
11692-        d = nm.create_mutable_file("contents 1")
11693+        d = nm.create_mutable_file(MutableData("contents 1"))
11694         def _created(n):
11695             d = defer.succeed(None)
11696             d.addCallback(lambda res: n.get_servermap(MODE_READ))
11697hunk ./src/allmydata/test/test_mutable.py 2273
11698             d.addCallback(_got_smap1)
11699             # then modify the file, leaving the old map untouched
11700             d.addCallback(lambda res: log.msg("starting winning write"))
11701-            d.addCallback(lambda res: n.overwrite("contents 2"))
11702+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
11703             # now attempt to retrieve the old version with the old servermap.
11704             # This will look like someone has changed the file since we
11705             # updated the servermap.
11706hunk ./src/allmydata/test/test_mutable.py 2282
11707             d.addCallback(lambda res:
11708                           self.shouldFail(NotEnoughSharesError,
11709                                           "test_retrieve_surprise",
11710-                                          "ran out of peers: have 0 shares (k=3)",
11711+                                          "ran out of peers: have 0 of 1",
11712                                           n.download_version,
11713                                           self.old_map,
11714                                           self.old_map.best_recoverable_version(),
11715hunk ./src/allmydata/test/test_mutable.py 2291
11716         d.addCallback(_created)
11717         return d
11718 
11719+
11720     def test_unexpected_shares(self):
11721         # upload the file, take a servermap, shut down one of the servers,
11722         # upload it again (causing shares to appear on a new server), then
11723hunk ./src/allmydata/test/test_mutable.py 2301
11724         self.basedir = "mutable/Problems/test_unexpected_shares"
11725         self.set_up_grid()
11726         nm = self.g.clients[0].nodemaker
11727-        d = nm.create_mutable_file("contents 1")
11728+        d = nm.create_mutable_file(MutableData("contents 1"))
11729         def _created(n):
11730             d = defer.succeed(None)
11731             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
11732hunk ./src/allmydata/test/test_mutable.py 2313
11733                 self.g.remove_server(peer0)
11734                 # then modify the file, leaving the old map untouched
11735                 log.msg("starting winning write")
11736-                return n.overwrite("contents 2")
11737+                return n.overwrite(MutableData("contents 2"))
11738             d.addCallback(_got_smap1)
11739             # now attempt to modify the file with the old servermap. This
11740             # will look just like an uncoordinated write, in which every
11741hunk ./src/allmydata/test/test_mutable.py 2323
11742                           self.shouldFail(UncoordinatedWriteError,
11743                                           "test_surprise", None,
11744                                           n.upload,
11745-                                          "contents 2a", self.old_map))
11746+                                          MutableData("contents 2a"), self.old_map))
11747             return d
11748         d.addCallback(_created)
11749         return d
11750hunk ./src/allmydata/test/test_mutable.py 2327
11751+    test_unexpected_shares.timeout = 15
11752 
11753     def test_bad_server(self):
11754         # Break one server, then create the file: the initial publish should
11755hunk ./src/allmydata/test/test_mutable.py 2361
11756         d.addCallback(_break_peer0)
11757         # now "create" the file, using the pre-established key, and let the
11758         # initial publish finally happen
11759-        d.addCallback(lambda res: nm.create_mutable_file("contents 1"))
11760+        d.addCallback(lambda res: nm.create_mutable_file(MutableData("contents 1")))
11761         # that ought to work
11762         def _got_node(n):
11763             d = n.download_best_version()
11764hunk ./src/allmydata/test/test_mutable.py 2370
11765             def _break_peer1(res):
11766                 self.g.break_server(self.server1.get_serverid())
11767             d.addCallback(_break_peer1)
11768-            d.addCallback(lambda res: n.overwrite("contents 2"))
11769+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
11770             # that ought to work too
11771             d.addCallback(lambda res: n.download_best_version())
11772             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
11773hunk ./src/allmydata/test/test_mutable.py 2402
11774         peerids = [s.get_serverid() for s in sb.get_connected_servers()]
11775         self.g.break_server(peerids[0])
11776 
11777-        d = nm.create_mutable_file("contents 1")
11778+        d = nm.create_mutable_file(MutableData("contents 1"))
11779         def _created(n):
11780             d = n.download_best_version()
11781             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
11782hunk ./src/allmydata/test/test_mutable.py 2410
11783             def _break_second_server(res):
11784                 self.g.break_server(peerids[1])
11785             d.addCallback(_break_second_server)
11786-            d.addCallback(lambda res: n.overwrite("contents 2"))
11787+            d.addCallback(lambda res: n.overwrite(MutableData("contents 2")))
11788             # that ought to work too
11789             d.addCallback(lambda res: n.download_best_version())
11790             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
11791hunk ./src/allmydata/test/test_mutable.py 2429
11792         d = self.shouldFail(NotEnoughServersError,
11793                             "test_publish_all_servers_bad",
11794                             "Ran out of non-bad servers",
11795-                            nm.create_mutable_file, "contents")
11796+                            nm.create_mutable_file, MutableData("contents"))
11797         return d
11798 
11799     def test_publish_no_servers(self):
11800hunk ./src/allmydata/test/test_mutable.py 2441
11801         d = self.shouldFail(NotEnoughServersError,
11802                             "test_publish_no_servers",
11803                             "Ran out of non-bad servers",
11804-                            nm.create_mutable_file, "contents")
11805+                            nm.create_mutable_file, MutableData("contents"))
11806         return d
11807     test_publish_no_servers.timeout = 30
11808 
11809hunk ./src/allmydata/test/test_mutable.py 2459
11810         # we need some contents that are large enough to push the privkey out
11811         # of the early part of the file
11812         LARGE = "These are Larger contents" * 2000 # about 50KB
11813-        d = nm.create_mutable_file(LARGE)
11814+        LARGE_uploadable = MutableData(LARGE)
11815+        d = nm.create_mutable_file(LARGE_uploadable)
11816         def _created(n):
11817             self.uri = n.get_uri()
11818             self.n2 = nm.create_from_cap(self.uri)
11819hunk ./src/allmydata/test/test_mutable.py 2495
11820         self.basedir = "mutable/Problems/test_privkey_query_missing"
11821         self.set_up_grid(num_servers=20)
11822         nm = self.g.clients[0].nodemaker
11823-        LARGE = "These are Larger contents" * 2000 # about 50KB
11824+        LARGE = "These are Larger contents" * 2000 # about 50KiB
11825+        LARGE_uploadable = MutableData(LARGE)
11826         nm._node_cache = DevNullDictionary() # disable the nodecache
11827 
11828hunk ./src/allmydata/test/test_mutable.py 2499
11829-        d = nm.create_mutable_file(LARGE)
11830+        d = nm.create_mutable_file(LARGE_uploadable)
11831         def _created(n):
11832             self.uri = n.get_uri()
11833             self.n2 = nm.create_from_cap(self.uri)
11834hunk ./src/allmydata/test/test_mutable.py 2509
11835         d.addCallback(_created)
11836         d.addCallback(lambda res: self.n2.get_servermap(MODE_WRITE))
11837         return d
11838+
11839+
11840+    def test_block_and_hash_query_error(self):
11841+        # This tests for what happens when a query to a remote server
11842+        # fails in either the hash validation step or the block getting
11843+        # step (because of batching, this is the same actual query).
11844+        # We need to have the storage server persist up until the point
11845+        # that its prefix is validated, then suddenly die. This
11846+        # exercises some exception handling code in Retrieve.
11847+        self.basedir = "mutable/Problems/test_block_and_hash_query_error"
11848+        self.set_up_grid(num_servers=20)
11849+        nm = self.g.clients[0].nodemaker
11850+        CONTENTS = "contents" * 2000
11851+        CONTENTS_uploadable = MutableData(CONTENTS)
11852+        d = nm.create_mutable_file(CONTENTS_uploadable)
11853+        def _created(node):
11854+            self._node = node
11855+        d.addCallback(_created)
11856+        d.addCallback(lambda ignored:
11857+            self._node.get_servermap(MODE_READ))
11858+        def _then(servermap):
11859+            # we have our servermap. Now we set up the servers like the
11860+            # tests above -- the first one that gets a read call should
11861+            # start throwing errors, but only after returning its prefix
11862+            # for validation. Since we'll download without fetching the
11863+            # private key, the next query to the remote server will be
11864+            # for either a block and salt or for hashes, either of which
11865+            # will exercise the error handling code.
11866+            killer = FirstServerGetsKilled()
11867+            for (serverid, ss) in nm.storage_broker.get_all_servers():
11868+                ss.post_call_notifier = killer.notify
11869+            ver = servermap.best_recoverable_version()
11870+            assert ver
11871+            return self._node.download_version(servermap, ver)
11872+        d.addCallback(_then)
11873+        d.addCallback(lambda data:
11874+            self.failUnlessEqual(data, CONTENTS))
11875+        return d
11876+
11877+
11878+class FileHandle(unittest.TestCase):
11879+    def setUp(self):
11880+        self.test_data = "Test Data" * 50000
11881+        self.sio = StringIO(self.test_data)
11882+        self.uploadable = MutableFileHandle(self.sio)
11883+
11884+
11885+    def test_filehandle_read(self):
11886+        self.basedir = "mutable/FileHandle/test_filehandle_read"
11887+        chunk_size = 10
11888+        for i in xrange(0, len(self.test_data), chunk_size):
11889+            data = self.uploadable.read(chunk_size)
11890+            data = "".join(data)
11891+            start = i
11892+            end = i + chunk_size
11893+            self.failUnlessEqual(data, self.test_data[start:end])
11894+
11895+
11896+    def test_filehandle_get_size(self):
11897+        self.basedir = "mutable/FileHandle/test_filehandle_get_size"
11898+        actual_size = len(self.test_data)
11899+        size = self.uploadable.get_size()
11900+        self.failUnlessEqual(size, actual_size)
11901+
11902+
11903+    def test_filehandle_get_size_out_of_order(self):
11904+        # We should be able to call get_size whenever we want without
11905+        # disturbing the location of the seek pointer.
11906+        chunk_size = 100
11907+        data = self.uploadable.read(chunk_size)
11908+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
11909+
11910+        # Now get the size.
11911+        size = self.uploadable.get_size()
11912+        self.failUnlessEqual(size, len(self.test_data))
11913+
11914+        # Now get more data. We should be right where we left off.
11915+        more_data = self.uploadable.read(chunk_size)
11916+        start = chunk_size
11917+        end = chunk_size * 2
11918+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
11919+
11920+
11921+    def test_filehandle_file(self):
11922+        # Make sure that the MutableFileHandle works on a file as well
11923+        # as a StringIO object, since in some cases it will be asked to
11924+        # deal with files.
11925+        self.basedir = self.mktemp()
11926+        # necessary? What am I doing wrong here?
11927+        os.mkdir(self.basedir)
11928+        f_path = os.path.join(self.basedir, "test_file")
11929+        f = open(f_path, "w")
11930+        f.write(self.test_data)
11931+        f.close()
11932+        f = open(f_path, "r")
11933+
11934+        uploadable = MutableFileHandle(f)
11935+
11936+        data = uploadable.read(len(self.test_data))
11937+        self.failUnlessEqual("".join(data), self.test_data)
11938+        size = uploadable.get_size()
11939+        self.failUnlessEqual(size, len(self.test_data))
11940+
11941+
11942+    def test_close(self):
11943+        # Make sure that the MutableFileHandle closes its handle when
11944+        # told to do so.
11945+        self.uploadable.close()
11946+        self.failUnless(self.sio.closed)
11947+
11948+
11949+class DataHandle(unittest.TestCase):
11950+    def setUp(self):
11951+        self.test_data = "Test Data" * 50000
11952+        self.uploadable = MutableData(self.test_data)
11953+
11954+
11955+    def test_datahandle_read(self):
11956+        chunk_size = 10
11957+        for i in xrange(0, len(self.test_data), chunk_size):
11958+            data = self.uploadable.read(chunk_size)
11959+            data = "".join(data)
11960+            start = i
11961+            end = i + chunk_size
11962+            self.failUnlessEqual(data, self.test_data[start:end])
11963+
11964+
11965+    def test_datahandle_get_size(self):
11966+        actual_size = len(self.test_data)
11967+        size = self.uploadable.get_size()
11968+        self.failUnlessEqual(size, actual_size)
11969+
11970+
11971+    def test_datahandle_get_size_out_of_order(self):
11972+        # We should be able to call get_size whenever we want without
11973+        # disturbing the location of the seek pointer.
11974+        chunk_size = 100
11975+        data = self.uploadable.read(chunk_size)
11976+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
11977+
11978+        # Now get the size.
11979+        size = self.uploadable.get_size()
11980+        self.failUnlessEqual(size, len(self.test_data))
11981+
11982+        # Now get more data. We should be right where we left off.
11983+        more_data = self.uploadable.read(chunk_size)
11984+        start = chunk_size
11985+        end = chunk_size * 2
11986+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
11987+
11988+
11989+class Version(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin, \
11990+              PublishMixin):
11991+    def setUp(self):
11992+        GridTestMixin.setUp(self)
11993+        self.basedir = self.mktemp()
11994+        self.set_up_grid()
11995+        self.c = self.g.clients[0]
11996+        self.nm = self.c.nodemaker
11997+        self.data = "test data" * 100000 # about 900 KiB; MDMF
11998+        self.small_data = "test data" * 10 # about 90 B; SDMF
11999+        return self.do_upload()
12000+
12001+
12002+    def do_upload(self):
12003+        d1 = self.nm.create_mutable_file(MutableData(self.data),
12004+                                         version=MDMF_VERSION)
12005+        d2 = self.nm.create_mutable_file(MutableData(self.small_data))
12006+        dl = gatherResults([d1, d2])
12007+        def _then((n1, n2)):
12008+            assert isinstance(n1, MutableFileNode)
12009+            assert isinstance(n2, MutableFileNode)
12010+
12011+            self.mdmf_node = n1
12012+            self.sdmf_node = n2
12013+        dl.addCallback(_then)
12014+        return dl
12015+
12016+
12017+    def test_get_readonly_mutable_version(self):
12018+        # Attempting to get a mutable version of a mutable file from a
12019+        # filenode initialized with a readcap should return a readonly
12020+        # version of that same node.
12021+        ro = self.mdmf_node.get_readonly()
12022+        d = ro.get_best_mutable_version()
12023+        d.addCallback(lambda version:
12024+            self.failUnless(version.is_readonly()))
12025+        d.addCallback(lambda ignored:
12026+            self.sdmf_node.get_readonly())
12027+        d.addCallback(lambda version:
12028+            self.failUnless(version.is_readonly()))
12029+        return d
12030+
12031+
12032+    def test_get_sequence_number(self):
12033+        d = self.mdmf_node.get_best_readable_version()
12034+        d.addCallback(lambda bv:
12035+            self.failUnlessEqual(bv.get_sequence_number(), 1))
12036+        d.addCallback(lambda ignored:
12037+            self.sdmf_node.get_best_readable_version())
12038+        d.addCallback(lambda bv:
12039+            self.failUnlessEqual(bv.get_sequence_number(), 1))
12040+        # Now update. The sequence number in both cases should be 1 in
12041+        # both cases.
12042+        def _do_update(ignored):
12043+            new_data = MutableData("foo bar baz" * 100000)
12044+            new_small_data = MutableData("foo bar baz" * 10)
12045+            d1 = self.mdmf_node.overwrite(new_data)
12046+            d2 = self.sdmf_node.overwrite(new_small_data)
12047+            dl = gatherResults([d1, d2])
12048+            return dl
12049+        d.addCallback(_do_update)
12050+        d.addCallback(lambda ignored:
12051+            self.mdmf_node.get_best_readable_version())
12052+        d.addCallback(lambda bv:
12053+            self.failUnlessEqual(bv.get_sequence_number(), 2))
12054+        d.addCallback(lambda ignored:
12055+            self.sdmf_node.get_best_readable_version())
12056+        d.addCallback(lambda bv:
12057+            self.failUnlessEqual(bv.get_sequence_number(), 2))
12058+        return d
12059+
12060+
12061+    def test_get_writekey(self):
12062+        d = self.mdmf_node.get_best_mutable_version()
12063+        d.addCallback(lambda bv:
12064+            self.failUnlessEqual(bv.get_writekey(),
12065+                                 self.mdmf_node.get_writekey()))
12066+        d.addCallback(lambda ignored:
12067+            self.sdmf_node.get_best_mutable_version())
12068+        d.addCallback(lambda bv:
12069+            self.failUnlessEqual(bv.get_writekey(),
12070+                                 self.sdmf_node.get_writekey()))
12071+        return d
12072+
12073+
12074+    def test_get_storage_index(self):
12075+        d = self.mdmf_node.get_best_mutable_version()
12076+        d.addCallback(lambda bv:
12077+            self.failUnlessEqual(bv.get_storage_index(),
12078+                                 self.mdmf_node.get_storage_index()))
12079+        d.addCallback(lambda ignored:
12080+            self.sdmf_node.get_best_mutable_version())
12081+        d.addCallback(lambda bv:
12082+            self.failUnlessEqual(bv.get_storage_index(),
12083+                                 self.sdmf_node.get_storage_index()))
12084+        return d
12085+
12086+
12087+    def test_get_readonly_version(self):
12088+        d = self.mdmf_node.get_best_readable_version()
12089+        d.addCallback(lambda bv:
12090+            self.failUnless(bv.is_readonly()))
12091+        d.addCallback(lambda ignored:
12092+            self.sdmf_node.get_best_readable_version())
12093+        d.addCallback(lambda bv:
12094+            self.failUnless(bv.is_readonly()))
12095+        return d
12096+
12097+
12098+    def test_get_mutable_version(self):
12099+        d = self.mdmf_node.get_best_mutable_version()
12100+        d.addCallback(lambda bv:
12101+            self.failIf(bv.is_readonly()))
12102+        d.addCallback(lambda ignored:
12103+            self.sdmf_node.get_best_mutable_version())
12104+        d.addCallback(lambda bv:
12105+            self.failIf(bv.is_readonly()))
12106+        return d
12107+
12108+
12109+    def test_toplevel_overwrite(self):
12110+        new_data = MutableData("foo bar baz" * 100000)
12111+        new_small_data = MutableData("foo bar baz" * 10)
12112+        d = self.mdmf_node.overwrite(new_data)
12113+        d.addCallback(lambda ignored:
12114+            self.mdmf_node.download_best_version())
12115+        d.addCallback(lambda data:
12116+            self.failUnlessEqual(data, "foo bar baz" * 100000))
12117+        d.addCallback(lambda ignored:
12118+            self.sdmf_node.overwrite(new_small_data))
12119+        d.addCallback(lambda ignored:
12120+            self.sdmf_node.download_best_version())
12121+        d.addCallback(lambda data:
12122+            self.failUnlessEqual(data, "foo bar baz" * 10))
12123+        return d
12124+
12125+
12126+    def test_toplevel_modify(self):
12127+        def modifier(old_contents, servermap, first_time):
12128+            return old_contents + "modified"
12129+        d = self.mdmf_node.modify(modifier)
12130+        d.addCallback(lambda ignored:
12131+            self.mdmf_node.download_best_version())
12132+        d.addCallback(lambda data:
12133+            self.failUnlessIn("modified", data))
12134+        d.addCallback(lambda ignored:
12135+            self.sdmf_node.modify(modifier))
12136+        d.addCallback(lambda ignored:
12137+            self.sdmf_node.download_best_version())
12138+        d.addCallback(lambda data:
12139+            self.failUnlessIn("modified", data))
12140+        return d
12141+
12142+
12143+    def test_version_modify(self):
12144+        # TODO: When we can publish multiple versions, alter this test
12145+        # to modify a version other than the best usable version, then
12146+        # test to see that the best recoverable version is that.
12147+        def modifier(old_contents, servermap, first_time):
12148+            return old_contents + "modified"
12149+        d = self.mdmf_node.modify(modifier)
12150+        d.addCallback(lambda ignored:
12151+            self.mdmf_node.download_best_version())
12152+        d.addCallback(lambda data:
12153+            self.failUnlessIn("modified", data))
12154+        d.addCallback(lambda ignored:
12155+            self.sdmf_node.modify(modifier))
12156+        d.addCallback(lambda ignored:
12157+            self.sdmf_node.download_best_version())
12158+        d.addCallback(lambda data:
12159+            self.failUnlessIn("modified", data))
12160+        return d
12161+
12162+
12163+    def test_download_version(self):
12164+        d = self.publish_multiple()
12165+        # We want to have two recoverable versions on the grid.
12166+        d.addCallback(lambda res:
12167+                      self._set_versions({0:0,2:0,4:0,6:0,8:0,
12168+                                          1:1,3:1,5:1,7:1,9:1}))
12169+        # Now try to download each version. We should get the plaintext
12170+        # associated with that version.
12171+        d.addCallback(lambda ignored:
12172+            self._fn.get_servermap(mode=MODE_READ))
12173+        def _got_servermap(smap):
12174+            versions = smap.recoverable_versions()
12175+            assert len(versions) == 2
12176+
12177+            self.servermap = smap
12178+            self.version1, self.version2 = versions
12179+            assert self.version1 != self.version2
12180+
12181+            self.version1_seqnum = self.version1[0]
12182+            self.version2_seqnum = self.version2[0]
12183+            self.version1_index = self.version1_seqnum - 1
12184+            self.version2_index = self.version2_seqnum - 1
12185+
12186+        d.addCallback(_got_servermap)
12187+        d.addCallback(lambda ignored:
12188+            self._fn.download_version(self.servermap, self.version1))
12189+        d.addCallback(lambda results:
12190+            self.failUnlessEqual(self.CONTENTS[self.version1_index],
12191+                                 results))
12192+        d.addCallback(lambda ignored:
12193+            self._fn.download_version(self.servermap, self.version2))
12194+        d.addCallback(lambda results:
12195+            self.failUnlessEqual(self.CONTENTS[self.version2_index],
12196+                                 results))
12197+        return d
12198+
12199+
12200+    def test_download_nonexistent_version(self):
12201+        d = self.mdmf_node.get_servermap(mode=MODE_WRITE)
12202+        def _set_servermap(servermap):
12203+            self.servermap = servermap
12204+        d.addCallback(_set_servermap)
12205+        d.addCallback(lambda ignored:
12206+           self.shouldFail(UnrecoverableFileError, "nonexistent version",
12207+                           None,
12208+                           self.mdmf_node.download_version, self.servermap,
12209+                           "not a version"))
12210+        return d
12211+
12212+
12213+    def test_partial_read(self):
12214+        # read only a few bytes at a time, and see that the results are
12215+        # what we expect.
12216+        d = self.mdmf_node.get_best_readable_version()
12217+        def _read_data(version):
12218+            c = consumer.MemoryConsumer()
12219+            d2 = defer.succeed(None)
12220+            for i in xrange(0, len(self.data), 10000):
12221+                d2.addCallback(lambda ignored, i=i: version.read(c, i, 10000))
12222+            d2.addCallback(lambda ignored:
12223+                self.failUnlessEqual(self.data, "".join(c.chunks)))
12224+            return d2
12225+        d.addCallback(_read_data)
12226+        return d
12227+
12228+
12229+    def test_read(self):
12230+        d = self.mdmf_node.get_best_readable_version()
12231+        def _read_data(version):
12232+            c = consumer.MemoryConsumer()
12233+            d2 = defer.succeed(None)
12234+            d2.addCallback(lambda ignored: version.read(c))
12235+            d2.addCallback(lambda ignored:
12236+                self.failUnlessEqual("".join(c.chunks), self.data))
12237+            return d2
12238+        d.addCallback(_read_data)
12239+        return d
12240+
12241+
12242+    def test_download_best_version(self):
12243+        d = self.mdmf_node.download_best_version()
12244+        d.addCallback(lambda data:
12245+            self.failUnlessEqual(data, self.data))
12246+        d.addCallback(lambda ignored:
12247+            self.sdmf_node.download_best_version())
12248+        d.addCallback(lambda data:
12249+            self.failUnlessEqual(data, self.small_data))
12250+        return d
12251+
12252+
12253+class Update(GridTestMixin, unittest.TestCase, testutil.ShouldFailMixin):
12254+    def setUp(self):
12255+        GridTestMixin.setUp(self)
12256+        self.basedir = self.mktemp()
12257+        self.set_up_grid()
12258+        self.c = self.g.clients[0]
12259+        self.nm = self.c.nodemaker
12260+        self.data = "test data" * 100000 # about 900 KiB; MDMF
12261+        self.small_data = "test data" * 10 # about 90 B; SDMF
12262+        return self.do_upload()
12263+
12264+
12265+    def do_upload(self):
12266+        d1 = self.nm.create_mutable_file(MutableData(self.data),
12267+                                         version=MDMF_VERSION)
12268+        d2 = self.nm.create_mutable_file(MutableData(self.small_data))
12269+        dl = gatherResults([d1, d2])
12270+        def _then((n1, n2)):
12271+            assert isinstance(n1, MutableFileNode)
12272+            assert isinstance(n2, MutableFileNode)
12273+
12274+            self.mdmf_node = n1
12275+            self.sdmf_node = n2
12276+        dl.addCallback(_then)
12277+        return dl
12278+
12279+
12280+    def test_append(self):
12281+        # We should be able to append data to the middle of a mutable
12282+        # file and get what we expect.
12283+        new_data = self.data + "appended"
12284+        d = self.mdmf_node.get_best_mutable_version()
12285+        d.addCallback(lambda mv:
12286+            mv.update(MutableData("appended"), len(self.data)))
12287+        d.addCallback(lambda ignored:
12288+            self.mdmf_node.download_best_version())
12289+        d.addCallback(lambda results:
12290+            self.failUnlessEqual(results, new_data))
12291+        return d
12292+    test_append.timeout = 15
12293+
12294+
12295+    def test_replace(self):
12296+        # We should be able to replace data in the middle of a mutable
12297+        # file and get what we expect back.
12298+        new_data = self.data[:100]
12299+        new_data += "appended"
12300+        new_data += self.data[108:]
12301+        d = self.mdmf_node.get_best_mutable_version()
12302+        d.addCallback(lambda mv:
12303+            mv.update(MutableData("appended"), 100))
12304+        d.addCallback(lambda ignored:
12305+            self.mdmf_node.download_best_version())
12306+        d.addCallback(lambda results:
12307+            self.failUnlessEqual(results, new_data))
12308+        return d
12309+
12310+
12311+    def test_replace_and_extend(self):
12312+        # We should be able to replace data in the middle of a mutable
12313+        # file and extend that mutable file and get what we expect.
12314+        new_data = self.data[:100]
12315+        new_data += "modified " * 100000
12316+        d = self.mdmf_node.get_best_mutable_version()
12317+        d.addCallback(lambda mv:
12318+            mv.update(MutableData("modified " * 100000), 100))
12319+        d.addCallback(lambda ignored:
12320+            self.mdmf_node.download_best_version())
12321+        d.addCallback(lambda results:
12322+            self.failUnlessEqual(results, new_data))
12323+        return d
12324+
12325+
12326+    def test_append_power_of_two(self):
12327+        # If we attempt to extend a mutable file so that its segment
12328+        # count crosses a power-of-two boundary, the update operation
12329+        # should know how to reencode the file.
12330+
12331+        # Note that the data populating self.mdmf_node is about 900 KiB
12332+        # long -- this is 7 segments in the default segment size. So we
12333+        # need to add 2 segments worth of data to push it over a
12334+        # power-of-two boundary.
12335+        segment = "a" * DEFAULT_MAX_SEGMENT_SIZE
12336+        new_data = self.data + (segment * 2)
12337+        d = self.mdmf_node.get_best_mutable_version()
12338+        d.addCallback(lambda mv:
12339+            mv.update(MutableData(segment * 2), len(self.data)))
12340+        d.addCallback(lambda ignored:
12341+            self.mdmf_node.download_best_version())
12342+        d.addCallback(lambda results:
12343+            self.failUnlessEqual(results, new_data))
12344+        return d
12345+    test_append_power_of_two.timeout = 15
12346+
12347+
12348+    def test_update_sdmf(self):
12349+        # Running update on a single-segment file should still work.
12350+        new_data = self.small_data + "appended"
12351+        d = self.sdmf_node.get_best_mutable_version()
12352+        d.addCallback(lambda mv:
12353+            mv.update(MutableData("appended"), len(self.small_data)))
12354+        d.addCallback(lambda ignored:
12355+            self.sdmf_node.download_best_version())
12356+        d.addCallback(lambda results:
12357+            self.failUnlessEqual(results, new_data))
12358+        return d
12359+
12360+    def test_replace_in_last_segment(self):
12361+        # The wrapper should know how to handle the tail segment
12362+        # appropriately.
12363+        replace_offset = len(self.data) - 100
12364+        new_data = self.data[:replace_offset] + "replaced"
12365+        rest_offset = replace_offset + len("replaced")
12366+        new_data += self.data[rest_offset:]
12367+        d = self.mdmf_node.get_best_mutable_version()
12368+        d.addCallback(lambda mv:
12369+            mv.update(MutableData("replaced"), replace_offset))
12370+        d.addCallback(lambda ignored:
12371+            self.mdmf_node.download_best_version())
12372+        d.addCallback(lambda results:
12373+            self.failUnlessEqual(results, new_data))
12374+        return d
12375+
12376+
12377+    def test_multiple_segment_replace(self):
12378+        replace_offset = 2 * DEFAULT_MAX_SEGMENT_SIZE
12379+        new_data = self.data[:replace_offset]
12380+        new_segment = "a" * DEFAULT_MAX_SEGMENT_SIZE
12381+        new_data += 2 * new_segment
12382+        new_data += "replaced"
12383+        rest_offset = len(new_data)
12384+        new_data += self.data[rest_offset:]
12385+        d = self.mdmf_node.get_best_mutable_version()
12386+        d.addCallback(lambda mv:
12387+            mv.update(MutableData((2 * new_segment) + "replaced"),
12388+                      replace_offset))
12389+        d.addCallback(lambda ignored:
12390+            self.mdmf_node.download_best_version())
12391+        d.addCallback(lambda results:
12392+            self.failUnlessEqual(results, new_data))
12393+        return d
12394hunk ./src/allmydata/test/test_sftp.py 32
12395 
12396 from allmydata.util.consumer import download_to_data
12397 from allmydata.immutable import upload
12398+from allmydata.mutable import publish
12399 from allmydata.test.no_network import GridTestMixin
12400 from allmydata.test.common import ShouldFailMixin
12401 from allmydata.test.common_util import ReallyEqualMixin
12402hunk ./src/allmydata/test/test_sftp.py 84
12403         return d
12404 
12405     def _set_up_tree(self):
12406-        d = self.client.create_mutable_file("mutable file contents")
12407+        u = publish.MutableData("mutable file contents")
12408+        d = self.client.create_mutable_file(u)
12409         d.addCallback(lambda node: self.root.set_node(u"mutable", node))
12410         def _created_mutable(n):
12411             self.mutable = n
12412hunk ./src/allmydata/test/test_sftp.py 1334
12413         d.addCallback(lambda ign: self.failUnlessEqual(sftpd.all_heisenfiles, {}))
12414         d.addCallback(lambda ign: self.failUnlessEqual(self.handler._heisenfiles, {}))
12415         return d
12416+    test_makeDirectory.timeout = 15
12417 
12418     def test_execCommand_and_openShell(self):
12419         class FakeProtocol:
12420hunk ./src/allmydata/test/test_storage.py 27
12421                                      LayoutInvalid, MDMFSIGNABLEHEADER, \
12422                                      SIGNED_PREFIX, MDMFHEADER, \
12423                                      MDMFOFFSETS, SDMFSlotWriteProxy
12424-from allmydata.interfaces import BadWriteEnablerError, MDMF_VERSION, \
12425-                                 SDMF_VERSION
12426+from allmydata.interfaces import BadWriteEnablerError
12427 from allmydata.test.common import LoggingServiceParent, ShouldFailMixin
12428 from allmydata.test.common_web import WebRenderingMixin
12429 from allmydata.web.storage import StorageStatus, remove_prefix
12430hunk ./src/allmydata/test/test_system.py 26
12431 from allmydata.monitor import Monitor
12432 from allmydata.mutable.common import NotWriteableError
12433 from allmydata.mutable import layout as mutable_layout
12434+from allmydata.mutable.publish import MutableData
12435 from foolscap.api import DeadReferenceError
12436 from twisted.python.failure import Failure
12437 from twisted.web.client import getPage
12438hunk ./src/allmydata/test/test_system.py 467
12439     def test_mutable(self):
12440         self.basedir = "system/SystemTest/test_mutable"
12441         DATA = "initial contents go here."  # 25 bytes % 3 != 0
12442+        DATA_uploadable = MutableData(DATA)
12443         NEWDATA = "new contents yay"
12444hunk ./src/allmydata/test/test_system.py 469
12445+        NEWDATA_uploadable = MutableData(NEWDATA)
12446         NEWERDATA = "this is getting old"
12447hunk ./src/allmydata/test/test_system.py 471
12448+        NEWERDATA_uploadable = MutableData(NEWERDATA)
12449 
12450         d = self.set_up_nodes(use_key_generator=True)
12451 
12452hunk ./src/allmydata/test/test_system.py 478
12453         def _create_mutable(res):
12454             c = self.clients[0]
12455             log.msg("starting create_mutable_file")
12456-            d1 = c.create_mutable_file(DATA)
12457+            d1 = c.create_mutable_file(DATA_uploadable)
12458             def _done(res):
12459                 log.msg("DONE: %s" % (res,))
12460                 self._mutable_node_1 = res
12461hunk ./src/allmydata/test/test_system.py 565
12462             self.failUnlessEqual(res, DATA)
12463             # replace the data
12464             log.msg("starting replace1")
12465-            d1 = newnode.overwrite(NEWDATA)
12466+            d1 = newnode.overwrite(NEWDATA_uploadable)
12467             d1.addCallback(lambda res: newnode.download_best_version())
12468             return d1
12469         d.addCallback(_check_download_3)
12470hunk ./src/allmydata/test/test_system.py 579
12471             newnode2 = self.clients[3].create_node_from_uri(uri)
12472             self._newnode3 = self.clients[3].create_node_from_uri(uri)
12473             log.msg("starting replace2")
12474-            d1 = newnode1.overwrite(NEWERDATA)
12475+            d1 = newnode1.overwrite(NEWERDATA_uploadable)
12476             d1.addCallback(lambda res: newnode2.download_best_version())
12477             return d1
12478         d.addCallback(_check_download_4)
12479hunk ./src/allmydata/test/test_system.py 649
12480         def _check_empty_file(res):
12481             # make sure we can create empty files, this usually screws up the
12482             # segsize math
12483-            d1 = self.clients[2].create_mutable_file("")
12484+            d1 = self.clients[2].create_mutable_file(MutableData(""))
12485             d1.addCallback(lambda newnode: newnode.download_best_version())
12486             d1.addCallback(lambda res: self.failUnlessEqual("", res))
12487             return d1
12488hunk ./src/allmydata/test/test_system.py 680
12489                                  self.key_generator_svc.key_generator.pool_size + size_delta)
12490 
12491         d.addCallback(check_kg_poolsize, 0)
12492-        d.addCallback(lambda junk: self.clients[3].create_mutable_file('hello, world'))
12493+        d.addCallback(lambda junk:
12494+            self.clients[3].create_mutable_file(MutableData('hello, world')))
12495         d.addCallback(check_kg_poolsize, -1)
12496         d.addCallback(lambda junk: self.clients[3].create_dirnode())
12497         d.addCallback(check_kg_poolsize, -2)
12498hunk ./src/allmydata/test/test_web.py 28
12499 from allmydata.util.encodingutil import to_str
12500 from allmydata.test.common import FakeCHKFileNode, FakeMutableFileNode, \
12501      create_chk_filenode, WebErrorMixin, ShouldFailMixin, make_mutable_file_uri
12502-from allmydata.interfaces import IMutableFileNode
12503+from allmydata.interfaces import IMutableFileNode, SDMF_VERSION, MDMF_VERSION
12504 from allmydata.mutable import servermap, publish, retrieve
12505 import allmydata.test.common_util as testutil
12506 from allmydata.test.no_network import GridTestMixin
12507hunk ./src/allmydata/test/test_web.py 57
12508         return FakeCHKFileNode(cap)
12509     def _create_mutable(self, cap):
12510         return FakeMutableFileNode(None, None, None, None).init_from_cap(cap)
12511-    def create_mutable_file(self, contents="", keysize=None):
12512+    def create_mutable_file(self, contents="", keysize=None,
12513+                            version=SDMF_VERSION):
12514         n = FakeMutableFileNode(None, None, None, None)
12515hunk ./src/allmydata/test/test_web.py 60
12516+        n.set_version(version)
12517         return n.create(contents)
12518 
12519 class FakeUploader(service.Service):
12520hunk ./src/allmydata/test/test_web.py 157
12521         self.nodemaker = FakeNodeMaker(None, self._secret_holder, None,
12522                                        self.uploader, None,
12523                                        None, None)
12524+        self.mutable_file_default = SDMF_VERSION
12525 
12526     def startService(self):
12527         return service.MultiService.startService(self)
12528hunk ./src/allmydata/test/test_web.py 762
12529                              self.PUT, base + "/@@name=/blah.txt", "")
12530         return d
12531 
12532+
12533     def test_GET_DIRURL_named_bad(self):
12534         base = "/file/%s" % urllib.quote(self._foo_uri)
12535         d = self.shouldFail2(error.Error, "test_PUT_DIRURL_named_bad",
12536hunk ./src/allmydata/test/test_web.py 878
12537                                                       self.NEWFILE_CONTENTS))
12538         return d
12539 
12540+    def test_PUT_NEWFILEURL_unlinked_mdmf(self):
12541+        # this should get us a few segments of an MDMF mutable file,
12542+        # which we can then test for.
12543+        contents = self.NEWFILE_CONTENTS * 300000
12544+        d = self.PUT("/uri?mutable=true&mutable-type=mdmf",
12545+                     contents)
12546+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12547+        d.addCallback(lambda json: self.failUnlessIn("mdmf", json))
12548+        return d
12549+
12550+    def test_PUT_NEWFILEURL_unlinked_sdmf(self):
12551+        contents = self.NEWFILE_CONTENTS * 300000
12552+        d = self.PUT("/uri?mutable=true&mutable-type=sdmf",
12553+                     contents)
12554+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12555+        d.addCallback(lambda json: self.failUnlessIn("sdmf", json))
12556+        return d
12557+
12558     def test_PUT_NEWFILEURL_range_bad(self):
12559         headers = {"content-range": "bytes 1-10/%d" % len(self.NEWFILE_CONTENTS)}
12560         target = self.public_url + "/foo/new.txt"
12561hunk ./src/allmydata/test/test_web.py 928
12562         return d
12563 
12564     def test_PUT_NEWFILEURL_mutable_toobig(self):
12565-        d = self.shouldFail2(error.Error, "test_PUT_NEWFILEURL_mutable_toobig",
12566-                             "413 Request Entity Too Large",
12567-                             "SDMF is limited to one segment, and 10001 > 10000",
12568-                             self.PUT,
12569-                             self.public_url + "/foo/new.txt?mutable=true",
12570-                             "b" * (self.s.MUTABLE_SIZELIMIT+1))
12571+        # It is okay to upload large mutable files, so we should be able
12572+        # to do that.
12573+        d = self.PUT(self.public_url + "/foo/new.txt?mutable=true",
12574+                     "b" * (self.s.MUTABLE_SIZELIMIT + 1))
12575         return d
12576 
12577     def test_PUT_NEWFILEURL_replace(self):
12578hunk ./src/allmydata/test/test_web.py 1026
12579         d.addCallback(_check1)
12580         return d
12581 
12582+    def test_GET_FILEURL_json_mutable_type(self):
12583+        # The JSON should include mutable-type, which says whether the
12584+        # file is SDMF or MDMF
12585+        d = self.PUT("/uri?mutable=true&mutable-type=mdmf",
12586+                     self.NEWFILE_CONTENTS * 300000)
12587+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12588+        def _got_json(json, version):
12589+            data = simplejson.loads(json)
12590+            assert "filenode" == data[0]
12591+            data = data[1]
12592+            assert isinstance(data, dict)
12593+
12594+            self.failUnlessIn("mutable-type", data)
12595+            self.failUnlessEqual(data['mutable-type'], version)
12596+
12597+        d.addCallback(_got_json, "mdmf")
12598+        # Now make an SDMF file and check that it is reported correctly.
12599+        d.addCallback(lambda ignored:
12600+            self.PUT("/uri?mutable=true&mutable-type=sdmf",
12601+                      self.NEWFILE_CONTENTS * 300000))
12602+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12603+        d.addCallback(_got_json, "sdmf")
12604+        return d
12605+
12606     def test_GET_FILEURL_json_missing(self):
12607         d = self.GET(self.public_url + "/foo/missing?json")
12608         d.addBoth(self.should404, "test_GET_FILEURL_json_missing")
12609hunk ./src/allmydata/test/test_web.py 1088
12610         d.addBoth(self.should404, "test_GET_FILEURL_uri_missing")
12611         return d
12612 
12613-    def test_GET_DIRECTORY_html_banner(self):
12614+    def test_GET_DIRECTORY_html(self):
12615         d = self.GET(self.public_url + "/foo", followRedirect=True)
12616         def _check(res):
12617             self.failUnlessIn('<div class="toolbar-item"><a href="../../..">Return to Welcome page</a></div>',res)
12618hunk ./src/allmydata/test/test_web.py 1092
12619+            self.failUnlessIn("mutable-type-mdmf", res)
12620+            self.failUnlessIn("mutable-type-sdmf", res)
12621         d.addCallback(_check)
12622         return d
12623 
12624hunk ./src/allmydata/test/test_web.py 1097
12625+    def test_GET_root_html(self):
12626+        # make sure that we have the option to upload an unlinked
12627+        # mutable file in SDMF and MDMF formats.
12628+        d = self.GET("/")
12629+        def _got_html(html):
12630+            # These are radio buttons that allow the user to toggle
12631+            # whether a particular mutable file is MDMF or SDMF.
12632+            self.failUnlessIn("mutable-type-mdmf", html)
12633+            self.failUnlessIn("mutable-type-sdmf", html)
12634+        d.addCallback(_got_html)
12635+        return d
12636+
12637+    def test_mutable_type_defaults(self):
12638+        # The checked="checked" attribute of the inputs corresponding to
12639+        # the mutable-type parameter should change as expected with the
12640+        # value configured in tahoe.cfg.
12641+        #
12642+        # By default, the value configured with the client is
12643+        # SDMF_VERSION, so that should be checked.
12644+        assert self.s.mutable_file_default == SDMF_VERSION
12645+
12646+        d = self.GET("/")
12647+        def _got_html(html, value):
12648+            i = 'input checked="checked" type="radio" id="mutable-type-%s"'
12649+            self.failUnlessIn(i % value, html)
12650+        d.addCallback(_got_html, "sdmf")
12651+        d.addCallback(lambda ignored:
12652+            self.GET(self.public_url + "/foo", followRedirect=True))
12653+        d.addCallback(_got_html, "sdmf")
12654+        # Now switch the configuration value to MDMF. The MDMF radio
12655+        # buttons should now be checked on these pages.
12656+        def _swap_values(ignored):
12657+            self.s.mutable_file_default = MDMF_VERSION
12658+        d.addCallback(_swap_values)
12659+        d.addCallback(lambda ignored: self.GET("/"))
12660+        d.addCallback(_got_html, "mdmf")
12661+        d.addCallback(lambda ignored:
12662+            self.GET(self.public_url + "/foo", followRedirect=True))
12663+        d.addCallback(_got_html, "mdmf")
12664+        return d
12665+
12666     def test_GET_DIRURL(self):
12667         # the addSlash means we get a redirect here
12668         # from /uri/$URI/foo/ , we need ../../../ to get back to the root
12669hunk ./src/allmydata/test/test_web.py 1227
12670         d.addCallback(self.failUnlessIsFooJSON)
12671         return d
12672 
12673+    def test_GET_DIRURL_json_mutable_type(self):
12674+        d = self.PUT(self.public_url + \
12675+                     "/foo/sdmf.txt?mutable=true&mutable-type=sdmf",
12676+                     self.NEWFILE_CONTENTS * 300000)
12677+        d.addCallback(lambda ignored:
12678+            self.PUT(self.public_url + \
12679+                     "/foo/mdmf.txt?mutable=true&mutable-type=mdmf",
12680+                     self.NEWFILE_CONTENTS * 300000))
12681+        # Now we have an MDMF and SDMF file in the directory. If we GET
12682+        # its JSON, we should see their encodings.
12683+        d.addCallback(lambda ignored:
12684+            self.GET(self.public_url + "/foo?t=json"))
12685+        def _got_json(json):
12686+            data = simplejson.loads(json)
12687+            assert data[0] == "dirnode"
12688+
12689+            data = data[1]
12690+            kids = data['children']
12691+
12692+            mdmf_data = kids['mdmf.txt'][1]
12693+            self.failUnlessIn("mutable-type", mdmf_data)
12694+            self.failUnlessEqual(mdmf_data['mutable-type'], "mdmf")
12695+
12696+            sdmf_data = kids['sdmf.txt'][1]
12697+            self.failUnlessIn("mutable-type", sdmf_data)
12698+            self.failUnlessEqual(sdmf_data['mutable-type'], "sdmf")
12699+        d.addCallback(_got_json)
12700+        return d
12701+
12702 
12703     def test_POST_DIRURL_manifest_no_ophandle(self):
12704         d = self.shouldFail2(error.Error,
12705hunk ./src/allmydata/test/test_web.py 1810
12706         return d
12707 
12708     def test_POST_upload_no_link_mutable_toobig(self):
12709-        d = self.shouldFail2(error.Error,
12710-                             "test_POST_upload_no_link_mutable_toobig",
12711-                             "413 Request Entity Too Large",
12712-                             "SDMF is limited to one segment, and 10001 > 10000",
12713-                             self.POST,
12714-                             "/uri", t="upload", mutable="true",
12715-                             file=("new.txt",
12716-                                   "b" * (self.s.MUTABLE_SIZELIMIT+1)) )
12717+        # The SDMF size limit is no longer in place, so we should be
12718+        # able to upload mutable files that are as large as we want them
12719+        # to be.
12720+        d = self.POST("/uri", t="upload", mutable="true",
12721+                      file=("new.txt", "b" * (self.s.MUTABLE_SIZELIMIT + 1)))
12722         return d
12723 
12724hunk ./src/allmydata/test/test_web.py 1817
12725+
12726+    def test_POST_upload_mutable_type_unlinked(self):
12727+        d = self.POST("/uri?t=upload&mutable=true&mutable-type=sdmf",
12728+                      file=("sdmf.txt", self.NEWFILE_CONTENTS * 300000))
12729+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12730+        def _got_json(json, version):
12731+            data = simplejson.loads(json)
12732+            data = data[1]
12733+
12734+            self.failUnlessIn("mutable-type", data)
12735+            self.failUnlessEqual(data['mutable-type'], version)
12736+        d.addCallback(_got_json, "sdmf")
12737+        d.addCallback(lambda ignored:
12738+            self.POST("/uri?t=upload&mutable=true&mutable-type=mdmf",
12739+                      file=('mdmf.txt', self.NEWFILE_CONTENTS * 300000)))
12740+        d.addCallback(lambda filecap: self.GET("/uri/%s?t=json" % filecap))
12741+        d.addCallback(_got_json, "mdmf")
12742+        return d
12743+
12744+    def test_POST_upload_mutable_type(self):
12745+        d = self.POST(self.public_url + \
12746+                      "/foo?t=upload&mutable=true&mutable-type=sdmf",
12747+                      file=("sdmf.txt", self.NEWFILE_CONTENTS * 300000))
12748+        fn = self._foo_node
12749+        def _got_cap(filecap, filename):
12750+            filenameu = unicode(filename)
12751+            self.failUnlessURIMatchesRWChild(filecap, fn, filenameu)
12752+            return self.GET(self.public_url + "/foo/%s?t=json" % filename)
12753+        d.addCallback(_got_cap, "sdmf.txt")
12754+        def _got_json(json, version):
12755+            data = simplejson.loads(json)
12756+            data = data[1]
12757+
12758+            self.failUnlessIn("mutable-type", data)
12759+            self.failUnlessEqual(data['mutable-type'], version)
12760+        d.addCallback(_got_json, "sdmf")
12761+        d.addCallback(lambda ignored:
12762+            self.POST(self.public_url + \
12763+                      "/foo?t=upload&mutable=true&mutable-type=mdmf",
12764+                      file=("mdmf.txt", self.NEWFILE_CONTENTS * 300000)))
12765+        d.addCallback(_got_cap, "mdmf.txt")
12766+        d.addCallback(_got_json, "mdmf")
12767+        return d
12768+
12769     def test_POST_upload_mutable(self):
12770         # this creates a mutable file
12771         d = self.POST(self.public_url + "/foo", t="upload", mutable="true",
12772hunk ./src/allmydata/test/test_web.py 1985
12773             self.failUnlessReallyEqual(headers["content-type"], ["text/plain"])
12774         d.addCallback(_got_headers)
12775 
12776-        # make sure that size errors are displayed correctly for overwrite
12777-        d.addCallback(lambda res:
12778-                      self.shouldFail2(error.Error,
12779-                                       "test_POST_upload_mutable-toobig",
12780-                                       "413 Request Entity Too Large",
12781-                                       "SDMF is limited to one segment, and 10001 > 10000",
12782-                                       self.POST,
12783-                                       self.public_url + "/foo", t="upload",
12784-                                       mutable="true",
12785-                                       file=("new.txt",
12786-                                             "b" * (self.s.MUTABLE_SIZELIMIT+1)),
12787-                                       ))
12788-
12789+        # make sure that outdated size limits aren't enforced anymore.
12790+        d.addCallback(lambda ignored:
12791+            self.POST(self.public_url + "/foo", t="upload",
12792+                      mutable="true",
12793+                      file=("new.txt",
12794+                            "b" * (self.s.MUTABLE_SIZELIMIT+1))))
12795         d.addErrback(self.dump_error)
12796         return d
12797 
12798hunk ./src/allmydata/test/test_web.py 1995
12799     def test_POST_upload_mutable_toobig(self):
12800-        d = self.shouldFail2(error.Error,
12801-                             "test_POST_upload_mutable_toobig",
12802-                             "413 Request Entity Too Large",
12803-                             "SDMF is limited to one segment, and 10001 > 10000",
12804-                             self.POST,
12805-                             self.public_url + "/foo",
12806-                             t="upload", mutable="true",
12807-                             file=("new.txt",
12808-                                   "b" * (self.s.MUTABLE_SIZELIMIT+1)) )
12809+        # SDMF had a size limti that was removed a while ago. MDMF has
12810+        # never had a size limit. Test to make sure that we do not
12811+        # encounter errors when trying to upload large mutable files,
12812+        # since there should be no coded prohibitions regarding large
12813+        # mutable files.
12814+        d = self.POST(self.public_url + "/foo",
12815+                      t="upload", mutable="true",
12816+                      file=("new.txt", "b" * (self.s.MUTABLE_SIZELIMIT + 1)))
12817         return d
12818 
12819     def dump_error(self, f):
12820hunk ./src/allmydata/test/test_web.py 3005
12821                                                       contents))
12822         return d
12823 
12824+    def test_PUT_NEWFILEURL_mdmf(self):
12825+        new_contents = self.NEWFILE_CONTENTS * 300000
12826+        d = self.PUT(self.public_url + \
12827+                     "/foo/mdmf.txt?mutable=true&mutable-type=mdmf",
12828+                     new_contents)
12829+        d.addCallback(lambda ignored:
12830+            self.GET(self.public_url + "/foo/mdmf.txt?t=json"))
12831+        def _got_json(json):
12832+            data = simplejson.loads(json)
12833+            data = data[1]
12834+            self.failUnlessIn("mutable-type", data)
12835+            self.failUnlessEqual(data['mutable-type'], "mdmf")
12836+        d.addCallback(_got_json)
12837+        return d
12838+
12839+    def test_PUT_NEWFILEURL_sdmf(self):
12840+        new_contents = self.NEWFILE_CONTENTS * 300000
12841+        d = self.PUT(self.public_url + \
12842+                     "/foo/sdmf.txt?mutable=true&mutable-type=sdmf",
12843+                     new_contents)
12844+        d.addCallback(lambda ignored:
12845+            self.GET(self.public_url + "/foo/sdmf.txt?t=json"))
12846+        def _got_json(json):
12847+            data = simplejson.loads(json)
12848+            data = data[1]
12849+            self.failUnlessIn("mutable-type", data)
12850+            self.failUnlessEqual(data['mutable-type'], "sdmf")
12851+        d.addCallback(_got_json)
12852+        return d
12853+
12854     def test_PUT_NEWFILEURL_uri_replace(self):
12855         contents, n, new_uri = self.makefile(8)
12856         d = self.PUT(self.public_url + "/foo/bar.txt?t=uri", new_uri)
12857hunk ./src/allmydata/test/test_web.py 3156
12858         d.addCallback(_done)
12859         return d
12860 
12861+
12862+    def test_PUT_update_at_offset(self):
12863+        file_contents = "test file" * 100000 # about 900 KiB
12864+        d = self.PUT("/uri?mutable=true", file_contents)
12865+        def _then(filecap):
12866+            self.filecap = filecap
12867+            new_data = file_contents[:100]
12868+            new = "replaced and so on"
12869+            new_data += new
12870+            new_data += file_contents[len(new_data):]
12871+            assert len(new_data) == len(file_contents)
12872+            self.new_data = new_data
12873+        d.addCallback(_then)
12874+        d.addCallback(lambda ignored:
12875+            self.PUT("/uri/%s?replace=True&offset=100" % self.filecap,
12876+                     "replaced and so on"))
12877+        def _get_data(filecap):
12878+            n = self.s.create_node_from_uri(filecap)
12879+            return n.download_best_version()
12880+        d.addCallback(_get_data)
12881+        d.addCallback(lambda results:
12882+            self.failUnlessEqual(results, self.new_data))
12883+        # Now try appending things to the file
12884+        d.addCallback(lambda ignored:
12885+            self.PUT("/uri/%s?offset=%d" % (self.filecap, len(self.new_data)),
12886+                     "puppies" * 100))
12887+        d.addCallback(_get_data)
12888+        d.addCallback(lambda results:
12889+            self.failUnlessEqual(results, self.new_data + ("puppies" * 100)))
12890+        return d
12891+
12892+
12893+    def test_PUT_update_at_offset_immutable(self):
12894+        file_contents = "Test file" * 100000
12895+        d = self.PUT("/uri", file_contents)
12896+        def _then(filecap):
12897+            self.filecap = filecap
12898+        d.addCallback(_then)
12899+        d.addCallback(lambda ignored:
12900+            self.shouldHTTPError("test immutable update",
12901+                                 400, "Bad Request",
12902+                                 "immutable",
12903+                                 self.PUT,
12904+                                 "/uri/%s?offset=50" % self.filecap,
12905+                                 "foo"))
12906+        return d
12907+
12908+
12909     def test_bad_method(self):
12910         url = self.webish_url + self.public_url + "/foo/bar.txt"
12911         d = self.shouldHTTPError("test_bad_method",
12912hunk ./src/allmydata/test/test_web.py 3473
12913         def _stash_mutable_uri(n, which):
12914             self.uris[which] = n.get_uri()
12915             assert isinstance(self.uris[which], str)
12916-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
12917+        d.addCallback(lambda ign:
12918+            c0.create_mutable_file(publish.MutableData(DATA+"3")))
12919         d.addCallback(_stash_mutable_uri, "corrupt")
12920         d.addCallback(lambda ign:
12921                       c0.upload(upload.Data("literal", convergence="")))
12922hunk ./src/allmydata/test/test_web.py 3620
12923         def _stash_mutable_uri(n, which):
12924             self.uris[which] = n.get_uri()
12925             assert isinstance(self.uris[which], str)
12926-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
12927+        d.addCallback(lambda ign:
12928+            c0.create_mutable_file(publish.MutableData(DATA+"3")))
12929         d.addCallback(_stash_mutable_uri, "corrupt")
12930 
12931         def _compute_fileurls(ignored):
12932hunk ./src/allmydata/test/test_web.py 4283
12933         def _stash_mutable_uri(n, which):
12934             self.uris[which] = n.get_uri()
12935             assert isinstance(self.uris[which], str)
12936-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"2"))
12937+        d.addCallback(lambda ign:
12938+            c0.create_mutable_file(publish.MutableData(DATA+"2")))
12939         d.addCallback(_stash_mutable_uri, "mutable")
12940 
12941         def _compute_fileurls(ignored):
12942hunk ./src/allmydata/test/test_web.py 4383
12943                                                         convergence="")))
12944         d.addCallback(_stash_uri, "small")
12945 
12946-        d.addCallback(lambda ign: c0.create_mutable_file("mutable"))
12947+        d.addCallback(lambda ign:
12948+            c0.create_mutable_file(publish.MutableData("mutable")))
12949         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
12950         d.addCallback(_stash_uri, "mutable")
12951 
12952}
12953[resolve conflicts between 393-MDMF patches and trunk as of 1.8.2
12954"Brian Warner <warner@lothar.com>"**20110220230201
12955 Ignore-this: 9bbf5d26c994e8069202331dcb4cdd95
12956] {
12957merger 0.0 (
12958merger 0.0 (
12959merger 0.0 (
12960replace ./docs/configuration.rst [A-Za-z_0-9\-\.] Tahoe Tahoe-LAFS
12961merger 0.0 (
12962hunk ./docs/configuration.rst 384
12963-shares.needed = (int, optional) aka "k", default 3
12964-shares.total = (int, optional) aka "N", N >= k, default 10
12965-shares.happy = (int, optional) 1 <= happy <= N, default 7
12966-
12967- These three values set the default encoding parameters. Each time a new file
12968- is uploaded, erasure-coding is used to break the ciphertext into separate
12969- pieces. There will be "N" (i.e. shares.total) pieces created, and the file
12970- will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
12971- The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
12972- Setting k to 1 is equivalent to simple replication (uploading N copies of
12973- the file).
12974-
12975- These values control the tradeoff between storage overhead, performance, and
12976- reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
12977- backend storage space (the actual value will be a bit more, because of other
12978- forms of overhead). Up to N-k shares can be lost before the file becomes
12979- unrecoverable, so assuming there are at least N servers, up to N-k servers
12980- can be offline without losing the file. So large N/k ratios are more
12981- reliable, and small N/k ratios use less disk space. Clearly, k must never be
12982- smaller than N.
12983-
12984- Large values of N will slow down upload operations slightly, since more
12985- servers must be involved, and will slightly increase storage overhead due to
12986- the hash trees that are created. Large values of k will cause downloads to
12987- be marginally slower, because more servers must be involved. N cannot be
12988- larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
12989- uses.
12990-
12991- shares.happy allows you control over the distribution of your immutable file.
12992- For a successful upload, shares are guaranteed to be initially placed on
12993- at least 'shares.happy' distinct servers, the correct functioning of any
12994- k of which is sufficient to guarantee the availability of the uploaded file.
12995- This value should not be larger than the number of servers on your grid.
12996-
12997- A value of shares.happy <= k is allowed, but does not provide any redundancy
12998- if some servers fail or lose shares.
12999-
13000- (Mutable files use a different share placement algorithm that does not
13001-  consider this parameter.)
13002-
13003-
13004-== Storage Server Configuration ==
13005-
13006-[storage]
13007-enabled = (boolean, optional)
13008-
13009- If this is True, the node will run a storage server, offering space to other
13010- clients. If it is False, the node will not run a storage server, meaning
13011- that no shares will be stored on this node. Use False this for clients who
13012- do not wish to provide storage service. The default value is True.
13013-
13014-readonly = (boolean, optional)
13015-
13016- If True, the node will run a storage server but will not accept any shares,
13017- making it effectively read-only. Use this for storage servers which are
13018- being decommissioned: the storage/ directory could be mounted read-only,
13019- while shares are moved to other servers. Note that this currently only
13020- affects immutable shares. Mutable shares (used for directories) will be
13021- written and modified anyway. See ticket #390 for the current status of this
13022- bug. The default value is False.
13023-
13024-reserved_space = (str, optional)
13025-
13026- If provided, this value defines how much disk space is reserved: the storage
13027- server will not accept any share which causes the amount of free disk space
13028- to drop below this value. (The free space is measured by a call to statvfs(2)
13029- on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
13030- user account under which the storage server runs.)
13031-
13032- This string contains a number, with an optional case-insensitive scale
13033- suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
13034- "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
13035- thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
13036-
13037-expire.enabled =
13038-expire.mode =
13039-expire.override_lease_duration =
13040-expire.cutoff_date =
13041-expire.immutable =
13042-expire.mutable =
13043-
13044- These settings control garbage-collection, in which the server will delete
13045- shares that no longer have an up-to-date lease on them. Please see the
13046- neighboring "garbage-collection.txt" document for full details.
13047-
13048-
13049-== Running A Helper ==
13050+Running A Helper
13051+================
13052hunk ./docs/configuration.rst 424
13053+mutable.format = sdmf or mdmf
13054+
13055+ This value tells Tahoe-LAFS what the default mutable file format should
13056+ be. If mutable.format=sdmf, then newly created mutable files will be in
13057+ the old SDMF format. This is desirable for clients that operate on
13058+ grids where some peers run older versions of Tahoe-LAFS, as these older
13059+ versions cannot read the new MDMF mutable file format. If
13060+ mutable.format = mdmf, then newly created mutable files will use the
13061+ new MDMF format, which supports efficient in-place modification and
13062+ streaming downloads. You can overwrite this value using a special
13063+ mutable-type parameter in the webapi. If you do not specify a value
13064+ here, Tahoe-LAFS will use SDMF for all newly-created mutable files.
13065+
13066+ Note that this parameter only applies to mutable files. Mutable
13067+ directories, which are stored as mutable files, are not controlled by
13068+ this parameter and will always use SDMF. We may revisit this decision
13069+ in future versions of Tahoe-LAFS.
13070)
13071)
13072hunk ./docs/configuration.rst 324
13073+Frontend Configuration
13074+======================
13075+
13076+The Tahoe client process can run a variety of frontend file-access protocols.
13077+You will use these to create and retrieve files from the virtual filesystem.
13078+Configuration details for each are documented in the following
13079+protocol-specific guides:
13080+
13081+HTTP
13082+
13083+    Tahoe runs a webserver by default on port 3456. This interface provides a
13084+    human-oriented "WUI", with pages to create, modify, and browse
13085+    directories and files, as well as a number of pages to check on the
13086+    status of your Tahoe node. It also provides a machine-oriented "WAPI",
13087+    with a REST-ful HTTP interface that can be used by other programs
13088+    (including the CLI tools). Please see `<frontends/webapi.rst>`_ for full
13089+    details, and the ``web.port`` and ``web.static`` config variables above.
13090+    The `<frontends/download-status.rst>`_ document also describes a few WUI
13091+    status pages.
13092+
13093+CLI
13094+
13095+    The main "bin/tahoe" executable includes subcommands for manipulating the
13096+    filesystem, uploading/downloading files, and creating/running Tahoe
13097+    nodes. See `<frontends/CLI.rst>`_ for details.
13098+
13099+FTP, SFTP
13100+
13101+    Tahoe can also run both FTP and SFTP servers, and map a username/password
13102+    pair to a top-level Tahoe directory. See `<frontends/FTP-and-SFTP.rst>`_
13103+    for instructions on configuring these services, and the ``[ftpd]`` and
13104+    ``[sftpd]`` sections of ``tahoe.cfg``.
13105+
13106)
13107hunk ./docs/configuration.rst 324
13108+``mutable.format = sdmf or mdmf``
13109+
13110+    This value tells Tahoe what the default mutable file format should
13111+    be. If ``mutable.format=sdmf``, then newly created mutable files will be
13112+    in the old SDMF format. This is desirable for clients that operate on
13113+    grids where some peers run older versions of Tahoe, as these older
13114+    versions cannot read the new MDMF mutable file format. If
13115+    ``mutable.format`` is ``mdmf``, then newly created mutable files will use
13116+    the new MDMF format, which supports efficient in-place modification and
13117+    streaming downloads. You can overwrite this value using a special
13118+    mutable-type parameter in the webapi. If you do not specify a value here,
13119+    Tahoe will use SDMF for all newly-created mutable files.
13120+
13121+    Note that this parameter only applies to mutable files. Mutable
13122+    directories, which are stored as mutable files, are not controlled by
13123+    this parameter and will always use SDMF. We may revisit this decision
13124+    in future versions of Tahoe-LAFS.
13125+
13126)
13127merger 0.0 (
13128merger 0.0 (
13129hunk ./docs/configuration.rst 324
13130+``mutable.format = sdmf or mdmf``
13131+
13132+    This value tells Tahoe what the default mutable file format should
13133+    be. If ``mutable.format=sdmf``, then newly created mutable files will be
13134+    in the old SDMF format. This is desirable for clients that operate on
13135+    grids where some peers run older versions of Tahoe, as these older
13136+    versions cannot read the new MDMF mutable file format. If
13137+    ``mutable.format`` is ``mdmf``, then newly created mutable files will use
13138+    the new MDMF format, which supports efficient in-place modification and
13139+    streaming downloads. You can overwrite this value using a special
13140+    mutable-type parameter in the webapi. If you do not specify a value here,
13141+    Tahoe will use SDMF for all newly-created mutable files.
13142+
13143+    Note that this parameter only applies to mutable files. Mutable
13144+    directories, which are stored as mutable files, are not controlled by
13145+    this parameter and will always use SDMF. We may revisit this decision
13146+    in future versions of Tahoe-LAFS.
13147+
13148merger 0.0 (
13149merger 0.0 (
13150replace ./docs/configuration.rst [A-Za-z_0-9\-\.] Tahoe Tahoe-LAFS
13151merger 0.0 (
13152hunk ./docs/configuration.rst 384
13153-shares.needed = (int, optional) aka "k", default 3
13154-shares.total = (int, optional) aka "N", N >= k, default 10
13155-shares.happy = (int, optional) 1 <= happy <= N, default 7
13156-
13157- These three values set the default encoding parameters. Each time a new file
13158- is uploaded, erasure-coding is used to break the ciphertext into separate
13159- pieces. There will be "N" (i.e. shares.total) pieces created, and the file
13160- will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
13161- The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
13162- Setting k to 1 is equivalent to simple replication (uploading N copies of
13163- the file).
13164-
13165- These values control the tradeoff between storage overhead, performance, and
13166- reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
13167- backend storage space (the actual value will be a bit more, because of other
13168- forms of overhead). Up to N-k shares can be lost before the file becomes
13169- unrecoverable, so assuming there are at least N servers, up to N-k servers
13170- can be offline without losing the file. So large N/k ratios are more
13171- reliable, and small N/k ratios use less disk space. Clearly, k must never be
13172- smaller than N.
13173-
13174- Large values of N will slow down upload operations slightly, since more
13175- servers must be involved, and will slightly increase storage overhead due to
13176- the hash trees that are created. Large values of k will cause downloads to
13177- be marginally slower, because more servers must be involved. N cannot be
13178- larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
13179- uses.
13180-
13181- shares.happy allows you control over the distribution of your immutable file.
13182- For a successful upload, shares are guaranteed to be initially placed on
13183- at least 'shares.happy' distinct servers, the correct functioning of any
13184- k of which is sufficient to guarantee the availability of the uploaded file.
13185- This value should not be larger than the number of servers on your grid.
13186-
13187- A value of shares.happy <= k is allowed, but does not provide any redundancy
13188- if some servers fail or lose shares.
13189-
13190- (Mutable files use a different share placement algorithm that does not
13191-  consider this parameter.)
13192-
13193-
13194-== Storage Server Configuration ==
13195-
13196-[storage]
13197-enabled = (boolean, optional)
13198-
13199- If this is True, the node will run a storage server, offering space to other
13200- clients. If it is False, the node will not run a storage server, meaning
13201- that no shares will be stored on this node. Use False this for clients who
13202- do not wish to provide storage service. The default value is True.
13203-
13204-readonly = (boolean, optional)
13205-
13206- If True, the node will run a storage server but will not accept any shares,
13207- making it effectively read-only. Use this for storage servers which are
13208- being decommissioned: the storage/ directory could be mounted read-only,
13209- while shares are moved to other servers. Note that this currently only
13210- affects immutable shares. Mutable shares (used for directories) will be
13211- written and modified anyway. See ticket #390 for the current status of this
13212- bug. The default value is False.
13213-
13214-reserved_space = (str, optional)
13215-
13216- If provided, this value defines how much disk space is reserved: the storage
13217- server will not accept any share which causes the amount of free disk space
13218- to drop below this value. (The free space is measured by a call to statvfs(2)
13219- on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
13220- user account under which the storage server runs.)
13221-
13222- This string contains a number, with an optional case-insensitive scale
13223- suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
13224- "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
13225- thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
13226-
13227-expire.enabled =
13228-expire.mode =
13229-expire.override_lease_duration =
13230-expire.cutoff_date =
13231-expire.immutable =
13232-expire.mutable =
13233-
13234- These settings control garbage-collection, in which the server will delete
13235- shares that no longer have an up-to-date lease on them. Please see the
13236- neighboring "garbage-collection.txt" document for full details.
13237-
13238-
13239-== Running A Helper ==
13240+Running A Helper
13241+================
13242hunk ./docs/configuration.rst 424
13243+mutable.format = sdmf or mdmf
13244+
13245+ This value tells Tahoe-LAFS what the default mutable file format should
13246+ be. If mutable.format=sdmf, then newly created mutable files will be in
13247+ the old SDMF format. This is desirable for clients that operate on
13248+ grids where some peers run older versions of Tahoe-LAFS, as these older
13249+ versions cannot read the new MDMF mutable file format. If
13250+ mutable.format = mdmf, then newly created mutable files will use the
13251+ new MDMF format, which supports efficient in-place modification and
13252+ streaming downloads. You can overwrite this value using a special
13253+ mutable-type parameter in the webapi. If you do not specify a value
13254+ here, Tahoe-LAFS will use SDMF for all newly-created mutable files.
13255+
13256+ Note that this parameter only applies to mutable files. Mutable
13257+ directories, which are stored as mutable files, are not controlled by
13258+ this parameter and will always use SDMF. We may revisit this decision
13259+ in future versions of Tahoe-LAFS.
13260)
13261)
13262hunk ./docs/configuration.rst 324
13263+Frontend Configuration
13264+======================
13265+
13266+The Tahoe client process can run a variety of frontend file-access protocols.
13267+You will use these to create and retrieve files from the virtual filesystem.
13268+Configuration details for each are documented in the following
13269+protocol-specific guides:
13270+
13271+HTTP
13272+
13273+    Tahoe runs a webserver by default on port 3456. This interface provides a
13274+    human-oriented "WUI", with pages to create, modify, and browse
13275+    directories and files, as well as a number of pages to check on the
13276+    status of your Tahoe node. It also provides a machine-oriented "WAPI",
13277+    with a REST-ful HTTP interface that can be used by other programs
13278+    (including the CLI tools). Please see `<frontends/webapi.rst>`_ for full
13279+    details, and the ``web.port`` and ``web.static`` config variables above.
13280+    The `<frontends/download-status.rst>`_ document also describes a few WUI
13281+    status pages.
13282+
13283+CLI
13284+
13285+    The main "bin/tahoe" executable includes subcommands for manipulating the
13286+    filesystem, uploading/downloading files, and creating/running Tahoe
13287+    nodes. See `<frontends/CLI.rst>`_ for details.
13288+
13289+FTP, SFTP
13290+
13291+    Tahoe can also run both FTP and SFTP servers, and map a username/password
13292+    pair to a top-level Tahoe directory. See `<frontends/FTP-and-SFTP.rst>`_
13293+    for instructions on configuring these services, and the ``[ftpd]`` and
13294+    ``[sftpd]`` sections of ``tahoe.cfg``.
13295+
13296)
13297)
13298hunk ./docs/configuration.rst 402
13299-shares.needed = (int, optional) aka "k", default 3
13300-shares.total = (int, optional) aka "N", N >= k, default 10
13301-shares.happy = (int, optional) 1 <= happy <= N, default 7
13302-
13303- These three values set the default encoding parameters. Each time a new file
13304- is uploaded, erasure-coding is used to break the ciphertext into separate
13305- pieces. There will be "N" (i.e. shares.total) pieces created, and the file
13306- will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
13307- The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
13308- Setting k to 1 is equivalent to simple replication (uploading N copies of
13309- the file).
13310-
13311- These values control the tradeoff between storage overhead, performance, and
13312- reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
13313- backend storage space (the actual value will be a bit more, because of other
13314- forms of overhead). Up to N-k shares can be lost before the file becomes
13315- unrecoverable, so assuming there are at least N servers, up to N-k servers
13316- can be offline without losing the file. So large N/k ratios are more
13317- reliable, and small N/k ratios use less disk space. Clearly, k must never be
13318- smaller than N.
13319-
13320- Large values of N will slow down upload operations slightly, since more
13321- servers must be involved, and will slightly increase storage overhead due to
13322- the hash trees that are created. Large values of k will cause downloads to
13323- be marginally slower, because more servers must be involved. N cannot be
13324- larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
13325- uses.
13326-
13327- shares.happy allows you control over the distribution of your immutable file.
13328- For a successful upload, shares are guaranteed to be initially placed on
13329- at least 'shares.happy' distinct servers, the correct functioning of any
13330- k of which is sufficient to guarantee the availability of the uploaded file.
13331- This value should not be larger than the number of servers on your grid.
13332-
13333- A value of shares.happy <= k is allowed, but does not provide any redundancy
13334- if some servers fail or lose shares.
13335-
13336- (Mutable files use a different share placement algorithm that does not
13337-  consider this parameter.)
13338-
13339-
13340-== Storage Server Configuration ==
13341-
13342-[storage]
13343-enabled = (boolean, optional)
13344-
13345- If this is True, the node will run a storage server, offering space to other
13346- clients. If it is False, the node will not run a storage server, meaning
13347- that no shares will be stored on this node. Use False this for clients who
13348- do not wish to provide storage service. The default value is True.
13349-
13350-readonly = (boolean, optional)
13351-
13352- If True, the node will run a storage server but will not accept any shares,
13353- making it effectively read-only. Use this for storage servers which are
13354- being decommissioned: the storage/ directory could be mounted read-only,
13355- while shares are moved to other servers. Note that this currently only
13356- affects immutable shares. Mutable shares (used for directories) will be
13357- written and modified anyway. See ticket #390 for the current status of this
13358- bug. The default value is False.
13359-
13360-reserved_space = (str, optional)
13361-
13362- If provided, this value defines how much disk space is reserved: the storage
13363- server will not accept any share which causes the amount of free disk space
13364- to drop below this value. (The free space is measured by a call to statvfs(2)
13365- on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
13366- user account under which the storage server runs.)
13367-
13368- This string contains a number, with an optional case-insensitive scale
13369- suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
13370- "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
13371- thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
13372-
13373-expire.enabled =
13374-expire.mode =
13375-expire.override_lease_duration =
13376-expire.cutoff_date =
13377-expire.immutable =
13378-expire.mutable =
13379-
13380- These settings control garbage-collection, in which the server will delete
13381- shares that no longer have an up-to-date lease on them. Please see the
13382- neighboring "garbage-collection.txt" document for full details.
13383-
13384-
13385-== Running A Helper ==
13386+Running A Helper
13387+================
13388)
13389merger 0.0 (
13390merger 0.0 (
13391hunk ./docs/configuration.rst 402
13392-shares.needed = (int, optional) aka "k", default 3
13393-shares.total = (int, optional) aka "N", N >= k, default 10
13394-shares.happy = (int, optional) 1 <= happy <= N, default 7
13395-
13396- These three values set the default encoding parameters. Each time a new file
13397- is uploaded, erasure-coding is used to break the ciphertext into separate
13398- pieces. There will be "N" (i.e. shares.total) pieces created, and the file
13399- will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
13400- The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
13401- Setting k to 1 is equivalent to simple replication (uploading N copies of
13402- the file).
13403-
13404- These values control the tradeoff between storage overhead, performance, and
13405- reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
13406- backend storage space (the actual value will be a bit more, because of other
13407- forms of overhead). Up to N-k shares can be lost before the file becomes
13408- unrecoverable, so assuming there are at least N servers, up to N-k servers
13409- can be offline without losing the file. So large N/k ratios are more
13410- reliable, and small N/k ratios use less disk space. Clearly, k must never be
13411- smaller than N.
13412-
13413- Large values of N will slow down upload operations slightly, since more
13414- servers must be involved, and will slightly increase storage overhead due to
13415- the hash trees that are created. Large values of k will cause downloads to
13416- be marginally slower, because more servers must be involved. N cannot be
13417- larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
13418- uses.
13419-
13420- shares.happy allows you control over the distribution of your immutable file.
13421- For a successful upload, shares are guaranteed to be initially placed on
13422- at least 'shares.happy' distinct servers, the correct functioning of any
13423- k of which is sufficient to guarantee the availability of the uploaded file.
13424- This value should not be larger than the number of servers on your grid.
13425-
13426- A value of shares.happy <= k is allowed, but does not provide any redundancy
13427- if some servers fail or lose shares.
13428-
13429- (Mutable files use a different share placement algorithm that does not
13430-  consider this parameter.)
13431-
13432-
13433-== Storage Server Configuration ==
13434-
13435-[storage]
13436-enabled = (boolean, optional)
13437-
13438- If this is True, the node will run a storage server, offering space to other
13439- clients. If it is False, the node will not run a storage server, meaning
13440- that no shares will be stored on this node. Use False this for clients who
13441- do not wish to provide storage service. The default value is True.
13442-
13443-readonly = (boolean, optional)
13444-
13445- If True, the node will run a storage server but will not accept any shares,
13446- making it effectively read-only. Use this for storage servers which are
13447- being decommissioned: the storage/ directory could be mounted read-only,
13448- while shares are moved to other servers. Note that this currently only
13449- affects immutable shares. Mutable shares (used for directories) will be
13450- written and modified anyway. See ticket #390 for the current status of this
13451- bug. The default value is False.
13452-
13453-reserved_space = (str, optional)
13454-
13455- If provided, this value defines how much disk space is reserved: the storage
13456- server will not accept any share which causes the amount of free disk space
13457- to drop below this value. (The free space is measured by a call to statvfs(2)
13458- on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
13459- user account under which the storage server runs.)
13460-
13461- This string contains a number, with an optional case-insensitive scale
13462- suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
13463- "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
13464- thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
13465-
13466-expire.enabled =
13467-expire.mode =
13468-expire.override_lease_duration =
13469-expire.cutoff_date =
13470-expire.immutable =
13471-expire.mutable =
13472-
13473- These settings control garbage-collection, in which the server will delete
13474- shares that no longer have an up-to-date lease on them. Please see the
13475- neighboring "garbage-collection.txt" document for full details.
13476-
13477-
13478-== Running A Helper ==
13479+Running A Helper
13480+================
13481merger 0.0 (
13482hunk ./docs/configuration.rst 324
13483+``mutable.format = sdmf or mdmf``
13484+
13485+    This value tells Tahoe what the default mutable file format should
13486+    be. If ``mutable.format=sdmf``, then newly created mutable files will be
13487+    in the old SDMF format. This is desirable for clients that operate on
13488+    grids where some peers run older versions of Tahoe, as these older
13489+    versions cannot read the new MDMF mutable file format. If
13490+    ``mutable.format`` is ``mdmf``, then newly created mutable files will use
13491+    the new MDMF format, which supports efficient in-place modification and
13492+    streaming downloads. You can overwrite this value using a special
13493+    mutable-type parameter in the webapi. If you do not specify a value here,
13494+    Tahoe will use SDMF for all newly-created mutable files.
13495+
13496+    Note that this parameter only applies to mutable files. Mutable
13497+    directories, which are stored as mutable files, are not controlled by
13498+    this parameter and will always use SDMF. We may revisit this decision
13499+    in future versions of Tahoe-LAFS.
13500+
13501merger 0.0 (
13502merger 0.0 (
13503replace ./docs/configuration.rst [A-Za-z_0-9\-\.] Tahoe Tahoe-LAFS
13504merger 0.0 (
13505hunk ./docs/configuration.rst 384
13506-shares.needed = (int, optional) aka "k", default 3
13507-shares.total = (int, optional) aka "N", N >= k, default 10
13508-shares.happy = (int, optional) 1 <= happy <= N, default 7
13509-
13510- These three values set the default encoding parameters. Each time a new file
13511- is uploaded, erasure-coding is used to break the ciphertext into separate
13512- pieces. There will be "N" (i.e. shares.total) pieces created, and the file
13513- will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
13514- The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
13515- Setting k to 1 is equivalent to simple replication (uploading N copies of
13516- the file).
13517-
13518- These values control the tradeoff between storage overhead, performance, and
13519- reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
13520- backend storage space (the actual value will be a bit more, because of other
13521- forms of overhead). Up to N-k shares can be lost before the file becomes
13522- unrecoverable, so assuming there are at least N servers, up to N-k servers
13523- can be offline without losing the file. So large N/k ratios are more
13524- reliable, and small N/k ratios use less disk space. Clearly, k must never be
13525- smaller than N.
13526-
13527- Large values of N will slow down upload operations slightly, since more
13528- servers must be involved, and will slightly increase storage overhead due to
13529- the hash trees that are created. Large values of k will cause downloads to
13530- be marginally slower, because more servers must be involved. N cannot be
13531- larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe
13532- uses.
13533-
13534- shares.happy allows you control over the distribution of your immutable file.
13535- For a successful upload, shares are guaranteed to be initially placed on
13536- at least 'shares.happy' distinct servers, the correct functioning of any
13537- k of which is sufficient to guarantee the availability of the uploaded file.
13538- This value should not be larger than the number of servers on your grid.
13539-
13540- A value of shares.happy <= k is allowed, but does not provide any redundancy
13541- if some servers fail or lose shares.
13542-
13543- (Mutable files use a different share placement algorithm that does not
13544-  consider this parameter.)
13545-
13546-
13547-== Storage Server Configuration ==
13548-
13549-[storage]
13550-enabled = (boolean, optional)
13551-
13552- If this is True, the node will run a storage server, offering space to other
13553- clients. If it is False, the node will not run a storage server, meaning
13554- that no shares will be stored on this node. Use False this for clients who
13555- do not wish to provide storage service. The default value is True.
13556-
13557-readonly = (boolean, optional)
13558-
13559- If True, the node will run a storage server but will not accept any shares,
13560- making it effectively read-only. Use this for storage servers which are
13561- being decommissioned: the storage/ directory could be mounted read-only,
13562- while shares are moved to other servers. Note that this currently only
13563- affects immutable shares. Mutable shares (used for directories) will be
13564- written and modified anyway. See ticket #390 for the current status of this
13565- bug. The default value is False.
13566-
13567-reserved_space = (str, optional)
13568-
13569- If provided, this value defines how much disk space is reserved: the storage
13570- server will not accept any share which causes the amount of free disk space
13571- to drop below this value. (The free space is measured by a call to statvfs(2)
13572- on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
13573- user account under which the storage server runs.)
13574-
13575- This string contains a number, with an optional case-insensitive scale
13576- suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
13577- "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
13578- thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
13579-
13580-expire.enabled =
13581-expire.mode =
13582-expire.override_lease_duration =
13583-expire.cutoff_date =
13584-expire.immutable =
13585-expire.mutable =
13586-
13587- These settings control garbage-collection, in which the server will delete
13588- shares that no longer have an up-to-date lease on them. Please see the
13589- neighboring "garbage-collection.txt" document for full details.
13590-
13591-
13592-== Running A Helper ==
13593+Running A Helper
13594+================
13595hunk ./docs/configuration.rst 424
13596+mutable.format = sdmf or mdmf
13597+
13598+ This value tells Tahoe-LAFS what the default mutable file format should
13599+ be. If mutable.format=sdmf, then newly created mutable files will be in
13600+ the old SDMF format. This is desirable for clients that operate on
13601+ grids where some peers run older versions of Tahoe-LAFS, as these older
13602+ versions cannot read the new MDMF mutable file format. If
13603+ mutable.format = mdmf, then newly created mutable files will use the
13604+ new MDMF format, which supports efficient in-place modification and
13605+ streaming downloads. You can overwrite this value using a special
13606+ mutable-type parameter in the webapi. If you do not specify a value
13607+ here, Tahoe-LAFS will use SDMF for all newly-created mutable files.
13608+
13609+ Note that this parameter only applies to mutable files. Mutable
13610+ directories, which are stored as mutable files, are not controlled by
13611+ this parameter and will always use SDMF. We may revisit this decision
13612+ in future versions of Tahoe-LAFS.
13613)
13614)
13615hunk ./docs/configuration.rst 324
13616+Frontend Configuration
13617+======================
13618+
13619+The Tahoe client process can run a variety of frontend file-access protocols.
13620+You will use these to create and retrieve files from the virtual filesystem.
13621+Configuration details for each are documented in the following
13622+protocol-specific guides:
13623+
13624+HTTP
13625+
13626+    Tahoe runs a webserver by default on port 3456. This interface provides a
13627+    human-oriented "WUI", with pages to create, modify, and browse
13628+    directories and files, as well as a number of pages to check on the
13629+    status of your Tahoe node. It also provides a machine-oriented "WAPI",
13630+    with a REST-ful HTTP interface that can be used by other programs
13631+    (including the CLI tools). Please see `<frontends/webapi.rst>`_ for full
13632+    details, and the ``web.port`` and ``web.static`` config variables above.
13633+    The `<frontends/download-status.rst>`_ document also describes a few WUI
13634+    status pages.
13635+
13636+CLI
13637+
13638+    The main "bin/tahoe" executable includes subcommands for manipulating the
13639+    filesystem, uploading/downloading files, and creating/running Tahoe
13640+    nodes. See `<frontends/CLI.rst>`_ for details.
13641+
13642+FTP, SFTP
13643+
13644+    Tahoe can also run both FTP and SFTP servers, and map a username/password
13645+    pair to a top-level Tahoe directory. See `<frontends/FTP-and-SFTP.rst>`_
13646+    for instructions on configuring these services, and the ``[ftpd]`` and
13647+    ``[sftpd]`` sections of ``tahoe.cfg``.
13648+
13649)
13650)
13651)
13652replace ./docs/configuration.rst [A-Za-z_0-9\-\.] Tahoe Tahoe-LAFS
13653)
13654hunk ./src/allmydata/mutable/retrieve.py 7
13655 from zope.interface import implements
13656 from twisted.internet import defer
13657 from twisted.python import failure
13658-from foolscap.api import DeadReferenceError, eventually, fireEventually
13659-from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError
13660-from allmydata.util import hashutil, idlib, log
13661+from twisted.internet.interfaces import IPushProducer, IConsumer
13662+from foolscap.api import eventually, fireEventually
13663+from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError, \
13664+                                 MDMF_VERSION, SDMF_VERSION
13665+from allmydata.util import hashutil, log, mathutil
13666+from allmydata.util.dictutil import DictOfSets
13667 from allmydata import hashtree, codec
13668 from allmydata.storage.server import si_b2a
13669 from pycryptopp.cipher.aes import AES
13670hunk ./src/allmydata/mutable/retrieve.py 239
13671             # KiB, so we ask for that much.
13672             # TODO: Change the cache methods to allow us to fetch all of the
13673             # data that they have, then change this method to do that.
13674-            any_cache, timestamp = self._node._read_from_cache(self.verinfo,
13675-                                                               shnum,
13676-                                                               0,
13677-                                                               1000)
13678+            any_cache = self._node._read_from_cache(self.verinfo, shnum,
13679+                                                    0, 1000)
13680             ss = self.servermap.connections[peerid]
13681             reader = MDMFSlotReadProxy(ss,
13682                                        self._storage_index,
13683hunk ./src/allmydata/mutable/retrieve.py 373
13684                  (k, n, self._num_segments, self._segment_size,
13685                   self._tail_segment_size))
13686 
13687-        # ask the cache first
13688-        got_from_cache = False
13689-        datavs = []
13690-        for (offset, length) in readv:
13691-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
13692-                                                            offset, length)
13693-            if data is not None:
13694-                datavs.append(data)
13695-        if len(datavs) == len(readv):
13696-            self.log("got data from cache")
13697-            got_from_cache = True
13698-            d = fireEventually({shnum: datavs})
13699-            # datavs is a dict mapping shnum to a pair of strings
13700+        for i in xrange(self._total_shares):
13701+            # So we don't have to do this later.
13702+            self._block_hash_trees[i] = hashtree.IncompleteHashTree(self._num_segments)
13703+
13704+        # Our last task is to tell the downloader where to start and
13705+        # where to stop. We use three parameters for that:
13706+        #   - self._start_segment: the segment that we need to start
13707+        #     downloading from.
13708+        #   - self._current_segment: the next segment that we need to
13709+        #     download.
13710+        #   - self._last_segment: The last segment that we were asked to
13711+        #     download.
13712+        #
13713+        #  We say that the download is complete when
13714+        #  self._current_segment > self._last_segment. We use
13715+        #  self._start_segment and self._last_segment to know when to
13716+        #  strip things off of segments, and how much to strip.
13717+        if self._offset:
13718+            self.log("got offset: %d" % self._offset)
13719+            # our start segment is the first segment containing the
13720+            # offset we were given.
13721+            start = mathutil.div_ceil(self._offset,
13722+                                      self._segment_size)
13723+            # this gets us the first segment after self._offset. Then
13724+            # our start segment is the one before it.
13725+            start -= 1
13726+
13727+            assert start < self._num_segments
13728+            self._start_segment = start
13729+            self.log("got start segment: %d" % self._start_segment)
13730         else:
13731             self._start_segment = 0
13732 
13733hunk ./src/allmydata/mutable/servermap.py 7
13734 from itertools import count
13735 from twisted.internet import defer
13736 from twisted.python import failure
13737-from foolscap.api import DeadReferenceError, RemoteException, eventually
13738-from allmydata.util import base32, hashutil, idlib, log
13739+from foolscap.api import DeadReferenceError, RemoteException, eventually, \
13740+                         fireEventually
13741+from allmydata.util import base32, hashutil, idlib, log, deferredutil
13742+from allmydata.util.dictutil import DictOfSets
13743 from allmydata.storage.server import si_b2a
13744 from allmydata.interfaces import IServermapUpdaterStatus
13745 from pycryptopp.publickey import rsa
13746hunk ./src/allmydata/mutable/servermap.py 16
13747 
13748 from allmydata.mutable.common import MODE_CHECK, MODE_ANYTHING, MODE_WRITE, MODE_READ, \
13749-     DictOfSets, CorruptShareError, NeedMoreDataError
13750-from allmydata.mutable.layout import unpack_prefix_and_signature, unpack_header, unpack_share, \
13751-     SIGNED_PREFIX_LENGTH
13752+     CorruptShareError
13753+from allmydata.mutable.layout import SIGNED_PREFIX_LENGTH, MDMFSlotReadProxy
13754 
13755 class UpdateStatus:
13756     implements(IServermapUpdaterStatus)
13757hunk ./src/allmydata/mutable/servermap.py 391
13758         #  * if we need the encrypted private key, we want [-1216ish:]
13759         #   * but we can't read from negative offsets
13760         #   * the offset table tells us the 'ish', also the positive offset
13761-        # A future version of the SMDF slot format should consider using
13762-        # fixed-size slots so we can retrieve less data. For now, we'll just
13763-        # read 2000 bytes, which also happens to read enough actual data to
13764-        # pre-fetch a 9-entry dirnode.
13765+        # MDMF:
13766+        #  * Checkstring? [0:72]
13767+        #  * If we want to validate the checkstring, then [0:72], [143:?] --
13768+        #    the offset table will tell us for sure.
13769+        #  * If we need the verification key, we have to consult the offset
13770+        #    table as well.
13771+        # At this point, we don't know which we are. Our filenode can
13772+        # tell us, but it might be lying -- in some cases, we're
13773+        # responsible for telling it which kind of file it is.
13774         self._read_size = 4000
13775         if mode == MODE_CHECK:
13776             # we use unpack_prefix_and_signature, so we need 1k
13777hunk ./src/allmydata/mutable/servermap.py 633
13778         updated.
13779         """
13780         if verinfo:
13781-            self._node._add_to_cache(verinfo, shnum, 0, data, now)
13782+            self._node._add_to_cache(verinfo, shnum, 0, data)
13783 
13784 
13785     def _got_results(self, datavs, peerid, readsize, stuff, started):
13786hunk ./src/allmydata/mutable/servermap.py 664
13787 
13788         for shnum,datav in datavs.items():
13789             data = datav[0]
13790-            try:
13791-                verinfo = self._got_results_one_share(shnum, data, peerid, lp)
13792-                last_verinfo = verinfo
13793-                last_shnum = shnum
13794-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
13795-            except CorruptShareError, e:
13796-                # log it and give the other shares a chance to be processed
13797-                f = failure.Failure()
13798-                self.log(format="bad share: %(f_value)s", f_value=str(f.value),
13799-                         failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
13800-                self.notify_server_corruption(peerid, shnum, str(e))
13801-                self._bad_peers.add(peerid)
13802-                self._last_failure = f
13803-                checkstring = data[:SIGNED_PREFIX_LENGTH]
13804-                self._servermap.mark_bad_share(peerid, shnum, checkstring)
13805-                self._servermap.problems.append(f)
13806-                pass
13807+            reader = MDMFSlotReadProxy(ss,
13808+                                       storage_index,
13809+                                       shnum,
13810+                                       data)
13811+            self._readers.setdefault(peerid, dict())[shnum] = reader
13812+            # our goal, with each response, is to validate the version
13813+            # information and share data as best we can at this point --
13814+            # we do this by validating the signature. To do this, we
13815+            # need to do the following:
13816+            #   - If we don't already have the public key, fetch the
13817+            #     public key. We use this to validate the signature.
13818+            if not self._node.get_pubkey():
13819+                # fetch and set the public key.
13820+                d = reader.get_verification_key(queue=True)
13821+                d.addCallback(lambda results, shnum=shnum, peerid=peerid:
13822+                    self._try_to_set_pubkey(results, peerid, shnum, lp))
13823+                # XXX: Make self._pubkey_query_failed?
13824+                d.addErrback(lambda error, shnum=shnum, peerid=peerid:
13825+                    self._got_corrupt_share(error, shnum, peerid, data, lp))
13826+            else:
13827+                # we already have the public key.
13828+                d = defer.succeed(None)
13829 
13830             # Neither of these two branches return anything of
13831             # consequence, so the first entry in our deferredlist will
13832hunk ./src/allmydata/test/test_storage.py 1
13833-import time, os.path, platform, stat, re, simplejson, struct
13834+import time, os.path, platform, stat, re, simplejson, struct, shutil
13835 
13836hunk ./src/allmydata/test/test_storage.py 3
13837-import time, os.path, stat, re, simplejson, struct
13838+import mock
13839 
13840 from twisted.trial import unittest
13841 
13842}
13843[mutable/filenode.py: fix create_mutable_file('string')
13844"Brian Warner <warner@lothar.com>"**20110221014659
13845 Ignore-this: dc6bdad761089f0199681eeb784f1001
13846] hunk ./src/allmydata/mutable/filenode.py 137
13847         if contents is None:
13848             return MutableData("")
13849 
13850+        if isinstance(contents, str):
13851+            return MutableData(contents)
13852+
13853         if IMutableUploadable.providedBy(contents):
13854             return contents
13855 
13856[resolve more conflicts with current trunk
13857"Brian Warner <warner@lothar.com>"**20110221055600
13858 Ignore-this: 77ad038a478dbf5d9b34f7a68159a3e0
13859] hunk ./src/allmydata/mutable/servermap.py 461
13860         self._queries_completed = 0
13861 
13862         sb = self._storage_broker
13863-        full_peerlist = sb.get_servers_for_index(self._storage_index)
13864+        # All of the peers, permuted by the storage index, as usual.
13865+        full_peerlist = [(s.get_serverid(), s.get_rref())
13866+                         for s in sb.get_servers_for_psi(self._storage_index)]
13867         self.full_peerlist = full_peerlist # for use later, immutable
13868         self.extra_peers = full_peerlist[:] # peers are removed as we use them
13869         self._good_peers = set() # peers who had some shares
13870[update MDMF code with StorageFarmBroker changes
13871"Brian Warner <warner@lothar.com>"**20110221061004
13872 Ignore-this: a693b201d31125b391cebe0412ddd027
13873] {
13874hunk ./src/allmydata/mutable/publish.py 203
13875         self._encprivkey = self._node.get_encprivkey()
13876 
13877         sb = self._storage_broker
13878-        full_peerlist = sb.get_servers_for_index(self._storage_index)
13879+        full_peerlist = [(s.get_serverid(), s.get_rref())
13880+                         for s in sb.get_servers_for_psi(self._storage_index)]
13881         self.full_peerlist = full_peerlist # for use later, immutable
13882         self.bad_peers = set() # peerids who have errbacked/refused requests
13883 
13884hunk ./src/allmydata/test/test_mutable.py 2538
13885             # for either a block and salt or for hashes, either of which
13886             # will exercise the error handling code.
13887             killer = FirstServerGetsKilled()
13888-            for (serverid, ss) in nm.storage_broker.get_all_servers():
13889-                ss.post_call_notifier = killer.notify
13890+            for s in nm.storage_broker.get_connected_servers():
13891+                s.get_rref().post_call_notifier = killer.notify
13892             ver = servermap.best_recoverable_version()
13893             assert ver
13894             return self._node.download_version(servermap, ver)
13895}
13896[mutable/filenode: Clean up servermap handling in MutableFileVersion
13897Kevan Carstensen <kevan@isnotajoke.com>**20110226010433
13898 Ignore-this: 2257c9f65502098789f5ea355b94f130
13899 
13900 We want to update the servermap before attempting to modify a file,
13901 which we now do. This introduced code duplication, which was addressed
13902 by refactoring the servermap update into its own method, and then
13903 eliminating duplicate servermap updates throughout the
13904 MutableFileVersion.
13905] {
13906hunk ./src/allmydata/mutable/filenode.py 19
13907 from allmydata.mutable.publish import Publish, MutableData,\
13908                                       DEFAULT_MAX_SEGMENT_SIZE, \
13909                                       TransformingUploadable
13910-from allmydata.mutable.common import MODE_READ, MODE_WRITE, UnrecoverableFileError, \
13911+from allmydata.mutable.common import MODE_READ, MODE_WRITE, MODE_CHECK, UnrecoverableFileError, \
13912      ResponseCache, UncoordinatedWriteError
13913 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
13914 from allmydata.mutable.retrieve import Retrieve
13915hunk ./src/allmydata/mutable/filenode.py 807
13916         a little bit.
13917         """
13918         log.msg("doing modify")
13919-        d = self._modify_once(modifier, first_time)
13920+        if first_time:
13921+            d = self._update_servermap()
13922+        else:
13923+            # We ran into trouble; do MODE_CHECK so we're a little more
13924+            # careful on subsequent tries.
13925+            d = self._update_servermap(mode=MODE_CHECK)
13926+
13927+        d.addCallback(lambda ignored:
13928+            self._modify_once(modifier, first_time))
13929         def _retry(f):
13930             f.trap(UncoordinatedWriteError)
13931hunk ./src/allmydata/mutable/filenode.py 818
13932+            # Uh oh, it broke. We're allowed to trust the servermap for our
13933+            # first try, but after that we need to update it. It's
13934+            # possible that we've failed due to a race with another
13935+            # uploader, and if the race is to converge correctly, we
13936+            # need to know about that upload.
13937             d2 = defer.maybeDeferred(backoffer, self, f)
13938             d2.addCallback(lambda ignored:
13939                            self._modify_and_retry(modifier,
13940hunk ./src/allmydata/mutable/filenode.py 837
13941         I attempt to apply a modifier to the contents of the mutable
13942         file.
13943         """
13944-        # XXX: This is wrong -- we could get more servers if we updated
13945-        # in MODE_ANYTHING and possibly MODE_CHECK. Probably we want to
13946-        # assert that the last update wasn't MODE_READ
13947-        assert self._servermap.last_update_mode == MODE_WRITE
13948+        assert self._servermap.last_update_mode != MODE_READ
13949 
13950         # download_to_data is serialized, so we have to call this to
13951         # avoid deadlock.
13952hunk ./src/allmydata/mutable/filenode.py 1076
13953 
13954         # Now ask for the servermap to be updated in MODE_WRITE with
13955         # this update range.
13956-        u = ServermapUpdater(self._node, self._storage_broker, Monitor(),
13957-                             self._servermap,
13958-                             mode=MODE_WRITE,
13959-                             update_range=(start_segment, end_segment))
13960-        return u.update()
13961+        return self._update_servermap(update_range=(start_segment,
13962+                                                    end_segment))
13963 
13964 
13965     def _decode_and_decrypt_segments(self, ignored, data, offset):
13966hunk ./src/allmydata/mutable/filenode.py 1135
13967                                    segments_and_bht[1])
13968         p = Publish(self._node, self._storage_broker, self._servermap)
13969         return p.update(u, offset, segments_and_bht[2], self._version)
13970+
13971+
13972+    def _update_servermap(self, mode=MODE_WRITE, update_range=None):
13973+        """
13974+        I update the servermap. I return a Deferred that fires when the
13975+        servermap update is done.
13976+        """
13977+        if update_range:
13978+            u = ServermapUpdater(self._node, self._storage_broker, Monitor(),
13979+                                 self._servermap,
13980+                                 mode=mode,
13981+                                 update_range=update_range)
13982+        else:
13983+            u = ServermapUpdater(self._node, self._storage_broker, Monitor(),
13984+                                 self._servermap,
13985+                                 mode=mode)
13986+        return u.update()
13987}
13988[web: Use the string "replace" to trigger whole-file replacement when processing an offset parameter.
13989Kevan Carstensen <kevan@isnotajoke.com>**20110227231643
13990 Ignore-this: 5bbf0b90d68efe20d4c531bb98a8321a
13991] {
13992hunk ./docs/frontends/webapi.rst 360
13993  To use the /uri/$FILECAP form, $FILECAP must be a write-cap for a mutable file.
13994 
13995  In the /uri/$DIRCAP/[SUBDIRS../]FILENAME form, if the target file is a
13996- writeable mutable file, that file's contents will be overwritten in-place. If
13997- it is a read-cap for a mutable file, an error will occur. If it is an
13998- immutable file, the old file will be discarded, and a new one will be put in
13999- its place. If the target file is a writable mutable file, you may also
14000- specify an "offset" parameter -- a byte offset that determines where in
14001- the mutable file the data from the HTTP request body is placed. This
14002- operation is relatively efficient for MDMF mutable files, and is
14003- relatively inefficient (but still supported) for SDMF mutable files.
14004+ writeable mutable file, that file's contents will be overwritten
14005+ in-place. If it is a read-cap for a mutable file, an error will occur.
14006+ If it is an immutable file, the old file will be discarded, and a new
14007+ one will be put in its place. If the target file is a writable mutable
14008+ file, you may also specify an "offset" parameter -- a byte offset that
14009+ determines where in the mutable file the data from the HTTP request
14010+ body is placed. This operation is relatively efficient for MDMF mutable
14011+ files, and is relatively inefficient (but still supported) for SDMF
14012+ mutable files. If no offset parameter is specified, then the entire
14013+ file is replaced with the data from the HTTP request body. For an
14014+ immutable file, the "offset" parameter is not valid.
14015 
14016  When creating a new file, if "mutable=true" is in the query arguments, the
14017  operation will create a mutable file instead of an immutable one.
14018hunk ./src/allmydata/test/test_web.py 3187
14019             self.failUnlessEqual(results, self.new_data + ("puppies" * 100)))
14020         return d
14021 
14022+    def test_PUT_update_at_invalid_offset(self):
14023+        file_contents = "test file" * 100000 # about 900 KiB
14024+        d = self.PUT("/uri?mutable=true", file_contents)
14025+        def _then(filecap):
14026+            self.filecap = filecap
14027+        d.addCallback(_then)
14028+        # Negative offsets should cause an error.
14029+        d.addCallback(lambda ignored:
14030+            self.shouldHTTPError("test mutable invalid offset negative",
14031+                                 400, "Bad Request",
14032+                                 "Invalid offset",
14033+                                 self.PUT,
14034+                                 "/uri/%s?offset=-1" % self.filecap,
14035+                                 "foo"))
14036+        return d
14037 
14038     def test_PUT_update_at_offset_immutable(self):
14039         file_contents = "Test file" * 100000
14040hunk ./src/allmydata/web/common.py 55
14041     # message? Since this call is going to be used by programmers and
14042     # their tools rather than users (through the wui), it is not
14043     # inconsistent to return that, I guess.
14044-    offset = int(offset)
14045-    return offset
14046+    return int(offset)
14047 
14048 
14049 def get_root(ctx_or_req):
14050hunk ./src/allmydata/web/filenode.py 219
14051         req = IRequest(ctx)
14052         t = get_arg(req, "t", "").strip()
14053         replace = parse_replace_arg(get_arg(req, "replace", "true"))
14054-        offset = parse_offset_arg(get_arg(req, "offset", -1))
14055+        offset = parse_offset_arg(get_arg(req, "offset", False))
14056 
14057         if not t:
14058hunk ./src/allmydata/web/filenode.py 222
14059-            if self.node.is_mutable() and offset >= 0:
14060-                return self.update_my_contents(req, offset)
14061-
14062-            elif self.node.is_mutable():
14063-                return self.replace_my_contents(req)
14064             if not replace:
14065                 # this is the early trap: if someone else modifies the
14066                 # directory while we're uploading, the add_file(overwrite=)
14067hunk ./src/allmydata/web/filenode.py 227
14068                 # call in replace_me_with_a_child will do the late trap.
14069                 raise ExistingChildError()
14070-            if offset >= 0:
14071-                raise WebError("PUT to a file: append operation invoked "
14072-                               "on an immutable cap")
14073 
14074hunk ./src/allmydata/web/filenode.py 228
14075+            if self.node.is_mutable():
14076+                if offset == False:
14077+                    return self.replace_my_contents(req)
14078+
14079+                if offset >= 0:
14080+                    return self.update_my_contents(req, offset)
14081+
14082+                raise WebError("PUT to a mutable file: Invalid offset")
14083+
14084+            else:
14085+                if offset != False:
14086+                    raise WebError("PUT to a file: append operation invoked "
14087+                                   "on an immutable cap")
14088+
14089+                assert self.parentnode and self.name
14090+                return self.replace_me_with_a_child(req, self.client, replace)
14091 
14092hunk ./src/allmydata/web/filenode.py 245
14093-            assert self.parentnode and self.name
14094-            return self.replace_me_with_a_child(req, self.client, replace)
14095         if t == "uri":
14096             if not replace:
14097                 raise ExistingChildError()
14098}
14099[docs/configuration.rst: fix more conflicts between #393 and trunk
14100Kevan Carstensen <kevan@isnotajoke.com>**20110228003426
14101 Ignore-this: 7917effdeecab00d634a06f1df8fe2cf
14102] {
14103replace ./docs/configuration.rst [A-Za-z_0-9\-\.] Tahoe Tahoe-LAFS
14104hunk ./docs/configuration.rst 324
14105     (Mutable files use a different share placement algorithm that does not
14106     currently consider this parameter.)
14107 
14108+``mutable.format = sdmf or mdmf``
14109+
14110+    This value tells Tahoe-LAFS what the default mutable file format should
14111+    be. If ``mutable.format=sdmf``, then newly created mutable files will be
14112+    in the old SDMF format. This is desirable for clients that operate on
14113+    grids where some peers run older versions of Tahoe-LAFS, as these older
14114+    versions cannot read the new MDMF mutable file format. If
14115+    ``mutable.format`` is ``mdmf``, then newly created mutable files will use
14116+    the new MDMF format, which supports efficient in-place modification and
14117+    streaming downloads. You can overwrite this value using a special
14118+    mutable-type parameter in the webapi. If you do not specify a value here,
14119+    Tahoe-LAFS will use SDMF for all newly-created mutable files.
14120+
14121+    Note that this parameter only applies to mutable files. Mutable
14122+    directories, which are stored as mutable files, are not controlled by
14123+    this parameter and will always use SDMF. We may revisit this decision
14124+    in future versions of Tahoe-LAFS.
14125+
14126+
14127+Frontend Configuration
14128+======================
14129+
14130+The Tahoe client process can run a variety of frontend file-access protocols.
14131+You will use these to create and retrieve files from the virtual filesystem.
14132+Configuration details for each are documented in the following
14133+protocol-specific guides:
14134+
14135+HTTP
14136+
14137+    Tahoe runs a webserver by default on port 3456. This interface provides a
14138+    human-oriented "WUI", with pages to create, modify, and browse
14139+    directories and files, as well as a number of pages to check on the
14140+    status of your Tahoe node. It also provides a machine-oriented "WAPI",
14141+    with a REST-ful HTTP interface that can be used by other programs
14142+    (including the CLI tools). Please see `<frontends/webapi.rst>`_ for full
14143+    details, and the ``web.port`` and ``web.static`` config variables above.
14144+    The `<frontends/download-status.rst>`_ document also describes a few WUI
14145+    status pages.
14146+
14147+CLI
14148+
14149+    The main "bin/tahoe" executable includes subcommands for manipulating the
14150+    filesystem, uploading/downloading files, and creating/running Tahoe
14151+    nodes. See `<frontends/CLI.rst>`_ for details.
14152+
14153+FTP, SFTP
14154+
14155+    Tahoe can also run both FTP and SFTP servers, and map a username/password
14156+    pair to a top-level Tahoe directory. See `<frontends/FTP-and-SFTP.rst>`_
14157+    for instructions on configuring these services, and the ``[ftpd]`` and
14158+    ``[sftpd]`` sections of ``tahoe.cfg``.
14159+
14160 
14161 Storage Server Configuration
14162 ============================
14163hunk ./docs/configuration.rst 436
14164     `<garbage-collection.rst>`_ for full details.
14165 
14166 
14167-shares.needed = (int, optional) aka "k", default 3
14168-shares.total = (int, optional) aka "N", N >= k, default 10
14169-shares.happy = (int, optional) 1 <= happy <= N, default 7
14170-
14171- These three values set the default encoding parameters. Each time a new file
14172- is uploaded, erasure-coding is used to break the ciphertext into separate
14173- pieces. There will be "N" (i.e. shares.total) pieces created, and the file
14174- will be recoverable if any "k" (i.e. shares.needed) pieces are retrieved.
14175- The default values are 3-of-10 (i.e. shares.needed = 3, shares.total = 10).
14176- Setting k to 1 is equivalent to simple replication (uploading N copies of
14177- the file).
14178-
14179- These values control the tradeoff between storage overhead, performance, and
14180- reliability. To a first approximation, a 1MB file will use (1MB*N/k) of
14181- backend storage space (the actual value will be a bit more, because of other
14182- forms of overhead). Up to N-k shares can be lost before the file becomes
14183- unrecoverable, so assuming there are at least N servers, up to N-k servers
14184- can be offline without losing the file. So large N/k ratios are more
14185- reliable, and small N/k ratios use less disk space. Clearly, k must never be
14186- smaller than N.
14187-
14188- Large values of N will slow down upload operations slightly, since more
14189- servers must be involved, and will slightly increase storage overhead due to
14190- the hash trees that are created. Large values of k will cause downloads to
14191- be marginally slower, because more servers must be involved. N cannot be
14192- larger than 256, because of the 8-bit erasure-coding algorithm that Tahoe-LAFS
14193- uses.
14194-
14195- shares.happy allows you control over the distribution of your immutable file.
14196- For a successful upload, shares are guaranteed to be initially placed on
14197- at least 'shares.happy' distinct servers, the correct functioning of any
14198- k of which is sufficient to guarantee the availability of the uploaded file.
14199- This value should not be larger than the number of servers on your grid.
14200-
14201- A value of shares.happy <= k is allowed, but does not provide any redundancy
14202- if some servers fail or lose shares.
14203-
14204- (Mutable files use a different share placement algorithm that does not
14205-  consider this parameter.)
14206-
14207-
14208-== Storage Server Configuration ==
14209-
14210-[storage]
14211-enabled = (boolean, optional)
14212-
14213- If this is True, the node will run a storage server, offering space to other
14214- clients. If it is False, the node will not run a storage server, meaning
14215- that no shares will be stored on this node. Use False this for clients who
14216- do not wish to provide storage service. The default value is True.
14217-
14218-readonly = (boolean, optional)
14219-
14220- If True, the node will run a storage server but will not accept any shares,
14221- making it effectively read-only. Use this for storage servers which are
14222- being decommissioned: the storage/ directory could be mounted read-only,
14223- while shares are moved to other servers. Note that this currently only
14224- affects immutable shares. Mutable shares (used for directories) will be
14225- written and modified anyway. See ticket #390 for the current status of this
14226- bug. The default value is False.
14227-
14228-reserved_space = (str, optional)
14229-
14230- If provided, this value defines how much disk space is reserved: the storage
14231- server will not accept any share which causes the amount of free disk space
14232- to drop below this value. (The free space is measured by a call to statvfs(2)
14233- on Unix, or GetDiskFreeSpaceEx on Windows, and is the space available to the
14234- user account under which the storage server runs.)
14235-
14236- This string contains a number, with an optional case-insensitive scale
14237- suffix like "K" or "M" or "G", and an optional "B" or "iB" suffix. So
14238- "100MB", "100M", "100000000B", "100000000", and "100000kb" all mean the same
14239- thing. Likewise, "1MiB", "1024KiB", and "1048576B" all mean the same thing.
14240-
14241-expire.enabled =
14242-expire.mode =
14243-expire.override_lease_duration =
14244-expire.cutoff_date =
14245-expire.immutable =
14246-expire.mutable =
14247-
14248- These settings control garbage-collection, in which the server will delete
14249- shares that no longer have an up-to-date lease on them. Please see the
14250- neighboring "garbage-collection.txt" document for full details.
14251-
14252-
14253-== Running A Helper ==
14254+Running A Helper
14255+================
14256 
14257 A "helper" is a regular client node that also offers the "upload helper"
14258 service.
14259}
14260[mutable/layout: remove references to the salt hash tree.
14261Kevan Carstensen <kevan@isnotajoke.com>**20110228010637
14262 Ignore-this: b3b2963ba4d0b42c78b6bba219d4deb5
14263] {
14264hunk ./src/allmydata/mutable/layout.py 577
14265     # 99          8           The offset of the EOF
14266     #
14267     # followed by salts and share data, the encrypted private key, the
14268-    # block hash tree, the salt hash tree, the share hash chain, a
14269-    # signature over the first eight fields, and a verification key.
14270+    # block hash tree, the share hash chain, a signature over the first
14271+    # eight fields, and a verification key.
14272     #
14273     # The checkstring is the first three fields -- the version number,
14274     # sequence number, root hash and root salt hash. This is consistent
14275hunk ./src/allmydata/mutable/layout.py 628
14276     #      calculate the offset for the share hash chain, and fill that
14277     #      into the offsets table.
14278     #
14279-    #   4: At the same time, we're in a position to upload the salt hash
14280-    #      tree. This is a Merkle tree over all of the salts. We use a
14281-    #      Merkle tree so that we can validate each block,salt pair as
14282-    #      we download them later. We do this using
14283-    #
14284-    #        put_salthashes(salt_hash_tree)
14285-    #
14286-    #      When you do this, I automatically put the root of the tree
14287-    #      (the hash at index 0 of the list) in its appropriate slot in
14288-    #      the signed prefix of the share.
14289-    #
14290-    #   5: We're now in a position to upload the share hash chain for
14291+    #   4: We're now in a position to upload the share hash chain for
14292     #      a share. Do that with something like:
14293     #     
14294     #        put_sharehashes(share_hash_chain)
14295hunk ./src/allmydata/mutable/layout.py 639
14296     #      The root of this tree will be put explicitly in the next
14297     #      step.
14298     #
14299-    #      TODO: Why? Why not just include it in the tree here?
14300-    #
14301-    #   6: Before putting the signature, we must first put the
14302+    #   5: Before putting the signature, we must first put the
14303     #      root_hash. Do this with:
14304     #
14305     #        put_root_hash(root_hash).
14306hunk ./src/allmydata/mutable/layout.py 872
14307             raise LayoutInvalid("I was given the wrong size block to write")
14308 
14309         # We want to write at len(MDMFHEADER) + segnum * block_size.
14310-
14311         offset = MDMFHEADERSIZE + (self._actual_block_size * segnum)
14312         data = salt + data
14313 
14314hunk ./src/allmydata/mutable/layout.py 889
14315         # tree is written, since that could cause the private key to run
14316         # into the block hash tree. Before it writes the block hash
14317         # tree, the block hash tree writing method writes the offset of
14318-        # the salt hash tree. So that's a good indicator of whether or
14319+        # the share hash chain. So that's a good indicator of whether or
14320         # not the block hash tree has been written.
14321         if "share_hash_chain" in self._offsets:
14322             raise LayoutInvalid("You must write this before the block hash tree")
14323hunk ./src/allmydata/mutable/layout.py 907
14324         The encrypted private key must be queued before the block hash
14325         tree, since we need to know how large it is to know where the
14326         block hash tree should go. The block hash tree must be put
14327-        before the salt hash tree, since its size determines the
14328+        before the share hash chain, since its size determines the
14329         offset of the share hash chain.
14330         """
14331         assert self._offsets
14332hunk ./src/allmydata/mutable/layout.py 932
14333         I queue a write vector to put the share hash chain in my
14334         argument onto the remote server.
14335 
14336-        The salt hash tree must be queued before the share hash chain,
14337-        since we need to know where the salt hash tree ends before we
14338+        The block hash tree must be queued before the share hash chain,
14339+        since we need to know where the block hash tree ends before we
14340         can know where the share hash chain starts. The share hash chain
14341         must be put before the signature, since the length of the packed
14342         share hash chain determines the offset of the signature. Also,
14343hunk ./src/allmydata/mutable/layout.py 937
14344-        semantically, you must know what the root of the salt hash tree
14345+        semantically, you must know what the root of the block hash tree
14346         is before you can generate a valid signature.
14347         """
14348         assert isinstance(sharehashes, dict)
14349hunk ./src/allmydata/mutable/layout.py 942
14350         if "share_hash_chain" not in self._offsets:
14351-            raise LayoutInvalid("You need to put the salt hash tree before "
14352+            raise LayoutInvalid("You need to put the block hash tree before "
14353                                 "you can put the share hash chain")
14354         # The signature comes after the share hash chain. If the
14355         # signature has already been written, we must not write another
14356}
14357[test_mutable.py: add test to exercise fencepost bug
14358warner@lothar.com**20110228021056
14359 Ignore-this: d2f9cf237ce6db42fb250c8ad71a4fc3
14360] {
14361hunk ./src/allmydata/test/test_mutable.py 2
14362 
14363-import os
14364+import os, re
14365 from cStringIO import StringIO
14366 from twisted.trial import unittest
14367 from twisted.internet import defer, reactor
14368hunk ./src/allmydata/test/test_mutable.py 2931
14369         self.set_up_grid()
14370         self.c = self.g.clients[0]
14371         self.nm = self.c.nodemaker
14372-        self.data = "test data" * 100000 # about 900 KiB; MDMF
14373+        self.data = "testdata " * 100000 # about 900 KiB; MDMF
14374         self.small_data = "test data" * 10 # about 90 B; SDMF
14375         return self.do_upload()
14376 
14377hunk ./src/allmydata/test/test_mutable.py 2981
14378             self.failUnlessEqual(results, new_data))
14379         return d
14380 
14381+    def test_replace_segstart1(self):
14382+        offset = 128*1024+1
14383+        new_data = "NNNN"
14384+        expected = self.data[:offset]+new_data+self.data[offset+4:]
14385+        d = self.mdmf_node.get_best_mutable_version()
14386+        d.addCallback(lambda mv:
14387+            mv.update(MutableData(new_data), offset))
14388+        d.addCallback(lambda ignored:
14389+            self.mdmf_node.download_best_version())
14390+        def _check(results):
14391+            if results != expected:
14392+                print
14393+                print "got: %s ... %s" % (results[:20], results[-20:])
14394+                print "exp: %s ... %s" % (expected[:20], expected[-20:])
14395+                self.fail("results != expected")
14396+        d.addCallback(_check)
14397+        return d
14398+
14399+    def _check_differences(self, got, expected):
14400+        # displaying arbitrary file corruption is tricky for a
14401+        # 1MB file of repeating data,, so look for likely places
14402+        # with problems and display them separately
14403+        gotmods = [mo.span() for mo in re.finditer('([A-Z]+)', got)]
14404+        expmods = [mo.span() for mo in re.finditer('([A-Z]+)', expected)]
14405+        gotspans = ["%d:%d=%s" % (start,end,got[start:end])
14406+                    for (start,end) in gotmods]
14407+        expspans = ["%d:%d=%s" % (start,end,expected[start:end])
14408+                    for (start,end) in expmods]
14409+        #print "expecting: %s" % expspans
14410+
14411+        SEGSIZE = 128*1024
14412+        if got != expected:
14413+            print "differences:"
14414+            for segnum in range(len(expected)//SEGSIZE):
14415+                start = segnum * SEGSIZE
14416+                end = (segnum+1) * SEGSIZE
14417+                got_ends = "%s .. %s" % (got[start:start+20], got[end-20:end])
14418+                exp_ends = "%s .. %s" % (expected[start:start+20], expected[end-20:end])
14419+                if got_ends != exp_ends:
14420+                    print "expected[%d]: %s" % (start, exp_ends)
14421+                    print "got     [%d]: %s" % (start, got_ends)
14422+            if expspans != gotspans:
14423+                print "expected: %s" % expspans
14424+                print "got     : %s" % gotspans
14425+            open("EXPECTED","wb").write(expected)
14426+            open("GOT","wb").write(got)
14427+            print "wrote data to EXPECTED and GOT"
14428+            self.fail("didn't get expected data")
14429+
14430+
14431+    def test_replace_locations(self):
14432+        # exercise fencepost conditions
14433+        expected = self.data
14434+        SEGSIZE = 128*1024
14435+        suspects = range(SEGSIZE-3, SEGSIZE+1)+range(2*SEGSIZE-3, 2*SEGSIZE+1)
14436+        letters = iter("ABCDEFGHIJKLMNOPQRSTUVWXYZ")
14437+        d = defer.succeed(None)
14438+        for offset in suspects:
14439+            new_data = letters.next()*2 # "AA", then "BB", etc
14440+            expected = expected[:offset]+new_data+expected[offset+2:]
14441+            d.addCallback(lambda ign:
14442+                          self.mdmf_node.get_best_mutable_version())
14443+            def _modify(mv, offset=offset, new_data=new_data):
14444+                # close over 'offset','new_data'
14445+                md = MutableData(new_data)
14446+                return mv.update(md, offset)
14447+            d.addCallback(_modify)
14448+            d.addCallback(lambda ignored:
14449+                          self.mdmf_node.download_best_version())
14450+            d.addCallback(self._check_differences, expected)
14451+        return d
14452+
14453 
14454     def test_replace_and_extend(self):
14455         # We should be able to replace data in the middle of a mutable
14456}
14457[mutable/publish: account for offsets on segment boundaries.
14458Kevan Carstensen <kevan@isnotajoke.com>**20110228083327
14459 Ignore-this: c8758a0580fcc15a22c2f8582d758a6b
14460] {
14461hunk ./src/allmydata/mutable/filenode.py 17
14462 from pycryptopp.cipher.aes import AES
14463 
14464 from allmydata.mutable.publish import Publish, MutableData,\
14465-                                      DEFAULT_MAX_SEGMENT_SIZE, \
14466                                       TransformingUploadable
14467 from allmydata.mutable.common import MODE_READ, MODE_WRITE, MODE_CHECK, UnrecoverableFileError, \
14468      ResponseCache, UncoordinatedWriteError
14469hunk ./src/allmydata/mutable/filenode.py 1058
14470         # appending data to the file.
14471         assert offset <= self.get_size()
14472 
14473+        segsize = self._version[3]
14474         # We'll need the segment that the data starts in, regardless of
14475         # what we'll do later.
14476hunk ./src/allmydata/mutable/filenode.py 1061
14477-        start_segment = mathutil.div_ceil(offset, DEFAULT_MAX_SEGMENT_SIZE)
14478+        start_segment = mathutil.div_ceil(offset, segsize)
14479         start_segment -= 1
14480 
14481         # We only need the end segment if the data we append does not go
14482hunk ./src/allmydata/mutable/filenode.py 1069
14483         end_segment = start_segment
14484         if offset + data.get_size() < self.get_size():
14485             end_data = offset + data.get_size()
14486-            end_segment = mathutil.div_ceil(end_data, DEFAULT_MAX_SEGMENT_SIZE)
14487+            end_segment = mathutil.div_ceil(end_data, segsize)
14488             end_segment -= 1
14489         self._start_segment = start_segment
14490         self._end_segment = end_segment
14491hunk ./src/allmydata/mutable/publish.py 551
14492                                                   segment_size)
14493             self.starting_segment = mathutil.div_ceil(offset,
14494                                                       segment_size)
14495-            self.starting_segment -= 1
14496+            if offset % segment_size != 0:
14497+                self.starting_segment -= 1
14498             if offset == 0:
14499                 self.starting_segment = 0
14500 
14501}
14502[tahoe-put: raise UsageError when given a nonsensical mutable type, move option validation code to the option parser.
14503Kevan Carstensen <kevan@isnotajoke.com>**20110301030807
14504 Ignore-this: 2dc19d8bd741842eff458ca553d0bf2a
14505] {
14506hunk ./src/allmydata/scripts/cli.py 179
14507         if self.from_file == u"-":
14508             self.from_file = None
14509 
14510+        if self['mutable-type'] and self['mutable-type'] not in ("sdmf", "mdmf"):
14511+            raise usage.UsageError("%s is an invalid format" % self['mutable-type'])
14512+
14513+
14514     def getSynopsis(self):
14515         return "Usage:  %s put LOCAL_FILE REMOTE_FILE" % (os.path.basename(sys.argv[0]),)
14516 
14517hunk ./src/allmydata/scripts/tahoe_put.py 33
14518     stdout = options.stdout
14519     stderr = options.stderr
14520 
14521-    if mutable_type and mutable_type not in ('sdmf', 'mdmf'):
14522-        # Don't try to pass unsupported types to the webapi
14523-        print >>stderr, "error: %s is an invalid format" % mutable_type
14524-        return 1
14525-
14526     if nodeurl[-1] != "/":
14527         nodeurl += "/"
14528     if to_file:
14529hunk ./src/allmydata/test/test_cli.py 1008
14530         return d
14531 
14532     def test_mutable_type_invalid_format(self):
14533-        self.basedir = "cli/Put/mutable_type_invalid_format"
14534-        self.set_up_grid()
14535-        data = "data" * 100000
14536-        fn1 = os.path.join(self.basedir, "data")
14537-        fileutil.write(fn1, data)
14538-        d = self.do_cli("put", "--mutable", "--mutable-type=ldmf", fn1)
14539-        def _check_failure((rc, out, err)):
14540-            self.failIfEqual(rc, 0)
14541-            self.failUnlessIn("invalid", err)
14542-        d.addCallback(_check_failure)
14543-        return d
14544+        o = cli.PutOptions()
14545+        self.failUnlessRaises(usage.UsageError,
14546+                              o.parseOptions,
14547+                              ["--mutable", "--mutable-type=ldmf"])
14548 
14549     def test_put_with_nonexistent_alias(self):
14550         # when invoked with an alias that doesn't exist, 'tahoe put'
14551}
14552[web: use None instead of False in the case of no offset, use object identity comparison to check whether or not an offset was specified.
14553Kevan Carstensen <kevan@isnotajoke.com>**20110305010858
14554 Ignore-this: 14b7550ca95ce423c9b0b7f6f14ffd2f
14555] {
14556hunk ./src/allmydata/test/test_mutable.py 2981
14557             self.failUnlessEqual(results, new_data))
14558         return d
14559 
14560+    def test_replace_beginning(self):
14561+        # We should be able to replace data at the beginning of the file
14562+        # without truncating the file
14563+        B = "beginning"
14564+        new_data = B + self.data[len(B):]
14565+        d = self.mdmf_node.get_best_mutable_version()
14566+        d.addCallback(lambda mv: mv.update(MutableData(B), 0))
14567+        d.addCallback(lambda ignored: self.mdmf_node.download_best_version())
14568+        d.addCallback(lambda results: self.failUnlessEqual(results, new_data))
14569+        return d
14570+
14571     def test_replace_segstart1(self):
14572         offset = 128*1024+1
14573         new_data = "NNNN"
14574hunk ./src/allmydata/test/test_web.py 3185
14575         d.addCallback(_get_data)
14576         d.addCallback(lambda results:
14577             self.failUnlessEqual(results, self.new_data + ("puppies" * 100)))
14578+        # and try replacing the beginning of the file
14579+        d.addCallback(lambda ignored:
14580+            self.PUT("/uri/%s?offset=0" % self.filecap, "begin"))
14581+        d.addCallback(_get_data)
14582+        d.addCallback(lambda results:
14583+            self.failUnlessEqual(results, "begin"+self.new_data[len("begin"):]+("puppies"*100)))
14584         return d
14585 
14586     def test_PUT_update_at_invalid_offset(self):
14587hunk ./src/allmydata/web/common.py 55
14588     # message? Since this call is going to be used by programmers and
14589     # their tools rather than users (through the wui), it is not
14590     # inconsistent to return that, I guess.
14591-    return int(offset)
14592+    if offset is not None:
14593+        offset = int(offset)
14594+
14595+    return offset
14596 
14597 
14598 def get_root(ctx_or_req):
14599hunk ./src/allmydata/web/filenode.py 219
14600         req = IRequest(ctx)
14601         t = get_arg(req, "t", "").strip()
14602         replace = parse_replace_arg(get_arg(req, "replace", "true"))
14603-        offset = parse_offset_arg(get_arg(req, "offset", False))
14604+        offset = parse_offset_arg(get_arg(req, "offset", None))
14605 
14606         if not t:
14607             if not replace:
14608hunk ./src/allmydata/web/filenode.py 229
14609                 raise ExistingChildError()
14610 
14611             if self.node.is_mutable():
14612-                if offset == False:
14613+                if offset is None:
14614                     return self.replace_my_contents(req)
14615 
14616                 if offset >= 0:
14617hunk ./src/allmydata/web/filenode.py 238
14618                 raise WebError("PUT to a mutable file: Invalid offset")
14619 
14620             else:
14621-                if offset != False:
14622+                if offset is not None:
14623                     raise WebError("PUT to a file: append operation invoked "
14624                                    "on an immutable cap")
14625 
14626}
14627[mutable/filenode: remove incorrect comments about segment boundaries
14628Kevan Carstensen <kevan@isnotajoke.com>**20110307081713
14629 Ignore-this: 7008644c3d9588815000a86edbf9c568
14630] {
14631hunk ./src/allmydata/mutable/filenode.py 1001
14632         offset. I return a Deferred that fires when this has been
14633         completed.
14634         """
14635-        # We have two cases here:
14636-        # 1. The new data will add few enough segments so that it does
14637-        #    not cross into the next power-of-two boundary.
14638-        # 2. It doesn't.
14639-        #
14640-        # In the former case, we can modify the file in place. In the
14641-        # latter case, we need to re-encode the file.
14642         new_size = data.get_size() + offset
14643         old_size = self.get_size()
14644         segment_size = self._version[3]
14645hunk ./src/allmydata/mutable/filenode.py 1011
14646         log.msg("got %d old segments, %d new segments" % \
14647                         (num_old_segments, num_new_segments))
14648 
14649-        # We also do a whole file re-encode if the file is an SDMF file.
14650+        # We do a whole file re-encode if the file is an SDMF file.
14651         if self._version[2]: # version[2] == SDMF salt, which MDMF lacks
14652             log.msg("doing re-encode instead of in-place update")
14653             return self._do_modify_update(data, offset)
14654hunk ./src/allmydata/mutable/filenode.py 1016
14655 
14656+        # Otherwise, we can replace just the parts that are changing.
14657         log.msg("updating in place")
14658         d = self._do_update_update(data, offset)
14659         d.addCallback(self._decode_and_decrypt_segments, data, offset)
14660}
14661[mutable: use integer division where appropriate
14662Kevan Carstensen <kevan@isnotajoke.com>**20110307082229
14663 Ignore-this: a8767e89d919c9f2a5d5fef3953d53f9
14664] {
14665hunk ./src/allmydata/mutable/filenode.py 1055
14666         segsize = self._version[3]
14667         # We'll need the segment that the data starts in, regardless of
14668         # what we'll do later.
14669-        start_segment = mathutil.div_ceil(offset, segsize)
14670-        start_segment -= 1
14671+        start_segment = offset // segsize
14672 
14673         # We only need the end segment if the data we append does not go
14674         # beyond the current end-of-file.
14675hunk ./src/allmydata/mutable/filenode.py 1062
14676         end_segment = start_segment
14677         if offset + data.get_size() < self.get_size():
14678             end_data = offset + data.get_size()
14679-            end_segment = mathutil.div_ceil(end_data, segsize)
14680-            end_segment -= 1
14681+            end_segment = end_data // segsize
14682+
14683         self._start_segment = start_segment
14684         self._end_segment = end_segment
14685 
14686hunk ./src/allmydata/mutable/publish.py 547
14687 
14688         # Calculate the starting segment for the upload.
14689         if segment_size:
14690+            # We use div_ceil instead of integer division here because
14691+            # it is semantically correct.
14692+            # If datalength isn't an even multiple of segment_size, but
14693+            # is larger than segment_size, datalength // segment_size
14694+            # will be the largest number such that num <= datalength and
14695+            # num % segment_size == 0. But that's not what we want,
14696+            # because it ignores the extra data. div_ceil will give us
14697+            # the right number of segments for the data that we're
14698+            # given.
14699             self.num_segments = mathutil.div_ceil(self.datalength,
14700                                                   segment_size)
14701hunk ./src/allmydata/mutable/publish.py 558
14702-            self.starting_segment = mathutil.div_ceil(offset,
14703-                                                      segment_size)
14704-            if offset % segment_size != 0:
14705-                self.starting_segment -= 1
14706-            if offset == 0:
14707-                self.starting_segment = 0
14708+
14709+            self.starting_segment = offset // segment_size
14710 
14711         else:
14712             self.num_segments = 0
14713hunk ./src/allmydata/mutable/publish.py 604
14714         self.end_segment = self.num_segments - 1
14715         # Now figure out where the last segment should be.
14716         if self.data.get_size() != self.datalength:
14717+            # We're updating a few segments in the middle of a mutable
14718+            # file, so we don't want to republish the whole thing.
14719+            # (we don't have enough data to do that even if we wanted
14720+            # to)
14721             end = self.data.get_size()
14722hunk ./src/allmydata/mutable/publish.py 609
14723-            self.end_segment = mathutil.div_ceil(end,
14724-                                                 segment_size)
14725-            self.end_segment -= 1
14726+            self.end_segment = end // segment_size
14727+            if end % segment_size == 0:
14728+                self.end_segment -= 1
14729+
14730         self.log("got start segment %d" % self.starting_segment)
14731         self.log("got end segment %d" % self.end_segment)
14732 
14733}
14734[mutable/layout.py: reorder on-disk format to aput variable-length fields at the end of the share, after a predictably long preamble
14735Kevan Carstensen <kevan@isnotajoke.com>**20110501224125
14736 Ignore-this: 8b2c5d29b8984dfe675c1a2ada5205cf
14737] {
14738hunk ./src/allmydata/mutable/layout.py 539
14739                                      self._readvs)
14740 
14741 
14742-MDMFHEADER = ">BQ32sBBQQ QQQQQQ"
14743+MDMFHEADER = ">BQ32sBBQQ QQQQQQQQ"
14744 MDMFHEADERWITHOUTOFFSETS = ">BQ32sBBQQ"
14745 MDMFHEADERSIZE = struct.calcsize(MDMFHEADER)
14746 MDMFHEADERWITHOUTOFFSETSSIZE = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
14747hunk ./src/allmydata/mutable/layout.py 545
14748 MDMFCHECKSTRING = ">BQ32s"
14749 MDMFSIGNABLEHEADER = ">BQ32sBBQQ"
14750-MDMFOFFSETS = ">QQQQQQ"
14751+MDMFOFFSETS = ">QQQQQQQQ"
14752 MDMFOFFSETS_LENGTH = struct.calcsize(MDMFOFFSETS)
14753hunk ./src/allmydata/mutable/layout.py 547
14754+# XXX Fix this.
14755+PRIVATE_KEY_SIZE = 2000
14756+SIGNATURE_SIZE = 10000
14757+VERIFICATION_KEY_SIZE = 2000
14758+# We know we won't ever have more than 256 shares.
14759+# XXX: This, too, can be
14760+SHARE_HASH_CHAIN_SIZE = HASH_SIZE * 256
14761 
14762 class MDMFSlotWriteProxy:
14763     implements(IMutableSlotWriter)
14764hunk ./src/allmydata/mutable/layout.py 577
14765     # 51          8           The data length of the original plaintext
14766     #-- end signed part --
14767     # 59          8           The offset of the encrypted private key
14768-    # 83          8           The offset of the signature
14769-    # 91          8           The offset of the verification key
14770-    # 67          8           The offset of the block hash tree
14771-    # 75          8           The offset of the share hash chain
14772-    # 99          8           The offset of the EOF
14773-    #
14774-    # followed by salts and share data, the encrypted private key, the
14775-    # block hash tree, the share hash chain, a signature over the first
14776-    # eight fields, and a verification key.
14777+    # 67          8           The offset of the signature
14778+    # 75          8           The offset of the verification key
14779+    # 83          8           The offset of the end of the v. key.
14780+    # 92          8           The offset of the share data
14781+    # 100         8           The offset of the block hash tree
14782+    # 108         8           The offset of the share hash chain
14783+    # 116         8           The offset of EOF
14784     #
14785hunk ./src/allmydata/mutable/layout.py 585
14786+    # followed by the encrypted private key, signature, verification
14787+    # key, share hash chain, data, and block hash tree. We order the
14788+    # fields that way to make smart downloaders -- downloaders which
14789+    # prempetively read a big part of the share -- possible.
14790+    #
14791     # The checkstring is the first three fields -- the version number,
14792     # sequence number, root hash and root salt hash. This is consistent
14793     # in meaning to what we have with SDMF files, except now instead of
14794hunk ./src/allmydata/mutable/layout.py 792
14795         data_size += self._tail_block_size
14796         data_size += SALT_SIZE
14797         self._offsets['enc_privkey'] = MDMFHEADERSIZE
14798-        self._offsets['enc_privkey'] += data_size
14799-        # We'll wait for the rest. Callers can now call my "put_block" and
14800-        # "set_checkstring" methods.
14801+
14802+        # We don't define offsets for these because we want them to be
14803+        # tightly packed -- this allows us to ignore the responsibility
14804+        # of padding individual values, and of removing that padding
14805+        # later. So nonconstant_start is where we start writing
14806+        # nonconstant data.
14807+        nonconstant_start = self._offsets['enc_privkey']
14808+        nonconstant_start += PRIVATE_KEY_SIZE
14809+        nonconstant_start += SIGNATURE_SIZE
14810+        nonconstant_start += VERIFICATION_KEY_SIZE
14811+        nonconstant_start += SHARE_HASH_CHAIN_SIZE
14812+
14813+        self._offsets['share_data'] = nonconstant_start
14814+
14815+        # Finally, we know how big the share data will be, so we can
14816+        # figure out where the block hash tree needs to go.
14817+        # XXX: But this will go away if Zooko wants to make it so that
14818+        # you don't need to know the size of the file before you start
14819+        # uploading it.
14820+        self._offsets['block_hash_tree'] = self._offsets['share_data'] + \
14821+                    data_size
14822+
14823+        # Done. We can snow start writing.
14824 
14825 
14826     def set_checkstring(self,
14827hunk ./src/allmydata/mutable/layout.py 891
14828         anything to be written yet.
14829         """
14830         if segnum >= self._num_segments:
14831-            raise LayoutInvalid("I won't overwrite the private key")
14832+            raise LayoutInvalid("I won't overwrite the block hash tree")
14833         if len(salt) != SALT_SIZE:
14834             raise LayoutInvalid("I was given a salt of size %d, but "
14835                                 "I wanted a salt of size %d")
14836hunk ./src/allmydata/mutable/layout.py 902
14837             raise LayoutInvalid("I was given the wrong size block to write")
14838 
14839         # We want to write at len(MDMFHEADER) + segnum * block_size.
14840-        offset = MDMFHEADERSIZE + (self._actual_block_size * segnum)
14841+        offset = self._offsets['share_data'] + \
14842+            (self._actual_block_size * segnum)
14843         data = salt + data
14844 
14845         self._writevs.append(tuple([offset, data]))
14846hunk ./src/allmydata/mutable/layout.py 922
14847         # tree, the block hash tree writing method writes the offset of
14848         # the share hash chain. So that's a good indicator of whether or
14849         # not the block hash tree has been written.
14850-        if "share_hash_chain" in self._offsets:
14851-            raise LayoutInvalid("You must write this before the block hash tree")
14852+        if "signature" in self._offsets:
14853+            raise LayoutInvalid("You can't put the encrypted private key "
14854+                                "after putting the share hash chain")
14855+
14856+        self._offsets['share_hash_chain'] = self._offsets['enc_privkey'] + \
14857+                len(encprivkey)
14858 
14859hunk ./src/allmydata/mutable/layout.py 929
14860-        self._offsets['block_hash_tree'] = self._offsets['enc_privkey'] + \
14861-            len(encprivkey)
14862         self._writevs.append(tuple([self._offsets['enc_privkey'], encprivkey]))
14863 
14864 
14865hunk ./src/allmydata/mutable/layout.py 944
14866         offset of the share hash chain.
14867         """
14868         assert self._offsets
14869+        assert "block_hash_tree" in self._offsets
14870+
14871         assert isinstance(blockhashes, list)
14872hunk ./src/allmydata/mutable/layout.py 947
14873-        if "block_hash_tree" not in self._offsets:
14874-            raise LayoutInvalid("You must put the encrypted private key "
14875-                                "before you put the block hash tree")
14876-        # If written, the share hash chain causes the signature offset
14877-        # to be defined.
14878-        if "signature" in self._offsets:
14879-            raise LayoutInvalid("You must put the block hash tree before "
14880-                                "you put the share hash chain")
14881+
14882         blockhashes_s = "".join(blockhashes)
14883hunk ./src/allmydata/mutable/layout.py 949
14884-        self._offsets['share_hash_chain'] = self._offsets['block_hash_tree'] + len(blockhashes_s)
14885+        self._offsets['EOF'] = self._offsets['block_hash_tree'] + len(blockhashes_s)
14886 
14887         self._writevs.append(tuple([self._offsets['block_hash_tree'],
14888                                   blockhashes_s]))
14889hunk ./src/allmydata/mutable/layout.py 969
14890         is before you can generate a valid signature.
14891         """
14892         assert isinstance(sharehashes, dict)
14893+        assert self._offsets
14894         if "share_hash_chain" not in self._offsets:
14895hunk ./src/allmydata/mutable/layout.py 971
14896-            raise LayoutInvalid("You need to put the block hash tree before "
14897-                                "you can put the share hash chain")
14898+            raise LayoutInvalid("You must put the block hash tree before "
14899+                                "putting the share hash chain")
14900+
14901         # The signature comes after the share hash chain. If the
14902         # signature has already been written, we must not write another
14903         # share hash chain. The signature writes the verification key
14904hunk ./src/allmydata/mutable/layout.py 984
14905                                 "before you write the signature")
14906         sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
14907                                   for i in sorted(sharehashes.keys())])
14908-        self._offsets['signature'] = self._offsets['share_hash_chain'] + len(sharehashes_s)
14909+        self._offsets['signature'] = self._offsets['share_hash_chain'] + \
14910+            len(sharehashes_s)
14911         self._writevs.append(tuple([self._offsets['share_hash_chain'],
14912                             sharehashes_s]))
14913 
14914hunk ./src/allmydata/mutable/layout.py 1002
14915         # Signature is defined by the routine that places the share hash
14916         # chain, so it's a good thing to look for in finding out whether
14917         # or not the share hash chain exists on the remote server.
14918-        if "signature" not in self._offsets:
14919-            raise LayoutInvalid("You need to put the share hash chain "
14920-                                "before you can put the root share hash")
14921         if len(roothash) != HASH_SIZE:
14922             raise LayoutInvalid("hashes and salts must be exactly %d bytes"
14923                                  % HASH_SIZE)
14924hunk ./src/allmydata/mutable/layout.py 1053
14925         # If we put the signature after we put the verification key, we
14926         # could end up running into the verification key, and will
14927         # probably screw up the offsets as well. So we don't allow that.
14928+        if "verification_key_end" in self._offsets:
14929+            raise LayoutInvalid("You can't put the signature after the "
14930+                                "verification key")
14931         # The method that writes the verification key defines the EOF
14932         # offset before writing the verification key, so look for that.
14933hunk ./src/allmydata/mutable/layout.py 1058
14934-        if "EOF" in self._offsets:
14935-            raise LayoutInvalid("You must write the signature before the verification key")
14936-
14937-        self._offsets['verification_key'] = self._offsets['signature'] + len(signature)
14938+        self._offsets['verification_key'] = self._offsets['signature'] +\
14939+            len(signature)
14940         self._writevs.append(tuple([self._offsets['signature'], signature]))
14941 
14942 
14943hunk ./src/allmydata/mutable/layout.py 1074
14944         if "verification_key" not in self._offsets:
14945             raise LayoutInvalid("You must put the signature before you "
14946                                 "can put the verification key")
14947-        self._offsets['EOF'] = self._offsets['verification_key'] + len(verification_key)
14948+
14949+        self._offsets['verification_key_end'] = \
14950+            self._offsets['verification_key'] + len(verification_key)
14951         self._writevs.append(tuple([self._offsets['verification_key'],
14952                             verification_key]))
14953 
14954hunk ./src/allmydata/mutable/layout.py 1102
14955         of the write vectors that I've dealt with so far to be published
14956         to the remote server, ending the write process.
14957         """
14958-        if "EOF" not in self._offsets:
14959+        if "verification_key_end" not in self._offsets:
14960             raise LayoutInvalid("You must put the verification key before "
14961                                 "you can publish the offsets")
14962         offsets_offset = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
14963hunk ./src/allmydata/mutable/layout.py 1108
14964         offsets = struct.pack(MDMFOFFSETS,
14965                               self._offsets['enc_privkey'],
14966-                              self._offsets['block_hash_tree'],
14967                               self._offsets['share_hash_chain'],
14968                               self._offsets['signature'],
14969                               self._offsets['verification_key'],
14970hunk ./src/allmydata/mutable/layout.py 1111
14971+                              self._offsets['verification_key_end'],
14972+                              self._offsets['share_data'],
14973+                              self._offsets['block_hash_tree'],
14974                               self._offsets['EOF'])
14975         self._writevs.append(tuple([offsets_offset, offsets]))
14976         encoding_parameters_offset = struct.calcsize(MDMFCHECKSTRING)
14977hunk ./src/allmydata/mutable/layout.py 1227
14978         # MDMF, though we'll be left with 4 more bytes than we
14979         # need if this ends up being MDMF. This is probably less
14980         # expensive than the cost of a second roundtrip.
14981-        readvs = [(0, 107)]
14982+        readvs = [(0, 123)]
14983         d = self._read(readvs, force_remote)
14984         d.addCallback(self._process_encoding_parameters)
14985         d.addCallback(self._process_offsets)
14986hunk ./src/allmydata/mutable/layout.py 1330
14987             read_length = MDMFOFFSETS_LENGTH
14988             end = read_offset + read_length
14989             (encprivkey,
14990-             blockhashes,
14991              sharehashes,
14992              signature,
14993              verification_key,
14994hunk ./src/allmydata/mutable/layout.py 1333
14995+             verification_key_end,
14996+             sharedata,
14997+             blockhashes,
14998              eof) = struct.unpack(MDMFOFFSETS,
14999                                   offsets[read_offset:end])
15000             self._offsets = {}
15001hunk ./src/allmydata/mutable/layout.py 1344
15002             self._offsets['share_hash_chain'] = sharehashes
15003             self._offsets['signature'] = signature
15004             self._offsets['verification_key'] = verification_key
15005+            self._offsets['verification_key_end']= \
15006+                verification_key_end
15007             self._offsets['EOF'] = eof
15008hunk ./src/allmydata/mutable/layout.py 1347
15009+            self._offsets['share_data'] = sharedata
15010 
15011 
15012     def get_block_and_salt(self, segnum, queue=False):
15013hunk ./src/allmydata/mutable/layout.py 1357
15014         """
15015         d = self._maybe_fetch_offsets_and_header()
15016         def _then(ignored):
15017-            if self._version_number == 1:
15018-                base_share_offset = MDMFHEADERSIZE
15019-            else:
15020-                base_share_offset = self._offsets['share_data']
15021+            base_share_offset = self._offsets['share_data']
15022 
15023             if segnum + 1 > self._num_segments:
15024                 raise LayoutInvalid("Not a valid segment number")
15025hunk ./src/allmydata/mutable/layout.py 1430
15026         def _then(ignored):
15027             blockhashes_offset = self._offsets['block_hash_tree']
15028             if self._version_number == 1:
15029-                blockhashes_length = self._offsets['share_hash_chain'] - blockhashes_offset
15030+                blockhashes_length = self._offsets['EOF'] - blockhashes_offset
15031             else:
15032                 blockhashes_length = self._offsets['share_data'] - blockhashes_offset
15033             readvs = [(blockhashes_offset, blockhashes_length)]
15034hunk ./src/allmydata/mutable/layout.py 1501
15035             if self._version_number == 0:
15036                 privkey_length = self._offsets['EOF'] - privkey_offset
15037             else:
15038-                privkey_length = self._offsets['block_hash_tree'] - privkey_offset
15039+                privkey_length = self._offsets['share_hash_chain'] - privkey_offset
15040             readvs = [(privkey_offset, privkey_length)]
15041             return readvs
15042         d.addCallback(_make_readvs)
15043hunk ./src/allmydata/mutable/layout.py 1549
15044         def _make_readvs(ignored):
15045             if self._version_number == 1:
15046                 vk_offset = self._offsets['verification_key']
15047-                vk_length = self._offsets['EOF'] - vk_offset
15048+                vk_length = self._offsets['verification_key_end'] - vk_offset
15049             else:
15050                 vk_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
15051                 vk_length = self._offsets['signature'] - vk_offset
15052hunk ./src/allmydata/test/test_storage.py 26
15053 from allmydata.mutable.layout import MDMFSlotWriteProxy, MDMFSlotReadProxy, \
15054                                      LayoutInvalid, MDMFSIGNABLEHEADER, \
15055                                      SIGNED_PREFIX, MDMFHEADER, \
15056-                                     MDMFOFFSETS, SDMFSlotWriteProxy
15057+                                     MDMFOFFSETS, SDMFSlotWriteProxy, \
15058+                                     PRIVATE_KEY_SIZE, \
15059+                                     SIGNATURE_SIZE, \
15060+                                     VERIFICATION_KEY_SIZE, \
15061+                                     SHARE_HASH_CHAIN_SIZE
15062 from allmydata.interfaces import BadWriteEnablerError
15063 from allmydata.test.common import LoggingServiceParent, ShouldFailMixin
15064 from allmydata.test.common_web import WebRenderingMixin
15065hunk ./src/allmydata/test/test_storage.py 1408
15066 
15067         # The encrypted private key comes after the shares + salts
15068         offset_size = struct.calcsize(MDMFOFFSETS)
15069-        encrypted_private_key_offset = len(data) + offset_size + len(sharedata)
15070-        # The blockhashes come after the private key
15071-        blockhashes_offset = encrypted_private_key_offset + len(self.encprivkey)
15072-        # The sharehashes come after the salt hashes
15073-        sharehashes_offset = blockhashes_offset + len(self.block_hash_tree_s)
15074-        # The signature comes after the share hash chain
15075+        encrypted_private_key_offset = len(data) + offset_size
15076+        # The share has chain comes after the private key
15077+        sharehashes_offset = encrypted_private_key_offset + \
15078+            len(self.encprivkey)
15079+
15080+        # The signature comes after the share hash chain.
15081         signature_offset = sharehashes_offset + len(self.share_hash_chain_s)
15082hunk ./src/allmydata/test/test_storage.py 1415
15083-        # The verification key comes after the signature
15084-        verification_offset = signature_offset + len(self.signature)
15085-        # The EOF comes after the verification key
15086-        eof_offset = verification_offset + len(self.verification_key)
15087+
15088+        verification_key_offset = signature_offset + len(self.signature)
15089+        verification_key_end = verification_key_offset + \
15090+            len(self.verification_key)
15091+
15092+        share_data_offset = offset_size
15093+        share_data_offset += PRIVATE_KEY_SIZE
15094+        share_data_offset += SIGNATURE_SIZE
15095+        share_data_offset += VERIFICATION_KEY_SIZE
15096+        share_data_offset += SHARE_HASH_CHAIN_SIZE
15097+
15098+        blockhashes_offset = share_data_offset + len(sharedata)
15099+        eof_offset = blockhashes_offset + len(self.block_hash_tree_s)
15100+
15101         data += struct.pack(MDMFOFFSETS,
15102                             encrypted_private_key_offset,
15103hunk ./src/allmydata/test/test_storage.py 1431
15104-                            blockhashes_offset,
15105                             sharehashes_offset,
15106                             signature_offset,
15107hunk ./src/allmydata/test/test_storage.py 1433
15108-                            verification_offset,
15109+                            verification_key_offset,
15110+                            verification_key_end,
15111+                            share_data_offset,
15112+                            blockhashes_offset,
15113                             eof_offset)
15114hunk ./src/allmydata/test/test_storage.py 1438
15115+
15116         self.offsets = {}
15117         self.offsets['enc_privkey'] = encrypted_private_key_offset
15118         self.offsets['block_hash_tree'] = blockhashes_offset
15119hunk ./src/allmydata/test/test_storage.py 1444
15120         self.offsets['share_hash_chain'] = sharehashes_offset
15121         self.offsets['signature'] = signature_offset
15122-        self.offsets['verification_key'] = verification_offset
15123+        self.offsets['verification_key'] = verification_key_offset
15124+        self.offsets['share_data'] = share_data_offset
15125+        self.offsets['verification_key_end'] = verification_key_end
15126         self.offsets['EOF'] = eof_offset
15127hunk ./src/allmydata/test/test_storage.py 1448
15128-        # Next, we'll add in the salts and share data,
15129-        data += sharedata
15130+
15131         # the private key,
15132         data += self.encprivkey
15133hunk ./src/allmydata/test/test_storage.py 1451
15134-        # the block hash tree,
15135-        data += self.block_hash_tree_s
15136-        # the share hash chain,
15137+        # the sharehashes
15138         data += self.share_hash_chain_s
15139         # the signature,
15140         data += self.signature
15141hunk ./src/allmydata/test/test_storage.py 1457
15142         # and the verification key
15143         data += self.verification_key
15144+        # Then we'll add in gibberish until we get to the right point.
15145+        nulls = "".join([" " for i in xrange(len(data), share_data_offset)])
15146+        data += nulls
15147+
15148+        # Then the share data
15149+        data += sharedata
15150+        # the blockhashes
15151+        data += self.block_hash_tree_s
15152         return data
15153 
15154 
15155hunk ./src/allmydata/test/test_storage.py 1729
15156         return d
15157 
15158 
15159-    def test_blockhashes_after_share_hash_chain(self):
15160+    def test_private_key_after_share_hash_chain(self):
15161         mw = self._make_new_mw("si1", 0)
15162         d = defer.succeed(None)
15163hunk ./src/allmydata/test/test_storage.py 1732
15164-        # Put everything up to and including the share hash chain
15165         for i in xrange(6):
15166             d.addCallback(lambda ignored, i=i:
15167                 mw.put_block(self.block, i, self.salt))
15168hunk ./src/allmydata/test/test_storage.py 1738
15169         d.addCallback(lambda ignored:
15170             mw.put_encprivkey(self.encprivkey))
15171         d.addCallback(lambda ignored:
15172-            mw.put_blockhashes(self.block_hash_tree))
15173-        d.addCallback(lambda ignored:
15174             mw.put_sharehashes(self.share_hash_chain))
15175 
15176hunk ./src/allmydata/test/test_storage.py 1740
15177-        # Now try to put the block hash tree again.
15178+        # Now try to put the private key again.
15179         d.addCallback(lambda ignored:
15180hunk ./src/allmydata/test/test_storage.py 1742
15181-            self.shouldFail(LayoutInvalid, "test repeat salthashes",
15182-                            None,
15183-                            mw.put_blockhashes, self.block_hash_tree))
15184-        return d
15185-
15186-
15187-    def test_encprivkey_after_blockhashes(self):
15188-        mw = self._make_new_mw("si1", 0)
15189-        d = defer.succeed(None)
15190-        # Put everything up to and including the block hash tree
15191-        for i in xrange(6):
15192-            d.addCallback(lambda ignored, i=i:
15193-                mw.put_block(self.block, i, self.salt))
15194-        d.addCallback(lambda ignored:
15195-            mw.put_encprivkey(self.encprivkey))
15196-        d.addCallback(lambda ignored:
15197-            mw.put_blockhashes(self.block_hash_tree))
15198-        d.addCallback(lambda ignored:
15199-            self.shouldFail(LayoutInvalid, "out of order private key",
15200+            self.shouldFail(LayoutInvalid, "test repeat private key",
15201                             None,
15202                             mw.put_encprivkey, self.encprivkey))
15203         return d
15204hunk ./src/allmydata/test/test_storage.py 1748
15205 
15206 
15207-    def test_share_hash_chain_after_signature(self):
15208-        mw = self._make_new_mw("si1", 0)
15209-        d = defer.succeed(None)
15210-        # Put everything up to and including the signature
15211-        for i in xrange(6):
15212-            d.addCallback(lambda ignored, i=i:
15213-                mw.put_block(self.block, i, self.salt))
15214-        d.addCallback(lambda ignored:
15215-            mw.put_encprivkey(self.encprivkey))
15216-        d.addCallback(lambda ignored:
15217-            mw.put_blockhashes(self.block_hash_tree))
15218-        d.addCallback(lambda ignored:
15219-            mw.put_sharehashes(self.share_hash_chain))
15220-        d.addCallback(lambda ignored:
15221-            mw.put_root_hash(self.root_hash))
15222-        d.addCallback(lambda ignored:
15223-            mw.put_signature(self.signature))
15224-        # Now try to put the share hash chain again. This should fail
15225-        d.addCallback(lambda ignored:
15226-            self.shouldFail(LayoutInvalid, "out of order share hash chain",
15227-                            None,
15228-                            mw.put_sharehashes, self.share_hash_chain))
15229-        return d
15230-
15231-
15232     def test_signature_after_verification_key(self):
15233         mw = self._make_new_mw("si1", 0)
15234         d = defer.succeed(None)
15235hunk ./src/allmydata/test/test_storage.py 1877
15236         mw = self._make_new_mw("si1", 0)
15237         # Test writing some blocks.
15238         read = self.ss.remote_slot_readv
15239-        expected_sharedata_offset = struct.calcsize(MDMFHEADER)
15240+        expected_private_key_offset = struct.calcsize(MDMFHEADER)
15241+        expected_sharedata_offset = struct.calcsize(MDMFHEADER) + \
15242+                                    PRIVATE_KEY_SIZE + \
15243+                                    SIGNATURE_SIZE + \
15244+                                    VERIFICATION_KEY_SIZE + \
15245+                                    SHARE_HASH_CHAIN_SIZE
15246         written_block_size = 2 + len(self.salt)
15247         written_block = self.block + self.salt
15248         for i in xrange(6):
15249hunk ./src/allmydata/test/test_storage.py 1903
15250                 self.failUnlessEqual(read("si1", [0], [(expected_sharedata_offset + (i * written_block_size), written_block_size)]),
15251                                 {0: [written_block]})
15252 
15253-            expected_private_key_offset = expected_sharedata_offset + \
15254-                                      len(written_block) * 6
15255             self.failUnlessEqual(len(self.encprivkey), 7)
15256             self.failUnlessEqual(read("si1", [0], [(expected_private_key_offset, 7)]),
15257                                  {0: [self.encprivkey]})
15258hunk ./src/allmydata/test/test_storage.py 1907
15259 
15260-            expected_block_hash_offset = expected_private_key_offset + len(self.encprivkey)
15261+            expected_block_hash_offset = expected_sharedata_offset + \
15262+                        (6 * written_block_size)
15263             self.failUnlessEqual(len(self.block_hash_tree_s), 32 * 6)
15264             self.failUnlessEqual(read("si1", [0], [(expected_block_hash_offset, 32 * 6)]),
15265                                  {0: [self.block_hash_tree_s]})
15266hunk ./src/allmydata/test/test_storage.py 1913
15267 
15268-            expected_share_hash_offset = expected_block_hash_offset + len(self.block_hash_tree_s)
15269+            expected_share_hash_offset = expected_private_key_offset + len(self.encprivkey)
15270             self.failUnlessEqual(read("si1", [0],[(expected_share_hash_offset, (32 + 2) * 6)]),
15271                                  {0: [self.share_hash_chain_s]})
15272 
15273hunk ./src/allmydata/test/test_storage.py 1919
15274             self.failUnlessEqual(read("si1", [0], [(9, 32)]),
15275                                  {0: [self.root_hash]})
15276-            expected_signature_offset = expected_share_hash_offset + len(self.share_hash_chain_s)
15277+            expected_signature_offset = expected_share_hash_offset + \
15278+                len(self.share_hash_chain_s)
15279             self.failUnlessEqual(len(self.signature), 9)
15280             self.failUnlessEqual(read("si1", [0], [(expected_signature_offset, 9)]),
15281                                  {0: [self.signature]})
15282hunk ./src/allmydata/test/test_storage.py 1941
15283             self.failUnlessEqual(n, 10)
15284             self.failUnlessEqual(segsize, 6)
15285             self.failUnlessEqual(datalen, 36)
15286-            expected_eof_offset = expected_verification_key_offset + len(self.verification_key)
15287+            expected_eof_offset = expected_block_hash_offset + \
15288+                len(self.block_hash_tree_s)
15289 
15290             # Check the version number to make sure that it is correct.
15291             expected_version_number = struct.pack(">B", 1)
15292hunk ./src/allmydata/test/test_storage.py 1969
15293             expected_offset = struct.pack(">Q", expected_private_key_offset)
15294             self.failUnlessEqual(read("si1", [0], [(59, 8)]),
15295                                  {0: [expected_offset]})
15296-            expected_offset = struct.pack(">Q", expected_block_hash_offset)
15297+            expected_offset = struct.pack(">Q", expected_share_hash_offset)
15298             self.failUnlessEqual(read("si1", [0], [(67, 8)]),
15299                                  {0: [expected_offset]})
15300hunk ./src/allmydata/test/test_storage.py 1972
15301-            expected_offset = struct.pack(">Q", expected_share_hash_offset)
15302+            expected_offset = struct.pack(">Q", expected_signature_offset)
15303             self.failUnlessEqual(read("si1", [0], [(75, 8)]),
15304                                  {0: [expected_offset]})
15305hunk ./src/allmydata/test/test_storage.py 1975
15306-            expected_offset = struct.pack(">Q", expected_signature_offset)
15307+            expected_offset = struct.pack(">Q", expected_verification_key_offset)
15308             self.failUnlessEqual(read("si1", [0], [(83, 8)]),
15309                                  {0: [expected_offset]})
15310hunk ./src/allmydata/test/test_storage.py 1978
15311-            expected_offset = struct.pack(">Q", expected_verification_key_offset)
15312+            expected_offset = struct.pack(">Q", expected_verification_key_offset + len(self.verification_key))
15313             self.failUnlessEqual(read("si1", [0], [(91, 8)]),
15314                                  {0: [expected_offset]})
15315hunk ./src/allmydata/test/test_storage.py 1981
15316-            expected_offset = struct.pack(">Q", expected_eof_offset)
15317+            expected_offset = struct.pack(">Q", expected_sharedata_offset)
15318             self.failUnlessEqual(read("si1", [0], [(99, 8)]),
15319                                  {0: [expected_offset]})
15320hunk ./src/allmydata/test/test_storage.py 1984
15321+            expected_offset = struct.pack(">Q", expected_block_hash_offset)
15322+            self.failUnlessEqual(read("si1", [0], [(107, 8)]),
15323+                                 {0: [expected_offset]})
15324+            expected_offset = struct.pack(">Q", expected_eof_offset)
15325+            self.failUnlessEqual(read("si1", [0], [(115, 8)]),
15326+                                 {0: [expected_offset]})
15327         d.addCallback(_check_publish)
15328         return d
15329 
15330hunk ./src/allmydata/test/test_storage.py 2117
15331         for i in xrange(6):
15332             d.addCallback(lambda ignored, i=i:
15333                 mw0.put_block(self.block, i, self.salt))
15334-        # Try to write the block hashes before writing the encrypted
15335-        # private key
15336-        d.addCallback(lambda ignored:
15337-            self.shouldFail(LayoutInvalid, "block hashes before key",
15338-                            None, mw0.put_blockhashes,
15339-                            self.block_hash_tree))
15340-
15341-        # Write the private key.
15342-        d.addCallback(lambda ignored:
15343-            mw0.put_encprivkey(self.encprivkey))
15344-
15345 
15346hunk ./src/allmydata/test/test_storage.py 2118
15347-        # Try to write the share hash chain without writing the block
15348-        # hash tree
15349+        # Try to write the share hash chain without writing the
15350+        # encrypted private key
15351         d.addCallback(lambda ignored:
15352             self.shouldFail(LayoutInvalid, "share hash chain before "
15353hunk ./src/allmydata/test/test_storage.py 2122
15354-                                           "salt hash tree",
15355+                                           "private key",
15356                             None,
15357                             mw0.put_sharehashes, self.share_hash_chain))
15358hunk ./src/allmydata/test/test_storage.py 2125
15359-
15360-        # Try to write the root hash and without writing either the
15361-        # block hashes or the or the share hashes
15362+        # Write the private key.
15363         d.addCallback(lambda ignored:
15364hunk ./src/allmydata/test/test_storage.py 2127
15365-            self.shouldFail(LayoutInvalid, "root hash before share hashes",
15366-                            None,
15367-                            mw0.put_root_hash, self.root_hash))
15368+            mw0.put_encprivkey(self.encprivkey))
15369 
15370         # Now write the block hashes and try again
15371         d.addCallback(lambda ignored:
15372hunk ./src/allmydata/test/test_storage.py 2133
15373             mw0.put_blockhashes(self.block_hash_tree))
15374 
15375-        d.addCallback(lambda ignored:
15376-            self.shouldFail(LayoutInvalid, "root hash before share hashes",
15377-                            None, mw0.put_root_hash, self.root_hash))
15378-
15379         # We haven't yet put the root hash on the share, so we shouldn't
15380         # be able to sign it.
15381         d.addCallback(lambda ignored:
15382hunk ./src/allmydata/test/test_storage.py 2378
15383         # This should be enough to fill in both the encoding parameters
15384         # and the table of offsets, which will complete the version
15385         # information tuple.
15386-        d.addCallback(_make_mr, 107)
15387+        d.addCallback(_make_mr, 123)
15388         d.addCallback(lambda mr:
15389             mr.get_verinfo())
15390         def _check_verinfo(verinfo):
15391hunk ./src/allmydata/test/test_storage.py 2412
15392         d.addCallback(_check_verinfo)
15393         # This is not enough data to read a block and a share, so the
15394         # wrapper should attempt to read this from the remote server.
15395-        d.addCallback(_make_mr, 107)
15396+        d.addCallback(_make_mr, 123)
15397         d.addCallback(lambda mr:
15398             mr.get_block_and_salt(0))
15399         def _check_block_and_salt((block, salt)):
15400hunk ./src/allmydata/test/test_storage.py 2420
15401             self.failUnlessEqual(salt, self.salt)
15402             self.failUnlessEqual(self.rref.read_count, 1)
15403         # This should be enough data to read one block.
15404-        d.addCallback(_make_mr, 249)
15405+        d.addCallback(_make_mr, 123 + PRIVATE_KEY_SIZE + SIGNATURE_SIZE + VERIFICATION_KEY_SIZE + SHARE_HASH_CHAIN_SIZE + 140)
15406         d.addCallback(lambda mr:
15407             mr.get_block_and_salt(0))
15408         d.addCallback(_check_block_and_salt)
15409hunk ./src/allmydata/test/test_storage.py 2438
15410         # This should be enough to get us the encoding parameters,
15411         # offset table, and everything else we need to build a verinfo
15412         # string.
15413-        d.addCallback(_make_mr, 107)
15414+        d.addCallback(_make_mr, 123)
15415         d.addCallback(lambda mr:
15416             mr.get_verinfo())
15417         def _check_verinfo(verinfo):
15418hunk ./src/allmydata/test/test_storage.py 2473
15419             self.failUnlessEqual(self.rref.read_count, 0)
15420         d.addCallback(_check_verinfo)
15421         # This shouldn't be enough to read any share data.
15422-        d.addCallback(_make_mr, 107)
15423+        d.addCallback(_make_mr, 123)
15424         d.addCallback(lambda mr:
15425             mr.get_block_and_salt(0))
15426         def _check_block_and_salt((block, salt)):
15427}
15428[uri.py: Add MDMF cap
15429Kevan Carstensen <kevan@isnotajoke.com>**20110501224249
15430 Ignore-this: a6d1046d33f5cc811c5e8b10af925f33
15431] {
15432hunk ./src/allmydata/interfaces.py 546
15433 
15434 class IMutableFileURI(Interface):
15435     """I am a URI which represents a mutable filenode."""
15436+    def get_extension_params():
15437+        """Return the extension parameters in the URI"""
15438 
15439 class IDirectoryURI(Interface):
15440     pass
15441hunk ./src/allmydata/test/test_uri.py 2
15442 
15443+import re
15444 from twisted.trial import unittest
15445 from allmydata import uri
15446 from allmydata.util import hashutil, base32
15447hunk ./src/allmydata/test/test_uri.py 259
15448         uri.CHKFileURI.init_from_string(fileURI)
15449 
15450 class Mutable(testutil.ReallyEqualMixin, unittest.TestCase):
15451-    def test_pack(self):
15452-        writekey = "\x01" * 16
15453-        fingerprint = "\x02" * 32
15454+    def setUp(self):
15455+        self.writekey = "\x01" * 16
15456+        self.fingerprint = "\x02" * 32
15457+        self.readkey = hashutil.ssk_readkey_hash(self.writekey)
15458+        self.storage_index = hashutil.ssk_storage_index_hash(self.readkey)
15459 
15460hunk ./src/allmydata/test/test_uri.py 265
15461-        u = uri.WriteableSSKFileURI(writekey, fingerprint)
15462-        self.failUnlessReallyEqual(u.writekey, writekey)
15463-        self.failUnlessReallyEqual(u.fingerprint, fingerprint)
15464+    def test_pack(self):
15465+        u = uri.WriteableSSKFileURI(self.writekey, self.fingerprint)
15466+        self.failUnlessReallyEqual(u.writekey, self.writekey)
15467+        self.failUnlessReallyEqual(u.fingerprint, self.fingerprint)
15468         self.failIf(u.is_readonly())
15469         self.failUnless(u.is_mutable())
15470         self.failUnless(IURI.providedBy(u))
15471hunk ./src/allmydata/test/test_uri.py 281
15472         self.failUnlessReallyEqual(u, u_h)
15473 
15474         u2 = uri.from_string(u.to_string())
15475-        self.failUnlessReallyEqual(u2.writekey, writekey)
15476-        self.failUnlessReallyEqual(u2.fingerprint, fingerprint)
15477+        self.failUnlessReallyEqual(u2.writekey, self.writekey)
15478+        self.failUnlessReallyEqual(u2.fingerprint, self.fingerprint)
15479         self.failIf(u2.is_readonly())
15480         self.failUnless(u2.is_mutable())
15481         self.failUnless(IURI.providedBy(u2))
15482hunk ./src/allmydata/test/test_uri.py 297
15483         self.failUnless(isinstance(u2imm, uri.UnknownURI), u2imm)
15484 
15485         u3 = u2.get_readonly()
15486-        readkey = hashutil.ssk_readkey_hash(writekey)
15487-        self.failUnlessReallyEqual(u3.fingerprint, fingerprint)
15488+        readkey = hashutil.ssk_readkey_hash(self.writekey)
15489+        self.failUnlessReallyEqual(u3.fingerprint, self.fingerprint)
15490         self.failUnlessReallyEqual(u3.readkey, readkey)
15491         self.failUnless(u3.is_readonly())
15492         self.failUnless(u3.is_mutable())
15493hunk ./src/allmydata/test/test_uri.py 317
15494         u3_h = uri.ReadonlySSKFileURI.init_from_human_encoding(he)
15495         self.failUnlessReallyEqual(u3, u3_h)
15496 
15497-        u4 = uri.ReadonlySSKFileURI(readkey, fingerprint)
15498-        self.failUnlessReallyEqual(u4.fingerprint, fingerprint)
15499+        u4 = uri.ReadonlySSKFileURI(readkey, self.fingerprint)
15500+        self.failUnlessReallyEqual(u4.fingerprint, self.fingerprint)
15501         self.failUnlessReallyEqual(u4.readkey, readkey)
15502         self.failUnless(u4.is_readonly())
15503         self.failUnless(u4.is_mutable())
15504hunk ./src/allmydata/test/test_uri.py 350
15505         self.failUnlessReallyEqual(u5, u5_h)
15506 
15507 
15508+    def test_writable_mdmf_cap(self):
15509+        u1 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint)
15510+        cap = u1.to_string()
15511+        u = uri.WritableMDMFFileURI.init_from_string(cap)
15512+
15513+        self.failUnless(IMutableFileURI.providedBy(u))
15514+        self.failUnlessReallyEqual(u.fingerprint, self.fingerprint)
15515+        self.failUnlessReallyEqual(u.writekey, self.writekey)
15516+        self.failUnless(u.is_mutable())
15517+        self.failIf(u.is_readonly())
15518+        self.failUnlessEqual(cap, u.to_string())
15519+
15520+        # Now get a readonly cap from the writable cap, and test that it
15521+        # degrades gracefully.
15522+        ru = u.get_readonly()
15523+        self.failUnlessReallyEqual(self.readkey, ru.readkey)
15524+        self.failUnlessReallyEqual(self.fingerprint, ru.fingerprint)
15525+        self.failUnless(ru.is_mutable())
15526+        self.failUnless(ru.is_readonly())
15527+
15528+        # Now get a verifier cap.
15529+        vu = ru.get_verify_cap()
15530+        self.failUnlessReallyEqual(self.storage_index, vu.storage_index)
15531+        self.failUnlessReallyEqual(self.fingerprint, vu.fingerprint)
15532+        self.failUnless(IVerifierURI.providedBy(vu))
15533+
15534+    def test_readonly_mdmf_cap(self):
15535+        u1 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint)
15536+        cap = u1.to_string()
15537+        u2 = uri.ReadonlyMDMFFileURI.init_from_string(cap)
15538+
15539+        self.failUnlessReallyEqual(u2.fingerprint, self.fingerprint)
15540+        self.failUnlessReallyEqual(u2.readkey, self.readkey)
15541+        self.failUnless(u2.is_readonly())
15542+        self.failUnless(u2.is_mutable())
15543+
15544+        vu = u2.get_verify_cap()
15545+        self.failUnlessEqual(u2.storage_index, self.storage_index)
15546+        self.failUnlessEqual(u2.fingerprint, self.fingerprint)
15547+
15548+    def test_create_writable_mdmf_cap_from_readcap(self):
15549+        # we shouldn't be able to create a writable MDMF cap given only a
15550+        # readcap.
15551+        u1 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint)
15552+        cap = u1.to_string()
15553+        self.failUnlessRaises(uri.BadURIError,
15554+                              uri.WritableMDMFFileURI.init_from_string,
15555+                              cap)
15556+
15557+    def test_create_writable_mdmf_cap_from_verifycap(self):
15558+        u1 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint)
15559+        cap = u1.to_string()
15560+        self.failUnlessRaises(uri.BadURIError,
15561+                              uri.WritableMDMFFileURI.init_from_string,
15562+                              cap)
15563+
15564+    def test_create_readonly_mdmf_cap_from_verifycap(self):
15565+        u1 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint)
15566+        cap = u1.to_string()
15567+        self.failUnlessRaises(uri.BadURIError,
15568+                              uri.ReadonlyMDMFFileURI.init_from_string,
15569+                              cap)
15570+
15571+    def test_mdmf_verifier_cap(self):
15572+        u1 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint)
15573+        self.failUnless(u1.is_readonly())
15574+        self.failIf(u1.is_mutable())
15575+        self.failUnlessReallyEqual(self.storage_index, u1.storage_index)
15576+        self.failUnlessReallyEqual(self.fingerprint, u1.fingerprint)
15577+
15578+        cap = u1.to_string()
15579+        u2 = uri.MDMFVerifierURI.init_from_string(cap)
15580+        self.failUnless(u2.is_readonly())
15581+        self.failIf(u2.is_mutable())
15582+        self.failUnlessReallyEqual(self.storage_index, u2.storage_index)
15583+        self.failUnlessReallyEqual(self.fingerprint, u2.fingerprint)
15584+
15585+        u3 = u2.get_readonly()
15586+        self.failUnlessReallyEqual(u3, u2)
15587+
15588+        u4 = u2.get_verify_cap()
15589+        self.failUnlessReallyEqual(u4, u2)
15590+
15591+    def test_mdmf_cap_extra_information(self):
15592+        # MDMF caps can be arbitrarily extended after the fingerprint
15593+        # and key/storage index fields.
15594+        u1 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint)
15595+        self.failUnlessEqual([], u1.get_extension_params())
15596+
15597+        cap = u1.to_string()
15598+        # Now let's append some fields. Say, 131073 (the segment size)
15599+        # and 3 (the "k" encoding parameter).
15600+        expected_extensions = []
15601+        for e in ('131073', '3'):
15602+            cap += (":%s" % e)
15603+            expected_extensions.append(e)
15604+
15605+            u2 = uri.WritableMDMFFileURI.init_from_string(cap)
15606+            self.failUnlessReallyEqual(self.writekey, u2.writekey)
15607+            self.failUnlessReallyEqual(self.fingerprint, u2.fingerprint)
15608+            self.failIf(u2.is_readonly())
15609+            self.failUnless(u2.is_mutable())
15610+
15611+            c2 = u2.to_string()
15612+            u2n = uri.WritableMDMFFileURI.init_from_string(c2)
15613+            self.failUnlessReallyEqual(u2, u2n)
15614+
15615+            # We should get the extra back when we ask for it.
15616+            self.failUnlessEqual(expected_extensions, u2.get_extension_params())
15617+
15618+            # These should be preserved through cap attenuation, too.
15619+            u3 = u2.get_readonly()
15620+            self.failUnlessReallyEqual(self.readkey, u3.readkey)
15621+            self.failUnlessReallyEqual(self.fingerprint, u3.fingerprint)
15622+            self.failUnless(u3.is_readonly())
15623+            self.failUnless(u3.is_mutable())
15624+            self.failUnlessEqual(expected_extensions, u3.get_extension_params())
15625+
15626+            c3 = u3.to_string()
15627+            u3n = uri.ReadonlyMDMFFileURI.init_from_string(c3)
15628+            self.failUnlessReallyEqual(u3, u3n)
15629+
15630+            u4 = u3.get_verify_cap()
15631+            self.failUnlessReallyEqual(self.storage_index, u4.storage_index)
15632+            self.failUnlessReallyEqual(self.fingerprint, u4.fingerprint)
15633+            self.failUnless(u4.is_readonly())
15634+            self.failIf(u4.is_mutable())
15635+
15636+            c4 = u4.to_string()
15637+            u4n = uri.MDMFVerifierURI.init_from_string(c4)
15638+            self.failUnlessReallyEqual(u4n, u4)
15639+
15640+            self.failUnlessEqual(expected_extensions, u4.get_extension_params())
15641+
15642+
15643+    def test_sdmf_cap_extra_information(self):
15644+        # For interface consistency, we define a method to get
15645+        # extensions for SDMF files as well. This method must always
15646+        # return no extensions, since SDMF files were not created with
15647+        # extensions and cannot be modified to include extensions
15648+        # without breaking older clients.
15649+        u1 = uri.WriteableSSKFileURI(self.writekey, self.fingerprint)
15650+        cap = u1.to_string()
15651+        u2 = uri.WriteableSSKFileURI.init_from_string(cap)
15652+        self.failUnlessEqual([], u2.get_extension_params())
15653+
15654+    def test_extension_character_range(self):
15655+        # As written now, we shouldn't put things other than numbers in
15656+        # the extension fields.
15657+        writecap = uri.WritableMDMFFileURI(self.writekey, self.fingerprint).to_string()
15658+        readcap  = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint).to_string()
15659+        vcap     = uri.MDMFVerifierURI(self.storage_index, self.fingerprint).to_string()
15660+        self.failUnlessRaises(uri.BadURIError,
15661+                              uri.WritableMDMFFileURI.init_from_string,
15662+                              ("%s:invalid" % writecap))
15663+        self.failUnlessRaises(uri.BadURIError,
15664+                              uri.ReadonlyMDMFFileURI.init_from_string,
15665+                              ("%s:invalid" % readcap))
15666+        self.failUnlessRaises(uri.BadURIError,
15667+                              uri.MDMFVerifierURI.init_from_string,
15668+                              ("%s:invalid" % vcap))
15669+
15670+
15671+    def test_mdmf_valid_human_encoding(self):
15672+        # What's a human encoding? Well, it's of the form:
15673+        base = "https://127.0.0.1:3456/uri/"
15674+        # With a cap on the end. For each of the cap types, we need to
15675+        # test that a valid cap (with and without the traditional
15676+        # separators) is recognized and accepted by the classes.
15677+        w1 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint)
15678+        w2 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint,
15679+                                     ['131073', '3'])
15680+        r1 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint)
15681+        r2 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint,
15682+                                     ['131073', '3'])
15683+        v1 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint)
15684+        v2 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint,
15685+                                 ['131073', '3'])
15686+
15687+        # These will yield six different caps.
15688+        for o in (w1, w2, r1 , r2, v1, v2):
15689+            url = base + o.to_string()
15690+            o1 = o.__class__.init_from_human_encoding(url)
15691+            self.failUnlessReallyEqual(o1, o)
15692+
15693+            # Note that our cap will, by default, have : as separators.
15694+            # But it's expected that users from, e.g., the WUI, will
15695+            # have %3A as a separator. We need to make sure that the
15696+            # initialization routine handles that, too.
15697+            cap = o.to_string()
15698+            cap = re.sub(":", "%3A", cap)
15699+            url = base + cap
15700+            o2 = o.__class__.init_from_human_encoding(url)
15701+            self.failUnlessReallyEqual(o2, o)
15702+
15703+
15704+    def test_mdmf_human_encoding_invalid_base(self):
15705+        # What's a human encoding? Well, it's of the form:
15706+        base = "https://127.0.0.1:3456/foo/bar/bazuri/"
15707+        # With a cap on the end. For each of the cap types, we need to
15708+        # test that a valid cap (with and without the traditional
15709+        # separators) is recognized and accepted by the classes.
15710+        w1 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint)
15711+        w2 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint,
15712+                                     ['131073', '3'])
15713+        r1 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint)
15714+        r2 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint,
15715+                                     ['131073', '3'])
15716+        v1 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint)
15717+        v2 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint,
15718+                                 ['131073', '3'])
15719+
15720+        # These will yield six different caps.
15721+        for o in (w1, w2, r1 , r2, v1, v2):
15722+            url = base + o.to_string()
15723+            self.failUnlessRaises(uri.BadURIError,
15724+                                  o.__class__.init_from_human_encoding,
15725+                                  url)
15726+
15727+    def test_mdmf_human_encoding_invalid_cap(self):
15728+        base = "https://127.0.0.1:3456/uri/"
15729+        # With a cap on the end. For each of the cap types, we need to
15730+        # test that a valid cap (with and without the traditional
15731+        # separators) is recognized and accepted by the classes.
15732+        w1 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint)
15733+        w2 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint,
15734+                                     ['131073', '3'])
15735+        r1 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint)
15736+        r2 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint,
15737+                                     ['131073', '3'])
15738+        v1 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint)
15739+        v2 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint,
15740+                                 ['131073', '3'])
15741+
15742+        # These will yield six different caps.
15743+        for o in (w1, w2, r1 , r2, v1, v2):
15744+            # not exhaustive, obviously...
15745+            url = base + o.to_string() + "foobarbaz"
15746+            url2 = base + "foobarbaz" + o.to_string()
15747+            url3 = base + o.to_string()[:25] + "foo" + o.to_string()[:25]
15748+            for u in (url, url2, url3):
15749+                self.failUnlessRaises(uri.BadURIError,
15750+                                      o.__class__.init_from_human_encoding,
15751+                                      u)
15752+
15753+    def test_mdmf_from_string(self):
15754+        # Make sure that the from_string utility function works with
15755+        # MDMF caps.
15756+        u1 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint)
15757+        cap = u1.to_string()
15758+        self.failUnless(uri.is_uri(cap))
15759+        u2 = uri.from_string(cap)
15760+        self.failUnlessReallyEqual(u1, u2)
15761+        u3 = uri.from_string_mutable_filenode(cap)
15762+        self.failUnlessEqual(u3, u1)
15763+
15764+        # XXX: We should refactor the extension field into setUp
15765+        u1 = uri.WritableMDMFFileURI(self.writekey, self.fingerprint,
15766+                                     ['131073', '3'])
15767+        cap = u1.to_string()
15768+        self.failUnless(uri.is_uri(cap))
15769+        u2 = uri.from_string(cap)
15770+        self.failUnlessReallyEqual(u1, u2)
15771+        u3 = uri.from_string_mutable_filenode(cap)
15772+        self.failUnlessEqual(u3, u1)
15773+
15774+        u1 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint)
15775+        cap = u1.to_string()
15776+        self.failUnless(uri.is_uri(cap))
15777+        u2 = uri.from_string(cap)
15778+        self.failUnlessReallyEqual(u1, u2)
15779+        u3 = uri.from_string_mutable_filenode(cap)
15780+        self.failUnlessEqual(u3, u1)
15781+
15782+        u1 = uri.ReadonlyMDMFFileURI(self.readkey, self.fingerprint,
15783+                                     ['131073', '3'])
15784+        cap = u1.to_string()
15785+        self.failUnless(uri.is_uri(cap))
15786+        u2 = uri.from_string(cap)
15787+        self.failUnlessReallyEqual(u1, u2)
15788+        u3 = uri.from_string_mutable_filenode(cap)
15789+        self.failUnlessEqual(u3, u1)
15790+
15791+        u1 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint)
15792+        cap = u1.to_string()
15793+        self.failUnless(uri.is_uri(cap))
15794+        u2 = uri.from_string(cap)
15795+        self.failUnlessReallyEqual(u1, u2)
15796+        u3 = uri.from_string_verifier(cap)
15797+        self.failUnlessEqual(u3, u1)
15798+
15799+        u1 = uri.MDMFVerifierURI(self.storage_index, self.fingerprint,
15800+                                 ['131073', '3'])
15801+        cap = u1.to_string()
15802+        self.failUnless(uri.is_uri(cap))
15803+        u2 = uri.from_string(cap)
15804+        self.failUnlessReallyEqual(u1, u2)
15805+        u3 = uri.from_string_verifier(cap)
15806+        self.failUnlessEqual(u3, u1)
15807+
15808+
15809 class Dirnode(testutil.ReallyEqualMixin, unittest.TestCase):
15810     def test_pack(self):
15811         writekey = "\x01" * 16
15812hunk ./src/allmydata/uri.py 31
15813 SEP='(?::|%3A)'
15814 NUMBER='([0-9]+)'
15815 NUMBER_IGNORE='(?:[0-9]+)'
15816+OPTIONAL_EXTENSION_FIELD = '(' + SEP + '[0-9' + SEP + ']+|)'
15817 
15818 # "human-encoded" URIs are allowed to come with a leading
15819 # 'http://127.0.0.1:(8123|3456)/uri/' that will be ignored.
15820hunk ./src/allmydata/uri.py 297
15821     def get_verify_cap(self):
15822         return SSKVerifierURI(self.storage_index, self.fingerprint)
15823 
15824+    def get_extension_params(self):
15825+        return []
15826 
15827 class ReadonlySSKFileURI(_BaseURI):
15828     implements(IURI, IMutableFileURI)
15829hunk ./src/allmydata/uri.py 354
15830     def get_verify_cap(self):
15831         return SSKVerifierURI(self.storage_index, self.fingerprint)
15832 
15833+    def get_extension_params(self):
15834+        return []
15835 
15836 class SSKVerifierURI(_BaseURI):
15837     implements(IVerifierURI)
15838hunk ./src/allmydata/uri.py 401
15839     def get_verify_cap(self):
15840         return self
15841 
15842+    def get_extension_params(self):
15843+        return []
15844+
15845+class WritableMDMFFileURI(_BaseURI):
15846+    implements(IURI, IMutableFileURI)
15847+
15848+    BASE_STRING='URI:MDMF:'
15849+    STRING_RE=re.compile('^'+BASE_STRING+BASE32STR_128bits+':'+BASE32STR_256bits+OPTIONAL_EXTENSION_FIELD+'$')
15850+    HUMAN_RE=re.compile('^'+OPTIONALHTTPLEAD+'URI'+SEP+'MDMF'+SEP+BASE32STR_128bits+SEP+BASE32STR_256bits+OPTIONAL_EXTENSION_FIELD+'$')
15851+
15852+    def __init__(self, writekey, fingerprint, params=[]):
15853+        self.writekey = writekey
15854+        self.readkey = hashutil.ssk_readkey_hash(writekey)
15855+        self.storage_index = hashutil.ssk_storage_index_hash(self.readkey)
15856+        assert len(self.storage_index) == 16
15857+        self.fingerprint = fingerprint
15858+        self.extension = params
15859+
15860+    @classmethod
15861+    def init_from_human_encoding(cls, uri):
15862+        mo = cls.HUMAN_RE.search(uri)
15863+        if not mo:
15864+            raise BadURIError("'%s' doesn't look like a %s cap" % (uri, cls))
15865+        params = filter(lambda x: x != '', re.split(SEP, mo.group(3)))
15866+        return cls(base32.a2b(mo.group(1)), base32.a2b(mo.group(2)), params)
15867+
15868+    @classmethod
15869+    def init_from_string(cls, uri):
15870+        mo = cls.STRING_RE.search(uri)
15871+        if not mo:
15872+            raise BadURIError("'%s' doesn't look like a %s cap" % (uri, cls))
15873+        params = mo.group(3)
15874+        params = filter(lambda x: x != '', params.split(":"))
15875+        return cls(base32.a2b(mo.group(1)), base32.a2b(mo.group(2)), params)
15876+
15877+    def to_string(self):
15878+        assert isinstance(self.writekey, str)
15879+        assert isinstance(self.fingerprint, str)
15880+        ret = 'URI:MDMF:%s:%s' % (base32.b2a(self.writekey),
15881+                                  base32.b2a(self.fingerprint))
15882+        if self.extension:
15883+            ret += ":"
15884+            ret += ":".join(self.extension)
15885+
15886+        return ret
15887+
15888+    def __repr__(self):
15889+        return "<%s %s>" % (self.__class__.__name__, self.abbrev())
15890+
15891+    def abbrev(self):
15892+        return base32.b2a(self.writekey[:5])
15893+
15894+    def abbrev_si(self):
15895+        return base32.b2a(self.storage_index)[:5]
15896+
15897+    def is_readonly(self):
15898+        return False
15899+
15900+    def is_mutable(self):
15901+        return True
15902+
15903+    def get_readonly(self):
15904+        return ReadonlyMDMFFileURI(self.readkey, self.fingerprint, self.extension)
15905+
15906+    def get_verify_cap(self):
15907+        return MDMFVerifierURI(self.storage_index, self.fingerprint, self.extension)
15908+
15909+    def get_extension_params(self):
15910+        return self.extension
15911+
15912+class ReadonlyMDMFFileURI(_BaseURI):
15913+    implements(IURI, IMutableFileURI)
15914+
15915+    BASE_STRING='URI:MDMF-RO:'
15916+    STRING_RE=re.compile('^' +BASE_STRING+BASE32STR_128bits+':'+BASE32STR_256bits+OPTIONAL_EXTENSION_FIELD+'$')
15917+    HUMAN_RE=re.compile('^'+OPTIONALHTTPLEAD+'URI'+SEP+'MDMF-RO'+SEP+BASE32STR_128bits+SEP+BASE32STR_256bits+OPTIONAL_EXTENSION_FIELD+'$')
15918+
15919+    def __init__(self, readkey, fingerprint, params=[]):
15920+        self.readkey = readkey
15921+        self.storage_index = hashutil.ssk_storage_index_hash(self.readkey)
15922+        assert len(self.storage_index) == 16
15923+        self.fingerprint = fingerprint
15924+        self.extension = params
15925+
15926+    @classmethod
15927+    def init_from_human_encoding(cls, uri):
15928+        mo = cls.HUMAN_RE.search(uri)
15929+        if not mo:
15930+            raise BadURIError("'%s' doesn't look like a %s cap" % (uri, cls))
15931+        params = mo.group(3)
15932+        params = filter(lambda x: x!= '', re.split(SEP, params))
15933+        return cls(base32.a2b(mo.group(1)), base32.a2b(mo.group(2)), params)
15934+
15935+    @classmethod
15936+    def init_from_string(cls, uri):
15937+        mo = cls.STRING_RE.search(uri)
15938+        if not mo:
15939+            raise BadURIError("'%s' doesn't look like a %s cap" % (uri, cls))
15940+
15941+        params = mo.group(3)
15942+        params = filter(lambda x: x != '', params.split(":"))
15943+        return cls(base32.a2b(mo.group(1)), base32.a2b(mo.group(2)), params)
15944+
15945+    def to_string(self):
15946+        assert isinstance(self.readkey, str)
15947+        assert isinstance(self.fingerprint, str)
15948+        ret = 'URI:MDMF-RO:%s:%s' % (base32.b2a(self.readkey),
15949+                                     base32.b2a(self.fingerprint))
15950+        if self.extension:
15951+            ret += ":"
15952+            ret += ":".join(self.extension)
15953+
15954+        return ret
15955+
15956+    def __repr__(self):
15957+        return "<%s %s>" % (self.__class__.__name__, self.abbrev())
15958+
15959+    def abbrev(self):
15960+        return base32.b2a(self.readkey[:5])
15961+
15962+    def abbrev_si(self):
15963+        return base32.b2a(self.storage_index)[:5]
15964+
15965+    def is_readonly(self):
15966+        return True
15967+
15968+    def is_mutable(self):
15969+        return True
15970+
15971+    def get_readonly(self):
15972+        return self
15973+
15974+    def get_verify_cap(self):
15975+        return MDMFVerifierURI(self.storage_index, self.fingerprint, self.extension)
15976+
15977+    def get_extension_params(self):
15978+        return self.extension
15979+
15980+class MDMFVerifierURI(_BaseURI):
15981+    implements(IVerifierURI)
15982+
15983+    BASE_STRING='URI:MDMF-Verifier:'
15984+    STRING_RE=re.compile('^'+BASE_STRING+BASE32STR_128bits+':'+BASE32STR_256bits+OPTIONAL_EXTENSION_FIELD+'$')
15985+    HUMAN_RE=re.compile('^'+OPTIONALHTTPLEAD+'URI'+SEP+'MDMF-Verifier'+SEP+BASE32STR_128bits+SEP+BASE32STR_256bits+OPTIONAL_EXTENSION_FIELD+'$')
15986+
15987+    def __init__(self, storage_index, fingerprint, params=[]):
15988+        assert len(storage_index) == 16
15989+        self.storage_index = storage_index
15990+        self.fingerprint = fingerprint
15991+        self.extension = params
15992+
15993+    @classmethod
15994+    def init_from_human_encoding(cls, uri):
15995+        mo = cls.HUMAN_RE.search(uri)
15996+        if not mo:
15997+            raise BadURIError("'%s' doesn't look like a %s cap" % (uri, cls))
15998+        params = mo.group(3)
15999+        params = filter(lambda x: x != '', re.split(SEP, params))
16000+        return cls(si_a2b(mo.group(1)), base32.a2b(mo.group(2)), params)
16001+
16002+    @classmethod
16003+    def init_from_string(cls, uri):
16004+        mo = cls.STRING_RE.search(uri)
16005+        if not mo:
16006+            raise BadURIError("'%s' doesn't look like a %s cap" % (uri, cls))
16007+        params = mo.group(3)
16008+        params = filter(lambda x: x != '', params.split(":"))
16009+        return cls(si_a2b(mo.group(1)), base32.a2b(mo.group(2)), params)
16010+
16011+    def to_string(self):
16012+        assert isinstance(self.storage_index, str)
16013+        assert isinstance(self.fingerprint, str)
16014+        ret = 'URI:MDMF-Verifier:%s:%s' % (si_b2a(self.storage_index),
16015+                                           base32.b2a(self.fingerprint))
16016+        if self.extension:
16017+            ret += ':'
16018+            ret += ":".join(self.extension)
16019+
16020+        return ret
16021+
16022+    def is_readonly(self):
16023+        return True
16024+
16025+    def is_mutable(self):
16026+        return False
16027+
16028+    def get_readonly(self):
16029+        return self
16030+
16031+    def get_verify_cap(self):
16032+        return self
16033+
16034+    def get_extension_params(self):
16035+        return self.extension
16036+
16037 class _DirectoryBaseURI(_BaseURI):
16038     implements(IURI, IDirnodeURI)
16039     def __init__(self, filenode_uri=None):
16040hunk ./src/allmydata/uri.py 831
16041             kind = "URI:SSK-RO readcap to a mutable file"
16042         elif s.startswith('URI:SSK-Verifier:'):
16043             return SSKVerifierURI.init_from_string(s)
16044+        elif s.startswith('URI:MDMF:'):
16045+            return WritableMDMFFileURI.init_from_string(s)
16046+        elif s.startswith('URI:MDMF-RO:'):
16047+            return ReadonlyMDMFFileURI.init_from_string(s)
16048+        elif s.startswith('URI:MDMF-Verifier:'):
16049+            return MDMFVerifierURI.init_from_string(s)
16050         elif s.startswith('URI:DIR2:'):
16051             if can_be_writeable:
16052                 return DirectoryURI.init_from_string(s)
16053}
16054[nodemaker, mutable/filenode: train nodemaker and filenode to handle MDMF caps
16055Kevan Carstensen <kevan@isnotajoke.com>**20110501224523
16056 Ignore-this: 1f3b4581eb583e7bb93d234182bda395
16057] {
16058hunk ./src/allmydata/mutable/filenode.py 12
16059      IMutableFileVersion, IWritable
16060 from allmydata.util import hashutil, log, consumer, deferredutil, mathutil
16061 from allmydata.util.assertutil import precondition
16062-from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI
16063+from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI, \
16064+                          WritableMDMFFileURI, ReadonlyMDMFFileURI
16065 from allmydata.monitor import Monitor
16066 from pycryptopp.cipher.aes import AES
16067 
16068hunk ./src/allmydata/mutable/filenode.py 75
16069         # set to this default value in case neither of those things happen,
16070         # or in case the servermap can't find any shares to tell us what
16071         # to publish as.
16072-        # TODO: Set this back to None, and find out why the tests fail
16073-        #       with it set to None.
16074+        # XXX: Version should come in via the constructor.
16075         self._protocol_version = None
16076 
16077         # all users of this MutableFileNode go through the serializer. This
16078hunk ./src/allmydata/mutable/filenode.py 95
16079         # verification key, nor things like 'k' or 'N'. If and when someone
16080         # wants to get our contents, we'll pull from shares and fill those
16081         # in.
16082-        assert isinstance(filecap, (ReadonlySSKFileURI, WriteableSSKFileURI))
16083+        if isinstance(filecap, (WritableMDMFFileURI, ReadonlyMDMFFileURI)):
16084+            self._protocol_version = MDMF_VERSION
16085+        elif isinstance(filecap, (ReadonlySSKFileURI, WriteableSSKFileURI)):
16086+            self._protocol_version = SDMF_VERSION
16087+
16088         self._uri = filecap
16089         self._writekey = None
16090hunk ./src/allmydata/mutable/filenode.py 102
16091-        if isinstance(filecap, WriteableSSKFileURI):
16092+
16093+        if not filecap.is_readonly() and filecap.is_mutable():
16094             self._writekey = self._uri.writekey
16095         self._readkey = self._uri.readkey
16096         self._storage_index = self._uri.storage_index
16097hunk ./src/allmydata/mutable/filenode.py 131
16098         self._writekey = hashutil.ssk_writekey_hash(privkey_s)
16099         self._encprivkey = self._encrypt_privkey(self._writekey, privkey_s)
16100         self._fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
16101-        self._uri = WriteableSSKFileURI(self._writekey, self._fingerprint)
16102+        if self._protocol_version == MDMF_VERSION:
16103+            self._uri = WritableMDMFFileURI(self._writekey, self._fingerprint)
16104+        else:
16105+            self._uri = WriteableSSKFileURI(self._writekey, self._fingerprint)
16106         self._readkey = self._uri.readkey
16107         self._storage_index = self._uri.storage_index
16108         initial_contents = self._get_initial_contents(contents)
16109hunk ./src/allmydata/nodemaker.py 82
16110             return self._create_immutable(cap)
16111         if isinstance(cap, uri.CHKFileVerifierURI):
16112             return self._create_immutable_verifier(cap)
16113-        if isinstance(cap, (uri.ReadonlySSKFileURI, uri.WriteableSSKFileURI)):
16114+        if isinstance(cap, (uri.ReadonlySSKFileURI, uri.WriteableSSKFileURI,
16115+                            uri.WritableMDMFFileURI, uri.ReadonlyMDMFFileURI)):
16116             return self._create_mutable(cap)
16117         if isinstance(cap, (uri.DirectoryURI,
16118                             uri.ReadonlyDirectoryURI,
16119hunk ./src/allmydata/test/test_mutable.py 196
16120                     offset2 = 0
16121                 if offset1 == "pubkey" and IV:
16122                     real_offset = 107
16123-                elif offset1 == "share_data" and not IV:
16124-                    real_offset = 107
16125                 elif offset1 in o:
16126                     real_offset = o[offset1]
16127                 else:
16128hunk ./src/allmydata/test/test_mutable.py 270
16129         return d
16130 
16131 
16132+    def test_mdmf_filenode_cap(self):
16133+        # Test that an MDMF filenode, once created, returns an MDMF URI.
16134+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
16135+        def _created(n):
16136+            self.failUnless(isinstance(n, MutableFileNode))
16137+            cap = n.get_cap()
16138+            self.failUnless(isinstance(cap, uri.WritableMDMFFileURI))
16139+            rcap = n.get_readcap()
16140+            self.failUnless(isinstance(rcap, uri.ReadonlyMDMFFileURI))
16141+            vcap = n.get_verify_cap()
16142+            self.failUnless(isinstance(vcap, uri.MDMFVerifierURI))
16143+        d.addCallback(_created)
16144+        return d
16145+
16146+
16147+    def test_create_from_mdmf_writecap(self):
16148+        # Test that the nodemaker is capable of creating an MDMF
16149+        # filenode given an MDMF cap.
16150+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
16151+        def _created(n):
16152+            self.failUnless(isinstance(n, MutableFileNode))
16153+            s = n.get_uri()
16154+            self.failUnless(s.startswith("URI:MDMF"))
16155+            n2 = self.nodemaker.create_from_cap(s)
16156+            self.failUnless(isinstance(n2, MutableFileNode))
16157+            self.failUnlessEqual(n.get_storage_index(), n2.get_storage_index())
16158+            self.failUnlessEqual(n.get_uri(), n2.get_uri())
16159+        d.addCallback(_created)
16160+        return d
16161+
16162+
16163+    def test_create_from_mdmf_writecap_with_extensions(self):
16164+        # Test that the nodemaker is capable of creating an MDMF
16165+        # filenode when given a writecap with extension parameters in
16166+        # them.
16167+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
16168+        def _created(n):
16169+            self.failUnless(isinstance(n, MutableFileNode))
16170+            s = n.get_uri()
16171+            s2 = "%s:3:131073" % s
16172+            n2 = self.nodemaker.create_from_cap(s2)
16173+
16174+            self.failUnlessEqual(n2.get_storage_index(), n.get_storage_index())
16175+            self.failUnlessEqual(n.get_writekey(), n2.get_writekey())
16176+        d.addCallback(_created)
16177+        return d
16178+
16179+
16180+    def test_create_from_mdmf_readcap(self):
16181+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
16182+        def _created(n):
16183+            self.failUnless(isinstance(n, MutableFileNode))
16184+            s = n.get_readonly_uri()
16185+            n2 = self.nodemaker.create_from_cap(s)
16186+            self.failUnless(isinstance(n2, MutableFileNode))
16187+
16188+            # Check that it's a readonly node
16189+            self.failUnless(n2.is_readonly())
16190+        d.addCallback(_created)
16191+        return d
16192+
16193+
16194+    def test_create_from_mdmf_readcap_with_extensions(self):
16195+        # We should be able to create an MDMF filenode with the
16196+        # extension parameters without it breaking.
16197+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
16198+        def _created(n):
16199+            self.failUnless(isinstance(n, MutableFileNode))
16200+            s = n.get_readonly_uri()
16201+            s = "%s:3:131073" % s
16202+
16203+            n2 = self.nodemaker.create_from_cap(s)
16204+            self.failUnless(isinstance(n2, MutableFileNode))
16205+            self.failUnless(n2.is_readonly())
16206+            self.failUnlessEqual(n.get_storage_index(), n2.get_storage_index())
16207+        d.addCallback(_created)
16208+        return d
16209+
16210+
16211+    def test_internal_version_from_cap(self):
16212+        # MutableFileNodes and MutableFileVersions have an internal
16213+        # switch that tells them whether they're dealing with an SDMF or
16214+        # MDMF mutable file when they start doing stuff. We want to make
16215+        # sure that this is set appropriately given an MDMF cap.
16216+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
16217+        def _created(n):
16218+            self.uri = n.get_uri()
16219+            self.failUnlessEqual(n._protocol_version, MDMF_VERSION)
16220+
16221+            n2 = self.nodemaker.create_from_cap(self.uri)
16222+            self.failUnlessEqual(n2._protocol_version, MDMF_VERSION)
16223+        d.addCallback(_created)
16224+        return d
16225+
16226+
16227     def test_serialize(self):
16228         n = MutableFileNode(None, None, {"k": 3, "n": 10}, None)
16229         calls = []
16230hunk ./src/allmydata/test/test_mutable.py 464
16231         return d
16232 
16233 
16234+    def test_download_from_mdmf_cap(self):
16235+        # We should be able to download an MDMF file given its cap
16236+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
16237+        def _created(node):
16238+            self.uri = node.get_uri()
16239+
16240+            return node.overwrite(MutableData("contents1" * 100000))
16241+        def _then(ignored):
16242+            node = self.nodemaker.create_from_cap(self.uri)
16243+            return node.download_best_version()
16244+        def _downloaded(data):
16245+            self.failUnlessEqual(data, "contents1" * 100000)
16246+        d.addCallback(_created)
16247+        d.addCallback(_then)
16248+        d.addCallback(_downloaded)
16249+        return d
16250+
16251+
16252     def test_mdmf_write_count(self):
16253         # Publishing an MDMF file should only cause one write for each
16254         # share that is to be published. Otherwise, we introduce
16255hunk ./src/allmydata/test/test_mutable.py 1735
16256     def test_verify_mdmf_bad_encprivkey(self):
16257         d = self.publish_mdmf()
16258         d.addCallback(lambda ignored:
16259-            corrupt(None, self._storage, "enc_privkey", [1]))
16260+            corrupt(None, self._storage, "enc_privkey", [0]))
16261         d.addCallback(lambda ignored:
16262             self._fn.check(Monitor(), verify=True))
16263         d.addCallback(self.check_bad, "test_verify_mdmf_bad_encprivkey")
16264hunk ./src/allmydata/test/test_mutable.py 2843
16265         return d
16266 
16267 
16268+    def test_version_extension_api(self):
16269+        # We need to define an API by which an uploader can set the
16270+        # extension parameters, and by which a downloader can retrieve
16271+        # extensions.
16272+        self.failUnless(False)
16273+
16274+
16275+    def test_extensions_from_cap(self):
16276+        self.failUnless(False)
16277+
16278+
16279+    def test_extensions_from_upload(self):
16280+        self.failUnless(False)
16281+
16282+
16283+    def test_cap_after_upload(self):
16284+        self.failUnless(False)
16285+
16286+
16287     def test_get_writekey(self):
16288         d = self.mdmf_node.get_best_mutable_version()
16289         d.addCallback(lambda bv:
16290}
16291
16292Context:
16293
16294[allmydata/__init__.py: Nicer reporting of unparseable version numbers in dependencies. fixes #1388
16295david-sarah@jacaranda.org**20110401202750
16296 Ignore-this: 9c6bd599259d2405e1caadbb3e0d8c7f
16297] 
16298[update FTP-and-SFTP.rst: the necessary patch is included in Twisted-10.1
16299Brian Warner <warner@lothar.com>**20110325232511
16300 Ignore-this: d5307faa6900f143193bfbe14e0f01a
16301] 
16302[control.py: remove all uses of s.get_serverid()
16303warner@lothar.com**20110227011203
16304 Ignore-this: f80a787953bd7fa3d40e828bde00e855
16305] 
16306[web: remove some uses of s.get_serverid(), not all
16307warner@lothar.com**20110227011159
16308 Ignore-this: a9347d9cf6436537a47edc6efde9f8be
16309] 
16310[immutable/downloader/fetcher.py: remove all get_serverid() calls
16311warner@lothar.com**20110227011156
16312 Ignore-this: fb5ef018ade1749348b546ec24f7f09a
16313] 
16314[immutable/downloader/fetcher.py: fix diversity bug in server-response handling
16315warner@lothar.com**20110227011153
16316 Ignore-this: bcd62232c9159371ae8a16ff63d22c1b
16317 
16318 When blocks terminate (either COMPLETE or CORRUPT/DEAD/BADSEGNUM), the
16319 _shares_from_server dict was being popped incorrectly (using shnum as the
16320 index instead of serverid). I'm still thinking through the consequences of
16321 this bug. It was probably benign and really hard to detect. I think it would
16322 cause us to incorrectly believe that we're pulling too many shares from a
16323 server, and thus prefer a different server rather than asking for a second
16324 share from the first server. The diversity code is intended to spread out the
16325 number of shares simultaneously being requested from each server, but with
16326 this bug, it might be spreading out the total number of shares requested at
16327 all, not just simultaneously. (note that SegmentFetcher is scoped to a single
16328 segment, so the effect doesn't last very long).
16329] 
16330[immutable/downloader/share.py: reduce get_serverid(), one left, update ext deps
16331warner@lothar.com**20110227011150
16332 Ignore-this: d8d56dd8e7b280792b40105e13664554
16333 
16334 test_download.py: create+check MyShare instances better, make sure they share
16335 Server objects, now that finder.py cares
16336] 
16337[immutable/downloader/finder.py: reduce use of get_serverid(), one left
16338warner@lothar.com**20110227011146
16339 Ignore-this: 5785be173b491ae8a78faf5142892020
16340] 
16341[immutable/offloaded.py: reduce use of get_serverid() a bit more
16342warner@lothar.com**20110227011142
16343 Ignore-this: b48acc1b2ae1b311da7f3ba4ffba38f
16344] 
16345[immutable/upload.py: reduce use of get_serverid()
16346warner@lothar.com**20110227011138
16347 Ignore-this: ffdd7ff32bca890782119a6e9f1495f6
16348] 
16349[immutable/checker.py: remove some uses of s.get_serverid(), not all
16350warner@lothar.com**20110227011134
16351 Ignore-this: e480a37efa9e94e8016d826c492f626e
16352] 
16353[add remaining get_* methods to storage_client.Server, NoNetworkServer, and
16354warner@lothar.com**20110227011132
16355 Ignore-this: 6078279ddf42b179996a4b53bee8c421
16356 MockIServer stubs
16357] 
16358[upload.py: rearrange _make_trackers a bit, no behavior changes
16359warner@lothar.com**20110227011128
16360 Ignore-this: 296d4819e2af452b107177aef6ebb40f
16361] 
16362[happinessutil.py: finally rename merge_peers to merge_servers
16363warner@lothar.com**20110227011124
16364 Ignore-this: c8cd381fea1dd888899cb71e4f86de6e
16365] 
16366[test_upload.py: factor out FakeServerTracker
16367warner@lothar.com**20110227011120
16368 Ignore-this: 6c182cba90e908221099472cc159325b
16369] 
16370[test_upload.py: server-vs-tracker cleanup
16371warner@lothar.com**20110227011115
16372 Ignore-this: 2915133be1a3ba456e8603885437e03
16373] 
16374[happinessutil.py: server-vs-tracker cleanup
16375warner@lothar.com**20110227011111
16376 Ignore-this: b856c84033562d7d718cae7cb01085a9
16377] 
16378[upload.py: more tracker-vs-server cleanup
16379warner@lothar.com**20110227011107
16380 Ignore-this: bb75ed2afef55e47c085b35def2de315
16381] 
16382[upload.py: fix var names to avoid confusion between 'trackers' and 'servers'
16383warner@lothar.com**20110227011103
16384 Ignore-this: 5d5e3415b7d2732d92f42413c25d205d
16385] 
16386[refactor: s/peer/server/ in immutable/upload, happinessutil.py, test_upload
16387warner@lothar.com**20110227011100
16388 Ignore-this: 7ea858755cbe5896ac212a925840fe68
16389 
16390 No behavioral changes, just updating variable/method names and log messages.
16391 The effects outside these three files should be minimal: some exception
16392 messages changed (to say "server" instead of "peer"), and some internal class
16393 names were changed. A few things still use "peer" to minimize external
16394 changes, like UploadResults.timings["peer_selection"] and
16395 happinessutil.merge_peers, which can be changed later.
16396] 
16397[storage_client.py: clean up test_add_server/test_add_descriptor, remove .test_servers
16398warner@lothar.com**20110227011056
16399 Ignore-this: efad933e78179d3d5fdcd6d1ef2b19cc
16400] 
16401[test_client.py, upload.py:: remove KiB/MiB/etc constants, and other dead code
16402warner@lothar.com**20110227011051
16403 Ignore-this: dc83c5794c2afc4f81e592f689c0dc2d
16404] 
16405[test: increase timeout on a network test because Francois's ARM machine hit that timeout
16406zooko@zooko.com**20110317165909
16407 Ignore-this: 380c345cdcbd196268ca5b65664ac85b
16408 I'm skeptical that the test was proceeding correctly but ran out of time. It seems more likely that it had gotten hung. But if we raise the timeout to an even more extravagant number then we can be even more certain that the test was never going to finish.
16409] 
16410[docs/configuration.rst: add a "Frontend Configuration" section
16411Brian Warner <warner@lothar.com>**20110222014323
16412 Ignore-this: 657018aa501fe4f0efef9851628444ca
16413 
16414 this points to docs/frontends/*.rst, which were previously underlinked
16415] 
16416[web/filenode.py: avoid calling req.finish() on closed HTTP connections. Closes #1366
16417"Brian Warner <warner@lothar.com>"**20110221061544
16418 Ignore-this: 799d4de19933f2309b3c0c19a63bb888
16419] 
16420[Add unit tests for cross_check_pkg_resources_versus_import, and a regression test for ref #1355. This requires a little refactoring to make it testable.
16421david-sarah@jacaranda.org**20110221015817
16422 Ignore-this: 51d181698f8c20d3aca58b057e9c475a
16423] 
16424[allmydata/__init__.py: .name was used in place of the correct .__name__ when printing an exception. Also, robustify string formatting by using %r instead of %s in some places. fixes #1355.
16425david-sarah@jacaranda.org**20110221020125
16426 Ignore-this: b0744ed58f161bf188e037bad077fc48
16427] 
16428[Refactor StorageFarmBroker handling of servers
16429Brian Warner <warner@lothar.com>**20110221015804
16430 Ignore-this: 842144ed92f5717699b8f580eab32a51
16431 
16432 Pass around IServer instance instead of (peerid, rref) tuple. Replace
16433 "descriptor" with "server". Other replacements:
16434 
16435  get_all_servers -> get_connected_servers/get_known_servers
16436  get_servers_for_index -> get_servers_for_psi (now returns IServers)
16437 
16438 This change still needs to be pushed further down: lots of code is now
16439 getting the IServer and then distributing (peerid, rref) internally.
16440 Instead, it ought to distribute the IServer internally and delay
16441 extracting a serverid or rref until the last moment.
16442 
16443 no_network.py was updated to retain parallelism.
16444] 
16445[TAG allmydata-tahoe-1.8.2
16446warner@lothar.com**20110131020101] 
16447Patch bundle hash:
1644849bc083e307a6234ae0b7a143c3cc0b77d59880a