Ticket #393: 393status17.dpatch

File 393status17.dpatch, 434.2 KB (added by kevan, at 2010-07-08T21:23:27Z)
Line 
1Thu Jun 24 16:46:37 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
2  * Misc. changes to support the work I'm doing
3 
4      - Add a notion of file version number to interfaces.py
5      - Alter mutable file node interfaces to have a notion of version,
6        though this may be changed later.
7      - Alter mutable/filenode.py to conform to these changes.
8      - Add a salt hasher to util/hashutil.py
9
10Thu Jun 24 16:48:33 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
11  * nodemaker.py: create MDMF files when asked to
12
13Thu Jun 24 16:49:05 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
14  * storage/server.py: minor code cleanup
15
16Thu Jun 24 16:49:24 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
17  * test/test_mutable.py: alter some tests that were failing due to MDMF; minor code cleanup.
18
19Fri Jun 25 17:35:20 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
20  * test/test_mutable.py: change the definition of corrupt() to work with MDMF as well as SDMF files, change users of corrupt to use the new definition
21
22Sat Jun 26 16:41:18 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
23  * Alter the ServermapUpdater to find MDMF files
24 
25  The servermapupdater should find MDMF files on a grid in the same way
26  that it finds SDMF files. This patch makes it do that.
27
28Sat Jun 26 16:42:04 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
29  * Make a segmented mutable uploader
30 
31  The mutable file uploader should be able to publish files with one
32  segment and files with multiple segments. This patch makes it do that.
33  This is still incomplete, and rather ugly -- I need to flesh out error
34  handling, I need to write tests, and I need to remove some of the uglier
35  kludges in the process before I can call this done.
36
37Sat Jun 26 16:43:14 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
38  * Write a segmented mutable downloader
39 
40  The segmented mutable downloader can deal with MDMF files (files with
41  one or more segments in MDMF format) and SDMF files (files with one
42  segment in SDMF format). It is backwards compatible with the old
43  file format.
44 
45  This patch also contains tests for the segmented mutable downloader.
46
47Mon Jun 28 15:50:48 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
48  * mutable/checker.py: check MDMF files
49 
50  This patch adapts the mutable file checker and verifier to check and
51  verify MDMF files. It does this by using the new segmented downloader,
52  which is trained to perform verification operations on request. This
53  removes some code duplication.
54
55Mon Jun 28 15:52:01 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
56  * mutable/retrieve.py: learn how to verify mutable files
57
58Wed Jun 30 11:33:05 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
59  * interfaces.py: add IMutableSlotWriter
60
61Thu Jul  1 16:28:06 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
62  * test/test_mutable.py: temporarily disable two tests that are now irrelevant
63
64Fri Jul  2 15:55:31 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
65  * Add MDMF reader and writer, and SDMF writer
66 
67  The MDMF/SDMF reader MDMF writer, and SDMF writer are similar to the
68  object proxies that exist for immutable files. They abstract away
69  details of connection, state, and caching from their callers (in this
70  case, the download, servermap updater, and uploader), and expose methods
71  to get and set information on the remote server.
72 
73  MDMFSlotReadProxy reads a mutable file from the server, doing the right
74  thing (in most cases) regardless of whether the file is MDMF or SDMF. It
75  allows callers to tell it how to batch and flush reads.
76 
77  MDMFSlotWriteProxy writes an MDMF mutable file to a server.
78 
79  SDMFSlotWriteProxy writes an SDMF mutable file to a server.
80 
81  This patch also includes tests for MDMFSlotReadProxy,
82  SDMFSlotWriteProxy, and MDMFSlotWriteProxy.
83
84Fri Jul  2 15:55:54 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
85  * mutable/publish.py: cleanup + simplification
86
87Fri Jul  2 15:57:10 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
88  * test/test_mutable.py: remove tests that are no longer relevant
89
90Tue Jul  6 14:52:17 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
91  * interfaces.py: create IMutableUploadable
92
93Tue Jul  6 14:52:57 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
94  * mutable/publish.py: add MutableDataHandle and MutableFileHandle
95
96Tue Jul  6 14:55:41 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
97  * mutable/publish.py: reorganize in preparation of file-like uploadables
98
99Tue Jul  6 14:56:49 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
100  * test/test_mutable.py: write tests for MutableFileHandle and MutableDataHandle
101
102Wed Jul  7 17:00:31 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
103  * Alter tests to work with the new APIs
104
105Wed Jul  7 17:07:32 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
106  * Alter mutable files to use file-like objects for publishing instead of strings.
107
108Thu Jul  8 12:34:54 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
109  * frontends/sftpd.py: alter a mutable file overwrite to work with the new API
110
111Thu Jul  8 12:35:22 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
112  * test/test_sftp.py: alter a setup routine to work with new mutable file APIs.
113
114Thu Jul  8 12:36:00 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
115  * mutable/publish.py: make MutableFileHandle seek to the beginning of its file handle before reading.
116
117New patches:
118
119[Misc. changes to support the work I'm doing
120Kevan Carstensen <kevan@isnotajoke.com>**20100624234637
121 Ignore-this: fdd18fa8cc05f4b4b15ff53ee24a1819
122 
123     - Add a notion of file version number to interfaces.py
124     - Alter mutable file node interfaces to have a notion of version,
125       though this may be changed later.
126     - Alter mutable/filenode.py to conform to these changes.
127     - Add a salt hasher to util/hashutil.py
128] {
129hunk ./src/allmydata/interfaces.py 7
130      ChoiceOf, IntegerConstraint, Any, RemoteInterface, Referenceable
131 
132 HASH_SIZE=32
133+SALT_SIZE=16
134+
135+SDMF_VERSION=0
136+MDMF_VERSION=1
137 
138 Hash = StringConstraint(maxLength=HASH_SIZE,
139                         minLength=HASH_SIZE)# binary format 32-byte SHA256 hash
140hunk ./src/allmydata/interfaces.py 811
141         writer-visible data using this writekey.
142         """
143 
144+    def set_version(version):
145+        """Tahoe-LAFS supports SDMF and MDMF mutable files. By default,
146+        we upload in SDMF for reasons of compatibility. If you want to
147+        change this, set_version will let you do that.
148+
149+        To say that this file should be uploaded in SDMF, pass in a 0. To
150+        say that the file should be uploaded as MDMF, pass in a 1.
151+        """
152+
153+    def get_version():
154+        """Returns the mutable file protocol version."""
155+
156 class NotEnoughSharesError(Exception):
157     """Download was unable to get enough shares"""
158 
159hunk ./src/allmydata/mutable/filenode.py 8
160 from twisted.internet import defer, reactor
161 from foolscap.api import eventually
162 from allmydata.interfaces import IMutableFileNode, \
163-     ICheckable, ICheckResults, NotEnoughSharesError
164+     ICheckable, ICheckResults, NotEnoughSharesError, MDMF_VERSION, SDMF_VERSION
165 from allmydata.util import hashutil, log
166 from allmydata.util.assertutil import precondition
167 from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI
168hunk ./src/allmydata/mutable/filenode.py 67
169         self._sharemap = {} # known shares, shnum-to-[nodeids]
170         self._cache = ResponseCache()
171         self._most_recent_size = None
172+        # filled in after __init__ if we're being created for the first time;
173+        # filled in by the servermap updater before publishing, otherwise.
174+        # set to this default value in case neither of those things happen,
175+        # or in case the servermap can't find any shares to tell us what
176+        # to publish as.
177+        # TODO: Set this back to None, and find out why the tests fail
178+        #       with it set to None.
179+        self._protocol_version = SDMF_VERSION
180 
181         # all users of this MutableFileNode go through the serializer. This
182         # takes advantage of the fact that Deferreds discard the callbacks
183hunk ./src/allmydata/mutable/filenode.py 472
184     def _did_upload(self, res, size):
185         self._most_recent_size = size
186         return res
187+
188+
189+    def set_version(self, version):
190+        # I can be set in two ways:
191+        #  1. When the node is created.
192+        #  2. (for an existing share) when the Servermap is updated
193+        #     before I am read.
194+        assert version in (MDMF_VERSION, SDMF_VERSION)
195+        self._protocol_version = version
196+
197+
198+    def get_version(self):
199+        return self._protocol_version
200hunk ./src/allmydata/util/hashutil.py 90
201 MUTABLE_READKEY_TAG = "allmydata_mutable_writekey_to_readkey_v1"
202 MUTABLE_DATAKEY_TAG = "allmydata_mutable_readkey_to_datakey_v1"
203 MUTABLE_STORAGEINDEX_TAG = "allmydata_mutable_readkey_to_storage_index_v1"
204+MUTABLE_SALT_TAG = "allmydata_mutable_segment_salt_v1"
205 
206 # dirnodes
207 DIRNODE_CHILD_WRITECAP_TAG = "allmydata_mutable_writekey_and_salt_to_dirnode_child_capkey_v1"
208hunk ./src/allmydata/util/hashutil.py 134
209 def plaintext_segment_hasher():
210     return tagged_hasher(PLAINTEXT_SEGMENT_TAG)
211 
212+def mutable_salt_hash(data):
213+    return tagged_hash(MUTABLE_SALT_TAG, data)
214+def mutable_salt_hasher():
215+    return tagged_hasher(MUTABLE_SALT_TAG)
216+
217 KEYLEN = 16
218 IVLEN = 16
219 
220}
221[nodemaker.py: create MDMF files when asked to
222Kevan Carstensen <kevan@isnotajoke.com>**20100624234833
223 Ignore-this: 26c16aaca9ddab7a7ce37a4530bc970
224] {
225hunk ./src/allmydata/nodemaker.py 3
226 import weakref
227 from zope.interface import implements
228-from allmydata.interfaces import INodeMaker
229+from allmydata.util.assertutil import precondition
230+from allmydata.interfaces import INodeMaker, MustBeDeepImmutableError, \
231+                                 SDMF_VERSION, MDMF_VERSION
232 from allmydata.immutable.filenode import ImmutableFileNode, LiteralFileNode
233 from allmydata.immutable.upload import Data
234 from allmydata.mutable.filenode import MutableFileNode
235hunk ./src/allmydata/nodemaker.py 92
236             return self._create_dirnode(filenode)
237         return None
238 
239-    def create_mutable_file(self, contents=None, keysize=None):
240+    def create_mutable_file(self, contents=None, keysize=None,
241+                            version=SDMF_VERSION):
242         n = MutableFileNode(self.storage_broker, self.secret_holder,
243                             self.default_encoding_parameters, self.history)
244hunk ./src/allmydata/nodemaker.py 96
245+        n.set_version(version)
246         d = self.key_generator.generate(keysize)
247         d.addCallback(n.create_with_keys, contents)
248         d.addCallback(lambda res: n)
249hunk ./src/allmydata/nodemaker.py 102
250         return d
251 
252-    def create_new_mutable_directory(self, initial_children={}):
253+    def create_new_mutable_directory(self, initial_children={},
254+                                     version=SDMF_VERSION):
255+        # initial_children must have metadata (i.e. {} instead of None)
256+        for (name, (node, metadata)) in initial_children.iteritems():
257+            precondition(isinstance(metadata, dict),
258+                         "create_new_mutable_directory requires metadata to be a dict, not None", metadata)
259+            node.raise_error()
260         d = self.create_mutable_file(lambda n:
261hunk ./src/allmydata/nodemaker.py 110
262-                                     pack_children(n, initial_children))
263+                                     pack_children(n, initial_children),
264+                                     version)
265         d.addCallback(self._create_dirnode)
266         return d
267 
268}
269[storage/server.py: minor code cleanup
270Kevan Carstensen <kevan@isnotajoke.com>**20100624234905
271 Ignore-this: 2358c531c39e48d3c8e56b62b5768228
272] {
273hunk ./src/allmydata/storage/server.py 569
274                                          self)
275         return share
276 
277-    def remote_slot_readv(self, storage_index, shares, readv):
278+    def remote_slot_readv(self, storage_index, shares, readvs):
279         start = time.time()
280         self.count("readv")
281         si_s = si_b2a(storage_index)
282hunk ./src/allmydata/storage/server.py 590
283             if sharenum in shares or not shares:
284                 filename = os.path.join(bucketdir, sharenum_s)
285                 msf = MutableShareFile(filename, self)
286-                datavs[sharenum] = msf.readv(readv)
287+                datavs[sharenum] = msf.readv(readvs)
288         log.msg("returning shares %s" % (datavs.keys(),),
289                 facility="tahoe.storage", level=log.NOISY, parent=lp)
290         self.add_latency("readv", time.time() - start)
291}
292[test/test_mutable.py: alter some tests that were failing due to MDMF; minor code cleanup.
293Kevan Carstensen <kevan@isnotajoke.com>**20100624234924
294 Ignore-this: afb86ec1fbdbfe1a5ef6f46f350273c0
295] {
296hunk ./src/allmydata/test/test_mutable.py 151
297             chr(ord(original[byte_offset]) ^ 0x01) +
298             original[byte_offset+1:])
299 
300+def add_two(original, byte_offset):
301+    # It isn't enough to simply flip the bit for the version number,
302+    # because 1 is a valid version number. So we add two instead.
303+    return (original[:byte_offset] +
304+            chr(ord(original[byte_offset]) ^ 0x02) +
305+            original[byte_offset+1:])
306+
307 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
308     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
309     # list of shnums to corrupt.
310hunk ./src/allmydata/test/test_mutable.py 187
311                 real_offset = offset1
312             real_offset = int(real_offset) + offset2 + offset_offset
313             assert isinstance(real_offset, int), offset
314-            shares[shnum] = flip_bit(data, real_offset)
315+            if offset1 == 0: # verbyte
316+                f = add_two
317+            else:
318+                f = flip_bit
319+            shares[shnum] = f(data, real_offset)
320     return res
321 
322 def make_storagebroker(s=None, num_peers=10):
323hunk ./src/allmydata/test/test_mutable.py 423
324         d.addCallback(_created)
325         return d
326 
327+
328     def test_modify_backoffer(self):
329         def _modifier(old_contents, servermap, first_time):
330             return old_contents + "line2"
331hunk ./src/allmydata/test/test_mutable.py 658
332         d.addCallback(_created)
333         return d
334 
335+
336     def _copy_shares(self, ignored, index):
337         shares = self._storage._peers
338         # we need a deep copy
339}
340[test/test_mutable.py: change the definition of corrupt() to work with MDMF as well as SDMF files, change users of corrupt to use the new definition
341Kevan Carstensen <kevan@isnotajoke.com>**20100626003520
342 Ignore-this: 836e59e2fde0535f6b4bea3468dc8244
343] {
344hunk ./src/allmydata/test/test_mutable.py 168
345                 and shnum not in shnums_to_corrupt):
346                 continue
347             data = shares[shnum]
348-            (version,
349-             seqnum,
350-             root_hash,
351-             IV,
352-             k, N, segsize, datalen,
353-             o) = unpack_header(data)
354-            if isinstance(offset, tuple):
355-                offset1, offset2 = offset
356-            else:
357-                offset1 = offset
358-                offset2 = 0
359-            if offset1 == "pubkey":
360-                real_offset = 107
361-            elif offset1 in o:
362-                real_offset = o[offset1]
363-            else:
364-                real_offset = offset1
365-            real_offset = int(real_offset) + offset2 + offset_offset
366-            assert isinstance(real_offset, int), offset
367-            if offset1 == 0: # verbyte
368-                f = add_two
369-            else:
370-                f = flip_bit
371-            shares[shnum] = f(data, real_offset)
372-    return res
373+            # We're feeding the reader all of the share data, so it
374+            # won't need to use the rref that we didn't provide, nor the
375+            # storage index that we didn't provide. We do this because
376+            # the reader will work for both MDMF and SDMF.
377+            reader = MDMFSlotReadProxy(None, None, shnum, data)
378+            # We need to get the offsets for the next part.
379+            d = reader.get_verinfo()
380+            def _do_corruption(verinfo, data, shnum):
381+                (seqnum,
382+                 root_hash,
383+                 IV,
384+                 segsize,
385+                 datalen,
386+                 k, n, prefix, o) = verinfo
387+                if isinstance(offset, tuple):
388+                    offset1, offset2 = offset
389+                else:
390+                    offset1 = offset
391+                    offset2 = 0
392+                if offset1 == "pubkey":
393+                    real_offset = 107
394+                elif offset1 in o:
395+                    real_offset = o[offset1]
396+                else:
397+                    real_offset = offset1
398+                real_offset = int(real_offset) + offset2 + offset_offset
399+                assert isinstance(real_offset, int), offset
400+                if offset1 == 0: # verbyte
401+                    f = add_two
402+                else:
403+                    f = flip_bit
404+                shares[shnum] = f(data, real_offset)
405+            d.addCallback(_do_corruption, data, shnum)
406+            ds.append(d)
407+    dl = defer.DeferredList(ds)
408+    dl.addCallback(lambda ignored: res)
409+    return dl
410 
411 def make_storagebroker(s=None, num_peers=10):
412     if not s:
413hunk ./src/allmydata/test/test_mutable.py 1177
414         return d
415 
416     def test_download_fails(self):
417-        corrupt(None, self._storage, "signature")
418-        d = self.shouldFail(UnrecoverableFileError, "test_download_anyway",
419+        d = corrupt(None, self._storage, "signature")
420+        d.addCallback(lambda ignored:
421+            self.shouldFail(UnrecoverableFileError, "test_download_anyway",
422                             "no recoverable versions",
423                             self._fn.download_best_version)
424         return d
425hunk ./src/allmydata/test/test_mutable.py 1232
426         return d
427 
428     def test_check_all_bad_sig(self):
429-        corrupt(None, self._storage, 1) # bad sig
430-        d = self._fn.check(Monitor())
431+        d = corrupt(None, self._storage, 1) # bad sig
432+        d.addCallback(lambda ignored:
433+            self._fn.check(Monitor()))
434         d.addCallback(self.check_bad, "test_check_all_bad_sig")
435         return d
436 
437hunk ./src/allmydata/test/test_mutable.py 1239
438     def test_check_all_bad_blocks(self):
439-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
440+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
441         # the Checker won't notice this.. it doesn't look at actual data
442hunk ./src/allmydata/test/test_mutable.py 1241
443-        d = self._fn.check(Monitor())
444+        d.addCallback(lambda ignored:
445+            self._fn.check(Monitor()))
446         d.addCallback(self.check_good, "test_check_all_bad_blocks")
447         return d
448 
449hunk ./src/allmydata/test/test_mutable.py 1252
450         return d
451 
452     def test_verify_all_bad_sig(self):
453-        corrupt(None, self._storage, 1) # bad sig
454-        d = self._fn.check(Monitor(), verify=True)
455+        d = corrupt(None, self._storage, 1) # bad sig
456+        d.addCallback(lambda ignored:
457+            self._fn.check(Monitor(), verify=True))
458         d.addCallback(self.check_bad, "test_verify_all_bad_sig")
459         return d
460 
461hunk ./src/allmydata/test/test_mutable.py 1259
462     def test_verify_one_bad_sig(self):
463-        corrupt(None, self._storage, 1, [9]) # bad sig
464-        d = self._fn.check(Monitor(), verify=True)
465+        d = corrupt(None, self._storage, 1, [9]) # bad sig
466+        d.addCallback(lambda ignored:
467+            self._fn.check(Monitor(), verify=True))
468         d.addCallback(self.check_bad, "test_verify_one_bad_sig")
469         return d
470 
471hunk ./src/allmydata/test/test_mutable.py 1266
472     def test_verify_one_bad_block(self):
473-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
474+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
475         # the Verifier *will* notice this, since it examines every byte
476hunk ./src/allmydata/test/test_mutable.py 1268
477-        d = self._fn.check(Monitor(), verify=True)
478+        d.addCallback(lambda ignored:
479+            self._fn.check(Monitor(), verify=True))
480         d.addCallback(self.check_bad, "test_verify_one_bad_block")
481         d.addCallback(self.check_expected_failure,
482                       CorruptShareError, "block hash tree failure",
483hunk ./src/allmydata/test/test_mutable.py 1277
484         return d
485 
486     def test_verify_one_bad_sharehash(self):
487-        corrupt(None, self._storage, "share_hash_chain", [9], 5)
488-        d = self._fn.check(Monitor(), verify=True)
489+        d = corrupt(None, self._storage, "share_hash_chain", [9], 5)
490+        d.addCallback(lambda ignored:
491+            self._fn.check(Monitor(), verify=True))
492         d.addCallback(self.check_bad, "test_verify_one_bad_sharehash")
493         d.addCallback(self.check_expected_failure,
494                       CorruptShareError, "corrupt hashes",
495hunk ./src/allmydata/test/test_mutable.py 1287
496         return d
497 
498     def test_verify_one_bad_encprivkey(self):
499-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
500-        d = self._fn.check(Monitor(), verify=True)
501+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
502+        d.addCallback(lambda ignored:
503+            self._fn.check(Monitor(), verify=True))
504         d.addCallback(self.check_bad, "test_verify_one_bad_encprivkey")
505         d.addCallback(self.check_expected_failure,
506                       CorruptShareError, "invalid privkey",
507hunk ./src/allmydata/test/test_mutable.py 1297
508         return d
509 
510     def test_verify_one_bad_encprivkey_uncheckable(self):
511-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
512+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
513         readonly_fn = self._fn.get_readonly()
514         # a read-only node has no way to validate the privkey
515hunk ./src/allmydata/test/test_mutable.py 1300
516-        d = readonly_fn.check(Monitor(), verify=True)
517+        d.addCallback(lambda ignored:
518+            readonly_fn.check(Monitor(), verify=True))
519         d.addCallback(self.check_good,
520                       "test_verify_one_bad_encprivkey_uncheckable")
521         return d
522}
523[Alter the ServermapUpdater to find MDMF files
524Kevan Carstensen <kevan@isnotajoke.com>**20100626234118
525 Ignore-this: 25f6278209c2983ba8f307cfe0fde0
526 
527 The servermapupdater should find MDMF files on a grid in the same way
528 that it finds SDMF files. This patch makes it do that.
529] {
530hunk ./src/allmydata/mutable/servermap.py 7
531 from itertools import count
532 from twisted.internet import defer
533 from twisted.python import failure
534-from foolscap.api import DeadReferenceError, RemoteException, eventually
535+from foolscap.api import DeadReferenceError, RemoteException, eventually, \
536+                         fireEventually
537 from allmydata.util import base32, hashutil, idlib, log
538 from allmydata.storage.server import si_b2a
539 from allmydata.interfaces import IServermapUpdaterStatus
540hunk ./src/allmydata/mutable/servermap.py 17
541 from allmydata.mutable.common import MODE_CHECK, MODE_ANYTHING, MODE_WRITE, MODE_READ, \
542      DictOfSets, CorruptShareError, NeedMoreDataError
543 from allmydata.mutable.layout import unpack_prefix_and_signature, unpack_header, unpack_share, \
544-     SIGNED_PREFIX_LENGTH
545+     SIGNED_PREFIX_LENGTH, MDMFSlotReadProxy
546 
547 class UpdateStatus:
548     implements(IServermapUpdaterStatus)
549hunk ./src/allmydata/mutable/servermap.py 254
550         """Return a set of versionids, one for each version that is currently
551         recoverable."""
552         versionmap = self.make_versionmap()
553-
554         recoverable_versions = set()
555         for (verinfo, shares) in versionmap.items():
556             (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
557hunk ./src/allmydata/mutable/servermap.py 366
558         self._servers_responded = set()
559 
560         # how much data should we read?
561+        # SDMF:
562         #  * if we only need the checkstring, then [0:75]
563         #  * if we need to validate the checkstring sig, then [543ish:799ish]
564         #  * if we need the verification key, then [107:436ish]
565hunk ./src/allmydata/mutable/servermap.py 374
566         #  * if we need the encrypted private key, we want [-1216ish:]
567         #   * but we can't read from negative offsets
568         #   * the offset table tells us the 'ish', also the positive offset
569-        # A future version of the SMDF slot format should consider using
570-        # fixed-size slots so we can retrieve less data. For now, we'll just
571-        # read 2000 bytes, which also happens to read enough actual data to
572-        # pre-fetch a 9-entry dirnode.
573+        # MDMF:
574+        #  * Checkstring? [0:72]
575+        #  * If we want to validate the checkstring, then [0:72], [143:?] --
576+        #    the offset table will tell us for sure.
577+        #  * If we need the verification key, we have to consult the offset
578+        #    table as well.
579+        # At this point, we don't know which we are. Our filenode can
580+        # tell us, but it might be lying -- in some cases, we're
581+        # responsible for telling it which kind of file it is.
582         self._read_size = 4000
583         if mode == MODE_CHECK:
584             # we use unpack_prefix_and_signature, so we need 1k
585hunk ./src/allmydata/mutable/servermap.py 432
586         self._queries_completed = 0
587 
588         sb = self._storage_broker
589+        # All of the peers, permuted by the storage index, as usual.
590         full_peerlist = sb.get_servers_for_index(self._storage_index)
591         self.full_peerlist = full_peerlist # for use later, immutable
592         self.extra_peers = full_peerlist[:] # peers are removed as we use them
593hunk ./src/allmydata/mutable/servermap.py 439
594         self._good_peers = set() # peers who had some shares
595         self._empty_peers = set() # peers who don't have any shares
596         self._bad_peers = set() # peers to whom our queries failed
597+        self._readers = {} # peerid -> dict(sharewriters), filled in
598+                           # after responses come in.
599 
600         k = self._node.get_required_shares()
601hunk ./src/allmydata/mutable/servermap.py 443
602+        # For what cases can these conditions work?
603         if k is None:
604             # make a guess
605             k = 3
606hunk ./src/allmydata/mutable/servermap.py 456
607         self.num_peers_to_query = k + self.EPSILON
608 
609         if self.mode == MODE_CHECK:
610+            # We want to query all of the peers.
611             initial_peers_to_query = dict(full_peerlist)
612             must_query = set(initial_peers_to_query.keys())
613             self.extra_peers = []
614hunk ./src/allmydata/mutable/servermap.py 464
615             # we're planning to replace all the shares, so we want a good
616             # chance of finding them all. We will keep searching until we've
617             # seen epsilon that don't have a share.
618+            # We don't query all of the peers because that could take a while.
619             self.num_peers_to_query = N + self.EPSILON
620             initial_peers_to_query, must_query = self._build_initial_querylist()
621             self.required_num_empty_peers = self.EPSILON
622hunk ./src/allmydata/mutable/servermap.py 474
623             # might also avoid the round trip required to read the encrypted
624             # private key.
625 
626-        else:
627+        else: # MODE_READ, MODE_ANYTHING
628+            # 2k peers is good enough.
629             initial_peers_to_query, must_query = self._build_initial_querylist()
630 
631         # this is a set of peers that we are required to get responses from:
632hunk ./src/allmydata/mutable/servermap.py 490
633         # before we can consider ourselves finished, and self.extra_peers
634         # contains the overflow (peers that we should tap if we don't get
635         # enough responses)
636+        # I guess that self._must_query is a subset of
637+        # initial_peers_to_query?
638+        assert set(must_query).issubset(set(initial_peers_to_query))
639 
640         self._send_initial_requests(initial_peers_to_query)
641         self._status.timings["initial_queries"] = time.time() - self._started
642hunk ./src/allmydata/mutable/servermap.py 549
643         # errors that aren't handled by _query_failed (and errors caused by
644         # _query_failed) get logged, but we still want to check for doneness.
645         d.addErrback(log.err)
646-        d.addBoth(self._check_for_done)
647         d.addErrback(self._fatal_error)
648hunk ./src/allmydata/mutable/servermap.py 550
649+        d.addCallback(self._check_for_done)
650         return d
651 
652     def _do_read(self, ss, peerid, storage_index, shnums, readv):
653hunk ./src/allmydata/mutable/servermap.py 569
654         d = ss.callRemote("slot_readv", storage_index, shnums, readv)
655         return d
656 
657+
658+    def _got_corrupt_share(self, e, shnum, peerid, data, lp):
659+        """
660+        I am called when a remote server returns a corrupt share in
661+        response to one of our queries. By corrupt, I mean a share
662+        without a valid signature. I then record the failure, notify the
663+        server of the corruption, and record the share as bad.
664+        """
665+        f = failure.Failure(e)
666+        self.log(format="bad share: %(f_value)s", f_value=str(f),
667+                 failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
668+        # Notify the server that its share is corrupt.
669+        self.notify_server_corruption(peerid, shnum, str(e))
670+        # By flagging this as a bad peer, we won't count any of
671+        # the other shares on that peer as valid, though if we
672+        # happen to find a valid version string amongst those
673+        # shares, we'll keep track of it so that we don't need
674+        # to validate the signature on those again.
675+        self._bad_peers.add(peerid)
676+        self._last_failure = f
677+        # XXX: Use the reader for this?
678+        checkstring = data[:SIGNED_PREFIX_LENGTH]
679+        self._servermap.mark_bad_share(peerid, shnum, checkstring)
680+        self._servermap.problems.append(f)
681+
682+
683+    def _cache_good_sharedata(self, verinfo, shnum, now, data):
684+        """
685+        If one of my queries returns successfully (which means that we
686+        were able to and successfully did validate the signature), I
687+        cache the data that we initially fetched from the storage
688+        server. This will help reduce the number of roundtrips that need
689+        to occur when the file is downloaded, or when the file is
690+        updated.
691+        """
692+        self._node._add_to_cache(verinfo, shnum, 0, data, now)
693+
694+
695     def _got_results(self, datavs, peerid, readsize, stuff, started):
696         lp = self.log(format="got result from [%(peerid)s], %(numshares)d shares",
697                       peerid=idlib.shortnodeid_b2a(peerid),
698hunk ./src/allmydata/mutable/servermap.py 630
699         else:
700             self._empty_peers.add(peerid)
701 
702-        last_verinfo = None
703-        last_shnum = None
704+        ss, storage_index = stuff
705+        ds = []
706+
707         for shnum,datav in datavs.items():
708             data = datav[0]
709hunk ./src/allmydata/mutable/servermap.py 635
710-            try:
711-                verinfo = self._got_results_one_share(shnum, data, peerid, lp)
712-                last_verinfo = verinfo
713-                last_shnum = shnum
714-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
715-            except CorruptShareError, e:
716-                # log it and give the other shares a chance to be processed
717-                f = failure.Failure()
718-                self.log(format="bad share: %(f_value)s", f_value=str(f.value),
719-                         failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
720-                self.notify_server_corruption(peerid, shnum, str(e))
721-                self._bad_peers.add(peerid)
722-                self._last_failure = f
723-                checkstring = data[:SIGNED_PREFIX_LENGTH]
724-                self._servermap.mark_bad_share(peerid, shnum, checkstring)
725-                self._servermap.problems.append(f)
726-                pass
727-
728-        self._status.timings["cumulative_verify"] += (time.time() - now)
729+            reader = MDMFSlotReadProxy(ss,
730+                                       storage_index,
731+                                       shnum,
732+                                       data)
733+            self._readers.setdefault(peerid, dict())[shnum] = reader
734+            # our goal, with each response, is to validate the version
735+            # information and share data as best we can at this point --
736+            # we do this by validating the signature. To do this, we
737+            # need to do the following:
738+            #   - If we don't already have the public key, fetch the
739+            #     public key. We use this to validate the signature.
740+            if not self._node.get_pubkey():
741+                # fetch and set the public key.
742+                d = reader.get_verification_key()
743+                d.addCallback(lambda results, shnum=shnum, peerid=peerid:
744+                    self._try_to_set_pubkey(results, peerid, shnum, lp))
745+                # XXX: Make self._pubkey_query_failed?
746+                d.addErrback(lambda error, shnum=shnum, peerid=peerid:
747+                    self._got_corrupt_share(error, shnum, peerid, data, lp))
748+            else:
749+                # we already have the public key.
750+                d = defer.succeed(None)
751+            # Neither of these two branches return anything of
752+            # consequence, so the first entry in our deferredlist will
753+            # be None.
754 
755hunk ./src/allmydata/mutable/servermap.py 661
756-        if self._need_privkey and last_verinfo:
757-            # send them a request for the privkey. We send one request per
758-            # server.
759-            lp2 = self.log("sending privkey request",
760-                           parent=lp, level=log.NOISY)
761-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
762-             offsets_tuple) = last_verinfo
763-            o = dict(offsets_tuple)
764+            # - Next, we need the version information. We almost
765+            #   certainly got this by reading the first thousand or so
766+            #   bytes of the share on the storage server, so we
767+            #   shouldn't need to fetch anything at this step.
768+            d2 = reader.get_verinfo()
769+            d2.addErrback(lambda error, shnum=shnum, peerid=peerid:
770+                self._got_corrupt_share(error, shnum, peerid, data, lp))
771+            # - Next, we need the signature. For an SDMF share, it is
772+            #   likely that we fetched this when doing our initial fetch
773+            #   to get the version information. In MDMF, this lives at
774+            #   the end of the share, so unless the file is quite small,
775+            #   we'll need to do a remote fetch to get it.
776+            d3 = reader.get_signature()
777+            d3.addErrback(lambda error, shnum=shnum, peerid=peerid:
778+                self._got_corrupt_share(error, shnum, peerid, data, lp))
779+            #  Once we have all three of these responses, we can move on
780+            #  to validating the signature
781 
782hunk ./src/allmydata/mutable/servermap.py 679
783-            self._queries_outstanding.add(peerid)
784-            readv = [ (o['enc_privkey'], (o['EOF'] - o['enc_privkey'])) ]
785-            ss = self._servermap.connections[peerid]
786-            privkey_started = time.time()
787-            d = self._do_read(ss, peerid, self._storage_index,
788-                              [last_shnum], readv)
789-            d.addCallback(self._got_privkey_results, peerid, last_shnum,
790-                          privkey_started, lp2)
791-            d.addErrback(self._privkey_query_failed, peerid, last_shnum, lp2)
792-            d.addErrback(log.err)
793-            d.addCallback(self._check_for_done)
794-            d.addErrback(self._fatal_error)
795+            # Does the node already have a privkey? If not, we'll try to
796+            # fetch it here.
797+            if self._need_privkey:
798+                d4 = reader.get_encprivkey()
799+                d4.addCallback(lambda results, shnum=shnum, peerid=peerid:
800+                    self._try_to_validate_privkey(results, peerid, shnum, lp))
801+                d4.addErrback(lambda error, shnum=shnum, peerid=peerid:
802+                    self._privkey_query_failed(error, shnum, data, lp))
803+            else:
804+                d4 = defer.succeed(None)
805 
806hunk ./src/allmydata/mutable/servermap.py 690
807+            dl = defer.DeferredList([d, d2, d3, d4])
808+            dl.addCallback(lambda results, shnum=shnum, peerid=peerid:
809+                self._got_signature_one_share(results, shnum, peerid, lp))
810+            dl.addErrback(lambda error, shnum=shnum, data=data:
811+               self._got_corrupt_share(error, shnum, peerid, data, lp))
812+            dl.addCallback(lambda verinfo, shnum=shnum, peerid=peerid, data=data:
813+                self._cache_good_sharedata(verinfo, shnum, now, data))
814+            ds.append(dl)
815+        # dl is a deferred list that will fire when all of the shares
816+        # that we found on this peer are done processing. When dl fires,
817+        # we know that processing is done, so we can decrement the
818+        # semaphore-like thing that we incremented earlier.
819+        dl = defer.DeferredList(ds, fireOnOneErrback=True)
820+        # Are we done? Done means that there are no more queries to
821+        # send, that there are no outstanding queries, and that we
822+        # haven't received any queries that are still processing. If we
823+        # are done, self._check_for_done will cause the done deferred
824+        # that we returned to our caller to fire, which tells them that
825+        # they have a complete servermap, and that we won't be touching
826+        # the servermap anymore.
827+        dl.addCallback(self._check_for_done)
828+        dl.addErrback(self._fatal_error)
829         # all done!
830         self.log("_got_results done", parent=lp, level=log.NOISY)
831hunk ./src/allmydata/mutable/servermap.py 714
832+        return dl
833+
834+
835+    def _try_to_set_pubkey(self, pubkey_s, peerid, shnum, lp):
836+        if self._node.get_pubkey():
837+            return # don't go through this again if we don't have to
838+        fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
839+        assert len(fingerprint) == 32
840+        if fingerprint != self._node.get_fingerprint():
841+            raise CorruptShareError(peerid, shnum,
842+                                "pubkey doesn't match fingerprint")
843+        self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
844+        assert self._node.get_pubkey()
845+
846 
847     def notify_server_corruption(self, peerid, shnum, reason):
848         ss = self._servermap.connections[peerid]
849hunk ./src/allmydata/mutable/servermap.py 734
850         ss.callRemoteOnly("advise_corrupt_share",
851                           "mutable", self._storage_index, shnum, reason)
852 
853-    def _got_results_one_share(self, shnum, data, peerid, lp):
854+
855+    def _got_signature_one_share(self, results, shnum, peerid, lp):
856+        # It is our job to give versioninfo to our caller. We need to
857+        # raise CorruptShareError if the share is corrupt for any
858+        # reason, something that our caller will handle.
859         self.log(format="_got_results: got shnum #%(shnum)d from peerid %(peerid)s",
860                  shnum=shnum,
861                  peerid=idlib.shortnodeid_b2a(peerid),
862hunk ./src/allmydata/mutable/servermap.py 744
863                  level=log.NOISY,
864                  parent=lp)
865-
866-        # this might raise NeedMoreDataError, if the pubkey and signature
867-        # live at some weird offset. That shouldn't happen, so I'm going to
868-        # treat it as a bad share.
869-        (seqnum, root_hash, IV, k, N, segsize, datalength,
870-         pubkey_s, signature, prefix) = unpack_prefix_and_signature(data)
871-
872-        if not self._node.get_pubkey():
873-            fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
874-            assert len(fingerprint) == 32
875-            if fingerprint != self._node.get_fingerprint():
876-                raise CorruptShareError(peerid, shnum,
877-                                        "pubkey doesn't match fingerprint")
878-            self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
879-
880-        if self._need_privkey:
881-            self._try_to_extract_privkey(data, peerid, shnum, lp)
882-
883-        (ig_version, ig_seqnum, ig_root_hash, ig_IV, ig_k, ig_N,
884-         ig_segsize, ig_datalen, offsets) = unpack_header(data)
885+        _, verinfo, signature, __ = results
886+        (seqnum,
887+         root_hash,
888+         saltish,
889+         segsize,
890+         datalen,
891+         k,
892+         n,
893+         prefix,
894+         offsets) = verinfo[1]
895         offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
896 
897hunk ./src/allmydata/mutable/servermap.py 756
898-        verinfo = (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
899+        # XXX: This should be done for us in the method, so
900+        # presumably you can go in there and fix it.
901+        verinfo = (seqnum,
902+                   root_hash,
903+                   saltish,
904+                   segsize,
905+                   datalen,
906+                   k,
907+                   n,
908+                   prefix,
909                    offsets_tuple)
910hunk ./src/allmydata/mutable/servermap.py 767
911+        # This tuple uniquely identifies a share on the grid; we use it
912+        # to keep track of the ones that we've already seen.
913 
914         if verinfo not in self._valid_versions:
915hunk ./src/allmydata/mutable/servermap.py 771
916-            # it's a new pair. Verify the signature.
917-            valid = self._node.get_pubkey().verify(prefix, signature)
918+            # This is a new version tuple, and we need to validate it
919+            # against the public key before keeping track of it.
920+            assert self._node.get_pubkey()
921+            valid = self._node.get_pubkey().verify(prefix, signature[1])
922             if not valid:
923hunk ./src/allmydata/mutable/servermap.py 776
924-                raise CorruptShareError(peerid, shnum, "signature is invalid")
925+                raise CorruptShareError(peerid, shnum,
926+                                        "signature is invalid")
927 
928hunk ./src/allmydata/mutable/servermap.py 779
929-            # ok, it's a valid verinfo. Add it to the list of validated
930-            # versions.
931-            self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
932-                     % (seqnum, base32.b2a(root_hash)[:4],
933-                        idlib.shortnodeid_b2a(peerid), shnum,
934-                        k, N, segsize, datalength),
935-                     parent=lp)
936-            self._valid_versions.add(verinfo)
937-        # We now know that this is a valid candidate verinfo.
938+        # ok, it's a valid verinfo. Add it to the list of validated
939+        # versions.
940+        self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
941+                 % (seqnum, base32.b2a(root_hash)[:4],
942+                    idlib.shortnodeid_b2a(peerid), shnum,
943+                    k, n, segsize, datalen),
944+                    parent=lp)
945+        self._valid_versions.add(verinfo)
946+        # We now know that this is a valid candidate verinfo. Whether or
947+        # not this instance of it is valid is a matter for the next
948+        # statement; at this point, we just know that if we see this
949+        # version info again, that its signature checks out and that
950+        # we're okay to skip the signature-checking step.
951 
952hunk ./src/allmydata/mutable/servermap.py 793
953+        # (peerid, shnum) are bound in the method invocation.
954         if (peerid, shnum) in self._servermap.bad_shares:
955             # we've been told that the rest of the data in this share is
956             # unusable, so don't add it to the servermap.
957hunk ./src/allmydata/mutable/servermap.py 808
958         self.versionmap.add(verinfo, (shnum, peerid, timestamp))
959         return verinfo
960 
961+
962     def _deserialize_pubkey(self, pubkey_s):
963         verifier = rsa.create_verifying_key_from_string(pubkey_s)
964         return verifier
965hunk ./src/allmydata/mutable/servermap.py 813
966 
967-    def _try_to_extract_privkey(self, data, peerid, shnum, lp):
968-        try:
969-            r = unpack_share(data)
970-        except NeedMoreDataError, e:
971-            # this share won't help us. oh well.
972-            offset = e.encprivkey_offset
973-            length = e.encprivkey_length
974-            self.log("shnum %d on peerid %s: share was too short (%dB) "
975-                     "to get the encprivkey; [%d:%d] ought to hold it" %
976-                     (shnum, idlib.shortnodeid_b2a(peerid), len(data),
977-                      offset, offset+length),
978-                     parent=lp)
979-            # NOTE: if uncoordinated writes are taking place, someone might
980-            # change the share (and most probably move the encprivkey) before
981-            # we get a chance to do one of these reads and fetch it. This
982-            # will cause us to see a NotEnoughSharesError(unable to fetch
983-            # privkey) instead of an UncoordinatedWriteError . This is a
984-            # nuisance, but it will go away when we move to DSA-based mutable
985-            # files (since the privkey will be small enough to fit in the
986-            # write cap).
987-
988-            return
989-
990-        (seqnum, root_hash, IV, k, N, segsize, datalen,
991-         pubkey, signature, share_hash_chain, block_hash_tree,
992-         share_data, enc_privkey) = r
993-
994-        return self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
995 
996     def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
997hunk ./src/allmydata/mutable/servermap.py 815
998-
999+        """
1000+        Given a writekey from a remote server, I validate it against the
1001+        writekey stored in my node. If it is valid, then I set the
1002+        privkey and encprivkey properties of the node.
1003+        """
1004         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
1005         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
1006         if alleged_writekey != self._node.get_writekey():
1007hunk ./src/allmydata/mutable/servermap.py 892
1008         self._queries_completed += 1
1009         self._last_failure = f
1010 
1011-    def _got_privkey_results(self, datavs, peerid, shnum, started, lp):
1012-        now = time.time()
1013-        elapsed = now - started
1014-        self._status.add_per_server_time(peerid, "privkey", started, elapsed)
1015-        self._queries_outstanding.discard(peerid)
1016-        if not self._need_privkey:
1017-            return
1018-        if shnum not in datavs:
1019-            self.log("privkey wasn't there when we asked it",
1020-                     level=log.WEIRD, umid="VA9uDQ")
1021-            return
1022-        datav = datavs[shnum]
1023-        enc_privkey = datav[0]
1024-        self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
1025 
1026     def _privkey_query_failed(self, f, peerid, shnum, lp):
1027         self._queries_outstanding.discard(peerid)
1028hunk ./src/allmydata/mutable/servermap.py 906
1029         self._servermap.problems.append(f)
1030         self._last_failure = f
1031 
1032+
1033     def _check_for_done(self, res):
1034         # exit paths:
1035         #  return self._send_more_queries(outstanding) : send some more queries
1036hunk ./src/allmydata/mutable/servermap.py 912
1037         #  return self._done() : all done
1038         #  return : keep waiting, no new queries
1039-
1040         lp = self.log(format=("_check_for_done, mode is '%(mode)s', "
1041                               "%(outstanding)d queries outstanding, "
1042                               "%(extra)d extra peers available, "
1043hunk ./src/allmydata/mutable/servermap.py 1117
1044         self._servermap.last_update_time = self._started
1045         # the servermap will not be touched after this
1046         self.log("servermap: %s" % self._servermap.summarize_versions())
1047+
1048         eventually(self._done_deferred.callback, self._servermap)
1049 
1050     def _fatal_error(self, f):
1051hunk ./src/allmydata/test/test_mutable.py 637
1052         d.addCallback(_created)
1053         return d
1054 
1055-    def publish_multiple(self):
1056+    def publish_mdmf(self):
1057+        # like publish_one, except that the result is guaranteed to be
1058+        # an MDMF file.
1059+        # self.CONTENTS should have more than one segment.
1060+        self.CONTENTS = "This is an MDMF file" * 100000
1061+        self._storage = FakeStorage()
1062+        self._nodemaker = make_nodemaker(self._storage)
1063+        self._storage_broker = self._nodemaker.storage_broker
1064+        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=1)
1065+        def _created(node):
1066+            self._fn = node
1067+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
1068+        d.addCallback(_created)
1069+        return d
1070+
1071+
1072+    def publish_sdmf(self):
1073+        # like publish_one, except that the result is guaranteed to be
1074+        # an SDMF file
1075+        self.CONTENTS = "This is an SDMF file" * 1000
1076+        self._storage = FakeStorage()
1077+        self._nodemaker = make_nodemaker(self._storage)
1078+        self._storage_broker = self._nodemaker.storage_broker
1079+        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=0)
1080+        def _created(node):
1081+            self._fn = node
1082+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
1083+        d.addCallback(_created)
1084+        return d
1085+
1086+
1087+    def publish_multiple(self, version=0):
1088         self.CONTENTS = ["Contents 0",
1089                          "Contents 1",
1090                          "Contents 2",
1091hunk ./src/allmydata/test/test_mutable.py 677
1092         self._copied_shares = {}
1093         self._storage = FakeStorage()
1094         self._nodemaker = make_nodemaker(self._storage)
1095-        d = self._nodemaker.create_mutable_file(self.CONTENTS[0]) # seqnum=1
1096+        d = self._nodemaker.create_mutable_file(self.CONTENTS[0], version=version) # seqnum=1
1097         def _created(node):
1098             self._fn = node
1099             # now create multiple versions of the same file, and accumulate
1100hunk ./src/allmydata/test/test_mutable.py 906
1101         return d
1102 
1103 
1104+    def test_servermapupdater_finds_mdmf_files(self):
1105+        # setUp already published an MDMF file for us. We just need to
1106+        # make sure that when we run the ServermapUpdater, the file is
1107+        # reported to have one recoverable version.
1108+        d = defer.succeed(None)
1109+        d.addCallback(lambda ignored:
1110+            self.publish_mdmf())
1111+        d.addCallback(lambda ignored:
1112+            self.make_servermap(mode=MODE_CHECK))
1113+        # Calling make_servermap also updates the servermap in the mode
1114+        # that we specify, so we just need to see what it says.
1115+        def _check_servermap(sm):
1116+            self.failUnlessEqual(len(sm.recoverable_versions()), 1)
1117+        d.addCallback(_check_servermap)
1118+        return d
1119+
1120+
1121+    def test_servermapupdater_finds_sdmf_files(self):
1122+        d = defer.succeed(None)
1123+        d.addCallback(lambda ignored:
1124+            self.publish_sdmf())
1125+        d.addCallback(lambda ignored:
1126+            self.make_servermap(mode=MODE_CHECK))
1127+        d.addCallback(lambda servermap:
1128+            self.failUnlessEqual(len(servermap.recoverable_versions()), 1))
1129+        return d
1130+
1131 
1132 class Roundtrip(unittest.TestCase, testutil.ShouldFailMixin, PublishMixin):
1133     def setUp(self):
1134hunk ./src/allmydata/test/test_mutable.py 1050
1135         return d
1136     test_no_servers_download.timeout = 15
1137 
1138+
1139     def _test_corrupt_all(self, offset, substring,
1140                           should_succeed=False, corrupt_early=True,
1141                           failure_checker=None):
1142}
1143[Make a segmented mutable uploader
1144Kevan Carstensen <kevan@isnotajoke.com>**20100626234204
1145 Ignore-this: d199af8ab0bc64d8ed2bc19c5437bfba
1146 
1147 The mutable file uploader should be able to publish files with one
1148 segment and files with multiple segments. This patch makes it do that.
1149 This is still incomplete, and rather ugly -- I need to flesh out error
1150 handling, I need to write tests, and I need to remove some of the uglier
1151 kludges in the process before I can call this done.
1152] {
1153hunk ./src/allmydata/mutable/publish.py 8
1154 from zope.interface import implements
1155 from twisted.internet import defer
1156 from twisted.python import failure
1157-from allmydata.interfaces import IPublishStatus
1158+from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION
1159 from allmydata.util import base32, hashutil, mathutil, idlib, log
1160 from allmydata import hashtree, codec
1161 from allmydata.storage.server import si_b2a
1162hunk ./src/allmydata/mutable/publish.py 19
1163      UncoordinatedWriteError, NotEnoughServersError
1164 from allmydata.mutable.servermap import ServerMap
1165 from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
1166-     unpack_checkstring, SIGNED_PREFIX
1167+     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy
1168+
1169+KiB = 1024
1170+DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
1171 
1172 class PublishStatus:
1173     implements(IPublishStatus)
1174hunk ./src/allmydata/mutable/publish.py 112
1175         self._status.set_helper(False)
1176         self._status.set_progress(0.0)
1177         self._status.set_active(True)
1178+        # We use this to control how the file is written.
1179+        version = self._node.get_version()
1180+        assert version in (SDMF_VERSION, MDMF_VERSION)
1181+        self._version = version
1182 
1183     def get_status(self):
1184         return self._status
1185hunk ./src/allmydata/mutable/publish.py 134
1186         simultaneous write.
1187         """
1188 
1189-        # 1: generate shares (SDMF: files are small, so we can do it in RAM)
1190-        # 2: perform peer selection, get candidate servers
1191-        #  2a: send queries to n+epsilon servers, to determine current shares
1192-        #  2b: based upon responses, create target map
1193-        # 3: send slot_testv_and_readv_and_writev messages
1194-        # 4: as responses return, update share-dispatch table
1195-        # 4a: may need to run recovery algorithm
1196-        # 5: when enough responses are back, we're done
1197+        # 0. Setup encoding parameters, encoder, and other such things.
1198+        # 1. Encrypt, encode, and publish segments.
1199 
1200         self.log("starting publish, datalen is %s" % len(newdata))
1201         self._status.set_size(len(newdata))
1202hunk ./src/allmydata/mutable/publish.py 187
1203         self.bad_peers = set() # peerids who have errbacked/refused requests
1204 
1205         self.newdata = newdata
1206-        self.salt = os.urandom(16)
1207 
1208hunk ./src/allmydata/mutable/publish.py 188
1209+        # This will set self.segment_size, self.num_segments, and
1210+        # self.fec.
1211         self.setup_encoding_parameters()
1212 
1213         # if we experience any surprises (writes which were rejected because
1214hunk ./src/allmydata/mutable/publish.py 238
1215             self.bad_share_checkstrings[key] = old_checkstring
1216             self.connections[peerid] = self._servermap.connections[peerid]
1217 
1218-        # create the shares. We'll discard these as they are delivered. SDMF:
1219-        # we're allowed to hold everything in memory.
1220+        # Now, the process dovetails -- if this is an SDMF file, we need
1221+        # to write an SDMF file. Otherwise, we need to write an MDMF
1222+        # file.
1223+        if self._version == MDMF_VERSION:
1224+            return self._publish_mdmf()
1225+        else:
1226+            return self._publish_sdmf()
1227+        #return self.done_deferred
1228+
1229+    def _publish_mdmf(self):
1230+        # Next, we find homes for all of the shares that we don't have
1231+        # homes for yet.
1232+        # TODO: Make this part do peer selection.
1233+        self.update_goal()
1234+        self.writers = {}
1235+        # For each (peerid, shnum) in self.goal, we make an
1236+        # MDMFSlotWriteProxy for that peer. We'll use this to write
1237+        # shares to the peer.
1238+        for key in self.goal:
1239+            peerid, shnum = key
1240+            write_enabler = self._node.get_write_enabler(peerid)
1241+            renew_secret = self._node.get_renewal_secret(peerid)
1242+            cancel_secret = self._node.get_cancel_secret(peerid)
1243+            secrets = (write_enabler, renew_secret, cancel_secret)
1244+
1245+            self.writers[shnum] =  MDMFSlotWriteProxy(shnum,
1246+                                                      self.connections[peerid],
1247+                                                      self._storage_index,
1248+                                                      secrets,
1249+                                                      self._new_seqnum,
1250+                                                      self.required_shares,
1251+                                                      self.total_shares,
1252+                                                      self.segment_size,
1253+                                                      len(self.newdata))
1254+            if (peerid, shnum) in self._servermap.servermap:
1255+                old_versionid, old_timestamp = self._servermap.servermap[key]
1256+                (old_seqnum, old_root_hash, old_salt, old_segsize,
1257+                 old_datalength, old_k, old_N, old_prefix,
1258+                 old_offsets_tuple) = old_versionid
1259+                self.writers[shnum].set_checkstring(old_seqnum, old_root_hash)
1260+
1261+        # Now, we start pushing shares.
1262+        self._status.timings["setup"] = time.time() - self._started
1263+        def _start_pushing(res):
1264+            self._started_pushing = time.time()
1265+            return res
1266+
1267+        # First, we encrypt, encode, and publish the shares that we need
1268+        # to encrypt, encode, and publish.
1269+
1270+        # This will eventually hold the block hash chain for each share
1271+        # that we publish. We define it this way so that empty publishes
1272+        # will still have something to write to the remote slot.
1273+        self.blockhashes = dict([(i, []) for i in xrange(self.total_shares)])
1274+        self.sharehash_leaves = None # eventually [sharehashes]
1275+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
1276+                              # validate the share]
1277 
1278hunk ./src/allmydata/mutable/publish.py 296
1279+        d = defer.succeed(None)
1280+        self.log("Starting push")
1281+        for i in xrange(self.num_segments - 1):
1282+            d.addCallback(lambda ignored, i=i:
1283+                self.push_segment(i))
1284+            d.addCallback(self._turn_barrier)
1285+        # We have at least one segment, so we will have a tail segment
1286+        if self.num_segments > 0:
1287+            d.addCallback(lambda ignored:
1288+                self.push_tail_segment())
1289+
1290+        d.addCallback(lambda ignored:
1291+            self.push_encprivkey())
1292+        d.addCallback(lambda ignored:
1293+            self.push_blockhashes())
1294+        d.addCallback(lambda ignored:
1295+            self.push_sharehashes())
1296+        d.addCallback(lambda ignored:
1297+            self.push_toplevel_hashes_and_signature())
1298+        d.addCallback(lambda ignored:
1299+            self.finish_publishing())
1300+        return d
1301+
1302+
1303+    def _publish_sdmf(self):
1304         self._status.timings["setup"] = time.time() - self._started
1305hunk ./src/allmydata/mutable/publish.py 322
1306+        self.salt = os.urandom(16)
1307+
1308         d = self._encrypt_and_encode()
1309         d.addCallback(self._generate_shares)
1310         def _start_pushing(res):
1311hunk ./src/allmydata/mutable/publish.py 335
1312 
1313         return self.done_deferred
1314 
1315+
1316     def setup_encoding_parameters(self):
1317hunk ./src/allmydata/mutable/publish.py 337
1318-        segment_size = len(self.newdata)
1319+        if self._version == MDMF_VERSION:
1320+            segment_size = DEFAULT_MAX_SEGMENT_SIZE # 128 KiB by default
1321+        else:
1322+            segment_size = len(self.newdata) # SDMF is only one segment
1323         # this must be a multiple of self.required_shares
1324         segment_size = mathutil.next_multiple(segment_size,
1325                                               self.required_shares)
1326hunk ./src/allmydata/mutable/publish.py 350
1327                                                   segment_size)
1328         else:
1329             self.num_segments = 0
1330-        assert self.num_segments in [0, 1,] # SDMF restrictions
1331+        if self._version == SDMF_VERSION:
1332+            assert self.num_segments in (0, 1) # SDMF
1333+            return
1334+        # calculate the tail segment size.
1335+        self.tail_segment_size = len(self.newdata) % segment_size
1336+
1337+        if self.tail_segment_size == 0:
1338+            # The tail segment is the same size as the other segments.
1339+            self.tail_segment_size = segment_size
1340+
1341+        # We'll make an encoder ahead-of-time for the normal-sized
1342+        # segments (defined as any segment of segment_size size.
1343+        # (the part of the code that puts the tail segment will make its
1344+        #  own encoder for that part)
1345+        fec = codec.CRSEncoder()
1346+        fec.set_params(self.segment_size,
1347+                       self.required_shares, self.total_shares)
1348+        self.piece_size = fec.get_block_size()
1349+        self.fec = fec
1350+
1351+
1352+    def push_segment(self, segnum):
1353+        started = time.time()
1354+        segsize = self.segment_size
1355+        self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
1356+        data = self.newdata[segsize * segnum:segsize*(segnum + 1)]
1357+        assert len(data) == segsize
1358+
1359+        salt = os.urandom(16)
1360+
1361+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
1362+        enc = AES(key)
1363+        crypttext = enc.process(data)
1364+        assert len(crypttext) == len(data)
1365+
1366+        now = time.time()
1367+        self._status.timings["encrypt"] = now - started
1368+        started = now
1369+
1370+        # now apply FEC
1371+
1372+        self._status.set_status("Encoding")
1373+        crypttext_pieces = [None] * self.required_shares
1374+        piece_size = self.piece_size
1375+        for i in range(len(crypttext_pieces)):
1376+            offset = i * piece_size
1377+            piece = crypttext[offset:offset+piece_size]
1378+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
1379+            crypttext_pieces[i] = piece
1380+            assert len(piece) == piece_size
1381+        d = self.fec.encode(crypttext_pieces)
1382+        def _done_encoding(res):
1383+            elapsed = time.time() - started
1384+            self._status.timings["encode"] = elapsed
1385+            return res
1386+        d.addCallback(_done_encoding)
1387+
1388+        def _push_shares_and_salt(results):
1389+            shares, shareids = results
1390+            dl = []
1391+            for i in xrange(len(shares)):
1392+                sharedata = shares[i]
1393+                shareid = shareids[i]
1394+                block_hash = hashutil.block_hash(salt + sharedata)
1395+                self.blockhashes[shareid].append(block_hash)
1396+
1397+                # find the writer for this share
1398+                d = self.writers[shareid].put_block(sharedata, segnum, salt)
1399+                dl.append(d)
1400+            # TODO: Naturally, we need to check on the results of these.
1401+            return defer.DeferredList(dl)
1402+        d.addCallback(_push_shares_and_salt)
1403+        return d
1404+
1405+
1406+    def push_tail_segment(self):
1407+        # This is essentially the same as push_segment, except that we
1408+        # don't use the cached encoder that we use elsewhere.
1409+        self.log("Pushing tail segment")
1410+        started = time.time()
1411+        segsize = self.segment_size
1412+        data = self.newdata[segsize * (self.num_segments-1):]
1413+        assert len(data) == self.tail_segment_size
1414+        salt = os.urandom(16)
1415+
1416+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
1417+        enc = AES(key)
1418+        crypttext = enc.process(data)
1419+        assert len(crypttext) == len(data)
1420+
1421+        now = time.time()
1422+        self._status.timings['encrypt'] = now - started
1423+        started = now
1424+
1425+        self._status.set_status("Encoding")
1426+        tail_fec = codec.CRSEncoder()
1427+        tail_fec.set_params(self.tail_segment_size,
1428+                            self.required_shares,
1429+                            self.total_shares)
1430+
1431+        crypttext_pieces = [None] * self.required_shares
1432+        piece_size = tail_fec.get_block_size()
1433+        for i in range(len(crypttext_pieces)):
1434+            offset = i * piece_size
1435+            piece = crypttext[offset:offset+piece_size]
1436+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
1437+            crypttext_pieces[i] = piece
1438+            assert len(piece) == piece_size
1439+        d = tail_fec.encode(crypttext_pieces)
1440+        def _push_shares_and_salt(results):
1441+            shares, shareids = results
1442+            dl = []
1443+            for i in xrange(len(shares)):
1444+                sharedata = shares[i]
1445+                shareid = shareids[i]
1446+                block_hash = hashutil.block_hash(salt + sharedata)
1447+                self.blockhashes[shareid].append(block_hash)
1448+                # find the writer for this share
1449+                d = self.writers[shareid].put_block(sharedata,
1450+                                                    self.num_segments - 1,
1451+                                                    salt)
1452+                dl.append(d)
1453+            # TODO: Naturally, we need to check on the results of these.
1454+            return defer.DeferredList(dl)
1455+        d.addCallback(_push_shares_and_salt)
1456+        return d
1457+
1458+
1459+    def push_encprivkey(self):
1460+        started = time.time()
1461+        encprivkey = self._encprivkey
1462+        dl = []
1463+        def _spy_on_writer(results):
1464+            print results
1465+            return results
1466+        for shnum, writer in self.writers.iteritems():
1467+            d = writer.put_encprivkey(encprivkey)
1468+            dl.append(d)
1469+        d = defer.DeferredList(dl)
1470+        return d
1471+
1472+
1473+    def push_blockhashes(self):
1474+        started = time.time()
1475+        dl = []
1476+        def _spy_on_results(results):
1477+            print results
1478+            return results
1479+        self.sharehash_leaves = [None] * len(self.blockhashes)
1480+        for shnum, blockhashes in self.blockhashes.iteritems():
1481+            t = hashtree.HashTree(blockhashes)
1482+            self.blockhashes[shnum] = list(t)
1483+            # set the leaf for future use.
1484+            self.sharehash_leaves[shnum] = t[0]
1485+            d = self.writers[shnum].put_blockhashes(self.blockhashes[shnum])
1486+            dl.append(d)
1487+        d = defer.DeferredList(dl)
1488+        return d
1489+
1490+
1491+    def push_sharehashes(self):
1492+        share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
1493+        share_hash_chain = {}
1494+        ds = []
1495+        def _spy_on_results(results):
1496+            print results
1497+            return results
1498+        for shnum in xrange(len(self.sharehash_leaves)):
1499+            needed_indices = share_hash_tree.needed_hashes(shnum)
1500+            self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
1501+                                             for i in needed_indices] )
1502+            d = self.writers[shnum].put_sharehashes(self.sharehashes[shnum])
1503+            ds.append(d)
1504+        self.root_hash = share_hash_tree[0]
1505+        d = defer.DeferredList(ds)
1506+        return d
1507+
1508+
1509+    def push_toplevel_hashes_and_signature(self):
1510+        # We need to to three things here:
1511+        #   - Push the root hash and salt hash
1512+        #   - Get the checkstring of the resulting layout; sign that.
1513+        #   - Push the signature
1514+        ds = []
1515+        def _spy_on_results(results):
1516+            print results
1517+            return results
1518+        for shnum in xrange(self.total_shares):
1519+            d = self.writers[shnum].put_root_hash(self.root_hash)
1520+            ds.append(d)
1521+        d = defer.DeferredList(ds)
1522+        def _make_and_place_signature(ignored):
1523+            signable = self.writers[0].get_signable()
1524+            self.signature = self._privkey.sign(signable)
1525+
1526+            ds = []
1527+            for (shnum, writer) in self.writers.iteritems():
1528+                d = writer.put_signature(self.signature)
1529+                ds.append(d)
1530+            return defer.DeferredList(ds)
1531+        d.addCallback(_make_and_place_signature)
1532+        return d
1533+
1534+
1535+    def finish_publishing(self):
1536+        # We're almost done -- we just need to put the verification key
1537+        # and the offsets
1538+        ds = []
1539+        verification_key = self._pubkey.serialize()
1540+
1541+        def _spy_on_results(results):
1542+            print results
1543+            return results
1544+        for (shnum, writer) in self.writers.iteritems():
1545+            d = writer.put_verification_key(verification_key)
1546+            d.addCallback(lambda ignored, writer=writer:
1547+                writer.finish_publishing())
1548+            ds.append(d)
1549+        return defer.DeferredList(ds)
1550+
1551+
1552+    def _turn_barrier(self, res):
1553+        # putting this method in a Deferred chain imposes a guaranteed
1554+        # reactor turn between the pre- and post- portions of that chain.
1555+        # This can be useful to limit memory consumption: since Deferreds do
1556+        # not do tail recursion, code which uses defer.succeed(result) for
1557+        # consistency will cause objects to live for longer than you might
1558+        # normally expect.
1559+        return fireEventually(res)
1560+
1561 
1562     def _fatal_error(self, f):
1563         self.log("error during loop", failure=f, level=log.UNUSUAL)
1564hunk ./src/allmydata/mutable/publish.py 716
1565             self.log_goal(self.goal, "after update: ")
1566 
1567 
1568-
1569     def _encrypt_and_encode(self):
1570         # this returns a Deferred that fires with a list of (sharedata,
1571         # sharenum) tuples. TODO: cache the ciphertext, only produce the
1572hunk ./src/allmydata/mutable/publish.py 757
1573         d.addCallback(_done_encoding)
1574         return d
1575 
1576+
1577     def _generate_shares(self, shares_and_shareids):
1578         # this sets self.shares and self.root_hash
1579         self.log("_generate_shares")
1580hunk ./src/allmydata/mutable/publish.py 1145
1581             self._status.set_progress(1.0)
1582         eventually(self.done_deferred.callback, res)
1583 
1584-
1585hunk ./src/allmydata/test/test_mutable.py 248
1586         d.addCallback(_created)
1587         return d
1588 
1589+
1590+    def test_create_mdmf(self):
1591+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
1592+        def _created(n):
1593+            self.failUnless(isinstance(n, MutableFileNode))
1594+            self.failUnlessEqual(n.get_storage_index(), n._storage_index)
1595+            sb = self.nodemaker.storage_broker
1596+            peer0 = sorted(sb.get_all_serverids())[0]
1597+            shnums = self._storage._peers[peer0].keys()
1598+            self.failUnlessEqual(len(shnums), 1)
1599+        d.addCallback(_created)
1600+        return d
1601+
1602+
1603     def test_serialize(self):
1604         n = MutableFileNode(None, None, {"k": 3, "n": 10}, None)
1605         calls = []
1606hunk ./src/allmydata/test/test_mutable.py 334
1607         d.addCallback(_created)
1608         return d
1609 
1610+
1611+    def test_create_mdmf_with_initial_contents(self):
1612+        initial_contents = "foobarbaz" * 131072 # 900KiB
1613+        d = self.nodemaker.create_mutable_file(initial_contents,
1614+                                               version=MDMF_VERSION)
1615+        def _created(n):
1616+            d = n.download_best_version()
1617+            d.addCallback(lambda data:
1618+                self.failUnlessEqual(data, initial_contents))
1619+            d.addCallback(lambda ignored:
1620+                n.overwrite(initial_contents + "foobarbaz"))
1621+            d.addCallback(lambda ignored:
1622+                n.download_best_version())
1623+            d.addCallback(lambda data:
1624+                self.failUnlessEqual(data, initial_contents +
1625+                                           "foobarbaz"))
1626+            return d
1627+        d.addCallback(_created)
1628+        return d
1629+
1630+
1631     def test_create_with_initial_contents_function(self):
1632         data = "initial contents"
1633         def _make_contents(n):
1634hunk ./src/allmydata/test/test_mutable.py 370
1635         d.addCallback(lambda data2: self.failUnlessEqual(data2, data))
1636         return d
1637 
1638+
1639+    def test_create_mdmf_with_initial_contents_function(self):
1640+        data = "initial contents" * 100000
1641+        def _make_contents(n):
1642+            self.failUnless(isinstance(n, MutableFileNode))
1643+            key = n.get_writekey()
1644+            self.failUnless(isinstance(key, str), key)
1645+            self.failUnlessEqual(len(key), 16)
1646+            return data
1647+        d = self.nodemaker.create_mutable_file(_make_contents,
1648+                                               version=MDMF_VERSION)
1649+        d.addCallback(lambda n:
1650+            n.download_best_version())
1651+        d.addCallback(lambda data2:
1652+            self.failUnlessEqual(data2, data))
1653+        return d
1654+
1655+
1656     def test_create_with_too_large_contents(self):
1657         BIG = "a" * (self.OLD_MAX_SEGMENT_SIZE + 1)
1658         d = self.nodemaker.create_mutable_file(BIG)
1659}
1660[Write a segmented mutable downloader
1661Kevan Carstensen <kevan@isnotajoke.com>**20100626234314
1662 Ignore-this: d2bef531cde1b5c38f2eb28afdd4b17c
1663 
1664 The segmented mutable downloader can deal with MDMF files (files with
1665 one or more segments in MDMF format) and SDMF files (files with one
1666 segment in SDMF format). It is backwards compatible with the old
1667 file format.
1668 
1669 This patch also contains tests for the segmented mutable downloader.
1670] {
1671hunk ./src/allmydata/mutable/retrieve.py 8
1672 from twisted.internet import defer
1673 from twisted.python import failure
1674 from foolscap.api import DeadReferenceError, eventually, fireEventually
1675-from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError
1676-from allmydata.util import hashutil, idlib, log
1677+from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError, \
1678+                                 MDMF_VERSION, SDMF_VERSION
1679+from allmydata.util import hashutil, idlib, log, mathutil
1680 from allmydata import hashtree, codec
1681 from allmydata.storage.server import si_b2a
1682 from pycryptopp.cipher.aes import AES
1683hunk ./src/allmydata/mutable/retrieve.py 17
1684 from pycryptopp.publickey import rsa
1685 
1686 from allmydata.mutable.common import DictOfSets, CorruptShareError, UncoordinatedWriteError
1687-from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data
1688+from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data, \
1689+                                     MDMFSlotReadProxy
1690 
1691 class RetrieveStatus:
1692     implements(IRetrieveStatus)
1693hunk ./src/allmydata/mutable/retrieve.py 104
1694         self.verinfo = verinfo
1695         # during repair, we may be called upon to grab the private key, since
1696         # it wasn't picked up during a verify=False checker run, and we'll
1697-        # need it for repair to generate the a new version.
1698+        # need it for repair to generate a new version.
1699         self._need_privkey = fetch_privkey
1700         if self._node.get_privkey():
1701             self._need_privkey = False
1702hunk ./src/allmydata/mutable/retrieve.py 109
1703 
1704+        if self._need_privkey:
1705+            # TODO: Evaluate the need for this. We'll use it if we want
1706+            # to limit how many queries are on the wire for the privkey
1707+            # at once.
1708+            self._privkey_query_markers = [] # one Marker for each time we've
1709+                                             # tried to get the privkey.
1710+
1711         self._status = RetrieveStatus()
1712         self._status.set_storage_index(self._storage_index)
1713         self._status.set_helper(False)
1714hunk ./src/allmydata/mutable/retrieve.py 125
1715          offsets_tuple) = self.verinfo
1716         self._status.set_size(datalength)
1717         self._status.set_encoding(k, N)
1718+        self.readers = {}
1719 
1720     def get_status(self):
1721         return self._status
1722hunk ./src/allmydata/mutable/retrieve.py 149
1723         self.remaining_sharemap = DictOfSets()
1724         for (shnum, peerid, timestamp) in shares:
1725             self.remaining_sharemap.add(shnum, peerid)
1726+            # If the servermap update fetched anything, it fetched at least 1
1727+            # KiB, so we ask for that much.
1728+            # TODO: Change the cache methods to allow us to fetch all of the
1729+            # data that they have, then change this method to do that.
1730+            any_cache, timestamp = self._node._read_from_cache(self.verinfo,
1731+                                                               shnum,
1732+                                                               0,
1733+                                                               1000)
1734+            ss = self.servermap.connections[peerid]
1735+            reader = MDMFSlotReadProxy(ss,
1736+                                       self._storage_index,
1737+                                       shnum,
1738+                                       any_cache)
1739+            reader.peerid = peerid
1740+            self.readers[shnum] = reader
1741+
1742 
1743         self.shares = {} # maps shnum to validated blocks
1744hunk ./src/allmydata/mutable/retrieve.py 167
1745+        self._active_readers = [] # list of active readers for this dl.
1746+        self._validated_readers = set() # set of readers that we have
1747+                                        # validated the prefix of
1748+        self._block_hash_trees = {} # shnum => hashtree
1749+        # TODO: Make this into a file-backed consumer or something to
1750+        # conserve memory.
1751+        self._plaintext = ""
1752 
1753         # how many shares do we need?
1754hunk ./src/allmydata/mutable/retrieve.py 176
1755-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1756+        (seqnum,
1757+         root_hash,
1758+         IV,
1759+         segsize,
1760+         datalength,
1761+         k,
1762+         N,
1763+         prefix,
1764          offsets_tuple) = self.verinfo
1765hunk ./src/allmydata/mutable/retrieve.py 185
1766-        assert len(self.remaining_sharemap) >= k
1767-        # we start with the lowest shnums we have available, since FEC is
1768-        # faster if we're using "primary shares"
1769-        self.active_shnums = set(sorted(self.remaining_sharemap.keys())[:k])
1770-        for shnum in self.active_shnums:
1771-            # we use an arbitrary peer who has the share. If shares are
1772-            # doubled up (more than one share per peer), we could make this
1773-            # run faster by spreading the load among multiple peers. But the
1774-            # algorithm to do that is more complicated than I want to write
1775-            # right now, and a well-provisioned grid shouldn't have multiple
1776-            # shares per peer.
1777-            peerid = list(self.remaining_sharemap[shnum])[0]
1778-            self.get_data(shnum, peerid)
1779 
1780hunk ./src/allmydata/mutable/retrieve.py 186
1781-        # control flow beyond this point: state machine. Receiving responses
1782-        # from queries is the input. We might send out more queries, or we
1783-        # might produce a result.
1784 
1785hunk ./src/allmydata/mutable/retrieve.py 187
1786+        # We need one share hash tree for the entire file; its leaves
1787+        # are the roots of the block hash trees for the shares that
1788+        # comprise it, and its root is in the verinfo.
1789+        self.share_hash_tree = hashtree.IncompleteHashTree(N)
1790+        self.share_hash_tree.set_hashes({0: root_hash})
1791+
1792+        # This will set up both the segment decoder and the tail segment
1793+        # decoder, as well as a variety of other instance variables that
1794+        # the download process will use.
1795+        self._setup_encoding_parameters()
1796+        assert len(self.remaining_sharemap) >= k
1797+
1798+        self.log("starting download")
1799+        self._add_active_peers()
1800+        # The download process beyond this is a state machine.
1801+        # _add_active_peers will select the peers that we want to use
1802+        # for the download, and then attempt to start downloading. After
1803+        # each segment, it will check for doneness, reacting to broken
1804+        # peers and corrupt shares as necessary. If it runs out of good
1805+        # peers before downloading all of the segments, _done_deferred
1806+        # will errback.  Otherwise, it will eventually callback with the
1807+        # contents of the mutable file.
1808         return self._done_deferred
1809 
1810hunk ./src/allmydata/mutable/retrieve.py 211
1811-    def get_data(self, shnum, peerid):
1812-        self.log(format="sending sh#%(shnum)d request to [%(peerid)s]",
1813-                 shnum=shnum,
1814-                 peerid=idlib.shortnodeid_b2a(peerid),
1815-                 level=log.NOISY)
1816-        ss = self.servermap.connections[peerid]
1817-        started = time.time()
1818-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1819+
1820+    def _setup_encoding_parameters(self):
1821+        """
1822+        I set up the encoding parameters, including k, n, the number
1823+        of segments associated with this file, and the segment decoder.
1824+        """
1825+        (seqnum,
1826+         root_hash,
1827+         IV,
1828+         segsize,
1829+         datalength,
1830+         k,
1831+         n,
1832+         known_prefix,
1833          offsets_tuple) = self.verinfo
1834hunk ./src/allmydata/mutable/retrieve.py 226
1835-        offsets = dict(offsets_tuple)
1836+        self._required_shares = k
1837+        self._total_shares = n
1838+        self._segment_size = segsize
1839+        self._data_length = datalength
1840+
1841+        if not IV:
1842+            self._version = MDMF_VERSION
1843+        else:
1844+            self._version = SDMF_VERSION
1845+
1846+        if datalength and segsize:
1847+            self._num_segments = mathutil.div_ceil(datalength, segsize)
1848+            self._tail_data_size = datalength % segsize
1849+        else:
1850+            self._num_segments = 0
1851+            self._tail_data_size = 0
1852 
1853hunk ./src/allmydata/mutable/retrieve.py 243
1854-        # we read the checkstring, to make sure that the data we grab is from
1855-        # the right version.
1856-        readv = [ (0, struct.calcsize(SIGNED_PREFIX)) ]
1857+        self._segment_decoder = codec.CRSDecoder()
1858+        self._segment_decoder.set_params(segsize, k, n)
1859+        self._current_segment = 0
1860 
1861hunk ./src/allmydata/mutable/retrieve.py 247
1862-        # We also read the data, and the hashes necessary to validate them
1863-        # (share_hash_chain, block_hash_tree, share_data). We don't read the
1864-        # signature or the pubkey, since that was handled during the
1865-        # servermap phase, and we'll be comparing the share hash chain
1866-        # against the roothash that was validated back then.
1867+        if  not self._tail_data_size:
1868+            self._tail_data_size = segsize
1869 
1870hunk ./src/allmydata/mutable/retrieve.py 250
1871-        readv.append( (offsets['share_hash_chain'],
1872-                       offsets['enc_privkey'] - offsets['share_hash_chain'] ) )
1873+        self._tail_segment_size = mathutil.next_multiple(self._tail_data_size,
1874+                                                         self._required_shares)
1875+        if self._tail_segment_size == self._segment_size:
1876+            self._tail_decoder = self._segment_decoder
1877+        else:
1878+            self._tail_decoder = codec.CRSDecoder()
1879+            self._tail_decoder.set_params(self._tail_segment_size,
1880+                                          self._required_shares,
1881+                                          self._total_shares)
1882 
1883hunk ./src/allmydata/mutable/retrieve.py 260
1884-        # if we need the private key (for repair), we also fetch that
1885-        if self._need_privkey:
1886-            readv.append( (offsets['enc_privkey'],
1887-                           offsets['EOF'] - offsets['enc_privkey']) )
1888+        self.log("got encoding parameters: "
1889+                 "k: %d "
1890+                 "n: %d "
1891+                 "%d segments of %d bytes each (%d byte tail segment)" % \
1892+                 (k, n, self._num_segments, self._segment_size,
1893+                  self._tail_segment_size))
1894 
1895hunk ./src/allmydata/mutable/retrieve.py 267
1896-        m = Marker()
1897-        self._outstanding_queries[m] = (peerid, shnum, started)
1898+        for i in xrange(self._total_shares):
1899+            # So we don't have to do this later.
1900+            self._block_hash_trees[i] = hashtree.IncompleteHashTree(self._num_segments)
1901 
1902hunk ./src/allmydata/mutable/retrieve.py 271
1903-        # ask the cache first
1904-        got_from_cache = False
1905-        datavs = []
1906-        for (offset, length) in readv:
1907-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
1908-                                                            offset, length)
1909-            if data is not None:
1910-                datavs.append(data)
1911-        if len(datavs) == len(readv):
1912-            self.log("got data from cache")
1913-            got_from_cache = True
1914-            d = fireEventually({shnum: datavs})
1915-            # datavs is a dict mapping shnum to a pair of strings
1916-        else:
1917-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
1918-        self.remaining_sharemap.discard(shnum, peerid)
1919+        # If we have more than one segment, we are an SDMF file, which
1920+        # means that we need to validate the salts as we receive them.
1921+        self._salt_hash_tree = hashtree.IncompleteHashTree(self._num_segments)
1922+        self._salt_hash_tree[0] = IV # from the prefix.
1923 
1924hunk ./src/allmydata/mutable/retrieve.py 276
1925-        d.addCallback(self._got_results, m, peerid, started, got_from_cache)
1926-        d.addErrback(self._query_failed, m, peerid)
1927-        # errors that aren't handled by _query_failed (and errors caused by
1928-        # _query_failed) get logged, but we still want to check for doneness.
1929-        def _oops(f):
1930-            self.log(format="problem in _query_failed for sh#%(shnum)d to %(peerid)s",
1931-                     shnum=shnum,
1932-                     peerid=idlib.shortnodeid_b2a(peerid),
1933-                     failure=f,
1934-                     level=log.WEIRD, umid="W0xnQA")
1935-        d.addErrback(_oops)
1936-        d.addBoth(self._check_for_done)
1937-        # any error during _check_for_done means the download fails. If the
1938-        # download is successful, _check_for_done will fire _done by itself.
1939-        d.addErrback(self._done)
1940-        d.addErrback(log.err)
1941-        return d # purely for testing convenience
1942 
1943hunk ./src/allmydata/mutable/retrieve.py 277
1944-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
1945-        # isolate the callRemote to a separate method, so tests can subclass
1946-        # Publish and override it
1947-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
1948-        return d
1949+    def _add_active_peers(self):
1950+        """
1951+        I populate self._active_readers with enough active readers to
1952+        retrieve the contents of this mutable file. I am called before
1953+        downloading starts, and (eventually) after each validation
1954+        error, connection error, or other problem in the download.
1955+        """
1956+        # TODO: It would be cool to investigate other heuristics for
1957+        # reader selection. For instance, the cost (in time the user
1958+        # spends waiting for their file) of selecting a really slow peer
1959+        # that happens to have a primary share is probably more than
1960+        # selecting a really fast peer that doesn't have a primary
1961+        # share. Maybe the servermap could be extended to provide this
1962+        # information; it could keep track of latency information while
1963+        # it gathers more important data, and then this routine could
1964+        # use that to select active readers.
1965+        #
1966+        # (these and other questions would be easier to answer with a
1967+        #  robust, configurable tahoe-lafs simulator, which modeled node
1968+        #  failures, differences in node speed, and other characteristics
1969+        #  that we expect storage servers to have.  You could have
1970+        #  presets for really stable grids (like allmydata.com),
1971+        #  friendnets, make it easy to configure your own settings, and
1972+        #  then simulate the effect of big changes on these use cases
1973+        #  instead of just reasoning about what the effect might be. Out
1974+        #  of scope for MDMF, though.)
1975 
1976hunk ./src/allmydata/mutable/retrieve.py 304
1977-    def remove_peer(self, peerid):
1978-        for shnum in list(self.remaining_sharemap.keys()):
1979-            self.remaining_sharemap.discard(shnum, peerid)
1980+        # We need at least self._required_shares readers to download a
1981+        # segment.
1982+        needed = self._required_shares - len(self._active_readers)
1983+        # XXX: Why don't format= log messages work here?
1984+        self.log("adding %d peers to the active peers list" % needed)
1985 
1986hunk ./src/allmydata/mutable/retrieve.py 310
1987-    def _got_results(self, datavs, marker, peerid, started, got_from_cache):
1988-        now = time.time()
1989-        elapsed = now - started
1990-        if not got_from_cache:
1991-            self._status.add_fetch_timing(peerid, elapsed)
1992-        self.log(format="got results (%(shares)d shares) from [%(peerid)s]",
1993-                 shares=len(datavs),
1994-                 peerid=idlib.shortnodeid_b2a(peerid),
1995-                 level=log.NOISY)
1996-        self._outstanding_queries.pop(marker, None)
1997-        if not self._running:
1998-            return
1999+        # We favor lower numbered shares, since FEC is faster with
2000+        # primary shares than with other shares, and lower-numbered
2001+        # shares are more likely to be primary than higher numbered
2002+        # shares.
2003+        active_shnums = set(sorted(self.remaining_sharemap.keys()))
2004+        # We shouldn't consider adding shares that we already have; this
2005+        # will cause problems later.
2006+        active_shnums -= set([reader.shnum for reader in self._active_readers])
2007+        active_shnums = list(active_shnums)[:needed]
2008+        if len(active_shnums) < needed:
2009+            # We don't have enough readers to retrieve the file; fail.
2010+            return self._failed()
2011 
2012hunk ./src/allmydata/mutable/retrieve.py 323
2013-        # note that we only ask for a single share per query, so we only
2014-        # expect a single share back. On the other hand, we use the extra
2015-        # shares if we get them.. seems better than an assert().
2016+        for shnum in active_shnums:
2017+            self._active_readers.append(self.readers[shnum])
2018+            self.log("added reader for share %d" % shnum)
2019+        assert len(self._active_readers) == self._required_shares
2020+        # Conceptually, this is part of the _add_active_peers step. It
2021+        # validates the prefixes of newly added readers to make sure
2022+        # that they match what we are expecting for self.verinfo. If
2023+        # validation is successful, _validate_active_prefixes will call
2024+        # _download_current_segment for us. If validation is
2025+        # unsuccessful, then _validate_prefixes will remove the peer and
2026+        # call _add_active_peers again, where we will attempt to rectify
2027+        # the problem by choosing another peer.
2028+        return self._validate_active_prefixes()
2029 
2030hunk ./src/allmydata/mutable/retrieve.py 337
2031-        for shnum,datav in datavs.items():
2032-            (prefix, hash_and_data) = datav[:2]
2033-            try:
2034-                self._got_results_one_share(shnum, peerid,
2035-                                            prefix, hash_and_data)
2036-            except CorruptShareError, e:
2037-                # log it and give the other shares a chance to be processed
2038-                f = failure.Failure()
2039-                self.log(format="bad share: %(f_value)s",
2040-                         f_value=str(f.value), failure=f,
2041-                         level=log.WEIRD, umid="7fzWZw")
2042-                self.notify_server_corruption(peerid, shnum, str(e))
2043-                self.remove_peer(peerid)
2044-                self.servermap.mark_bad_share(peerid, shnum, prefix)
2045-                self._bad_shares.add( (peerid, shnum) )
2046-                self._status.problems[peerid] = f
2047-                self._last_failure = f
2048-                pass
2049-            if self._need_privkey and len(datav) > 2:
2050-                lp = None
2051-                self._try_to_validate_privkey(datav[2], peerid, shnum, lp)
2052-        # all done!
2053 
2054hunk ./src/allmydata/mutable/retrieve.py 338
2055-    def notify_server_corruption(self, peerid, shnum, reason):
2056-        ss = self.servermap.connections[peerid]
2057-        ss.callRemoteOnly("advise_corrupt_share",
2058-                          "mutable", self._storage_index, shnum, reason)
2059+    def _validate_active_prefixes(self):
2060+        """
2061+        I check to make sure that the prefixes on the peers that I am
2062+        currently reading from match the prefix that we want to see, as
2063+        said in self.verinfo.
2064 
2065hunk ./src/allmydata/mutable/retrieve.py 344
2066-    def _got_results_one_share(self, shnum, peerid,
2067-                               got_prefix, got_hash_and_data):
2068-        self.log("_got_results: got shnum #%d from peerid %s"
2069-                 % (shnum, idlib.shortnodeid_b2a(peerid)))
2070-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2071+        If I find that all of the active peers have acceptable prefixes,
2072+        I pass control to _download_current_segment, which will use
2073+        those peers to do cool things. If I find that some of the active
2074+        peers have unacceptable prefixes, I will remove them from active
2075+        peers (and from further consideration) and call
2076+        _add_active_peers to attempt to rectify the situation. I keep
2077+        track of which peers I have already validated so that I don't
2078+        need to do so again.
2079+        """
2080+        assert self._active_readers, "No more active readers"
2081+
2082+        ds = []
2083+        new_readers = set(self._active_readers) - self._validated_readers
2084+        self.log('validating %d newly-added active readers' % len(new_readers))
2085+
2086+        for reader in new_readers:
2087+            # We force a remote read here -- otherwise, we are relying
2088+            # on cached data that we already verified as valid, and we
2089+            # won't detect an uncoordinated write that has occurred
2090+            # since the last servermap update.
2091+            d = reader.get_prefix(force_remote=True)
2092+            d.addCallback(self._try_to_validate_prefix, reader)
2093+            ds.append(d)
2094+        dl = defer.DeferredList(ds, consumeErrors=True)
2095+        def _check_results(results):
2096+            # Each result in results will be of the form (success, msg).
2097+            # We don't care about msg, but success will tell us whether
2098+            # or not the checkstring validated. If it didn't, we need to
2099+            # remove the offending (peer,share) from our active readers,
2100+            # and ensure that active readers is again populated.
2101+            bad_readers = []
2102+            for i, result in enumerate(results):
2103+                if not result[0]:
2104+                    reader = self._active_readers[i]
2105+                    f = result[1]
2106+                    assert isinstance(f, failure.Failure)
2107+
2108+                    self.log("The reader %s failed to "
2109+                             "properly validate: %s" % \
2110+                             (reader, str(f.value)))
2111+                    bad_readers.append((reader, f))
2112+                else:
2113+                    reader = self._active_readers[i]
2114+                    self.log("the reader %s checks out, so we'll use it" % \
2115+                             reader)
2116+                    self._validated_readers.add(reader)
2117+                    # Each time we validate a reader, we check to see if
2118+                    # we need the private key. If we do, we politely ask
2119+                    # for it and then continue computing. If we find
2120+                    # that we haven't gotten it at the end of
2121+                    # segment decoding, then we'll take more drastic
2122+                    # measures.
2123+                    if self._need_privkey:
2124+                        d = reader.get_encprivkey()
2125+                        d.addCallback(self._try_to_validate_privkey, reader)
2126+            if bad_readers:
2127+                # We do them all at once, or else we screw up list indexing.
2128+                for (reader, f) in bad_readers:
2129+                    self._mark_bad_share(reader, f)
2130+                return self._add_active_peers()
2131+            else:
2132+                return self._download_current_segment()
2133+            # The next step will assert that it has enough active
2134+            # readers to fetch shares; we just need to remove it.
2135+        dl.addCallback(_check_results)
2136+        return dl
2137+
2138+
2139+    def _try_to_validate_prefix(self, prefix, reader):
2140+        """
2141+        I check that the prefix returned by a candidate server for
2142+        retrieval matches the prefix that the servermap knows about
2143+        (and, hence, the prefix that was validated earlier). If it does,
2144+        I return True, which means that I approve of the use of the
2145+        candidate server for segment retrieval. If it doesn't, I return
2146+        False, which means that another server must be chosen.
2147+        """
2148+        (seqnum,
2149+         root_hash,
2150+         IV,
2151+         segsize,
2152+         datalength,
2153+         k,
2154+         N,
2155+         known_prefix,
2156          offsets_tuple) = self.verinfo
2157hunk ./src/allmydata/mutable/retrieve.py 430
2158-        assert len(got_prefix) == len(prefix), (len(got_prefix), len(prefix))
2159-        if got_prefix != prefix:
2160-            msg = "someone wrote to the data since we read the servermap: prefix changed"
2161-            raise UncoordinatedWriteError(msg)
2162-        (share_hash_chain, block_hash_tree,
2163-         share_data) = unpack_share_data(self.verinfo, got_hash_and_data)
2164+        if known_prefix != prefix:
2165+            self.log("prefix from share %d doesn't match" % reader.shnum)
2166+            raise UncoordinatedWriteError("Mismatched prefix -- this could "
2167+                                          "indicate an uncoordinated write")
2168+        # Otherwise, we're okay -- no issues.
2169 
2170hunk ./src/allmydata/mutable/retrieve.py 436
2171-        assert isinstance(share_data, str)
2172-        # build the block hash tree. SDMF has only one leaf.
2173-        leaves = [hashutil.block_hash(share_data)]
2174-        t = hashtree.HashTree(leaves)
2175-        if list(t) != block_hash_tree:
2176-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
2177-        share_hash_leaf = t[0]
2178-        t2 = hashtree.IncompleteHashTree(N)
2179-        # root_hash was checked by the signature
2180-        t2.set_hashes({0: root_hash})
2181-        try:
2182-            t2.set_hashes(hashes=share_hash_chain,
2183-                          leaves={shnum: share_hash_leaf})
2184-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
2185-                IndexError), e:
2186-            msg = "corrupt hashes: %s" % (e,)
2187-            raise CorruptShareError(peerid, shnum, msg)
2188-        self.log(" data valid! len=%d" % len(share_data))
2189-        # each query comes down to this: placing validated share data into
2190-        # self.shares
2191-        self.shares[shnum] = share_data
2192 
2193hunk ./src/allmydata/mutable/retrieve.py 437
2194-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
2195+    def _remove_reader(self, reader):
2196+        """
2197+        At various points, we will wish to remove a peer from
2198+        consideration and/or use. These include, but are not necessarily
2199+        limited to:
2200 
2201hunk ./src/allmydata/mutable/retrieve.py 443
2202-        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
2203-        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
2204-        if alleged_writekey != self._node.get_writekey():
2205-            self.log("invalid privkey from %s shnum %d" %
2206-                     (idlib.nodeid_b2a(peerid)[:8], shnum),
2207-                     parent=lp, level=log.WEIRD, umid="YIw4tA")
2208-            return
2209+            - A connection error.
2210+            - A mismatched prefix (that is, a prefix that does not match
2211+              our conception of the version information string).
2212+            - A failing block hash, salt hash, or share hash, which can
2213+              indicate disk failure/bit flips, or network trouble.
2214 
2215hunk ./src/allmydata/mutable/retrieve.py 449
2216-        # it's good
2217-        self.log("got valid privkey from shnum %d on peerid %s" %
2218-                 (shnum, idlib.shortnodeid_b2a(peerid)),
2219-                 parent=lp)
2220-        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
2221-        self._node._populate_encprivkey(enc_privkey)
2222-        self._node._populate_privkey(privkey)
2223-        self._need_privkey = False
2224+        This method will do that. I will make sure that the
2225+        (shnum,reader) combination represented by my reader argument is
2226+        not used for anything else during this download. I will not
2227+        advise the reader of any corruption, something that my callers
2228+        may wish to do on their own.
2229+        """
2230+        # TODO: When you're done writing this, see if this is ever
2231+        # actually used for something that _mark_bad_share isn't. I have
2232+        # a feeling that they will be used for very similar things, and
2233+        # that having them both here is just going to be an epic amount
2234+        # of code duplication.
2235+        #
2236+        # (well, okay, not epic, but meaningful)
2237+        self.log("removing reader %s" % reader)
2238+        # Remove the reader from _active_readers
2239+        self._active_readers.remove(reader)
2240+        # TODO: self.readers.remove(reader)?
2241+        for shnum in list(self.remaining_sharemap.keys()):
2242+            self.remaining_sharemap.discard(shnum, reader.peerid)
2243 
2244hunk ./src/allmydata/mutable/retrieve.py 469
2245-    def _query_failed(self, f, marker, peerid):
2246-        self.log(format="query to [%(peerid)s] failed",
2247-                 peerid=idlib.shortnodeid_b2a(peerid),
2248-                 level=log.NOISY)
2249-        self._status.problems[peerid] = f
2250-        self._outstanding_queries.pop(marker, None)
2251-        if not self._running:
2252-            return
2253-        self._last_failure = f
2254-        self.remove_peer(peerid)
2255-        level = log.WEIRD
2256-        if f.check(DeadReferenceError):
2257-            level = log.UNUSUAL
2258-        self.log(format="error during query: %(f_value)s",
2259-                 f_value=str(f.value), failure=f, level=level, umid="gOJB5g")
2260 
2261hunk ./src/allmydata/mutable/retrieve.py 470
2262-    def _check_for_done(self, res):
2263-        # exit paths:
2264-        #  return : keep waiting, no new queries
2265-        #  return self._send_more_queries(outstanding) : send some more queries
2266-        #  fire self._done(plaintext) : download successful
2267-        #  raise exception : download fails
2268+    def _mark_bad_share(self, reader, f):
2269+        """
2270+        I mark the (peerid, shnum) encapsulated by my reader argument as
2271+        a bad share, which means that it will not be used anywhere else.
2272 
2273hunk ./src/allmydata/mutable/retrieve.py 475
2274-        self.log(format="_check_for_done: running=%(running)s, decoding=%(decoding)s",
2275-                 running=self._running, decoding=self._decoding,
2276-                 level=log.NOISY)
2277-        if not self._running:
2278-            return
2279-        if self._decoding:
2280-            return
2281-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2282-         offsets_tuple) = self.verinfo
2283+        There are several reasons to want to mark something as a bad
2284+        share. These include:
2285 
2286hunk ./src/allmydata/mutable/retrieve.py 478
2287-        if len(self.shares) < k:
2288-            # we don't have enough shares yet
2289-            return self._maybe_send_more_queries(k)
2290-        if self._need_privkey:
2291-            # we got k shares, but none of them had a valid privkey. TODO:
2292-            # look further. Adding code to do this is a bit complicated, and
2293-            # I want to avoid that complication, and this should be pretty
2294-            # rare (k shares with bitflips in the enc_privkey but not in the
2295-            # data blocks). If we actually do get here, the subsequent repair
2296-            # will fail for lack of a privkey.
2297-            self.log("got k shares but still need_privkey, bummer",
2298-                     level=log.WEIRD, umid="MdRHPA")
2299+            - A connection error to the peer.
2300+            - A mismatched prefix (that is, a prefix that does not match
2301+              our local conception of the version information string).
2302+            - A failing block hash, salt hash, share hash, or other
2303+              integrity check.
2304 
2305hunk ./src/allmydata/mutable/retrieve.py 484
2306-        # we have enough to finish. All the shares have had their hashes
2307-        # checked, so if something fails at this point, we don't know how
2308-        # to fix it, so the download will fail.
2309+        This method will ensure that readers that we wish to mark bad
2310+        (for these reasons or other reasons) are not used for the rest
2311+        of the download. Additionally, it will attempt to tell the
2312+        remote peer (with no guarantee of success) that its share is
2313+        corrupt.
2314+        """
2315+        self.log("marking share %d on server %s as bad" % \
2316+                 (reader.shnum, reader))
2317+        self._remove_reader(reader)
2318+        self._bad_shares.add((reader.peerid, reader.shnum))
2319+        self._status.problems[reader.peerid] = f
2320+        self._last_failure = f
2321+        self.notify_server_corruption(reader.peerid, reader.shnum,
2322+                                      str(f.value))
2323 
2324hunk ./src/allmydata/mutable/retrieve.py 499
2325-        self._decoding = True # avoid reentrancy
2326-        self._status.set_status("decoding")
2327-        now = time.time()
2328-        elapsed = now - self._started
2329-        self._status.timings["fetch"] = elapsed
2330 
2331hunk ./src/allmydata/mutable/retrieve.py 500
2332-        d = defer.maybeDeferred(self._decode)
2333-        d.addCallback(self._decrypt, IV, self._node.get_readkey())
2334-        d.addBoth(self._done)
2335-        return d # purely for test convenience
2336+    def _download_current_segment(self):
2337+        """
2338+        I download, validate, decode, decrypt, and assemble the segment
2339+        that this Retrieve is currently responsible for downloading.
2340+        """
2341+        assert len(self._active_readers) >= self._required_shares
2342+        if self._current_segment < self._num_segments:
2343+            d = self._process_segment(self._current_segment)
2344+        else:
2345+            d = defer.succeed(None)
2346+        d.addCallback(self._check_for_done)
2347+        return d
2348 
2349hunk ./src/allmydata/mutable/retrieve.py 513
2350-    def _maybe_send_more_queries(self, k):
2351-        # we don't have enough shares yet. Should we send out more queries?
2352-        # There are some number of queries outstanding, each for a single
2353-        # share. If we can generate 'needed_shares' additional queries, we do
2354-        # so. If we can't, then we know this file is a goner, and we raise
2355-        # NotEnoughSharesError.
2356-        self.log(format=("_maybe_send_more_queries, have=%(have)d, k=%(k)d, "
2357-                         "outstanding=%(outstanding)d"),
2358-                 have=len(self.shares), k=k,
2359-                 outstanding=len(self._outstanding_queries),
2360-                 level=log.NOISY)
2361 
2362hunk ./src/allmydata/mutable/retrieve.py 514
2363-        remaining_shares = k - len(self.shares)
2364-        needed = remaining_shares - len(self._outstanding_queries)
2365-        if not needed:
2366-            # we have enough queries in flight already
2367+    def _process_segment(self, segnum):
2368+        """
2369+        I download, validate, decode, and decrypt one segment of the
2370+        file that this Retrieve is retrieving. This means coordinating
2371+        the process of getting k blocks of that file, validating them,
2372+        assembling them into one segment with the decoder, and then
2373+        decrypting them.
2374+        """
2375+        self.log("processing segment %d" % segnum)
2376 
2377hunk ./src/allmydata/mutable/retrieve.py 524
2378-            # TODO: but if they've been in flight for a long time, and we
2379-            # have reason to believe that new queries might respond faster
2380-            # (i.e. we've seen other queries come back faster, then consider
2381-            # sending out new queries. This could help with peers which have
2382-            # silently gone away since the servermap was updated, for which
2383-            # we're still waiting for the 15-minute TCP disconnect to happen.
2384-            self.log("enough queries are in flight, no more are needed",
2385-                     level=log.NOISY)
2386-            return
2387+        # TODO: The old code uses a marker. Should this code do that
2388+        # too? What did the Marker do?
2389+        assert len(self._active_readers) >= self._required_shares
2390+
2391+        # We need to ask each of our active readers for its block and
2392+        # salt. We will then validate those. If validation is
2393+        # successful, we will assemble the results into plaintext.
2394+        ds = []
2395+        for reader in self._active_readers:
2396+            d = reader.get_block_and_salt(segnum, queue=True)
2397+            d2 = self._get_needed_hashes(reader, segnum)
2398+            dl = defer.DeferredList([d, d2], consumeErrors=True)
2399+            dl.addCallback(self._validate_block, segnum, reader)
2400+            dl.addErrback(self._validation_or_decoding_failed, [reader])
2401+            ds.append(dl)
2402+            reader.flush()
2403+        dl = defer.DeferredList(ds)
2404+        dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
2405+        return dl
2406 
2407hunk ./src/allmydata/mutable/retrieve.py 544
2408-        outstanding_shnums = set([shnum
2409-                                  for (peerid, shnum, started)
2410-                                  in self._outstanding_queries.values()])
2411-        # prefer low-numbered shares, they are more likely to be primary
2412-        available_shnums = sorted(self.remaining_sharemap.keys())
2413-        for shnum in available_shnums:
2414-            if shnum in outstanding_shnums:
2415-                # skip ones that are already in transit
2416-                continue
2417-            if shnum not in self.remaining_sharemap:
2418-                # no servers for that shnum. note that DictOfSets removes
2419-                # empty sets from the dict for us.
2420-                continue
2421-            peerid = list(self.remaining_sharemap[shnum])[0]
2422-            # get_data will remove that peerid from the sharemap, and add the
2423-            # query to self._outstanding_queries
2424-            self._status.set_status("Retrieving More Shares")
2425-            self.get_data(shnum, peerid)
2426-            needed -= 1
2427-            if not needed:
2428+
2429+    def _maybe_decode_and_decrypt_segment(self, blocks_and_salts, segnum):
2430+        """
2431+        I take the results of fetching and validating the blocks from a
2432+        callback chain in another method. If the results are such that
2433+        they tell me that validation and fetching succeeded without
2434+        incident, I will proceed with decoding and decryption.
2435+        Otherwise, I will do nothing.
2436+        """
2437+        self.log("trying to decode and decrypt segment %d" % segnum)
2438+        failures = False
2439+        for block_and_salt in blocks_and_salts:
2440+            if not block_and_salt[0] or block_and_salt[1] == None:
2441+                self.log("some validation operations failed; not proceeding")
2442+                failures = True
2443                 break
2444hunk ./src/allmydata/mutable/retrieve.py 560
2445+        if not failures:
2446+            self.log("everything looks ok, building segment %d" % segnum)
2447+            d = self._decode_blocks(blocks_and_salts, segnum)
2448+            d.addCallback(self._decrypt_segment)
2449+            d.addErrback(self._validation_or_decoding_failed,
2450+                         self._active_readers)
2451+            d.addCallback(self._set_segment)
2452+            return d
2453+        else:
2454+            return defer.succeed(None)
2455+
2456+
2457+    def _set_segment(self, segment):
2458+        """
2459+        Given a plaintext segment, I register that segment with the
2460+        target that is handling the file download.
2461+        """
2462+        self.log("got plaintext for segment %d" % self._current_segment)
2463+        self._plaintext += segment
2464+        self._current_segment += 1
2465 
2466hunk ./src/allmydata/mutable/retrieve.py 581
2467-        # at this point, we have as many outstanding queries as we can. If
2468-        # needed!=0 then we might not have enough to recover the file.
2469-        if needed:
2470-            format = ("ran out of peers: "
2471-                      "have %(have)d shares (k=%(k)d), "
2472-                      "%(outstanding)d queries in flight, "
2473-                      "need %(need)d more, "
2474-                      "found %(bad)d bad shares")
2475-            args = {"have": len(self.shares),
2476-                    "k": k,
2477-                    "outstanding": len(self._outstanding_queries),
2478-                    "need": needed,
2479-                    "bad": len(self._bad_shares),
2480-                    }
2481-            self.log(format=format,
2482-                     level=log.WEIRD, umid="ezTfjw", **args)
2483-            err = NotEnoughSharesError("%s, last failure: %s" %
2484-                                      (format % args, self._last_failure))
2485-            if self._bad_shares:
2486-                self.log("We found some bad shares this pass. You should "
2487-                         "update the servermap and try again to check "
2488-                         "more peers",
2489-                         level=log.WEIRD, umid="EFkOlA")
2490-                err.servermap = self.servermap
2491-            raise err
2492 
2493hunk ./src/allmydata/mutable/retrieve.py 582
2494+    def _validation_or_decoding_failed(self, f, readers):
2495+        """
2496+        I am called when a block or a salt fails to correctly validate, or when
2497+        the decryption or decoding operation fails for some reason.  I react to
2498+        this failure by notifying the remote server of corruption, and then
2499+        removing the remote peer from further activity.
2500+        """
2501+        assert isinstance(readers, list)
2502+        bad_shnums = [reader.shnum for reader in readers]
2503+
2504+        self.log("validation or decoding failed on share(s) %s, peer(s) %s "
2505+                 ", segment %d: %s" % \
2506+                 (bad_shnums, readers, self._current_segment, str(f)))
2507+        for reader in readers:
2508+            self._mark_bad_share(reader, f)
2509         return
2510 
2511hunk ./src/allmydata/mutable/retrieve.py 599
2512-    def _decode(self):
2513-        started = time.time()
2514-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2515-         offsets_tuple) = self.verinfo
2516 
2517hunk ./src/allmydata/mutable/retrieve.py 600
2518-        # shares_dict is a dict mapping shnum to share data, but the codec
2519-        # wants two lists.
2520-        shareids = []; shares = []
2521-        for shareid, share in self.shares.items():
2522+    def _validate_block(self, results, segnum, reader):
2523+        """
2524+        I validate a block from one share on a remote server.
2525+        """
2526+        # Grab the part of the block hash tree that is necessary to
2527+        # validate this block, then generate the block hash root.
2528+        self.log("validating share %d for segment %d" % (reader.shnum,
2529+                                                             segnum))
2530+        # Did we fail to fetch either of the things that we were
2531+        # supposed to? Fail if so.
2532+        if not results[0][0] and results[1][0]:
2533+            # handled by the errback handler.
2534+
2535+            # These all get batched into one query, so the resulting
2536+            # failure should be the same for all of them, so we can just
2537+            # use the first one.
2538+            assert isinstance(results[0][1], failure.Failure)
2539+
2540+            f = results[0][1]
2541+            raise CorruptShareError(reader.peerid,
2542+                                    reader.shnum,
2543+                                    "Connection error: %s" % str(f))
2544+
2545+        block_and_salt, block_and_sharehashes = results
2546+        block, salt = block_and_salt[1]
2547+        blockhashes, sharehashes = block_and_sharehashes[1]
2548+
2549+        blockhashes = dict(enumerate(blockhashes[1]))
2550+        self.log("the reader gave me the following blockhashes: %s" % \
2551+                 blockhashes.keys())
2552+        self.log("the reader gave me the following sharehashes: %s" % \
2553+                 sharehashes[1].keys())
2554+        bht = self._block_hash_trees[reader.shnum]
2555+
2556+        if bht.needed_hashes(segnum, include_leaf=True):
2557+            try:
2558+                bht.set_hashes(blockhashes)
2559+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2560+                    IndexError), e:
2561+                raise CorruptShareError(reader.peerid,
2562+                                        reader.shnum,
2563+                                        "block hash tree failure: %s" % e)
2564+
2565+        if self._version == MDMF_VERSION:
2566+            blockhash = hashutil.block_hash(salt + block)
2567+        else:
2568+            blockhash = hashutil.block_hash(block)
2569+        # If this works without an error, then validation is
2570+        # successful.
2571+        try:
2572+           bht.set_hashes(leaves={segnum: blockhash})
2573+        except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2574+                IndexError), e:
2575+            raise CorruptShareError(reader.peerid,
2576+                                    reader.shnum,
2577+                                    "block hash tree failure: %s" % e)
2578+
2579+        # Reaching this point means that we know that this segment
2580+        # is correct. Now we need to check to see whether the share
2581+        # hash chain is also correct.
2582+        # SDMF wrote share hash chains that didn't contain the
2583+        # leaves, which would be produced from the block hash tree.
2584+        # So we need to validate the block hash tree first. If
2585+        # successful, then bht[0] will contain the root for the
2586+        # shnum, which will be a leaf in the share hash tree, which
2587+        # will allow us to validate the rest of the tree.
2588+        if self.share_hash_tree.needed_hashes(reader.shnum,
2589+                                               include_leaf=True):
2590+            try:
2591+                self.share_hash_tree.set_hashes(hashes=sharehashes[1],
2592+                                            leaves={reader.shnum: bht[0]})
2593+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2594+                    IndexError), e:
2595+                raise CorruptShareError(reader.peerid,
2596+                                        reader.shnum,
2597+                                        "corrupt hashes: %s" % e)
2598+
2599+        # TODO: Validate the salt, too.
2600+        self.log('share %d is valid for segment %d' % (reader.shnum,
2601+                                                       segnum))
2602+        return {reader.shnum: (block, salt)}
2603+
2604+
2605+    def _get_needed_hashes(self, reader, segnum):
2606+        """
2607+        I get the hashes needed to validate segnum from the reader, then return
2608+        to my caller when this is done.
2609+        """
2610+        bht = self._block_hash_trees[reader.shnum]
2611+        needed = bht.needed_hashes(segnum, include_leaf=True)
2612+        # The root of the block hash tree is also a leaf in the share
2613+        # hash tree. So we don't need to fetch it from the remote
2614+        # server. In the case of files with one segment, this means that
2615+        # we won't fetch any block hash tree from the remote server,
2616+        # since the hash of each share of the file is the entire block
2617+        # hash tree, and is a leaf in the share hash tree. This is fine,
2618+        # since any share corruption will be detected in the share hash
2619+        # tree.
2620+        #needed.discard(0)
2621+        self.log("getting blockhashes for segment %d, share %d: %s" % \
2622+                 (segnum, reader.shnum, str(needed)))
2623+        d1 = reader.get_blockhashes(needed, queue=True, force_remote=True)
2624+        if self.share_hash_tree.needed_hashes(reader.shnum):
2625+            need = self.share_hash_tree.needed_hashes(reader.shnum)
2626+            self.log("also need sharehashes for share %d: %s" % (reader.shnum,
2627+                                                                 str(need)))
2628+            d2 = reader.get_sharehashes(need, queue=True, force_remote=True)
2629+        else:
2630+            d2 = defer.succeed({}) # the logic in the next method
2631+                                   # expects a dict
2632+        dl = defer.DeferredList([d1, d2], consumeErrors=True)
2633+        return dl
2634+
2635+
2636+    def _decode_blocks(self, blocks_and_salts, segnum):
2637+        """
2638+        I take a list of k blocks and salts, and decode that into a
2639+        single encrypted segment.
2640+        """
2641+        d = {}
2642+        # We want to merge our dictionaries to the form
2643+        # {shnum: blocks_and_salts}
2644+        #
2645+        # The dictionaries come from validate block that way, so we just
2646+        # need to merge them.
2647+        for block_and_salt in blocks_and_salts:
2648+            d.update(block_and_salt[1])
2649+
2650+        # All of these blocks should have the same salt; in SDMF, it is
2651+        # the file-wide IV, while in MDMF it is the per-segment salt. In
2652+        # either case, we just need to get one of them and use it.
2653+        #
2654+        # d.items()[0] is like (shnum, (block, salt))
2655+        # d.items()[0][1] is like (block, salt)
2656+        # d.items()[0][1][1] is the salt.
2657+        salt = d.items()[0][1][1]
2658+        # Next, extract just the blocks from the dict. We'll use the
2659+        # salt in the next step.
2660+        share_and_shareids = [(k, v[0]) for k, v in d.items()]
2661+        d2 = dict(share_and_shareids)
2662+        shareids = []
2663+        shares = []
2664+        for shareid, share in d2.items():
2665             shareids.append(shareid)
2666             shares.append(share)
2667 
2668hunk ./src/allmydata/mutable/retrieve.py 746
2669-        assert len(shareids) >= k, len(shareids)
2670+        assert len(shareids) >= self._required_shares, len(shareids)
2671         # zfec really doesn't want extra shares
2672hunk ./src/allmydata/mutable/retrieve.py 748
2673-        shareids = shareids[:k]
2674-        shares = shares[:k]
2675-
2676-        fec = codec.CRSDecoder()
2677-        fec.set_params(segsize, k, N)
2678-
2679-        self.log("params %s, we have %d shares" % ((segsize, k, N), len(shares)))
2680-        self.log("about to decode, shareids=%s" % (shareids,))
2681-        d = defer.maybeDeferred(fec.decode, shares, shareids)
2682-        def _done(buffers):
2683-            self._status.timings["decode"] = time.time() - started
2684-            self.log(" decode done, %d buffers" % len(buffers))
2685+        shareids = shareids[:self._required_shares]
2686+        shares = shares[:self._required_shares]
2687+        self.log("decoding segment %d" % segnum)
2688+        if segnum == self._num_segments - 1:
2689+            d = defer.maybeDeferred(self._tail_decoder.decode, shares, shareids)
2690+        else:
2691+            d = defer.maybeDeferred(self._segment_decoder.decode, shares, shareids)
2692+        def _process(buffers):
2693             segment = "".join(buffers)
2694hunk ./src/allmydata/mutable/retrieve.py 757
2695+            self.log(format="now decoding segment %(segnum)s of %(numsegs)s",
2696+                     segnum=segnum,
2697+                     numsegs=self._num_segments,
2698+                     level=log.NOISY)
2699             self.log(" joined length %d, datalength %d" %
2700hunk ./src/allmydata/mutable/retrieve.py 762
2701-                     (len(segment), datalength))
2702-            segment = segment[:datalength]
2703+                     (len(segment), self._data_length))
2704+            if segnum == self._num_segments - 1:
2705+                size_to_use = self._tail_data_size
2706+            else:
2707+                size_to_use = self._segment_size
2708+            segment = segment[:size_to_use]
2709             self.log(" segment len=%d" % len(segment))
2710hunk ./src/allmydata/mutable/retrieve.py 769
2711-            return segment
2712-        def _err(f):
2713-            self.log(" decode failed: %s" % f)
2714-            return f
2715-        d.addCallback(_done)
2716-        d.addErrback(_err)
2717+            return segment, salt
2718+        d.addCallback(_process)
2719         return d
2720 
2721hunk ./src/allmydata/mutable/retrieve.py 773
2722-    def _decrypt(self, crypttext, IV, readkey):
2723+
2724+    def _decrypt_segment(self, segment_and_salt):
2725+        """
2726+        I take a single segment and its salt, and decrypt it. I return
2727+        the plaintext of the segment that is in my argument.
2728+        """
2729+        segment, salt = segment_and_salt
2730         self._status.set_status("decrypting")
2731hunk ./src/allmydata/mutable/retrieve.py 781
2732+        self.log("decrypting segment %d" % self._current_segment)
2733         started = time.time()
2734hunk ./src/allmydata/mutable/retrieve.py 783
2735-        key = hashutil.ssk_readkey_data_hash(IV, readkey)
2736+        key = hashutil.ssk_readkey_data_hash(salt, self._node.get_readkey())
2737         decryptor = AES(key)
2738hunk ./src/allmydata/mutable/retrieve.py 785
2739-        plaintext = decryptor.process(crypttext)
2740+        plaintext = decryptor.process(segment)
2741         self._status.timings["decrypt"] = time.time() - started
2742         return plaintext
2743 
2744hunk ./src/allmydata/mutable/retrieve.py 789
2745-    def _done(self, res):
2746-        if not self._running:
2747+
2748+    def notify_server_corruption(self, peerid, shnum, reason):
2749+        ss = self.servermap.connections[peerid]
2750+        ss.callRemoteOnly("advise_corrupt_share",
2751+                          "mutable", self._storage_index, shnum, reason)
2752+
2753+
2754+    def _try_to_validate_privkey(self, enc_privkey, reader):
2755+
2756+        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
2757+        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
2758+        if alleged_writekey != self._node.get_writekey():
2759+            self.log("invalid privkey from %s shnum %d" %
2760+                     (reader, reader.shnum),
2761+                     level=log.WEIRD, umid="YIw4tA")
2762             return
2763hunk ./src/allmydata/mutable/retrieve.py 805
2764-        self._running = False
2765-        self._status.set_active(False)
2766-        self._status.timings["total"] = time.time() - self._started
2767-        # res is either the new contents, or a Failure
2768-        if isinstance(res, failure.Failure):
2769-            self.log("Retrieve done, with failure", failure=res,
2770-                     level=log.UNUSUAL)
2771-            self._status.set_status("Failed")
2772-        else:
2773-            self.log("Retrieve done, success!")
2774-            self._status.set_status("Finished")
2775-            self._status.set_progress(1.0)
2776-            # remember the encoding parameters, use them again next time
2777-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2778-             offsets_tuple) = self.verinfo
2779-            self._node._populate_required_shares(k)
2780-            self._node._populate_total_shares(N)
2781-        eventually(self._done_deferred.callback, res)
2782 
2783hunk ./src/allmydata/mutable/retrieve.py 806
2784+        # it's good
2785+        self.log("got valid privkey from shnum %d on reader %s" %
2786+                 (reader.shnum, reader))
2787+        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
2788+        self._node._populate_encprivkey(enc_privkey)
2789+        self._node._populate_privkey(privkey)
2790+        self._need_privkey = False
2791+
2792+
2793+    def _check_for_done(self, res):
2794+        """
2795+        I check to see if this Retrieve object has successfully finished
2796+        its work.
2797+
2798+        I can exit in the following ways:
2799+            - If there are no more segments to download, then I exit by
2800+              causing self._done_deferred to fire with the plaintext
2801+              content requested by the caller.
2802+            - If there are still segments to be downloaded, and there
2803+              are enough active readers (readers which have not broken
2804+              and have not given us corrupt data) to continue
2805+              downloading, I send control back to
2806+              _download_current_segment.
2807+            - If there are still segments to be downloaded but there are
2808+              not enough active peers to download them, I ask
2809+              _add_active_peers to add more peers. If it is successful,
2810+              it will call _download_current_segment. If there are not
2811+              enough peers to retrieve the file, then that will cause
2812+              _done_deferred to errback.
2813+        """
2814+        self.log("checking for doneness")
2815+        if self._current_segment == self._num_segments:
2816+            # No more segments to download, we're done.
2817+            self.log("got plaintext, done")
2818+            return self._done()
2819+
2820+        if len(self._active_readers) >= self._required_shares:
2821+            # More segments to download, but we have enough good peers
2822+            # in self._active_readers that we can do that without issue,
2823+            # so go nab the next segment.
2824+            self.log("not done yet: on segment %d of %d" % \
2825+                     (self._current_segment + 1, self._num_segments))
2826+            return self._download_current_segment()
2827+
2828+        self.log("not done yet: on segment %d of %d, need to add peers" % \
2829+                 (self._current_segment + 1, self._num_segments))
2830+        return self._add_active_peers()
2831+
2832+
2833+    def _done(self):
2834+        """
2835+        I am called by _check_for_done when the download process has
2836+        finished successfully. After making some useful logging
2837+        statements, I return the decrypted contents to the owner of this
2838+        Retrieve object through self._done_deferred.
2839+        """
2840+        eventually(self._done_deferred.callback, self._plaintext)
2841+
2842+
2843+    def _failed(self):
2844+        """
2845+        I am called by _add_active_peers when there are not enough
2846+        active peers left to complete the download. After making some
2847+        useful logging statements, I return an exception to that effect
2848+        to the caller of this Retrieve object through
2849+        self._done_deferred.
2850+        """
2851+        format = ("ran out of peers: "
2852+                  "have %(have)d of %(total)d segments "
2853+                  "found %(bad)d bad shares "
2854+                  "encoding %(k)d-of-%(n)d")
2855+        args = {"have": self._current_segment,
2856+                "total": self._num_segments,
2857+                "k": self._required_shares,
2858+                "n": self._total_shares,
2859+                "bad": len(self._bad_shares)}
2860+        e = NotEnoughSharesError("%s, last failure: %s" % (format % args,
2861+                                                        str(self._last_failure)))
2862+        f = failure.Failure(e)
2863+        eventually(self._done_deferred.callback, f)
2864hunk ./src/allmydata/test/test_mutable.py 12
2865 from allmydata.util.hashutil import tagged_hash, ssk_writekey_hash, \
2866      ssk_pubkey_fingerprint_hash
2867 from allmydata.interfaces import IRepairResults, ICheckAndRepairResults, \
2868-     NotEnoughSharesError
2869+     NotEnoughSharesError, SDMF_VERSION, MDMF_VERSION
2870 from allmydata.monitor import Monitor
2871 from allmydata.test.common import ShouldFailMixin
2872 from allmydata.test.no_network import GridTestMixin
2873hunk ./src/allmydata/test/test_mutable.py 28
2874 from allmydata.mutable.retrieve import Retrieve
2875 from allmydata.mutable.publish import Publish
2876 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
2877-from allmydata.mutable.layout import unpack_header, unpack_share
2878+from allmydata.mutable.layout import unpack_header, unpack_share, \
2879+                                     MDMFSlotReadProxy
2880 from allmydata.mutable.repairer import MustForceRepairError
2881 
2882 import allmydata.test.common_util as testutil
2883hunk ./src/allmydata/test/test_mutable.py 104
2884         d = fireEventually()
2885         d.addCallback(lambda res: _call())
2886         return d
2887+
2888     def callRemoteOnly(self, methname, *args, **kwargs):
2889         d = self.callRemote(methname, *args, **kwargs)
2890         d.addBoth(lambda ignore: None)
2891hunk ./src/allmydata/test/test_mutable.py 163
2892 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
2893     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
2894     # list of shnums to corrupt.
2895+    ds = []
2896     for peerid in s._peers:
2897         shares = s._peers[peerid]
2898         for shnum in shares:
2899hunk ./src/allmydata/test/test_mutable.py 190
2900                 else:
2901                     offset1 = offset
2902                     offset2 = 0
2903-                if offset1 == "pubkey":
2904+                if offset1 == "pubkey" and IV:
2905                     real_offset = 107
2906hunk ./src/allmydata/test/test_mutable.py 192
2907+                elif offset1 == "share_data" and not IV:
2908+                    real_offset = 104
2909                 elif offset1 in o:
2910                     real_offset = o[offset1]
2911                 else:
2912hunk ./src/allmydata/test/test_mutable.py 327
2913         d.addCallback(_created)
2914         return d
2915 
2916+
2917+    def test_upload_and_download_mdmf(self):
2918+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
2919+        def _created(n):
2920+            d = defer.succeed(None)
2921+            d.addCallback(lambda ignored:
2922+                n.get_servermap(MODE_READ))
2923+            def _then(servermap):
2924+                dumped = servermap.dump(StringIO())
2925+                self.failUnlessIn("3-of-10", dumped.getvalue())
2926+            d.addCallback(_then)
2927+            # Now overwrite the contents with some new contents. We want
2928+            # to make them big enough to force the file to be uploaded
2929+            # in more than one segment.
2930+            big_contents = "contents1" * 100000 # about 900 KiB
2931+            d.addCallback(lambda ignored:
2932+                n.overwrite(big_contents))
2933+            d.addCallback(lambda ignored:
2934+                n.download_best_version())
2935+            d.addCallback(lambda data:
2936+                self.failUnlessEqual(data, big_contents))
2937+            # Overwrite the contents again with some new contents. As
2938+            # before, they need to be big enough to force multiple
2939+            # segments, so that we make the downloader deal with
2940+            # multiple segments.
2941+            bigger_contents = "contents2" * 1000000 # about 9MiB
2942+            d.addCallback(lambda ignored:
2943+                n.overwrite(bigger_contents))
2944+            d.addCallback(lambda ignored:
2945+                n.download_best_version())
2946+            d.addCallback(lambda data:
2947+                self.failUnlessEqual(data, bigger_contents))
2948+            return d
2949+        d.addCallback(_created)
2950+        return d
2951+
2952+
2953     def test_create_with_initial_contents(self):
2954         d = self.nodemaker.create_mutable_file("contents 1")
2955         def _created(n):
2956hunk ./src/allmydata/test/test_mutable.py 1147
2957 
2958 
2959     def _test_corrupt_all(self, offset, substring,
2960-                          should_succeed=False, corrupt_early=True,
2961-                          failure_checker=None):
2962+                          should_succeed=False,
2963+                          corrupt_early=True,
2964+                          failure_checker=None,
2965+                          fetch_privkey=False):
2966         d = defer.succeed(None)
2967         if corrupt_early:
2968             d.addCallback(corrupt, self._storage, offset)
2969hunk ./src/allmydata/test/test_mutable.py 1167
2970                     self.failUnlessIn(substring, "".join(allproblems))
2971                 return servermap
2972             if should_succeed:
2973-                d1 = self._fn.download_version(servermap, ver)
2974+                d1 = self._fn.download_version(servermap, ver,
2975+                                               fetch_privkey)
2976                 d1.addCallback(lambda new_contents:
2977                                self.failUnlessEqual(new_contents, self.CONTENTS))
2978             else:
2979hunk ./src/allmydata/test/test_mutable.py 1175
2980                 d1 = self.shouldFail(NotEnoughSharesError,
2981                                      "_corrupt_all(offset=%s)" % (offset,),
2982                                      substring,
2983-                                     self._fn.download_version, servermap, ver)
2984+                                     self._fn.download_version, servermap,
2985+                                                                ver,
2986+                                                                fetch_privkey)
2987             if failure_checker:
2988                 d1.addCallback(failure_checker)
2989             d1.addCallback(lambda res: servermap)
2990hunk ./src/allmydata/test/test_mutable.py 1186
2991         return d
2992 
2993     def test_corrupt_all_verbyte(self):
2994-        # when the version byte is not 0, we hit an UnknownVersionError error
2995-        # in unpack_share().
2996+        # when the version byte is not 0 or 1, we hit an UnknownVersionError
2997+        # error in unpack_share().
2998         d = self._test_corrupt_all(0, "UnknownVersionError")
2999         def _check_servermap(servermap):
3000             # and the dump should mention the problems
3001hunk ./src/allmydata/test/test_mutable.py 1193
3002             s = StringIO()
3003             dump = servermap.dump(s).getvalue()
3004-            self.failUnless("10 PROBLEMS" in dump, dump)
3005+            self.failUnless("30 PROBLEMS" in dump, dump)
3006         d.addCallback(_check_servermap)
3007         return d
3008 
3009hunk ./src/allmydata/test/test_mutable.py 1263
3010         return self._test_corrupt_all("enc_privkey", None, should_succeed=True)
3011 
3012 
3013+    def test_corrupt_all_encprivkey_late(self):
3014+        # this should work for the same reason as above, but we corrupt
3015+        # after the servermap update to exercise the error handling
3016+        # code.
3017+        # We need to remove the privkey from the node, or the retrieve
3018+        # process won't know to update it.
3019+        self._fn._privkey = None
3020+        return self._test_corrupt_all("enc_privkey",
3021+                                      None, # this shouldn't fail
3022+                                      should_succeed=True,
3023+                                      corrupt_early=False,
3024+                                      fetch_privkey=True)
3025+
3026+
3027     def test_corrupt_all_seqnum_late(self):
3028         # corrupting the seqnum between mapupdate and retrieve should result
3029         # in NotEnoughSharesError, since each share will look invalid
3030hunk ./src/allmydata/test/test_mutable.py 1283
3031         def _check(res):
3032             f = res[0]
3033             self.failUnless(f.check(NotEnoughSharesError))
3034-            self.failUnless("someone wrote to the data since we read the servermap" in str(f))
3035+            self.failUnless("uncoordinated write" in str(f))
3036         return self._test_corrupt_all(1, "ran out of peers",
3037                                       corrupt_early=False,
3038                                       failure_checker=_check)
3039hunk ./src/allmydata/test/test_mutable.py 1333
3040                       self.failUnlessEqual(new_contents, self.CONTENTS))
3041         return d
3042 
3043-    def test_corrupt_some(self):
3044-        # corrupt the data of first five shares (so the servermap thinks
3045-        # they're good but retrieve marks them as bad), so that the
3046-        # MODE_READ set of 6 will be insufficient, forcing node.download to
3047-        # retry with more servers.
3048-        corrupt(None, self._storage, "share_data", range(5))
3049-        d = self.make_servermap()
3050+
3051+    def _test_corrupt_some(self, offset, mdmf=False):
3052+        if mdmf:
3053+            d = self.publish_mdmf()
3054+        else:
3055+            d = defer.succeed(None)
3056+        d.addCallback(lambda ignored:
3057+            corrupt(None, self._storage, offset, range(5)))
3058+        d.addCallback(lambda ignored:
3059+            self.make_servermap())
3060         def _do_retrieve(servermap):
3061             ver = servermap.best_recoverable_version()
3062             self.failUnless(ver)
3063hunk ./src/allmydata/test/test_mutable.py 1349
3064             return self._fn.download_best_version()
3065         d.addCallback(_do_retrieve)
3066         d.addCallback(lambda new_contents:
3067-                      self.failUnlessEqual(new_contents, self.CONTENTS))
3068+            self.failUnlessEqual(new_contents, self.CONTENTS))
3069         return d
3070 
3071hunk ./src/allmydata/test/test_mutable.py 1352
3072+
3073+    def test_corrupt_some(self):
3074+        # corrupt the data of first five shares (so the servermap thinks
3075+        # they're good but retrieve marks them as bad), so that the
3076+        # MODE_READ set of 6 will be insufficient, forcing node.download to
3077+        # retry with more servers.
3078+        return self._test_corrupt_some("share_data")
3079+
3080+
3081     def test_download_fails(self):
3082         d = corrupt(None, self._storage, "signature")
3083         d.addCallback(lambda ignored:
3084hunk ./src/allmydata/test/test_mutable.py 1366
3085             self.shouldFail(UnrecoverableFileError, "test_download_anyway",
3086                             "no recoverable versions",
3087-                            self._fn.download_best_version)
3088+                            self._fn.download_best_version))
3089         return d
3090 
3091 
3092hunk ./src/allmydata/test/test_mutable.py 1370
3093+
3094+    def test_corrupt_mdmf_block_hash_tree(self):
3095+        d = self.publish_mdmf()
3096+        d.addCallback(lambda ignored:
3097+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
3098+                                   "block hash tree failure",
3099+                                   corrupt_early=False,
3100+                                   should_succeed=False))
3101+        return d
3102+
3103+
3104+    def test_corrupt_mdmf_block_hash_tree_late(self):
3105+        d = self.publish_mdmf()
3106+        d.addCallback(lambda ignored:
3107+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
3108+                                   "block hash tree failure",
3109+                                   corrupt_early=True,
3110+                                   should_succeed=False))
3111+        return d
3112+
3113+
3114+    def test_corrupt_mdmf_share_data(self):
3115+        d = self.publish_mdmf()
3116+        d.addCallback(lambda ignored:
3117+            # TODO: Find out what the block size is and corrupt a
3118+            # specific block, rather than just guessing.
3119+            self._test_corrupt_all(("share_data", 12 * 40),
3120+                                    "block hash tree failure",
3121+                                    corrupt_early=True,
3122+                                    should_succeed=False))
3123+        return d
3124+
3125+
3126+    def test_corrupt_some_mdmf(self):
3127+        return self._test_corrupt_some(("share_data", 12 * 40),
3128+                                       mdmf=True)
3129+
3130+
3131 class CheckerMixin:
3132     def check_good(self, r, where):
3133         self.failUnless(r.is_healthy(), where)
3134hunk ./src/allmydata/test/test_mutable.py 2116
3135             d.addCallback(lambda res:
3136                           self.shouldFail(NotEnoughSharesError,
3137                                           "test_retrieve_surprise",
3138-                                          "ran out of peers: have 0 shares (k=3)",
3139+                                          "ran out of peers: have 0 of 1",
3140                                           n.download_version,
3141                                           self.old_map,
3142                                           self.old_map.best_recoverable_version(),
3143hunk ./src/allmydata/test/test_mutable.py 2125
3144         d.addCallback(_created)
3145         return d
3146 
3147+
3148     def test_unexpected_shares(self):
3149         # upload the file, take a servermap, shut down one of the servers,
3150         # upload it again (causing shares to appear on a new server), then
3151hunk ./src/allmydata/test/test_mutable.py 2329
3152         self.basedir = "mutable/Problems/test_privkey_query_missing"
3153         self.set_up_grid(num_servers=20)
3154         nm = self.g.clients[0].nodemaker
3155-        LARGE = "These are Larger contents" * 2000 # about 50KB
3156+        LARGE = "These are Larger contents" * 2000 # about 50KiB
3157         nm._node_cache = DevNullDictionary() # disable the nodecache
3158 
3159         d = nm.create_mutable_file(LARGE)
3160hunk ./src/allmydata/test/test_mutable.py 2342
3161         d.addCallback(_created)
3162         d.addCallback(lambda res: self.n2.get_servermap(MODE_WRITE))
3163         return d
3164+
3165+
3166+    def test_block_and_hash_query_error(self):
3167+        # This tests for what happens when a query to a remote server
3168+        # fails in either the hash validation step or the block getting
3169+        # step (because of batching, this is the same actual query).
3170+        # We need to have the storage server persist up until the point
3171+        # that its prefix is validated, then suddenly die. This
3172+        # exercises some exception handling code in Retrieve.
3173+        self.basedir = "mutable/Problems/test_block_and_hash_query_error"
3174+        self.set_up_grid(num_servers=20)
3175+        nm = self.g.clients[0].nodemaker
3176+        CONTENTS = "contents" * 2000
3177+        d = nm.create_mutable_file(CONTENTS)
3178+        def _created(node):
3179+            self._node = node
3180+        d.addCallback(_created)
3181+        d.addCallback(lambda ignored:
3182+            self._node.get_servermap(MODE_READ))
3183+        def _then(servermap):
3184+            # we have our servermap. Now we set up the servers like the
3185+            # tests above -- the first one that gets a read call should
3186+            # start throwing errors, but only after returning its prefix
3187+            # for validation. Since we'll download without fetching the
3188+            # private key, the next query to the remote server will be
3189+            # for either a block and salt or for hashes, either of which
3190+            # will exercise the error handling code.
3191+            killer = FirstServerGetsKilled()
3192+            for (serverid, ss) in nm.storage_broker.get_all_servers():
3193+                ss.post_call_notifier = killer.notify
3194+            ver = servermap.best_recoverable_version()
3195+            assert ver
3196+            return self._node.download_version(servermap, ver)
3197+        d.addCallback(_then)
3198+        d.addCallback(lambda data:
3199+            self.failUnlessEqual(data, CONTENTS))
3200+        return d
3201}
3202[mutable/checker.py: check MDMF files
3203Kevan Carstensen <kevan@isnotajoke.com>**20100628225048
3204 Ignore-this: fb697b36285d60552df6ca5ac6a37629
3205 
3206 This patch adapts the mutable file checker and verifier to check and
3207 verify MDMF files. It does this by using the new segmented downloader,
3208 which is trained to perform verification operations on request. This
3209 removes some code duplication.
3210] {
3211hunk ./src/allmydata/mutable/checker.py 12
3212 from allmydata.mutable.common import MODE_CHECK, CorruptShareError
3213 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
3214 from allmydata.mutable.layout import unpack_share, SIGNED_PREFIX_LENGTH
3215+from allmydata.mutable.retrieve import Retrieve # for verifying
3216 
3217 class MutableChecker:
3218 
3219hunk ./src/allmydata/mutable/checker.py 29
3220 
3221     def check(self, verify=False, add_lease=False):
3222         servermap = ServerMap()
3223+        # Updating the servermap in MODE_CHECK will stand a good chance
3224+        # of finding all of the shares, and getting a good idea of
3225+        # recoverability, etc, without verifying.
3226         u = ServermapUpdater(self._node, self._storage_broker, self._monitor,
3227                              servermap, MODE_CHECK, add_lease=add_lease)
3228         if self._history:
3229hunk ./src/allmydata/mutable/checker.py 55
3230         if num_recoverable:
3231             self.best_version = servermap.best_recoverable_version()
3232 
3233+        # The file is unhealthy and needs to be repaired if:
3234+        # - There are unrecoverable versions.
3235         if servermap.unrecoverable_versions():
3236             self.need_repair = True
3237hunk ./src/allmydata/mutable/checker.py 59
3238+        # - There isn't a recoverable version.
3239         if num_recoverable != 1:
3240             self.need_repair = True
3241hunk ./src/allmydata/mutable/checker.py 62
3242+        # - The best recoverable version is missing some shares.
3243         if self.best_version:
3244             available_shares = servermap.shares_available()
3245             (num_distinct_shares, k, N) = available_shares[self.best_version]
3246hunk ./src/allmydata/mutable/checker.py 73
3247 
3248     def _verify_all_shares(self, servermap):
3249         # read every byte of each share
3250+        #
3251+        # This logic is going to be very nearly the same as the
3252+        # downloader. I bet we could pass the downloader a flag that
3253+        # makes it do this, and piggyback onto that instead of
3254+        # duplicating a bunch of code.
3255+        #
3256+        # Like:
3257+        #  r = Retrieve(blah, blah, blah, verify=True)
3258+        #  d = r.download()
3259+        #  (wait, wait, wait, d.callback)
3260+        # 
3261+        #  Then, when it has finished, we can check the servermap (which
3262+        #  we provided to Retrieve) to figure out which shares are bad,
3263+        #  since the Retrieve process will have updated the servermap as
3264+        #  it went along.
3265+        #
3266+        #  By passing the verify=True flag to the constructor, we are
3267+        #  telling the downloader a few things.
3268+        #
3269+        #  1. It needs to download all N shares, not just K shares.
3270+        #  2. It doesn't need to decrypt or decode the shares, only
3271+        #     verify them.
3272         if not self.best_version:
3273             return
3274hunk ./src/allmydata/mutable/checker.py 97
3275-        versionmap = servermap.make_versionmap()
3276-        shares = versionmap[self.best_version]
3277-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
3278-         offsets_tuple) = self.best_version
3279-        offsets = dict(offsets_tuple)
3280-        readv = [ (0, offsets["EOF"]) ]
3281-        dl = []
3282-        for (shnum, peerid, timestamp) in shares:
3283-            ss = servermap.connections[peerid]
3284-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
3285-            d.addCallback(self._got_answer, peerid, servermap)
3286-            dl.append(d)
3287-        return defer.DeferredList(dl, fireOnOneErrback=True, consumeErrors=True)
3288 
3289hunk ./src/allmydata/mutable/checker.py 98
3290-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
3291-        # isolate the callRemote to a separate method, so tests can subclass
3292-        # Publish and override it
3293-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
3294+        r = Retrieve(self._node, servermap, self.best_version, verify=True)
3295+        d = r.download()
3296+        d.addCallback(self._process_bad_shares)
3297         return d
3298 
3299hunk ./src/allmydata/mutable/checker.py 103
3300-    def _got_answer(self, datavs, peerid, servermap):
3301-        for shnum,datav in datavs.items():
3302-            data = datav[0]
3303-            try:
3304-                self._got_results_one_share(shnum, peerid, data)
3305-            except CorruptShareError:
3306-                f = failure.Failure()
3307-                self.need_repair = True
3308-                self.bad_shares.append( (peerid, shnum, f) )
3309-                prefix = data[:SIGNED_PREFIX_LENGTH]
3310-                servermap.mark_bad_share(peerid, shnum, prefix)
3311-                ss = servermap.connections[peerid]
3312-                self.notify_server_corruption(ss, shnum, str(f.value))
3313-
3314-    def check_prefix(self, peerid, shnum, data):
3315-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
3316-         offsets_tuple) = self.best_version
3317-        got_prefix = data[:SIGNED_PREFIX_LENGTH]
3318-        if got_prefix != prefix:
3319-            raise CorruptShareError(peerid, shnum,
3320-                                    "prefix mismatch: share changed while we were reading it")
3321-
3322-    def _got_results_one_share(self, shnum, peerid, data):
3323-        self.check_prefix(peerid, shnum, data)
3324-
3325-        # the [seqnum:signature] pieces are validated by _compare_prefix,
3326-        # which checks their signature against the pubkey known to be
3327-        # associated with this file.
3328 
3329hunk ./src/allmydata/mutable/checker.py 104
3330-        (seqnum, root_hash, IV, k, N, segsize, datalen, pubkey, signature,
3331-         share_hash_chain, block_hash_tree, share_data,
3332-         enc_privkey) = unpack_share(data)
3333-
3334-        # validate [share_hash_chain,block_hash_tree,share_data]
3335-
3336-        leaves = [hashutil.block_hash(share_data)]
3337-        t = hashtree.HashTree(leaves)
3338-        if list(t) != block_hash_tree:
3339-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
3340-        share_hash_leaf = t[0]
3341-        t2 = hashtree.IncompleteHashTree(N)
3342-        # root_hash was checked by the signature
3343-        t2.set_hashes({0: root_hash})
3344-        try:
3345-            t2.set_hashes(hashes=share_hash_chain,
3346-                          leaves={shnum: share_hash_leaf})
3347-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
3348-                IndexError), e:
3349-            msg = "corrupt hashes: %s" % (e,)
3350-            raise CorruptShareError(peerid, shnum, msg)
3351-
3352-        # validate enc_privkey: only possible if we have a write-cap
3353-        if not self._node.is_readonly():
3354-            alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
3355-            alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
3356-            if alleged_writekey != self._node.get_writekey():
3357-                raise CorruptShareError(peerid, shnum, "invalid privkey")
3358+    def _process_bad_shares(self, bad_shares):
3359+        if bad_shares:
3360+            self.need_repair = True
3361+        self.bad_shares = bad_shares
3362 
3363hunk ./src/allmydata/mutable/checker.py 109
3364-    def notify_server_corruption(self, ss, shnum, reason):
3365-        ss.callRemoteOnly("advise_corrupt_share",
3366-                          "mutable", self._storage_index, shnum, reason)
3367 
3368     def _count_shares(self, smap, version):
3369         available_shares = smap.shares_available()
3370hunk ./src/allmydata/test/test_mutable.py 193
3371                 if offset1 == "pubkey" and IV:
3372                     real_offset = 107
3373                 elif offset1 == "share_data" and not IV:
3374-                    real_offset = 104
3375+                    real_offset = 107
3376                 elif offset1 in o:
3377                     real_offset = o[offset1]
3378                 else:
3379hunk ./src/allmydata/test/test_mutable.py 395
3380             return d
3381         d.addCallback(_created)
3382         return d
3383+    test_create_mdmf_with_initial_contents.timeout = 20
3384 
3385 
3386     def test_create_with_initial_contents_function(self):
3387hunk ./src/allmydata/test/test_mutable.py 700
3388                                            k, N, segsize, datalen)
3389                 self.failUnless(p._pubkey.verify(sig_material, signature))
3390                 #self.failUnlessEqual(signature, p._privkey.sign(sig_material))
3391-                self.failUnless(isinstance(share_hash_chain, dict))
3392-                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
3393+                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
3394                 for shnum,share_hash in share_hash_chain.items():
3395                     self.failUnless(isinstance(shnum, int))
3396                     self.failUnless(isinstance(share_hash, str))
3397hunk ./src/allmydata/test/test_mutable.py 820
3398                     shares[peerid][shnum] = oldshares[index][peerid][shnum]
3399 
3400 
3401+
3402+
3403 class Servermap(unittest.TestCase, PublishMixin):
3404     def setUp(self):
3405         return self.publish_one()
3406hunk ./src/allmydata/test/test_mutable.py 951
3407         self._storage._peers = {} # delete all shares
3408         ms = self.make_servermap
3409         d = defer.succeed(None)
3410-
3411+#
3412         d.addCallback(lambda res: ms(mode=MODE_CHECK))
3413         d.addCallback(lambda sm: self.failUnlessNoneRecoverable(sm))
3414 
3415hunk ./src/allmydata/test/test_mutable.py 1440
3416         d.addCallback(self.check_good, "test_check_good")
3417         return d
3418 
3419+    def test_check_mdmf_good(self):
3420+        d = self.publish_mdmf()
3421+        d.addCallback(lambda ignored:
3422+            self._fn.check(Monitor()))
3423+        d.addCallback(self.check_good, "test_check_mdmf_good")
3424+        return d
3425+
3426     def test_check_no_shares(self):
3427         for shares in self._storage._peers.values():
3428             shares.clear()
3429hunk ./src/allmydata/test/test_mutable.py 1454
3430         d.addCallback(self.check_bad, "test_check_no_shares")
3431         return d
3432 
3433+    def test_check_mdmf_no_shares(self):
3434+        d = self.publish_mdmf()
3435+        def _then(ignored):
3436+            for share in self._storage._peers.values():
3437+                share.clear()
3438+        d.addCallback(_then)
3439+        d.addCallback(lambda ignored:
3440+            self._fn.check(Monitor()))
3441+        d.addCallback(self.check_bad, "test_check_mdmf_no_shares")
3442+        return d
3443+
3444     def test_check_not_enough_shares(self):
3445         for shares in self._storage._peers.values():
3446             for shnum in shares.keys():
3447hunk ./src/allmydata/test/test_mutable.py 1474
3448         d.addCallback(self.check_bad, "test_check_not_enough_shares")
3449         return d
3450 
3451+    def test_check_mdmf_not_enough_shares(self):
3452+        d = self.publish_mdmf()
3453+        def _then(ignored):
3454+            for shares in self._storage._peers.values():
3455+                for shnum in shares.keys():
3456+                    if shnum > 0:
3457+                        del shares[shnum]
3458+        d.addCallback(_then)
3459+        d.addCallback(lambda ignored:
3460+            self._fn.check(Monitor()))
3461+        d.addCallback(self.check_bad, "test_check_mdmf_not_enougH_shares")
3462+        return d
3463+
3464+
3465     def test_check_all_bad_sig(self):
3466         d = corrupt(None, self._storage, 1) # bad sig
3467         d.addCallback(lambda ignored:
3468hunk ./src/allmydata/test/test_mutable.py 1495
3469         d.addCallback(self.check_bad, "test_check_all_bad_sig")
3470         return d
3471 
3472+    def test_check_mdmf_all_bad_sig(self):
3473+        d = self.publish_mdmf()
3474+        d.addCallback(lambda ignored:
3475+            corrupt(None, self._storage, 1))
3476+        d.addCallback(lambda ignored:
3477+            self._fn.check(Monitor()))
3478+        d.addCallback(self.check_bad, "test_check_mdmf_all_bad_sig")
3479+        return d
3480+
3481     def test_check_all_bad_blocks(self):
3482         d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
3483         # the Checker won't notice this.. it doesn't look at actual data
3484hunk ./src/allmydata/test/test_mutable.py 1512
3485         d.addCallback(self.check_good, "test_check_all_bad_blocks")
3486         return d
3487 
3488+
3489+    def test_check_mdmf_all_bad_blocks(self):
3490+        d = self.publish_mdmf()
3491+        d.addCallback(lambda ignored:
3492+            corrupt(None, self._storage, "share_data"))
3493+        d.addCallback(lambda ignored:
3494+            self._fn.check(Monitor()))
3495+        d.addCallback(self.check_good, "test_check_mdmf_all_bad_blocks")
3496+        return d
3497+
3498     def test_verify_good(self):
3499         d = self._fn.check(Monitor(), verify=True)
3500         d.addCallback(self.check_good, "test_verify_good")
3501hunk ./src/allmydata/test/test_mutable.py 1582
3502                       "test_verify_one_bad_encprivkey_uncheckable")
3503         return d
3504 
3505+
3506+    def test_verify_mdmf_good(self):
3507+        d = self.publish_mdmf()
3508+        d.addCallback(lambda ignored:
3509+            self._fn.check(Monitor(), verify=True))
3510+        d.addCallback(self.check_good, "test_verify_mdmf_good")
3511+        return d
3512+
3513+
3514+    def test_verify_mdmf_one_bad_block(self):
3515+        d = self.publish_mdmf()
3516+        d.addCallback(lambda ignored:
3517+            corrupt(None, self._storage, "share_data", [1]))
3518+        d.addCallback(lambda ignored:
3519+            self._fn.check(Monitor(), verify=True))
3520+        # We should find one bad block here
3521+        d.addCallback(self.check_bad, "test_verify_mdmf_one_bad_block")
3522+        d.addCallback(self.check_expected_failure,
3523+                      CorruptShareError, "block hash tree failure",
3524+                      "test_verify_mdmf_one_bad_block")
3525+        return d
3526+
3527+
3528+    def test_verify_mdmf_bad_encprivkey(self):
3529+        d = self.publish_mdmf()
3530+        d.addCallback(lambda ignored:
3531+            corrupt(None, self._storage, "enc_privkey", [1]))
3532+        d.addCallback(lambda ignored:
3533+            self._fn.check(Monitor(), verify=True))
3534+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_encprivkey")
3535+        d.addCallback(self.check_expected_failure,
3536+                      CorruptShareError, "privkey",
3537+                      "test_verify_mdmf_bad_encprivkey")
3538+        return d
3539+
3540+
3541+    def test_verify_mdmf_bad_sig(self):
3542+        d = self.publish_mdmf()
3543+        d.addCallback(lambda ignored:
3544+            corrupt(None, self._storage, 1, [1]))
3545+        d.addCallback(lambda ignored:
3546+            self._fn.check(Monitor(), verify=True))
3547+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_sig")
3548+        return d
3549+
3550+
3551+    def test_verify_mdmf_bad_encprivkey_uncheckable(self):
3552+        d = self.publish_mdmf()
3553+        d.addCallback(lambda ignored:
3554+            corrupt(None, self._storage, "enc_privkey", [1]))
3555+        d.addCallback(lambda ignored:
3556+            self._fn.get_readonly())
3557+        d.addCallback(lambda fn:
3558+            fn.check(Monitor(), verify=True))
3559+        d.addCallback(self.check_good,
3560+                      "test_verify_mdmf_bad_encprivkey_uncheckable")
3561+        return d
3562+
3563+
3564 class Repair(unittest.TestCase, PublishMixin, ShouldFailMixin):
3565 
3566     def get_shares(self, s):
3567hunk ./src/allmydata/test/test_mutable.py 1706
3568         current_shares = self.old_shares[-1]
3569         self.failUnlessEqual(old_shares, current_shares)
3570 
3571+
3572     def test_unrepairable_0shares(self):
3573         d = self.publish_one()
3574         def _delete_all_shares(ign):
3575hunk ./src/allmydata/test/test_mutable.py 1721
3576         d.addCallback(_check)
3577         return d
3578 
3579+    def test_mdmf_unrepairable_0shares(self):
3580+        d = self.publish_mdmf()
3581+        def _delete_all_shares(ign):
3582+            shares = self._storage._peers
3583+            for peerid in shares:
3584+                shares[peerid] = {}
3585+        d.addCallback(_delete_all_shares)
3586+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3587+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3588+        d.addCallback(lambda crr: self.failIf(crr.get_successful()))
3589+        return d
3590+
3591+
3592     def test_unrepairable_1share(self):
3593         d = self.publish_one()
3594         def _delete_all_shares(ign):
3595hunk ./src/allmydata/test/test_mutable.py 1750
3596         d.addCallback(_check)
3597         return d
3598 
3599+    def test_mdmf_unrepairable_1share(self):
3600+        d = self.publish_mdmf()
3601+        def _delete_all_shares(ign):
3602+            shares = self._storage._peers
3603+            for peerid in shares:
3604+                for shnum in list(shares[peerid]):
3605+                    if shnum > 0:
3606+                        del shares[peerid][shnum]
3607+        d.addCallback(_delete_all_shares)
3608+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3609+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3610+        def _check(crr):
3611+            self.failUnlessEqual(crr.get_successful(), False)
3612+        d.addCallback(_check)
3613+        return d
3614+
3615+    def test_repairable_5shares(self):
3616+        d = self.publish_mdmf()
3617+        def _delete_all_shares(ign):
3618+            shares = self._storage._peers
3619+            for peerid in shares:
3620+                for shnum in list(shares[peerid]):
3621+                    if shnum > 4:
3622+                        del shares[peerid][shnum]
3623+        d.addCallback(_delete_all_shares)
3624+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3625+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3626+        def _check(crr):
3627+            self.failUnlessEqual(crr.get_successful(), True)
3628+        d.addCallback(_check)
3629+        return d
3630+
3631+    def test_mdmf_repairable_5shares(self):
3632+        d = self.publish_mdmf()
3633+        def _delete_all_shares(ign):
3634+            shares = self._storage._peers
3635+            for peerid in shares:
3636+                for shnum in list(shares[peerid]):
3637+                    if shnum > 5:
3638+                        del shares[peerid][shnum]
3639+        d.addCallback(_delete_all_shares)
3640+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3641+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3642+        def _check(crr):
3643+            self.failUnlessEqual(crr.get_successful(), True)
3644+        d.addCallback(_check)
3645+        return d
3646+
3647+
3648     def test_merge(self):
3649         self.old_shares = []
3650         d = self.publish_multiple()
3651}
3652[mutable/retrieve.py: learn how to verify mutable files
3653Kevan Carstensen <kevan@isnotajoke.com>**20100628225201
3654 Ignore-this: 989af7800c47589620918461ec989483
3655] {
3656hunk ./src/allmydata/mutable/retrieve.py 86
3657     # Retrieve object will remain tied to a specific version of the file, and
3658     # will use a single ServerMap instance.
3659 
3660-    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False):
3661+    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False,
3662+                 verify=False):
3663         self._node = filenode
3664         assert self._node.get_pubkey()
3665         self._storage_index = filenode.get_storage_index()
3666hunk ./src/allmydata/mutable/retrieve.py 106
3667         # during repair, we may be called upon to grab the private key, since
3668         # it wasn't picked up during a verify=False checker run, and we'll
3669         # need it for repair to generate a new version.
3670-        self._need_privkey = fetch_privkey
3671-        if self._node.get_privkey():
3672+        self._need_privkey = fetch_privkey or verify
3673+        if self._node.get_privkey() and not verify:
3674             self._need_privkey = False
3675 
3676         if self._need_privkey:
3677hunk ./src/allmydata/mutable/retrieve.py 117
3678             self._privkey_query_markers = [] # one Marker for each time we've
3679                                              # tried to get the privkey.
3680 
3681+        # verify means that we are using the downloader logic to verify all
3682+        # of our shares. This tells the downloader a few things.
3683+        #
3684+        # 1. We need to download all of the shares.
3685+        # 2. We don't need to decode or decrypt the shares, since our
3686+        #    caller doesn't care about the plaintext, only the
3687+        #    information about which shares are or are not valid.
3688+        # 3. When we are validating readers, we need to validate the
3689+        #    signature on the prefix. Do we? We already do this in the
3690+        #    servermap update?
3691+        #
3692+        # (just work on 1 and 2 for now, I guess)
3693+        self._verify = False
3694+        if verify:
3695+            self._verify = True
3696+
3697         self._status = RetrieveStatus()
3698         self._status.set_storage_index(self._storage_index)
3699         self._status.set_helper(False)
3700hunk ./src/allmydata/mutable/retrieve.py 323
3701 
3702         # We need at least self._required_shares readers to download a
3703         # segment.
3704-        needed = self._required_shares - len(self._active_readers)
3705+        if self._verify:
3706+            needed = self._total_shares
3707+        else:
3708+            needed = self._required_shares - len(self._active_readers)
3709         # XXX: Why don't format= log messages work here?
3710         self.log("adding %d peers to the active peers list" % needed)
3711 
3712hunk ./src/allmydata/mutable/retrieve.py 339
3713         # will cause problems later.
3714         active_shnums -= set([reader.shnum for reader in self._active_readers])
3715         active_shnums = list(active_shnums)[:needed]
3716-        if len(active_shnums) < needed:
3717+        if len(active_shnums) < needed and not self._verify:
3718             # We don't have enough readers to retrieve the file; fail.
3719             return self._failed()
3720 
3721hunk ./src/allmydata/mutable/retrieve.py 346
3722         for shnum in active_shnums:
3723             self._active_readers.append(self.readers[shnum])
3724             self.log("added reader for share %d" % shnum)
3725-        assert len(self._active_readers) == self._required_shares
3726+        assert len(self._active_readers) >= self._required_shares
3727         # Conceptually, this is part of the _add_active_peers step. It
3728         # validates the prefixes of newly added readers to make sure
3729         # that they match what we are expecting for self.verinfo. If
3730hunk ./src/allmydata/mutable/retrieve.py 416
3731                     # that we haven't gotten it at the end of
3732                     # segment decoding, then we'll take more drastic
3733                     # measures.
3734-                    if self._need_privkey:
3735+                    if self._need_privkey and not self._node.is_readonly():
3736                         d = reader.get_encprivkey()
3737                         d.addCallback(self._try_to_validate_privkey, reader)
3738             if bad_readers:
3739hunk ./src/allmydata/mutable/retrieve.py 423
3740                 # We do them all at once, or else we screw up list indexing.
3741                 for (reader, f) in bad_readers:
3742                     self._mark_bad_share(reader, f)
3743-                return self._add_active_peers()
3744+                if self._verify:
3745+                    if len(self._active_readers) >= self._required_shares:
3746+                        return self._download_current_segment()
3747+                    else:
3748+                        return self._failed()
3749+                else:
3750+                    return self._add_active_peers()
3751             else:
3752                 return self._download_current_segment()
3753             # The next step will assert that it has enough active
3754hunk ./src/allmydata/mutable/retrieve.py 518
3755         """
3756         self.log("marking share %d on server %s as bad" % \
3757                  (reader.shnum, reader))
3758+        prefix = self.verinfo[-2]
3759+        self.servermap.mark_bad_share(reader.peerid,
3760+                                      reader.shnum,
3761+                                      prefix)
3762         self._remove_reader(reader)
3763hunk ./src/allmydata/mutable/retrieve.py 523
3764-        self._bad_shares.add((reader.peerid, reader.shnum))
3765+        self._bad_shares.add((reader.peerid, reader.shnum, f))
3766         self._status.problems[reader.peerid] = f
3767         self._last_failure = f
3768         self.notify_server_corruption(reader.peerid, reader.shnum,
3769hunk ./src/allmydata/mutable/retrieve.py 571
3770             ds.append(dl)
3771             reader.flush()
3772         dl = defer.DeferredList(ds)
3773-        dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
3774+        if self._verify:
3775+            dl.addCallback(lambda ignored: "")
3776+            dl.addCallback(self._set_segment)
3777+        else:
3778+            dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
3779         return dl
3780 
3781 
3782hunk ./src/allmydata/mutable/retrieve.py 701
3783         # shnum, which will be a leaf in the share hash tree, which
3784         # will allow us to validate the rest of the tree.
3785         if self.share_hash_tree.needed_hashes(reader.shnum,
3786-                                               include_leaf=True):
3787+                                              include_leaf=True) or \
3788+                                              self._verify:
3789             try:
3790                 self.share_hash_tree.set_hashes(hashes=sharehashes[1],
3791                                             leaves={reader.shnum: bht[0]})
3792hunk ./src/allmydata/mutable/retrieve.py 832
3793 
3794 
3795     def _try_to_validate_privkey(self, enc_privkey, reader):
3796-
3797         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
3798         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
3799         if alleged_writekey != self._node.get_writekey():
3800hunk ./src/allmydata/mutable/retrieve.py 838
3801             self.log("invalid privkey from %s shnum %d" %
3802                      (reader, reader.shnum),
3803                      level=log.WEIRD, umid="YIw4tA")
3804+            if self._verify:
3805+                self.servermap.mark_bad_share(reader.peerid, reader.shnum,
3806+                                              self.verinfo[-2])
3807+                e = CorruptShareError(reader.peerid,
3808+                                      reader.shnum,
3809+                                      "invalid privkey")
3810+                f = failure.Failure(e)
3811+                self._bad_shares.add((reader.peerid, reader.shnum, f))
3812             return
3813 
3814         # it's good
3815hunk ./src/allmydata/mutable/retrieve.py 904
3816         statements, I return the decrypted contents to the owner of this
3817         Retrieve object through self._done_deferred.
3818         """
3819-        eventually(self._done_deferred.callback, self._plaintext)
3820+        if self._verify:
3821+            ret = list(self._bad_shares)
3822+            self.log("done verifying, found %d bad shares" % len(ret))
3823+        else:
3824+            ret = self._plaintext
3825+        eventually(self._done_deferred.callback, ret)
3826 
3827 
3828     def _failed(self):
3829hunk ./src/allmydata/mutable/retrieve.py 920
3830         to the caller of this Retrieve object through
3831         self._done_deferred.
3832         """
3833-        format = ("ran out of peers: "
3834-                  "have %(have)d of %(total)d segments "
3835-                  "found %(bad)d bad shares "
3836-                  "encoding %(k)d-of-%(n)d")
3837-        args = {"have": self._current_segment,
3838-                "total": self._num_segments,
3839-                "k": self._required_shares,
3840-                "n": self._total_shares,
3841-                "bad": len(self._bad_shares)}
3842-        e = NotEnoughSharesError("%s, last failure: %s" % (format % args,
3843-                                                        str(self._last_failure)))
3844-        f = failure.Failure(e)
3845-        eventually(self._done_deferred.callback, f)
3846+        if self._verify:
3847+            ret = list(self._bad_shares)
3848+        else:
3849+            format = ("ran out of peers: "
3850+                      "have %(have)d of %(total)d segments "
3851+                      "found %(bad)d bad shares "
3852+                      "encoding %(k)d-of-%(n)d")
3853+            args = {"have": self._current_segment,
3854+                    "total": self._num_segments,
3855+                    "k": self._required_shares,
3856+                    "n": self._total_shares,
3857+                    "bad": len(self._bad_shares)}
3858+            e = NotEnoughSharesError("%s, last failure: %s" % \
3859+                                     (format % args, str(self._last_failure)))
3860+            f = failure.Failure(e)
3861+            ret = f
3862+        eventually(self._done_deferred.callback, ret)
3863}
3864[interfaces.py: add IMutableSlotWriter
3865Kevan Carstensen <kevan@isnotajoke.com>**20100630183305
3866 Ignore-this: ff9dca96ef1a009ae85485682f81ea5
3867] hunk ./src/allmydata/interfaces.py 418
3868         """
3869 
3870 
3871+class IMutableSlotWriter(Interface):
3872+    """
3873+    The interface for a writer around a mutable slot on a remote server.
3874+    """
3875+    def set_checkstring(checkstring, *args):
3876+        """
3877+        Set the checkstring that I will pass to the remote server when
3878+        writing.
3879+
3880+            @param checkstring A packed checkstring to use.
3881+
3882+        Note that implementations can differ in which semantics they
3883+        wish to support for set_checkstring -- they can, for example,
3884+        build the checkstring themselves from its constituents, or
3885+        some other thing.
3886+        """
3887+
3888+    def get_checkstring():
3889+        """
3890+        Get the checkstring that I think currently exists on the remote
3891+        server.
3892+        """
3893+
3894+    def put_block(data, segnum, salt):
3895+        """
3896+        Add a block and salt to the share.
3897+        """
3898+
3899+    def put_encprivey(encprivkey):
3900+        """
3901+        Add the encrypted private key to the share.
3902+        """
3903+
3904+    def put_blockhashes(blockhashes=list):
3905+        """
3906+        Add the block hash tree to the share.
3907+        """
3908+
3909+    def put_sharehashes(sharehashes=dict):
3910+        """
3911+        Add the share hash chain to the share.
3912+        """
3913+
3914+    def get_signable():
3915+        """
3916+        Return the part of the share that needs to be signed.
3917+        """
3918+
3919+    def put_signature(signature):
3920+        """
3921+        Add the signature to the share.
3922+        """
3923+
3924+    def put_verification_key(verification_key):
3925+        """
3926+        Add the verification key to the share.
3927+        """
3928+
3929+    def finish_publishing():
3930+        """
3931+        Do anything necessary to finish writing the share to a remote
3932+        server. I require that no further publishing needs to take place
3933+        after this method has been called.
3934+        """
3935+
3936+
3937 class IURI(Interface):
3938     def init_from_string(uri):
3939         """Accept a string (as created by my to_string() method) and populate
3940[test/test_mutable.py: temporarily disable two tests that are now irrelevant
3941Kevan Carstensen <kevan@isnotajoke.com>**20100701232806
3942 Ignore-this: 701e143567f3954812ca6960af1d6ac7
3943] {
3944hunk ./src/allmydata/test/test_mutable.py 651
3945             self.failUnlessEqual(len(share_ids), 10)
3946         d.addCallback(_done)
3947         return d
3948+    test_encrypt.todo = "Write an equivalent of this for the new uploader"
3949 
3950     def test_generate(self):
3951         nm = make_nodemaker()
3952hunk ./src/allmydata/test/test_mutable.py 713
3953                 self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey())
3954         d.addCallback(_generated)
3955         return d
3956+    test_generate.todo = "Write an equivalent of this for the new uploader"
3957 
3958     # TODO: when we publish to 20 peers, we should get one share per peer on 10
3959     # when we publish to 3 peers, we should get either 3 or 4 shares per peer
3960}
3961[Add MDMF reader and writer, and SDMF writer
3962Kevan Carstensen <kevan@isnotajoke.com>**20100702225531
3963 Ignore-this: bf6276a91d27dcb4e779b0eb82ea1843
3964 
3965 The MDMF/SDMF reader MDMF writer, and SDMF writer are similar to the
3966 object proxies that exist for immutable files. They abstract away
3967 details of connection, state, and caching from their callers (in this
3968 case, the download, servermap updater, and uploader), and expose methods
3969 to get and set information on the remote server.
3970 
3971 MDMFSlotReadProxy reads a mutable file from the server, doing the right
3972 thing (in most cases) regardless of whether the file is MDMF or SDMF. It
3973 allows callers to tell it how to batch and flush reads.
3974 
3975 MDMFSlotWriteProxy writes an MDMF mutable file to a server.
3976 
3977 SDMFSlotWriteProxy writes an SDMF mutable file to a server.
3978 
3979 This patch also includes tests for MDMFSlotReadProxy,
3980 SDMFSlotWriteProxy, and MDMFSlotWriteProxy.
3981] {
3982hunk ./src/allmydata/mutable/layout.py 4
3983 
3984 import struct
3985 from allmydata.mutable.common import NeedMoreDataError, UnknownVersionError
3986+from allmydata.interfaces import HASH_SIZE, SALT_SIZE, SDMF_VERSION, \
3987+                                 MDMF_VERSION, IMutableSlotWriter
3988+from allmydata.util import mathutil, observer
3989+from twisted.python import failure
3990+from twisted.internet import defer
3991+from zope.interface import implements
3992+
3993+
3994+# These strings describe the format of the packed structs they help process
3995+# Here's what they mean:
3996+#
3997+#  PREFIX:
3998+#    >: Big-endian byte order; the most significant byte is first (leftmost).
3999+#    B: The version information; an 8 bit version identifier. Stored as
4000+#       an unsigned char. This is currently 00 00 00 00; our modifications
4001+#       will turn it into 00 00 00 01.
4002+#    Q: The sequence number; this is sort of like a revision history for
4003+#       mutable files; they start at 1 and increase as they are changed after
4004+#       being uploaded. Stored as an unsigned long long, which is 8 bytes in
4005+#       length.
4006+#  32s: The root hash of the share hash tree. We use sha-256d, so we use 32
4007+#       characters = 32 bytes to store the value.
4008+#  16s: The salt for the readkey. This is a 16-byte random value, stored as
4009+#       16 characters.
4010+#
4011+#  SIGNED_PREFIX additions, things that are covered by the signature:
4012+#    B: The "k" encoding parameter. We store this as an 8-bit character,
4013+#       which is convenient because our erasure coding scheme cannot
4014+#       encode if you ask for more than 255 pieces.
4015+#    B: The "N" encoding parameter. Stored as an 8-bit character for the
4016+#       same reasons as above.
4017+#    Q: The segment size of the uploaded file. This will essentially be the
4018+#       length of the file in SDMF. An unsigned long long, so we can store
4019+#       files of quite large size.
4020+#    Q: The data length of the uploaded file. Modulo padding, this will be
4021+#       the same of the data length field. Like the data length field, it is
4022+#       an unsigned long long and can be quite large.
4023+#
4024+#   HEADER additions:
4025+#     L: The offset of the signature of this. An unsigned long.
4026+#     L: The offset of the share hash chain. An unsigned long.
4027+#     L: The offset of the block hash tree. An unsigned long.
4028+#     L: The offset of the share data. An unsigned long.
4029+#     Q: The offset of the encrypted private key. An unsigned long long, to
4030+#        account for the possibility of a lot of share data.
4031+#     Q: The offset of the EOF. An unsigned long long, to account for the
4032+#        possibility of a lot of share data.
4033+#
4034+#  After all of these, we have the following:
4035+#    - The verification key: Occupies the space between the end of the header
4036+#      and the start of the signature (i.e.: data[HEADER_LENGTH:o['signature']].
4037+#    - The signature, which goes from the signature offset to the share hash
4038+#      chain offset.
4039+#    - The share hash chain, which goes from the share hash chain offset to
4040+#      the block hash tree offset.
4041+#    - The share data, which goes from the share data offset to the encrypted
4042+#      private key offset.
4043+#    - The encrypted private key offset, which goes until the end of the file.
4044+#
4045+#  The block hash tree in this encoding has only one share, so the offset of
4046+#  the share data will be 32 bits more than the offset of the block hash tree.
4047+#  Given this, we may need to check to see how many bytes a reasonably sized
4048+#  block hash tree will take up.
4049 
4050 PREFIX = ">BQ32s16s" # each version has a different prefix
4051 SIGNED_PREFIX = ">BQ32s16s BBQQ" # this is covered by the signature
4052hunk ./src/allmydata/mutable/layout.py 73
4053 SIGNED_PREFIX_LENGTH = struct.calcsize(SIGNED_PREFIX)
4054 HEADER = ">BQ32s16s BBQQ LLLLQQ" # includes offsets
4055 HEADER_LENGTH = struct.calcsize(HEADER)
4056+OFFSETS = ">LLLLQQ"
4057+OFFSETS_LENGTH = struct.calcsize(OFFSETS)
4058 
4059 def unpack_header(data):
4060     o = {}
4061hunk ./src/allmydata/mutable/layout.py 194
4062     return (share_hash_chain, block_hash_tree, share_data)
4063 
4064 
4065-def pack_checkstring(seqnum, root_hash, IV):
4066+def pack_checkstring(seqnum, root_hash, IV, version=0):
4067     return struct.pack(PREFIX,
4068hunk ./src/allmydata/mutable/layout.py 196
4069-                       0, # version,
4070+                       version,
4071                        seqnum,
4072                        root_hash,
4073                        IV)
4074hunk ./src/allmydata/mutable/layout.py 269
4075                            encprivkey])
4076     return final_share
4077 
4078+def pack_prefix(seqnum, root_hash, IV,
4079+                required_shares, total_shares,
4080+                segment_size, data_length):
4081+    prefix = struct.pack(SIGNED_PREFIX,
4082+                         0, # version,
4083+                         seqnum,
4084+                         root_hash,
4085+                         IV,
4086+                         required_shares,
4087+                         total_shares,
4088+                         segment_size,
4089+                         data_length,
4090+                         )
4091+    return prefix
4092+
4093+
4094+class SDMFSlotWriteProxy:
4095+    implements(IMutableSlotWriter)
4096+    """
4097+    I represent a remote write slot for an SDMF mutable file. I build a
4098+    share in memory, and then write it in one piece to the remote
4099+    server. This mimics how SDMF shares were built before MDMF (and the
4100+    new MDMF uploader), but provides that functionality in a way that
4101+    allows the MDMF uploader to be built without much special-casing for
4102+    file format, which makes the uploader code more readable.
4103+    """
4104+    def __init__(self,
4105+                 shnum,
4106+                 rref, # a remote reference to a storage server
4107+                 storage_index,
4108+                 secrets, # (write_enabler, renew_secret, cancel_secret)
4109+                 seqnum, # the sequence number of the mutable file
4110+                 required_shares,
4111+                 total_shares,
4112+                 segment_size,
4113+                 data_length): # the length of the original file
4114+        self.shnum = shnum
4115+        self._rref = rref
4116+        self._storage_index = storage_index
4117+        self._secrets = secrets
4118+        self._seqnum = seqnum
4119+        self._required_shares = required_shares
4120+        self._total_shares = total_shares
4121+        self._segment_size = segment_size
4122+        self._data_length = data_length
4123+
4124+        # This is an SDMF file, so it should have only one segment, so,
4125+        # modulo padding of the data length, the segment size and the
4126+        # data length should be the same.
4127+        expected_segment_size = mathutil.next_multiple(data_length,
4128+                                                       self._required_shares)
4129+        assert expected_segment_size == segment_size
4130+
4131+        self._block_size = self._segment_size / self._required_shares
4132+
4133+        # This is meant to mimic how SDMF files were built before MDMF
4134+        # entered the picture: we generate each share in its entirety,
4135+        # then push it off to the storage server in one write. When
4136+        # callers call set_*, they are just populating this dict.
4137+        # finish_publishing will stitch these pieces together into a
4138+        # coherent share, and then write the coherent share to the
4139+        # storage server.
4140+        self._share_pieces = {}
4141+
4142+        # This tells the write logic what checkstring to use when
4143+        # writing remote shares.
4144+        self._testvs = []
4145+
4146+        self._readvs = [(0, struct.calcsize(PREFIX))]
4147+
4148+
4149+    def set_checkstring(self, checkstring_or_seqnum,
4150+                              root_hash=None,
4151+                              salt=None):
4152+        """
4153+        Set the checkstring that I will pass to the remote server when
4154+        writing.
4155+
4156+            @param checkstring_or_seqnum: A packed checkstring to use,
4157+                   or a sequence number. I will treat this as a checkstr
4158+
4159+        Note that implementations can differ in which semantics they
4160+        wish to support for set_checkstring -- they can, for example,
4161+        build the checkstring themselves from its constituents, or
4162+        some other thing.
4163+        """
4164+        if root_hash and salt:
4165+            checkstring = struct.pack(PREFIX,
4166+                                      0,
4167+                                      checkstring_or_seqnum,
4168+                                      root_hash,
4169+                                      salt)
4170+        else:
4171+            checkstring = checkstring_or_seqnum
4172+        self._testvs = [(0, len(checkstring), "eq", checkstring)]
4173+
4174+
4175+    def get_checkstring(self):
4176+        """
4177+        Get the checkstring that I think currently exists on the remote
4178+        server.
4179+        """
4180+        if self._testvs:
4181+            return self._testvs[0][3]
4182+        return ""
4183+
4184+
4185+    def put_block(self, data, segnum, salt):
4186+        """
4187+        Add a block and salt to the share.
4188+        """
4189+        # SDMF files have only one segment
4190+        assert segnum == 0
4191+        assert len(data) == self._block_size
4192+        assert len(salt) == SALT_SIZE
4193+
4194+        self._share_pieces['sharedata'] = data
4195+        self._share_pieces['salt'] = salt
4196+
4197+        # TODO: Figure out something intelligent to return.
4198+        return defer.succeed(None)
4199+
4200+
4201+    def put_encprivkey(self, encprivkey):
4202+        """
4203+        Add the encrypted private key to the share.
4204+        """
4205+        self._share_pieces['encprivkey'] = encprivkey
4206+
4207+        return defer.succeed(None)
4208+
4209+
4210+    def put_blockhashes(self, blockhashes):
4211+        """
4212+        Add the block hash tree to the share.
4213+        """
4214+        assert isinstance(blockhashes, list)
4215+        for h in blockhashes:
4216+            assert len(h) == HASH_SIZE
4217+
4218+        # serialize the blockhashes, then set them.
4219+        blockhashes_s = "".join(blockhashes)
4220+        self._share_pieces['block_hash_tree'] = blockhashes_s
4221+
4222+        return defer.succeed(None)
4223+
4224+
4225+    def put_sharehashes(self, sharehashes):
4226+        """
4227+        Add the share hash chain to the share.
4228+        """
4229+        assert isinstance(sharehashes, dict)
4230+        for h in sharehashes.itervalues():
4231+            assert len(h) == HASH_SIZE
4232+
4233+        # serialize the sharehashes, then set them.
4234+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
4235+                                 for i in sorted(sharehashes.keys())])
4236+        self._share_pieces['share_hash_chain'] = sharehashes_s
4237+
4238+        return defer.succeed(None)
4239+
4240+
4241+    def put_root_hash(self, root_hash):
4242+        """
4243+        Add the root hash to the share.
4244+        """
4245+        assert len(root_hash) == HASH_SIZE
4246+
4247+        self._share_pieces['root_hash'] = root_hash
4248+
4249+        return defer.succeed(None)
4250+
4251+
4252+    def put_salt(self, salt):
4253+        """
4254+        Add a salt to an empty SDMF file.
4255+        """
4256+        assert len(salt) == SALT_SIZE
4257+
4258+        self._share_pieces['salt'] = salt
4259+        self._share_pieces['sharedata'] = ""
4260+
4261+
4262+    def get_signable(self):
4263+        """
4264+        Return the part of the share that needs to be signed.
4265+
4266+        SDMF writers need to sign the packed representation of the
4267+        first eight fields of the remote share, that is:
4268+            - version number (0)
4269+            - sequence number
4270+            - root of the share hash tree
4271+            - salt
4272+            - k
4273+            - n
4274+            - segsize
4275+            - datalen
4276+
4277+        This method is responsible for returning that to callers.
4278+        """
4279+        return struct.pack(SIGNED_PREFIX,
4280+                           0,
4281+                           self._seqnum,
4282+                           self._share_pieces['root_hash'],
4283+                           self._share_pieces['salt'],
4284+                           self._required_shares,
4285+                           self._total_shares,
4286+                           self._segment_size,
4287+                           self._data_length)
4288+
4289+
4290+    def put_signature(self, signature):
4291+        """
4292+        Add the signature to the share.
4293+        """
4294+        self._share_pieces['signature'] = signature
4295+
4296+        return defer.succeed(None)
4297+
4298+
4299+    def put_verification_key(self, verification_key):
4300+        """
4301+        Add the verification key to the share.
4302+        """
4303+        self._share_pieces['verification_key'] = verification_key
4304+
4305+        return defer.succeed(None)
4306+
4307+
4308+    def get_verinfo(self):
4309+        """
4310+        I return my verinfo tuple. This is used by the ServermapUpdater
4311+        to keep track of versions of mutable files.
4312+
4313+        The verinfo tuple for MDMF files contains:
4314+            - seqnum
4315+            - root hash
4316+            - a blank (nothing)
4317+            - segsize
4318+            - datalen
4319+            - k
4320+            - n
4321+            - prefix (the thing that you sign)
4322+            - a tuple of offsets
4323+
4324+        We include the nonce in MDMF to simplify processing of version
4325+        information tuples.
4326+
4327+        The verinfo tuple for SDMF files is the same, but contains a
4328+        16-byte IV instead of a hash of salts.
4329+        """
4330+        return (self._seqnum,
4331+                self._share_pieces['root_hash'],
4332+                self._share_pieces['salt'],
4333+                self._segment_size,
4334+                self._data_length,
4335+                self._required_shares,
4336+                self._total_shares,
4337+                self.get_signable(),
4338+                self._get_offsets_tuple())
4339+
4340+    def _get_offsets_dict(self):
4341+        post_offset = HEADER_LENGTH
4342+        offsets = {}
4343+
4344+        verification_key_length = len(self._share_pieces['verification_key'])
4345+        o1 = offsets['signature'] = post_offset + verification_key_length
4346+
4347+        signature_length = len(self._share_pieces['signature'])
4348+        o2 = offsets['share_hash_chain'] = o1 + signature_length
4349+
4350+        share_hash_chain_length = len(self._share_pieces['share_hash_chain'])
4351+        o3 = offsets['block_hash_tree'] = o2 + share_hash_chain_length
4352+
4353+        block_hash_tree_length = len(self._share_pieces['block_hash_tree'])
4354+        o4 = offsets['share_data'] = o3 + block_hash_tree_length
4355+
4356+        share_data_length = len(self._share_pieces['sharedata'])
4357+        o5 = offsets['enc_privkey'] = o4 + share_data_length
4358+
4359+        encprivkey_length = len(self._share_pieces['encprivkey'])
4360+        offsets['EOF'] = o5 + encprivkey_length
4361+        return offsets
4362+
4363+
4364+    def _get_offsets_tuple(self):
4365+        offsets = self._get_offsets_dict()
4366+        return tuple([(key, value) for key, value in offsets.items()])
4367+
4368+
4369+    def _pack_offsets(self):
4370+        offsets = self._get_offsets_dict()
4371+        return struct.pack(">LLLLQQ",
4372+                           offsets['signature'],
4373+                           offsets['share_hash_chain'],
4374+                           offsets['block_hash_tree'],
4375+                           offsets['share_data'],
4376+                           offsets['enc_privkey'],
4377+                           offsets['EOF'])
4378+
4379+
4380+    def finish_publishing(self):
4381+        """
4382+        Do anything necessary to finish writing the share to a remote
4383+        server. I require that no further publishing needs to take place
4384+        after this method has been called.
4385+        """
4386+        for k in ["sharedata", "encprivkey", "signature", "verification_key",
4387+                  "share_hash_chain", "block_hash_tree"]:
4388+            assert k in self._share_pieces
4389+        # This is the only method that actually writes something to the
4390+        # remote server.
4391+        # First, we need to pack the share into data that we can write
4392+        # to the remote server in one write.
4393+        offsets = self._pack_offsets()
4394+        prefix = self.get_signable()
4395+        final_share = "".join([prefix,
4396+                               offsets,
4397+                               self._share_pieces['verification_key'],
4398+                               self._share_pieces['signature'],
4399+                               self._share_pieces['share_hash_chain'],
4400+                               self._share_pieces['block_hash_tree'],
4401+                               self._share_pieces['sharedata'],
4402+                               self._share_pieces['encprivkey']])
4403+
4404+        # Our only data vector is going to be writing the final share,
4405+        # in its entirely.
4406+        datavs = [(0, final_share)]
4407+
4408+        if not self._testvs:
4409+            # Our caller has not provided us with another checkstring
4410+            # yet, so we assume that we are writing a new share, and set
4411+            # a test vector that will allow a new share to be written.
4412+            self._testvs = []
4413+            self._testvs.append(tuple([0, 1, "eq", ""]))
4414+            new_share = True
4415+
4416+        tw_vectors = {}
4417+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
4418+        return self._rref.callRemote("slot_testv_and_readv_and_writev",
4419+                                     self._storage_index,
4420+                                     self._secrets,
4421+                                     tw_vectors,
4422+                                     # TODO is it useful to read something?
4423+                                     self._readvs)
4424+
4425+
4426+MDMFHEADER = ">BQ32sBBQQ QQQQQQ"
4427+MDMFHEADERWITHOUTOFFSETS = ">BQ32sBBQQ"
4428+MDMFHEADERSIZE = struct.calcsize(MDMFHEADER)
4429+MDMFHEADERWITHOUTOFFSETSSIZE = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
4430+MDMFCHECKSTRING = ">BQ32s"
4431+MDMFSIGNABLEHEADER = ">BQ32sBBQQ"
4432+MDMFOFFSETS = ">QQQQQQ"
4433+MDMFOFFSETS_LENGTH = struct.calcsize(MDMFOFFSETS)
4434+
4435+class MDMFSlotWriteProxy:
4436+    implements(IMutableSlotWriter)
4437+
4438+    """
4439+    I represent a remote write slot for an MDMF mutable file.
4440+
4441+    I abstract away from my caller the details of block and salt
4442+    management, and the implementation of the on-disk format for MDMF
4443+    shares.
4444+    """
4445+    # Expected layout, MDMF:
4446+    # offset:     size:       name:
4447+    #-- signed part --
4448+    # 0           1           version number (01)
4449+    # 1           8           sequence number
4450+    # 9           32          share tree root hash
4451+    # 41          1           The "k" encoding parameter
4452+    # 42          1           The "N" encoding parameter
4453+    # 43          8           The segment size of the uploaded file
4454+    # 51          8           The data length of the original plaintext
4455+    #-- end signed part --
4456+    # 59          8           The offset of the encrypted private key
4457+    # 67          8           The offset of the block hash tree
4458+    # 75          8           The offset of the share hash chain
4459+    # 83          8           The offset of the signature
4460+    # 91          8           The offset of the verification key
4461+    # 99          8           The offset of the EOF
4462+    #
4463+    # followed by salts and share data, the encrypted private key, the
4464+    # block hash tree, the salt hash tree, the share hash chain, a
4465+    # signature over the first eight fields, and a verification key.
4466+    #
4467+    # The checkstring is the first three fields -- the version number,
4468+    # sequence number, root hash and root salt hash. This is consistent
4469+    # in meaning to what we have with SDMF files, except now instead of
4470+    # using the literal salt, we use a value derived from all of the
4471+    # salts -- the share hash root.
4472+    #
4473+    # The salt is stored before the block for each segment. The block
4474+    # hash tree is computed over the combination of block and salt for
4475+    # each segment. In this way, we get integrity checking for both
4476+    # block and salt with the current block hash tree arrangement.
4477+    #
4478+    # The ordering of the offsets is different to reflect the dependencies
4479+    # that we'll run into with an MDMF file. The expected write flow is
4480+    # something like this:
4481+    #
4482+    #   0: Initialize with the sequence number, encoding parameters and
4483+    #      data length. From this, we can deduce the number of segments,
4484+    #      and where they should go.. We can also figure out where the
4485+    #      encrypted private key should go, because we can figure out how
4486+    #      big the share data will be.
4487+    #
4488+    #   1: Encrypt, encode, and upload the file in chunks. Do something
4489+    #      like
4490+    #
4491+    #       put_block(data, segnum, salt)
4492+    #
4493+    #      to write a block and a salt to the disk. We can do both of
4494+    #      these operations now because we have enough of the offsets to
4495+    #      know where to put them.
4496+    #
4497+    #   2: Put the encrypted private key. Use:
4498+    #
4499+    #        put_encprivkey(encprivkey)
4500+    #
4501+    #      Now that we know the length of the private key, we can fill
4502+    #      in the offset for the block hash tree.
4503+    #
4504+    #   3: We're now in a position to upload the block hash tree for
4505+    #      a share. Put that using something like:
4506+    #       
4507+    #        put_blockhashes(block_hash_tree)
4508+    #
4509+    #      Note that block_hash_tree is a list of hashes -- we'll take
4510+    #      care of the details of serializing that appropriately. When
4511+    #      we get the block hash tree, we are also in a position to
4512+    #      calculate the offset for the share hash chain, and fill that
4513+    #      into the offsets table.
4514+    #
4515+    #   4: At the same time, we're in a position to upload the salt hash
4516+    #      tree. This is a Merkle tree over all of the salts. We use a
4517+    #      Merkle tree so that we can validate each block,salt pair as
4518+    #      we download them later. We do this using
4519+    #
4520+    #        put_salthashes(salt_hash_tree)
4521+    #
4522+    #      When you do this, I automatically put the root of the tree
4523+    #      (the hash at index 0 of the list) in its appropriate slot in
4524+    #      the signed prefix of the share.
4525+    #
4526+    #   5: We're now in a position to upload the share hash chain for
4527+    #      a share. Do that with something like:
4528+    #     
4529+    #        put_sharehashes(share_hash_chain)
4530+    #
4531+    #      share_hash_chain should be a dictionary mapping shnums to
4532+    #      32-byte hashes -- the wrapper handles serialization.
4533+    #      We'll know where to put the signature at this point, also.
4534+    #      The root of this tree will be put explicitly in the next
4535+    #      step.
4536+    #
4537+    #      TODO: Why? Why not just include it in the tree here?
4538+    #
4539+    #   6: Before putting the signature, we must first put the
4540+    #      root_hash. Do this with:
4541+    #
4542+    #        put_root_hash(root_hash).
4543+    #     
4544+    #      In terms of knowing where to put this value, it was always
4545+    #      possible to place it, but it makes sense semantically to
4546+    #      place it after the share hash tree, so that's why you do it
4547+    #      in this order.
4548+    #
4549+    #   6: With the root hash put, we can now sign the header. Use:
4550+    #
4551+    #        get_signable()
4552+    #
4553+    #      to get the part of the header that you want to sign, and use:
4554+    #       
4555+    #        put_signature(signature)
4556+    #
4557+    #      to write your signature to the remote server.
4558+    #
4559+    #   6: Add the verification key, and finish. Do:
4560+    #
4561+    #        put_verification_key(key)
4562+    #
4563+    #      and
4564+    #
4565+    #        finish_publish()
4566+    #
4567+    # Checkstring management:
4568+    #
4569+    # To write to a mutable slot, we have to provide test vectors to ensure
4570+    # that we are writing to the same data that we think we are. These
4571+    # vectors allow us to detect uncoordinated writes; that is, writes
4572+    # where both we and some other shareholder are writing to the
4573+    # mutable slot, and to report those back to the parts of the program
4574+    # doing the writing.
4575+    #
4576+    # With SDMF, this was easy -- all of the share data was written in
4577+    # one go, so it was easy to detect uncoordinated writes, and we only
4578+    # had to do it once. With MDMF, not all of the file is written at
4579+    # once.
4580+    #
4581+    # If a share is new, we write out as much of the header as we can
4582+    # before writing out anything else. This gives other writers a
4583+    # canary that they can use to detect uncoordinated writes, and, if
4584+    # they do the same thing, gives us the same canary. We them update
4585+    # the share. We won't be able to write out two fields of the header
4586+    # -- the share tree hash and the salt hash -- until we finish
4587+    # writing out the share. We only require the writer to provide the
4588+    # initial checkstring, and keep track of what it should be after
4589+    # updates ourselves.
4590+    #
4591+    # If we haven't written anything yet, then on the first write (which
4592+    # will probably be a block + salt of a share), we'll also write out
4593+    # the header. On subsequent passes, we'll expect to see the header.
4594+    # This changes in two places:
4595+    #
4596+    #   - When we write out the salt hash
4597+    #   - When we write out the root of the share hash tree
4598+    #
4599+    # since these values will change the header. It is possible that we
4600+    # can just make those be written in one operation to minimize
4601+    # disruption.
4602+    def __init__(self,
4603+                 shnum,
4604+                 rref, # a remote reference to a storage server
4605+                 storage_index,
4606+                 secrets, # (write_enabler, renew_secret, cancel_secret)
4607+                 seqnum, # the sequence number of the mutable file
4608+                 required_shares,
4609+                 total_shares,
4610+                 segment_size,
4611+                 data_length): # the length of the original file
4612+        self.shnum = shnum
4613+        self._rref = rref
4614+        self._storage_index = storage_index
4615+        self._seqnum = seqnum
4616+        self._required_shares = required_shares
4617+        assert self.shnum >= 0 and self.shnum < total_shares
4618+        self._total_shares = total_shares
4619+        # We build up the offset table as we write things. It is the
4620+        # last thing we write to the remote server.
4621+        self._offsets = {}
4622+        self._testvs = []
4623+        self._secrets = secrets
4624+        # The segment size needs to be a multiple of the k parameter --
4625+        # any padding should have been carried out by the publisher
4626+        # already.
4627+        assert segment_size % required_shares == 0
4628+        self._segment_size = segment_size
4629+        self._data_length = data_length
4630+
4631+        # These are set later -- we define them here so that we can
4632+        # check for their existence easily
4633+
4634+        # This is the root of the share hash tree -- the Merkle tree
4635+        # over the roots of the block hash trees computed for shares in
4636+        # this upload.
4637+        self._root_hash = None
4638+
4639+        # We haven't yet written anything to the remote bucket. By
4640+        # setting this, we tell the _write method as much. The write
4641+        # method will then know that it also needs to add a write vector
4642+        # for the checkstring (or what we have of it) to the first write
4643+        # request. We'll then record that value for future use.  If
4644+        # we're expecting something to be there already, we need to call
4645+        # set_checkstring before we write anything to tell the first
4646+        # write about that.
4647+        self._written = False
4648+
4649+        # When writing data to the storage servers, we get a read vector
4650+        # for free. We'll read the checkstring, which will help us
4651+        # figure out what's gone wrong if a write fails.
4652+        self._readv = [(0, struct.calcsize(MDMFCHECKSTRING))]
4653+
4654+        # We calculate the number of segments because it tells us
4655+        # where the salt part of the file ends/share segment begins,
4656+        # and also because it provides a useful amount of bounds checking.
4657+        self._num_segments = mathutil.div_ceil(self._data_length,
4658+                                               self._segment_size)
4659+        self._block_size = self._segment_size / self._required_shares
4660+        # We also calculate the share size, to help us with block
4661+        # constraints later.
4662+        tail_size = self._data_length % self._segment_size
4663+        if not tail_size:
4664+            self._tail_block_size = self._block_size
4665+        else:
4666+            self._tail_block_size = mathutil.next_multiple(tail_size,
4667+                                                           self._required_shares)
4668+            self._tail_block_size /= self._required_shares
4669+
4670+        # We already know where the sharedata starts; right after the end
4671+        # of the header (which is defined as the signable part + the offsets)
4672+        # We can also calculate where the encrypted private key begins
4673+        # from what we know know.
4674+        self._actual_block_size = self._block_size + SALT_SIZE
4675+        data_size = self._actual_block_size * (self._num_segments - 1)
4676+        data_size += self._tail_block_size
4677+        data_size += SALT_SIZE
4678+        self._offsets['enc_privkey'] = MDMFHEADERSIZE
4679+        self._offsets['enc_privkey'] += data_size
4680+        # We'll wait for the rest. Callers can now call my "put_block" and
4681+        # "set_checkstring" methods.
4682+
4683+
4684+    def set_checkstring(self,
4685+                        seqnum_or_checkstring,
4686+                        root_hash=None,
4687+                        salt=None):
4688+        """
4689+        Set checkstring checkstring for the given shnum.
4690+
4691+        This can be invoked in one of two ways.
4692+
4693+        With one argument, I assume that you are giving me a literal
4694+        checkstring -- e.g., the output of get_checkstring. I will then
4695+        set that checkstring as it is. This form is used by unit tests.
4696+
4697+        With two arguments, I assume that you are giving me a sequence
4698+        number and root hash to make a checkstring from. In that case, I
4699+        will build a checkstring and set it for you. This form is used
4700+        by the publisher.
4701+
4702+        By default, I assume that I am writing new shares to the grid.
4703+        If you don't explcitly set your own checkstring, I will use
4704+        one that requires that the remote share not exist. You will want
4705+        to use this method if you are updating a share in-place;
4706+        otherwise, writes will fail.
4707+        """
4708+        # You're allowed to overwrite checkstrings with this method;
4709+        # I assume that users know what they are doing when they call
4710+        # it.
4711+        if root_hash:
4712+            checkstring = struct.pack(MDMFCHECKSTRING,
4713+                                      1,
4714+                                      seqnum_or_checkstring,
4715+                                      root_hash)
4716+        else:
4717+            checkstring = seqnum_or_checkstring
4718+
4719+        if checkstring == "":
4720+            # We special-case this, since len("") = 0, but we need
4721+            # length of 1 for the case of an empty share to work on the
4722+            # storage server, which is what a checkstring that is the
4723+            # empty string means.
4724+            self._testvs = []
4725+        else:
4726+            self._testvs = []
4727+            self._testvs.append((0, len(checkstring), "eq", checkstring))
4728+
4729+
4730+    def __repr__(self):
4731+        return "MDMFSlotWriteProxy for share %d" % self.shnum
4732+
4733+
4734+    def get_checkstring(self):
4735+        """
4736+        Given a share number, I return a representation of what the
4737+        checkstring for that share on the server will look like.
4738+
4739+        I am mostly used for tests.
4740+        """
4741+        if self._root_hash:
4742+            roothash = self._root_hash
4743+        else:
4744+            roothash = "\x00" * 32
4745+        return struct.pack(MDMFCHECKSTRING,
4746+                           1,
4747+                           self._seqnum,
4748+                           roothash)
4749+
4750+
4751+    def put_block(self, data, segnum, salt):
4752+        """
4753+        Put the encrypted-and-encoded data segment in the slot, along
4754+        with the salt.
4755+        """
4756+        if segnum >= self._num_segments:
4757+            raise LayoutInvalid("I won't overwrite the private key")
4758+        if len(salt) != SALT_SIZE:
4759+            raise LayoutInvalid("I was given a salt of size %d, but "
4760+                                "I wanted a salt of size %d")
4761+        if segnum + 1 == self._num_segments:
4762+            if len(data) != self._tail_block_size:
4763+                raise LayoutInvalid("I was given the wrong size block to write")
4764+        elif len(data) != self._block_size:
4765+            raise LayoutInvalid("I was given the wrong size block to write")
4766+
4767+        # We want to write at len(MDMFHEADER) + segnum * block_size.
4768+
4769+        offset = MDMFHEADERSIZE + (self._actual_block_size * segnum)
4770+        data = salt + data
4771+
4772+        datavs = [tuple([offset, data])]
4773+        return self._write(datavs)
4774+
4775+
4776+    def put_encprivkey(self, encprivkey):
4777+        """
4778+        Put the encrypted private key in the remote slot.
4779+        """
4780+        assert self._offsets
4781+        assert self._offsets['enc_privkey']
4782+        # You shouldn't re-write the encprivkey after the block hash
4783+        # tree is written, since that could cause the private key to run
4784+        # into the block hash tree. Before it writes the block hash
4785+        # tree, the block hash tree writing method writes the offset of
4786+        # the salt hash tree. So that's a good indicator of whether or
4787+        # not the block hash tree has been written.
4788+        if "share_hash_chain" in self._offsets:
4789+            raise LayoutInvalid("You must write this before the block hash tree")
4790+
4791+        self._offsets['block_hash_tree'] = self._offsets['enc_privkey'] + len(encprivkey)
4792+        datavs = [(tuple([self._offsets['enc_privkey'], encprivkey]))]
4793+        def _on_failure():
4794+            del(self._offsets['block_hash_tree'])
4795+        return self._write(datavs, on_failure=_on_failure)
4796+
4797+
4798+    def put_blockhashes(self, blockhashes):
4799+        """
4800+        Put the block hash tree in the remote slot.
4801+
4802+        The encrypted private key must be put before the block hash
4803+        tree, since we need to know how large it is to know where the
4804+        block hash tree should go. The block hash tree must be put
4805+        before the salt hash tree, since its size determines the
4806+        offset of the share hash chain.
4807+        """
4808+        assert self._offsets
4809+        assert isinstance(blockhashes, list)
4810+        if "block_hash_tree" not in self._offsets:
4811+            raise LayoutInvalid("You must put the encrypted private key "
4812+                                "before you put the block hash tree")
4813+        # If written, the share hash chain causes the signature offset
4814+        # to be defined.
4815+        if "signature" in self._offsets:
4816+            raise LayoutInvalid("You must put the block hash tree before "
4817+                                "you put the share hash chain")
4818+        blockhashes_s = "".join(blockhashes)
4819+        self._offsets['share_hash_chain'] = self._offsets['block_hash_tree'] + len(blockhashes_s)
4820+        datavs = []
4821+        datavs.append(tuple([self._offsets['block_hash_tree'], blockhashes_s]))
4822+        def _on_failure():
4823+            del(self._offsets['share_hash_chain'])
4824+        return self._write(datavs, on_failure=_on_failure)
4825+
4826+
4827+    def put_sharehashes(self, sharehashes):
4828+        """
4829+        Put the share hash chain in the remote slot.
4830+
4831+        The salt hash tree must be put before the share hash chain,
4832+        since we need to know where the salt hash tree ends before we
4833+        can know where the share hash chain starts. The share hash chain
4834+        must be put before the signature, since the length of the packed
4835+        share hash chain determines the offset of the signature. Also,
4836+        semantically, you must know what the root of the salt hash tree
4837+        is before you can generate a valid signature.
4838+        """
4839+        assert isinstance(sharehashes, dict)
4840+        if "share_hash_chain" not in self._offsets:
4841+            raise LayoutInvalid("You need to put the salt hash tree before "
4842+                                "you can put the share hash chain")
4843+        # The signature comes after the share hash chain. If the
4844+        # signature has already been written, we must not write another
4845+        # share hash chain. The signature writes the verification key
4846+        # offset when it gets sent to the remote server, so we look for
4847+        # that.
4848+        if "verification_key" in self._offsets:
4849+            raise LayoutInvalid("You must write the share hash chain "
4850+                                "before you write the signature")
4851+        datavs = []
4852+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
4853+                                  for i in sorted(sharehashes.keys())])
4854+        self._offsets['signature'] = self._offsets['share_hash_chain'] + len(sharehashes_s)
4855+        datavs.append(tuple([self._offsets['share_hash_chain'], sharehashes_s]))
4856+        def _on_failure():
4857+            del(self._offsets['signature'])
4858+        return self._write(datavs, on_failure=_on_failure)
4859+
4860+
4861+    def put_root_hash(self, roothash):
4862+        """
4863+        Put the root hash (the root of the share hash tree) in the
4864+        remote slot.
4865+        """
4866+        # It does not make sense to be able to put the root
4867+        # hash without first putting the share hashes, since you need
4868+        # the share hashes to generate the root hash.
4869+        #
4870+        # Signature is defined by the routine that places the share hash
4871+        # chain, so it's a good thing to look for in finding out whether
4872+        # or not the share hash chain exists on the remote server.
4873+        if "signature" not in self._offsets:
4874+            raise LayoutInvalid("You need to put the share hash chain "
4875+                                "before you can put the root share hash")
4876+        if len(roothash) != HASH_SIZE:
4877+            raise LayoutInvalid("hashes and salts must be exactly %d bytes"
4878+                                 % HASH_SIZE)
4879+        datavs = []
4880+        self._root_hash = roothash
4881+        # To write both of these values, we update the checkstring on
4882+        # the remote server, which includes them
4883+        checkstring = self.get_checkstring()
4884+        datavs.append(tuple([0, checkstring]))
4885+        # This write, if successful, changes the checkstring, so we need
4886+        # to update our internal checkstring to be consistent with the
4887+        # one on the server.
4888+        def _on_success():
4889+            self._testvs = [(0, len(checkstring), "eq", checkstring)]
4890+        def _on_failure():
4891+            self._root_hash = None
4892+        return self._write(datavs,
4893+                           on_success=_on_success,
4894+                           on_failure=_on_failure)
4895+
4896+
4897+    def get_signable(self):
4898+        """
4899+        Get the first seven fields of the mutable file; the parts that
4900+        are signed.
4901+        """
4902+        if not self._root_hash:
4903+            raise LayoutInvalid("You need to set the root hash "
4904+                                "before getting something to "
4905+                                "sign")
4906+        return struct.pack(MDMFSIGNABLEHEADER,
4907+                           1,
4908+                           self._seqnum,
4909+                           self._root_hash,
4910+                           self._required_shares,
4911+                           self._total_shares,
4912+                           self._segment_size,
4913+                           self._data_length)
4914+
4915+
4916+    def put_signature(self, signature):
4917+        """
4918+        Put the signature field to the remote slot.
4919+
4920+        I require that the root hash and share hash chain have been put
4921+        to the grid before I will write the signature to the grid.
4922+        """
4923+        if "signature" not in self._offsets:
4924+            raise LayoutInvalid("You must put the share hash chain "
4925+        # It does not make sense to put a signature without first
4926+        # putting the root hash and the salt hash (since otherwise
4927+        # the signature would be incomplete), so we don't allow that.
4928+                       "before putting the signature")
4929+        if not self._root_hash:
4930+            raise LayoutInvalid("You must complete the signed prefix "
4931+                                "before computing a signature")
4932+        # If we put the signature after we put the verification key, we
4933+        # could end up running into the verification key, and will
4934+        # probably screw up the offsets as well. So we don't allow that.
4935+        # The method that writes the verification key defines the EOF
4936+        # offset before writing the verification key, so look for that.
4937+        if "EOF" in self._offsets:
4938+            raise LayoutInvalid("You must write the signature before the verification key")
4939+
4940+        self._offsets['verification_key'] = self._offsets['signature'] + len(signature)
4941+        datavs = []
4942+        datavs.append(tuple([self._offsets['signature'], signature]))
4943+        def _on_failure():
4944+            del(self._offsets['verification_key'])
4945+        return self._write(datavs, on_failure=_on_failure)
4946+
4947+
4948+    def put_verification_key(self, verification_key):
4949+        """
4950+        Put the verification key into the remote slot.
4951+
4952+        I require that the signature have been written to the storage
4953+        server before I allow the verification key to be written to the
4954+        remote server.
4955+        """
4956+        if "verification_key" not in self._offsets:
4957+            raise LayoutInvalid("You must put the signature before you "
4958+                                "can put the verification key")
4959+        self._offsets['EOF'] = self._offsets['verification_key'] + len(verification_key)
4960+        datavs = []
4961+        datavs.append(tuple([self._offsets['verification_key'], verification_key]))
4962+        def _on_failure():
4963+            del(self._offsets['EOF'])
4964+        return self._write(datavs, on_failure=_on_failure)
4965+
4966+    def _get_offsets_tuple(self):
4967+        return tuple([(key, value) for key, value in self._offsets.items()])
4968+
4969+    def get_verinfo(self):
4970+        return (self._seqnum,
4971+                self._root_hash,
4972+                self._required_shares,
4973+                self._total_shares,
4974+                self._segment_size,
4975+                self._data_length,
4976+                self.get_signable(),
4977+                self._get_offsets_tuple())
4978+
4979+
4980+    def finish_publishing(self):
4981+        """
4982+        Write the offset table and encoding parameters to the remote
4983+        slot, since that's the only thing we have yet to publish at this
4984+        point.
4985+        """
4986+        if "EOF" not in self._offsets:
4987+            raise LayoutInvalid("You must put the verification key before "
4988+                                "you can publish the offsets")
4989+        offsets_offset = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
4990+        offsets = struct.pack(MDMFOFFSETS,
4991+                              self._offsets['enc_privkey'],
4992+                              self._offsets['block_hash_tree'],
4993+                              self._offsets['share_hash_chain'],
4994+                              self._offsets['signature'],
4995+                              self._offsets['verification_key'],
4996+                              self._offsets['EOF'])
4997+        datavs = []
4998+        datavs.append(tuple([offsets_offset, offsets]))
4999+        encoding_parameters_offset = struct.calcsize(MDMFCHECKSTRING)
5000+        params = struct.pack(">BBQQ",
5001+                             self._required_shares,
5002+                             self._total_shares,
5003+                             self._segment_size,
5004+                             self._data_length)
5005+        datavs.append(tuple([encoding_parameters_offset, params]))
5006+        return self._write(datavs)
5007+
5008+
5009+    def _write(self, datavs, on_failure=None, on_success=None):
5010+        """I write the data vectors in datavs to the remote slot."""
5011+        tw_vectors = {}
5012+        new_share = False
5013+        if not self._testvs:
5014+            self._testvs = []
5015+            self._testvs.append(tuple([0, 1, "eq", ""]))
5016+            new_share = True
5017+        if not self._written:
5018+            # Write a new checkstring to the share when we write it, so
5019+            # that we have something to check later.
5020+            new_checkstring = self.get_checkstring()
5021+            datavs.append((0, new_checkstring))
5022+            def _first_write():
5023+                self._written = True
5024+                self._testvs = [(0, len(new_checkstring), "eq", new_checkstring)]
5025+            on_success = _first_write
5026+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
5027+        datalength = sum([len(x[1]) for x in datavs])
5028+        d = self._rref.callRemote("slot_testv_and_readv_and_writev",
5029+                                  self._storage_index,
5030+                                  self._secrets,
5031+                                  tw_vectors,
5032+                                  self._readv)
5033+        def _result(results):
5034+            if isinstance(results, failure.Failure) or not results[0]:
5035+                # Do nothing; the write was unsuccessful.
5036+                if on_failure: on_failure()
5037+            else:
5038+                if on_success: on_success()
5039+            return results
5040+        d.addCallback(_result)
5041+        return d
5042+
5043+
5044+class MDMFSlotReadProxy:
5045+    """
5046+    I read from a mutable slot filled with data written in the MDMF data
5047+    format (which is described above).
5048+
5049+    I can be initialized with some amount of data, which I will use (if
5050+    it is valid) to eliminate some of the need to fetch it from servers.
5051+    """
5052+    def __init__(self,
5053+                 rref,
5054+                 storage_index,
5055+                 shnum,
5056+                 data=""):
5057+        # Start the initialization process.
5058+        self._rref = rref
5059+        self._storage_index = storage_index
5060+        self.shnum = shnum
5061+
5062+        # Before doing anything, the reader is probably going to want to
5063+        # verify that the signature is correct. To do that, they'll need
5064+        # the verification key, and the signature. To get those, we'll
5065+        # need the offset table. So fetch the offset table on the
5066+        # assumption that that will be the first thing that a reader is
5067+        # going to do.
5068+
5069+        # The fact that these encoding parameters are None tells us
5070+        # that we haven't yet fetched them from the remote share, so we
5071+        # should. We could just not set them, but the checks will be
5072+        # easier to read if we don't have to use hasattr.
5073+        self._version_number = None
5074+        self._sequence_number = None
5075+        self._root_hash = None
5076+        # Filled in if we're dealing with an SDMF file. Unused
5077+        # otherwise.
5078+        self._salt = None
5079+        self._required_shares = None
5080+        self._total_shares = None
5081+        self._segment_size = None
5082+        self._data_length = None
5083+        self._offsets = None
5084+
5085+        # If the user has chosen to initialize us with some data, we'll
5086+        # try to satisfy subsequent data requests with that data before
5087+        # asking the storage server for it. If
5088+        self._data = data
5089+        # The way callers interact with cache in the filenode returns
5090+        # None if there isn't any cached data, but the way we index the
5091+        # cached data requires a string, so convert None to "".
5092+        if self._data == None:
5093+            self._data = ""
5094+
5095+        self._queue_observers = observer.ObserverList()
5096+        self._queue_errbacks = observer.ObserverList()
5097+        self._readvs = []
5098+
5099+
5100+    def _maybe_fetch_offsets_and_header(self, force_remote=False):
5101+        """
5102+        I fetch the offset table and the header from the remote slot if
5103+        I don't already have them. If I do have them, I do nothing and
5104+        return an empty Deferred.
5105+        """
5106+        if self._offsets:
5107+            return defer.succeed(None)
5108+        # At this point, we may be either SDMF or MDMF. Fetching 107
5109+        # bytes will be enough to get header and offsets for both SDMF and
5110+        # MDMF, though we'll be left with 4 more bytes than we
5111+        # need if this ends up being MDMF. This is probably less
5112+        # expensive than the cost of a second roundtrip.
5113+        readvs = [(0, 107)]
5114+        d = self._read(readvs, force_remote)
5115+        d.addCallback(self._process_encoding_parameters)
5116+        d.addCallback(self._process_offsets)
5117+        return d
5118+
5119+
5120+    def _process_encoding_parameters(self, encoding_parameters):
5121+        assert self.shnum in encoding_parameters
5122+        encoding_parameters = encoding_parameters[self.shnum][0]
5123+        # The first byte is the version number. It will tell us what
5124+        # to do next.
5125+        (verno,) = struct.unpack(">B", encoding_parameters[:1])
5126+        if verno == MDMF_VERSION:
5127+            read_size = MDMFHEADERWITHOUTOFFSETSSIZE
5128+            (verno,
5129+             seqnum,
5130+             root_hash,
5131+             k,
5132+             n,
5133+             segsize,
5134+             datalen) = struct.unpack(MDMFHEADERWITHOUTOFFSETS,
5135+                                      encoding_parameters[:read_size])
5136+            if segsize == 0 and datalen == 0:
5137+                # Empty file, no segments.
5138+                self._num_segments = 0
5139+            else:
5140+                self._num_segments = mathutil.div_ceil(datalen, segsize)
5141+
5142+        elif verno == SDMF_VERSION:
5143+            read_size = SIGNED_PREFIX_LENGTH
5144+            (verno,
5145+             seqnum,
5146+             root_hash,
5147+             salt,
5148+             k,
5149+             n,
5150+             segsize,
5151+             datalen) = struct.unpack(">BQ32s16s BBQQ",
5152+                                encoding_parameters[:SIGNED_PREFIX_LENGTH])
5153+            self._salt = salt
5154+            if segsize == 0 and datalen == 0:
5155+                # empty file
5156+                self._num_segments = 0
5157+            else:
5158+                # non-empty SDMF files have one segment.
5159+                self._num_segments = 1
5160+        else:
5161+            raise UnknownVersionError("You asked me to read mutable file "
5162+                                      "version %d, but I only understand "
5163+                                      "%d and %d" % (verno, SDMF_VERSION,
5164+                                                     MDMF_VERSION))
5165+
5166+        self._version_number = verno
5167+        self._sequence_number = seqnum
5168+        self._root_hash = root_hash
5169+        self._required_shares = k
5170+        self._total_shares = n
5171+        self._segment_size = segsize
5172+        self._data_length = datalen
5173+
5174+        self._block_size = self._segment_size / self._required_shares
5175+        # We can upload empty files, and need to account for this fact
5176+        # so as to avoid zero-division and zero-modulo errors.
5177+        if datalen > 0:
5178+            tail_size = self._data_length % self._segment_size
5179+        else:
5180+            tail_size = 0
5181+        if not tail_size:
5182+            self._tail_block_size = self._block_size
5183+        else:
5184+            self._tail_block_size = mathutil.next_multiple(tail_size,
5185+                                                    self._required_shares)
5186+            self._tail_block_size /= self._required_shares
5187+
5188+        return encoding_parameters
5189+
5190+
5191+    def _process_offsets(self, offsets):
5192+        if self._version_number == 0:
5193+            read_size = OFFSETS_LENGTH
5194+            read_offset = SIGNED_PREFIX_LENGTH
5195+            end = read_size + read_offset
5196+            (signature,
5197+             share_hash_chain,
5198+             block_hash_tree,
5199+             share_data,
5200+             enc_privkey,
5201+             EOF) = struct.unpack(">LLLLQQ",
5202+                                  offsets[read_offset:end])
5203+            self._offsets = {}
5204+            self._offsets['signature'] = signature
5205+            self._offsets['share_data'] = share_data
5206+            self._offsets['block_hash_tree'] = block_hash_tree
5207+            self._offsets['share_hash_chain'] = share_hash_chain
5208+            self._offsets['enc_privkey'] = enc_privkey
5209+            self._offsets['EOF'] = EOF
5210+
5211+        elif self._version_number == 1:
5212+            read_offset = MDMFHEADERWITHOUTOFFSETSSIZE
5213+            read_length = MDMFOFFSETS_LENGTH
5214+            end = read_offset + read_length
5215+            (encprivkey,
5216+             blockhashes,
5217+             sharehashes,
5218+             signature,
5219+             verification_key,
5220+             eof) = struct.unpack(MDMFOFFSETS,
5221+                                  offsets[read_offset:end])
5222+            self._offsets = {}
5223+            self._offsets['enc_privkey'] = encprivkey
5224+            self._offsets['block_hash_tree'] = blockhashes
5225+            self._offsets['share_hash_chain'] = sharehashes
5226+            self._offsets['signature'] = signature
5227+            self._offsets['verification_key'] = verification_key
5228+            self._offsets['EOF'] = eof
5229+
5230+
5231+    def get_block_and_salt(self, segnum, queue=False):
5232+        """
5233+        I return (block, salt), where block is the block data and
5234+        salt is the salt used to encrypt that segment.
5235+        """
5236+        d = self._maybe_fetch_offsets_and_header()
5237+        def _then(ignored):
5238+            if self._version_number == 1:
5239+                base_share_offset = MDMFHEADERSIZE
5240+            else:
5241+                base_share_offset = self._offsets['share_data']
5242+
5243+            if segnum + 1 > self._num_segments:
5244+                raise LayoutInvalid("Not a valid segment number")
5245+
5246+            if self._version_number == 0:
5247+                share_offset = base_share_offset + self._block_size * segnum
5248+            else:
5249+                share_offset = base_share_offset + (self._block_size + \
5250+                                                    SALT_SIZE) * segnum
5251+            if segnum + 1 == self._num_segments:
5252+                data = self._tail_block_size
5253+            else:
5254+                data = self._block_size
5255+
5256+            if self._version_number == 1:
5257+                data += SALT_SIZE
5258+
5259+            readvs = [(share_offset, data)]
5260+            return readvs
5261+        d.addCallback(_then)
5262+        d.addCallback(lambda readvs:
5263+            self._read(readvs, queue=queue))
5264+        def _process_results(results):
5265+            assert self.shnum in results
5266+            if self._version_number == 0:
5267+                # We only read the share data, but we know the salt from
5268+                # when we fetched the header
5269+                data = results[self.shnum]
5270+                if not data:
5271+                    data = ""
5272+                else:
5273+                    assert len(data) == 1
5274+                    data = data[0]
5275+                salt = self._salt
5276+            else:
5277+                data = results[self.shnum]
5278+                if not data:
5279+                    salt = data = ""
5280+                else:
5281+                    salt_and_data = results[self.shnum][0]
5282+                    salt = salt_and_data[:SALT_SIZE]
5283+                    data = salt_and_data[SALT_SIZE:]
5284+            return data, salt
5285+        d.addCallback(_process_results)
5286+        return d
5287+
5288+
5289+    def get_blockhashes(self, needed=None, queue=False, force_remote=False):
5290+        """
5291+        I return the block hash tree
5292+
5293+        I take an optional argument, needed, which is a set of indices
5294+        correspond to hashes that I should fetch. If this argument is
5295+        missing, I will fetch the entire block hash tree; otherwise, I
5296+        may attempt to fetch fewer hashes, based on what needed says
5297+        that I should do. Note that I may fetch as many hashes as I
5298+        want, so long as the set of hashes that I do fetch is a superset
5299+        of the ones that I am asked for, so callers should be prepared
5300+        to tolerate additional hashes.
5301+        """
5302+        # TODO: Return only the parts of the block hash tree necessary
5303+        # to validate the blocknum provided?
5304+        # This is a good idea, but it is hard to implement correctly. It
5305+        # is bad to fetch any one block hash more than once, so we
5306+        # probably just want to fetch the whole thing at once and then
5307+        # serve it.
5308+        if needed == set([]):
5309+            return defer.succeed([])
5310+        d = self._maybe_fetch_offsets_and_header()
5311+        def _then(ignored):
5312+            blockhashes_offset = self._offsets['block_hash_tree']
5313+            if self._version_number == 1:
5314+                blockhashes_length = self._offsets['share_hash_chain'] - blockhashes_offset
5315+            else:
5316+                blockhashes_length = self._offsets['share_data'] - blockhashes_offset
5317+            readvs = [(blockhashes_offset, blockhashes_length)]
5318+            return readvs
5319+        d.addCallback(_then)
5320+        d.addCallback(lambda readvs:
5321+            self._read(readvs, queue=queue, force_remote=force_remote))
5322+        def _build_block_hash_tree(results):
5323+            assert self.shnum in results
5324+
5325+            rawhashes = results[self.shnum][0]
5326+            results = [rawhashes[i:i+HASH_SIZE]
5327+                       for i in range(0, len(rawhashes), HASH_SIZE)]
5328+            return results
5329+        d.addCallback(_build_block_hash_tree)
5330+        return d
5331+
5332+
5333+    def get_sharehashes(self, needed=None, queue=False, force_remote=False):
5334+        """
5335+        I return the part of the share hash chain placed to validate
5336+        this share.
5337+
5338+        I take an optional argument, needed. Needed is a set of indices
5339+        that correspond to the hashes that I should fetch. If needed is
5340+        not present, I will fetch and return the entire share hash
5341+        chain. Otherwise, I may fetch and return any part of the share
5342+        hash chain that is a superset of the part that I am asked to
5343+        fetch. Callers should be prepared to deal with more hashes than
5344+        they've asked for.
5345+        """
5346+        if needed == set([]):
5347+            return defer.succeed([])
5348+        d = self._maybe_fetch_offsets_and_header()
5349+
5350+        def _make_readvs(ignored):
5351+            sharehashes_offset = self._offsets['share_hash_chain']
5352+            if self._version_number == 0:
5353+                sharehashes_length = self._offsets['block_hash_tree'] - sharehashes_offset
5354+            else:
5355+                sharehashes_length = self._offsets['signature'] - sharehashes_offset
5356+            readvs = [(sharehashes_offset, sharehashes_length)]
5357+            return readvs
5358+        d.addCallback(_make_readvs)
5359+        d.addCallback(lambda readvs:
5360+            self._read(readvs, queue=queue, force_remote=force_remote))
5361+        def _build_share_hash_chain(results):
5362+            assert self.shnum in results
5363+
5364+            sharehashes = results[self.shnum][0]
5365+            results = [sharehashes[i:i+(HASH_SIZE + 2)]
5366+                       for i in range(0, len(sharehashes), HASH_SIZE + 2)]
5367+            results = dict([struct.unpack(">H32s", data)
5368+                            for data in results])
5369+            return results
5370+        d.addCallback(_build_share_hash_chain)
5371+        return d
5372+
5373+
5374+    def get_encprivkey(self, queue=False):
5375+        """
5376+        I return the encrypted private key.
5377+        """
5378+        d = self._maybe_fetch_offsets_and_header()
5379+
5380+        def _make_readvs(ignored):
5381+            privkey_offset = self._offsets['enc_privkey']
5382+            if self._version_number == 0:
5383+                privkey_length = self._offsets['EOF'] - privkey_offset
5384+            else:
5385+                privkey_length = self._offsets['block_hash_tree'] - privkey_offset
5386+            readvs = [(privkey_offset, privkey_length)]
5387+            return readvs
5388+        d.addCallback(_make_readvs)
5389+        d.addCallback(lambda readvs:
5390+            self._read(readvs, queue=queue))
5391+        def _process_results(results):
5392+            assert self.shnum in results
5393+            privkey = results[self.shnum][0]
5394+            return privkey
5395+        d.addCallback(_process_results)
5396+        return d
5397+
5398+
5399+    def get_signature(self, queue=False):
5400+        """
5401+        I return the signature of my share.
5402+        """
5403+        d = self._maybe_fetch_offsets_and_header()
5404+
5405+        def _make_readvs(ignored):
5406+            signature_offset = self._offsets['signature']
5407+            if self._version_number == 1:
5408+                signature_length = self._offsets['verification_key'] - signature_offset
5409+            else:
5410+                signature_length = self._offsets['share_hash_chain'] - signature_offset
5411+            readvs = [(signature_offset, signature_length)]
5412+            return readvs
5413+        d.addCallback(_make_readvs)
5414+        d.addCallback(lambda readvs:
5415+            self._read(readvs, queue=queue))
5416+        def _process_results(results):
5417+            assert self.shnum in results
5418+            signature = results[self.shnum][0]
5419+            return signature
5420+        d.addCallback(_process_results)
5421+        return d
5422+
5423+
5424+    def get_verification_key(self, queue=False):
5425+        """
5426+        I return the verification key.
5427+        """
5428+        d = self._maybe_fetch_offsets_and_header()
5429+
5430+        def _make_readvs(ignored):
5431+            if self._version_number == 1:
5432+                vk_offset = self._offsets['verification_key']
5433+                vk_length = self._offsets['EOF'] - vk_offset
5434+            else:
5435+                vk_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
5436+                vk_length = self._offsets['signature'] - vk_offset
5437+            readvs = [(vk_offset, vk_length)]
5438+            return readvs
5439+        d.addCallback(_make_readvs)
5440+        d.addCallback(lambda readvs:
5441+            self._read(readvs, queue=queue))
5442+        def _process_results(results):
5443+            assert self.shnum in results
5444+            verification_key = results[self.shnum][0]
5445+            return verification_key
5446+        d.addCallback(_process_results)
5447+        return d
5448+
5449+
5450+    def get_encoding_parameters(self):
5451+        """
5452+        I return (k, n, segsize, datalen)
5453+        """
5454+        d = self._maybe_fetch_offsets_and_header()
5455+        d.addCallback(lambda ignored:
5456+            (self._required_shares,
5457+             self._total_shares,
5458+             self._segment_size,
5459+             self._data_length))
5460+        return d
5461+
5462+
5463+    def get_seqnum(self):
5464+        """
5465+        I return the sequence number for this share.
5466+        """
5467+        d = self._maybe_fetch_offsets_and_header()
5468+        d.addCallback(lambda ignored:
5469+            self._sequence_number)
5470+        return d
5471+
5472+
5473+    def get_root_hash(self):
5474+        """
5475+        I return the root of the block hash tree
5476+        """
5477+        d = self._maybe_fetch_offsets_and_header()
5478+        d.addCallback(lambda ignored: self._root_hash)
5479+        return d
5480+
5481+
5482+    def get_checkstring(self):
5483+        """
5484+        I return the packed representation of the following:
5485+
5486+            - version number
5487+            - sequence number
5488+            - root hash
5489+            - salt hash
5490+
5491+        which my users use as a checkstring to detect other writers.
5492+        """
5493+        d = self._maybe_fetch_offsets_and_header()
5494+        def _build_checkstring(ignored):
5495+            if self._salt:
5496+                checkstring = strut.pack(PREFIX,
5497+                                         self._version_number,
5498+                                         self._sequence_number,
5499+                                         self._root_hash,
5500+                                         self._salt)
5501+            else:
5502+                checkstring = struct.pack(MDMFCHECKSTRING,
5503+                                          self._version_number,
5504+                                          self._sequence_number,
5505+                                          self._root_hash)
5506+
5507+            return checkstring
5508+        d.addCallback(_build_checkstring)
5509+        return d
5510+
5511+
5512+    def get_prefix(self, force_remote):
5513+        d = self._maybe_fetch_offsets_and_header(force_remote)
5514+        d.addCallback(lambda ignored:
5515+            self._build_prefix())
5516+        return d
5517+
5518+
5519+    def _build_prefix(self):
5520+        # The prefix is another name for the part of the remote share
5521+        # that gets signed. It consists of everything up to and
5522+        # including the datalength, packed by struct.
5523+        if self._version_number == SDMF_VERSION:
5524+            return struct.pack(SIGNED_PREFIX,
5525+                           self._version_number,
5526+                           self._sequence_number,
5527+                           self._root_hash,
5528+                           self._salt,
5529+                           self._required_shares,
5530+                           self._total_shares,
5531+                           self._segment_size,
5532+                           self._data_length)
5533+
5534+        else:
5535+            return struct.pack(MDMFSIGNABLEHEADER,
5536+                           self._version_number,
5537+                           self._sequence_number,
5538+                           self._root_hash,
5539+                           self._required_shares,
5540+                           self._total_shares,
5541+                           self._segment_size,
5542+                           self._data_length)
5543+
5544+
5545+    def _get_offsets_tuple(self):
5546+        # The offsets tuple is another component of the version
5547+        # information tuple. It is basically our offsets dictionary,
5548+        # itemized and in a tuple.
5549+        return self._offsets.copy()
5550+
5551+
5552+    def get_verinfo(self):
5553+        """
5554+        I return my verinfo tuple. This is used by the ServermapUpdater
5555+        to keep track of versions of mutable files.
5556+
5557+        The verinfo tuple for MDMF files contains:
5558+            - seqnum
5559+            - root hash
5560+            - a blank (nothing)
5561+            - segsize
5562+            - datalen
5563+            - k
5564+            - n
5565+            - prefix (the thing that you sign)
5566+            - a tuple of offsets
5567+
5568+        We include the nonce in MDMF to simplify processing of version
5569+        information tuples.
5570+
5571+        The verinfo tuple for SDMF files is the same, but contains a
5572+        16-byte IV instead of a hash of salts.
5573+        """
5574+        d = self._maybe_fetch_offsets_and_header()
5575+        def _build_verinfo(ignored):
5576+            if self._version_number == SDMF_VERSION:
5577+                salt_to_use = self._salt
5578+            else:
5579+                salt_to_use = None
5580+            return (self._sequence_number,
5581+                    self._root_hash,
5582+                    salt_to_use,
5583+                    self._segment_size,
5584+                    self._data_length,
5585+                    self._required_shares,
5586+                    self._total_shares,
5587+                    self._build_prefix(),
5588+                    self._get_offsets_tuple())
5589+        d.addCallback(_build_verinfo)
5590+        return d
5591+
5592+
5593+    def flush(self):
5594+        """
5595+        I flush my queue of read vectors.
5596+        """
5597+        d = self._read(self._readvs)
5598+        def _then(results):
5599+            self._readvs = []
5600+            if isinstance(results, failure.Failure):
5601+                self._queue_errbacks.notify(results)
5602+            else:
5603+                self._queue_observers.notify(results)
5604+            self._queue_observers = observer.ObserverList()
5605+            self._queue_errbacks = observer.ObserverList()
5606+        d.addBoth(_then)
5607+
5608+
5609+    def _read(self, readvs, force_remote=False, queue=False):
5610+        unsatisfiable = filter(lambda x: x[0] + x[1] > len(self._data), readvs)
5611+        # TODO: It's entirely possible to tweak this so that it just
5612+        # fulfills the requests that it can, and not demand that all
5613+        # requests are satisfiable before running it.
5614+        if not unsatisfiable and not force_remote:
5615+            results = [self._data[offset:offset+length]
5616+                       for (offset, length) in readvs]
5617+            results = {self.shnum: results}
5618+            return defer.succeed(results)
5619+        else:
5620+            if queue:
5621+                start = len(self._readvs)
5622+                self._readvs += readvs
5623+                end = len(self._readvs)
5624+                def _get_results(results, start, end):
5625+                    if not self.shnum in results:
5626+                        return {self._shnum: [""]}
5627+                    return {self.shnum: results[self.shnum][start:end]}
5628+                d = defer.Deferred()
5629+                d.addCallback(_get_results, start, end)
5630+                self._queue_observers.subscribe(d.callback)
5631+                self._queue_errbacks.subscribe(d.errback)
5632+                return d
5633+            return self._rref.callRemote("slot_readv",
5634+                                         self._storage_index,
5635+                                         [self.shnum],
5636+                                         readvs)
5637+
5638+
5639+    def is_sdmf(self):
5640+        """I tell my caller whether or not my remote file is SDMF or MDMF
5641+        """
5642+        d = self._maybe_fetch_offsets_and_header()
5643+        d.addCallback(lambda ignored:
5644+            self._version_number == 0)
5645+        return d
5646+
5647+
5648+class LayoutInvalid(Exception):
5649+    """
5650+    This isn't a valid MDMF mutable file
5651+    """
5652hunk ./src/allmydata/test/test_storage.py 2
5653 
5654-import time, os.path, stat, re, simplejson, struct
5655+import time, os.path, stat, re, simplejson, struct, shutil
5656 
5657 from twisted.trial import unittest
5658 
5659hunk ./src/allmydata/test/test_storage.py 22
5660 from allmydata.storage.expirer import LeaseCheckingCrawler
5661 from allmydata.immutable.layout import WriteBucketProxy, WriteBucketProxy_v2, \
5662      ReadBucketProxy
5663-from allmydata.interfaces import BadWriteEnablerError
5664-from allmydata.test.common import LoggingServiceParent
5665+from allmydata.mutable.layout import MDMFSlotWriteProxy, MDMFSlotReadProxy, \
5666+                                     LayoutInvalid, MDMFSIGNABLEHEADER, \
5667+                                     SIGNED_PREFIX, MDMFHEADER, \
5668+                                     MDMFOFFSETS, SDMFSlotWriteProxy
5669+from allmydata.interfaces import BadWriteEnablerError, MDMF_VERSION, \
5670+                                 SDMF_VERSION
5671+from allmydata.test.common import LoggingServiceParent, ShouldFailMixin
5672 from allmydata.test.common_web import WebRenderingMixin
5673 from allmydata.web.storage import StorageStatus, remove_prefix
5674 
5675hunk ./src/allmydata/test/test_storage.py 106
5676 
5677 class RemoteBucket:
5678 
5679+    def __init__(self):
5680+        self.read_count = 0
5681+        self.write_count = 0
5682+
5683     def callRemote(self, methname, *args, **kwargs):
5684         def _call():
5685             meth = getattr(self.target, "remote_" + methname)
5686hunk ./src/allmydata/test/test_storage.py 114
5687             return meth(*args, **kwargs)
5688+
5689+        if methname == "slot_readv":
5690+            self.read_count += 1
5691+        if "writev" in methname:
5692+            self.write_count += 1
5693+
5694         return defer.maybeDeferred(_call)
5695 
5696hunk ./src/allmydata/test/test_storage.py 122
5697+
5698 class BucketProxy(unittest.TestCase):
5699     def make_bucket(self, name, size):
5700         basedir = os.path.join("storage", "BucketProxy", name)
5701hunk ./src/allmydata/test/test_storage.py 1299
5702         self.failUnless(os.path.exists(prefixdir), prefixdir)
5703         self.failIf(os.path.exists(bucketdir), bucketdir)
5704 
5705+
5706+class MDMFProxies(unittest.TestCase, ShouldFailMixin):
5707+    def setUp(self):
5708+        self.sparent = LoggingServiceParent()
5709+        self._lease_secret = itertools.count()
5710+        self.ss = self.create("MDMFProxies storage test server")
5711+        self.rref = RemoteBucket()
5712+        self.rref.target = self.ss
5713+        self.secrets = (self.write_enabler("we_secret"),
5714+                        self.renew_secret("renew_secret"),
5715+                        self.cancel_secret("cancel_secret"))
5716+        self.segment = "aaaaaa"
5717+        self.block = "aa"
5718+        self.salt = "a" * 16
5719+        self.block_hash = "a" * 32
5720+        self.block_hash_tree = [self.block_hash for i in xrange(6)]
5721+        self.share_hash = self.block_hash
5722+        self.share_hash_chain = dict([(i, self.share_hash) for i in xrange(6)])
5723+        self.signature = "foobarbaz"
5724+        self.verification_key = "vvvvvv"
5725+        self.encprivkey = "private"
5726+        self.root_hash = self.block_hash
5727+        self.salt_hash = self.root_hash
5728+        self.salt_hash_tree = [self.salt_hash for i in xrange(6)]
5729+        self.block_hash_tree_s = self.serialize_blockhashes(self.block_hash_tree)
5730+        self.share_hash_chain_s = self.serialize_sharehashes(self.share_hash_chain)
5731+        # blockhashes and salt hashes are serialized in the same way,
5732+        # only we lop off the first element and store that in the
5733+        # header.
5734+        self.salt_hash_tree_s = self.serialize_blockhashes(self.salt_hash_tree[1:])
5735+
5736+
5737+    def tearDown(self):
5738+        self.sparent.stopService()
5739+        shutil.rmtree(self.workdir("MDMFProxies storage test server"))
5740+
5741+
5742+    def write_enabler(self, we_tag):
5743+        return hashutil.tagged_hash("we_blah", we_tag)
5744+
5745+
5746+    def renew_secret(self, tag):
5747+        return hashutil.tagged_hash("renew_blah", str(tag))
5748+
5749+
5750+    def cancel_secret(self, tag):
5751+        return hashutil.tagged_hash("cancel_blah", str(tag))
5752+
5753+
5754+    def workdir(self, name):
5755+        basedir = os.path.join("storage", "MutableServer", name)
5756+        return basedir
5757+
5758+
5759+    def create(self, name):
5760+        workdir = self.workdir(name)
5761+        ss = StorageServer(workdir, "\x00" * 20)
5762+        ss.setServiceParent(self.sparent)
5763+        return ss
5764+
5765+
5766+    def build_test_mdmf_share(self, tail_segment=False, empty=False):
5767+        # Start with the checkstring
5768+        data = struct.pack(">BQ32s",
5769+                           1,
5770+                           0,
5771+                           self.root_hash)
5772+        self.checkstring = data
5773+        # Next, the encoding parameters
5774+        if tail_segment:
5775+            data += struct.pack(">BBQQ",
5776+                                3,
5777+                                10,
5778+                                6,
5779+                                33)
5780+        elif empty:
5781+            data += struct.pack(">BBQQ",
5782+                                3,
5783+                                10,
5784+                                0,
5785+                                0)
5786+        else:
5787+            data += struct.pack(">BBQQ",
5788+                                3,
5789+                                10,
5790+                                6,
5791+                                36)
5792+        # Now we'll build the offsets.
5793+        sharedata = ""
5794+        if not tail_segment and not empty:
5795+            for i in xrange(6):
5796+                sharedata += self.salt + self.block
5797+        elif tail_segment:
5798+            for i in xrange(5):
5799+                sharedata += self.salt + self.block
5800+            sharedata += self.salt + "a"
5801+
5802+        # The encrypted private key comes after the shares + salts
5803+        offset_size = struct.calcsize(MDMFOFFSETS)
5804+        encrypted_private_key_offset = len(data) + offset_size + len(sharedata)
5805+        # The blockhashes come after the private key
5806+        blockhashes_offset = encrypted_private_key_offset + len(self.encprivkey)
5807+        # The sharehashes come after the salt hashes
5808+        sharehashes_offset = blockhashes_offset + len(self.block_hash_tree_s)
5809+        # The signature comes after the share hash chain
5810+        signature_offset = sharehashes_offset + len(self.share_hash_chain_s)
5811+        # The verification key comes after the signature
5812+        verification_offset = signature_offset + len(self.signature)
5813+        # The EOF comes after the verification key
5814+        eof_offset = verification_offset + len(self.verification_key)
5815+        data += struct.pack(MDMFOFFSETS,
5816+                            encrypted_private_key_offset,
5817+                            blockhashes_offset,
5818+                            sharehashes_offset,
5819+                            signature_offset,
5820+                            verification_offset,
5821+                            eof_offset)
5822+        self.offsets = {}
5823+        self.offsets['enc_privkey'] = encrypted_private_key_offset
5824+        self.offsets['block_hash_tree'] = blockhashes_offset
5825+        self.offsets['share_hash_chain'] = sharehashes_offset
5826+        self.offsets['signature'] = signature_offset
5827+        self.offsets['verification_key'] = verification_offset
5828+        self.offsets['EOF'] = eof_offset
5829+        # Next, we'll add in the salts and share data,
5830+        data += sharedata
5831+        # the private key,
5832+        data += self.encprivkey
5833+        # the block hash tree,
5834+        data += self.block_hash_tree_s
5835+        # the share hash chain,
5836+        data += self.share_hash_chain_s
5837+        # the signature,
5838+        data += self.signature
5839+        # and the verification key
5840+        data += self.verification_key
5841+        return data
5842+
5843+
5844+    def write_test_share_to_server(self,
5845+                                   storage_index,
5846+                                   tail_segment=False,
5847+                                   empty=False):
5848+        """
5849+        I write some data for the read tests to read to self.ss
5850+
5851+        If tail_segment=True, then I will write a share that has a
5852+        smaller tail segment than other segments.
5853+        """
5854+        write = self.ss.remote_slot_testv_and_readv_and_writev
5855+        data = self.build_test_mdmf_share(tail_segment, empty)
5856+        # Finally, we write the whole thing to the storage server in one
5857+        # pass.
5858+        testvs = [(0, 1, "eq", "")]
5859+        tws = {}
5860+        tws[0] = (testvs, [(0, data)], None)
5861+        readv = [(0, 1)]
5862+        results = write(storage_index, self.secrets, tws, readv)
5863+        self.failUnless(results[0])
5864+
5865+
5866+    def build_test_sdmf_share(self, empty=False):
5867+        if empty:
5868+            sharedata = ""
5869+        else:
5870+            sharedata = self.segment * 6
5871+        self.sharedata = sharedata
5872+        blocksize = len(sharedata) / 3
5873+        block = sharedata[:blocksize]
5874+        self.blockdata = block
5875+        prefix = struct.pack(">BQ32s16s BBQQ",
5876+                             0, # version,
5877+                             0,
5878+                             self.root_hash,
5879+                             self.salt,
5880+                             3,
5881+                             10,
5882+                             len(sharedata),
5883+                             len(sharedata),
5884+                            )
5885+        post_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
5886+        signature_offset = post_offset + len(self.verification_key)
5887+        sharehashes_offset = signature_offset + len(self.signature)
5888+        blockhashes_offset = sharehashes_offset + len(self.share_hash_chain_s)
5889+        sharedata_offset = blockhashes_offset + len(self.block_hash_tree_s)
5890+        encprivkey_offset = sharedata_offset + len(block)
5891+        eof_offset = encprivkey_offset + len(self.encprivkey)
5892+        offsets = struct.pack(">LLLLQQ",
5893+                              signature_offset,
5894+                              sharehashes_offset,
5895+                              blockhashes_offset,
5896+                              sharedata_offset,
5897+                              encprivkey_offset,
5898+                              eof_offset)
5899+        final_share = "".join([prefix,
5900+                           offsets,
5901+                           self.verification_key,
5902+                           self.signature,
5903+                           self.share_hash_chain_s,
5904+                           self.block_hash_tree_s,
5905+                           block,
5906+                           self.encprivkey])
5907+        self.offsets = {}
5908+        self.offsets['signature'] = signature_offset
5909+        self.offsets['share_hash_chain'] = sharehashes_offset
5910+        self.offsets['block_hash_tree'] = blockhashes_offset
5911+        self.offsets['share_data'] = sharedata_offset
5912+        self.offsets['enc_privkey'] = encprivkey_offset
5913+        self.offsets['EOF'] = eof_offset
5914+        return final_share
5915+
5916+
5917+    def write_sdmf_share_to_server(self,
5918+                                   storage_index,
5919+                                   empty=False):
5920+        # Some tests need SDMF shares to verify that we can still
5921+        # read them. This method writes one, which resembles but is not
5922+        assert self.rref
5923+        write = self.ss.remote_slot_testv_and_readv_and_writev
5924+        share = self.build_test_sdmf_share(empty)
5925+        testvs = [(0, 1, "eq", "")]
5926+        tws = {}
5927+        tws[0] = (testvs, [(0, share)], None)
5928+        readv = []
5929+        results = write(storage_index, self.secrets, tws, readv)
5930+        self.failUnless(results[0])
5931+
5932+
5933+    def test_read(self):
5934+        self.write_test_share_to_server("si1")
5935+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
5936+        # Check that every method equals what we expect it to.
5937+        d = defer.succeed(None)
5938+        def _check_block_and_salt((block, salt)):
5939+            self.failUnlessEqual(block, self.block)
5940+            self.failUnlessEqual(salt, self.salt)
5941+
5942+        for i in xrange(6):
5943+            d.addCallback(lambda ignored, i=i:
5944+                mr.get_block_and_salt(i))
5945+            d.addCallback(_check_block_and_salt)
5946+
5947+        d.addCallback(lambda ignored:
5948+            mr.get_encprivkey())
5949+        d.addCallback(lambda encprivkey:
5950+            self.failUnlessEqual(self.encprivkey, encprivkey))
5951+
5952+        d.addCallback(lambda ignored:
5953+            mr.get_blockhashes())
5954+        d.addCallback(lambda blockhashes:
5955+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
5956+
5957+        d.addCallback(lambda ignored:
5958+            mr.get_sharehashes())
5959+        d.addCallback(lambda sharehashes:
5960+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
5961+
5962+        d.addCallback(lambda ignored:
5963+            mr.get_signature())
5964+        d.addCallback(lambda signature:
5965+            self.failUnlessEqual(signature, self.signature))
5966+
5967+        d.addCallback(lambda ignored:
5968+            mr.get_verification_key())
5969+        d.addCallback(lambda verification_key:
5970+            self.failUnlessEqual(verification_key, self.verification_key))
5971+
5972+        d.addCallback(lambda ignored:
5973+            mr.get_seqnum())
5974+        d.addCallback(lambda seqnum:
5975+            self.failUnlessEqual(seqnum, 0))
5976+
5977+        d.addCallback(lambda ignored:
5978+            mr.get_root_hash())
5979+        d.addCallback(lambda root_hash:
5980+            self.failUnlessEqual(self.root_hash, root_hash))
5981+
5982+        d.addCallback(lambda ignored:
5983+            mr.get_seqnum())
5984+        d.addCallback(lambda seqnum:
5985+            self.failUnlessEqual(0, seqnum))
5986+
5987+        d.addCallback(lambda ignored:
5988+            mr.get_encoding_parameters())
5989+        def _check_encoding_parameters((k, n, segsize, datalen)):
5990+            self.failUnlessEqual(k, 3)
5991+            self.failUnlessEqual(n, 10)
5992+            self.failUnlessEqual(segsize, 6)
5993+            self.failUnlessEqual(datalen, 36)
5994+        d.addCallback(_check_encoding_parameters)
5995+
5996+        d.addCallback(lambda ignored:
5997+            mr.get_checkstring())
5998+        d.addCallback(lambda checkstring:
5999+            self.failUnlessEqual(checkstring, checkstring))
6000+        return d
6001+
6002+
6003+    def test_read_with_different_tail_segment_size(self):
6004+        self.write_test_share_to_server("si1", tail_segment=True)
6005+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6006+        d = mr.get_block_and_salt(5)
6007+        def _check_tail_segment(results):
6008+            block, salt = results
6009+            self.failUnlessEqual(len(block), 1)
6010+            self.failUnlessEqual(block, "a")
6011+        d.addCallback(_check_tail_segment)
6012+        return d
6013+
6014+
6015+    def test_get_block_with_invalid_segnum(self):
6016+        self.write_test_share_to_server("si1")
6017+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6018+        d = defer.succeed(None)
6019+        d.addCallback(lambda ignored:
6020+            self.shouldFail(LayoutInvalid, "test invalid segnum",
6021+                            None,
6022+                            mr.get_block_and_salt, 7))
6023+        return d
6024+
6025+
6026+    def test_get_encoding_parameters_first(self):
6027+        self.write_test_share_to_server("si1")
6028+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6029+        d = mr.get_encoding_parameters()
6030+        def _check_encoding_parameters((k, n, segment_size, datalen)):
6031+            self.failUnlessEqual(k, 3)
6032+            self.failUnlessEqual(n, 10)
6033+            self.failUnlessEqual(segment_size, 6)
6034+            self.failUnlessEqual(datalen, 36)
6035+        d.addCallback(_check_encoding_parameters)
6036+        return d
6037+
6038+
6039+    def test_get_seqnum_first(self):
6040+        self.write_test_share_to_server("si1")
6041+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6042+        d = mr.get_seqnum()
6043+        d.addCallback(lambda seqnum:
6044+            self.failUnlessEqual(seqnum, 0))
6045+        return d
6046+
6047+
6048+    def test_get_root_hash_first(self):
6049+        self.write_test_share_to_server("si1")
6050+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6051+        d = mr.get_root_hash()
6052+        d.addCallback(lambda root_hash:
6053+            self.failUnlessEqual(root_hash, self.root_hash))
6054+        return d
6055+
6056+
6057+    def test_get_checkstring_first(self):
6058+        self.write_test_share_to_server("si1")
6059+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6060+        d = mr.get_checkstring()
6061+        d.addCallback(lambda checkstring:
6062+            self.failUnlessEqual(checkstring, self.checkstring))
6063+        return d
6064+
6065+
6066+    def test_write_read_vectors(self):
6067+        # When writing for us, the storage server will return to us a
6068+        # read vector, along with its result. If a write fails because
6069+        # the test vectors failed, this read vector can help us to
6070+        # diagnose the problem. This test ensures that the read vector
6071+        # is working appropriately.
6072+        mw = self._make_new_mw("si1", 0)
6073+        d = defer.succeed(None)
6074+
6075+        # Write one share. This should return a checkstring of nothing,
6076+        # since there is no data there.
6077+        d.addCallback(lambda ignored:
6078+            mw.put_block(self.block, 0, self.salt))
6079+        def _check_first_write(results):
6080+            result, readvs = results
6081+            self.failUnless(result)
6082+            self.failIf(readvs)
6083+        d.addCallback(_check_first_write)
6084+        # Now, there should be a different checkstring returned when
6085+        # we write other shares
6086+        d.addCallback(lambda ignored:
6087+            mw.put_block(self.block, 1, self.salt))
6088+        def _check_next_write(results):
6089+            result, readvs = results
6090+            self.failUnless(result)
6091+            self.expected_checkstring = mw.get_checkstring()
6092+            self.failUnlessIn(0, readvs)
6093+            self.failUnlessEqual(readvs[0][0], self.expected_checkstring)
6094+        d.addCallback(_check_next_write)
6095+        # Add the other four shares
6096+        for i in xrange(2, 6):
6097+            d.addCallback(lambda ignored, i=i:
6098+                mw.put_block(self.block, i, self.salt))
6099+            d.addCallback(_check_next_write)
6100+        # Add the encrypted private key
6101+        d.addCallback(lambda ignored:
6102+            mw.put_encprivkey(self.encprivkey))
6103+        d.addCallback(_check_next_write)
6104+        # Add the block hash tree and share hash tree
6105+        d.addCallback(lambda ignored:
6106+            mw.put_blockhashes(self.block_hash_tree))
6107+        d.addCallback(_check_next_write)
6108+        d.addCallback(lambda ignored:
6109+            mw.put_sharehashes(self.share_hash_chain))
6110+        d.addCallback(_check_next_write)
6111+        # Add the root hash and the salt hash. This should change the
6112+        # checkstring, but not in a way that we'll be able to see right
6113+        # now, since the read vectors are applied before the write
6114+        # vectors.
6115+        d.addCallback(lambda ignored:
6116+            mw.put_root_hash(self.root_hash))
6117+        def _check_old_testv_after_new_one_is_written(results):
6118+            result, readvs = results
6119+            self.failUnless(result)
6120+            self.failUnlessIn(0, readvs)
6121+            self.failUnlessEqual(self.expected_checkstring,
6122+                                 readvs[0][0])
6123+            new_checkstring = mw.get_checkstring()
6124+            self.failIfEqual(new_checkstring,
6125+                             readvs[0][0])
6126+        d.addCallback(_check_old_testv_after_new_one_is_written)
6127+        # Now add the signature. This should succeed, meaning that the
6128+        # data gets written and the read vector matches what the writer
6129+        # thinks should be there.
6130+        d.addCallback(lambda ignored:
6131+            mw.put_signature(self.signature))
6132+        d.addCallback(_check_next_write)
6133+        # The checkstring remains the same for the rest of the process.
6134+        return d
6135+
6136+
6137+    def test_blockhashes_after_share_hash_chain(self):
6138+        mw = self._make_new_mw("si1", 0)
6139+        d = defer.succeed(None)
6140+        # Put everything up to and including the share hash chain
6141+        for i in xrange(6):
6142+            d.addCallback(lambda ignored, i=i:
6143+                mw.put_block(self.block, i, self.salt))
6144+        d.addCallback(lambda ignored:
6145+            mw.put_encprivkey(self.encprivkey))
6146+        d.addCallback(lambda ignored:
6147+            mw.put_blockhashes(self.block_hash_tree))
6148+        d.addCallback(lambda ignored:
6149+            mw.put_sharehashes(self.share_hash_chain))
6150+
6151+        # Now try to put the block hash tree again.
6152+        d.addCallback(lambda ignored:
6153+            self.shouldFail(LayoutInvalid, "test repeat salthashes",
6154+                            None,
6155+                            mw.put_blockhashes, self.block_hash_tree))
6156+        return d
6157+
6158+
6159+    def test_encprivkey_after_blockhashes(self):
6160+        mw = self._make_new_mw("si1", 0)
6161+        d = defer.succeed(None)
6162+        # Put everything up to and including the block hash tree
6163+        for i in xrange(6):
6164+            d.addCallback(lambda ignored, i=i:
6165+                mw.put_block(self.block, i, self.salt))
6166+        d.addCallback(lambda ignored:
6167+            mw.put_encprivkey(self.encprivkey))
6168+        d.addCallback(lambda ignored:
6169+            mw.put_blockhashes(self.block_hash_tree))
6170+        d.addCallback(lambda ignored:
6171+            self.shouldFail(LayoutInvalid, "out of order private key",
6172+                            None,
6173+                            mw.put_encprivkey, self.encprivkey))
6174+        return d
6175+
6176+
6177+    def test_share_hash_chain_after_signature(self):
6178+        mw = self._make_new_mw("si1", 0)
6179+        d = defer.succeed(None)
6180+        # Put everything up to and including the signature
6181+        for i in xrange(6):
6182+            d.addCallback(lambda ignored, i=i:
6183+                mw.put_block(self.block, i, self.salt))
6184+        d.addCallback(lambda ignored:
6185+            mw.put_encprivkey(self.encprivkey))
6186+        d.addCallback(lambda ignored:
6187+            mw.put_blockhashes(self.block_hash_tree))
6188+        d.addCallback(lambda ignored:
6189+            mw.put_sharehashes(self.share_hash_chain))
6190+        d.addCallback(lambda ignored:
6191+            mw.put_root_hash(self.root_hash))
6192+        d.addCallback(lambda ignored:
6193+            mw.put_signature(self.signature))
6194+        # Now try to put the share hash chain again. This should fail
6195+        d.addCallback(lambda ignored:
6196+            self.shouldFail(LayoutInvalid, "out of order share hash chain",
6197+                            None,
6198+                            mw.put_sharehashes, self.share_hash_chain))
6199+        return d
6200+
6201+
6202+    def test_signature_after_verification_key(self):
6203+        mw = self._make_new_mw("si1", 0)
6204+        d = defer.succeed(None)
6205+        # Put everything up to and including the verification key.
6206+        for i in xrange(6):
6207+            d.addCallback(lambda ignored, i=i:
6208+                mw.put_block(self.block, i, self.salt))
6209+        d.addCallback(lambda ignored:
6210+            mw.put_encprivkey(self.encprivkey))
6211+        d.addCallback(lambda ignored:
6212+            mw.put_blockhashes(self.block_hash_tree))
6213+        d.addCallback(lambda ignored:
6214+            mw.put_sharehashes(self.share_hash_chain))
6215+        d.addCallback(lambda ignored:
6216+            mw.put_root_hash(self.root_hash))
6217+        d.addCallback(lambda ignored:
6218+            mw.put_signature(self.signature))
6219+        d.addCallback(lambda ignored:
6220+            mw.put_verification_key(self.verification_key))
6221+        # Now try to put the signature again. This should fail
6222+        d.addCallback(lambda ignored:
6223+            self.shouldFail(LayoutInvalid, "signature after verification",
6224+                            None,
6225+                            mw.put_signature, self.signature))
6226+        return d
6227+
6228+
6229+    def test_uncoordinated_write(self):
6230+        # Make two mutable writers, both pointing to the same storage
6231+        # server, both at the same storage index, and try writing to the
6232+        # same share.
6233+        mw1 = self._make_new_mw("si1", 0)
6234+        mw2 = self._make_new_mw("si1", 0)
6235+        d = defer.succeed(None)
6236+        def _check_success(results):
6237+            result, readvs = results
6238+            self.failUnless(result)
6239+
6240+        def _check_failure(results):
6241+            result, readvs = results
6242+            self.failIf(result)
6243+
6244+        d.addCallback(lambda ignored:
6245+            mw1.put_block(self.block, 0, self.salt))
6246+        d.addCallback(_check_success)
6247+        d.addCallback(lambda ignored:
6248+            mw2.put_block(self.block, 0, self.salt))
6249+        d.addCallback(_check_failure)
6250+        return d
6251+
6252+
6253+    def test_invalid_salt_size(self):
6254+        # Salts need to be 16 bytes in size. Writes that attempt to
6255+        # write more or less than this should be rejected.
6256+        mw = self._make_new_mw("si1", 0)
6257+        invalid_salt = "a" * 17 # 17 bytes
6258+        another_invalid_salt = "b" * 15 # 15 bytes
6259+        d = defer.succeed(None)
6260+        d.addCallback(lambda ignored:
6261+            self.shouldFail(LayoutInvalid, "salt too big",
6262+                            None,
6263+                            mw.put_block, self.block, 0, invalid_salt))
6264+        d.addCallback(lambda ignored:
6265+            self.shouldFail(LayoutInvalid, "salt too small",
6266+                            None,
6267+                            mw.put_block, self.block, 0,
6268+                            another_invalid_salt))
6269+        return d
6270+
6271+
6272+    def test_write_test_vectors(self):
6273+        # If we give the write proxy a bogus test vector at
6274+        # any point during the process, it should fail to write.
6275+        mw = self._make_new_mw("si1", 0)
6276+        mw.set_checkstring("this is a lie")
6277+        # The initial write should be expecting to find the improbable
6278+        # checkstring above in place; finding nothing, it should fail.
6279+        d = defer.succeed(None)
6280+        d.addCallback(lambda ignored:
6281+            mw.put_block(self.block, 0, self.salt))
6282+        def _check_failure(results):
6283+            result, readv = results
6284+            self.failIf(result)
6285+        d.addCallback(_check_failure)
6286+        # Now set the checkstring to the empty string, which
6287+        # indicates that no share is there.
6288+        d.addCallback(lambda ignored:
6289+            mw.set_checkstring(""))
6290+        d.addCallback(lambda ignored:
6291+            mw.put_block(self.block, 0, self.salt))
6292+        def _check_success(results):
6293+            result, readv = results
6294+            self.failUnless(result)
6295+        d.addCallback(_check_success)
6296+        # Now set the checkstring to something wrong
6297+        d.addCallback(lambda ignored:
6298+            mw.set_checkstring("something wrong"))
6299+        # This should fail to do anything
6300+        d.addCallback(lambda ignored:
6301+            mw.put_block(self.block, 1, self.salt))
6302+        d.addCallback(_check_failure)
6303+        # Now set it back to what it should be.
6304+        d.addCallback(lambda ignored:
6305+            mw.set_checkstring(mw.get_checkstring()))
6306+        for i in xrange(1, 6):
6307+            d.addCallback(lambda ignored, i=i:
6308+                mw.put_block(self.block, i, self.salt))
6309+            d.addCallback(_check_success)
6310+        d.addCallback(lambda ignored:
6311+            mw.put_encprivkey(self.encprivkey))
6312+        d.addCallback(_check_success)
6313+        d.addCallback(lambda ignored:
6314+            mw.put_blockhashes(self.block_hash_tree))
6315+        d.addCallback(_check_success)
6316+        d.addCallback(lambda ignored:
6317+            mw.put_sharehashes(self.share_hash_chain))
6318+        d.addCallback(_check_success)
6319+        def _keep_old_checkstring(ignored):
6320+            self.old_checkstring = mw.get_checkstring()
6321+            mw.set_checkstring("foobarbaz")
6322+        d.addCallback(_keep_old_checkstring)
6323+        d.addCallback(lambda ignored:
6324+            mw.put_root_hash(self.root_hash))
6325+        d.addCallback(_check_failure)
6326+        d.addCallback(lambda ignored:
6327+            self.failUnlessEqual(self.old_checkstring, mw.get_checkstring()))
6328+        def _restore_old_checkstring(ignored):
6329+            mw.set_checkstring(self.old_checkstring)
6330+        d.addCallback(_restore_old_checkstring)
6331+        d.addCallback(lambda ignored:
6332+            mw.put_root_hash(self.root_hash))
6333+        d.addCallback(_check_success)
6334+        # The checkstring should have been set appropriately for us on
6335+        # the last write; if we try to change it to something else,
6336+        # that change should cause the verification key step to fail.
6337+        d.addCallback(lambda ignored:
6338+            mw.set_checkstring("something else"))
6339+        d.addCallback(lambda ignored:
6340+            mw.put_signature(self.signature))
6341+        d.addCallback(_check_failure)
6342+        d.addCallback(lambda ignored:
6343+            mw.set_checkstring(mw.get_checkstring()))
6344+        d.addCallback(lambda ignored:
6345+            mw.put_signature(self.signature))
6346+        d.addCallback(_check_success)
6347+        d.addCallback(lambda ignored:
6348+            mw.put_verification_key(self.verification_key))
6349+        d.addCallback(_check_success)
6350+        return d
6351+
6352+
6353+    def test_offset_only_set_on_success(self):
6354+        # The write proxy should be smart enough to detect when a write
6355+        # has failed, and to temper its definition of progress based on
6356+        # that.
6357+        mw = self._make_new_mw("si1", 0)
6358+        d = defer.succeed(None)
6359+        for i in xrange(1, 6):
6360+            d.addCallback(lambda ignored, i=i:
6361+                mw.put_block(self.block, i, self.salt))
6362+        def _break_checkstring(ignored):
6363+            self._old_checkstring = mw.get_checkstring()
6364+            mw.set_checkstring("foobarbaz")
6365+
6366+        def _fix_checkstring(ignored):
6367+            mw.set_checkstring(self._old_checkstring)
6368+
6369+        d.addCallback(_break_checkstring)
6370+
6371+        # Setting the encrypted private key shouldn't work now, which is
6372+        # to be expected and is tested elsewhere. We also want to make
6373+        # sure that we can't add the block hash tree after a failed
6374+        # write of this sort.
6375+        d.addCallback(lambda ignored:
6376+            mw.put_encprivkey(self.encprivkey))
6377+        d.addCallback(lambda ignored:
6378+            self.shouldFail(LayoutInvalid, "test out-of-order blockhashes",
6379+                            None,
6380+                            mw.put_blockhashes, self.block_hash_tree))
6381+        d.addCallback(_fix_checkstring)
6382+        d.addCallback(lambda ignored:
6383+            mw.put_encprivkey(self.encprivkey))
6384+        d.addCallback(_break_checkstring)
6385+        d.addCallback(lambda ignored:
6386+            mw.put_blockhashes(self.block_hash_tree))
6387+        d.addCallback(lambda ignored:
6388+            self.shouldFail(LayoutInvalid, "test out-of-order sharehashes",
6389+                            None,
6390+                            mw.put_sharehashes, self.share_hash_chain))
6391+        d.addCallback(_fix_checkstring)
6392+        d.addCallback(lambda ignored:
6393+            mw.put_blockhashes(self.block_hash_tree))
6394+        d.addCallback(_break_checkstring)
6395+        d.addCallback(lambda ignored:
6396+            mw.put_sharehashes(self.share_hash_chain))
6397+        d.addCallback(lambda ignored:
6398+            self.shouldFail(LayoutInvalid, "out-of-order root hash",
6399+                            None,
6400+                            mw.put_root_hash, self.root_hash))
6401+        d.addCallback(_fix_checkstring)
6402+        d.addCallback(lambda ignored:
6403+            mw.put_sharehashes(self.share_hash_chain))
6404+        d.addCallback(_break_checkstring)
6405+        d.addCallback(lambda ignored:
6406+            mw.put_root_hash(self.root_hash))
6407+        d.addCallback(lambda ignored:
6408+            self.shouldFail(LayoutInvalid, "out-of-order signature",
6409+                            None,
6410+                            mw.put_signature, self.signature))
6411+        d.addCallback(_fix_checkstring)
6412+        d.addCallback(lambda ignored:
6413+            mw.put_root_hash(self.root_hash))
6414+        d.addCallback(_break_checkstring)
6415+        d.addCallback(lambda ignored:
6416+            mw.put_signature(self.signature))
6417+        d.addCallback(lambda ignored:
6418+            self.shouldFail(LayoutInvalid, "out-of-order verification key",
6419+                            None,
6420+                            mw.put_verification_key,
6421+                            self.verification_key))
6422+        d.addCallback(_fix_checkstring)
6423+        d.addCallback(lambda ignored:
6424+            mw.put_signature(self.signature))
6425+        d.addCallback(_break_checkstring)
6426+        d.addCallback(lambda ignored:
6427+            mw.put_verification_key(self.verification_key))
6428+        d.addCallback(lambda ignored:
6429+            self.shouldFail(LayoutInvalid, "out-of-order finish",
6430+                            None,
6431+                            mw.finish_publishing))
6432+        return d
6433+
6434+
6435+    def serialize_blockhashes(self, blockhashes):
6436+        return "".join(blockhashes)
6437+
6438+
6439+    def serialize_sharehashes(self, sharehashes):
6440+        ret = "".join([struct.pack(">H32s", i, sharehashes[i])
6441+                        for i in sorted(sharehashes.keys())])
6442+        return ret
6443+
6444+
6445+    def test_write(self):
6446+        # This translates to a file with 6 6-byte segments, and with 2-byte
6447+        # blocks.
6448+        mw = self._make_new_mw("si1", 0)
6449+        mw2 = self._make_new_mw("si1", 1)
6450+        # Test writing some blocks.
6451+        read = self.ss.remote_slot_readv
6452+        expected_sharedata_offset = struct.calcsize(MDMFHEADER)
6453+        written_block_size = 2 + len(self.salt)
6454+        written_block = self.block + self.salt
6455+        def _check_block_write(i, share):
6456+            self.failUnlessEqual(read("si1", [share], [(expected_sharedata_offset + (i * written_block_size), written_block_size)]),
6457+                                {share: [written_block]})
6458+        d = defer.succeed(None)
6459+        for i in xrange(6):
6460+            d.addCallback(lambda ignored, i=i:
6461+                mw.put_block(self.block, i, self.salt))
6462+            d.addCallback(lambda ignored, i=i:
6463+                _check_block_write(i, 0))
6464+        # Now try the same thing, but with share 1 instead of share 0.
6465+        for i in xrange(6):
6466+            d.addCallback(lambda ignored, i=i:
6467+                mw2.put_block(self.block, i, self.salt))
6468+            d.addCallback(lambda ignored, i=i:
6469+                _check_block_write(i, 1))
6470+
6471+        # Next, we make a fake encrypted private key, and put it onto the
6472+        # storage server.
6473+        d.addCallback(lambda ignored:
6474+            mw.put_encprivkey(self.encprivkey))
6475+        expected_private_key_offset = expected_sharedata_offset + \
6476+                                      len(written_block) * 6
6477+        self.failUnlessEqual(len(self.encprivkey), 7)
6478+        d.addCallback(lambda ignored:
6479+            self.failUnlessEqual(read("si1", [0], [(expected_private_key_offset, 7)]),
6480+                                 {0: [self.encprivkey]}))
6481+
6482+        # Next, we put a fake block hash tree.
6483+        d.addCallback(lambda ignored:
6484+            mw.put_blockhashes(self.block_hash_tree))
6485+        expected_block_hash_offset = expected_private_key_offset + len(self.encprivkey)
6486+        self.failUnlessEqual(len(self.block_hash_tree_s), 32 * 6)
6487+        d.addCallback(lambda ignored:
6488+            self.failUnlessEqual(read("si1", [0], [(expected_block_hash_offset, 32 * 6)]),
6489+                                 {0: [self.block_hash_tree_s]}))
6490+
6491+        # Next, put a fake share hash chain
6492+        d.addCallback(lambda ignored:
6493+            mw.put_sharehashes(self.share_hash_chain))
6494+        expected_share_hash_offset = expected_block_hash_offset + len(self.block_hash_tree_s)
6495+        d.addCallback(lambda ignored:
6496+            self.failUnlessEqual(read("si1", [0],[(expected_share_hash_offset, (32 + 2) * 6)]),
6497+                                 {0: [self.share_hash_chain_s]}))
6498+
6499+        # Next, we put what is supposed to be the root hash of
6500+        # our share hash tree but isn't       
6501+        d.addCallback(lambda ignored:
6502+            mw.put_root_hash(self.root_hash))
6503+        # The root hash gets inserted at byte 9 (its position is in the header,
6504+        # and is fixed).
6505+        def _check(ignored):
6506+            self.failUnlessEqual(read("si1", [0], [(9, 32)]),
6507+                                 {0: [self.root_hash]})
6508+        d.addCallback(_check)
6509+
6510+        # Next, we put a signature of the header block.
6511+        d.addCallback(lambda ignored:
6512+            mw.put_signature(self.signature))
6513+        expected_signature_offset = expected_share_hash_offset + len(self.share_hash_chain_s)
6514+        self.failUnlessEqual(len(self.signature), 9)
6515+        d.addCallback(lambda ignored:
6516+            self.failUnlessEqual(read("si1", [0], [(expected_signature_offset, 9)]),
6517+                                 {0: [self.signature]}))
6518+
6519+        # Next, we put the verification key
6520+        d.addCallback(lambda ignored:
6521+            mw.put_verification_key(self.verification_key))
6522+        expected_verification_key_offset = expected_signature_offset + len(self.signature)
6523+        self.failUnlessEqual(len(self.verification_key), 6)
6524+        d.addCallback(lambda ignored:
6525+            self.failUnlessEqual(read("si1", [0], [(expected_verification_key_offset, 6)]),
6526+                                 {0: [self.verification_key]}))
6527+
6528+        def _check_signable(ignored):
6529+            # Make sure that the signable is what we think it should be.
6530+            signable = mw.get_signable()
6531+            verno, seq, roothash, k, n, segsize, datalen = \
6532+                                            struct.unpack(">BQ32sBBQQ",
6533+                                                          signable)
6534+            self.failUnlessEqual(verno, 1)
6535+            self.failUnlessEqual(seq, 0)
6536+            self.failUnlessEqual(roothash, self.root_hash)
6537+            self.failUnlessEqual(k, 3)
6538+            self.failUnlessEqual(n, 10)
6539+            self.failUnlessEqual(segsize, 6)
6540+            self.failUnlessEqual(datalen, 36)
6541+        d.addCallback(_check_signable)
6542+        # Next, we cause the offset table to be published.
6543+        d.addCallback(lambda ignored:
6544+            mw.finish_publishing())
6545+        expected_eof_offset = expected_verification_key_offset + len(self.verification_key)
6546+
6547+        def _check_offsets(ignored):
6548+            # Check the version number to make sure that it is correct.
6549+            expected_version_number = struct.pack(">B", 1)
6550+            self.failUnlessEqual(read("si1", [0], [(0, 1)]),
6551+                                 {0: [expected_version_number]})
6552+            # Check the sequence number to make sure that it is correct
6553+            expected_sequence_number = struct.pack(">Q", 0)
6554+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
6555+                                 {0: [expected_sequence_number]})
6556+            # Check that the encoding parameters (k, N, segement size, data
6557+            # length) are what they should be. These are  3, 10, 6, 36
6558+            expected_k = struct.pack(">B", 3)
6559+            self.failUnlessEqual(read("si1", [0], [(41, 1)]),
6560+                                 {0: [expected_k]})
6561+            expected_n = struct.pack(">B", 10)
6562+            self.failUnlessEqual(read("si1", [0], [(42, 1)]),
6563+                                 {0: [expected_n]})
6564+            expected_segment_size = struct.pack(">Q", 6)
6565+            self.failUnlessEqual(read("si1", [0], [(43, 8)]),
6566+                                 {0: [expected_segment_size]})
6567+            expected_data_length = struct.pack(">Q", 36)
6568+            self.failUnlessEqual(read("si1", [0], [(51, 8)]),
6569+                                 {0: [expected_data_length]})
6570+            expected_offset = struct.pack(">Q", expected_private_key_offset)
6571+            self.failUnlessEqual(read("si1", [0], [(59, 8)]),
6572+                                 {0: [expected_offset]})
6573+            expected_offset = struct.pack(">Q", expected_block_hash_offset)
6574+            self.failUnlessEqual(read("si1", [0], [(67, 8)]),
6575+                                 {0: [expected_offset]})
6576+            expected_offset = struct.pack(">Q", expected_share_hash_offset)
6577+            self.failUnlessEqual(read("si1", [0], [(75, 8)]),
6578+                                 {0: [expected_offset]})
6579+            expected_offset = struct.pack(">Q", expected_signature_offset)
6580+            self.failUnlessEqual(read("si1", [0], [(83, 8)]),
6581+                                 {0: [expected_offset]})
6582+            expected_offset = struct.pack(">Q", expected_verification_key_offset)
6583+            self.failUnlessEqual(read("si1", [0], [(91, 8)]),
6584+                                 {0: [expected_offset]})
6585+            expected_offset = struct.pack(">Q", expected_eof_offset)
6586+            self.failUnlessEqual(read("si1", [0], [(99, 8)]),
6587+                                 {0: [expected_offset]})
6588+        d.addCallback(_check_offsets)
6589+        return d
6590+
6591+    def _make_new_mw(self, si, share, datalength=36):
6592+        # This is a file of size 36 bytes. Since it has a segment
6593+        # size of 6, we know that it has 6 byte segments, which will
6594+        # be split into blocks of 2 bytes because our FEC k
6595+        # parameter is 3.
6596+        mw = MDMFSlotWriteProxy(share, self.rref, si, self.secrets, 0, 3, 10,
6597+                                6, datalength)
6598+        return mw
6599+
6600+
6601+    def test_write_rejected_with_too_many_blocks(self):
6602+        mw = self._make_new_mw("si0", 0)
6603+
6604+        # Try writing too many blocks. We should not be able to write
6605+        # more than 6
6606+        # blocks into each share.
6607+        d = defer.succeed(None)
6608+        for i in xrange(6):
6609+            d.addCallback(lambda ignored, i=i:
6610+                mw.put_block(self.block, i, self.salt))
6611+        d.addCallback(lambda ignored:
6612+            self.shouldFail(LayoutInvalid, "too many blocks",
6613+                            None,
6614+                            mw.put_block, self.block, 7, self.salt))
6615+        return d
6616+
6617+
6618+    def test_write_rejected_with_invalid_salt(self):
6619+        # Try writing an invalid salt. Salts are 16 bytes -- any more or
6620+        # less should cause an error.
6621+        mw = self._make_new_mw("si1", 0)
6622+        bad_salt = "a" * 17 # 17 bytes
6623+        d = defer.succeed(None)
6624+        d.addCallback(lambda ignored:
6625+            self.shouldFail(LayoutInvalid, "test_invalid_salt",
6626+                            None, mw.put_block, self.block, 7, bad_salt))
6627+        return d
6628+
6629+
6630+    def test_write_rejected_with_invalid_root_hash(self):
6631+        # Try writing an invalid root hash. This should be SHA256d, and
6632+        # 32 bytes long as a result.
6633+        mw = self._make_new_mw("si2", 0)
6634+        # 17 bytes != 32 bytes
6635+        invalid_root_hash = "a" * 17
6636+        d = defer.succeed(None)
6637+        # Before this test can work, we need to put some blocks + salts,
6638+        # a block hash tree, and a share hash tree. Otherwise, we'll see
6639+        # failures that match what we are looking for, but are caused by
6640+        # the constraints imposed on operation ordering.
6641+        for i in xrange(6):
6642+            d.addCallback(lambda ignored, i=i:
6643+                mw.put_block(self.block, i, self.salt))
6644+        d.addCallback(lambda ignored:
6645+            mw.put_encprivkey(self.encprivkey))
6646+        d.addCallback(lambda ignored:
6647+            mw.put_blockhashes(self.block_hash_tree))
6648+        d.addCallback(lambda ignored:
6649+            mw.put_sharehashes(self.share_hash_chain))
6650+        d.addCallback(lambda ignored:
6651+            self.shouldFail(LayoutInvalid, "invalid root hash",
6652+                            None, mw.put_root_hash, invalid_root_hash))
6653+        return d
6654+
6655+
6656+    def test_write_rejected_with_invalid_blocksize(self):
6657+        # The blocksize implied by the writer that we get from
6658+        # _make_new_mw is 2bytes -- any more or any less than this
6659+        # should be cause for failure, unless it is the tail segment, in
6660+        # which case it may not be failure.
6661+        invalid_block = "a"
6662+        mw = self._make_new_mw("si3", 0, 33) # implies a tail segment with
6663+                                             # one byte blocks
6664+        # 1 bytes != 2 bytes
6665+        d = defer.succeed(None)
6666+        d.addCallback(lambda ignored, invalid_block=invalid_block:
6667+            self.shouldFail(LayoutInvalid, "test blocksize too small",
6668+                            None, mw.put_block, invalid_block, 0,
6669+                            self.salt))
6670+        invalid_block = invalid_block * 3
6671+        # 3 bytes != 2 bytes
6672+        d.addCallback(lambda ignored:
6673+            self.shouldFail(LayoutInvalid, "test blocksize too large",
6674+                            None,
6675+                            mw.put_block, invalid_block, 0, self.salt))
6676+        for i in xrange(5):
6677+            d.addCallback(lambda ignored, i=i:
6678+                mw.put_block(self.block, i, self.salt))
6679+        # Try to put an invalid tail segment
6680+        d.addCallback(lambda ignored:
6681+            self.shouldFail(LayoutInvalid, "test invalid tail segment",
6682+                            None,
6683+                            mw.put_block, self.block, 5, self.salt))
6684+        valid_block = "a"
6685+        d.addCallback(lambda ignored:
6686+            mw.put_block(valid_block, 5, self.salt))
6687+        return d
6688+
6689+
6690+    def test_write_enforces_order_constraints(self):
6691+        # We require that the MDMFSlotWriteProxy be interacted with in a
6692+        # specific way.
6693+        # That way is:
6694+        # 0: __init__
6695+        # 1: write blocks and salts
6696+        # 2: Write the encrypted private key
6697+        # 3: Write the block hashes
6698+        # 4: Write the share hashes
6699+        # 5: Write the root hash and salt hash
6700+        # 6: Write the signature and verification key
6701+        # 7: Write the file.
6702+        #
6703+        # Some of these can be performed out-of-order, and some can't.
6704+        # The dependencies that I want to test here are:
6705+        #  - Private key before block hashes
6706+        #  - share hashes and block hashes before root hash
6707+        #  - root hash before signature
6708+        #  - signature before verification key
6709+        mw0 = self._make_new_mw("si0", 0)
6710+        # Write some shares
6711+        d = defer.succeed(None)
6712+        for i in xrange(6):
6713+            d.addCallback(lambda ignored, i=i:
6714+                mw0.put_block(self.block, i, self.salt))
6715+        # Try to write the block hashes before writing the encrypted
6716+        # private key
6717+        d.addCallback(lambda ignored:
6718+            self.shouldFail(LayoutInvalid, "block hashes before key",
6719+                            None, mw0.put_blockhashes,
6720+                            self.block_hash_tree))
6721+
6722+        # Write the private key.
6723+        d.addCallback(lambda ignored:
6724+            mw0.put_encprivkey(self.encprivkey))
6725+
6726+
6727+        # Try to write the share hash chain without writing the block
6728+        # hash tree
6729+        d.addCallback(lambda ignored:
6730+            self.shouldFail(LayoutInvalid, "share hash chain before "
6731+                                           "salt hash tree",
6732+                            None,
6733+                            mw0.put_sharehashes, self.share_hash_chain))
6734+
6735+        # Try to write the root hash and without writing either the
6736+        # block hashes or the or the share hashes
6737+        d.addCallback(lambda ignored:
6738+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
6739+                            None,
6740+                            mw0.put_root_hash, self.root_hash))
6741+
6742+        # Now write the block hashes and try again
6743+        d.addCallback(lambda ignored:
6744+            mw0.put_blockhashes(self.block_hash_tree))
6745+
6746+        d.addCallback(lambda ignored:
6747+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
6748+                            None, mw0.put_root_hash, self.root_hash))
6749+
6750+        # We haven't yet put the root hash on the share, so we shouldn't
6751+        # be able to sign it.
6752+        d.addCallback(lambda ignored:
6753+            self.shouldFail(LayoutInvalid, "signature before root hash",
6754+                            None, mw0.put_signature, self.signature))
6755+
6756+        d.addCallback(lambda ignored:
6757+            self.failUnlessRaises(LayoutInvalid, mw0.get_signable))
6758+
6759+        # ..and, since that fails, we also shouldn't be able to put the
6760+        # verification key.
6761+        d.addCallback(lambda ignored:
6762+            self.shouldFail(LayoutInvalid, "key before signature",
6763+                            None, mw0.put_verification_key,
6764+                            self.verification_key))
6765+
6766+        # Now write the share hashes.
6767+        d.addCallback(lambda ignored:
6768+            mw0.put_sharehashes(self.share_hash_chain))
6769+        # We should be able to write the root hash now too
6770+        d.addCallback(lambda ignored:
6771+            mw0.put_root_hash(self.root_hash))
6772+
6773+        # We should still be unable to put the verification key
6774+        d.addCallback(lambda ignored:
6775+            self.shouldFail(LayoutInvalid, "key before signature",
6776+                            None, mw0.put_verification_key,
6777+                            self.verification_key))
6778+
6779+        d.addCallback(lambda ignored:
6780+            mw0.put_signature(self.signature))
6781+
6782+        # We shouldn't be able to write the offsets to the remote server
6783+        # until the offset table is finished; IOW, until we have written
6784+        # the verification key.
6785+        d.addCallback(lambda ignored:
6786+            self.shouldFail(LayoutInvalid, "offsets before verification key",
6787+                            None,
6788+                            mw0.finish_publishing))
6789+
6790+        d.addCallback(lambda ignored:
6791+            mw0.put_verification_key(self.verification_key))
6792+        return d
6793+
6794+
6795+    def test_end_to_end(self):
6796+        mw = self._make_new_mw("si1", 0)
6797+        # Write a share using the mutable writer, and make sure that the
6798+        # reader knows how to read everything back to us.
6799+        d = defer.succeed(None)
6800+        for i in xrange(6):
6801+            d.addCallback(lambda ignored, i=i:
6802+                mw.put_block(self.block, i, self.salt))
6803+        d.addCallback(lambda ignored:
6804+            mw.put_encprivkey(self.encprivkey))
6805+        d.addCallback(lambda ignored:
6806+            mw.put_blockhashes(self.block_hash_tree))
6807+        d.addCallback(lambda ignored:
6808+            mw.put_sharehashes(self.share_hash_chain))
6809+        d.addCallback(lambda ignored:
6810+            mw.put_root_hash(self.root_hash))
6811+        d.addCallback(lambda ignored:
6812+            mw.put_signature(self.signature))
6813+        d.addCallback(lambda ignored:
6814+            mw.put_verification_key(self.verification_key))
6815+        d.addCallback(lambda ignored:
6816+            mw.finish_publishing())
6817+
6818+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6819+        def _check_block_and_salt((block, salt)):
6820+            self.failUnlessEqual(block, self.block)
6821+            self.failUnlessEqual(salt, self.salt)
6822+
6823+        for i in xrange(6):
6824+            d.addCallback(lambda ignored, i=i:
6825+                mr.get_block_and_salt(i))
6826+            d.addCallback(_check_block_and_salt)
6827+
6828+        d.addCallback(lambda ignored:
6829+            mr.get_encprivkey())
6830+        d.addCallback(lambda encprivkey:
6831+            self.failUnlessEqual(self.encprivkey, encprivkey))
6832+
6833+        d.addCallback(lambda ignored:
6834+            mr.get_blockhashes())
6835+        d.addCallback(lambda blockhashes:
6836+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
6837+
6838+        d.addCallback(lambda ignored:
6839+            mr.get_sharehashes())
6840+        d.addCallback(lambda sharehashes:
6841+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
6842+
6843+        d.addCallback(lambda ignored:
6844+            mr.get_signature())
6845+        d.addCallback(lambda signature:
6846+            self.failUnlessEqual(signature, self.signature))
6847+
6848+        d.addCallback(lambda ignored:
6849+            mr.get_verification_key())
6850+        d.addCallback(lambda verification_key:
6851+            self.failUnlessEqual(verification_key, self.verification_key))
6852+
6853+        d.addCallback(lambda ignored:
6854+            mr.get_seqnum())
6855+        d.addCallback(lambda seqnum:
6856+            self.failUnlessEqual(seqnum, 0))
6857+
6858+        d.addCallback(lambda ignored:
6859+            mr.get_root_hash())
6860+        d.addCallback(lambda root_hash:
6861+            self.failUnlessEqual(self.root_hash, root_hash))
6862+
6863+        d.addCallback(lambda ignored:
6864+            mr.get_encoding_parameters())
6865+        def _check_encoding_parameters((k, n, segsize, datalen)):
6866+            self.failUnlessEqual(k, 3)
6867+            self.failUnlessEqual(n, 10)
6868+            self.failUnlessEqual(segsize, 6)
6869+            self.failUnlessEqual(datalen, 36)
6870+        d.addCallback(_check_encoding_parameters)
6871+
6872+        d.addCallback(lambda ignored:
6873+            mr.get_checkstring())
6874+        d.addCallback(lambda checkstring:
6875+            self.failUnlessEqual(checkstring, mw.get_checkstring()))
6876+        return d
6877+
6878+
6879+    def test_is_sdmf(self):
6880+        # The MDMFSlotReadProxy should also know how to read SDMF files,
6881+        # since it will encounter them on the grid. Callers use the
6882+        # is_sdmf method to test this.
6883+        self.write_sdmf_share_to_server("si1")
6884+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6885+        d = mr.is_sdmf()
6886+        d.addCallback(lambda issdmf:
6887+            self.failUnless(issdmf))
6888+        return d
6889+
6890+
6891+    def test_reads_sdmf(self):
6892+        # The slot read proxy should, naturally, know how to tell us
6893+        # about data in the SDMF format
6894+        self.write_sdmf_share_to_server("si1")
6895+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6896+        d = defer.succeed(None)
6897+        d.addCallback(lambda ignored:
6898+            mr.is_sdmf())
6899+        d.addCallback(lambda issdmf:
6900+            self.failUnless(issdmf))
6901+
6902+        # What do we need to read?
6903+        #  - The sharedata
6904+        #  - The salt
6905+        d.addCallback(lambda ignored:
6906+            mr.get_block_and_salt(0))
6907+        def _check_block_and_salt(results):
6908+            block, salt = results
6909+            # Our original file is 36 bytes long. Then each share is 12
6910+            # bytes in size. The share is composed entirely of the
6911+            # letter a. self.block contains 2 as, so 6 * self.block is
6912+            # what we are looking for.
6913+            self.failUnlessEqual(block, self.block * 6)
6914+            self.failUnlessEqual(salt, self.salt)
6915+        d.addCallback(_check_block_and_salt)
6916+
6917+        #  - The blockhashes
6918+        d.addCallback(lambda ignored:
6919+            mr.get_blockhashes())
6920+        d.addCallback(lambda blockhashes:
6921+            self.failUnlessEqual(self.block_hash_tree,
6922+                                 blockhashes,
6923+                                 blockhashes))
6924+        #  - The sharehashes
6925+        d.addCallback(lambda ignored:
6926+            mr.get_sharehashes())
6927+        d.addCallback(lambda sharehashes:
6928+            self.failUnlessEqual(self.share_hash_chain,
6929+                                 sharehashes))
6930+        #  - The keys
6931+        d.addCallback(lambda ignored:
6932+            mr.get_encprivkey())
6933+        d.addCallback(lambda encprivkey:
6934+            self.failUnlessEqual(encprivkey, self.encprivkey, encprivkey))
6935+        d.addCallback(lambda ignored:
6936+            mr.get_verification_key())
6937+        d.addCallback(lambda verification_key:
6938+            self.failUnlessEqual(verification_key,
6939+                                 self.verification_key,
6940+                                 verification_key))
6941+        #  - The signature
6942+        d.addCallback(lambda ignored:
6943+            mr.get_signature())
6944+        d.addCallback(lambda signature:
6945+            self.failUnlessEqual(signature, self.signature, signature))
6946+
6947+        #  - The sequence number
6948+        d.addCallback(lambda ignored:
6949+            mr.get_seqnum())
6950+        d.addCallback(lambda seqnum:
6951+            self.failUnlessEqual(seqnum, 0, seqnum))
6952+
6953+        #  - The root hash
6954+        d.addCallback(lambda ignored:
6955+            mr.get_root_hash())
6956+        d.addCallback(lambda root_hash:
6957+            self.failUnlessEqual(root_hash, self.root_hash, root_hash))
6958+        return d
6959+
6960+
6961+    def test_only_reads_one_segment_sdmf(self):
6962+        # SDMF shares have only one segment, so it doesn't make sense to
6963+        # read more segments than that. The reader should know this and
6964+        # complain if we try to do that.
6965+        self.write_sdmf_share_to_server("si1")
6966+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6967+        d = defer.succeed(None)
6968+        d.addCallback(lambda ignored:
6969+            mr.is_sdmf())
6970+        d.addCallback(lambda issdmf:
6971+            self.failUnless(issdmf))
6972+        d.addCallback(lambda ignored:
6973+            self.shouldFail(LayoutInvalid, "test bad segment",
6974+                            None,
6975+                            mr.get_block_and_salt, 1))
6976+        return d
6977+
6978+
6979+    def test_read_with_prefetched_mdmf_data(self):
6980+        # The MDMFSlotReadProxy will prefill certain fields if you pass
6981+        # it data that you have already fetched. This is useful for
6982+        # cases like the Servermap, which prefetches ~2kb of data while
6983+        # finding out which shares are on the remote peer so that it
6984+        # doesn't waste round trips.
6985+        mdmf_data = self.build_test_mdmf_share()
6986+        self.write_test_share_to_server("si1")
6987+        def _make_mr(ignored, length):
6988+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, mdmf_data[:length])
6989+            return mr
6990+
6991+        d = defer.succeed(None)
6992+        # This should be enough to fill in both the encoding parameters
6993+        # and the table of offsets, which will complete the version
6994+        # information tuple.
6995+        d.addCallback(_make_mr, 107)
6996+        d.addCallback(lambda mr:
6997+            mr.get_verinfo())
6998+        def _check_verinfo(verinfo):
6999+            self.failUnless(verinfo)
7000+            self.failUnlessEqual(len(verinfo), 9)
7001+            (seqnum,
7002+             root_hash,
7003+             salt_hash,
7004+             segsize,
7005+             datalen,
7006+             k,
7007+             n,
7008+             prefix,
7009+             offsets) = verinfo
7010+            self.failUnlessEqual(seqnum, 0)
7011+            self.failUnlessEqual(root_hash, self.root_hash)
7012+            self.failUnlessEqual(segsize, 6)
7013+            self.failUnlessEqual(datalen, 36)
7014+            self.failUnlessEqual(k, 3)
7015+            self.failUnlessEqual(n, 10)
7016+            expected_prefix = struct.pack(MDMFSIGNABLEHEADER,
7017+                                          1,
7018+                                          seqnum,
7019+                                          root_hash,
7020+                                          k,
7021+                                          n,
7022+                                          segsize,
7023+                                          datalen)
7024+            self.failUnlessEqual(expected_prefix, prefix)
7025+            self.failUnlessEqual(self.rref.read_count, 0)
7026+        d.addCallback(_check_verinfo)
7027+        # This is not enough data to read a block and a share, so the
7028+        # wrapper should attempt to read this from the remote server.
7029+        d.addCallback(_make_mr, 107)
7030+        d.addCallback(lambda mr:
7031+            mr.get_block_and_salt(0))
7032+        def _check_block_and_salt((block, salt)):
7033+            self.failUnlessEqual(block, self.block)
7034+            self.failUnlessEqual(salt, self.salt)
7035+            self.failUnlessEqual(self.rref.read_count, 1)
7036+        # This should be enough data to read one block.
7037+        d.addCallback(_make_mr, 249)
7038+        d.addCallback(lambda mr:
7039+            mr.get_block_and_salt(0))
7040+        d.addCallback(_check_block_and_salt)
7041+        return d
7042+
7043+
7044+    def test_read_with_prefetched_sdmf_data(self):
7045+        sdmf_data = self.build_test_sdmf_share()
7046+        self.write_sdmf_share_to_server("si1")
7047+        def _make_mr(ignored, length):
7048+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, sdmf_data[:length])
7049+            return mr
7050+
7051+        d = defer.succeed(None)
7052+        # This should be enough to get us the encoding parameters,
7053+        # offset table, and everything else we need to build a verinfo
7054+        # string.
7055+        d.addCallback(_make_mr, 107)
7056+        d.addCallback(lambda mr:
7057+            mr.get_verinfo())
7058+        def _check_verinfo(verinfo):
7059+            self.failUnless(verinfo)
7060+            self.failUnlessEqual(len(verinfo), 9)
7061+            (seqnum,
7062+             root_hash,
7063+             salt,
7064+             segsize,
7065+             datalen,
7066+             k,
7067+             n,
7068+             prefix,
7069+             offsets) = verinfo
7070+            self.failUnlessEqual(seqnum, 0)
7071+            self.failUnlessEqual(root_hash, self.root_hash)
7072+            self.failUnlessEqual(salt, self.salt)
7073+            self.failUnlessEqual(segsize, 36)
7074+            self.failUnlessEqual(datalen, 36)
7075+            self.failUnlessEqual(k, 3)
7076+            self.failUnlessEqual(n, 10)
7077+            expected_prefix = struct.pack(SIGNED_PREFIX,
7078+                                          0,
7079+                                          seqnum,
7080+                                          root_hash,
7081+                                          salt,
7082+                                          k,
7083+                                          n,
7084+                                          segsize,
7085+                                          datalen)
7086+            self.failUnlessEqual(expected_prefix, prefix)
7087+            self.failUnlessEqual(self.rref.read_count, 0)
7088+        d.addCallback(_check_verinfo)
7089+        # This shouldn't be enough to read any share data.
7090+        d.addCallback(_make_mr, 107)
7091+        d.addCallback(lambda mr:
7092+            mr.get_block_and_salt(0))
7093+        def _check_block_and_salt((block, salt)):
7094+            self.failUnlessEqual(block, self.block * 6)
7095+            self.failUnlessEqual(salt, self.salt)
7096+            # TODO: Fix the read routine so that it reads only the data
7097+            #       that it has cached if it can't read all of it.
7098+            self.failUnlessEqual(self.rref.read_count, 2)
7099+
7100+        # This should be enough to read share data.
7101+        d.addCallback(_make_mr, self.offsets['share_data'])
7102+        d.addCallback(lambda mr:
7103+            mr.get_block_and_salt(0))
7104+        d.addCallback(_check_block_and_salt)
7105+        return d
7106+
7107+
7108+    def test_read_with_empty_mdmf_file(self):
7109+        # Some tests upload a file with no contents to test things
7110+        # unrelated to the actual handling of the content of the file.
7111+        # The reader should behave intelligently in these cases.
7112+        self.write_test_share_to_server("si1", empty=True)
7113+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7114+        # We should be able to get the encoding parameters, and they
7115+        # should be correct.
7116+        d = defer.succeed(None)
7117+        d.addCallback(lambda ignored:
7118+            mr.get_encoding_parameters())
7119+        def _check_encoding_parameters(params):
7120+            self.failUnlessEqual(len(params), 4)
7121+            k, n, segsize, datalen = params
7122+            self.failUnlessEqual(k, 3)
7123+            self.failUnlessEqual(n, 10)
7124+            self.failUnlessEqual(segsize, 0)
7125+            self.failUnlessEqual(datalen, 0)
7126+        d.addCallback(_check_encoding_parameters)
7127+
7128+        # We should not be able to fetch a block, since there are no
7129+        # blocks to fetch
7130+        d.addCallback(lambda ignored:
7131+            self.shouldFail(LayoutInvalid, "get block on empty file",
7132+                            None,
7133+                            mr.get_block_and_salt, 0))
7134+        return d
7135+
7136+
7137+    def test_read_with_empty_sdmf_file(self):
7138+        self.write_sdmf_share_to_server("si1", empty=True)
7139+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7140+        # We should be able to get the encoding parameters, and they
7141+        # should be correct
7142+        d = defer.succeed(None)
7143+        d.addCallback(lambda ignored:
7144+            mr.get_encoding_parameters())
7145+        def _check_encoding_parameters(params):
7146+            self.failUnlessEqual(len(params), 4)
7147+            k, n, segsize, datalen = params
7148+            self.failUnlessEqual(k, 3)
7149+            self.failUnlessEqual(n, 10)
7150+            self.failUnlessEqual(segsize, 0)
7151+            self.failUnlessEqual(datalen, 0)
7152+        d.addCallback(_check_encoding_parameters)
7153+
7154+        # It does not make sense to get a block in this format, so we
7155+        # should not be able to.
7156+        d.addCallback(lambda ignored:
7157+            self.shouldFail(LayoutInvalid, "get block on an empty file",
7158+                            None,
7159+                            mr.get_block_and_salt, 0))
7160+        return d
7161+
7162+
7163+    def test_verinfo_with_sdmf_file(self):
7164+        self.write_sdmf_share_to_server("si1")
7165+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7166+        # We should be able to get the version information.
7167+        d = defer.succeed(None)
7168+        d.addCallback(lambda ignored:
7169+            mr.get_verinfo())
7170+        def _check_verinfo(verinfo):
7171+            self.failUnless(verinfo)
7172+            self.failUnlessEqual(len(verinfo), 9)
7173+            (seqnum,
7174+             root_hash,
7175+             salt,
7176+             segsize,
7177+             datalen,
7178+             k,
7179+             n,
7180+             prefix,
7181+             offsets) = verinfo
7182+            self.failUnlessEqual(seqnum, 0)
7183+            self.failUnlessEqual(root_hash, self.root_hash)
7184+            self.failUnlessEqual(salt, self.salt)
7185+            self.failUnlessEqual(segsize, 36)
7186+            self.failUnlessEqual(datalen, 36)
7187+            self.failUnlessEqual(k, 3)
7188+            self.failUnlessEqual(n, 10)
7189+            expected_prefix = struct.pack(">BQ32s16s BBQQ",
7190+                                          0,
7191+                                          seqnum,
7192+                                          root_hash,
7193+                                          salt,
7194+                                          k,
7195+                                          n,
7196+                                          segsize,
7197+                                          datalen)
7198+            self.failUnlessEqual(prefix, expected_prefix)
7199+            self.failUnlessEqual(offsets, self.offsets)
7200+        d.addCallback(_check_verinfo)
7201+        return d
7202+
7203+
7204+    def test_verinfo_with_mdmf_file(self):
7205+        self.write_test_share_to_server("si1")
7206+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7207+        d = defer.succeed(None)
7208+        d.addCallback(lambda ignored:
7209+            mr.get_verinfo())
7210+        def _check_verinfo(verinfo):
7211+            self.failUnless(verinfo)
7212+            self.failUnlessEqual(len(verinfo), 9)
7213+            (seqnum,
7214+             root_hash,
7215+             IV,
7216+             segsize,
7217+             datalen,
7218+             k,
7219+             n,
7220+             prefix,
7221+             offsets) = verinfo
7222+            self.failUnlessEqual(seqnum, 0)
7223+            self.failUnlessEqual(root_hash, self.root_hash)
7224+            self.failIf(IV)
7225+            self.failUnlessEqual(segsize, 6)
7226+            self.failUnlessEqual(datalen, 36)
7227+            self.failUnlessEqual(k, 3)
7228+            self.failUnlessEqual(n, 10)
7229+            expected_prefix = struct.pack(">BQ32s BBQQ",
7230+                                          1,
7231+                                          seqnum,
7232+                                          root_hash,
7233+                                          k,
7234+                                          n,
7235+                                          segsize,
7236+                                          datalen)
7237+            self.failUnlessEqual(prefix, expected_prefix)
7238+            self.failUnlessEqual(offsets, self.offsets)
7239+        d.addCallback(_check_verinfo)
7240+        return d
7241+
7242+
7243+    def test_reader_queue(self):
7244+        self.write_test_share_to_server('si1')
7245+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7246+        d1 = mr.get_block_and_salt(0, queue=True)
7247+        d2 = mr.get_blockhashes(queue=True)
7248+        d3 = mr.get_sharehashes(queue=True)
7249+        d4 = mr.get_signature(queue=True)
7250+        d5 = mr.get_verification_key(queue=True)
7251+        dl = defer.DeferredList([d1, d2, d3, d4, d5])
7252+        mr.flush()
7253+        def _print(results):
7254+            self.failUnlessEqual(len(results), 5)
7255+            # We have one read for version information and offsets, and
7256+            # one for everything else.
7257+            self.failUnlessEqual(self.rref.read_count, 2)
7258+            block, salt = results[0][1] # results[0] is a boolean that says
7259+                                           # whether or not the operation
7260+                                           # worked.
7261+            self.failUnlessEqual(self.block, block)
7262+            self.failUnlessEqual(self.salt, salt)
7263+
7264+            blockhashes = results[1][1]
7265+            self.failUnlessEqual(self.block_hash_tree, blockhashes)
7266+
7267+            sharehashes = results[2][1]
7268+            self.failUnlessEqual(self.share_hash_chain, sharehashes)
7269+
7270+            signature = results[3][1]
7271+            self.failUnlessEqual(self.signature, signature)
7272+
7273+            verification_key = results[4][1]
7274+            self.failUnlessEqual(self.verification_key, verification_key)
7275+        dl.addCallback(_print)
7276+        return dl
7277+
7278+
7279+    def test_sdmf_writer(self):
7280+        # Go through the motions of writing an SDMF share to the storage
7281+        # server. Then read the storage server to see that the share got
7282+        # written in the way that we think it should have.
7283+
7284+        # We do this first so that the necessary instance variables get
7285+        # set the way we want them for the tests below.
7286+        data = self.build_test_sdmf_share()
7287+        sdmfr = SDMFSlotWriteProxy(0,
7288+                                   self.rref,
7289+                                   "si1",
7290+                                   self.secrets,
7291+                                   0, 3, 10, 36, 36)
7292+        # Put the block and salt.
7293+        sdmfr.put_block(self.blockdata, 0, self.salt)
7294+
7295+        # Put the encprivkey
7296+        sdmfr.put_encprivkey(self.encprivkey)
7297+
7298+        # Put the block and share hash chains
7299+        sdmfr.put_blockhashes(self.block_hash_tree)
7300+        sdmfr.put_sharehashes(self.share_hash_chain)
7301+        sdmfr.put_root_hash(self.root_hash)
7302+
7303+        # Put the signature
7304+        sdmfr.put_signature(self.signature)
7305+
7306+        # Put the verification key
7307+        sdmfr.put_verification_key(self.verification_key)
7308+
7309+        # Now check to make sure that nothing has been written yet.
7310+        self.failUnlessEqual(self.rref.write_count, 0)
7311+
7312+        # Now finish publishing
7313+        d = sdmfr.finish_publishing()
7314+        def _then(ignored):
7315+            self.failUnlessEqual(self.rref.write_count, 1)
7316+            read = self.ss.remote_slot_readv
7317+            self.failUnlessEqual(read("si1", [0], [(0, len(data))]),
7318+                                 {0: [data]})
7319+        d.addCallback(_then)
7320+        return d
7321+
7322+
7323+    def test_sdmf_writer_preexisting_share(self):
7324+        data = self.build_test_sdmf_share()
7325+        self.write_sdmf_share_to_server("si1")
7326+
7327+        # Now there is a share on the storage server. To successfully
7328+        # write, we need to set the checkstring correctly. When we
7329+        # don't, no write should occur.
7330+        sdmfw = SDMFSlotWriteProxy(0,
7331+                                   self.rref,
7332+                                   "si1",
7333+                                   self.secrets,
7334+                                   1, 3, 10, 36, 36)
7335+        sdmfw.put_block(self.blockdata, 0, self.salt)
7336+
7337+        # Put the encprivkey
7338+        sdmfw.put_encprivkey(self.encprivkey)
7339+
7340+        # Put the block and share hash chains
7341+        sdmfw.put_blockhashes(self.block_hash_tree)
7342+        sdmfw.put_sharehashes(self.share_hash_chain)
7343+
7344+        # Put the root hash
7345+        sdmfw.put_root_hash(self.root_hash)
7346+
7347+        # Put the signature
7348+        sdmfw.put_signature(self.signature)
7349+
7350+        # Put the verification key
7351+        sdmfw.put_verification_key(self.verification_key)
7352+
7353+        # We shouldn't have a checkstring yet
7354+        self.failUnlessEqual(sdmfw.get_checkstring(), "")
7355+
7356+        d = sdmfw.finish_publishing()
7357+        def _then(results):
7358+            self.failIf(results[0])
7359+            # this is the correct checkstring
7360+            self._expected_checkstring = results[1][0][0]
7361+            return self._expected_checkstring
7362+
7363+        d.addCallback(_then)
7364+        d.addCallback(sdmfw.set_checkstring)
7365+        d.addCallback(lambda ignored:
7366+            sdmfw.get_checkstring())
7367+        d.addCallback(lambda checkstring:
7368+            self.failUnlessEqual(checkstring, self._expected_checkstring))
7369+        d.addCallback(lambda ignored:
7370+            sdmfw.finish_publishing())
7371+        def _then_again(results):
7372+            self.failUnless(results[0])
7373+            read = self.ss.remote_slot_readv
7374+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
7375+                                 {0: [struct.pack(">Q", 1)]})
7376+            self.failUnlessEqual(read("si1", [0], [(9, len(data) - 9)]),
7377+                                 {0: [data[9:]]})
7378+        d.addCallback(_then_again)
7379+        return d
7380+
7381+
7382 class Stats(unittest.TestCase):
7383 
7384     def setUp(self):
7385}
7386[mutable/publish.py: cleanup + simplification
7387Kevan Carstensen <kevan@isnotajoke.com>**20100702225554
7388 Ignore-this: 36a58424ceceffb1ddc55cc5934399e2
7389] {
7390hunk ./src/allmydata/mutable/publish.py 19
7391      UncoordinatedWriteError, NotEnoughServersError
7392 from allmydata.mutable.servermap import ServerMap
7393 from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
7394-     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy
7395+     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy, \
7396+     SDMFSlotWriteProxy
7397 
7398 KiB = 1024
7399 DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
7400hunk ./src/allmydata/mutable/publish.py 24
7401+PUSHING_BLOCKS_STATE = 0
7402+PUSHING_EVERYTHING_ELSE_STATE = 1
7403+DONE_STATE = 2
7404 
7405 class PublishStatus:
7406     implements(IPublishStatus)
7407hunk ./src/allmydata/mutable/publish.py 229
7408 
7409         self.bad_share_checkstrings = {}
7410 
7411+        # This is set at the last step of the publishing process.
7412+        self.versioninfo = ""
7413+
7414         # we use the servermap to populate the initial goal: this way we will
7415         # try to update each existing share in place.
7416         for (peerid, shnum) in self._servermap.servermap:
7417hunk ./src/allmydata/mutable/publish.py 245
7418             self.bad_share_checkstrings[key] = old_checkstring
7419             self.connections[peerid] = self._servermap.connections[peerid]
7420 
7421-        # Now, the process dovetails -- if this is an SDMF file, we need
7422-        # to write an SDMF file. Otherwise, we need to write an MDMF
7423-        # file.
7424-        if self._version == MDMF_VERSION:
7425-            return self._publish_mdmf()
7426-        else:
7427-            return self._publish_sdmf()
7428-        #return self.done_deferred
7429-
7430-    def _publish_mdmf(self):
7431-        # Next, we find homes for all of the shares that we don't have
7432-        # homes for yet.
7433         # TODO: Make this part do peer selection.
7434         self.update_goal()
7435         self.writers = {}
7436hunk ./src/allmydata/mutable/publish.py 248
7437-        # For each (peerid, shnum) in self.goal, we make an
7438-        # MDMFSlotWriteProxy for that peer. We'll use this to write
7439+        if self._version == MDMF_VERSION:
7440+            writer_class = MDMFSlotWriteProxy
7441+        else:
7442+            writer_class = SDMFSlotWriteProxy
7443+
7444+        # For each (peerid, shnum) in self.goal, we make a
7445+        # write proxy for that peer. We'll use this to write
7446         # shares to the peer.
7447         for key in self.goal:
7448             peerid, shnum = key
7449hunk ./src/allmydata/mutable/publish.py 263
7450             cancel_secret = self._node.get_cancel_secret(peerid)
7451             secrets = (write_enabler, renew_secret, cancel_secret)
7452 
7453-            self.writers[shnum] =  MDMFSlotWriteProxy(shnum,
7454-                                                      self.connections[peerid],
7455-                                                      self._storage_index,
7456-                                                      secrets,
7457-                                                      self._new_seqnum,
7458-                                                      self.required_shares,
7459-                                                      self.total_shares,
7460-                                                      self.segment_size,
7461-                                                      len(self.newdata))
7462+            self.writers[shnum] =  writer_class(shnum,
7463+                                                self.connections[peerid],
7464+                                                self._storage_index,
7465+                                                secrets,
7466+                                                self._new_seqnum,
7467+                                                self.required_shares,
7468+                                                self.total_shares,
7469+                                                self.segment_size,
7470+                                                len(self.newdata))
7471+            self.writers[shnum].peerid = peerid
7472             if (peerid, shnum) in self._servermap.servermap:
7473                 old_versionid, old_timestamp = self._servermap.servermap[key]
7474                 (old_seqnum, old_root_hash, old_salt, old_segsize,
7475hunk ./src/allmydata/mutable/publish.py 278
7476                  old_datalength, old_k, old_N, old_prefix,
7477                  old_offsets_tuple) = old_versionid
7478-                self.writers[shnum].set_checkstring(old_seqnum, old_root_hash)
7479+                self.writers[shnum].set_checkstring(old_seqnum,
7480+                                                    old_root_hash,
7481+                                                    old_salt)
7482+            elif (peerid, shnum) in self.bad_share_checkstrings:
7483+                old_checkstring = self.bad_share_checkstrings[(peerid, shnum)]
7484+                self.writers[shnum].set_checkstring(old_checkstring)
7485+
7486+        # Our remote shares will not have a complete checkstring until
7487+        # after we are done writing share data and have started to write
7488+        # blocks. In the meantime, we need to know what to look for when
7489+        # writing, so that we can detect UncoordinatedWriteErrors.
7490+        self._checkstring = self.writers.values()[0].get_checkstring()
7491 
7492         # Now, we start pushing shares.
7493         self._status.timings["setup"] = time.time() - self._started
7494hunk ./src/allmydata/mutable/publish.py 293
7495-        def _start_pushing(res):
7496-            self._started_pushing = time.time()
7497-            return res
7498-
7499         # First, we encrypt, encode, and publish the shares that we need
7500         # to encrypt, encode, and publish.
7501 
7502hunk ./src/allmydata/mutable/publish.py 306
7503 
7504         d = defer.succeed(None)
7505         self.log("Starting push")
7506-        for i in xrange(self.num_segments - 1):
7507-            d.addCallback(lambda ignored, i=i:
7508-                self.push_segment(i))
7509-            d.addCallback(self._turn_barrier)
7510-        # We have at least one segment, so we will have a tail segment
7511-        if self.num_segments > 0:
7512-            d.addCallback(lambda ignored:
7513-                self.push_tail_segment())
7514-
7515-        d.addCallback(lambda ignored:
7516-            self.push_encprivkey())
7517-        d.addCallback(lambda ignored:
7518-            self.push_blockhashes())
7519-        d.addCallback(lambda ignored:
7520-            self.push_sharehashes())
7521-        d.addCallback(lambda ignored:
7522-            self.push_toplevel_hashes_and_signature())
7523-        d.addCallback(lambda ignored:
7524-            self.finish_publishing())
7525-        return d
7526-
7527-
7528-    def _publish_sdmf(self):
7529-        self._status.timings["setup"] = time.time() - self._started
7530-        self.salt = os.urandom(16)
7531 
7532hunk ./src/allmydata/mutable/publish.py 307
7533-        d = self._encrypt_and_encode()
7534-        d.addCallback(self._generate_shares)
7535-        def _start_pushing(res):
7536-            self._started_pushing = time.time()
7537-            return res
7538-        d.addCallback(_start_pushing)
7539-        d.addCallback(self.loop) # trigger delivery
7540-        d.addErrback(self._fatal_error)
7541+        self._state = PUSHING_BLOCKS_STATE
7542+        self._push()
7543 
7544         return self.done_deferred
7545 
7546hunk ./src/allmydata/mutable/publish.py 327
7547                                                   segment_size)
7548         else:
7549             self.num_segments = 0
7550+
7551+        self.log("building encoding parameters for file")
7552+        self.log("got segsize %d" % self.segment_size)
7553+        self.log("got %d segments" % self.num_segments)
7554+
7555         if self._version == SDMF_VERSION:
7556             assert self.num_segments in (0, 1) # SDMF
7557hunk ./src/allmydata/mutable/publish.py 334
7558-            return
7559         # calculate the tail segment size.
7560hunk ./src/allmydata/mutable/publish.py 335
7561-        self.tail_segment_size = len(self.newdata) % segment_size
7562 
7563hunk ./src/allmydata/mutable/publish.py 336
7564-        if self.tail_segment_size == 0:
7565+        if segment_size and self.newdata:
7566+            self.tail_segment_size = len(self.newdata) % segment_size
7567+        else:
7568+            self.tail_segment_size = 0
7569+
7570+        if self.tail_segment_size == 0 and segment_size:
7571             # The tail segment is the same size as the other segments.
7572             self.tail_segment_size = segment_size
7573 
7574hunk ./src/allmydata/mutable/publish.py 345
7575-        # We'll make an encoder ahead-of-time for the normal-sized
7576-        # segments (defined as any segment of segment_size size.
7577-        # (the part of the code that puts the tail segment will make its
7578-        #  own encoder for that part)
7579+        # Make FEC encoders
7580         fec = codec.CRSEncoder()
7581         fec.set_params(self.segment_size,
7582                        self.required_shares, self.total_shares)
7583hunk ./src/allmydata/mutable/publish.py 352
7584         self.piece_size = fec.get_block_size()
7585         self.fec = fec
7586 
7587+        if self.tail_segment_size == self.segment_size:
7588+            self.tail_fec = self.fec
7589+        else:
7590+            tail_fec = codec.CRSEncoder()
7591+            tail_fec.set_params(self.tail_segment_size,
7592+                                self.required_shares,
7593+                                self.total_shares)
7594+            self.tail_fec = tail_fec
7595+
7596+        self._current_segment = 0
7597+
7598+
7599+    def _push(self, ignored=None):
7600+        """
7601+        I manage state transitions. In particular, I see that we still
7602+        have a good enough number of writers to complete the upload
7603+        successfully.
7604+        """
7605+        # Can we still successfully publish this file?
7606+        # TODO: Keep track of outstanding queries before aborting the
7607+        #       process.
7608+        if len(self.writers) <= self.required_shares or self.surprised:
7609+            return self._failure()
7610+
7611+        # Figure out what we need to do next. Each of these needs to
7612+        # return a deferred so that we don't block execution when this
7613+        # is first called in the upload method.
7614+        if self._state == PUSHING_BLOCKS_STATE:
7615+            return self.push_segment(self._current_segment)
7616+
7617+        # XXX: Do we want more granularity in states? Is that useful at
7618+        #      all?
7619+        #      Yes -- quicker reaction to UCW.
7620+        elif self._state == PUSHING_EVERYTHING_ELSE_STATE:
7621+            return self.push_everything_else()
7622+
7623+        # If we make it to this point, we were successful in placing the
7624+        # file.
7625+        return self._done(None)
7626+
7627 
7628     def push_segment(self, segnum):
7629hunk ./src/allmydata/mutable/publish.py 394
7630+        if self.num_segments == 0 and self._version == SDMF_VERSION:
7631+            self._add_dummy_salts()
7632+
7633+        if segnum == self.num_segments:
7634+            # We don't have any more segments to push.
7635+            self._state = PUSHING_EVERYTHING_ELSE_STATE
7636+            return self._push()
7637+
7638+        d = self._encode_segment(segnum)
7639+        d.addCallback(self._push_segment, segnum)
7640+        def _increment_segnum(ign):
7641+            self._current_segment += 1
7642+        # XXX: I don't think we need to do addBoth here -- any errBacks
7643+        # should be handled within push_segment.
7644+        d.addBoth(_increment_segnum)
7645+        d.addBoth(self._push)
7646+
7647+
7648+    def _add_dummy_salts(self):
7649+        """
7650+        SDMF files need a salt even if they're empty, or the signature
7651+        won't make sense. This method adds a dummy salt to each of our
7652+        SDMF writers so that they can write the signature later.
7653+        """
7654+        salt = os.urandom(16)
7655+        assert self._version == SDMF_VERSION
7656+
7657+        for writer in self.writers.itervalues():
7658+            writer.put_salt(salt)
7659+
7660+
7661+    def _encode_segment(self, segnum):
7662+        """
7663+        I encrypt and encode the segment segnum.
7664+        """
7665         started = time.time()
7666hunk ./src/allmydata/mutable/publish.py 430
7667-        segsize = self.segment_size
7668+
7669+        if segnum + 1 == self.num_segments:
7670+            segsize = self.tail_segment_size
7671+        else:
7672+            segsize = self.segment_size
7673+
7674+
7675+        offset = self.segment_size * segnum
7676+        length = segsize + offset
7677         self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
7678hunk ./src/allmydata/mutable/publish.py 440
7679-        data = self.newdata[segsize * segnum:segsize*(segnum + 1)]
7680+        data = self.newdata[offset:length]
7681         assert len(data) == segsize
7682 
7683         salt = os.urandom(16)
7684hunk ./src/allmydata/mutable/publish.py 455
7685         started = now
7686 
7687         # now apply FEC
7688+        if segnum + 1 == self.num_segments:
7689+            fec = self.tail_fec
7690+        else:
7691+            fec = self.fec
7692 
7693         self._status.set_status("Encoding")
7694         crypttext_pieces = [None] * self.required_shares
7695hunk ./src/allmydata/mutable/publish.py 462
7696-        piece_size = self.piece_size
7697+        piece_size = fec.get_block_size()
7698         for i in range(len(crypttext_pieces)):
7699             offset = i * piece_size
7700             piece = crypttext[offset:offset+piece_size]
7701hunk ./src/allmydata/mutable/publish.py 469
7702             piece = piece + "\x00"*(piece_size - len(piece)) # padding
7703             crypttext_pieces[i] = piece
7704             assert len(piece) == piece_size
7705-        d = self.fec.encode(crypttext_pieces)
7706+        d = fec.encode(crypttext_pieces)
7707         def _done_encoding(res):
7708             elapsed = time.time() - started
7709             self._status.timings["encode"] = elapsed
7710hunk ./src/allmydata/mutable/publish.py 473
7711-            return res
7712+            return (res, salt)
7713         d.addCallback(_done_encoding)
7714hunk ./src/allmydata/mutable/publish.py 475
7715-
7716-        def _push_shares_and_salt(results):
7717-            shares, shareids = results
7718-            dl = []
7719-            for i in xrange(len(shares)):
7720-                sharedata = shares[i]
7721-                shareid = shareids[i]
7722-                block_hash = hashutil.block_hash(salt + sharedata)
7723-                self.blockhashes[shareid].append(block_hash)
7724-
7725-                # find the writer for this share
7726-                d = self.writers[shareid].put_block(sharedata, segnum, salt)
7727-                dl.append(d)
7728-            # TODO: Naturally, we need to check on the results of these.
7729-            return defer.DeferredList(dl)
7730-        d.addCallback(_push_shares_and_salt)
7731         return d
7732 
7733 
7734hunk ./src/allmydata/mutable/publish.py 478
7735-    def push_tail_segment(self):
7736-        # This is essentially the same as push_segment, except that we
7737-        # don't use the cached encoder that we use elsewhere.
7738-        self.log("Pushing tail segment")
7739+    def _push_segment(self, encoded_and_salt, segnum):
7740+        """
7741+        I push (data, salt) as segment number segnum.
7742+        """
7743+        results, salt = encoded_and_salt
7744+        shares, shareids = results
7745         started = time.time()
7746hunk ./src/allmydata/mutable/publish.py 485
7747-        segsize = self.segment_size
7748-        data = self.newdata[segsize * (self.num_segments-1):]
7749-        assert len(data) == self.tail_segment_size
7750-        salt = os.urandom(16)
7751-
7752-        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
7753-        enc = AES(key)
7754-        crypttext = enc.process(data)
7755-        assert len(crypttext) == len(data)
7756+        dl = []
7757+        for i in xrange(len(shares)):
7758+            sharedata = shares[i]
7759+            shareid = shareids[i]
7760+            if self._version == MDMF_VERSION:
7761+                hashed = salt + sharedata
7762+            else:
7763+                hashed = sharedata
7764+            block_hash = hashutil.block_hash(hashed)
7765+            self.blockhashes[shareid].append(block_hash)
7766 
7767hunk ./src/allmydata/mutable/publish.py 496
7768-        now = time.time()
7769-        self._status.timings['encrypt'] = now - started
7770-        started = now
7771+            # find the writer for this share
7772+            writer = self.writers[shareid]
7773+            d = writer.put_block(sharedata, segnum, salt)
7774+            d.addCallback(self._got_write_answer, writer, started)
7775+            d.addErrback(self._connection_problem, writer)
7776+            dl.append(d)
7777+            # TODO: Naturally, we need to check on the results of these.
7778+        return defer.DeferredList(dl)
7779 
7780hunk ./src/allmydata/mutable/publish.py 505
7781-        self._status.set_status("Encoding")
7782-        tail_fec = codec.CRSEncoder()
7783-        tail_fec.set_params(self.tail_segment_size,
7784-                            self.required_shares,
7785-                            self.total_shares)
7786 
7787hunk ./src/allmydata/mutable/publish.py 506
7788-        crypttext_pieces = [None] * self.required_shares
7789-        piece_size = tail_fec.get_block_size()
7790-        for i in range(len(crypttext_pieces)):
7791-            offset = i * piece_size
7792-            piece = crypttext[offset:offset+piece_size]
7793-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
7794-            crypttext_pieces[i] = piece
7795-            assert len(piece) == piece_size
7796-        d = tail_fec.encode(crypttext_pieces)
7797-        def _push_shares_and_salt(results):
7798-            shares, shareids = results
7799-            dl = []
7800-            for i in xrange(len(shares)):
7801-                sharedata = shares[i]
7802-                shareid = shareids[i]
7803-                block_hash = hashutil.block_hash(salt + sharedata)
7804-                self.blockhashes[shareid].append(block_hash)
7805-                # find the writer for this share
7806-                d = self.writers[shareid].put_block(sharedata,
7807-                                                    self.num_segments - 1,
7808-                                                    salt)
7809-                dl.append(d)
7810-            # TODO: Naturally, we need to check on the results of these.
7811-            return defer.DeferredList(dl)
7812-        d.addCallback(_push_shares_and_salt)
7813+    def push_everything_else(self):
7814+        """
7815+        I put everything else associated with a share.
7816+        """
7817+        encprivkey = self._encprivkey
7818+        d = self.push_encprivkey()
7819+        d.addCallback(self.push_blockhashes)
7820+        d.addCallback(self.push_sharehashes)
7821+        d.addCallback(self.push_toplevel_hashes_and_signature)
7822+        d.addCallback(self.finish_publishing)
7823+        def _change_state(ignored):
7824+            self._state = DONE_STATE
7825+        d.addCallback(_change_state)
7826+        d.addCallback(self._push)
7827         return d
7828 
7829 
7830hunk ./src/allmydata/mutable/publish.py 527
7831         started = time.time()
7832         encprivkey = self._encprivkey
7833         dl = []
7834-        def _spy_on_writer(results):
7835-            print results
7836-            return results
7837-        for shnum, writer in self.writers.iteritems():
7838+        for writer in self.writers.itervalues():
7839             d = writer.put_encprivkey(encprivkey)
7840hunk ./src/allmydata/mutable/publish.py 529
7841+            d.addCallback(self._got_write_answer, writer, started)
7842+            d.addErrback(self._connection_problem, writer)
7843             dl.append(d)
7844         d = defer.DeferredList(dl)
7845         return d
7846hunk ./src/allmydata/mutable/publish.py 536
7847 
7848 
7849-    def push_blockhashes(self):
7850+    def push_blockhashes(self, ignored):
7851         started = time.time()
7852         dl = []
7853hunk ./src/allmydata/mutable/publish.py 539
7854-        def _spy_on_results(results):
7855-            print results
7856-            return results
7857         self.sharehash_leaves = [None] * len(self.blockhashes)
7858         for shnum, blockhashes in self.blockhashes.iteritems():
7859             t = hashtree.HashTree(blockhashes)
7860hunk ./src/allmydata/mutable/publish.py 545
7861             self.blockhashes[shnum] = list(t)
7862             # set the leaf for future use.
7863             self.sharehash_leaves[shnum] = t[0]
7864-            d = self.writers[shnum].put_blockhashes(self.blockhashes[shnum])
7865+            writer = self.writers[shnum]
7866+            d = writer.put_blockhashes(self.blockhashes[shnum])
7867+            d.addCallback(self._got_write_answer, writer, started)
7868+            d.addErrback(self._connection_problem, self.writers[shnum])
7869             dl.append(d)
7870         d = defer.DeferredList(dl)
7871         return d
7872hunk ./src/allmydata/mutable/publish.py 554
7873 
7874 
7875-    def push_sharehashes(self):
7876+    def push_sharehashes(self, ignored):
7877+        started = time.time()
7878         share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
7879         share_hash_chain = {}
7880         ds = []
7881hunk ./src/allmydata/mutable/publish.py 559
7882-        def _spy_on_results(results):
7883-            print results
7884-            return results
7885         for shnum in xrange(len(self.sharehash_leaves)):
7886             needed_indices = share_hash_tree.needed_hashes(shnum)
7887             self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
7888hunk ./src/allmydata/mutable/publish.py 563
7889                                              for i in needed_indices] )
7890-            d = self.writers[shnum].put_sharehashes(self.sharehashes[shnum])
7891+            writer = self.writers[shnum]
7892+            d = writer.put_sharehashes(self.sharehashes[shnum])
7893+            d.addCallback(self._got_write_answer, writer, started)
7894+            d.addErrback(self._connection_problem, writer)
7895             ds.append(d)
7896         self.root_hash = share_hash_tree[0]
7897         d = defer.DeferredList(ds)
7898hunk ./src/allmydata/mutable/publish.py 573
7899         return d
7900 
7901 
7902-    def push_toplevel_hashes_and_signature(self):
7903+    def push_toplevel_hashes_and_signature(self, ignored):
7904         # We need to to three things here:
7905         #   - Push the root hash and salt hash
7906         #   - Get the checkstring of the resulting layout; sign that.
7907hunk ./src/allmydata/mutable/publish.py 578
7908         #   - Push the signature
7909+        started = time.time()
7910         ds = []
7911hunk ./src/allmydata/mutable/publish.py 580
7912-        def _spy_on_results(results):
7913-            print results
7914-            return results
7915         for shnum in xrange(self.total_shares):
7916hunk ./src/allmydata/mutable/publish.py 581
7917-            d = self.writers[shnum].put_root_hash(self.root_hash)
7918+            writer = self.writers[shnum]
7919+            d = writer.put_root_hash(self.root_hash)
7920+            d.addCallback(self._got_write_answer, writer, started)
7921             ds.append(d)
7922         d = defer.DeferredList(ds)
7923hunk ./src/allmydata/mutable/publish.py 586
7924-        def _make_and_place_signature(ignored):
7925-            signable = self.writers[0].get_signable()
7926-            self.signature = self._privkey.sign(signable)
7927-
7928-            ds = []
7929-            for (shnum, writer) in self.writers.iteritems():
7930-                d = writer.put_signature(self.signature)
7931-                ds.append(d)
7932-            return defer.DeferredList(ds)
7933-        d.addCallback(_make_and_place_signature)
7934+        d.addCallback(self._update_checkstring)
7935+        d.addCallback(self._make_and_place_signature)
7936         return d
7937 
7938 
7939hunk ./src/allmydata/mutable/publish.py 591
7940-    def finish_publishing(self):
7941+    def _update_checkstring(self, ignored):
7942+        """
7943+        After putting the root hash, MDMF files will have the
7944+        checkstring written to the storage server. This means that we
7945+        can update our copy of the checkstring so we can detect
7946+        uncoordinated writes. SDMF files will have the same checkstring,
7947+        so we need not do anything.
7948+        """
7949+        self._checkstring = self.writers.values()[0].get_checkstring()
7950+
7951+
7952+    def _make_and_place_signature(self, ignored):
7953+        """
7954+        I create and place the signature.
7955+        """
7956+        started = time.time()
7957+        signable = self.writers[0].get_signable()
7958+        self.signature = self._privkey.sign(signable)
7959+
7960+        ds = []
7961+        for (shnum, writer) in self.writers.iteritems():
7962+            d = writer.put_signature(self.signature)
7963+            d.addCallback(self._got_write_answer, writer, started)
7964+            d.addErrback(self._connection_problem, writer)
7965+            ds.append(d)
7966+        return defer.DeferredList(ds)
7967+
7968+
7969+    def finish_publishing(self, ignored):
7970         # We're almost done -- we just need to put the verification key
7971         # and the offsets
7972hunk ./src/allmydata/mutable/publish.py 622
7973+        started = time.time()
7974         ds = []
7975         verification_key = self._pubkey.serialize()
7976 
7977hunk ./src/allmydata/mutable/publish.py 626
7978-        def _spy_on_results(results):
7979-            print results
7980-            return results
7981+
7982+        # TODO: Bad, since we remove from this same dict. We need to
7983+        # make a copy, or just use a non-iterated value.
7984         for (shnum, writer) in self.writers.iteritems():
7985             d = writer.put_verification_key(verification_key)
7986hunk ./src/allmydata/mutable/publish.py 631
7987+            d.addCallback(self._got_write_answer, writer, started)
7988+            d.addCallback(self._record_verinfo)
7989             d.addCallback(lambda ignored, writer=writer:
7990                 writer.finish_publishing())
7991hunk ./src/allmydata/mutable/publish.py 635
7992+            d.addCallback(self._got_write_answer, writer, started)
7993+            d.addErrback(self._connection_problem, writer)
7994             ds.append(d)
7995         return defer.DeferredList(ds)
7996 
7997hunk ./src/allmydata/mutable/publish.py 641
7998 
7999-    def _turn_barrier(self, res):
8000-        # putting this method in a Deferred chain imposes a guaranteed
8001-        # reactor turn between the pre- and post- portions of that chain.
8002-        # This can be useful to limit memory consumption: since Deferreds do
8003-        # not do tail recursion, code which uses defer.succeed(result) for
8004-        # consistency will cause objects to live for longer than you might
8005-        # normally expect.
8006-        return fireEventually(res)
8007+    def _record_verinfo(self, ignored):
8008+        self.versioninfo = self.writers.values()[0].get_verinfo()
8009 
8010 
8011hunk ./src/allmydata/mutable/publish.py 645
8012-    def _fatal_error(self, f):
8013-        self.log("error during loop", failure=f, level=log.UNUSUAL)
8014-        self._done(f)
8015+    def _connection_problem(self, f, writer):
8016+        """
8017+        We ran into a connection problem while working with writer, and
8018+        need to deal with that.
8019+        """
8020+        self.log("found problem: %s" % str(f))
8021+        self._last_failure = f
8022+        del(self.writers[writer.shnum])
8023 
8024hunk ./src/allmydata/mutable/publish.py 654
8025-    def _update_status(self):
8026-        self._status.set_status("Sending Shares: %d placed out of %d, "
8027-                                "%d messages outstanding" %
8028-                                (len(self.placed),
8029-                                 len(self.goal),
8030-                                 len(self.outstanding)))
8031-        self._status.set_progress(1.0 * len(self.placed) / len(self.goal))
8032 
8033     def loop(self, ignored=None):
8034         self.log("entering loop", level=log.NOISY)
8035hunk ./src/allmydata/mutable/publish.py 778
8036             self.log_goal(self.goal, "after update: ")
8037 
8038 
8039-    def _encrypt_and_encode(self):
8040-        # this returns a Deferred that fires with a list of (sharedata,
8041-        # sharenum) tuples. TODO: cache the ciphertext, only produce the
8042-        # shares that we care about.
8043-        self.log("_encrypt_and_encode")
8044-
8045-        self._status.set_status("Encrypting")
8046-        started = time.time()
8047+    def _got_write_answer(self, answer, writer, started):
8048+        if not answer:
8049+            # SDMF writers only pretend to write when readers set their
8050+            # blocks, salts, and so on -- they actually just write once,
8051+            # at the end of the upload process. In fake writes, they
8052+            # return defer.succeed(None). If we see that, we shouldn't
8053+            # bother checking it.
8054+            return
8055 
8056hunk ./src/allmydata/mutable/publish.py 787
8057-        key = hashutil.ssk_readkey_data_hash(self.salt, self.readkey)
8058-        enc = AES(key)
8059-        crypttext = enc.process(self.newdata)
8060-        assert len(crypttext) == len(self.newdata)
8061+        peerid = writer.peerid
8062+        lp = self.log("_got_write_answer from %s, share %d" %
8063+                      (idlib.shortnodeid_b2a(peerid), writer.shnum))
8064 
8065         now = time.time()
8066hunk ./src/allmydata/mutable/publish.py 792
8067-        self._status.timings["encrypt"] = now - started
8068-        started = now
8069-
8070-        # now apply FEC
8071-
8072-        self._status.set_status("Encoding")
8073-        fec = codec.CRSEncoder()
8074-        fec.set_params(self.segment_size,
8075-                       self.required_shares, self.total_shares)
8076-        piece_size = fec.get_block_size()
8077-        crypttext_pieces = [None] * self.required_shares
8078-        for i in range(len(crypttext_pieces)):
8079-            offset = i * piece_size
8080-            piece = crypttext[offset:offset+piece_size]
8081-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
8082-            crypttext_pieces[i] = piece
8083-            assert len(piece) == piece_size
8084-
8085-        d = fec.encode(crypttext_pieces)
8086-        def _done_encoding(res):
8087-            elapsed = time.time() - started
8088-            self._status.timings["encode"] = elapsed
8089-            return res
8090-        d.addCallback(_done_encoding)
8091-        return d
8092-
8093-
8094-    def _generate_shares(self, shares_and_shareids):
8095-        # this sets self.shares and self.root_hash
8096-        self.log("_generate_shares")
8097-        self._status.set_status("Generating Shares")
8098-        started = time.time()
8099-
8100-        # we should know these by now
8101-        privkey = self._privkey
8102-        encprivkey = self._encprivkey
8103-        pubkey = self._pubkey
8104-
8105-        (shares, share_ids) = shares_and_shareids
8106-
8107-        assert len(shares) == len(share_ids)
8108-        assert len(shares) == self.total_shares
8109-        all_shares = {}
8110-        block_hash_trees = {}
8111-        share_hash_leaves = [None] * len(shares)
8112-        for i in range(len(shares)):
8113-            share_data = shares[i]
8114-            shnum = share_ids[i]
8115-            all_shares[shnum] = share_data
8116-
8117-            # build the block hash tree. SDMF has only one leaf.
8118-            leaves = [hashutil.block_hash(share_data)]
8119-            t = hashtree.HashTree(leaves)
8120-            block_hash_trees[shnum] = list(t)
8121-            share_hash_leaves[shnum] = t[0]
8122-        for leaf in share_hash_leaves:
8123-            assert leaf is not None
8124-        share_hash_tree = hashtree.HashTree(share_hash_leaves)
8125-        share_hash_chain = {}
8126-        for shnum in range(self.total_shares):
8127-            needed_hashes = share_hash_tree.needed_hashes(shnum)
8128-            share_hash_chain[shnum] = dict( [ (i, share_hash_tree[i])
8129-                                              for i in needed_hashes ] )
8130-        root_hash = share_hash_tree[0]
8131-        assert len(root_hash) == 32
8132-        self.log("my new root_hash is %s" % base32.b2a(root_hash))
8133-        self._new_version_info = (self._new_seqnum, root_hash, self.salt)
8134-
8135-        prefix = pack_prefix(self._new_seqnum, root_hash, self.salt,
8136-                             self.required_shares, self.total_shares,
8137-                             self.segment_size, len(self.newdata))
8138-
8139-        # now pack the beginning of the share. All shares are the same up
8140-        # to the signature, then they have divergent share hash chains,
8141-        # then completely different block hash trees + salt + share data,
8142-        # then they all share the same encprivkey at the end. The sizes
8143-        # of everything are the same for all shares.
8144-
8145-        sign_started = time.time()
8146-        signature = privkey.sign(prefix)
8147-        self._status.timings["sign"] = time.time() - sign_started
8148-
8149-        verification_key = pubkey.serialize()
8150-
8151-        final_shares = {}
8152-        for shnum in range(self.total_shares):
8153-            final_share = pack_share(prefix,
8154-                                     verification_key,
8155-                                     signature,
8156-                                     share_hash_chain[shnum],
8157-                                     block_hash_trees[shnum],
8158-                                     all_shares[shnum],
8159-                                     encprivkey)
8160-            final_shares[shnum] = final_share
8161-        elapsed = time.time() - started
8162-        self._status.timings["pack"] = elapsed
8163-        self.shares = final_shares
8164-        self.root_hash = root_hash
8165-
8166-        # we also need to build up the version identifier for what we're
8167-        # pushing. Extract the offsets from one of our shares.
8168-        assert final_shares
8169-        offsets = unpack_header(final_shares.values()[0])[-1]
8170-        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
8171-        verinfo = (self._new_seqnum, root_hash, self.salt,
8172-                   self.segment_size, len(self.newdata),
8173-                   self.required_shares, self.total_shares,
8174-                   prefix, offsets_tuple)
8175-        self.versioninfo = verinfo
8176-
8177-
8178-
8179-    def _send_shares(self, needed):
8180-        self.log("_send_shares")
8181-
8182-        # we're finally ready to send out our shares. If we encounter any
8183-        # surprises here, it's because somebody else is writing at the same
8184-        # time. (Note: in the future, when we remove the _query_peers() step
8185-        # and instead speculate about [or remember] which shares are where,
8186-        # surprises here are *not* indications of UncoordinatedWriteError,
8187-        # and we'll need to respond to them more gracefully.)
8188-
8189-        # needed is a set of (peerid, shnum) tuples. The first thing we do is
8190-        # organize it by peerid.
8191-
8192-        peermap = DictOfSets()
8193-        for (peerid, shnum) in needed:
8194-            peermap.add(peerid, shnum)
8195-
8196-        # the next thing is to build up a bunch of test vectors. The
8197-        # semantics of Publish are that we perform the operation if the world
8198-        # hasn't changed since the ServerMap was constructed (more or less).
8199-        # For every share we're trying to place, we create a test vector that
8200-        # tests to see if the server*share still corresponds to the
8201-        # map.
8202-
8203-        all_tw_vectors = {} # maps peerid to tw_vectors
8204-        sm = self._servermap.servermap
8205-
8206-        for key in needed:
8207-            (peerid, shnum) = key
8208-
8209-            if key in sm:
8210-                # an old version of that share already exists on the
8211-                # server, according to our servermap. We will create a
8212-                # request that attempts to replace it.
8213-                old_versionid, old_timestamp = sm[key]
8214-                (old_seqnum, old_root_hash, old_salt, old_segsize,
8215-                 old_datalength, old_k, old_N, old_prefix,
8216-                 old_offsets_tuple) = old_versionid
8217-                old_checkstring = pack_checkstring(old_seqnum,
8218-                                                   old_root_hash,
8219-                                                   old_salt)
8220-                testv = (0, len(old_checkstring), "eq", old_checkstring)
8221-
8222-            elif key in self.bad_share_checkstrings:
8223-                old_checkstring = self.bad_share_checkstrings[key]
8224-                testv = (0, len(old_checkstring), "eq", old_checkstring)
8225-
8226-            else:
8227-                # add a testv that requires the share not exist
8228-
8229-                # Unfortunately, foolscap-0.2.5 has a bug in the way inbound
8230-                # constraints are handled. If the same object is referenced
8231-                # multiple times inside the arguments, foolscap emits a
8232-                # 'reference' token instead of a distinct copy of the
8233-                # argument. The bug is that these 'reference' tokens are not
8234-                # accepted by the inbound constraint code. To work around
8235-                # this, we need to prevent python from interning the
8236-                # (constant) tuple, by creating a new copy of this vector
8237-                # each time.
8238-
8239-                # This bug is fixed in foolscap-0.2.6, and even though this
8240-                # version of Tahoe requires foolscap-0.3.1 or newer, we are
8241-                # supposed to be able to interoperate with older versions of
8242-                # Tahoe which are allowed to use older versions of foolscap,
8243-                # including foolscap-0.2.5 . In addition, I've seen other
8244-                # foolscap problems triggered by 'reference' tokens (see #541
8245-                # for details). So we must keep this workaround in place.
8246-
8247-                #testv = (0, 1, 'eq', "")
8248-                testv = tuple([0, 1, 'eq', ""])
8249-
8250-            testvs = [testv]
8251-            # the write vector is simply the share
8252-            writev = [(0, self.shares[shnum])]
8253-
8254-            if peerid not in all_tw_vectors:
8255-                all_tw_vectors[peerid] = {}
8256-                # maps shnum to (testvs, writevs, new_length)
8257-            assert shnum not in all_tw_vectors[peerid]
8258-
8259-            all_tw_vectors[peerid][shnum] = (testvs, writev, None)
8260-
8261-        # we read the checkstring back from each share, however we only use
8262-        # it to detect whether there was a new share that we didn't know
8263-        # about. The success or failure of the write will tell us whether
8264-        # there was a collision or not. If there is a collision, the first
8265-        # thing we'll do is update the servermap, which will find out what
8266-        # happened. We could conceivably reduce a roundtrip by using the
8267-        # readv checkstring to populate the servermap, but really we'd have
8268-        # to read enough data to validate the signatures too, so it wouldn't
8269-        # be an overall win.
8270-        read_vector = [(0, struct.calcsize(SIGNED_PREFIX))]
8271-
8272-        # ok, send the messages!
8273-        self.log("sending %d shares" % len(all_tw_vectors), level=log.NOISY)
8274-        started = time.time()
8275-        for (peerid, tw_vectors) in all_tw_vectors.items():
8276-
8277-            write_enabler = self._node.get_write_enabler(peerid)
8278-            renew_secret = self._node.get_renewal_secret(peerid)
8279-            cancel_secret = self._node.get_cancel_secret(peerid)
8280-            secrets = (write_enabler, renew_secret, cancel_secret)
8281-            shnums = tw_vectors.keys()
8282-
8283-            for shnum in shnums:
8284-                self.outstanding.add( (peerid, shnum) )
8285-
8286-            d = self._do_testreadwrite(peerid, secrets,
8287-                                       tw_vectors, read_vector)
8288-            d.addCallbacks(self._got_write_answer, self._got_write_error,
8289-                           callbackArgs=(peerid, shnums, started),
8290-                           errbackArgs=(peerid, shnums, started))
8291-            # tolerate immediate errback, like with DeadReferenceError
8292-            d.addBoth(fireEventually)
8293-            d.addCallback(self.loop)
8294-            d.addErrback(self._fatal_error)
8295-
8296-        self._update_status()
8297-        self.log("%d shares sent" % len(all_tw_vectors), level=log.NOISY)
8298+        elapsed = now - started
8299 
8300hunk ./src/allmydata/mutable/publish.py 794
8301-    def _do_testreadwrite(self, peerid, secrets,
8302-                          tw_vectors, read_vector):
8303-        storage_index = self._storage_index
8304-        ss = self.connections[peerid]
8305+        self._status.add_per_server_time(peerid, elapsed)
8306 
8307hunk ./src/allmydata/mutable/publish.py 796
8308-        #print "SS[%s] is %s" % (idlib.shortnodeid_b2a(peerid), ss), ss.tracker.interfaceName
8309-        d = ss.callRemote("slot_testv_and_readv_and_writev",
8310-                          storage_index,
8311-                          secrets,
8312-                          tw_vectors,
8313-                          read_vector)
8314-        return d
8315+        wrote, read_data = answer
8316 
8317hunk ./src/allmydata/mutable/publish.py 798
8318-    def _got_write_answer(self, answer, peerid, shnums, started):
8319-        lp = self.log("_got_write_answer from %s" %
8320-                      idlib.shortnodeid_b2a(peerid))
8321-        for shnum in shnums:
8322-            self.outstanding.discard( (peerid, shnum) )
8323+        surprise_shares = set(read_data.keys()) - set([writer.shnum])
8324 
8325hunk ./src/allmydata/mutable/publish.py 800
8326-        now = time.time()
8327-        elapsed = now - started
8328-        self._status.add_per_server_time(peerid, elapsed)
8329+        # We need to remove from surprise_shares any shares that we are
8330+        # knowingly also writing to that peer from other writers.
8331 
8332hunk ./src/allmydata/mutable/publish.py 803
8333-        wrote, read_data = answer
8334+        # TODO: Precompute this.
8335+        known_shnums = [x.shnum for x in self.writers.values()
8336+                        if x.peerid == peerid]
8337+        surprise_shares -= set(known_shnums)
8338+        self.log("found the following surprise shares: %s" %
8339+                 str(surprise_shares))
8340 
8341hunk ./src/allmydata/mutable/publish.py 810
8342-        surprise_shares = set(read_data.keys()) - set(shnums)
8343+        # Now surprise shares contains all of the shares that we did not
8344+        # expect to be there.
8345 
8346         surprised = False
8347         for shnum in surprise_shares:
8348hunk ./src/allmydata/mutable/publish.py 817
8349             # read_data is a dict mapping shnum to checkstring (SIGNED_PREFIX)
8350             checkstring = read_data[shnum][0]
8351-            their_version_info = unpack_checkstring(checkstring)
8352-            if their_version_info == self._new_version_info:
8353+            # What we want to do here is to see if their (seqnum,
8354+            # roothash, salt) is the same as our (seqnum, roothash,
8355+            # salt), or the equivalent for MDMF. The best way to do this
8356+            # is to store a packed representation of our checkstring
8357+            # somewhere, then not bother unpacking the other
8358+            # checkstring.
8359+            if checkstring == self._checkstring:
8360                 # they have the right share, somehow
8361 
8362                 if (peerid,shnum) in self.goal:
8363hunk ./src/allmydata/mutable/publish.py 902
8364             self.log("our testv failed, so the write did not happen",
8365                      parent=lp, level=log.WEIRD, umid="8sc26g")
8366             self.surprised = True
8367-            self.bad_peers.add(peerid) # don't ask them again
8368+            # TODO: This needs to
8369+            self.bad_peers.add(writer) # don't ask them again
8370             # use the checkstring to add information to the log message
8371             for (shnum,readv) in read_data.items():
8372                 checkstring = readv[0]
8373hunk ./src/allmydata/mutable/publish.py 928
8374             # self.loop() will take care of finding new homes
8375             return
8376 
8377-        for shnum in shnums:
8378-            self.placed.add( (peerid, shnum) )
8379-            # and update the servermap
8380-            self._servermap.add_new_share(peerid, shnum,
8381+        # and update the servermap
8382+        # self.versioninfo is set during the last phase of publishing.
8383+        # If we get there, we know that responses correspond to placed
8384+        # shares, and can safely execute these statements.
8385+        if self.versioninfo:
8386+            self.log("wrote successfully: adding new share to servermap")
8387+            self._servermap.add_new_share(peerid, writer.shnum,
8388                                           self.versioninfo, started)
8389hunk ./src/allmydata/mutable/publish.py 936
8390-
8391-        # self.loop() will take care of checking to see if we're done
8392-        return
8393+            self.placed.add( (peerid, writer.shnum) )
8394 
8395hunk ./src/allmydata/mutable/publish.py 938
8396-    def _got_write_error(self, f, peerid, shnums, started):
8397-        for shnum in shnums:
8398-            self.outstanding.discard( (peerid, shnum) )
8399-        self.bad_peers.add(peerid)
8400-        if self._first_write_error is None:
8401-            self._first_write_error = f
8402-        self.log(format="error while writing shares %(shnums)s to peerid %(peerid)s",
8403-                 shnums=list(shnums), peerid=idlib.shortnodeid_b2a(peerid),
8404-                 failure=f,
8405-                 level=log.UNUSUAL)
8406         # self.loop() will take care of checking to see if we're done
8407         return
8408 
8409hunk ./src/allmydata/mutable/publish.py 949
8410         now = time.time()
8411         self._status.timings["total"] = now - self._started
8412         self._status.set_active(False)
8413-        if isinstance(res, failure.Failure):
8414-            self.log("Publish done, with failure", failure=res,
8415-                     level=log.WEIRD, umid="nRsR9Q")
8416-            self._status.set_status("Failed")
8417-        elif self.surprised:
8418-            self.log("Publish done, UncoordinatedWriteError", level=log.UNUSUAL)
8419-            self._status.set_status("UncoordinatedWriteError")
8420-            # deliver a failure
8421-            res = failure.Failure(UncoordinatedWriteError())
8422-            # TODO: recovery
8423-        else:
8424-            self.log("Publish done, success")
8425-            self._status.set_status("Finished")
8426-            self._status.set_progress(1.0)
8427+        self.log("Publish done, success")
8428+        self._status.set_status("Finished")
8429+        self._status.set_progress(1.0)
8430         eventually(self.done_deferred.callback, res)
8431 
8432hunk ./src/allmydata/mutable/publish.py 954
8433+    def _failure(self):
8434+
8435+        if not self.surprised:
8436+            # We ran out of servers
8437+            self.log("Publish ran out of good servers, "
8438+                     "last failure was: %s" % str(self._last_failure))
8439+            e = NotEnoughServersError("Ran out of non-bad servers, "
8440+                                      "last failure was %s" %
8441+                                      str(self._last_failure))
8442+        else:
8443+            # We ran into shares that we didn't recognize, which means
8444+            # that we need to return an UncoordinatedWriteError.
8445+            self.log("Publish failed with UncoordinatedWriteError")
8446+            e = UncoordinatedWriteError()
8447+        f = failure.Failure(e)
8448+        eventually(self.done_deferred.callback, f)
8449}
8450[test/test_mutable.py: remove tests that are no longer relevant
8451Kevan Carstensen <kevan@isnotajoke.com>**20100702225710
8452 Ignore-this: 90a26b4cc4b2e190a635474ba7097e21
8453] hunk ./src/allmydata/test/test_mutable.py 627
8454         return d
8455 
8456 
8457-class MakeShares(unittest.TestCase):
8458-    def test_encrypt(self):
8459-        nm = make_nodemaker()
8460-        CONTENTS = "some initial contents"
8461-        d = nm.create_mutable_file(CONTENTS)
8462-        def _created(fn):
8463-            p = Publish(fn, nm.storage_broker, None)
8464-            p.salt = "SALT" * 4
8465-            p.readkey = "\x00" * 16
8466-            p.newdata = CONTENTS
8467-            p.required_shares = 3
8468-            p.total_shares = 10
8469-            p.setup_encoding_parameters()
8470-            return p._encrypt_and_encode()
8471-        d.addCallback(_created)
8472-        def _done(shares_and_shareids):
8473-            (shares, share_ids) = shares_and_shareids
8474-            self.failUnlessEqual(len(shares), 10)
8475-            for sh in shares:
8476-                self.failUnless(isinstance(sh, str))
8477-                self.failUnlessEqual(len(sh), 7)
8478-            self.failUnlessEqual(len(share_ids), 10)
8479-        d.addCallback(_done)
8480-        return d
8481-    test_encrypt.todo = "Write an equivalent of this for the new uploader"
8482-
8483-    def test_generate(self):
8484-        nm = make_nodemaker()
8485-        CONTENTS = "some initial contents"
8486-        d = nm.create_mutable_file(CONTENTS)
8487-        def _created(fn):
8488-            self._fn = fn
8489-            p = Publish(fn, nm.storage_broker, None)
8490-            self._p = p
8491-            p.newdata = CONTENTS
8492-            p.required_shares = 3
8493-            p.total_shares = 10
8494-            p.setup_encoding_parameters()
8495-            p._new_seqnum = 3
8496-            p.salt = "SALT" * 4
8497-            # make some fake shares
8498-            shares_and_ids = ( ["%07d" % i for i in range(10)], range(10) )
8499-            p._privkey = fn.get_privkey()
8500-            p._encprivkey = fn.get_encprivkey()
8501-            p._pubkey = fn.get_pubkey()
8502-            return p._generate_shares(shares_and_ids)
8503-        d.addCallback(_created)
8504-        def _generated(res):
8505-            p = self._p
8506-            final_shares = p.shares
8507-            root_hash = p.root_hash
8508-            self.failUnlessEqual(len(root_hash), 32)
8509-            self.failUnless(isinstance(final_shares, dict))
8510-            self.failUnlessEqual(len(final_shares), 10)
8511-            self.failUnlessEqual(sorted(final_shares.keys()), range(10))
8512-            for i,sh in final_shares.items():
8513-                self.failUnless(isinstance(sh, str))
8514-                # feed the share through the unpacker as a sanity-check
8515-                pieces = unpack_share(sh)
8516-                (u_seqnum, u_root_hash, IV, k, N, segsize, datalen,
8517-                 pubkey, signature, share_hash_chain, block_hash_tree,
8518-                 share_data, enc_privkey) = pieces
8519-                self.failUnlessEqual(u_seqnum, 3)
8520-                self.failUnlessEqual(u_root_hash, root_hash)
8521-                self.failUnlessEqual(k, 3)
8522-                self.failUnlessEqual(N, 10)
8523-                self.failUnlessEqual(segsize, 21)
8524-                self.failUnlessEqual(datalen, len(CONTENTS))
8525-                self.failUnlessEqual(pubkey, p._pubkey.serialize())
8526-                sig_material = struct.pack(">BQ32s16s BBQQ",
8527-                                           0, p._new_seqnum, root_hash, IV,
8528-                                           k, N, segsize, datalen)
8529-                self.failUnless(p._pubkey.verify(sig_material, signature))
8530-                #self.failUnlessEqual(signature, p._privkey.sign(sig_material))
8531-                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
8532-                for shnum,share_hash in share_hash_chain.items():
8533-                    self.failUnless(isinstance(shnum, int))
8534-                    self.failUnless(isinstance(share_hash, str))
8535-                    self.failUnlessEqual(len(share_hash), 32)
8536-                self.failUnless(isinstance(block_hash_tree, list))
8537-                self.failUnlessEqual(len(block_hash_tree), 1) # very small tree
8538-                self.failUnlessEqual(IV, "SALT"*4)
8539-                self.failUnlessEqual(len(share_data), len("%07d" % 1))
8540-                self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey())
8541-        d.addCallback(_generated)
8542-        return d
8543-    test_generate.todo = "Write an equivalent of this for the new uploader"
8544-
8545-    # TODO: when we publish to 20 peers, we should get one share per peer on 10
8546-    # when we publish to 3 peers, we should get either 3 or 4 shares per peer
8547-    # when we publish to zero peers, we should get a NotEnoughSharesError
8548-
8549 class PublishMixin:
8550     def publish_one(self):
8551         # publish a file and create shares, which can then be manipulated
8552[interfaces.py: create IMutableUploadable
8553Kevan Carstensen <kevan@isnotajoke.com>**20100706215217
8554 Ignore-this: bee202ec2bfbd8e41f2d4019cce176c7
8555] hunk ./src/allmydata/interfaces.py 1693
8556         """The upload is finished, and whatever filehandle was in use may be
8557         closed."""
8558 
8559+
8560+class IMutableUploadable(Interface):
8561+    """
8562+    I represent content that is due to be uploaded to a mutable filecap.
8563+    """
8564+    # This is somewhat simpler than the IUploadable interface above
8565+    # because mutable files do not need to be concerned with possibly
8566+    # generating a CHK, nor with per-file keys. It is a subset of the
8567+    # methods in IUploadable, though, so we could just as well implement
8568+    # the mutable uploadables as IUploadables that don't happen to use
8569+    # those methods (with the understanding that the unused methods will
8570+    # never be called on such objects)
8571+    def get_size():
8572+        """
8573+        Returns a Deferred that fires with the size of the content held
8574+        by the uploadable.
8575+        """
8576+
8577+    def read(length):
8578+        """
8579+        Returns a list of strings which, when concatenated, are the next
8580+        length bytes of the file, or fewer if there are fewer bytes
8581+        between the current location and the end of the file.
8582+        """
8583+
8584+    def close():
8585+        """
8586+        The process that used the Uploadable is finished using it, so
8587+        the uploadable may be closed.
8588+        """
8589+
8590 class IUploadResults(Interface):
8591     """I am returned by upload() methods. I contain a number of public
8592     attributes which can be read to determine the results of the upload. Some
8593[mutable/publish.py: add MutableDataHandle and MutableFileHandle
8594Kevan Carstensen <kevan@isnotajoke.com>**20100706215257
8595 Ignore-this: 295ea3bc2a962fd14fb7877fc76c011c
8596] {
8597hunk ./src/allmydata/mutable/publish.py 8
8598 from zope.interface import implements
8599 from twisted.internet import defer
8600 from twisted.python import failure
8601-from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION
8602+from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION, \
8603+                                 IMutableUploadable
8604 from allmydata.util import base32, hashutil, mathutil, idlib, log
8605 from allmydata import hashtree, codec
8606 from allmydata.storage.server import si_b2a
8607hunk ./src/allmydata/mutable/publish.py 971
8608             e = UncoordinatedWriteError()
8609         f = failure.Failure(e)
8610         eventually(self.done_deferred.callback, f)
8611+
8612+
8613+class MutableFileHandle:
8614+    """
8615+    I am a mutable uploadable built around a filehandle-like object,
8616+    usually either a StringIO instance or a handle to an actual file.
8617+    """
8618+    implements(IMutableUploadable)
8619+
8620+    def __init__(self, filehandle):
8621+        # The filehandle is defined as a generally file-like object that
8622+        # has these two methods. We don't care beyond that.
8623+        assert hasattr(filehandle, "read")
8624+        assert hasattr(filehandle, "close")
8625+
8626+        self._filehandle = filehandle
8627+
8628+
8629+    def get_size(self):
8630+        """
8631+        I return the amount of data in my filehandle.
8632+        """
8633+        if not hasattr(self, "_size"):
8634+            old_position = self._filehandle.tell()
8635+            # Seek to the end of the file by seeking 0 bytes from the
8636+            # file's end
8637+            self._filehandle.seek(0, os.SEEK_END)
8638+            self._size = self._filehandle.tell()
8639+            # Restore the previous position, in case this was called
8640+            # after a read.
8641+            self._filehandle.seek(old_position)
8642+            assert self._filehandle.tell() == old_position
8643+
8644+        assert hasattr(self, "_size")
8645+        return self._size
8646+
8647+
8648+    def read(self, length):
8649+        """
8650+        I return some data (up to length bytes) from my filehandle.
8651+
8652+        In most cases, I return length bytes. If I don't, it is because
8653+        length is longer than the distance between my current position
8654+        in the file that I represent and its end. In that case, I return
8655+        as many bytes as I can before going over the EOF.
8656+        """
8657+        return [self._filehandle.read(length)]
8658+
8659+
8660+    def close(self):
8661+        """
8662+        I close the underlying filehandle. Any further operations on the
8663+        filehandle fail at this point.
8664+        """
8665+        self._filehandle.close()
8666+
8667+
8668+class MutableDataHandle(MutableFileHandle):
8669+    """
8670+    I am a mutable uploadable built around a string, which I then cast
8671+    into a StringIO and treat as a filehandle.
8672+    """
8673+
8674+    def __init__(self, s):
8675+        # Take a string and return a file-like uploadable.
8676+        assert isinstance(s, str)
8677+
8678+        MutableFileHandle.__init__(self, StringIO(s))
8679}
8680[mutable/publish.py: reorganize in preparation of file-like uploadables
8681Kevan Carstensen <kevan@isnotajoke.com>**20100706215541
8682 Ignore-this: 5346c9f919ee5b73807c8f287c64e8ce
8683] {
8684hunk ./src/allmydata/mutable/publish.py 4
8685 
8686 
8687 import os, struct, time
8688+from StringIO import StringIO
8689 from itertools import count
8690 from zope.interface import implements
8691 from twisted.internet import defer
8692hunk ./src/allmydata/mutable/publish.py 118
8693         self._status.set_helper(False)
8694         self._status.set_progress(0.0)
8695         self._status.set_active(True)
8696-        # We use this to control how the file is written.
8697-        version = self._node.get_version()
8698-        assert version in (SDMF_VERSION, MDMF_VERSION)
8699-        self._version = version
8700+        self._version = self._node.get_version()
8701+        assert self._version in (SDMF_VERSION, MDMF_VERSION)
8702+
8703 
8704     def get_status(self):
8705         return self._status
8706hunk ./src/allmydata/mutable/publish.py 141
8707 
8708         # 0. Setup encoding parameters, encoder, and other such things.
8709         # 1. Encrypt, encode, and publish segments.
8710+        self.data = StringIO(newdata)
8711+        self.datalength = len(newdata)
8712 
8713hunk ./src/allmydata/mutable/publish.py 144
8714-        self.log("starting publish, datalen is %s" % len(newdata))
8715-        self._status.set_size(len(newdata))
8716+        self.log("starting publish, datalen is %s" % self.datalength)
8717+        self._status.set_size(self.datalength)
8718         self._status.set_status("Started")
8719         self._started = time.time()
8720 
8721hunk ./src/allmydata/mutable/publish.py 193
8722         self.full_peerlist = full_peerlist # for use later, immutable
8723         self.bad_peers = set() # peerids who have errbacked/refused requests
8724 
8725-        self.newdata = newdata
8726-
8727         # This will set self.segment_size, self.num_segments, and
8728         # self.fec.
8729         self.setup_encoding_parameters()
8730hunk ./src/allmydata/mutable/publish.py 272
8731                                                 self.required_shares,
8732                                                 self.total_shares,
8733                                                 self.segment_size,
8734-                                                len(self.newdata))
8735+                                                self.datalength)
8736             self.writers[shnum].peerid = peerid
8737             if (peerid, shnum) in self._servermap.servermap:
8738                 old_versionid, old_timestamp = self._servermap.servermap[key]
8739hunk ./src/allmydata/mutable/publish.py 318
8740         if self._version == MDMF_VERSION:
8741             segment_size = DEFAULT_MAX_SEGMENT_SIZE # 128 KiB by default
8742         else:
8743-            segment_size = len(self.newdata) # SDMF is only one segment
8744+            segment_size = self.datalength # SDMF is only one segment
8745         # this must be a multiple of self.required_shares
8746         segment_size = mathutil.next_multiple(segment_size,
8747                                               self.required_shares)
8748hunk ./src/allmydata/mutable/publish.py 324
8749         self.segment_size = segment_size
8750         if segment_size:
8751-            self.num_segments = mathutil.div_ceil(len(self.newdata),
8752+            self.num_segments = mathutil.div_ceil(self.datalength,
8753                                                   segment_size)
8754         else:
8755             self.num_segments = 0
8756hunk ./src/allmydata/mutable/publish.py 337
8757             assert self.num_segments in (0, 1) # SDMF
8758         # calculate the tail segment size.
8759 
8760-        if segment_size and self.newdata:
8761-            self.tail_segment_size = len(self.newdata) % segment_size
8762+        if segment_size and self.datalength:
8763+            self.tail_segment_size = self.datalength % segment_size
8764         else:
8765             self.tail_segment_size = 0
8766 
8767hunk ./src/allmydata/mutable/publish.py 438
8768             segsize = self.segment_size
8769 
8770 
8771-        offset = self.segment_size * segnum
8772-        length = segsize + offset
8773         self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
8774hunk ./src/allmydata/mutable/publish.py 439
8775-        data = self.newdata[offset:length]
8776+        data = self.data.read(segsize)
8777+
8778         assert len(data) == segsize
8779 
8780         salt = os.urandom(16)
8781hunk ./src/allmydata/mutable/publish.py 502
8782             d.addCallback(self._got_write_answer, writer, started)
8783             d.addErrback(self._connection_problem, writer)
8784             dl.append(d)
8785-            # TODO: Naturally, we need to check on the results of these.
8786         return defer.DeferredList(dl)
8787 
8788 
8789}
8790[test/test_mutable.py: write tests for MutableFileHandle and MutableDataHandle
8791Kevan Carstensen <kevan@isnotajoke.com>**20100706215649
8792 Ignore-this: df719a0c52b4bbe9be4fae206c7ab3e7
8793] {
8794hunk ./src/allmydata/test/test_mutable.py 2
8795 
8796-import struct
8797+import struct, os
8798 from cStringIO import StringIO
8799 from twisted.trial import unittest
8800 from twisted.internet import defer, reactor
8801hunk ./src/allmydata/test/test_mutable.py 26
8802      NeedMoreDataError, UnrecoverableFileError, UncoordinatedWriteError, \
8803      NotEnoughServersError, CorruptShareError
8804 from allmydata.mutable.retrieve import Retrieve
8805-from allmydata.mutable.publish import Publish
8806+from allmydata.mutable.publish import Publish, MutableFileHandle, \
8807+                                      MutableDataHandle
8808 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
8809 from allmydata.mutable.layout import unpack_header, unpack_share, \
8810                                      MDMFSlotReadProxy
8811hunk ./src/allmydata/test/test_mutable.py 2465
8812         d.addCallback(lambda data:
8813             self.failUnlessEqual(data, CONTENTS))
8814         return d
8815+
8816+
8817+class FileHandle(unittest.TestCase):
8818+    def setUp(self):
8819+        self.test_data = "Test Data" * 50000
8820+        self.sio = StringIO(self.test_data)
8821+        self.uploadable = MutableFileHandle(self.sio)
8822+
8823+
8824+    def test_filehandle_read(self):
8825+        self.basedir = "mutable/FileHandle/test_filehandle_read"
8826+        chunk_size = 10
8827+        for i in xrange(0, len(self.test_data), chunk_size):
8828+            data = self.uploadable.read(chunk_size)
8829+            data = "".join(data)
8830+            start = i
8831+            end = i + chunk_size
8832+            self.failUnlessEqual(data, self.test_data[start:end])
8833+
8834+
8835+    def test_filehandle_get_size(self):
8836+        self.basedir = "mutable/FileHandle/test_filehandle_get_size"
8837+        actual_size = len(self.test_data)
8838+        size = self.uploadable.get_size()
8839+        self.failUnlessEqual(size, actual_size)
8840+
8841+
8842+    def test_filehandle_get_size_out_of_order(self):
8843+        # We should be able to call get_size whenever we want without
8844+        # disturbing the location of the seek pointer.
8845+        chunk_size = 100
8846+        data = self.uploadable.read(chunk_size)
8847+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
8848+
8849+        # Now get the size.
8850+        size = self.uploadable.get_size()
8851+        self.failUnlessEqual(size, len(self.test_data))
8852+
8853+        # Now get more data. We should be right where we left off.
8854+        more_data = self.uploadable.read(chunk_size)
8855+        start = chunk_size
8856+        end = chunk_size * 2
8857+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
8858+
8859+
8860+    def test_filehandle_file(self):
8861+        # Make sure that the MutableFileHandle works on a file as well
8862+        # as a StringIO object, since in some cases it will be asked to
8863+        # deal with files.
8864+        self.basedir = self.mktemp()
8865+        # necessary? What am I doing wrong here?
8866+        os.mkdir(self.basedir)
8867+        f_path = os.path.join(self.basedir, "test_file")
8868+        f = open(f_path, "w")
8869+        f.write(self.test_data)
8870+        f.close()
8871+        f = open(f_path, "r")
8872+
8873+        uploadable = MutableFileHandle(f)
8874+
8875+        data = uploadable.read(len(self.test_data))
8876+        self.failUnlessEqual("".join(data), self.test_data)
8877+        size = uploadable.get_size()
8878+        self.failUnlessEqual(size, len(self.test_data))
8879+
8880+
8881+    def test_close(self):
8882+        # Make sure that the MutableFileHandle closes its handle when
8883+        # told to do so.
8884+        self.uploadable.close()
8885+        self.failUnless(self.sio.closed)
8886+
8887+
8888+class DataHandle(unittest.TestCase):
8889+    def setUp(self):
8890+        self.test_data = "Test Data" * 50000
8891+        self.uploadable = MutableDataHandle(self.test_data)
8892+
8893+
8894+    def test_datahandle_read(self):
8895+        chunk_size = 10
8896+        for i in xrange(0, len(self.test_data), chunk_size):
8897+            data = self.uploadable.read(chunk_size)
8898+            data = "".join(data)
8899+            start = i
8900+            end = i + chunk_size
8901+            self.failUnlessEqual(data, self.test_data[start:end])
8902+
8903+
8904+    def test_datahandle_get_size(self):
8905+        actual_size = len(self.test_data)
8906+        size = self.uploadable.get_size()
8907+        self.failUnlessEqual(size, actual_size)
8908+
8909+
8910+    def test_datahandle_get_size_out_of_order(self):
8911+        # We should be able to call get_size whenever we want without
8912+        # disturbing the location of the seek pointer.
8913+        chunk_size = 100
8914+        data = self.uploadable.read(chunk_size)
8915+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
8916+
8917+        # Now get the size.
8918+        size = self.uploadable.get_size()
8919+        self.failUnlessEqual(size, len(self.test_data))
8920+
8921+        # Now get more data. We should be right where we left off.
8922+        more_data = self.uploadable.read(chunk_size)
8923+        start = chunk_size
8924+        end = chunk_size * 2
8925+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
8926}
8927[Alter tests to work with the new APIs
8928Kevan Carstensen <kevan@isnotajoke.com>**20100708000031
8929 Ignore-this: 1f377904ac61ce40e9a04716fbd2ad95
8930] {
8931hunk ./src/allmydata/test/common.py 12
8932 from allmydata import uri, dirnode, client
8933 from allmydata.introducer.server import IntroducerNode
8934 from allmydata.interfaces import IMutableFileNode, IImmutableFileNode, \
8935-     FileTooLargeError, NotEnoughSharesError, ICheckable
8936+     FileTooLargeError, NotEnoughSharesError, ICheckable, \
8937+     IMutableUploadable
8938 from allmydata.check_results import CheckResults, CheckAndRepairResults, \
8939      DeepCheckResults, DeepCheckAndRepairResults
8940 from allmydata.mutable.common import CorruptShareError
8941hunk ./src/allmydata/test/common.py 18
8942 from allmydata.mutable.layout import unpack_header
8943+from allmydata.mutable.publish import MutableDataHandle
8944 from allmydata.storage.server import storage_index_to_dir
8945 from allmydata.storage.mutable import MutableShareFile
8946 from allmydata.util import hashutil, log, fileutil, pollmixin
8947hunk ./src/allmydata/test/common.py 182
8948         self.init_from_cap(make_mutable_file_cap())
8949     def create(self, contents, key_generator=None, keysize=None):
8950         initial_contents = self._get_initial_contents(contents)
8951-        if len(initial_contents) > self.MUTABLE_SIZELIMIT:
8952+        if initial_contents.get_size() > self.MUTABLE_SIZELIMIT:
8953             raise FileTooLargeError("SDMF is limited to one segment, and "
8954hunk ./src/allmydata/test/common.py 184
8955-                                    "%d > %d" % (len(initial_contents),
8956+                                    "%d > %d" % (initial_contents.get_size(),
8957                                                  self.MUTABLE_SIZELIMIT))
8958hunk ./src/allmydata/test/common.py 186
8959-        self.all_contents[self.storage_index] = initial_contents
8960+        data = initial_contents.read(initial_contents.get_size())
8961+        data = "".join(data)
8962+        self.all_contents[self.storage_index] = data
8963         return defer.succeed(self)
8964     def _get_initial_contents(self, contents):
8965hunk ./src/allmydata/test/common.py 191
8966-        if isinstance(contents, str):
8967-            return contents
8968         if contents is None:
8969hunk ./src/allmydata/test/common.py 192
8970-            return ""
8971+            return MutableDataHandle("")
8972+
8973+        if IMutableUploadable.providedBy(contents):
8974+            return contents
8975+
8976         assert callable(contents), "%s should be callable, not %s" % \
8977                (contents, type(contents))
8978         return contents(self)
8979hunk ./src/allmydata/test/common.py 309
8980         return defer.succeed(self.all_contents[self.storage_index])
8981 
8982     def overwrite(self, new_contents):
8983-        if len(new_contents) > self.MUTABLE_SIZELIMIT:
8984+        if new_contents.get_size() > self.MUTABLE_SIZELIMIT:
8985             raise FileTooLargeError("SDMF is limited to one segment, and "
8986hunk ./src/allmydata/test/common.py 311
8987-                                    "%d > %d" % (len(new_contents),
8988+                                    "%d > %d" % (new_contents.get_size(),
8989                                                  self.MUTABLE_SIZELIMIT))
8990         assert not self.is_readonly()
8991hunk ./src/allmydata/test/common.py 314
8992-        self.all_contents[self.storage_index] = new_contents
8993+        new_data = new_contents.read(new_contents.get_size())
8994+        new_data = "".join(new_data)
8995+        self.all_contents[self.storage_index] = new_data
8996         return defer.succeed(None)
8997     def modify(self, modifier):
8998         # this does not implement FileTooLargeError, but the real one does
8999hunk ./src/allmydata/test/common.py 324
9000     def _modify(self, modifier):
9001         assert not self.is_readonly()
9002         old_contents = self.all_contents[self.storage_index]
9003-        self.all_contents[self.storage_index] = modifier(old_contents, None, True)
9004+        new_data = modifier(old_contents, None, True)
9005+        if new_data is not None:
9006+            new_data = new_data.read(new_data.get_size())
9007+            new_data = "".join(new_data)
9008+        self.all_contents[self.storage_index] = new_data
9009         return None
9010 
9011 def make_mutable_file_cap():
9012hunk ./src/allmydata/test/test_checker.py 11
9013 from allmydata.test.no_network import GridTestMixin
9014 from allmydata.immutable.upload import Data
9015 from allmydata.test.common_web import WebRenderingMixin
9016+from allmydata.mutable.publish import MutableDataHandle
9017 
9018 class FakeClient:
9019     def get_storage_broker(self):
9020hunk ./src/allmydata/test/test_checker.py 291
9021         def _stash_immutable(ur):
9022             self.imm = c0.create_node_from_uri(ur.uri)
9023         d.addCallback(_stash_immutable)
9024-        d.addCallback(lambda ign: c0.create_mutable_file("contents"))
9025+        d.addCallback(lambda ign:
9026+            c0.create_mutable_file(MutableDataHandle("contents")))
9027         def _stash_mutable(node):
9028             self.mut = node
9029         d.addCallback(_stash_mutable)
9030hunk ./src/allmydata/test/test_cli.py 12
9031 from allmydata.util import fileutil, hashutil, base32
9032 from allmydata import uri
9033 from allmydata.immutable import upload
9034+from allmydata.mutable.publish import MutableDataHandle
9035 from allmydata.dirnode import normalize
9036 
9037 # Test that the scripts can be imported -- although the actual tests of their
9038hunk ./src/allmydata/test/test_cli.py 1983
9039         self.set_up_grid()
9040         c0 = self.g.clients[0]
9041         DATA = "data" * 100
9042-        d = c0.create_mutable_file(DATA)
9043+        DATA_uploadable = MutableDataHandle(DATA)
9044+        d = c0.create_mutable_file(DATA_uploadable)
9045         def _stash_uri(n):
9046             self.uri = n.get_uri()
9047         d.addCallback(_stash_uri)
9048hunk ./src/allmydata/test/test_cli.py 2085
9049                                            upload.Data("literal",
9050                                                         convergence="")))
9051         d.addCallback(_stash_uri, "small")
9052-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"1"))
9053+        d.addCallback(lambda ign:
9054+            c0.create_mutable_file(MutableDataHandle(DATA+"1")))
9055         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
9056         d.addCallback(_stash_uri, "mutable")
9057 
9058hunk ./src/allmydata/test/test_cli.py 2104
9059         # root/small
9060         # root/mutable
9061 
9062+        # We haven't broken anything yet, so this should all be healthy.
9063         d.addCallback(lambda ign: self.do_cli("deep-check", "--verbose",
9064                                               self.rooturi))
9065         def _check2((rc, out, err)):
9066hunk ./src/allmydata/test/test_cli.py 2119
9067                             in lines, out)
9068         d.addCallback(_check2)
9069 
9070+        # Similarly, all of these results should be as we expect them to
9071+        # be for a healthy file layout.
9072         d.addCallback(lambda ign: self.do_cli("stats", self.rooturi))
9073         def _check_stats((rc, out, err)):
9074             self.failUnlessReallyEqual(err, "")
9075hunk ./src/allmydata/test/test_cli.py 2136
9076             self.failUnlessIn(" 317-1000 : 1    (1000 B, 1000 B)", lines)
9077         d.addCallback(_check_stats)
9078 
9079+        # Now we break things.
9080         def _clobber_shares(ignored):
9081             shares = self.find_shares(self.uris[u"gööd"])
9082             self.failUnlessReallyEqual(len(shares), 10)
9083hunk ./src/allmydata/test/test_cli.py 2155
9084         d.addCallback(_clobber_shares)
9085 
9086         # root
9087-        # root/gööd  [9 shares]
9088+        # root/gööd  [1 missing share]
9089         # root/small
9090         # root/mutable [1 corrupt share]
9091 
9092hunk ./src/allmydata/test/test_cli.py 2161
9093         d.addCallback(lambda ign:
9094                       self.do_cli("deep-check", "--verbose", self.rooturi))
9095+        # This should reveal the missing share, but not the corrupt
9096+        # share, since we didn't tell the deep check operation to also
9097+        # verify.
9098         def _check3((rc, out, err)):
9099             self.failUnlessReallyEqual(err, "")
9100             self.failUnlessReallyEqual(rc, 0)
9101hunk ./src/allmydata/test/test_cli.py 2212
9102                                   "--verbose", "--verify", "--repair",
9103                                   self.rooturi))
9104         def _check6((rc, out, err)):
9105+            # We've just repaired the directory. There is no reason for
9106+            # that repair to be unsuccessful.
9107             self.failUnlessReallyEqual(err, "")
9108             self.failUnlessReallyEqual(rc, 0)
9109             lines = out.splitlines()
9110hunk ./src/allmydata/test/test_deepcheck.py 9
9111 from twisted.internet import threads # CLI tests use deferToThread
9112 from allmydata.immutable import upload
9113 from allmydata.mutable.common import UnrecoverableFileError
9114+from allmydata.mutable.publish import MutableDataHandle
9115 from allmydata.util import idlib
9116 from allmydata.util import base32
9117 from allmydata.scripts import runner
9118hunk ./src/allmydata/test/test_deepcheck.py 38
9119         self.basedir = "deepcheck/MutableChecker/good"
9120         self.set_up_grid()
9121         CONTENTS = "a little bit of data"
9122-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9123+        CONTENTS_uploadable = MutableDataHandle(CONTENTS)
9124+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9125         def _created(node):
9126             self.node = node
9127             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9128hunk ./src/allmydata/test/test_deepcheck.py 61
9129         self.basedir = "deepcheck/MutableChecker/corrupt"
9130         self.set_up_grid()
9131         CONTENTS = "a little bit of data"
9132-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9133+        CONTENTS_uploadable = MutableDataHandle(CONTENTS)
9134+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9135         def _stash_and_corrupt(node):
9136             self.node = node
9137             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9138hunk ./src/allmydata/test/test_deepcheck.py 99
9139         self.basedir = "deepcheck/MutableChecker/delete_share"
9140         self.set_up_grid()
9141         CONTENTS = "a little bit of data"
9142-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9143+        CONTENTS_uploadable = MutableDataHandle(CONTENTS)
9144+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9145         def _stash_and_delete(node):
9146             self.node = node
9147             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9148hunk ./src/allmydata/test/test_deepcheck.py 223
9149             self.root = n
9150             self.root_uri = n.get_uri()
9151         d.addCallback(_created_root)
9152-        d.addCallback(lambda ign: c0.create_mutable_file("mutable file contents"))
9153+        d.addCallback(lambda ign:
9154+            c0.create_mutable_file(MutableDataHandle("mutable file contents")))
9155         d.addCallback(lambda n: self.root.set_node(u"mutable", n))
9156         def _created_mutable(n):
9157             self.mutable = n
9158hunk ./src/allmydata/test/test_deepcheck.py 965
9159     def create_mangled(self, ignored, name):
9160         nodetype, mangletype = name.split("-", 1)
9161         if nodetype == "mutable":
9162-            d = self.g.clients[0].create_mutable_file("mutable file contents")
9163+            mutable_uploadable = MutableDataHandle("mutable file contents")
9164+            d = self.g.clients[0].create_mutable_file(mutable_uploadable)
9165             d.addCallback(lambda n: self.root.set_node(unicode(name), n))
9166         elif nodetype == "large":
9167             large = upload.Data("Lots of data\n" * 1000 + name + "\n", None)
9168hunk ./src/allmydata/test/test_dirnode.py 1281
9169     implements(IMutableFileNode)
9170     counter = 0
9171     def __init__(self, initial_contents=""):
9172-        self.data = self._get_initial_contents(initial_contents)
9173+        data = self._get_initial_contents(initial_contents)
9174+        self.data = data.read(data.get_size())
9175+        self.data = "".join(self.data)
9176+
9177         counter = FakeMutableFile.counter
9178         FakeMutableFile.counter += 1
9179         writekey = hashutil.ssk_writekey_hash(str(counter))
9180hunk ./src/allmydata/test/test_dirnode.py 1331
9181         pass
9182 
9183     def modify(self, modifier):
9184-        self.data = modifier(self.data, None, True)
9185+        data = modifier(self.data, None, True)
9186+        self.data = data.read(data.get_size())
9187+        self.data = "".join(self.data)
9188         return defer.succeed(None)
9189 
9190 class FakeNodeMaker(NodeMaker):
9191hunk ./src/allmydata/test/test_hung_server.py 10
9192 from allmydata.util.consumer import download_to_data
9193 from allmydata.immutable import upload
9194 from allmydata.mutable.common import UnrecoverableFileError
9195+from allmydata.mutable.publish import MutableDataHandle
9196 from allmydata.storage.common import storage_index_to_dir
9197 from allmydata.test.no_network import GridTestMixin
9198 from allmydata.test.common import ShouldFailMixin, _corrupt_share_data
9199hunk ./src/allmydata/test/test_hung_server.py 96
9200         self.servers = [(id, ss) for (id, ss) in nm.storage_broker.get_all_servers()]
9201 
9202         if mutable:
9203-            d = nm.create_mutable_file(mutable_plaintext)
9204+            uploadable = MutableDataHandle(mutable_plaintext)
9205+            d = nm.create_mutable_file(uploadable)
9206             def _uploaded_mutable(node):
9207                 self.uri = node.get_uri()
9208                 self.shares = self.find_shares(self.uri)
9209hunk ./src/allmydata/test/test_mutable.py 297
9210             d.addCallback(lambda smap: smap.dump(StringIO()))
9211             d.addCallback(lambda sio:
9212                           self.failUnless("3-of-10" in sio.getvalue()))
9213-            d.addCallback(lambda res: n.overwrite("contents 1"))
9214+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 1")))
9215             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
9216             d.addCallback(lambda res: n.download_best_version())
9217             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9218hunk ./src/allmydata/test/test_mutable.py 304
9219             d.addCallback(lambda res: n.get_size_of_best_version())
9220             d.addCallback(lambda size:
9221                           self.failUnlessEqual(size, len("contents 1")))
9222-            d.addCallback(lambda res: n.overwrite("contents 2"))
9223+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9224             d.addCallback(lambda res: n.download_best_version())
9225             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9226             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
9227hunk ./src/allmydata/test/test_mutable.py 308
9228-            d.addCallback(lambda smap: n.upload("contents 3", smap))
9229+            d.addCallback(lambda smap: n.upload(MutableDataHandle("contents 3"), smap))
9230             d.addCallback(lambda res: n.download_best_version())
9231             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
9232             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
9233hunk ./src/allmydata/test/test_mutable.py 320
9234             # mapupdate-to-retrieve data caching (i.e. make the shares larger
9235             # than the default readsize, which is 2000 bytes). A 15kB file
9236             # will have 5kB shares.
9237-            d.addCallback(lambda res: n.overwrite("large size file" * 1000))
9238+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("large size file" * 1000)))
9239             d.addCallback(lambda res: n.download_best_version())
9240             d.addCallback(lambda res:
9241                           self.failUnlessEqual(res, "large size file" * 1000))
9242hunk ./src/allmydata/test/test_mutable.py 343
9243             # to make them big enough to force the file to be uploaded
9244             # in more than one segment.
9245             big_contents = "contents1" * 100000 # about 900 KiB
9246+            big_contents_uploadable = MutableDataHandle(big_contents)
9247             d.addCallback(lambda ignored:
9248hunk ./src/allmydata/test/test_mutable.py 345
9249-                n.overwrite(big_contents))
9250+                n.overwrite(big_contents_uploadable))
9251             d.addCallback(lambda ignored:
9252                 n.download_best_version())
9253             d.addCallback(lambda data:
9254hunk ./src/allmydata/test/test_mutable.py 355
9255             # segments, so that we make the downloader deal with
9256             # multiple segments.
9257             bigger_contents = "contents2" * 1000000 # about 9MiB
9258+            bigger_contents_uploadable = MutableDataHandle(bigger_contents)
9259             d.addCallback(lambda ignored:
9260hunk ./src/allmydata/test/test_mutable.py 357
9261-                n.overwrite(bigger_contents))
9262+                n.overwrite(bigger_contents_uploadable))
9263             d.addCallback(lambda ignored:
9264                 n.download_best_version())
9265             d.addCallback(lambda data:
9266hunk ./src/allmydata/test/test_mutable.py 368
9267 
9268 
9269     def test_create_with_initial_contents(self):
9270-        d = self.nodemaker.create_mutable_file("contents 1")
9271+        upload1 = MutableDataHandle("contents 1")
9272+        d = self.nodemaker.create_mutable_file(upload1)
9273         def _created(n):
9274             d = n.download_best_version()
9275             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9276hunk ./src/allmydata/test/test_mutable.py 373
9277-            d.addCallback(lambda res: n.overwrite("contents 2"))
9278+            upload2 = MutableDataHandle("contents 2")
9279+            d.addCallback(lambda res: n.overwrite(upload2))
9280             d.addCallback(lambda res: n.download_best_version())
9281             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9282             return d
9283hunk ./src/allmydata/test/test_mutable.py 380
9284         d.addCallback(_created)
9285         return d
9286+    test_create_with_initial_contents.timeout = 15
9287 
9288 
9289     def test_create_mdmf_with_initial_contents(self):
9290hunk ./src/allmydata/test/test_mutable.py 385
9291         initial_contents = "foobarbaz" * 131072 # 900KiB
9292-        d = self.nodemaker.create_mutable_file(initial_contents,
9293+        initial_contents_uploadable = MutableDataHandle(initial_contents)
9294+        d = self.nodemaker.create_mutable_file(initial_contents_uploadable,
9295                                                version=MDMF_VERSION)
9296         def _created(n):
9297             d = n.download_best_version()
9298hunk ./src/allmydata/test/test_mutable.py 392
9299             d.addCallback(lambda data:
9300                 self.failUnlessEqual(data, initial_contents))
9301+            uploadable2 = MutableDataHandle(initial_contents + "foobarbaz")
9302             d.addCallback(lambda ignored:
9303hunk ./src/allmydata/test/test_mutable.py 394
9304-                n.overwrite(initial_contents + "foobarbaz"))
9305+                n.overwrite(uploadable2))
9306             d.addCallback(lambda ignored:
9307                 n.download_best_version())
9308             d.addCallback(lambda data:
9309hunk ./src/allmydata/test/test_mutable.py 413
9310             key = n.get_writekey()
9311             self.failUnless(isinstance(key, str), key)
9312             self.failUnlessEqual(len(key), 16) # AES key size
9313-            return data
9314+            return MutableDataHandle(data)
9315         d = self.nodemaker.create_mutable_file(_make_contents)
9316         def _created(n):
9317             return n.download_best_version()
9318hunk ./src/allmydata/test/test_mutable.py 429
9319             key = n.get_writekey()
9320             self.failUnless(isinstance(key, str), key)
9321             self.failUnlessEqual(len(key), 16)
9322-            return data
9323+            return MutableDataHandle(data)
9324         d = self.nodemaker.create_mutable_file(_make_contents,
9325                                                version=MDMF_VERSION)
9326         d.addCallback(lambda n:
9327hunk ./src/allmydata/test/test_mutable.py 441
9328 
9329     def test_create_with_too_large_contents(self):
9330         BIG = "a" * (self.OLD_MAX_SEGMENT_SIZE + 1)
9331-        d = self.nodemaker.create_mutable_file(BIG)
9332+        BIG_uploadable = MutableDataHandle(BIG)
9333+        d = self.nodemaker.create_mutable_file(BIG_uploadable)
9334         def _created(n):
9335hunk ./src/allmydata/test/test_mutable.py 444
9336-            d = n.overwrite(BIG)
9337+            other_BIG_uploadable = MutableDataHandle(BIG)
9338+            d = n.overwrite(other_BIG_uploadable)
9339             return d
9340         d.addCallback(_created)
9341         return d
9342hunk ./src/allmydata/test/test_mutable.py 459
9343 
9344     def test_modify(self):
9345         def _modifier(old_contents, servermap, first_time):
9346-            return old_contents + "line2"
9347+            new_contents = old_contents + "line2"
9348+            return MutableDataHandle(new_contents)
9349         def _non_modifier(old_contents, servermap, first_time):
9350hunk ./src/allmydata/test/test_mutable.py 462
9351-            return old_contents
9352+            return MutableDataHandle(old_contents)
9353         def _none_modifier(old_contents, servermap, first_time):
9354             return None
9355         def _error_modifier(old_contents, servermap, first_time):
9356hunk ./src/allmydata/test/test_mutable.py 468
9357             raise ValueError("oops")
9358         def _toobig_modifier(old_contents, servermap, first_time):
9359-            return "b" * (self.OLD_MAX_SEGMENT_SIZE+1)
9360+            new_content = "b" * (self.OLD_MAX_SEGMENT_SIZE + 1)
9361+            return MutableDataHandle(new_content)
9362         calls = []
9363         def _ucw_error_modifier(old_contents, servermap, first_time):
9364             # simulate an UncoordinatedWriteError once
9365hunk ./src/allmydata/test/test_mutable.py 476
9366             calls.append(1)
9367             if len(calls) <= 1:
9368                 raise UncoordinatedWriteError("simulated")
9369-            return old_contents + "line3"
9370+            new_contents = old_contents + "line3"
9371+            return MutableDataHandle(new_contents)
9372         def _ucw_error_non_modifier(old_contents, servermap, first_time):
9373             # simulate an UncoordinatedWriteError once, and don't actually
9374             # modify the contents on subsequent invocations
9375hunk ./src/allmydata/test/test_mutable.py 484
9376             calls.append(1)
9377             if len(calls) <= 1:
9378                 raise UncoordinatedWriteError("simulated")
9379-            return old_contents
9380+            return MutableDataHandle(old_contents)
9381 
9382hunk ./src/allmydata/test/test_mutable.py 486
9383-        d = self.nodemaker.create_mutable_file("line1")
9384+        initial_contents = "line1"
9385+        d = self.nodemaker.create_mutable_file(MutableDataHandle(initial_contents))
9386         def _created(n):
9387             d = n.modify(_modifier)
9388             d.addCallback(lambda res: n.download_best_version())
9389hunk ./src/allmydata/test/test_mutable.py 548
9390 
9391     def test_modify_backoffer(self):
9392         def _modifier(old_contents, servermap, first_time):
9393-            return old_contents + "line2"
9394+            return MutableDataHandle(old_contents + "line2")
9395         calls = []
9396         def _ucw_error_modifier(old_contents, servermap, first_time):
9397             # simulate an UncoordinatedWriteError once
9398hunk ./src/allmydata/test/test_mutable.py 555
9399             calls.append(1)
9400             if len(calls) <= 1:
9401                 raise UncoordinatedWriteError("simulated")
9402-            return old_contents + "line3"
9403+            return MutableDataHandle(old_contents + "line3")
9404         def _always_ucw_error_modifier(old_contents, servermap, first_time):
9405             raise UncoordinatedWriteError("simulated")
9406         def _backoff_stopper(node, f):
9407hunk ./src/allmydata/test/test_mutable.py 570
9408         giveuper._delay = 0.1
9409         giveuper.factor = 1
9410 
9411-        d = self.nodemaker.create_mutable_file("line1")
9412+        d = self.nodemaker.create_mutable_file(MutableDataHandle("line1"))
9413         def _created(n):
9414             d = n.modify(_modifier)
9415             d.addCallback(lambda res: n.download_best_version())
9416hunk ./src/allmydata/test/test_mutable.py 620
9417             d.addCallback(lambda smap: smap.dump(StringIO()))
9418             d.addCallback(lambda sio:
9419                           self.failUnless("3-of-10" in sio.getvalue()))
9420-            d.addCallback(lambda res: n.overwrite("contents 1"))
9421+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 1")))
9422             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
9423             d.addCallback(lambda res: n.download_best_version())
9424             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9425hunk ./src/allmydata/test/test_mutable.py 624
9426-            d.addCallback(lambda res: n.overwrite("contents 2"))
9427+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9428             d.addCallback(lambda res: n.download_best_version())
9429             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9430             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
9431hunk ./src/allmydata/test/test_mutable.py 628
9432-            d.addCallback(lambda smap: n.upload("contents 3", smap))
9433+            d.addCallback(lambda smap: n.upload(MutableDataHandle("contents 3"), smap))
9434             d.addCallback(lambda res: n.download_best_version())
9435             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
9436             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
9437hunk ./src/allmydata/test/test_mutable.py 646
9438         # publish a file and create shares, which can then be manipulated
9439         # later.
9440         self.CONTENTS = "New contents go here" * 1000
9441+        self.uploadable = MutableDataHandle(self.CONTENTS)
9442         self._storage = FakeStorage()
9443         self._nodemaker = make_nodemaker(self._storage)
9444         self._storage_broker = self._nodemaker.storage_broker
9445hunk ./src/allmydata/test/test_mutable.py 650
9446-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
9447+        d = self._nodemaker.create_mutable_file(self.uploadable)
9448         def _created(node):
9449             self._fn = node
9450             self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
9451hunk ./src/allmydata/test/test_mutable.py 662
9452         # an MDMF file.
9453         # self.CONTENTS should have more than one segment.
9454         self.CONTENTS = "This is an MDMF file" * 100000
9455+        self.uploadable = MutableDataHandle(self.CONTENTS)
9456         self._storage = FakeStorage()
9457         self._nodemaker = make_nodemaker(self._storage)
9458         self._storage_broker = self._nodemaker.storage_broker
9459hunk ./src/allmydata/test/test_mutable.py 666
9460-        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=1)
9461+        d = self._nodemaker.create_mutable_file(self.uploadable, version=MDMF_VERSION)
9462         def _created(node):
9463             self._fn = node
9464             self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
9465hunk ./src/allmydata/test/test_mutable.py 678
9466         # like publish_one, except that the result is guaranteed to be
9467         # an SDMF file
9468         self.CONTENTS = "This is an SDMF file" * 1000
9469+        self.uploadable = MutableDataHandle(self.CONTENTS)
9470         self._storage = FakeStorage()
9471         self._nodemaker = make_nodemaker(self._storage)
9472         self._storage_broker = self._nodemaker.storage_broker
9473hunk ./src/allmydata/test/test_mutable.py 682
9474-        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=0)
9475+        d = self._nodemaker.create_mutable_file(self.uploadable, version=SDMF_VERSION)
9476         def _created(node):
9477             self._fn = node
9478             self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
9479hunk ./src/allmydata/test/test_mutable.py 696
9480                          "Contents 2",
9481                          "Contents 3a",
9482                          "Contents 3b"]
9483+        self.uploadables = [MutableDataHandle(d) for d in self.CONTENTS]
9484         self._copied_shares = {}
9485         self._storage = FakeStorage()
9486         self._nodemaker = make_nodemaker(self._storage)
9487hunk ./src/allmydata/test/test_mutable.py 700
9488-        d = self._nodemaker.create_mutable_file(self.CONTENTS[0], version=version) # seqnum=1
9489+        d = self._nodemaker.create_mutable_file(self.uploadables[0], version=version) # seqnum=1
9490         def _created(node):
9491             self._fn = node
9492             # now create multiple versions of the same file, and accumulate
9493hunk ./src/allmydata/test/test_mutable.py 707
9494             # their shares, so we can mix and match them later.
9495             d = defer.succeed(None)
9496             d.addCallback(self._copy_shares, 0)
9497-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[1])) #s2
9498+            d.addCallback(lambda res: node.overwrite(self.uploadables[1])) #s2
9499             d.addCallback(self._copy_shares, 1)
9500hunk ./src/allmydata/test/test_mutable.py 709
9501-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[2])) #s3
9502+            d.addCallback(lambda res: node.overwrite(self.uploadables[2])) #s3
9503             d.addCallback(self._copy_shares, 2)
9504hunk ./src/allmydata/test/test_mutable.py 711
9505-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[3])) #s4a
9506+            d.addCallback(lambda res: node.overwrite(self.uploadables[3])) #s4a
9507             d.addCallback(self._copy_shares, 3)
9508             # now we replace all the shares with version s3, and upload a new
9509             # version to get s4b.
9510hunk ./src/allmydata/test/test_mutable.py 717
9511             rollback = dict([(i,2) for i in range(10)])
9512             d.addCallback(lambda res: self._set_versions(rollback))
9513-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[4])) #s4b
9514+            d.addCallback(lambda res: node.overwrite(self.uploadables[4])) #s4b
9515             d.addCallback(self._copy_shares, 4)
9516             # we leave the storage in state 4
9517             return d
9518hunk ./src/allmydata/test/test_mutable.py 826
9519         # create a new file, which is large enough to knock the privkey out
9520         # of the early part of the file
9521         LARGE = "These are Larger contents" * 200 # about 5KB
9522-        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE))
9523+        LARGE_uploadable = MutableDataHandle(LARGE)
9524+        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE_uploadable))
9525         def _created(large_fn):
9526             large_fn2 = self._nodemaker.create_from_cap(large_fn.get_uri())
9527             return self.make_servermap(MODE_WRITE, large_fn2)
9528hunk ./src/allmydata/test/test_mutable.py 1842
9529 class MultipleEncodings(unittest.TestCase):
9530     def setUp(self):
9531         self.CONTENTS = "New contents go here"
9532+        self.uploadable = MutableDataHandle(self.CONTENTS)
9533         self._storage = FakeStorage()
9534         self._nodemaker = make_nodemaker(self._storage, num_peers=20)
9535         self._storage_broker = self._nodemaker.storage_broker
9536hunk ./src/allmydata/test/test_mutable.py 1846
9537-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
9538+        d = self._nodemaker.create_mutable_file(self.uploadable)
9539         def _created(node):
9540             self._fn = node
9541         d.addCallback(_created)
9542hunk ./src/allmydata/test/test_mutable.py 1872
9543         s = self._storage
9544         s._peers = {} # clear existing storage
9545         p2 = Publish(fn2, self._storage_broker, None)
9546-        d = p2.publish(data)
9547+        uploadable = MutableDataHandle(data)
9548+        d = p2.publish(uploadable)
9549         def _published(res):
9550             shares = s._peers
9551             s._peers = {}
9552hunk ./src/allmydata/test/test_mutable.py 2049
9553         self._set_versions(target)
9554 
9555         def _modify(oldversion, servermap, first_time):
9556-            return oldversion + " modified"
9557+            return MutableDataHandle(oldversion + " modified")
9558         d = self._fn.modify(_modify)
9559         d.addCallback(lambda res: self._fn.download_best_version())
9560         expected = self.CONTENTS[2] + " modified"
9561hunk ./src/allmydata/test/test_mutable.py 2175
9562         self.basedir = "mutable/Problems/test_publish_surprise"
9563         self.set_up_grid()
9564         nm = self.g.clients[0].nodemaker
9565-        d = nm.create_mutable_file("contents 1")
9566+        d = nm.create_mutable_file(MutableDataHandle("contents 1"))
9567         def _created(n):
9568             d = defer.succeed(None)
9569             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
9570hunk ./src/allmydata/test/test_mutable.py 2185
9571             d.addCallback(_got_smap1)
9572             # then modify the file, leaving the old map untouched
9573             d.addCallback(lambda res: log.msg("starting winning write"))
9574-            d.addCallback(lambda res: n.overwrite("contents 2"))
9575+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9576             # now attempt to modify the file with the old servermap. This
9577             # will look just like an uncoordinated write, in which every
9578             # single share got updated between our mapupdate and our publish
9579hunk ./src/allmydata/test/test_mutable.py 2194
9580                           self.shouldFail(UncoordinatedWriteError,
9581                                           "test_publish_surprise", None,
9582                                           n.upload,
9583-                                          "contents 2a", self.old_map))
9584+                                          MutableDataHandle("contents 2a"), self.old_map))
9585             return d
9586         d.addCallback(_created)
9587         return d
9588hunk ./src/allmydata/test/test_mutable.py 2203
9589         self.basedir = "mutable/Problems/test_retrieve_surprise"
9590         self.set_up_grid()
9591         nm = self.g.clients[0].nodemaker
9592-        d = nm.create_mutable_file("contents 1")
9593+        d = nm.create_mutable_file(MutableDataHandle("contents 1"))
9594         def _created(n):
9595             d = defer.succeed(None)
9596             d.addCallback(lambda res: n.get_servermap(MODE_READ))
9597hunk ./src/allmydata/test/test_mutable.py 2213
9598             d.addCallback(_got_smap1)
9599             # then modify the file, leaving the old map untouched
9600             d.addCallback(lambda res: log.msg("starting winning write"))
9601-            d.addCallback(lambda res: n.overwrite("contents 2"))
9602+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9603             # now attempt to retrieve the old version with the old servermap.
9604             # This will look like someone has changed the file since we
9605             # updated the servermap.
9606hunk ./src/allmydata/test/test_mutable.py 2241
9607         self.basedir = "mutable/Problems/test_unexpected_shares"
9608         self.set_up_grid()
9609         nm = self.g.clients[0].nodemaker
9610-        d = nm.create_mutable_file("contents 1")
9611+        d = nm.create_mutable_file(MutableDataHandle("contents 1"))
9612         def _created(n):
9613             d = defer.succeed(None)
9614             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
9615hunk ./src/allmydata/test/test_mutable.py 2253
9616                 self.g.remove_server(peer0)
9617                 # then modify the file, leaving the old map untouched
9618                 log.msg("starting winning write")
9619-                return n.overwrite("contents 2")
9620+                return n.overwrite(MutableDataHandle("contents 2"))
9621             d.addCallback(_got_smap1)
9622             # now attempt to modify the file with the old servermap. This
9623             # will look just like an uncoordinated write, in which every
9624hunk ./src/allmydata/test/test_mutable.py 2263
9625                           self.shouldFail(UncoordinatedWriteError,
9626                                           "test_surprise", None,
9627                                           n.upload,
9628-                                          "contents 2a", self.old_map))
9629+                                          MutableDataHandle("contents 2a"), self.old_map))
9630             return d
9631         d.addCallback(_created)
9632         return d
9633hunk ./src/allmydata/test/test_mutable.py 2267
9634+    test_unexpected_shares.timeout = 15
9635 
9636     def test_bad_server(self):
9637         # Break one server, then create the file: the initial publish should
9638hunk ./src/allmydata/test/test_mutable.py 2303
9639         d.addCallback(_break_peer0)
9640         # now "create" the file, using the pre-established key, and let the
9641         # initial publish finally happen
9642-        d.addCallback(lambda res: nm.create_mutable_file("contents 1"))
9643+        d.addCallback(lambda res: nm.create_mutable_file(MutableDataHandle("contents 1")))
9644         # that ought to work
9645         def _got_node(n):
9646             d = n.download_best_version()
9647hunk ./src/allmydata/test/test_mutable.py 2312
9648             def _break_peer1(res):
9649                 self.connection1.broken = True
9650             d.addCallback(_break_peer1)
9651-            d.addCallback(lambda res: n.overwrite("contents 2"))
9652+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9653             # that ought to work too
9654             d.addCallback(lambda res: n.download_best_version())
9655             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9656hunk ./src/allmydata/test/test_mutable.py 2344
9657         peerids = [serverid for (serverid,ss) in sb.get_all_servers()]
9658         self.g.break_server(peerids[0])
9659 
9660-        d = nm.create_mutable_file("contents 1")
9661+        d = nm.create_mutable_file(MutableDataHandle("contents 1"))
9662         def _created(n):
9663             d = n.download_best_version()
9664             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9665hunk ./src/allmydata/test/test_mutable.py 2352
9666             def _break_second_server(res):
9667                 self.g.break_server(peerids[1])
9668             d.addCallback(_break_second_server)
9669-            d.addCallback(lambda res: n.overwrite("contents 2"))
9670+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9671             # that ought to work too
9672             d.addCallback(lambda res: n.download_best_version())
9673             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9674hunk ./src/allmydata/test/test_mutable.py 2371
9675         d = self.shouldFail(NotEnoughServersError,
9676                             "test_publish_all_servers_bad",
9677                             "Ran out of non-bad servers",
9678-                            nm.create_mutable_file, "contents")
9679+                            nm.create_mutable_file, MutableDataHandle("contents"))
9680         return d
9681 
9682     def test_publish_no_servers(self):
9683hunk ./src/allmydata/test/test_mutable.py 2383
9684         d = self.shouldFail(NotEnoughServersError,
9685                             "test_publish_no_servers",
9686                             "Ran out of non-bad servers",
9687-                            nm.create_mutable_file, "contents")
9688+                            nm.create_mutable_file, MutableDataHandle("contents"))
9689         return d
9690     test_publish_no_servers.timeout = 30
9691 
9692hunk ./src/allmydata/test/test_mutable.py 2401
9693         # we need some contents that are large enough to push the privkey out
9694         # of the early part of the file
9695         LARGE = "These are Larger contents" * 2000 # about 50KB
9696-        d = nm.create_mutable_file(LARGE)
9697+        LARGE_uploadable = MutableDataHandle(LARGE)
9698+        d = nm.create_mutable_file(LARGE_uploadable)
9699         def _created(n):
9700             self.uri = n.get_uri()
9701             self.n2 = nm.create_from_cap(self.uri)
9702hunk ./src/allmydata/test/test_mutable.py 2438
9703         self.set_up_grid(num_servers=20)
9704         nm = self.g.clients[0].nodemaker
9705         LARGE = "These are Larger contents" * 2000 # about 50KiB
9706+        LARGE_uploadable = MutableDataHandle(LARGE)
9707         nm._node_cache = DevNullDictionary() # disable the nodecache
9708 
9709hunk ./src/allmydata/test/test_mutable.py 2441
9710-        d = nm.create_mutable_file(LARGE)
9711+        d = nm.create_mutable_file(LARGE_uploadable)
9712         def _created(n):
9713             self.uri = n.get_uri()
9714             self.n2 = nm.create_from_cap(self.uri)
9715hunk ./src/allmydata/test/test_mutable.py 2464
9716         self.set_up_grid(num_servers=20)
9717         nm = self.g.clients[0].nodemaker
9718         CONTENTS = "contents" * 2000
9719-        d = nm.create_mutable_file(CONTENTS)
9720+        CONTENTS_uploadable = MutableDataHandle(CONTENTS)
9721+        d = nm.create_mutable_file(CONTENTS_uploadable)
9722         def _created(node):
9723             self._node = node
9724         d.addCallback(_created)
9725hunk ./src/allmydata/test/test_system.py 22
9726 from allmydata.monitor import Monitor
9727 from allmydata.mutable.common import NotWriteableError
9728 from allmydata.mutable import layout as mutable_layout
9729+from allmydata.mutable.publish import MutableDataHandle
9730 from foolscap.api import DeadReferenceError
9731 from twisted.python.failure import Failure
9732 from twisted.web.client import getPage
9733hunk ./src/allmydata/test/test_system.py 460
9734     def test_mutable(self):
9735         self.basedir = "system/SystemTest/test_mutable"
9736         DATA = "initial contents go here."  # 25 bytes % 3 != 0
9737+        DATA_uploadable = MutableDataHandle(DATA)
9738         NEWDATA = "new contents yay"
9739hunk ./src/allmydata/test/test_system.py 462
9740+        NEWDATA_uploadable = MutableDataHandle(NEWDATA)
9741         NEWERDATA = "this is getting old"
9742hunk ./src/allmydata/test/test_system.py 464
9743+        NEWERDATA_uploadable = MutableDataHandle(NEWERDATA)
9744 
9745         d = self.set_up_nodes(use_key_generator=True)
9746 
9747hunk ./src/allmydata/test/test_system.py 471
9748         def _create_mutable(res):
9749             c = self.clients[0]
9750             log.msg("starting create_mutable_file")
9751-            d1 = c.create_mutable_file(DATA)
9752+            d1 = c.create_mutable_file(DATA_uploadable)
9753             def _done(res):
9754                 log.msg("DONE: %s" % (res,))
9755                 self._mutable_node_1 = res
9756hunk ./src/allmydata/test/test_system.py 558
9757             self.failUnlessEqual(res, DATA)
9758             # replace the data
9759             log.msg("starting replace1")
9760-            d1 = newnode.overwrite(NEWDATA)
9761+            d1 = newnode.overwrite(NEWDATA_uploadable)
9762             d1.addCallback(lambda res: newnode.download_best_version())
9763             return d1
9764         d.addCallback(_check_download_3)
9765hunk ./src/allmydata/test/test_system.py 572
9766             newnode2 = self.clients[3].create_node_from_uri(uri)
9767             self._newnode3 = self.clients[3].create_node_from_uri(uri)
9768             log.msg("starting replace2")
9769-            d1 = newnode1.overwrite(NEWERDATA)
9770+            d1 = newnode1.overwrite(NEWERDATA_uploadable)
9771             d1.addCallback(lambda res: newnode2.download_best_version())
9772             return d1
9773         d.addCallback(_check_download_4)
9774hunk ./src/allmydata/test/test_system.py 642
9775         def _check_empty_file(res):
9776             # make sure we can create empty files, this usually screws up the
9777             # segsize math
9778-            d1 = self.clients[2].create_mutable_file("")
9779+            d1 = self.clients[2].create_mutable_file(MutableDataHandle(""))
9780             d1.addCallback(lambda newnode: newnode.download_best_version())
9781             d1.addCallback(lambda res: self.failUnlessEqual("", res))
9782             return d1
9783hunk ./src/allmydata/test/test_system.py 673
9784                                  self.key_generator_svc.key_generator.pool_size + size_delta)
9785 
9786         d.addCallback(check_kg_poolsize, 0)
9787-        d.addCallback(lambda junk: self.clients[3].create_mutable_file('hello, world'))
9788+        d.addCallback(lambda junk:
9789+            self.clients[3].create_mutable_file(MutableDataHandle('hello, world')))
9790         d.addCallback(check_kg_poolsize, -1)
9791         d.addCallback(lambda junk: self.clients[3].create_dirnode())
9792         d.addCallback(check_kg_poolsize, -2)
9793hunk ./src/allmydata/test/test_web.py 3166
9794         def _stash_mutable_uri(n, which):
9795             self.uris[which] = n.get_uri()
9796             assert isinstance(self.uris[which], str)
9797-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
9798+        d.addCallback(lambda ign:
9799+            c0.create_mutable_file(publish.MutableDataHandle(DATA+"3")))
9800         d.addCallback(_stash_mutable_uri, "corrupt")
9801         d.addCallback(lambda ign:
9802                       c0.upload(upload.Data("literal", convergence="")))
9803hunk ./src/allmydata/test/test_web.py 3313
9804         def _stash_mutable_uri(n, which):
9805             self.uris[which] = n.get_uri()
9806             assert isinstance(self.uris[which], str)
9807-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
9808+        d.addCallback(lambda ign:
9809+            c0.create_mutable_file(publish.MutableDataHandle(DATA+"3")))
9810         d.addCallback(_stash_mutable_uri, "corrupt")
9811 
9812         def _compute_fileurls(ignored):
9813hunk ./src/allmydata/test/test_web.py 3976
9814         def _stash_mutable_uri(n, which):
9815             self.uris[which] = n.get_uri()
9816             assert isinstance(self.uris[which], str)
9817-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"2"))
9818+        d.addCallback(lambda ign:
9819+            c0.create_mutable_file(publish.MutableDataHandle(DATA+"2")))
9820         d.addCallback(_stash_mutable_uri, "mutable")
9821 
9822         def _compute_fileurls(ignored):
9823hunk ./src/allmydata/test/test_web.py 4076
9824                                                         convergence="")))
9825         d.addCallback(_stash_uri, "small")
9826 
9827-        d.addCallback(lambda ign: c0.create_mutable_file("mutable"))
9828+        d.addCallback(lambda ign:
9829+            c0.create_mutable_file(publish.MutableDataHandle("mutable")))
9830         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
9831         d.addCallback(_stash_uri, "mutable")
9832 
9833}
9834[Alter mutable files to use file-like objects for publishing instead of strings.
9835Kevan Carstensen <kevan@isnotajoke.com>**20100708000732
9836 Ignore-this: 8dd07d95386b6d540bc21289f981ebd0
9837] {
9838hunk ./src/allmydata/dirnode.py 11
9839 from allmydata.mutable.common import NotWriteableError
9840 from allmydata.mutable.filenode import MutableFileNode
9841 from allmydata.unknown import UnknownNode, strip_prefix_for_ro
9842+from allmydata.mutable.publish import MutableDataHandle
9843 from allmydata.interfaces import IFilesystemNode, IDirectoryNode, IFileNode, \
9844      IImmutableFileNode, IMutableFileNode, \
9845      ExistingChildError, NoSuchChildError, ICheckable, IDeepCheckable, \
9846hunk ./src/allmydata/dirnode.py 104
9847 
9848         del children[self.name]
9849         new_contents = self.node._pack_contents(children)
9850-        return new_contents
9851+        uploadable = MutableDataHandle(new_contents)
9852+        return uploadable
9853 
9854 
9855 class MetadataSetter:
9856hunk ./src/allmydata/dirnode.py 130
9857 
9858         children[name] = (child, metadata)
9859         new_contents = self.node._pack_contents(children)
9860-        return new_contents
9861+        uploadable = MutableDataHandle(new_contents)
9862+        return uploadable
9863 
9864 
9865 class Adder:
9866hunk ./src/allmydata/dirnode.py 175
9867 
9868             children[name] = (child, metadata)
9869         new_contents = self.node._pack_contents(children)
9870-        return new_contents
9871+        uploadable = MutableDataHandle(new_contents)
9872+        return uploadable
9873 
9874 
9875 def _encrypt_rw_uri(filenode, rw_uri):
9876hunk ./src/allmydata/mutable/filenode.py 7
9877 from zope.interface import implements
9878 from twisted.internet import defer, reactor
9879 from foolscap.api import eventually
9880-from allmydata.interfaces import IMutableFileNode, \
9881-     ICheckable, ICheckResults, NotEnoughSharesError, MDMF_VERSION, SDMF_VERSION
9882+from allmydata.interfaces import IMutableFileNode, ICheckable, ICheckResults, \
9883+                                 NotEnoughSharesError, \
9884+                                 MDMF_VERSION, SDMF_VERSION, IMutableUploadable
9885 from allmydata.util import hashutil, log
9886 from allmydata.util.assertutil import precondition
9887 from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI
9888hunk ./src/allmydata/mutable/filenode.py 16
9889 from allmydata.monitor import Monitor
9890 from pycryptopp.cipher.aes import AES
9891 
9892-from allmydata.mutable.publish import Publish
9893+from allmydata.mutable.publish import Publish, MutableFileHandle, \
9894+                                      MutableDataHandle
9895 from allmydata.mutable.common import MODE_READ, MODE_WRITE, UnrecoverableFileError, \
9896      ResponseCache, UncoordinatedWriteError
9897 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
9898hunk ./src/allmydata/mutable/filenode.py 133
9899         return self._upload(initial_contents, None)
9900 
9901     def _get_initial_contents(self, contents):
9902-        if isinstance(contents, str):
9903-            return contents
9904         if contents is None:
9905hunk ./src/allmydata/mutable/filenode.py 134
9906-            return ""
9907+            return MutableDataHandle("")
9908+
9909+        if IMutableUploadable.providedBy(contents):
9910+            return contents
9911+
9912         assert callable(contents), "%s should be callable, not %s" % \
9913                (contents, type(contents))
9914         return contents(self)
9915hunk ./src/allmydata/mutable/filenode.py 353
9916     def overwrite(self, new_contents):
9917         return self._do_serialized(self._overwrite, new_contents)
9918     def _overwrite(self, new_contents):
9919+        assert IMutableUploadable.providedBy(new_contents)
9920+
9921         servermap = ServerMap()
9922         d = self._update_servermap(servermap, mode=MODE_WRITE)
9923         d.addCallback(lambda ignored: self._upload(new_contents, servermap))
9924hunk ./src/allmydata/mutable/filenode.py 431
9925                 # recovery when it observes UCWE, we need to do a second
9926                 # publish. See #551 for details. We'll basically loop until
9927                 # we managed an uncontested publish.
9928-                new_contents = old_contents
9929-            precondition(isinstance(new_contents, str),
9930-                         "Modifier function must return a string or None")
9931+                old_uploadable = MutableDataHandle(old_contents)
9932+                new_contents = old_uploadable
9933+            precondition((IMutableUploadable.providedBy(new_contents) or
9934+                          new_contents is None),
9935+                         "Modifier function must return an IMutableUploadable "
9936+                         "or None")
9937             return self._upload(new_contents, servermap)
9938         d.addCallback(_apply)
9939         return d
9940hunk ./src/allmydata/mutable/filenode.py 472
9941         return self._do_serialized(self._upload, new_contents, servermap)
9942     def _upload(self, new_contents, servermap):
9943         assert self._pubkey, "update_servermap must be called before publish"
9944+        assert IMutableUploadable.providedBy(new_contents)
9945+
9946         p = Publish(self, self._storage_broker, servermap)
9947         if self._history:
9948hunk ./src/allmydata/mutable/filenode.py 476
9949-            self._history.notify_publish(p.get_status(), len(new_contents))
9950+            self._history.notify_publish(p.get_status(), new_contents.get_size())
9951         d = p.publish(new_contents)
9952hunk ./src/allmydata/mutable/filenode.py 478
9953-        d.addCallback(self._did_upload, len(new_contents))
9954+        d.addCallback(self._did_upload, new_contents.get_size())
9955         return d
9956     def _did_upload(self, res, size):
9957         self._most_recent_size = size
9958hunk ./src/allmydata/mutable/publish.py 141
9959 
9960         # 0. Setup encoding parameters, encoder, and other such things.
9961         # 1. Encrypt, encode, and publish segments.
9962-        self.data = StringIO(newdata)
9963-        self.datalength = len(newdata)
9964+        assert IMutableUploadable.providedBy(newdata)
9965+
9966+        self.data = newdata
9967+        self.datalength = newdata.get_size()
9968 
9969         self.log("starting publish, datalen is %s" % self.datalength)
9970         self._status.set_size(self.datalength)
9971hunk ./src/allmydata/mutable/publish.py 442
9972 
9973         self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
9974         data = self.data.read(segsize)
9975+        # XXX: This is dumb. Why return a list?
9976+        data = "".join(data)
9977 
9978         assert len(data) == segsize
9979 
9980hunk ./src/allmydata/mutable/repairer.py 5
9981 from zope.interface import implements
9982 from twisted.internet import defer
9983 from allmydata.interfaces import IRepairResults, ICheckResults
9984+from allmydata.mutable.publish import MutableDataHandle
9985 
9986 class RepairResults:
9987     implements(IRepairResults)
9988hunk ./src/allmydata/mutable/repairer.py 108
9989             raise RepairRequiresWritecapError("Sorry, repair currently requires a writecap, to set the write-enabler properly.")
9990 
9991         d = self.node.download_version(smap, best_version, fetch_privkey=True)
9992+        d.addCallback(lambda data:
9993+            MutableDataHandle(data))
9994         d.addCallback(self.node.upload, smap)
9995         d.addCallback(self.get_results, smap)
9996         return d
9997hunk ./src/allmydata/nodemaker.py 9
9998 from allmydata.immutable.filenode import ImmutableFileNode, LiteralFileNode
9999 from allmydata.immutable.upload import Data
10000 from allmydata.mutable.filenode import MutableFileNode
10001+from allmydata.mutable.publish import MutableDataHandle
10002 from allmydata.dirnode import DirectoryNode, pack_children
10003 from allmydata.unknown import UnknownNode
10004 from allmydata import uri
10005hunk ./src/allmydata/nodemaker.py 111
10006                          "create_new_mutable_directory requires metadata to be a dict, not None", metadata)
10007             node.raise_error()
10008         d = self.create_mutable_file(lambda n:
10009-                                     pack_children(n, initial_children),
10010+                                     MutableDataHandle(
10011+                                        pack_children(n, initial_children)),
10012                                      version)
10013         d.addCallback(self._create_dirnode)
10014         return d
10015hunk ./src/allmydata/web/filenode.py 12
10016 from allmydata.interfaces import ExistingChildError
10017 from allmydata.monitor import Monitor
10018 from allmydata.immutable.upload import FileHandle
10019+from allmydata.mutable.publish import MutableFileHandle
10020 from allmydata.util import log, base32
10021 
10022 from allmydata.web.common import text_plain, WebError, RenderMixin, \
10023hunk ./src/allmydata/web/filenode.py 27
10024         # a new file is being uploaded in our place.
10025         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
10026         if mutable:
10027-            req.content.seek(0)
10028-            data = req.content.read()
10029+            data = MutableFileHandle(req.content)
10030             d = client.create_mutable_file(data)
10031             def _uploaded(newnode):
10032                 d2 = self.parentnode.set_node(self.name, newnode,
10033hunk ./src/allmydata/web/filenode.py 61
10034         d.addCallback(lambda res: childnode.get_uri())
10035         return d
10036 
10037-    def _read_data_from_formpost(self, req):
10038-        # SDMF: files are small, and we can only upload data, so we read
10039-        # the whole file into memory before uploading.
10040-        contents = req.fields["file"]
10041-        contents.file.seek(0)
10042-        data = contents.file.read()
10043-        return data
10044 
10045     def replace_me_with_a_formpost(self, req, client, replace):
10046         # create a new file, maybe mutable, maybe immutable
10047hunk ./src/allmydata/web/filenode.py 66
10048         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
10049 
10050+        # create an immutable file
10051+        contents = req.fields["file"]
10052         if mutable:
10053hunk ./src/allmydata/web/filenode.py 69
10054-            data = self._read_data_from_formpost(req)
10055-            d = client.create_mutable_file(data)
10056+            uploadable = MutableFileHandle(contents.file)
10057+            d = client.create_mutable_file(uploadable)
10058             def _uploaded(newnode):
10059                 d2 = self.parentnode.set_node(self.name, newnode,
10060                                               overwrite=replace)
10061hunk ./src/allmydata/web/filenode.py 78
10062                 return d2
10063             d.addCallback(_uploaded)
10064             return d
10065-        # create an immutable file
10066-        contents = req.fields["file"]
10067+
10068         uploadable = FileHandle(contents.file, convergence=client.convergence)
10069         d = self.parentnode.add_file(self.name, uploadable, overwrite=replace)
10070         d.addCallback(lambda newnode: newnode.get_uri())
10071hunk ./src/allmydata/web/filenode.py 84
10072         return d
10073 
10074+
10075 class PlaceHolderNodeHandler(RenderMixin, rend.Page, ReplaceMeMixin):
10076     def __init__(self, client, parentnode, name):
10077         rend.Page.__init__(self)
10078hunk ./src/allmydata/web/filenode.py 278
10079 
10080     def replace_my_contents(self, req):
10081         req.content.seek(0)
10082-        new_contents = req.content.read()
10083+        new_contents = MutableFileHandle(req.content)
10084         d = self.node.overwrite(new_contents)
10085         d.addCallback(lambda res: self.node.get_uri())
10086         return d
10087hunk ./src/allmydata/web/filenode.py 286
10088     def replace_my_contents_with_a_formpost(self, req):
10089         # we have a mutable file. Get the data from the formpost, and replace
10090         # the mutable file's contents with it.
10091-        new_contents = self._read_data_from_formpost(req)
10092+        new_contents = req.fields['file']
10093+        new_contents = MutableFileHandle(new_contents.file)
10094+
10095         d = self.node.overwrite(new_contents)
10096         d.addCallback(lambda res: self.node.get_uri())
10097         return d
10098hunk ./src/allmydata/web/unlinked.py 7
10099 from twisted.internet import defer
10100 from nevow import rend, url, tags as T
10101 from allmydata.immutable.upload import FileHandle
10102+from allmydata.mutable.publish import MutableFileHandle
10103 from allmydata.web.common import getxmlfile, get_arg, boolean_of_arg, \
10104      convert_children_json, WebError
10105 from allmydata.web import status
10106hunk ./src/allmydata/web/unlinked.py 23
10107 def PUTUnlinkedSSK(req, client):
10108     # SDMF: files are small, and we can only upload data
10109     req.content.seek(0)
10110-    data = req.content.read()
10111+    data = MutableFileHandle(req.content)
10112     d = client.create_mutable_file(data)
10113     d.addCallback(lambda n: n.get_uri())
10114     return d
10115hunk ./src/allmydata/web/unlinked.py 87
10116     # "POST /uri", to create an unlinked file.
10117     # SDMF: files are small, and we can only upload data
10118     contents = req.fields["file"]
10119-    contents.file.seek(0)
10120-    data = contents.file.read()
10121+    data = MutableFileHandle(contents.file)
10122     d = client.create_mutable_file(data)
10123     d.addCallback(lambda n: n.get_uri())
10124     return d
10125}
10126[frontends/sftpd.py: alter a mutable file overwrite to work with the new API
10127Kevan Carstensen <kevan@isnotajoke.com>**20100708193454
10128 Ignore-this: 2aa1867ad2e6ebb61e2d8c6cb95c9c3b
10129] {
10130hunk ./src/allmydata/frontends/sftpd.py 33
10131 from allmydata.interfaces import IFileNode, IDirectoryNode, ExistingChildError, \
10132      NoSuchChildError, ChildOfWrongTypeError
10133 from allmydata.mutable.common import NotWriteableError
10134+from allmydata.mutable.publish import MutableFileHandle
10135 from allmydata.immutable.upload import FileHandle
10136 from allmydata.dirnode import update_metadata
10137 
10138hunk ./src/allmydata/frontends/sftpd.py 867
10139                     assert parent and childname, (parent, childname, self.metadata)
10140                     d2.addCallback(lambda ign: parent.set_metadata_for(childname, self.metadata))
10141 
10142-                d2.addCallback(lambda ign: self.consumer.get_current_size())
10143-                d2.addCallback(lambda size: self.consumer.read(0, size))
10144-                d2.addCallback(lambda new_contents: self.filenode.overwrite(new_contents))
10145+                u = MutableFileHandle(self.consumer.get_file())
10146+                d2.addCallback(lambda ign: self.filenode.overwrite(u))
10147             else:
10148                 def _add_file(ign):
10149                     self.log("_add_file childname=%r" % (childname,), level=OPERATIONAL)
10150}
10151[test/test_sftp.py: alter a setup routine to work with new mutable file APIs.
10152Kevan Carstensen <kevan@isnotajoke.com>**20100708193522
10153 Ignore-this: 434bbe1347072076c0836d26fca8ac8a
10154] {
10155hunk ./src/allmydata/test/test_sftp.py 32
10156 
10157 from allmydata.util.consumer import download_to_data
10158 from allmydata.immutable import upload
10159+from allmydata.mutable import publish
10160 from allmydata.test.no_network import GridTestMixin
10161 from allmydata.test.common import ShouldFailMixin
10162 from allmydata.test.common_util import ReallyEqualMixin
10163hunk ./src/allmydata/test/test_sftp.py 84
10164         return d
10165 
10166     def _set_up_tree(self):
10167-        d = self.client.create_mutable_file("mutable file contents")
10168+        u = publish.MutableDataHandle("mutable file contents")
10169+        d = self.client.create_mutable_file(u)
10170         d.addCallback(lambda node: self.root.set_node(u"mutable", node))
10171         def _created_mutable(n):
10172             self.mutable = n
10173}
10174[mutable/publish.py: make MutableFileHandle seek to the beginning of its file handle before reading.
10175Kevan Carstensen <kevan@isnotajoke.com>**20100708193600
10176 Ignore-this: 453a737dc62a79c77b3d360fed9000ab
10177] hunk ./src/allmydata/mutable/publish.py 989
10178         assert hasattr(filehandle, "close")
10179 
10180         self._filehandle = filehandle
10181+        # We must start reading at the beginning of the file, or we risk
10182+        # encountering errors when the data read does not match the size
10183+        # reported to the uploader.
10184+        self._filehandle.seek(0)
10185 
10186 
10187     def get_size(self):
10188
10189Context:
10190
10191[SFTP: don't call .stopProducing on the producer registered with OverwriteableFileConsumer (which breaks with warner's new downloader).
10192david-sarah@jacaranda.org**20100628231926
10193 Ignore-this: 131b7a5787bc85a9a356b5740d9d996f
10194] 
10195[docs/how_to_make_a_tahoe-lafs_release.txt: trivial correction, install.html should now be quickstart.html.
10196david-sarah@jacaranda.org**20100625223929
10197 Ignore-this: 99a5459cac51bd867cc11ad06927ff30
10198] 
10199[setup: in the Makefile, refuse to upload tarballs unless someone has passed the environment variable "BB_BRANCH" with value "trunk"
10200zooko@zooko.com**20100619034928
10201 Ignore-this: 276ddf9b6ad7ec79e27474862e0f7d6
10202] 
10203[trivial: tiny update to in-line comment
10204zooko@zooko.com**20100614045715
10205 Ignore-this: 10851b0ed2abfed542c97749e5d280bc
10206 (I'm actually committing this patch as a test of the new eager-annotation-computation of trac-darcs.)
10207] 
10208[docs: about.html link to home page early on, and be decentralized storage instead of cloud storage this time around
10209zooko@zooko.com**20100619065318
10210 Ignore-this: dc6db03f696e5b6d2848699e754d8053
10211] 
10212[docs: update about.html, especially to have a non-broken link to quickstart.html, and also to comment out the broken links to "for Paranoids" and "for Corporates"
10213zooko@zooko.com**20100619065124
10214 Ignore-this: e292c7f51c337a84ebfeb366fbd24d6c
10215] 
10216[TAG allmydata-tahoe-1.7.0
10217zooko@zooko.com**20100619052631
10218 Ignore-this: d21e27afe6d85e2e3ba6a3292ba2be1
10219] 
10220Patch bundle hash:
10221ca9dc3740872ec6eb5506b722b637ba7c442211d