Ticket #393: 393status13.dpatch

File 393status13.dpatch, 350.3 KB (added by kevan_, at 2010-07-01T23:47:33Z)
Line 
1Thu Jun 24 16:46:37 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
2  * Misc. changes to support the work I'm doing
3 
4      - Add a notion of file version number to interfaces.py
5      - Alter mutable file node interfaces to have a notion of version,
6        though this may be changed later.
7      - Alter mutable/filenode.py to conform to these changes.
8      - Add a salt hasher to util/hashutil.py
9
10Thu Jun 24 16:48:33 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
11  * nodemaker.py: create MDMF files when asked to
12
13Thu Jun 24 16:49:05 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
14  * storage/server.py: minor code cleanup
15
16Thu Jun 24 16:49:24 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
17  * test/test_mutable.py: alter some tests that were failing due to MDMF; minor code cleanup.
18
19Fri Jun 25 17:35:20 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
20  * test/test_mutable.py: change the definition of corrupt() to work with MDMF as well as SDMF files, change users of corrupt to use the new definition
21
22Sat Jun 26 16:41:18 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
23  * Alter the ServermapUpdater to find MDMF files
24 
25  The servermapupdater should find MDMF files on a grid in the same way
26  that it finds SDMF files. This patch makes it do that.
27
28Sat Jun 26 16:42:04 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
29  * Make a segmented mutable uploader
30 
31  The mutable file uploader should be able to publish files with one
32  segment and files with multiple segments. This patch makes it do that.
33  This is still incomplete, and rather ugly -- I need to flesh out error
34  handling, I need to write tests, and I need to remove some of the uglier
35  kludges in the process before I can call this done.
36
37Sat Jun 26 16:43:14 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
38  * Write a segmented mutable downloader
39 
40  The segmented mutable downloader can deal with MDMF files (files with
41  one or more segments in MDMF format) and SDMF files (files with one
42  segment in SDMF format). It is backwards compatible with the old
43  file format.
44 
45  This patch also contains tests for the segmented mutable downloader.
46
47Mon Jun 28 15:50:48 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
48  * mutable/checker.py: check MDMF files
49 
50  This patch adapts the mutable file checker and verifier to check and
51  verify MDMF files. It does this by using the new segmented downloader,
52  which is trained to perform verification operations on request. This
53  removes some code duplication.
54
55Mon Jun 28 15:52:01 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
56  * mutable/retrieve.py: learn how to verify mutable files
57
58Wed Jun 30 11:33:05 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
59  * interfaces.py: add IMutableSlotWriter
60
61Thu Jul  1 16:26:56 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
62  * mutable/publish.py: cleanup + simplification
63
64Thu Jul  1 16:28:06 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
65  * test/test_mutable.py: temporarily disable two tests that are now irrelevant
66
67Thu Jul  1 16:28:34 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
68  * Add MDMF reader and writer, and SDMF writer
69 
70  The MDMF/SDMF reader MDMF writer, and SDMF writer are similar to the
71  object proxies that exist for immutable files. They abstract away
72  details of connection, state, and caching from their callers (in this
73  case, the download, servermap updater, and uploader), and expose methods
74  to get and set information on the remote server.
75 
76  MDMFSlotReadProxy reads a mutable file from the server, doing the right
77  thing (in most cases) regardless of whether the file is MDMF or SDMF. It
78  allows callers to tell it how to batch and flush reads.
79 
80  MDMFSlotWriteProxy writes an MDMF mutable file to a server.
81 
82  SDMFSlotWriteProxy writes an SDMF mutable file to a server.
83 
84  This patch also includes tests for MDMFSlotReadProxy,
85  SDMFSlotWriteProxy, and MDMFSlotWriteProxy.
86
87New patches:
88
89[Misc. changes to support the work I'm doing
90Kevan Carstensen <kevan@isnotajoke.com>**20100624234637
91 Ignore-this: fdd18fa8cc05f4b4b15ff53ee24a1819
92 
93     - Add a notion of file version number to interfaces.py
94     - Alter mutable file node interfaces to have a notion of version,
95       though this may be changed later.
96     - Alter mutable/filenode.py to conform to these changes.
97     - Add a salt hasher to util/hashutil.py
98] {
99hunk ./src/allmydata/interfaces.py 7
100      ChoiceOf, IntegerConstraint, Any, RemoteInterface, Referenceable
101 
102 HASH_SIZE=32
103+SALT_SIZE=16
104+
105+SDMF_VERSION=0
106+MDMF_VERSION=1
107 
108 Hash = StringConstraint(maxLength=HASH_SIZE,
109                         minLength=HASH_SIZE)# binary format 32-byte SHA256 hash
110hunk ./src/allmydata/interfaces.py 811
111         writer-visible data using this writekey.
112         """
113 
114+    def set_version(version):
115+        """Tahoe-LAFS supports SDMF and MDMF mutable files. By default,
116+        we upload in SDMF for reasons of compatibility. If you want to
117+        change this, set_version will let you do that.
118+
119+        To say that this file should be uploaded in SDMF, pass in a 0. To
120+        say that the file should be uploaded as MDMF, pass in a 1.
121+        """
122+
123+    def get_version():
124+        """Returns the mutable file protocol version."""
125+
126 class NotEnoughSharesError(Exception):
127     """Download was unable to get enough shares"""
128 
129hunk ./src/allmydata/mutable/filenode.py 8
130 from twisted.internet import defer, reactor
131 from foolscap.api import eventually
132 from allmydata.interfaces import IMutableFileNode, \
133-     ICheckable, ICheckResults, NotEnoughSharesError
134+     ICheckable, ICheckResults, NotEnoughSharesError, MDMF_VERSION, SDMF_VERSION
135 from allmydata.util import hashutil, log
136 from allmydata.util.assertutil import precondition
137 from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI
138hunk ./src/allmydata/mutable/filenode.py 67
139         self._sharemap = {} # known shares, shnum-to-[nodeids]
140         self._cache = ResponseCache()
141         self._most_recent_size = None
142+        # filled in after __init__ if we're being created for the first time;
143+        # filled in by the servermap updater before publishing, otherwise.
144+        # set to this default value in case neither of those things happen,
145+        # or in case the servermap can't find any shares to tell us what
146+        # to publish as.
147+        # TODO: Set this back to None, and find out why the tests fail
148+        #       with it set to None.
149+        self._protocol_version = SDMF_VERSION
150 
151         # all users of this MutableFileNode go through the serializer. This
152         # takes advantage of the fact that Deferreds discard the callbacks
153hunk ./src/allmydata/mutable/filenode.py 472
154     def _did_upload(self, res, size):
155         self._most_recent_size = size
156         return res
157+
158+
159+    def set_version(self, version):
160+        # I can be set in two ways:
161+        #  1. When the node is created.
162+        #  2. (for an existing share) when the Servermap is updated
163+        #     before I am read.
164+        assert version in (MDMF_VERSION, SDMF_VERSION)
165+        self._protocol_version = version
166+
167+
168+    def get_version(self):
169+        return self._protocol_version
170hunk ./src/allmydata/util/hashutil.py 90
171 MUTABLE_READKEY_TAG = "allmydata_mutable_writekey_to_readkey_v1"
172 MUTABLE_DATAKEY_TAG = "allmydata_mutable_readkey_to_datakey_v1"
173 MUTABLE_STORAGEINDEX_TAG = "allmydata_mutable_readkey_to_storage_index_v1"
174+MUTABLE_SALT_TAG = "allmydata_mutable_segment_salt_v1"
175 
176 # dirnodes
177 DIRNODE_CHILD_WRITECAP_TAG = "allmydata_mutable_writekey_and_salt_to_dirnode_child_capkey_v1"
178hunk ./src/allmydata/util/hashutil.py 134
179 def plaintext_segment_hasher():
180     return tagged_hasher(PLAINTEXT_SEGMENT_TAG)
181 
182+def mutable_salt_hash(data):
183+    return tagged_hash(MUTABLE_SALT_TAG, data)
184+def mutable_salt_hasher():
185+    return tagged_hasher(MUTABLE_SALT_TAG)
186+
187 KEYLEN = 16
188 IVLEN = 16
189 
190}
191[nodemaker.py: create MDMF files when asked to
192Kevan Carstensen <kevan@isnotajoke.com>**20100624234833
193 Ignore-this: 26c16aaca9ddab7a7ce37a4530bc970
194] {
195hunk ./src/allmydata/nodemaker.py 3
196 import weakref
197 from zope.interface import implements
198-from allmydata.interfaces import INodeMaker
199+from allmydata.util.assertutil import precondition
200+from allmydata.interfaces import INodeMaker, MustBeDeepImmutableError, \
201+                                 SDMF_VERSION, MDMF_VERSION
202 from allmydata.immutable.filenode import ImmutableFileNode, LiteralFileNode
203 from allmydata.immutable.upload import Data
204 from allmydata.mutable.filenode import MutableFileNode
205hunk ./src/allmydata/nodemaker.py 92
206             return self._create_dirnode(filenode)
207         return None
208 
209-    def create_mutable_file(self, contents=None, keysize=None):
210+    def create_mutable_file(self, contents=None, keysize=None,
211+                            version=SDMF_VERSION):
212         n = MutableFileNode(self.storage_broker, self.secret_holder,
213                             self.default_encoding_parameters, self.history)
214hunk ./src/allmydata/nodemaker.py 96
215+        n.set_version(version)
216         d = self.key_generator.generate(keysize)
217         d.addCallback(n.create_with_keys, contents)
218         d.addCallback(lambda res: n)
219hunk ./src/allmydata/nodemaker.py 102
220         return d
221 
222-    def create_new_mutable_directory(self, initial_children={}):
223+    def create_new_mutable_directory(self, initial_children={},
224+                                     version=SDMF_VERSION):
225+        # initial_children must have metadata (i.e. {} instead of None)
226+        for (name, (node, metadata)) in initial_children.iteritems():
227+            precondition(isinstance(metadata, dict),
228+                         "create_new_mutable_directory requires metadata to be a dict, not None", metadata)
229+            node.raise_error()
230         d = self.create_mutable_file(lambda n:
231hunk ./src/allmydata/nodemaker.py 110
232-                                     pack_children(n, initial_children))
233+                                     pack_children(n, initial_children),
234+                                     version)
235         d.addCallback(self._create_dirnode)
236         return d
237 
238}
239[storage/server.py: minor code cleanup
240Kevan Carstensen <kevan@isnotajoke.com>**20100624234905
241 Ignore-this: 2358c531c39e48d3c8e56b62b5768228
242] {
243hunk ./src/allmydata/storage/server.py 569
244                                          self)
245         return share
246 
247-    def remote_slot_readv(self, storage_index, shares, readv):
248+    def remote_slot_readv(self, storage_index, shares, readvs):
249         start = time.time()
250         self.count("readv")
251         si_s = si_b2a(storage_index)
252hunk ./src/allmydata/storage/server.py 590
253             if sharenum in shares or not shares:
254                 filename = os.path.join(bucketdir, sharenum_s)
255                 msf = MutableShareFile(filename, self)
256-                datavs[sharenum] = msf.readv(readv)
257+                datavs[sharenum] = msf.readv(readvs)
258         log.msg("returning shares %s" % (datavs.keys(),),
259                 facility="tahoe.storage", level=log.NOISY, parent=lp)
260         self.add_latency("readv", time.time() - start)
261}
262[test/test_mutable.py: alter some tests that were failing due to MDMF; minor code cleanup.
263Kevan Carstensen <kevan@isnotajoke.com>**20100624234924
264 Ignore-this: afb86ec1fbdbfe1a5ef6f46f350273c0
265] {
266hunk ./src/allmydata/test/test_mutable.py 151
267             chr(ord(original[byte_offset]) ^ 0x01) +
268             original[byte_offset+1:])
269 
270+def add_two(original, byte_offset):
271+    # It isn't enough to simply flip the bit for the version number,
272+    # because 1 is a valid version number. So we add two instead.
273+    return (original[:byte_offset] +
274+            chr(ord(original[byte_offset]) ^ 0x02) +
275+            original[byte_offset+1:])
276+
277 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
278     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
279     # list of shnums to corrupt.
280hunk ./src/allmydata/test/test_mutable.py 187
281                 real_offset = offset1
282             real_offset = int(real_offset) + offset2 + offset_offset
283             assert isinstance(real_offset, int), offset
284-            shares[shnum] = flip_bit(data, real_offset)
285+            if offset1 == 0: # verbyte
286+                f = add_two
287+            else:
288+                f = flip_bit
289+            shares[shnum] = f(data, real_offset)
290     return res
291 
292 def make_storagebroker(s=None, num_peers=10):
293hunk ./src/allmydata/test/test_mutable.py 423
294         d.addCallback(_created)
295         return d
296 
297+
298     def test_modify_backoffer(self):
299         def _modifier(old_contents, servermap, first_time):
300             return old_contents + "line2"
301hunk ./src/allmydata/test/test_mutable.py 658
302         d.addCallback(_created)
303         return d
304 
305+
306     def _copy_shares(self, ignored, index):
307         shares = self._storage._peers
308         # we need a deep copy
309}
310[test/test_mutable.py: change the definition of corrupt() to work with MDMF as well as SDMF files, change users of corrupt to use the new definition
311Kevan Carstensen <kevan@isnotajoke.com>**20100626003520
312 Ignore-this: 836e59e2fde0535f6b4bea3468dc8244
313] {
314hunk ./src/allmydata/test/test_mutable.py 168
315                 and shnum not in shnums_to_corrupt):
316                 continue
317             data = shares[shnum]
318-            (version,
319-             seqnum,
320-             root_hash,
321-             IV,
322-             k, N, segsize, datalen,
323-             o) = unpack_header(data)
324-            if isinstance(offset, tuple):
325-                offset1, offset2 = offset
326-            else:
327-                offset1 = offset
328-                offset2 = 0
329-            if offset1 == "pubkey":
330-                real_offset = 107
331-            elif offset1 in o:
332-                real_offset = o[offset1]
333-            else:
334-                real_offset = offset1
335-            real_offset = int(real_offset) + offset2 + offset_offset
336-            assert isinstance(real_offset, int), offset
337-            if offset1 == 0: # verbyte
338-                f = add_two
339-            else:
340-                f = flip_bit
341-            shares[shnum] = f(data, real_offset)
342-    return res
343+            # We're feeding the reader all of the share data, so it
344+            # won't need to use the rref that we didn't provide, nor the
345+            # storage index that we didn't provide. We do this because
346+            # the reader will work for both MDMF and SDMF.
347+            reader = MDMFSlotReadProxy(None, None, shnum, data)
348+            # We need to get the offsets for the next part.
349+            d = reader.get_verinfo()
350+            def _do_corruption(verinfo, data, shnum):
351+                (seqnum,
352+                 root_hash,
353+                 IV,
354+                 segsize,
355+                 datalen,
356+                 k, n, prefix, o) = verinfo
357+                if isinstance(offset, tuple):
358+                    offset1, offset2 = offset
359+                else:
360+                    offset1 = offset
361+                    offset2 = 0
362+                if offset1 == "pubkey":
363+                    real_offset = 107
364+                elif offset1 in o:
365+                    real_offset = o[offset1]
366+                else:
367+                    real_offset = offset1
368+                real_offset = int(real_offset) + offset2 + offset_offset
369+                assert isinstance(real_offset, int), offset
370+                if offset1 == 0: # verbyte
371+                    f = add_two
372+                else:
373+                    f = flip_bit
374+                shares[shnum] = f(data, real_offset)
375+            d.addCallback(_do_corruption, data, shnum)
376+            ds.append(d)
377+    dl = defer.DeferredList(ds)
378+    dl.addCallback(lambda ignored: res)
379+    return dl
380 
381 def make_storagebroker(s=None, num_peers=10):
382     if not s:
383hunk ./src/allmydata/test/test_mutable.py 1177
384         return d
385 
386     def test_download_fails(self):
387-        corrupt(None, self._storage, "signature")
388-        d = self.shouldFail(UnrecoverableFileError, "test_download_anyway",
389+        d = corrupt(None, self._storage, "signature")
390+        d.addCallback(lambda ignored:
391+            self.shouldFail(UnrecoverableFileError, "test_download_anyway",
392                             "no recoverable versions",
393                             self._fn.download_best_version)
394         return d
395hunk ./src/allmydata/test/test_mutable.py 1232
396         return d
397 
398     def test_check_all_bad_sig(self):
399-        corrupt(None, self._storage, 1) # bad sig
400-        d = self._fn.check(Monitor())
401+        d = corrupt(None, self._storage, 1) # bad sig
402+        d.addCallback(lambda ignored:
403+            self._fn.check(Monitor()))
404         d.addCallback(self.check_bad, "test_check_all_bad_sig")
405         return d
406 
407hunk ./src/allmydata/test/test_mutable.py 1239
408     def test_check_all_bad_blocks(self):
409-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
410+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
411         # the Checker won't notice this.. it doesn't look at actual data
412hunk ./src/allmydata/test/test_mutable.py 1241
413-        d = self._fn.check(Monitor())
414+        d.addCallback(lambda ignored:
415+            self._fn.check(Monitor()))
416         d.addCallback(self.check_good, "test_check_all_bad_blocks")
417         return d
418 
419hunk ./src/allmydata/test/test_mutable.py 1252
420         return d
421 
422     def test_verify_all_bad_sig(self):
423-        corrupt(None, self._storage, 1) # bad sig
424-        d = self._fn.check(Monitor(), verify=True)
425+        d = corrupt(None, self._storage, 1) # bad sig
426+        d.addCallback(lambda ignored:
427+            self._fn.check(Monitor(), verify=True))
428         d.addCallback(self.check_bad, "test_verify_all_bad_sig")
429         return d
430 
431hunk ./src/allmydata/test/test_mutable.py 1259
432     def test_verify_one_bad_sig(self):
433-        corrupt(None, self._storage, 1, [9]) # bad sig
434-        d = self._fn.check(Monitor(), verify=True)
435+        d = corrupt(None, self._storage, 1, [9]) # bad sig
436+        d.addCallback(lambda ignored:
437+            self._fn.check(Monitor(), verify=True))
438         d.addCallback(self.check_bad, "test_verify_one_bad_sig")
439         return d
440 
441hunk ./src/allmydata/test/test_mutable.py 1266
442     def test_verify_one_bad_block(self):
443-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
444+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
445         # the Verifier *will* notice this, since it examines every byte
446hunk ./src/allmydata/test/test_mutable.py 1268
447-        d = self._fn.check(Monitor(), verify=True)
448+        d.addCallback(lambda ignored:
449+            self._fn.check(Monitor(), verify=True))
450         d.addCallback(self.check_bad, "test_verify_one_bad_block")
451         d.addCallback(self.check_expected_failure,
452                       CorruptShareError, "block hash tree failure",
453hunk ./src/allmydata/test/test_mutable.py 1277
454         return d
455 
456     def test_verify_one_bad_sharehash(self):
457-        corrupt(None, self._storage, "share_hash_chain", [9], 5)
458-        d = self._fn.check(Monitor(), verify=True)
459+        d = corrupt(None, self._storage, "share_hash_chain", [9], 5)
460+        d.addCallback(lambda ignored:
461+            self._fn.check(Monitor(), verify=True))
462         d.addCallback(self.check_bad, "test_verify_one_bad_sharehash")
463         d.addCallback(self.check_expected_failure,
464                       CorruptShareError, "corrupt hashes",
465hunk ./src/allmydata/test/test_mutable.py 1287
466         return d
467 
468     def test_verify_one_bad_encprivkey(self):
469-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
470-        d = self._fn.check(Monitor(), verify=True)
471+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
472+        d.addCallback(lambda ignored:
473+            self._fn.check(Monitor(), verify=True))
474         d.addCallback(self.check_bad, "test_verify_one_bad_encprivkey")
475         d.addCallback(self.check_expected_failure,
476                       CorruptShareError, "invalid privkey",
477hunk ./src/allmydata/test/test_mutable.py 1297
478         return d
479 
480     def test_verify_one_bad_encprivkey_uncheckable(self):
481-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
482+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
483         readonly_fn = self._fn.get_readonly()
484         # a read-only node has no way to validate the privkey
485hunk ./src/allmydata/test/test_mutable.py 1300
486-        d = readonly_fn.check(Monitor(), verify=True)
487+        d.addCallback(lambda ignored:
488+            readonly_fn.check(Monitor(), verify=True))
489         d.addCallback(self.check_good,
490                       "test_verify_one_bad_encprivkey_uncheckable")
491         return d
492}
493[Alter the ServermapUpdater to find MDMF files
494Kevan Carstensen <kevan@isnotajoke.com>**20100626234118
495 Ignore-this: 25f6278209c2983ba8f307cfe0fde0
496 
497 The servermapupdater should find MDMF files on a grid in the same way
498 that it finds SDMF files. This patch makes it do that.
499] {
500hunk ./src/allmydata/mutable/servermap.py 7
501 from itertools import count
502 from twisted.internet import defer
503 from twisted.python import failure
504-from foolscap.api import DeadReferenceError, RemoteException, eventually
505+from foolscap.api import DeadReferenceError, RemoteException, eventually, \
506+                         fireEventually
507 from allmydata.util import base32, hashutil, idlib, log
508 from allmydata.storage.server import si_b2a
509 from allmydata.interfaces import IServermapUpdaterStatus
510hunk ./src/allmydata/mutable/servermap.py 17
511 from allmydata.mutable.common import MODE_CHECK, MODE_ANYTHING, MODE_WRITE, MODE_READ, \
512      DictOfSets, CorruptShareError, NeedMoreDataError
513 from allmydata.mutable.layout import unpack_prefix_and_signature, unpack_header, unpack_share, \
514-     SIGNED_PREFIX_LENGTH
515+     SIGNED_PREFIX_LENGTH, MDMFSlotReadProxy
516 
517 class UpdateStatus:
518     implements(IServermapUpdaterStatus)
519hunk ./src/allmydata/mutable/servermap.py 254
520         """Return a set of versionids, one for each version that is currently
521         recoverable."""
522         versionmap = self.make_versionmap()
523-
524         recoverable_versions = set()
525         for (verinfo, shares) in versionmap.items():
526             (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
527hunk ./src/allmydata/mutable/servermap.py 366
528         self._servers_responded = set()
529 
530         # how much data should we read?
531+        # SDMF:
532         #  * if we only need the checkstring, then [0:75]
533         #  * if we need to validate the checkstring sig, then [543ish:799ish]
534         #  * if we need the verification key, then [107:436ish]
535hunk ./src/allmydata/mutable/servermap.py 374
536         #  * if we need the encrypted private key, we want [-1216ish:]
537         #   * but we can't read from negative offsets
538         #   * the offset table tells us the 'ish', also the positive offset
539-        # A future version of the SMDF slot format should consider using
540-        # fixed-size slots so we can retrieve less data. For now, we'll just
541-        # read 2000 bytes, which also happens to read enough actual data to
542-        # pre-fetch a 9-entry dirnode.
543+        # MDMF:
544+        #  * Checkstring? [0:72]
545+        #  * If we want to validate the checkstring, then [0:72], [143:?] --
546+        #    the offset table will tell us for sure.
547+        #  * If we need the verification key, we have to consult the offset
548+        #    table as well.
549+        # At this point, we don't know which we are. Our filenode can
550+        # tell us, but it might be lying -- in some cases, we're
551+        # responsible for telling it which kind of file it is.
552         self._read_size = 4000
553         if mode == MODE_CHECK:
554             # we use unpack_prefix_and_signature, so we need 1k
555hunk ./src/allmydata/mutable/servermap.py 432
556         self._queries_completed = 0
557 
558         sb = self._storage_broker
559+        # All of the peers, permuted by the storage index, as usual.
560         full_peerlist = sb.get_servers_for_index(self._storage_index)
561         self.full_peerlist = full_peerlist # for use later, immutable
562         self.extra_peers = full_peerlist[:] # peers are removed as we use them
563hunk ./src/allmydata/mutable/servermap.py 439
564         self._good_peers = set() # peers who had some shares
565         self._empty_peers = set() # peers who don't have any shares
566         self._bad_peers = set() # peers to whom our queries failed
567+        self._readers = {} # peerid -> dict(sharewriters), filled in
568+                           # after responses come in.
569 
570         k = self._node.get_required_shares()
571hunk ./src/allmydata/mutable/servermap.py 443
572+        # For what cases can these conditions work?
573         if k is None:
574             # make a guess
575             k = 3
576hunk ./src/allmydata/mutable/servermap.py 456
577         self.num_peers_to_query = k + self.EPSILON
578 
579         if self.mode == MODE_CHECK:
580+            # We want to query all of the peers.
581             initial_peers_to_query = dict(full_peerlist)
582             must_query = set(initial_peers_to_query.keys())
583             self.extra_peers = []
584hunk ./src/allmydata/mutable/servermap.py 464
585             # we're planning to replace all the shares, so we want a good
586             # chance of finding them all. We will keep searching until we've
587             # seen epsilon that don't have a share.
588+            # We don't query all of the peers because that could take a while.
589             self.num_peers_to_query = N + self.EPSILON
590             initial_peers_to_query, must_query = self._build_initial_querylist()
591             self.required_num_empty_peers = self.EPSILON
592hunk ./src/allmydata/mutable/servermap.py 474
593             # might also avoid the round trip required to read the encrypted
594             # private key.
595 
596-        else:
597+        else: # MODE_READ, MODE_ANYTHING
598+            # 2k peers is good enough.
599             initial_peers_to_query, must_query = self._build_initial_querylist()
600 
601         # this is a set of peers that we are required to get responses from:
602hunk ./src/allmydata/mutable/servermap.py 490
603         # before we can consider ourselves finished, and self.extra_peers
604         # contains the overflow (peers that we should tap if we don't get
605         # enough responses)
606+        # I guess that self._must_query is a subset of
607+        # initial_peers_to_query?
608+        assert set(must_query).issubset(set(initial_peers_to_query))
609 
610         self._send_initial_requests(initial_peers_to_query)
611         self._status.timings["initial_queries"] = time.time() - self._started
612hunk ./src/allmydata/mutable/servermap.py 549
613         # errors that aren't handled by _query_failed (and errors caused by
614         # _query_failed) get logged, but we still want to check for doneness.
615         d.addErrback(log.err)
616-        d.addBoth(self._check_for_done)
617         d.addErrback(self._fatal_error)
618hunk ./src/allmydata/mutable/servermap.py 550
619+        d.addCallback(self._check_for_done)
620         return d
621 
622     def _do_read(self, ss, peerid, storage_index, shnums, readv):
623hunk ./src/allmydata/mutable/servermap.py 569
624         d = ss.callRemote("slot_readv", storage_index, shnums, readv)
625         return d
626 
627+
628+    def _got_corrupt_share(self, e, shnum, peerid, data, lp):
629+        """
630+        I am called when a remote server returns a corrupt share in
631+        response to one of our queries. By corrupt, I mean a share
632+        without a valid signature. I then record the failure, notify the
633+        server of the corruption, and record the share as bad.
634+        """
635+        f = failure.Failure(e)
636+        self.log(format="bad share: %(f_value)s", f_value=str(f),
637+                 failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
638+        # Notify the server that its share is corrupt.
639+        self.notify_server_corruption(peerid, shnum, str(e))
640+        # By flagging this as a bad peer, we won't count any of
641+        # the other shares on that peer as valid, though if we
642+        # happen to find a valid version string amongst those
643+        # shares, we'll keep track of it so that we don't need
644+        # to validate the signature on those again.
645+        self._bad_peers.add(peerid)
646+        self._last_failure = f
647+        # XXX: Use the reader for this?
648+        checkstring = data[:SIGNED_PREFIX_LENGTH]
649+        self._servermap.mark_bad_share(peerid, shnum, checkstring)
650+        self._servermap.problems.append(f)
651+
652+
653+    def _cache_good_sharedata(self, verinfo, shnum, now, data):
654+        """
655+        If one of my queries returns successfully (which means that we
656+        were able to and successfully did validate the signature), I
657+        cache the data that we initially fetched from the storage
658+        server. This will help reduce the number of roundtrips that need
659+        to occur when the file is downloaded, or when the file is
660+        updated.
661+        """
662+        self._node._add_to_cache(verinfo, shnum, 0, data, now)
663+
664+
665     def _got_results(self, datavs, peerid, readsize, stuff, started):
666         lp = self.log(format="got result from [%(peerid)s], %(numshares)d shares",
667                       peerid=idlib.shortnodeid_b2a(peerid),
668hunk ./src/allmydata/mutable/servermap.py 630
669         else:
670             self._empty_peers.add(peerid)
671 
672-        last_verinfo = None
673-        last_shnum = None
674+        ss, storage_index = stuff
675+        ds = []
676+
677         for shnum,datav in datavs.items():
678             data = datav[0]
679hunk ./src/allmydata/mutable/servermap.py 635
680-            try:
681-                verinfo = self._got_results_one_share(shnum, data, peerid, lp)
682-                last_verinfo = verinfo
683-                last_shnum = shnum
684-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
685-            except CorruptShareError, e:
686-                # log it and give the other shares a chance to be processed
687-                f = failure.Failure()
688-                self.log(format="bad share: %(f_value)s", f_value=str(f.value),
689-                         failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
690-                self.notify_server_corruption(peerid, shnum, str(e))
691-                self._bad_peers.add(peerid)
692-                self._last_failure = f
693-                checkstring = data[:SIGNED_PREFIX_LENGTH]
694-                self._servermap.mark_bad_share(peerid, shnum, checkstring)
695-                self._servermap.problems.append(f)
696-                pass
697-
698-        self._status.timings["cumulative_verify"] += (time.time() - now)
699+            reader = MDMFSlotReadProxy(ss,
700+                                       storage_index,
701+                                       shnum,
702+                                       data)
703+            self._readers.setdefault(peerid, dict())[shnum] = reader
704+            # our goal, with each response, is to validate the version
705+            # information and share data as best we can at this point --
706+            # we do this by validating the signature. To do this, we
707+            # need to do the following:
708+            #   - If we don't already have the public key, fetch the
709+            #     public key. We use this to validate the signature.
710+            if not self._node.get_pubkey():
711+                # fetch and set the public key.
712+                d = reader.get_verification_key()
713+                d.addCallback(lambda results, shnum=shnum, peerid=peerid:
714+                    self._try_to_set_pubkey(results, peerid, shnum, lp))
715+                # XXX: Make self._pubkey_query_failed?
716+                d.addErrback(lambda error, shnum=shnum, peerid=peerid:
717+                    self._got_corrupt_share(error, shnum, peerid, data, lp))
718+            else:
719+                # we already have the public key.
720+                d = defer.succeed(None)
721+            # Neither of these two branches return anything of
722+            # consequence, so the first entry in our deferredlist will
723+            # be None.
724 
725hunk ./src/allmydata/mutable/servermap.py 661
726-        if self._need_privkey and last_verinfo:
727-            # send them a request for the privkey. We send one request per
728-            # server.
729-            lp2 = self.log("sending privkey request",
730-                           parent=lp, level=log.NOISY)
731-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
732-             offsets_tuple) = last_verinfo
733-            o = dict(offsets_tuple)
734+            # - Next, we need the version information. We almost
735+            #   certainly got this by reading the first thousand or so
736+            #   bytes of the share on the storage server, so we
737+            #   shouldn't need to fetch anything at this step.
738+            d2 = reader.get_verinfo()
739+            d2.addErrback(lambda error, shnum=shnum, peerid=peerid:
740+                self._got_corrupt_share(error, shnum, peerid, data, lp))
741+            # - Next, we need the signature. For an SDMF share, it is
742+            #   likely that we fetched this when doing our initial fetch
743+            #   to get the version information. In MDMF, this lives at
744+            #   the end of the share, so unless the file is quite small,
745+            #   we'll need to do a remote fetch to get it.
746+            d3 = reader.get_signature()
747+            d3.addErrback(lambda error, shnum=shnum, peerid=peerid:
748+                self._got_corrupt_share(error, shnum, peerid, data, lp))
749+            #  Once we have all three of these responses, we can move on
750+            #  to validating the signature
751 
752hunk ./src/allmydata/mutable/servermap.py 679
753-            self._queries_outstanding.add(peerid)
754-            readv = [ (o['enc_privkey'], (o['EOF'] - o['enc_privkey'])) ]
755-            ss = self._servermap.connections[peerid]
756-            privkey_started = time.time()
757-            d = self._do_read(ss, peerid, self._storage_index,
758-                              [last_shnum], readv)
759-            d.addCallback(self._got_privkey_results, peerid, last_shnum,
760-                          privkey_started, lp2)
761-            d.addErrback(self._privkey_query_failed, peerid, last_shnum, lp2)
762-            d.addErrback(log.err)
763-            d.addCallback(self._check_for_done)
764-            d.addErrback(self._fatal_error)
765+            # Does the node already have a privkey? If not, we'll try to
766+            # fetch it here.
767+            if self._need_privkey:
768+                d4 = reader.get_encprivkey()
769+                d4.addCallback(lambda results, shnum=shnum, peerid=peerid:
770+                    self._try_to_validate_privkey(results, peerid, shnum, lp))
771+                d4.addErrback(lambda error, shnum=shnum, peerid=peerid:
772+                    self._privkey_query_failed(error, shnum, data, lp))
773+            else:
774+                d4 = defer.succeed(None)
775 
776hunk ./src/allmydata/mutable/servermap.py 690
777+            dl = defer.DeferredList([d, d2, d3, d4])
778+            dl.addCallback(lambda results, shnum=shnum, peerid=peerid:
779+                self._got_signature_one_share(results, shnum, peerid, lp))
780+            dl.addErrback(lambda error, shnum=shnum, data=data:
781+               self._got_corrupt_share(error, shnum, peerid, data, lp))
782+            dl.addCallback(lambda verinfo, shnum=shnum, peerid=peerid, data=data:
783+                self._cache_good_sharedata(verinfo, shnum, now, data))
784+            ds.append(dl)
785+        # dl is a deferred list that will fire when all of the shares
786+        # that we found on this peer are done processing. When dl fires,
787+        # we know that processing is done, so we can decrement the
788+        # semaphore-like thing that we incremented earlier.
789+        dl = defer.DeferredList(ds, fireOnOneErrback=True)
790+        # Are we done? Done means that there are no more queries to
791+        # send, that there are no outstanding queries, and that we
792+        # haven't received any queries that are still processing. If we
793+        # are done, self._check_for_done will cause the done deferred
794+        # that we returned to our caller to fire, which tells them that
795+        # they have a complete servermap, and that we won't be touching
796+        # the servermap anymore.
797+        dl.addCallback(self._check_for_done)
798+        dl.addErrback(self._fatal_error)
799         # all done!
800         self.log("_got_results done", parent=lp, level=log.NOISY)
801hunk ./src/allmydata/mutable/servermap.py 714
802+        return dl
803+
804+
805+    def _try_to_set_pubkey(self, pubkey_s, peerid, shnum, lp):
806+        if self._node.get_pubkey():
807+            return # don't go through this again if we don't have to
808+        fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
809+        assert len(fingerprint) == 32
810+        if fingerprint != self._node.get_fingerprint():
811+            raise CorruptShareError(peerid, shnum,
812+                                "pubkey doesn't match fingerprint")
813+        self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
814+        assert self._node.get_pubkey()
815+
816 
817     def notify_server_corruption(self, peerid, shnum, reason):
818         ss = self._servermap.connections[peerid]
819hunk ./src/allmydata/mutable/servermap.py 734
820         ss.callRemoteOnly("advise_corrupt_share",
821                           "mutable", self._storage_index, shnum, reason)
822 
823-    def _got_results_one_share(self, shnum, data, peerid, lp):
824+
825+    def _got_signature_one_share(self, results, shnum, peerid, lp):
826+        # It is our job to give versioninfo to our caller. We need to
827+        # raise CorruptShareError if the share is corrupt for any
828+        # reason, something that our caller will handle.
829         self.log(format="_got_results: got shnum #%(shnum)d from peerid %(peerid)s",
830                  shnum=shnum,
831                  peerid=idlib.shortnodeid_b2a(peerid),
832hunk ./src/allmydata/mutable/servermap.py 744
833                  level=log.NOISY,
834                  parent=lp)
835-
836-        # this might raise NeedMoreDataError, if the pubkey and signature
837-        # live at some weird offset. That shouldn't happen, so I'm going to
838-        # treat it as a bad share.
839-        (seqnum, root_hash, IV, k, N, segsize, datalength,
840-         pubkey_s, signature, prefix) = unpack_prefix_and_signature(data)
841-
842-        if not self._node.get_pubkey():
843-            fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
844-            assert len(fingerprint) == 32
845-            if fingerprint != self._node.get_fingerprint():
846-                raise CorruptShareError(peerid, shnum,
847-                                        "pubkey doesn't match fingerprint")
848-            self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
849-
850-        if self._need_privkey:
851-            self._try_to_extract_privkey(data, peerid, shnum, lp)
852-
853-        (ig_version, ig_seqnum, ig_root_hash, ig_IV, ig_k, ig_N,
854-         ig_segsize, ig_datalen, offsets) = unpack_header(data)
855+        _, verinfo, signature, __ = results
856+        (seqnum,
857+         root_hash,
858+         saltish,
859+         segsize,
860+         datalen,
861+         k,
862+         n,
863+         prefix,
864+         offsets) = verinfo[1]
865         offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
866 
867hunk ./src/allmydata/mutable/servermap.py 756
868-        verinfo = (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
869+        # XXX: This should be done for us in the method, so
870+        # presumably you can go in there and fix it.
871+        verinfo = (seqnum,
872+                   root_hash,
873+                   saltish,
874+                   segsize,
875+                   datalen,
876+                   k,
877+                   n,
878+                   prefix,
879                    offsets_tuple)
880hunk ./src/allmydata/mutable/servermap.py 767
881+        # This tuple uniquely identifies a share on the grid; we use it
882+        # to keep track of the ones that we've already seen.
883 
884         if verinfo not in self._valid_versions:
885hunk ./src/allmydata/mutable/servermap.py 771
886-            # it's a new pair. Verify the signature.
887-            valid = self._node.get_pubkey().verify(prefix, signature)
888+            # This is a new version tuple, and we need to validate it
889+            # against the public key before keeping track of it.
890+            assert self._node.get_pubkey()
891+            valid = self._node.get_pubkey().verify(prefix, signature[1])
892             if not valid:
893hunk ./src/allmydata/mutable/servermap.py 776
894-                raise CorruptShareError(peerid, shnum, "signature is invalid")
895+                raise CorruptShareError(peerid, shnum,
896+                                        "signature is invalid")
897 
898hunk ./src/allmydata/mutable/servermap.py 779
899-            # ok, it's a valid verinfo. Add it to the list of validated
900-            # versions.
901-            self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
902-                     % (seqnum, base32.b2a(root_hash)[:4],
903-                        idlib.shortnodeid_b2a(peerid), shnum,
904-                        k, N, segsize, datalength),
905-                     parent=lp)
906-            self._valid_versions.add(verinfo)
907-        # We now know that this is a valid candidate verinfo.
908+        # ok, it's a valid verinfo. Add it to the list of validated
909+        # versions.
910+        self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
911+                 % (seqnum, base32.b2a(root_hash)[:4],
912+                    idlib.shortnodeid_b2a(peerid), shnum,
913+                    k, n, segsize, datalen),
914+                    parent=lp)
915+        self._valid_versions.add(verinfo)
916+        # We now know that this is a valid candidate verinfo. Whether or
917+        # not this instance of it is valid is a matter for the next
918+        # statement; at this point, we just know that if we see this
919+        # version info again, that its signature checks out and that
920+        # we're okay to skip the signature-checking step.
921 
922hunk ./src/allmydata/mutable/servermap.py 793
923+        # (peerid, shnum) are bound in the method invocation.
924         if (peerid, shnum) in self._servermap.bad_shares:
925             # we've been told that the rest of the data in this share is
926             # unusable, so don't add it to the servermap.
927hunk ./src/allmydata/mutable/servermap.py 808
928         self.versionmap.add(verinfo, (shnum, peerid, timestamp))
929         return verinfo
930 
931+
932     def _deserialize_pubkey(self, pubkey_s):
933         verifier = rsa.create_verifying_key_from_string(pubkey_s)
934         return verifier
935hunk ./src/allmydata/mutable/servermap.py 813
936 
937-    def _try_to_extract_privkey(self, data, peerid, shnum, lp):
938-        try:
939-            r = unpack_share(data)
940-        except NeedMoreDataError, e:
941-            # this share won't help us. oh well.
942-            offset = e.encprivkey_offset
943-            length = e.encprivkey_length
944-            self.log("shnum %d on peerid %s: share was too short (%dB) "
945-                     "to get the encprivkey; [%d:%d] ought to hold it" %
946-                     (shnum, idlib.shortnodeid_b2a(peerid), len(data),
947-                      offset, offset+length),
948-                     parent=lp)
949-            # NOTE: if uncoordinated writes are taking place, someone might
950-            # change the share (and most probably move the encprivkey) before
951-            # we get a chance to do one of these reads and fetch it. This
952-            # will cause us to see a NotEnoughSharesError(unable to fetch
953-            # privkey) instead of an UncoordinatedWriteError . This is a
954-            # nuisance, but it will go away when we move to DSA-based mutable
955-            # files (since the privkey will be small enough to fit in the
956-            # write cap).
957-
958-            return
959-
960-        (seqnum, root_hash, IV, k, N, segsize, datalen,
961-         pubkey, signature, share_hash_chain, block_hash_tree,
962-         share_data, enc_privkey) = r
963-
964-        return self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
965 
966     def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
967hunk ./src/allmydata/mutable/servermap.py 815
968-
969+        """
970+        Given a writekey from a remote server, I validate it against the
971+        writekey stored in my node. If it is valid, then I set the
972+        privkey and encprivkey properties of the node.
973+        """
974         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
975         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
976         if alleged_writekey != self._node.get_writekey():
977hunk ./src/allmydata/mutable/servermap.py 892
978         self._queries_completed += 1
979         self._last_failure = f
980 
981-    def _got_privkey_results(self, datavs, peerid, shnum, started, lp):
982-        now = time.time()
983-        elapsed = now - started
984-        self._status.add_per_server_time(peerid, "privkey", started, elapsed)
985-        self._queries_outstanding.discard(peerid)
986-        if not self._need_privkey:
987-            return
988-        if shnum not in datavs:
989-            self.log("privkey wasn't there when we asked it",
990-                     level=log.WEIRD, umid="VA9uDQ")
991-            return
992-        datav = datavs[shnum]
993-        enc_privkey = datav[0]
994-        self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
995 
996     def _privkey_query_failed(self, f, peerid, shnum, lp):
997         self._queries_outstanding.discard(peerid)
998hunk ./src/allmydata/mutable/servermap.py 906
999         self._servermap.problems.append(f)
1000         self._last_failure = f
1001 
1002+
1003     def _check_for_done(self, res):
1004         # exit paths:
1005         #  return self._send_more_queries(outstanding) : send some more queries
1006hunk ./src/allmydata/mutable/servermap.py 912
1007         #  return self._done() : all done
1008         #  return : keep waiting, no new queries
1009-
1010         lp = self.log(format=("_check_for_done, mode is '%(mode)s', "
1011                               "%(outstanding)d queries outstanding, "
1012                               "%(extra)d extra peers available, "
1013hunk ./src/allmydata/mutable/servermap.py 1117
1014         self._servermap.last_update_time = self._started
1015         # the servermap will not be touched after this
1016         self.log("servermap: %s" % self._servermap.summarize_versions())
1017+
1018         eventually(self._done_deferred.callback, self._servermap)
1019 
1020     def _fatal_error(self, f):
1021hunk ./src/allmydata/test/test_mutable.py 637
1022         d.addCallback(_created)
1023         return d
1024 
1025-    def publish_multiple(self):
1026+    def publish_mdmf(self):
1027+        # like publish_one, except that the result is guaranteed to be
1028+        # an MDMF file.
1029+        # self.CONTENTS should have more than one segment.
1030+        self.CONTENTS = "This is an MDMF file" * 100000
1031+        self._storage = FakeStorage()
1032+        self._nodemaker = make_nodemaker(self._storage)
1033+        self._storage_broker = self._nodemaker.storage_broker
1034+        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=1)
1035+        def _created(node):
1036+            self._fn = node
1037+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
1038+        d.addCallback(_created)
1039+        return d
1040+
1041+
1042+    def publish_sdmf(self):
1043+        # like publish_one, except that the result is guaranteed to be
1044+        # an SDMF file
1045+        self.CONTENTS = "This is an SDMF file" * 1000
1046+        self._storage = FakeStorage()
1047+        self._nodemaker = make_nodemaker(self._storage)
1048+        self._storage_broker = self._nodemaker.storage_broker
1049+        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=0)
1050+        def _created(node):
1051+            self._fn = node
1052+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
1053+        d.addCallback(_created)
1054+        return d
1055+
1056+
1057+    def publish_multiple(self, version=0):
1058         self.CONTENTS = ["Contents 0",
1059                          "Contents 1",
1060                          "Contents 2",
1061hunk ./src/allmydata/test/test_mutable.py 677
1062         self._copied_shares = {}
1063         self._storage = FakeStorage()
1064         self._nodemaker = make_nodemaker(self._storage)
1065-        d = self._nodemaker.create_mutable_file(self.CONTENTS[0]) # seqnum=1
1066+        d = self._nodemaker.create_mutable_file(self.CONTENTS[0], version=version) # seqnum=1
1067         def _created(node):
1068             self._fn = node
1069             # now create multiple versions of the same file, and accumulate
1070hunk ./src/allmydata/test/test_mutable.py 906
1071         return d
1072 
1073 
1074+    def test_servermapupdater_finds_mdmf_files(self):
1075+        # setUp already published an MDMF file for us. We just need to
1076+        # make sure that when we run the ServermapUpdater, the file is
1077+        # reported to have one recoverable version.
1078+        d = defer.succeed(None)
1079+        d.addCallback(lambda ignored:
1080+            self.publish_mdmf())
1081+        d.addCallback(lambda ignored:
1082+            self.make_servermap(mode=MODE_CHECK))
1083+        # Calling make_servermap also updates the servermap in the mode
1084+        # that we specify, so we just need to see what it says.
1085+        def _check_servermap(sm):
1086+            self.failUnlessEqual(len(sm.recoverable_versions()), 1)
1087+        d.addCallback(_check_servermap)
1088+        return d
1089+
1090+
1091+    def test_servermapupdater_finds_sdmf_files(self):
1092+        d = defer.succeed(None)
1093+        d.addCallback(lambda ignored:
1094+            self.publish_sdmf())
1095+        d.addCallback(lambda ignored:
1096+            self.make_servermap(mode=MODE_CHECK))
1097+        d.addCallback(lambda servermap:
1098+            self.failUnlessEqual(len(servermap.recoverable_versions()), 1))
1099+        return d
1100+
1101 
1102 class Roundtrip(unittest.TestCase, testutil.ShouldFailMixin, PublishMixin):
1103     def setUp(self):
1104hunk ./src/allmydata/test/test_mutable.py 1050
1105         return d
1106     test_no_servers_download.timeout = 15
1107 
1108+
1109     def _test_corrupt_all(self, offset, substring,
1110                           should_succeed=False, corrupt_early=True,
1111                           failure_checker=None):
1112}
1113[Make a segmented mutable uploader
1114Kevan Carstensen <kevan@isnotajoke.com>**20100626234204
1115 Ignore-this: d199af8ab0bc64d8ed2bc19c5437bfba
1116 
1117 The mutable file uploader should be able to publish files with one
1118 segment and files with multiple segments. This patch makes it do that.
1119 This is still incomplete, and rather ugly -- I need to flesh out error
1120 handling, I need to write tests, and I need to remove some of the uglier
1121 kludges in the process before I can call this done.
1122] {
1123hunk ./src/allmydata/mutable/publish.py 8
1124 from zope.interface import implements
1125 from twisted.internet import defer
1126 from twisted.python import failure
1127-from allmydata.interfaces import IPublishStatus
1128+from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION
1129 from allmydata.util import base32, hashutil, mathutil, idlib, log
1130 from allmydata import hashtree, codec
1131 from allmydata.storage.server import si_b2a
1132hunk ./src/allmydata/mutable/publish.py 19
1133      UncoordinatedWriteError, NotEnoughServersError
1134 from allmydata.mutable.servermap import ServerMap
1135 from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
1136-     unpack_checkstring, SIGNED_PREFIX
1137+     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy
1138+
1139+KiB = 1024
1140+DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
1141 
1142 class PublishStatus:
1143     implements(IPublishStatus)
1144hunk ./src/allmydata/mutable/publish.py 112
1145         self._status.set_helper(False)
1146         self._status.set_progress(0.0)
1147         self._status.set_active(True)
1148+        # We use this to control how the file is written.
1149+        version = self._node.get_version()
1150+        assert version in (SDMF_VERSION, MDMF_VERSION)
1151+        self._version = version
1152 
1153     def get_status(self):
1154         return self._status
1155hunk ./src/allmydata/mutable/publish.py 134
1156         simultaneous write.
1157         """
1158 
1159-        # 1: generate shares (SDMF: files are small, so we can do it in RAM)
1160-        # 2: perform peer selection, get candidate servers
1161-        #  2a: send queries to n+epsilon servers, to determine current shares
1162-        #  2b: based upon responses, create target map
1163-        # 3: send slot_testv_and_readv_and_writev messages
1164-        # 4: as responses return, update share-dispatch table
1165-        # 4a: may need to run recovery algorithm
1166-        # 5: when enough responses are back, we're done
1167+        # 0. Setup encoding parameters, encoder, and other such things.
1168+        # 1. Encrypt, encode, and publish segments.
1169 
1170         self.log("starting publish, datalen is %s" % len(newdata))
1171         self._status.set_size(len(newdata))
1172hunk ./src/allmydata/mutable/publish.py 187
1173         self.bad_peers = set() # peerids who have errbacked/refused requests
1174 
1175         self.newdata = newdata
1176-        self.salt = os.urandom(16)
1177 
1178hunk ./src/allmydata/mutable/publish.py 188
1179+        # This will set self.segment_size, self.num_segments, and
1180+        # self.fec.
1181         self.setup_encoding_parameters()
1182 
1183         # if we experience any surprises (writes which were rejected because
1184hunk ./src/allmydata/mutable/publish.py 238
1185             self.bad_share_checkstrings[key] = old_checkstring
1186             self.connections[peerid] = self._servermap.connections[peerid]
1187 
1188-        # create the shares. We'll discard these as they are delivered. SDMF:
1189-        # we're allowed to hold everything in memory.
1190+        # Now, the process dovetails -- if this is an SDMF file, we need
1191+        # to write an SDMF file. Otherwise, we need to write an MDMF
1192+        # file.
1193+        if self._version == MDMF_VERSION:
1194+            return self._publish_mdmf()
1195+        else:
1196+            return self._publish_sdmf()
1197+        #return self.done_deferred
1198+
1199+    def _publish_mdmf(self):
1200+        # Next, we find homes for all of the shares that we don't have
1201+        # homes for yet.
1202+        # TODO: Make this part do peer selection.
1203+        self.update_goal()
1204+        self.writers = {}
1205+        # For each (peerid, shnum) in self.goal, we make an
1206+        # MDMFSlotWriteProxy for that peer. We'll use this to write
1207+        # shares to the peer.
1208+        for key in self.goal:
1209+            peerid, shnum = key
1210+            write_enabler = self._node.get_write_enabler(peerid)
1211+            renew_secret = self._node.get_renewal_secret(peerid)
1212+            cancel_secret = self._node.get_cancel_secret(peerid)
1213+            secrets = (write_enabler, renew_secret, cancel_secret)
1214+
1215+            self.writers[shnum] =  MDMFSlotWriteProxy(shnum,
1216+                                                      self.connections[peerid],
1217+                                                      self._storage_index,
1218+                                                      secrets,
1219+                                                      self._new_seqnum,
1220+                                                      self.required_shares,
1221+                                                      self.total_shares,
1222+                                                      self.segment_size,
1223+                                                      len(self.newdata))
1224+            if (peerid, shnum) in self._servermap.servermap:
1225+                old_versionid, old_timestamp = self._servermap.servermap[key]
1226+                (old_seqnum, old_root_hash, old_salt, old_segsize,
1227+                 old_datalength, old_k, old_N, old_prefix,
1228+                 old_offsets_tuple) = old_versionid
1229+                self.writers[shnum].set_checkstring(old_seqnum, old_root_hash)
1230+
1231+        # Now, we start pushing shares.
1232+        self._status.timings["setup"] = time.time() - self._started
1233+        def _start_pushing(res):
1234+            self._started_pushing = time.time()
1235+            return res
1236+
1237+        # First, we encrypt, encode, and publish the shares that we need
1238+        # to encrypt, encode, and publish.
1239+
1240+        # This will eventually hold the block hash chain for each share
1241+        # that we publish. We define it this way so that empty publishes
1242+        # will still have something to write to the remote slot.
1243+        self.blockhashes = dict([(i, []) for i in xrange(self.total_shares)])
1244+        self.sharehash_leaves = None # eventually [sharehashes]
1245+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
1246+                              # validate the share]
1247 
1248hunk ./src/allmydata/mutable/publish.py 296
1249+        d = defer.succeed(None)
1250+        self.log("Starting push")
1251+        for i in xrange(self.num_segments - 1):
1252+            d.addCallback(lambda ignored, i=i:
1253+                self.push_segment(i))
1254+            d.addCallback(self._turn_barrier)
1255+        # We have at least one segment, so we will have a tail segment
1256+        if self.num_segments > 0:
1257+            d.addCallback(lambda ignored:
1258+                self.push_tail_segment())
1259+
1260+        d.addCallback(lambda ignored:
1261+            self.push_encprivkey())
1262+        d.addCallback(lambda ignored:
1263+            self.push_blockhashes())
1264+        d.addCallback(lambda ignored:
1265+            self.push_sharehashes())
1266+        d.addCallback(lambda ignored:
1267+            self.push_toplevel_hashes_and_signature())
1268+        d.addCallback(lambda ignored:
1269+            self.finish_publishing())
1270+        return d
1271+
1272+
1273+    def _publish_sdmf(self):
1274         self._status.timings["setup"] = time.time() - self._started
1275hunk ./src/allmydata/mutable/publish.py 322
1276+        self.salt = os.urandom(16)
1277+
1278         d = self._encrypt_and_encode()
1279         d.addCallback(self._generate_shares)
1280         def _start_pushing(res):
1281hunk ./src/allmydata/mutable/publish.py 335
1282 
1283         return self.done_deferred
1284 
1285+
1286     def setup_encoding_parameters(self):
1287hunk ./src/allmydata/mutable/publish.py 337
1288-        segment_size = len(self.newdata)
1289+        if self._version == MDMF_VERSION:
1290+            segment_size = DEFAULT_MAX_SEGMENT_SIZE # 128 KiB by default
1291+        else:
1292+            segment_size = len(self.newdata) # SDMF is only one segment
1293         # this must be a multiple of self.required_shares
1294         segment_size = mathutil.next_multiple(segment_size,
1295                                               self.required_shares)
1296hunk ./src/allmydata/mutable/publish.py 350
1297                                                   segment_size)
1298         else:
1299             self.num_segments = 0
1300-        assert self.num_segments in [0, 1,] # SDMF restrictions
1301+        if self._version == SDMF_VERSION:
1302+            assert self.num_segments in (0, 1) # SDMF
1303+            return
1304+        # calculate the tail segment size.
1305+        self.tail_segment_size = len(self.newdata) % segment_size
1306+
1307+        if self.tail_segment_size == 0:
1308+            # The tail segment is the same size as the other segments.
1309+            self.tail_segment_size = segment_size
1310+
1311+        # We'll make an encoder ahead-of-time for the normal-sized
1312+        # segments (defined as any segment of segment_size size.
1313+        # (the part of the code that puts the tail segment will make its
1314+        #  own encoder for that part)
1315+        fec = codec.CRSEncoder()
1316+        fec.set_params(self.segment_size,
1317+                       self.required_shares, self.total_shares)
1318+        self.piece_size = fec.get_block_size()
1319+        self.fec = fec
1320+
1321+
1322+    def push_segment(self, segnum):
1323+        started = time.time()
1324+        segsize = self.segment_size
1325+        self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
1326+        data = self.newdata[segsize * segnum:segsize*(segnum + 1)]
1327+        assert len(data) == segsize
1328+
1329+        salt = os.urandom(16)
1330+
1331+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
1332+        enc = AES(key)
1333+        crypttext = enc.process(data)
1334+        assert len(crypttext) == len(data)
1335+
1336+        now = time.time()
1337+        self._status.timings["encrypt"] = now - started
1338+        started = now
1339+
1340+        # now apply FEC
1341+
1342+        self._status.set_status("Encoding")
1343+        crypttext_pieces = [None] * self.required_shares
1344+        piece_size = self.piece_size
1345+        for i in range(len(crypttext_pieces)):
1346+            offset = i * piece_size
1347+            piece = crypttext[offset:offset+piece_size]
1348+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
1349+            crypttext_pieces[i] = piece
1350+            assert len(piece) == piece_size
1351+        d = self.fec.encode(crypttext_pieces)
1352+        def _done_encoding(res):
1353+            elapsed = time.time() - started
1354+            self._status.timings["encode"] = elapsed
1355+            return res
1356+        d.addCallback(_done_encoding)
1357+
1358+        def _push_shares_and_salt(results):
1359+            shares, shareids = results
1360+            dl = []
1361+            for i in xrange(len(shares)):
1362+                sharedata = shares[i]
1363+                shareid = shareids[i]
1364+                block_hash = hashutil.block_hash(salt + sharedata)
1365+                self.blockhashes[shareid].append(block_hash)
1366+
1367+                # find the writer for this share
1368+                d = self.writers[shareid].put_block(sharedata, segnum, salt)
1369+                dl.append(d)
1370+            # TODO: Naturally, we need to check on the results of these.
1371+            return defer.DeferredList(dl)
1372+        d.addCallback(_push_shares_and_salt)
1373+        return d
1374+
1375+
1376+    def push_tail_segment(self):
1377+        # This is essentially the same as push_segment, except that we
1378+        # don't use the cached encoder that we use elsewhere.
1379+        self.log("Pushing tail segment")
1380+        started = time.time()
1381+        segsize = self.segment_size
1382+        data = self.newdata[segsize * (self.num_segments-1):]
1383+        assert len(data) == self.tail_segment_size
1384+        salt = os.urandom(16)
1385+
1386+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
1387+        enc = AES(key)
1388+        crypttext = enc.process(data)
1389+        assert len(crypttext) == len(data)
1390+
1391+        now = time.time()
1392+        self._status.timings['encrypt'] = now - started
1393+        started = now
1394+
1395+        self._status.set_status("Encoding")
1396+        tail_fec = codec.CRSEncoder()
1397+        tail_fec.set_params(self.tail_segment_size,
1398+                            self.required_shares,
1399+                            self.total_shares)
1400+
1401+        crypttext_pieces = [None] * self.required_shares
1402+        piece_size = tail_fec.get_block_size()
1403+        for i in range(len(crypttext_pieces)):
1404+            offset = i * piece_size
1405+            piece = crypttext[offset:offset+piece_size]
1406+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
1407+            crypttext_pieces[i] = piece
1408+            assert len(piece) == piece_size
1409+        d = tail_fec.encode(crypttext_pieces)
1410+        def _push_shares_and_salt(results):
1411+            shares, shareids = results
1412+            dl = []
1413+            for i in xrange(len(shares)):
1414+                sharedata = shares[i]
1415+                shareid = shareids[i]
1416+                block_hash = hashutil.block_hash(salt + sharedata)
1417+                self.blockhashes[shareid].append(block_hash)
1418+                # find the writer for this share
1419+                d = self.writers[shareid].put_block(sharedata,
1420+                                                    self.num_segments - 1,
1421+                                                    salt)
1422+                dl.append(d)
1423+            # TODO: Naturally, we need to check on the results of these.
1424+            return defer.DeferredList(dl)
1425+        d.addCallback(_push_shares_and_salt)
1426+        return d
1427+
1428+
1429+    def push_encprivkey(self):
1430+        started = time.time()
1431+        encprivkey = self._encprivkey
1432+        dl = []
1433+        def _spy_on_writer(results):
1434+            print results
1435+            return results
1436+        for shnum, writer in self.writers.iteritems():
1437+            d = writer.put_encprivkey(encprivkey)
1438+            dl.append(d)
1439+        d = defer.DeferredList(dl)
1440+        return d
1441+
1442+
1443+    def push_blockhashes(self):
1444+        started = time.time()
1445+        dl = []
1446+        def _spy_on_results(results):
1447+            print results
1448+            return results
1449+        self.sharehash_leaves = [None] * len(self.blockhashes)
1450+        for shnum, blockhashes in self.blockhashes.iteritems():
1451+            t = hashtree.HashTree(blockhashes)
1452+            self.blockhashes[shnum] = list(t)
1453+            # set the leaf for future use.
1454+            self.sharehash_leaves[shnum] = t[0]
1455+            d = self.writers[shnum].put_blockhashes(self.blockhashes[shnum])
1456+            dl.append(d)
1457+        d = defer.DeferredList(dl)
1458+        return d
1459+
1460+
1461+    def push_sharehashes(self):
1462+        share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
1463+        share_hash_chain = {}
1464+        ds = []
1465+        def _spy_on_results(results):
1466+            print results
1467+            return results
1468+        for shnum in xrange(len(self.sharehash_leaves)):
1469+            needed_indices = share_hash_tree.needed_hashes(shnum)
1470+            self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
1471+                                             for i in needed_indices] )
1472+            d = self.writers[shnum].put_sharehashes(self.sharehashes[shnum])
1473+            ds.append(d)
1474+        self.root_hash = share_hash_tree[0]
1475+        d = defer.DeferredList(ds)
1476+        return d
1477+
1478+
1479+    def push_toplevel_hashes_and_signature(self):
1480+        # We need to to three things here:
1481+        #   - Push the root hash and salt hash
1482+        #   - Get the checkstring of the resulting layout; sign that.
1483+        #   - Push the signature
1484+        ds = []
1485+        def _spy_on_results(results):
1486+            print results
1487+            return results
1488+        for shnum in xrange(self.total_shares):
1489+            d = self.writers[shnum].put_root_hash(self.root_hash)
1490+            ds.append(d)
1491+        d = defer.DeferredList(ds)
1492+        def _make_and_place_signature(ignored):
1493+            signable = self.writers[0].get_signable()
1494+            self.signature = self._privkey.sign(signable)
1495+
1496+            ds = []
1497+            for (shnum, writer) in self.writers.iteritems():
1498+                d = writer.put_signature(self.signature)
1499+                ds.append(d)
1500+            return defer.DeferredList(ds)
1501+        d.addCallback(_make_and_place_signature)
1502+        return d
1503+
1504+
1505+    def finish_publishing(self):
1506+        # We're almost done -- we just need to put the verification key
1507+        # and the offsets
1508+        ds = []
1509+        verification_key = self._pubkey.serialize()
1510+
1511+        def _spy_on_results(results):
1512+            print results
1513+            return results
1514+        for (shnum, writer) in self.writers.iteritems():
1515+            d = writer.put_verification_key(verification_key)
1516+            d.addCallback(lambda ignored, writer=writer:
1517+                writer.finish_publishing())
1518+            ds.append(d)
1519+        return defer.DeferredList(ds)
1520+
1521+
1522+    def _turn_barrier(self, res):
1523+        # putting this method in a Deferred chain imposes a guaranteed
1524+        # reactor turn between the pre- and post- portions of that chain.
1525+        # This can be useful to limit memory consumption: since Deferreds do
1526+        # not do tail recursion, code which uses defer.succeed(result) for
1527+        # consistency will cause objects to live for longer than you might
1528+        # normally expect.
1529+        return fireEventually(res)
1530+
1531 
1532     def _fatal_error(self, f):
1533         self.log("error during loop", failure=f, level=log.UNUSUAL)
1534hunk ./src/allmydata/mutable/publish.py 716
1535             self.log_goal(self.goal, "after update: ")
1536 
1537 
1538-
1539     def _encrypt_and_encode(self):
1540         # this returns a Deferred that fires with a list of (sharedata,
1541         # sharenum) tuples. TODO: cache the ciphertext, only produce the
1542hunk ./src/allmydata/mutable/publish.py 757
1543         d.addCallback(_done_encoding)
1544         return d
1545 
1546+
1547     def _generate_shares(self, shares_and_shareids):
1548         # this sets self.shares and self.root_hash
1549         self.log("_generate_shares")
1550hunk ./src/allmydata/mutable/publish.py 1145
1551             self._status.set_progress(1.0)
1552         eventually(self.done_deferred.callback, res)
1553 
1554-
1555hunk ./src/allmydata/test/test_mutable.py 248
1556         d.addCallback(_created)
1557         return d
1558 
1559+
1560+    def test_create_mdmf(self):
1561+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
1562+        def _created(n):
1563+            self.failUnless(isinstance(n, MutableFileNode))
1564+            self.failUnlessEqual(n.get_storage_index(), n._storage_index)
1565+            sb = self.nodemaker.storage_broker
1566+            peer0 = sorted(sb.get_all_serverids())[0]
1567+            shnums = self._storage._peers[peer0].keys()
1568+            self.failUnlessEqual(len(shnums), 1)
1569+        d.addCallback(_created)
1570+        return d
1571+
1572+
1573     def test_serialize(self):
1574         n = MutableFileNode(None, None, {"k": 3, "n": 10}, None)
1575         calls = []
1576hunk ./src/allmydata/test/test_mutable.py 334
1577         d.addCallback(_created)
1578         return d
1579 
1580+
1581+    def test_create_mdmf_with_initial_contents(self):
1582+        initial_contents = "foobarbaz" * 131072 # 900KiB
1583+        d = self.nodemaker.create_mutable_file(initial_contents,
1584+                                               version=MDMF_VERSION)
1585+        def _created(n):
1586+            d = n.download_best_version()
1587+            d.addCallback(lambda data:
1588+                self.failUnlessEqual(data, initial_contents))
1589+            d.addCallback(lambda ignored:
1590+                n.overwrite(initial_contents + "foobarbaz"))
1591+            d.addCallback(lambda ignored:
1592+                n.download_best_version())
1593+            d.addCallback(lambda data:
1594+                self.failUnlessEqual(data, initial_contents +
1595+                                           "foobarbaz"))
1596+            return d
1597+        d.addCallback(_created)
1598+        return d
1599+
1600+
1601     def test_create_with_initial_contents_function(self):
1602         data = "initial contents"
1603         def _make_contents(n):
1604hunk ./src/allmydata/test/test_mutable.py 370
1605         d.addCallback(lambda data2: self.failUnlessEqual(data2, data))
1606         return d
1607 
1608+
1609+    def test_create_mdmf_with_initial_contents_function(self):
1610+        data = "initial contents" * 100000
1611+        def _make_contents(n):
1612+            self.failUnless(isinstance(n, MutableFileNode))
1613+            key = n.get_writekey()
1614+            self.failUnless(isinstance(key, str), key)
1615+            self.failUnlessEqual(len(key), 16)
1616+            return data
1617+        d = self.nodemaker.create_mutable_file(_make_contents,
1618+                                               version=MDMF_VERSION)
1619+        d.addCallback(lambda n:
1620+            n.download_best_version())
1621+        d.addCallback(lambda data2:
1622+            self.failUnlessEqual(data2, data))
1623+        return d
1624+
1625+
1626     def test_create_with_too_large_contents(self):
1627         BIG = "a" * (self.OLD_MAX_SEGMENT_SIZE + 1)
1628         d = self.nodemaker.create_mutable_file(BIG)
1629}
1630[Write a segmented mutable downloader
1631Kevan Carstensen <kevan@isnotajoke.com>**20100626234314
1632 Ignore-this: d2bef531cde1b5c38f2eb28afdd4b17c
1633 
1634 The segmented mutable downloader can deal with MDMF files (files with
1635 one or more segments in MDMF format) and SDMF files (files with one
1636 segment in SDMF format). It is backwards compatible with the old
1637 file format.
1638 
1639 This patch also contains tests for the segmented mutable downloader.
1640] {
1641hunk ./src/allmydata/mutable/retrieve.py 8
1642 from twisted.internet import defer
1643 from twisted.python import failure
1644 from foolscap.api import DeadReferenceError, eventually, fireEventually
1645-from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError
1646-from allmydata.util import hashutil, idlib, log
1647+from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError, \
1648+                                 MDMF_VERSION, SDMF_VERSION
1649+from allmydata.util import hashutil, idlib, log, mathutil
1650 from allmydata import hashtree, codec
1651 from allmydata.storage.server import si_b2a
1652 from pycryptopp.cipher.aes import AES
1653hunk ./src/allmydata/mutable/retrieve.py 17
1654 from pycryptopp.publickey import rsa
1655 
1656 from allmydata.mutable.common import DictOfSets, CorruptShareError, UncoordinatedWriteError
1657-from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data
1658+from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data, \
1659+                                     MDMFSlotReadProxy
1660 
1661 class RetrieveStatus:
1662     implements(IRetrieveStatus)
1663hunk ./src/allmydata/mutable/retrieve.py 104
1664         self.verinfo = verinfo
1665         # during repair, we may be called upon to grab the private key, since
1666         # it wasn't picked up during a verify=False checker run, and we'll
1667-        # need it for repair to generate the a new version.
1668+        # need it for repair to generate a new version.
1669         self._need_privkey = fetch_privkey
1670         if self._node.get_privkey():
1671             self._need_privkey = False
1672hunk ./src/allmydata/mutable/retrieve.py 109
1673 
1674+        if self._need_privkey:
1675+            # TODO: Evaluate the need for this. We'll use it if we want
1676+            # to limit how many queries are on the wire for the privkey
1677+            # at once.
1678+            self._privkey_query_markers = [] # one Marker for each time we've
1679+                                             # tried to get the privkey.
1680+
1681         self._status = RetrieveStatus()
1682         self._status.set_storage_index(self._storage_index)
1683         self._status.set_helper(False)
1684hunk ./src/allmydata/mutable/retrieve.py 125
1685          offsets_tuple) = self.verinfo
1686         self._status.set_size(datalength)
1687         self._status.set_encoding(k, N)
1688+        self.readers = {}
1689 
1690     def get_status(self):
1691         return self._status
1692hunk ./src/allmydata/mutable/retrieve.py 149
1693         self.remaining_sharemap = DictOfSets()
1694         for (shnum, peerid, timestamp) in shares:
1695             self.remaining_sharemap.add(shnum, peerid)
1696+            # If the servermap update fetched anything, it fetched at least 1
1697+            # KiB, so we ask for that much.
1698+            # TODO: Change the cache methods to allow us to fetch all of the
1699+            # data that they have, then change this method to do that.
1700+            any_cache, timestamp = self._node._read_from_cache(self.verinfo,
1701+                                                               shnum,
1702+                                                               0,
1703+                                                               1000)
1704+            ss = self.servermap.connections[peerid]
1705+            reader = MDMFSlotReadProxy(ss,
1706+                                       self._storage_index,
1707+                                       shnum,
1708+                                       any_cache)
1709+            reader.peerid = peerid
1710+            self.readers[shnum] = reader
1711+
1712 
1713         self.shares = {} # maps shnum to validated blocks
1714hunk ./src/allmydata/mutable/retrieve.py 167
1715+        self._active_readers = [] # list of active readers for this dl.
1716+        self._validated_readers = set() # set of readers that we have
1717+                                        # validated the prefix of
1718+        self._block_hash_trees = {} # shnum => hashtree
1719+        # TODO: Make this into a file-backed consumer or something to
1720+        # conserve memory.
1721+        self._plaintext = ""
1722 
1723         # how many shares do we need?
1724hunk ./src/allmydata/mutable/retrieve.py 176
1725-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1726+        (seqnum,
1727+         root_hash,
1728+         IV,
1729+         segsize,
1730+         datalength,
1731+         k,
1732+         N,
1733+         prefix,
1734          offsets_tuple) = self.verinfo
1735hunk ./src/allmydata/mutable/retrieve.py 185
1736-        assert len(self.remaining_sharemap) >= k
1737-        # we start with the lowest shnums we have available, since FEC is
1738-        # faster if we're using "primary shares"
1739-        self.active_shnums = set(sorted(self.remaining_sharemap.keys())[:k])
1740-        for shnum in self.active_shnums:
1741-            # we use an arbitrary peer who has the share. If shares are
1742-            # doubled up (more than one share per peer), we could make this
1743-            # run faster by spreading the load among multiple peers. But the
1744-            # algorithm to do that is more complicated than I want to write
1745-            # right now, and a well-provisioned grid shouldn't have multiple
1746-            # shares per peer.
1747-            peerid = list(self.remaining_sharemap[shnum])[0]
1748-            self.get_data(shnum, peerid)
1749 
1750hunk ./src/allmydata/mutable/retrieve.py 186
1751-        # control flow beyond this point: state machine. Receiving responses
1752-        # from queries is the input. We might send out more queries, or we
1753-        # might produce a result.
1754 
1755hunk ./src/allmydata/mutable/retrieve.py 187
1756+        # We need one share hash tree for the entire file; its leaves
1757+        # are the roots of the block hash trees for the shares that
1758+        # comprise it, and its root is in the verinfo.
1759+        self.share_hash_tree = hashtree.IncompleteHashTree(N)
1760+        self.share_hash_tree.set_hashes({0: root_hash})
1761+
1762+        # This will set up both the segment decoder and the tail segment
1763+        # decoder, as well as a variety of other instance variables that
1764+        # the download process will use.
1765+        self._setup_encoding_parameters()
1766+        assert len(self.remaining_sharemap) >= k
1767+
1768+        self.log("starting download")
1769+        self._add_active_peers()
1770+        # The download process beyond this is a state machine.
1771+        # _add_active_peers will select the peers that we want to use
1772+        # for the download, and then attempt to start downloading. After
1773+        # each segment, it will check for doneness, reacting to broken
1774+        # peers and corrupt shares as necessary. If it runs out of good
1775+        # peers before downloading all of the segments, _done_deferred
1776+        # will errback.  Otherwise, it will eventually callback with the
1777+        # contents of the mutable file.
1778         return self._done_deferred
1779 
1780hunk ./src/allmydata/mutable/retrieve.py 211
1781-    def get_data(self, shnum, peerid):
1782-        self.log(format="sending sh#%(shnum)d request to [%(peerid)s]",
1783-                 shnum=shnum,
1784-                 peerid=idlib.shortnodeid_b2a(peerid),
1785-                 level=log.NOISY)
1786-        ss = self.servermap.connections[peerid]
1787-        started = time.time()
1788-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1789+
1790+    def _setup_encoding_parameters(self):
1791+        """
1792+        I set up the encoding parameters, including k, n, the number
1793+        of segments associated with this file, and the segment decoder.
1794+        """
1795+        (seqnum,
1796+         root_hash,
1797+         IV,
1798+         segsize,
1799+         datalength,
1800+         k,
1801+         n,
1802+         known_prefix,
1803          offsets_tuple) = self.verinfo
1804hunk ./src/allmydata/mutable/retrieve.py 226
1805-        offsets = dict(offsets_tuple)
1806+        self._required_shares = k
1807+        self._total_shares = n
1808+        self._segment_size = segsize
1809+        self._data_length = datalength
1810+
1811+        if not IV:
1812+            self._version = MDMF_VERSION
1813+        else:
1814+            self._version = SDMF_VERSION
1815+
1816+        if datalength and segsize:
1817+            self._num_segments = mathutil.div_ceil(datalength, segsize)
1818+            self._tail_data_size = datalength % segsize
1819+        else:
1820+            self._num_segments = 0
1821+            self._tail_data_size = 0
1822 
1823hunk ./src/allmydata/mutable/retrieve.py 243
1824-        # we read the checkstring, to make sure that the data we grab is from
1825-        # the right version.
1826-        readv = [ (0, struct.calcsize(SIGNED_PREFIX)) ]
1827+        self._segment_decoder = codec.CRSDecoder()
1828+        self._segment_decoder.set_params(segsize, k, n)
1829+        self._current_segment = 0
1830 
1831hunk ./src/allmydata/mutable/retrieve.py 247
1832-        # We also read the data, and the hashes necessary to validate them
1833-        # (share_hash_chain, block_hash_tree, share_data). We don't read the
1834-        # signature or the pubkey, since that was handled during the
1835-        # servermap phase, and we'll be comparing the share hash chain
1836-        # against the roothash that was validated back then.
1837+        if  not self._tail_data_size:
1838+            self._tail_data_size = segsize
1839 
1840hunk ./src/allmydata/mutable/retrieve.py 250
1841-        readv.append( (offsets['share_hash_chain'],
1842-                       offsets['enc_privkey'] - offsets['share_hash_chain'] ) )
1843+        self._tail_segment_size = mathutil.next_multiple(self._tail_data_size,
1844+                                                         self._required_shares)
1845+        if self._tail_segment_size == self._segment_size:
1846+            self._tail_decoder = self._segment_decoder
1847+        else:
1848+            self._tail_decoder = codec.CRSDecoder()
1849+            self._tail_decoder.set_params(self._tail_segment_size,
1850+                                          self._required_shares,
1851+                                          self._total_shares)
1852 
1853hunk ./src/allmydata/mutable/retrieve.py 260
1854-        # if we need the private key (for repair), we also fetch that
1855-        if self._need_privkey:
1856-            readv.append( (offsets['enc_privkey'],
1857-                           offsets['EOF'] - offsets['enc_privkey']) )
1858+        self.log("got encoding parameters: "
1859+                 "k: %d "
1860+                 "n: %d "
1861+                 "%d segments of %d bytes each (%d byte tail segment)" % \
1862+                 (k, n, self._num_segments, self._segment_size,
1863+                  self._tail_segment_size))
1864 
1865hunk ./src/allmydata/mutable/retrieve.py 267
1866-        m = Marker()
1867-        self._outstanding_queries[m] = (peerid, shnum, started)
1868+        for i in xrange(self._total_shares):
1869+            # So we don't have to do this later.
1870+            self._block_hash_trees[i] = hashtree.IncompleteHashTree(self._num_segments)
1871 
1872hunk ./src/allmydata/mutable/retrieve.py 271
1873-        # ask the cache first
1874-        got_from_cache = False
1875-        datavs = []
1876-        for (offset, length) in readv:
1877-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
1878-                                                            offset, length)
1879-            if data is not None:
1880-                datavs.append(data)
1881-        if len(datavs) == len(readv):
1882-            self.log("got data from cache")
1883-            got_from_cache = True
1884-            d = fireEventually({shnum: datavs})
1885-            # datavs is a dict mapping shnum to a pair of strings
1886-        else:
1887-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
1888-        self.remaining_sharemap.discard(shnum, peerid)
1889+        # If we have more than one segment, we are an SDMF file, which
1890+        # means that we need to validate the salts as we receive them.
1891+        self._salt_hash_tree = hashtree.IncompleteHashTree(self._num_segments)
1892+        self._salt_hash_tree[0] = IV # from the prefix.
1893 
1894hunk ./src/allmydata/mutable/retrieve.py 276
1895-        d.addCallback(self._got_results, m, peerid, started, got_from_cache)
1896-        d.addErrback(self._query_failed, m, peerid)
1897-        # errors that aren't handled by _query_failed (and errors caused by
1898-        # _query_failed) get logged, but we still want to check for doneness.
1899-        def _oops(f):
1900-            self.log(format="problem in _query_failed for sh#%(shnum)d to %(peerid)s",
1901-                     shnum=shnum,
1902-                     peerid=idlib.shortnodeid_b2a(peerid),
1903-                     failure=f,
1904-                     level=log.WEIRD, umid="W0xnQA")
1905-        d.addErrback(_oops)
1906-        d.addBoth(self._check_for_done)
1907-        # any error during _check_for_done means the download fails. If the
1908-        # download is successful, _check_for_done will fire _done by itself.
1909-        d.addErrback(self._done)
1910-        d.addErrback(log.err)
1911-        return d # purely for testing convenience
1912 
1913hunk ./src/allmydata/mutable/retrieve.py 277
1914-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
1915-        # isolate the callRemote to a separate method, so tests can subclass
1916-        # Publish and override it
1917-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
1918-        return d
1919+    def _add_active_peers(self):
1920+        """
1921+        I populate self._active_readers with enough active readers to
1922+        retrieve the contents of this mutable file. I am called before
1923+        downloading starts, and (eventually) after each validation
1924+        error, connection error, or other problem in the download.
1925+        """
1926+        # TODO: It would be cool to investigate other heuristics for
1927+        # reader selection. For instance, the cost (in time the user
1928+        # spends waiting for their file) of selecting a really slow peer
1929+        # that happens to have a primary share is probably more than
1930+        # selecting a really fast peer that doesn't have a primary
1931+        # share. Maybe the servermap could be extended to provide this
1932+        # information; it could keep track of latency information while
1933+        # it gathers more important data, and then this routine could
1934+        # use that to select active readers.
1935+        #
1936+        # (these and other questions would be easier to answer with a
1937+        #  robust, configurable tahoe-lafs simulator, which modeled node
1938+        #  failures, differences in node speed, and other characteristics
1939+        #  that we expect storage servers to have.  You could have
1940+        #  presets for really stable grids (like allmydata.com),
1941+        #  friendnets, make it easy to configure your own settings, and
1942+        #  then simulate the effect of big changes on these use cases
1943+        #  instead of just reasoning about what the effect might be. Out
1944+        #  of scope for MDMF, though.)
1945 
1946hunk ./src/allmydata/mutable/retrieve.py 304
1947-    def remove_peer(self, peerid):
1948-        for shnum in list(self.remaining_sharemap.keys()):
1949-            self.remaining_sharemap.discard(shnum, peerid)
1950+        # We need at least self._required_shares readers to download a
1951+        # segment.
1952+        needed = self._required_shares - len(self._active_readers)
1953+        # XXX: Why don't format= log messages work here?
1954+        self.log("adding %d peers to the active peers list" % needed)
1955 
1956hunk ./src/allmydata/mutable/retrieve.py 310
1957-    def _got_results(self, datavs, marker, peerid, started, got_from_cache):
1958-        now = time.time()
1959-        elapsed = now - started
1960-        if not got_from_cache:
1961-            self._status.add_fetch_timing(peerid, elapsed)
1962-        self.log(format="got results (%(shares)d shares) from [%(peerid)s]",
1963-                 shares=len(datavs),
1964-                 peerid=idlib.shortnodeid_b2a(peerid),
1965-                 level=log.NOISY)
1966-        self._outstanding_queries.pop(marker, None)
1967-        if not self._running:
1968-            return
1969+        # We favor lower numbered shares, since FEC is faster with
1970+        # primary shares than with other shares, and lower-numbered
1971+        # shares are more likely to be primary than higher numbered
1972+        # shares.
1973+        active_shnums = set(sorted(self.remaining_sharemap.keys()))
1974+        # We shouldn't consider adding shares that we already have; this
1975+        # will cause problems later.
1976+        active_shnums -= set([reader.shnum for reader in self._active_readers])
1977+        active_shnums = list(active_shnums)[:needed]
1978+        if len(active_shnums) < needed:
1979+            # We don't have enough readers to retrieve the file; fail.
1980+            return self._failed()
1981 
1982hunk ./src/allmydata/mutable/retrieve.py 323
1983-        # note that we only ask for a single share per query, so we only
1984-        # expect a single share back. On the other hand, we use the extra
1985-        # shares if we get them.. seems better than an assert().
1986+        for shnum in active_shnums:
1987+            self._active_readers.append(self.readers[shnum])
1988+            self.log("added reader for share %d" % shnum)
1989+        assert len(self._active_readers) == self._required_shares
1990+        # Conceptually, this is part of the _add_active_peers step. It
1991+        # validates the prefixes of newly added readers to make sure
1992+        # that they match what we are expecting for self.verinfo. If
1993+        # validation is successful, _validate_active_prefixes will call
1994+        # _download_current_segment for us. If validation is
1995+        # unsuccessful, then _validate_prefixes will remove the peer and
1996+        # call _add_active_peers again, where we will attempt to rectify
1997+        # the problem by choosing another peer.
1998+        return self._validate_active_prefixes()
1999 
2000hunk ./src/allmydata/mutable/retrieve.py 337
2001-        for shnum,datav in datavs.items():
2002-            (prefix, hash_and_data) = datav[:2]
2003-            try:
2004-                self._got_results_one_share(shnum, peerid,
2005-                                            prefix, hash_and_data)
2006-            except CorruptShareError, e:
2007-                # log it and give the other shares a chance to be processed
2008-                f = failure.Failure()
2009-                self.log(format="bad share: %(f_value)s",
2010-                         f_value=str(f.value), failure=f,
2011-                         level=log.WEIRD, umid="7fzWZw")
2012-                self.notify_server_corruption(peerid, shnum, str(e))
2013-                self.remove_peer(peerid)
2014-                self.servermap.mark_bad_share(peerid, shnum, prefix)
2015-                self._bad_shares.add( (peerid, shnum) )
2016-                self._status.problems[peerid] = f
2017-                self._last_failure = f
2018-                pass
2019-            if self._need_privkey and len(datav) > 2:
2020-                lp = None
2021-                self._try_to_validate_privkey(datav[2], peerid, shnum, lp)
2022-        # all done!
2023 
2024hunk ./src/allmydata/mutable/retrieve.py 338
2025-    def notify_server_corruption(self, peerid, shnum, reason):
2026-        ss = self.servermap.connections[peerid]
2027-        ss.callRemoteOnly("advise_corrupt_share",
2028-                          "mutable", self._storage_index, shnum, reason)
2029+    def _validate_active_prefixes(self):
2030+        """
2031+        I check to make sure that the prefixes on the peers that I am
2032+        currently reading from match the prefix that we want to see, as
2033+        said in self.verinfo.
2034 
2035hunk ./src/allmydata/mutable/retrieve.py 344
2036-    def _got_results_one_share(self, shnum, peerid,
2037-                               got_prefix, got_hash_and_data):
2038-        self.log("_got_results: got shnum #%d from peerid %s"
2039-                 % (shnum, idlib.shortnodeid_b2a(peerid)))
2040-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2041+        If I find that all of the active peers have acceptable prefixes,
2042+        I pass control to _download_current_segment, which will use
2043+        those peers to do cool things. If I find that some of the active
2044+        peers have unacceptable prefixes, I will remove them from active
2045+        peers (and from further consideration) and call
2046+        _add_active_peers to attempt to rectify the situation. I keep
2047+        track of which peers I have already validated so that I don't
2048+        need to do so again.
2049+        """
2050+        assert self._active_readers, "No more active readers"
2051+
2052+        ds = []
2053+        new_readers = set(self._active_readers) - self._validated_readers
2054+        self.log('validating %d newly-added active readers' % len(new_readers))
2055+
2056+        for reader in new_readers:
2057+            # We force a remote read here -- otherwise, we are relying
2058+            # on cached data that we already verified as valid, and we
2059+            # won't detect an uncoordinated write that has occurred
2060+            # since the last servermap update.
2061+            d = reader.get_prefix(force_remote=True)
2062+            d.addCallback(self._try_to_validate_prefix, reader)
2063+            ds.append(d)
2064+        dl = defer.DeferredList(ds, consumeErrors=True)
2065+        def _check_results(results):
2066+            # Each result in results will be of the form (success, msg).
2067+            # We don't care about msg, but success will tell us whether
2068+            # or not the checkstring validated. If it didn't, we need to
2069+            # remove the offending (peer,share) from our active readers,
2070+            # and ensure that active readers is again populated.
2071+            bad_readers = []
2072+            for i, result in enumerate(results):
2073+                if not result[0]:
2074+                    reader = self._active_readers[i]
2075+                    f = result[1]
2076+                    assert isinstance(f, failure.Failure)
2077+
2078+                    self.log("The reader %s failed to "
2079+                             "properly validate: %s" % \
2080+                             (reader, str(f.value)))
2081+                    bad_readers.append((reader, f))
2082+                else:
2083+                    reader = self._active_readers[i]
2084+                    self.log("the reader %s checks out, so we'll use it" % \
2085+                             reader)
2086+                    self._validated_readers.add(reader)
2087+                    # Each time we validate a reader, we check to see if
2088+                    # we need the private key. If we do, we politely ask
2089+                    # for it and then continue computing. If we find
2090+                    # that we haven't gotten it at the end of
2091+                    # segment decoding, then we'll take more drastic
2092+                    # measures.
2093+                    if self._need_privkey:
2094+                        d = reader.get_encprivkey()
2095+                        d.addCallback(self._try_to_validate_privkey, reader)
2096+            if bad_readers:
2097+                # We do them all at once, or else we screw up list indexing.
2098+                for (reader, f) in bad_readers:
2099+                    self._mark_bad_share(reader, f)
2100+                return self._add_active_peers()
2101+            else:
2102+                return self._download_current_segment()
2103+            # The next step will assert that it has enough active
2104+            # readers to fetch shares; we just need to remove it.
2105+        dl.addCallback(_check_results)
2106+        return dl
2107+
2108+
2109+    def _try_to_validate_prefix(self, prefix, reader):
2110+        """
2111+        I check that the prefix returned by a candidate server for
2112+        retrieval matches the prefix that the servermap knows about
2113+        (and, hence, the prefix that was validated earlier). If it does,
2114+        I return True, which means that I approve of the use of the
2115+        candidate server for segment retrieval. If it doesn't, I return
2116+        False, which means that another server must be chosen.
2117+        """
2118+        (seqnum,
2119+         root_hash,
2120+         IV,
2121+         segsize,
2122+         datalength,
2123+         k,
2124+         N,
2125+         known_prefix,
2126          offsets_tuple) = self.verinfo
2127hunk ./src/allmydata/mutable/retrieve.py 430
2128-        assert len(got_prefix) == len(prefix), (len(got_prefix), len(prefix))
2129-        if got_prefix != prefix:
2130-            msg = "someone wrote to the data since we read the servermap: prefix changed"
2131-            raise UncoordinatedWriteError(msg)
2132-        (share_hash_chain, block_hash_tree,
2133-         share_data) = unpack_share_data(self.verinfo, got_hash_and_data)
2134+        if known_prefix != prefix:
2135+            self.log("prefix from share %d doesn't match" % reader.shnum)
2136+            raise UncoordinatedWriteError("Mismatched prefix -- this could "
2137+                                          "indicate an uncoordinated write")
2138+        # Otherwise, we're okay -- no issues.
2139 
2140hunk ./src/allmydata/mutable/retrieve.py 436
2141-        assert isinstance(share_data, str)
2142-        # build the block hash tree. SDMF has only one leaf.
2143-        leaves = [hashutil.block_hash(share_data)]
2144-        t = hashtree.HashTree(leaves)
2145-        if list(t) != block_hash_tree:
2146-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
2147-        share_hash_leaf = t[0]
2148-        t2 = hashtree.IncompleteHashTree(N)
2149-        # root_hash was checked by the signature
2150-        t2.set_hashes({0: root_hash})
2151-        try:
2152-            t2.set_hashes(hashes=share_hash_chain,
2153-                          leaves={shnum: share_hash_leaf})
2154-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
2155-                IndexError), e:
2156-            msg = "corrupt hashes: %s" % (e,)
2157-            raise CorruptShareError(peerid, shnum, msg)
2158-        self.log(" data valid! len=%d" % len(share_data))
2159-        # each query comes down to this: placing validated share data into
2160-        # self.shares
2161-        self.shares[shnum] = share_data
2162 
2163hunk ./src/allmydata/mutable/retrieve.py 437
2164-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
2165+    def _remove_reader(self, reader):
2166+        """
2167+        At various points, we will wish to remove a peer from
2168+        consideration and/or use. These include, but are not necessarily
2169+        limited to:
2170 
2171hunk ./src/allmydata/mutable/retrieve.py 443
2172-        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
2173-        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
2174-        if alleged_writekey != self._node.get_writekey():
2175-            self.log("invalid privkey from %s shnum %d" %
2176-                     (idlib.nodeid_b2a(peerid)[:8], shnum),
2177-                     parent=lp, level=log.WEIRD, umid="YIw4tA")
2178-            return
2179+            - A connection error.
2180+            - A mismatched prefix (that is, a prefix that does not match
2181+              our conception of the version information string).
2182+            - A failing block hash, salt hash, or share hash, which can
2183+              indicate disk failure/bit flips, or network trouble.
2184 
2185hunk ./src/allmydata/mutable/retrieve.py 449
2186-        # it's good
2187-        self.log("got valid privkey from shnum %d on peerid %s" %
2188-                 (shnum, idlib.shortnodeid_b2a(peerid)),
2189-                 parent=lp)
2190-        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
2191-        self._node._populate_encprivkey(enc_privkey)
2192-        self._node._populate_privkey(privkey)
2193-        self._need_privkey = False
2194+        This method will do that. I will make sure that the
2195+        (shnum,reader) combination represented by my reader argument is
2196+        not used for anything else during this download. I will not
2197+        advise the reader of any corruption, something that my callers
2198+        may wish to do on their own.
2199+        """
2200+        # TODO: When you're done writing this, see if this is ever
2201+        # actually used for something that _mark_bad_share isn't. I have
2202+        # a feeling that they will be used for very similar things, and
2203+        # that having them both here is just going to be an epic amount
2204+        # of code duplication.
2205+        #
2206+        # (well, okay, not epic, but meaningful)
2207+        self.log("removing reader %s" % reader)
2208+        # Remove the reader from _active_readers
2209+        self._active_readers.remove(reader)
2210+        # TODO: self.readers.remove(reader)?
2211+        for shnum in list(self.remaining_sharemap.keys()):
2212+            self.remaining_sharemap.discard(shnum, reader.peerid)
2213 
2214hunk ./src/allmydata/mutable/retrieve.py 469
2215-    def _query_failed(self, f, marker, peerid):
2216-        self.log(format="query to [%(peerid)s] failed",
2217-                 peerid=idlib.shortnodeid_b2a(peerid),
2218-                 level=log.NOISY)
2219-        self._status.problems[peerid] = f
2220-        self._outstanding_queries.pop(marker, None)
2221-        if not self._running:
2222-            return
2223-        self._last_failure = f
2224-        self.remove_peer(peerid)
2225-        level = log.WEIRD
2226-        if f.check(DeadReferenceError):
2227-            level = log.UNUSUAL
2228-        self.log(format="error during query: %(f_value)s",
2229-                 f_value=str(f.value), failure=f, level=level, umid="gOJB5g")
2230 
2231hunk ./src/allmydata/mutable/retrieve.py 470
2232-    def _check_for_done(self, res):
2233-        # exit paths:
2234-        #  return : keep waiting, no new queries
2235-        #  return self._send_more_queries(outstanding) : send some more queries
2236-        #  fire self._done(plaintext) : download successful
2237-        #  raise exception : download fails
2238+    def _mark_bad_share(self, reader, f):
2239+        """
2240+        I mark the (peerid, shnum) encapsulated by my reader argument as
2241+        a bad share, which means that it will not be used anywhere else.
2242 
2243hunk ./src/allmydata/mutable/retrieve.py 475
2244-        self.log(format="_check_for_done: running=%(running)s, decoding=%(decoding)s",
2245-                 running=self._running, decoding=self._decoding,
2246-                 level=log.NOISY)
2247-        if not self._running:
2248-            return
2249-        if self._decoding:
2250-            return
2251-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2252-         offsets_tuple) = self.verinfo
2253+        There are several reasons to want to mark something as a bad
2254+        share. These include:
2255 
2256hunk ./src/allmydata/mutable/retrieve.py 478
2257-        if len(self.shares) < k:
2258-            # we don't have enough shares yet
2259-            return self._maybe_send_more_queries(k)
2260-        if self._need_privkey:
2261-            # we got k shares, but none of them had a valid privkey. TODO:
2262-            # look further. Adding code to do this is a bit complicated, and
2263-            # I want to avoid that complication, and this should be pretty
2264-            # rare (k shares with bitflips in the enc_privkey but not in the
2265-            # data blocks). If we actually do get here, the subsequent repair
2266-            # will fail for lack of a privkey.
2267-            self.log("got k shares but still need_privkey, bummer",
2268-                     level=log.WEIRD, umid="MdRHPA")
2269+            - A connection error to the peer.
2270+            - A mismatched prefix (that is, a prefix that does not match
2271+              our local conception of the version information string).
2272+            - A failing block hash, salt hash, share hash, or other
2273+              integrity check.
2274 
2275hunk ./src/allmydata/mutable/retrieve.py 484
2276-        # we have enough to finish. All the shares have had their hashes
2277-        # checked, so if something fails at this point, we don't know how
2278-        # to fix it, so the download will fail.
2279+        This method will ensure that readers that we wish to mark bad
2280+        (for these reasons or other reasons) are not used for the rest
2281+        of the download. Additionally, it will attempt to tell the
2282+        remote peer (with no guarantee of success) that its share is
2283+        corrupt.
2284+        """
2285+        self.log("marking share %d on server %s as bad" % \
2286+                 (reader.shnum, reader))
2287+        self._remove_reader(reader)
2288+        self._bad_shares.add((reader.peerid, reader.shnum))
2289+        self._status.problems[reader.peerid] = f
2290+        self._last_failure = f
2291+        self.notify_server_corruption(reader.peerid, reader.shnum,
2292+                                      str(f.value))
2293 
2294hunk ./src/allmydata/mutable/retrieve.py 499
2295-        self._decoding = True # avoid reentrancy
2296-        self._status.set_status("decoding")
2297-        now = time.time()
2298-        elapsed = now - self._started
2299-        self._status.timings["fetch"] = elapsed
2300 
2301hunk ./src/allmydata/mutable/retrieve.py 500
2302-        d = defer.maybeDeferred(self._decode)
2303-        d.addCallback(self._decrypt, IV, self._node.get_readkey())
2304-        d.addBoth(self._done)
2305-        return d # purely for test convenience
2306+    def _download_current_segment(self):
2307+        """
2308+        I download, validate, decode, decrypt, and assemble the segment
2309+        that this Retrieve is currently responsible for downloading.
2310+        """
2311+        assert len(self._active_readers) >= self._required_shares
2312+        if self._current_segment < self._num_segments:
2313+            d = self._process_segment(self._current_segment)
2314+        else:
2315+            d = defer.succeed(None)
2316+        d.addCallback(self._check_for_done)
2317+        return d
2318 
2319hunk ./src/allmydata/mutable/retrieve.py 513
2320-    def _maybe_send_more_queries(self, k):
2321-        # we don't have enough shares yet. Should we send out more queries?
2322-        # There are some number of queries outstanding, each for a single
2323-        # share. If we can generate 'needed_shares' additional queries, we do
2324-        # so. If we can't, then we know this file is a goner, and we raise
2325-        # NotEnoughSharesError.
2326-        self.log(format=("_maybe_send_more_queries, have=%(have)d, k=%(k)d, "
2327-                         "outstanding=%(outstanding)d"),
2328-                 have=len(self.shares), k=k,
2329-                 outstanding=len(self._outstanding_queries),
2330-                 level=log.NOISY)
2331 
2332hunk ./src/allmydata/mutable/retrieve.py 514
2333-        remaining_shares = k - len(self.shares)
2334-        needed = remaining_shares - len(self._outstanding_queries)
2335-        if not needed:
2336-            # we have enough queries in flight already
2337+    def _process_segment(self, segnum):
2338+        """
2339+        I download, validate, decode, and decrypt one segment of the
2340+        file that this Retrieve is retrieving. This means coordinating
2341+        the process of getting k blocks of that file, validating them,
2342+        assembling them into one segment with the decoder, and then
2343+        decrypting them.
2344+        """
2345+        self.log("processing segment %d" % segnum)
2346 
2347hunk ./src/allmydata/mutable/retrieve.py 524
2348-            # TODO: but if they've been in flight for a long time, and we
2349-            # have reason to believe that new queries might respond faster
2350-            # (i.e. we've seen other queries come back faster, then consider
2351-            # sending out new queries. This could help with peers which have
2352-            # silently gone away since the servermap was updated, for which
2353-            # we're still waiting for the 15-minute TCP disconnect to happen.
2354-            self.log("enough queries are in flight, no more are needed",
2355-                     level=log.NOISY)
2356-            return
2357+        # TODO: The old code uses a marker. Should this code do that
2358+        # too? What did the Marker do?
2359+        assert len(self._active_readers) >= self._required_shares
2360+
2361+        # We need to ask each of our active readers for its block and
2362+        # salt. We will then validate those. If validation is
2363+        # successful, we will assemble the results into plaintext.
2364+        ds = []
2365+        for reader in self._active_readers:
2366+            d = reader.get_block_and_salt(segnum, queue=True)
2367+            d2 = self._get_needed_hashes(reader, segnum)
2368+            dl = defer.DeferredList([d, d2], consumeErrors=True)
2369+            dl.addCallback(self._validate_block, segnum, reader)
2370+            dl.addErrback(self._validation_or_decoding_failed, [reader])
2371+            ds.append(dl)
2372+            reader.flush()
2373+        dl = defer.DeferredList(ds)
2374+        dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
2375+        return dl
2376 
2377hunk ./src/allmydata/mutable/retrieve.py 544
2378-        outstanding_shnums = set([shnum
2379-                                  for (peerid, shnum, started)
2380-                                  in self._outstanding_queries.values()])
2381-        # prefer low-numbered shares, they are more likely to be primary
2382-        available_shnums = sorted(self.remaining_sharemap.keys())
2383-        for shnum in available_shnums:
2384-            if shnum in outstanding_shnums:
2385-                # skip ones that are already in transit
2386-                continue
2387-            if shnum not in self.remaining_sharemap:
2388-                # no servers for that shnum. note that DictOfSets removes
2389-                # empty sets from the dict for us.
2390-                continue
2391-            peerid = list(self.remaining_sharemap[shnum])[0]
2392-            # get_data will remove that peerid from the sharemap, and add the
2393-            # query to self._outstanding_queries
2394-            self._status.set_status("Retrieving More Shares")
2395-            self.get_data(shnum, peerid)
2396-            needed -= 1
2397-            if not needed:
2398+
2399+    def _maybe_decode_and_decrypt_segment(self, blocks_and_salts, segnum):
2400+        """
2401+        I take the results of fetching and validating the blocks from a
2402+        callback chain in another method. If the results are such that
2403+        they tell me that validation and fetching succeeded without
2404+        incident, I will proceed with decoding and decryption.
2405+        Otherwise, I will do nothing.
2406+        """
2407+        self.log("trying to decode and decrypt segment %d" % segnum)
2408+        failures = False
2409+        for block_and_salt in blocks_and_salts:
2410+            if not block_and_salt[0] or block_and_salt[1] == None:
2411+                self.log("some validation operations failed; not proceeding")
2412+                failures = True
2413                 break
2414hunk ./src/allmydata/mutable/retrieve.py 560
2415+        if not failures:
2416+            self.log("everything looks ok, building segment %d" % segnum)
2417+            d = self._decode_blocks(blocks_and_salts, segnum)
2418+            d.addCallback(self._decrypt_segment)
2419+            d.addErrback(self._validation_or_decoding_failed,
2420+                         self._active_readers)
2421+            d.addCallback(self._set_segment)
2422+            return d
2423+        else:
2424+            return defer.succeed(None)
2425+
2426+
2427+    def _set_segment(self, segment):
2428+        """
2429+        Given a plaintext segment, I register that segment with the
2430+        target that is handling the file download.
2431+        """
2432+        self.log("got plaintext for segment %d" % self._current_segment)
2433+        self._plaintext += segment
2434+        self._current_segment += 1
2435 
2436hunk ./src/allmydata/mutable/retrieve.py 581
2437-        # at this point, we have as many outstanding queries as we can. If
2438-        # needed!=0 then we might not have enough to recover the file.
2439-        if needed:
2440-            format = ("ran out of peers: "
2441-                      "have %(have)d shares (k=%(k)d), "
2442-                      "%(outstanding)d queries in flight, "
2443-                      "need %(need)d more, "
2444-                      "found %(bad)d bad shares")
2445-            args = {"have": len(self.shares),
2446-                    "k": k,
2447-                    "outstanding": len(self._outstanding_queries),
2448-                    "need": needed,
2449-                    "bad": len(self._bad_shares),
2450-                    }
2451-            self.log(format=format,
2452-                     level=log.WEIRD, umid="ezTfjw", **args)
2453-            err = NotEnoughSharesError("%s, last failure: %s" %
2454-                                      (format % args, self._last_failure))
2455-            if self._bad_shares:
2456-                self.log("We found some bad shares this pass. You should "
2457-                         "update the servermap and try again to check "
2458-                         "more peers",
2459-                         level=log.WEIRD, umid="EFkOlA")
2460-                err.servermap = self.servermap
2461-            raise err
2462 
2463hunk ./src/allmydata/mutable/retrieve.py 582
2464+    def _validation_or_decoding_failed(self, f, readers):
2465+        """
2466+        I am called when a block or a salt fails to correctly validate, or when
2467+        the decryption or decoding operation fails for some reason.  I react to
2468+        this failure by notifying the remote server of corruption, and then
2469+        removing the remote peer from further activity.
2470+        """
2471+        assert isinstance(readers, list)
2472+        bad_shnums = [reader.shnum for reader in readers]
2473+
2474+        self.log("validation or decoding failed on share(s) %s, peer(s) %s "
2475+                 ", segment %d: %s" % \
2476+                 (bad_shnums, readers, self._current_segment, str(f)))
2477+        for reader in readers:
2478+            self._mark_bad_share(reader, f)
2479         return
2480 
2481hunk ./src/allmydata/mutable/retrieve.py 599
2482-    def _decode(self):
2483-        started = time.time()
2484-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2485-         offsets_tuple) = self.verinfo
2486 
2487hunk ./src/allmydata/mutable/retrieve.py 600
2488-        # shares_dict is a dict mapping shnum to share data, but the codec
2489-        # wants two lists.
2490-        shareids = []; shares = []
2491-        for shareid, share in self.shares.items():
2492+    def _validate_block(self, results, segnum, reader):
2493+        """
2494+        I validate a block from one share on a remote server.
2495+        """
2496+        # Grab the part of the block hash tree that is necessary to
2497+        # validate this block, then generate the block hash root.
2498+        self.log("validating share %d for segment %d" % (reader.shnum,
2499+                                                             segnum))
2500+        # Did we fail to fetch either of the things that we were
2501+        # supposed to? Fail if so.
2502+        if not results[0][0] and results[1][0]:
2503+            # handled by the errback handler.
2504+
2505+            # These all get batched into one query, so the resulting
2506+            # failure should be the same for all of them, so we can just
2507+            # use the first one.
2508+            assert isinstance(results[0][1], failure.Failure)
2509+
2510+            f = results[0][1]
2511+            raise CorruptShareError(reader.peerid,
2512+                                    reader.shnum,
2513+                                    "Connection error: %s" % str(f))
2514+
2515+        block_and_salt, block_and_sharehashes = results
2516+        block, salt = block_and_salt[1]
2517+        blockhashes, sharehashes = block_and_sharehashes[1]
2518+
2519+        blockhashes = dict(enumerate(blockhashes[1]))
2520+        self.log("the reader gave me the following blockhashes: %s" % \
2521+                 blockhashes.keys())
2522+        self.log("the reader gave me the following sharehashes: %s" % \
2523+                 sharehashes[1].keys())
2524+        bht = self._block_hash_trees[reader.shnum]
2525+
2526+        if bht.needed_hashes(segnum, include_leaf=True):
2527+            try:
2528+                bht.set_hashes(blockhashes)
2529+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2530+                    IndexError), e:
2531+                raise CorruptShareError(reader.peerid,
2532+                                        reader.shnum,
2533+                                        "block hash tree failure: %s" % e)
2534+
2535+        if self._version == MDMF_VERSION:
2536+            blockhash = hashutil.block_hash(salt + block)
2537+        else:
2538+            blockhash = hashutil.block_hash(block)
2539+        # If this works without an error, then validation is
2540+        # successful.
2541+        try:
2542+           bht.set_hashes(leaves={segnum: blockhash})
2543+        except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2544+                IndexError), e:
2545+            raise CorruptShareError(reader.peerid,
2546+                                    reader.shnum,
2547+                                    "block hash tree failure: %s" % e)
2548+
2549+        # Reaching this point means that we know that this segment
2550+        # is correct. Now we need to check to see whether the share
2551+        # hash chain is also correct.
2552+        # SDMF wrote share hash chains that didn't contain the
2553+        # leaves, which would be produced from the block hash tree.
2554+        # So we need to validate the block hash tree first. If
2555+        # successful, then bht[0] will contain the root for the
2556+        # shnum, which will be a leaf in the share hash tree, which
2557+        # will allow us to validate the rest of the tree.
2558+        if self.share_hash_tree.needed_hashes(reader.shnum,
2559+                                               include_leaf=True):
2560+            try:
2561+                self.share_hash_tree.set_hashes(hashes=sharehashes[1],
2562+                                            leaves={reader.shnum: bht[0]})
2563+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2564+                    IndexError), e:
2565+                raise CorruptShareError(reader.peerid,
2566+                                        reader.shnum,
2567+                                        "corrupt hashes: %s" % e)
2568+
2569+        # TODO: Validate the salt, too.
2570+        self.log('share %d is valid for segment %d' % (reader.shnum,
2571+                                                       segnum))
2572+        return {reader.shnum: (block, salt)}
2573+
2574+
2575+    def _get_needed_hashes(self, reader, segnum):
2576+        """
2577+        I get the hashes needed to validate segnum from the reader, then return
2578+        to my caller when this is done.
2579+        """
2580+        bht = self._block_hash_trees[reader.shnum]
2581+        needed = bht.needed_hashes(segnum, include_leaf=True)
2582+        # The root of the block hash tree is also a leaf in the share
2583+        # hash tree. So we don't need to fetch it from the remote
2584+        # server. In the case of files with one segment, this means that
2585+        # we won't fetch any block hash tree from the remote server,
2586+        # since the hash of each share of the file is the entire block
2587+        # hash tree, and is a leaf in the share hash tree. This is fine,
2588+        # since any share corruption will be detected in the share hash
2589+        # tree.
2590+        #needed.discard(0)
2591+        self.log("getting blockhashes for segment %d, share %d: %s" % \
2592+                 (segnum, reader.shnum, str(needed)))
2593+        d1 = reader.get_blockhashes(needed, queue=True, force_remote=True)
2594+        if self.share_hash_tree.needed_hashes(reader.shnum):
2595+            need = self.share_hash_tree.needed_hashes(reader.shnum)
2596+            self.log("also need sharehashes for share %d: %s" % (reader.shnum,
2597+                                                                 str(need)))
2598+            d2 = reader.get_sharehashes(need, queue=True, force_remote=True)
2599+        else:
2600+            d2 = defer.succeed({}) # the logic in the next method
2601+                                   # expects a dict
2602+        dl = defer.DeferredList([d1, d2], consumeErrors=True)
2603+        return dl
2604+
2605+
2606+    def _decode_blocks(self, blocks_and_salts, segnum):
2607+        """
2608+        I take a list of k blocks and salts, and decode that into a
2609+        single encrypted segment.
2610+        """
2611+        d = {}
2612+        # We want to merge our dictionaries to the form
2613+        # {shnum: blocks_and_salts}
2614+        #
2615+        # The dictionaries come from validate block that way, so we just
2616+        # need to merge them.
2617+        for block_and_salt in blocks_and_salts:
2618+            d.update(block_and_salt[1])
2619+
2620+        # All of these blocks should have the same salt; in SDMF, it is
2621+        # the file-wide IV, while in MDMF it is the per-segment salt. In
2622+        # either case, we just need to get one of them and use it.
2623+        #
2624+        # d.items()[0] is like (shnum, (block, salt))
2625+        # d.items()[0][1] is like (block, salt)
2626+        # d.items()[0][1][1] is the salt.
2627+        salt = d.items()[0][1][1]
2628+        # Next, extract just the blocks from the dict. We'll use the
2629+        # salt in the next step.
2630+        share_and_shareids = [(k, v[0]) for k, v in d.items()]
2631+        d2 = dict(share_and_shareids)
2632+        shareids = []
2633+        shares = []
2634+        for shareid, share in d2.items():
2635             shareids.append(shareid)
2636             shares.append(share)
2637 
2638hunk ./src/allmydata/mutable/retrieve.py 746
2639-        assert len(shareids) >= k, len(shareids)
2640+        assert len(shareids) >= self._required_shares, len(shareids)
2641         # zfec really doesn't want extra shares
2642hunk ./src/allmydata/mutable/retrieve.py 748
2643-        shareids = shareids[:k]
2644-        shares = shares[:k]
2645-
2646-        fec = codec.CRSDecoder()
2647-        fec.set_params(segsize, k, N)
2648-
2649-        self.log("params %s, we have %d shares" % ((segsize, k, N), len(shares)))
2650-        self.log("about to decode, shareids=%s" % (shareids,))
2651-        d = defer.maybeDeferred(fec.decode, shares, shareids)
2652-        def _done(buffers):
2653-            self._status.timings["decode"] = time.time() - started
2654-            self.log(" decode done, %d buffers" % len(buffers))
2655+        shareids = shareids[:self._required_shares]
2656+        shares = shares[:self._required_shares]
2657+        self.log("decoding segment %d" % segnum)
2658+        if segnum == self._num_segments - 1:
2659+            d = defer.maybeDeferred(self._tail_decoder.decode, shares, shareids)
2660+        else:
2661+            d = defer.maybeDeferred(self._segment_decoder.decode, shares, shareids)
2662+        def _process(buffers):
2663             segment = "".join(buffers)
2664hunk ./src/allmydata/mutable/retrieve.py 757
2665+            self.log(format="now decoding segment %(segnum)s of %(numsegs)s",
2666+                     segnum=segnum,
2667+                     numsegs=self._num_segments,
2668+                     level=log.NOISY)
2669             self.log(" joined length %d, datalength %d" %
2670hunk ./src/allmydata/mutable/retrieve.py 762
2671-                     (len(segment), datalength))
2672-            segment = segment[:datalength]
2673+                     (len(segment), self._data_length))
2674+            if segnum == self._num_segments - 1:
2675+                size_to_use = self._tail_data_size
2676+            else:
2677+                size_to_use = self._segment_size
2678+            segment = segment[:size_to_use]
2679             self.log(" segment len=%d" % len(segment))
2680hunk ./src/allmydata/mutable/retrieve.py 769
2681-            return segment
2682-        def _err(f):
2683-            self.log(" decode failed: %s" % f)
2684-            return f
2685-        d.addCallback(_done)
2686-        d.addErrback(_err)
2687+            return segment, salt
2688+        d.addCallback(_process)
2689         return d
2690 
2691hunk ./src/allmydata/mutable/retrieve.py 773
2692-    def _decrypt(self, crypttext, IV, readkey):
2693+
2694+    def _decrypt_segment(self, segment_and_salt):
2695+        """
2696+        I take a single segment and its salt, and decrypt it. I return
2697+        the plaintext of the segment that is in my argument.
2698+        """
2699+        segment, salt = segment_and_salt
2700         self._status.set_status("decrypting")
2701hunk ./src/allmydata/mutable/retrieve.py 781
2702+        self.log("decrypting segment %d" % self._current_segment)
2703         started = time.time()
2704hunk ./src/allmydata/mutable/retrieve.py 783
2705-        key = hashutil.ssk_readkey_data_hash(IV, readkey)
2706+        key = hashutil.ssk_readkey_data_hash(salt, self._node.get_readkey())
2707         decryptor = AES(key)
2708hunk ./src/allmydata/mutable/retrieve.py 785
2709-        plaintext = decryptor.process(crypttext)
2710+        plaintext = decryptor.process(segment)
2711         self._status.timings["decrypt"] = time.time() - started
2712         return plaintext
2713 
2714hunk ./src/allmydata/mutable/retrieve.py 789
2715-    def _done(self, res):
2716-        if not self._running:
2717+
2718+    def notify_server_corruption(self, peerid, shnum, reason):
2719+        ss = self.servermap.connections[peerid]
2720+        ss.callRemoteOnly("advise_corrupt_share",
2721+                          "mutable", self._storage_index, shnum, reason)
2722+
2723+
2724+    def _try_to_validate_privkey(self, enc_privkey, reader):
2725+
2726+        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
2727+        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
2728+        if alleged_writekey != self._node.get_writekey():
2729+            self.log("invalid privkey from %s shnum %d" %
2730+                     (reader, reader.shnum),
2731+                     level=log.WEIRD, umid="YIw4tA")
2732             return
2733hunk ./src/allmydata/mutable/retrieve.py 805
2734-        self._running = False
2735-        self._status.set_active(False)
2736-        self._status.timings["total"] = time.time() - self._started
2737-        # res is either the new contents, or a Failure
2738-        if isinstance(res, failure.Failure):
2739-            self.log("Retrieve done, with failure", failure=res,
2740-                     level=log.UNUSUAL)
2741-            self._status.set_status("Failed")
2742-        else:
2743-            self.log("Retrieve done, success!")
2744-            self._status.set_status("Finished")
2745-            self._status.set_progress(1.0)
2746-            # remember the encoding parameters, use them again next time
2747-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2748-             offsets_tuple) = self.verinfo
2749-            self._node._populate_required_shares(k)
2750-            self._node._populate_total_shares(N)
2751-        eventually(self._done_deferred.callback, res)
2752 
2753hunk ./src/allmydata/mutable/retrieve.py 806
2754+        # it's good
2755+        self.log("got valid privkey from shnum %d on reader %s" %
2756+                 (reader.shnum, reader))
2757+        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
2758+        self._node._populate_encprivkey(enc_privkey)
2759+        self._node._populate_privkey(privkey)
2760+        self._need_privkey = False
2761+
2762+
2763+    def _check_for_done(self, res):
2764+        """
2765+        I check to see if this Retrieve object has successfully finished
2766+        its work.
2767+
2768+        I can exit in the following ways:
2769+            - If there are no more segments to download, then I exit by
2770+              causing self._done_deferred to fire with the plaintext
2771+              content requested by the caller.
2772+            - If there are still segments to be downloaded, and there
2773+              are enough active readers (readers which have not broken
2774+              and have not given us corrupt data) to continue
2775+              downloading, I send control back to
2776+              _download_current_segment.
2777+            - If there are still segments to be downloaded but there are
2778+              not enough active peers to download them, I ask
2779+              _add_active_peers to add more peers. If it is successful,
2780+              it will call _download_current_segment. If there are not
2781+              enough peers to retrieve the file, then that will cause
2782+              _done_deferred to errback.
2783+        """
2784+        self.log("checking for doneness")
2785+        if self._current_segment == self._num_segments:
2786+            # No more segments to download, we're done.
2787+            self.log("got plaintext, done")
2788+            return self._done()
2789+
2790+        if len(self._active_readers) >= self._required_shares:
2791+            # More segments to download, but we have enough good peers
2792+            # in self._active_readers that we can do that without issue,
2793+            # so go nab the next segment.
2794+            self.log("not done yet: on segment %d of %d" % \
2795+                     (self._current_segment + 1, self._num_segments))
2796+            return self._download_current_segment()
2797+
2798+        self.log("not done yet: on segment %d of %d, need to add peers" % \
2799+                 (self._current_segment + 1, self._num_segments))
2800+        return self._add_active_peers()
2801+
2802+
2803+    def _done(self):
2804+        """
2805+        I am called by _check_for_done when the download process has
2806+        finished successfully. After making some useful logging
2807+        statements, I return the decrypted contents to the owner of this
2808+        Retrieve object through self._done_deferred.
2809+        """
2810+        eventually(self._done_deferred.callback, self._plaintext)
2811+
2812+
2813+    def _failed(self):
2814+        """
2815+        I am called by _add_active_peers when there are not enough
2816+        active peers left to complete the download. After making some
2817+        useful logging statements, I return an exception to that effect
2818+        to the caller of this Retrieve object through
2819+        self._done_deferred.
2820+        """
2821+        format = ("ran out of peers: "
2822+                  "have %(have)d of %(total)d segments "
2823+                  "found %(bad)d bad shares "
2824+                  "encoding %(k)d-of-%(n)d")
2825+        args = {"have": self._current_segment,
2826+                "total": self._num_segments,
2827+                "k": self._required_shares,
2828+                "n": self._total_shares,
2829+                "bad": len(self._bad_shares)}
2830+        e = NotEnoughSharesError("%s, last failure: %s" % (format % args,
2831+                                                        str(self._last_failure)))
2832+        f = failure.Failure(e)
2833+        eventually(self._done_deferred.callback, f)
2834hunk ./src/allmydata/test/test_mutable.py 12
2835 from allmydata.util.hashutil import tagged_hash, ssk_writekey_hash, \
2836      ssk_pubkey_fingerprint_hash
2837 from allmydata.interfaces import IRepairResults, ICheckAndRepairResults, \
2838-     NotEnoughSharesError
2839+     NotEnoughSharesError, SDMF_VERSION, MDMF_VERSION
2840 from allmydata.monitor import Monitor
2841 from allmydata.test.common import ShouldFailMixin
2842 from allmydata.test.no_network import GridTestMixin
2843hunk ./src/allmydata/test/test_mutable.py 28
2844 from allmydata.mutable.retrieve import Retrieve
2845 from allmydata.mutable.publish import Publish
2846 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
2847-from allmydata.mutable.layout import unpack_header, unpack_share
2848+from allmydata.mutable.layout import unpack_header, unpack_share, \
2849+                                     MDMFSlotReadProxy
2850 from allmydata.mutable.repairer import MustForceRepairError
2851 
2852 import allmydata.test.common_util as testutil
2853hunk ./src/allmydata/test/test_mutable.py 104
2854         d = fireEventually()
2855         d.addCallback(lambda res: _call())
2856         return d
2857+
2858     def callRemoteOnly(self, methname, *args, **kwargs):
2859         d = self.callRemote(methname, *args, **kwargs)
2860         d.addBoth(lambda ignore: None)
2861hunk ./src/allmydata/test/test_mutable.py 163
2862 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
2863     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
2864     # list of shnums to corrupt.
2865+    ds = []
2866     for peerid in s._peers:
2867         shares = s._peers[peerid]
2868         for shnum in shares:
2869hunk ./src/allmydata/test/test_mutable.py 190
2870                 else:
2871                     offset1 = offset
2872                     offset2 = 0
2873-                if offset1 == "pubkey":
2874+                if offset1 == "pubkey" and IV:
2875                     real_offset = 107
2876hunk ./src/allmydata/test/test_mutable.py 192
2877+                elif offset1 == "share_data" and not IV:
2878+                    real_offset = 104
2879                 elif offset1 in o:
2880                     real_offset = o[offset1]
2881                 else:
2882hunk ./src/allmydata/test/test_mutable.py 327
2883         d.addCallback(_created)
2884         return d
2885 
2886+
2887+    def test_upload_and_download_mdmf(self):
2888+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
2889+        def _created(n):
2890+            d = defer.succeed(None)
2891+            d.addCallback(lambda ignored:
2892+                n.get_servermap(MODE_READ))
2893+            def _then(servermap):
2894+                dumped = servermap.dump(StringIO())
2895+                self.failUnlessIn("3-of-10", dumped.getvalue())
2896+            d.addCallback(_then)
2897+            # Now overwrite the contents with some new contents. We want
2898+            # to make them big enough to force the file to be uploaded
2899+            # in more than one segment.
2900+            big_contents = "contents1" * 100000 # about 900 KiB
2901+            d.addCallback(lambda ignored:
2902+                n.overwrite(big_contents))
2903+            d.addCallback(lambda ignored:
2904+                n.download_best_version())
2905+            d.addCallback(lambda data:
2906+                self.failUnlessEqual(data, big_contents))
2907+            # Overwrite the contents again with some new contents. As
2908+            # before, they need to be big enough to force multiple
2909+            # segments, so that we make the downloader deal with
2910+            # multiple segments.
2911+            bigger_contents = "contents2" * 1000000 # about 9MiB
2912+            d.addCallback(lambda ignored:
2913+                n.overwrite(bigger_contents))
2914+            d.addCallback(lambda ignored:
2915+                n.download_best_version())
2916+            d.addCallback(lambda data:
2917+                self.failUnlessEqual(data, bigger_contents))
2918+            return d
2919+        d.addCallback(_created)
2920+        return d
2921+
2922+
2923     def test_create_with_initial_contents(self):
2924         d = self.nodemaker.create_mutable_file("contents 1")
2925         def _created(n):
2926hunk ./src/allmydata/test/test_mutable.py 1147
2927 
2928 
2929     def _test_corrupt_all(self, offset, substring,
2930-                          should_succeed=False, corrupt_early=True,
2931-                          failure_checker=None):
2932+                          should_succeed=False,
2933+                          corrupt_early=True,
2934+                          failure_checker=None,
2935+                          fetch_privkey=False):
2936         d = defer.succeed(None)
2937         if corrupt_early:
2938             d.addCallback(corrupt, self._storage, offset)
2939hunk ./src/allmydata/test/test_mutable.py 1167
2940                     self.failUnlessIn(substring, "".join(allproblems))
2941                 return servermap
2942             if should_succeed:
2943-                d1 = self._fn.download_version(servermap, ver)
2944+                d1 = self._fn.download_version(servermap, ver,
2945+                                               fetch_privkey)
2946                 d1.addCallback(lambda new_contents:
2947                                self.failUnlessEqual(new_contents, self.CONTENTS))
2948             else:
2949hunk ./src/allmydata/test/test_mutable.py 1175
2950                 d1 = self.shouldFail(NotEnoughSharesError,
2951                                      "_corrupt_all(offset=%s)" % (offset,),
2952                                      substring,
2953-                                     self._fn.download_version, servermap, ver)
2954+                                     self._fn.download_version, servermap,
2955+                                                                ver,
2956+                                                                fetch_privkey)
2957             if failure_checker:
2958                 d1.addCallback(failure_checker)
2959             d1.addCallback(lambda res: servermap)
2960hunk ./src/allmydata/test/test_mutable.py 1186
2961         return d
2962 
2963     def test_corrupt_all_verbyte(self):
2964-        # when the version byte is not 0, we hit an UnknownVersionError error
2965-        # in unpack_share().
2966+        # when the version byte is not 0 or 1, we hit an UnknownVersionError
2967+        # error in unpack_share().
2968         d = self._test_corrupt_all(0, "UnknownVersionError")
2969         def _check_servermap(servermap):
2970             # and the dump should mention the problems
2971hunk ./src/allmydata/test/test_mutable.py 1193
2972             s = StringIO()
2973             dump = servermap.dump(s).getvalue()
2974-            self.failUnless("10 PROBLEMS" in dump, dump)
2975+            self.failUnless("30 PROBLEMS" in dump, dump)
2976         d.addCallback(_check_servermap)
2977         return d
2978 
2979hunk ./src/allmydata/test/test_mutable.py 1263
2980         return self._test_corrupt_all("enc_privkey", None, should_succeed=True)
2981 
2982 
2983+    def test_corrupt_all_encprivkey_late(self):
2984+        # this should work for the same reason as above, but we corrupt
2985+        # after the servermap update to exercise the error handling
2986+        # code.
2987+        # We need to remove the privkey from the node, or the retrieve
2988+        # process won't know to update it.
2989+        self._fn._privkey = None
2990+        return self._test_corrupt_all("enc_privkey",
2991+                                      None, # this shouldn't fail
2992+                                      should_succeed=True,
2993+                                      corrupt_early=False,
2994+                                      fetch_privkey=True)
2995+
2996+
2997     def test_corrupt_all_seqnum_late(self):
2998         # corrupting the seqnum between mapupdate and retrieve should result
2999         # in NotEnoughSharesError, since each share will look invalid
3000hunk ./src/allmydata/test/test_mutable.py 1283
3001         def _check(res):
3002             f = res[0]
3003             self.failUnless(f.check(NotEnoughSharesError))
3004-            self.failUnless("someone wrote to the data since we read the servermap" in str(f))
3005+            self.failUnless("uncoordinated write" in str(f))
3006         return self._test_corrupt_all(1, "ran out of peers",
3007                                       corrupt_early=False,
3008                                       failure_checker=_check)
3009hunk ./src/allmydata/test/test_mutable.py 1333
3010                       self.failUnlessEqual(new_contents, self.CONTENTS))
3011         return d
3012 
3013-    def test_corrupt_some(self):
3014-        # corrupt the data of first five shares (so the servermap thinks
3015-        # they're good but retrieve marks them as bad), so that the
3016-        # MODE_READ set of 6 will be insufficient, forcing node.download to
3017-        # retry with more servers.
3018-        corrupt(None, self._storage, "share_data", range(5))
3019-        d = self.make_servermap()
3020+
3021+    def _test_corrupt_some(self, offset, mdmf=False):
3022+        if mdmf:
3023+            d = self.publish_mdmf()
3024+        else:
3025+            d = defer.succeed(None)
3026+        d.addCallback(lambda ignored:
3027+            corrupt(None, self._storage, offset, range(5)))
3028+        d.addCallback(lambda ignored:
3029+            self.make_servermap())
3030         def _do_retrieve(servermap):
3031             ver = servermap.best_recoverable_version()
3032             self.failUnless(ver)
3033hunk ./src/allmydata/test/test_mutable.py 1349
3034             return self._fn.download_best_version()
3035         d.addCallback(_do_retrieve)
3036         d.addCallback(lambda new_contents:
3037-                      self.failUnlessEqual(new_contents, self.CONTENTS))
3038+            self.failUnlessEqual(new_contents, self.CONTENTS))
3039         return d
3040 
3041hunk ./src/allmydata/test/test_mutable.py 1352
3042+
3043+    def test_corrupt_some(self):
3044+        # corrupt the data of first five shares (so the servermap thinks
3045+        # they're good but retrieve marks them as bad), so that the
3046+        # MODE_READ set of 6 will be insufficient, forcing node.download to
3047+        # retry with more servers.
3048+        return self._test_corrupt_some("share_data")
3049+
3050+
3051     def test_download_fails(self):
3052         d = corrupt(None, self._storage, "signature")
3053         d.addCallback(lambda ignored:
3054hunk ./src/allmydata/test/test_mutable.py 1366
3055             self.shouldFail(UnrecoverableFileError, "test_download_anyway",
3056                             "no recoverable versions",
3057-                            self._fn.download_best_version)
3058+                            self._fn.download_best_version))
3059         return d
3060 
3061 
3062hunk ./src/allmydata/test/test_mutable.py 1370
3063+
3064+    def test_corrupt_mdmf_block_hash_tree(self):
3065+        d = self.publish_mdmf()
3066+        d.addCallback(lambda ignored:
3067+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
3068+                                   "block hash tree failure",
3069+                                   corrupt_early=False,
3070+                                   should_succeed=False))
3071+        return d
3072+
3073+
3074+    def test_corrupt_mdmf_block_hash_tree_late(self):
3075+        d = self.publish_mdmf()
3076+        d.addCallback(lambda ignored:
3077+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
3078+                                   "block hash tree failure",
3079+                                   corrupt_early=True,
3080+                                   should_succeed=False))
3081+        return d
3082+
3083+
3084+    def test_corrupt_mdmf_share_data(self):
3085+        d = self.publish_mdmf()
3086+        d.addCallback(lambda ignored:
3087+            # TODO: Find out what the block size is and corrupt a
3088+            # specific block, rather than just guessing.
3089+            self._test_corrupt_all(("share_data", 12 * 40),
3090+                                    "block hash tree failure",
3091+                                    corrupt_early=True,
3092+                                    should_succeed=False))
3093+        return d
3094+
3095+
3096+    def test_corrupt_some_mdmf(self):
3097+        return self._test_corrupt_some(("share_data", 12 * 40),
3098+                                       mdmf=True)
3099+
3100+
3101 class CheckerMixin:
3102     def check_good(self, r, where):
3103         self.failUnless(r.is_healthy(), where)
3104hunk ./src/allmydata/test/test_mutable.py 2116
3105             d.addCallback(lambda res:
3106                           self.shouldFail(NotEnoughSharesError,
3107                                           "test_retrieve_surprise",
3108-                                          "ran out of peers: have 0 shares (k=3)",
3109+                                          "ran out of peers: have 0 of 1",
3110                                           n.download_version,
3111                                           self.old_map,
3112                                           self.old_map.best_recoverable_version(),
3113hunk ./src/allmydata/test/test_mutable.py 2125
3114         d.addCallback(_created)
3115         return d
3116 
3117+
3118     def test_unexpected_shares(self):
3119         # upload the file, take a servermap, shut down one of the servers,
3120         # upload it again (causing shares to appear on a new server), then
3121hunk ./src/allmydata/test/test_mutable.py 2329
3122         self.basedir = "mutable/Problems/test_privkey_query_missing"
3123         self.set_up_grid(num_servers=20)
3124         nm = self.g.clients[0].nodemaker
3125-        LARGE = "These are Larger contents" * 2000 # about 50KB
3126+        LARGE = "These are Larger contents" * 2000 # about 50KiB
3127         nm._node_cache = DevNullDictionary() # disable the nodecache
3128 
3129         d = nm.create_mutable_file(LARGE)
3130hunk ./src/allmydata/test/test_mutable.py 2342
3131         d.addCallback(_created)
3132         d.addCallback(lambda res: self.n2.get_servermap(MODE_WRITE))
3133         return d
3134+
3135+
3136+    def test_block_and_hash_query_error(self):
3137+        # This tests for what happens when a query to a remote server
3138+        # fails in either the hash validation step or the block getting
3139+        # step (because of batching, this is the same actual query).
3140+        # We need to have the storage server persist up until the point
3141+        # that its prefix is validated, then suddenly die. This
3142+        # exercises some exception handling code in Retrieve.
3143+        self.basedir = "mutable/Problems/test_block_and_hash_query_error"
3144+        self.set_up_grid(num_servers=20)
3145+        nm = self.g.clients[0].nodemaker
3146+        CONTENTS = "contents" * 2000
3147+        d = nm.create_mutable_file(CONTENTS)
3148+        def _created(node):
3149+            self._node = node
3150+        d.addCallback(_created)
3151+        d.addCallback(lambda ignored:
3152+            self._node.get_servermap(MODE_READ))
3153+        def _then(servermap):
3154+            # we have our servermap. Now we set up the servers like the
3155+            # tests above -- the first one that gets a read call should
3156+            # start throwing errors, but only after returning its prefix
3157+            # for validation. Since we'll download without fetching the
3158+            # private key, the next query to the remote server will be
3159+            # for either a block and salt or for hashes, either of which
3160+            # will exercise the error handling code.
3161+            killer = FirstServerGetsKilled()
3162+            for (serverid, ss) in nm.storage_broker.get_all_servers():
3163+                ss.post_call_notifier = killer.notify
3164+            ver = servermap.best_recoverable_version()
3165+            assert ver
3166+            return self._node.download_version(servermap, ver)
3167+        d.addCallback(_then)
3168+        d.addCallback(lambda data:
3169+            self.failUnlessEqual(data, CONTENTS))
3170+        return d
3171}
3172[mutable/checker.py: check MDMF files
3173Kevan Carstensen <kevan@isnotajoke.com>**20100628225048
3174 Ignore-this: fb697b36285d60552df6ca5ac6a37629
3175 
3176 This patch adapts the mutable file checker and verifier to check and
3177 verify MDMF files. It does this by using the new segmented downloader,
3178 which is trained to perform verification operations on request. This
3179 removes some code duplication.
3180] {
3181hunk ./src/allmydata/mutable/checker.py 12
3182 from allmydata.mutable.common import MODE_CHECK, CorruptShareError
3183 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
3184 from allmydata.mutable.layout import unpack_share, SIGNED_PREFIX_LENGTH
3185+from allmydata.mutable.retrieve import Retrieve # for verifying
3186 
3187 class MutableChecker:
3188 
3189hunk ./src/allmydata/mutable/checker.py 29
3190 
3191     def check(self, verify=False, add_lease=False):
3192         servermap = ServerMap()
3193+        # Updating the servermap in MODE_CHECK will stand a good chance
3194+        # of finding all of the shares, and getting a good idea of
3195+        # recoverability, etc, without verifying.
3196         u = ServermapUpdater(self._node, self._storage_broker, self._monitor,
3197                              servermap, MODE_CHECK, add_lease=add_lease)
3198         if self._history:
3199hunk ./src/allmydata/mutable/checker.py 55
3200         if num_recoverable:
3201             self.best_version = servermap.best_recoverable_version()
3202 
3203+        # The file is unhealthy and needs to be repaired if:
3204+        # - There are unrecoverable versions.
3205         if servermap.unrecoverable_versions():
3206             self.need_repair = True
3207hunk ./src/allmydata/mutable/checker.py 59
3208+        # - There isn't a recoverable version.
3209         if num_recoverable != 1:
3210             self.need_repair = True
3211hunk ./src/allmydata/mutable/checker.py 62
3212+        # - The best recoverable version is missing some shares.
3213         if self.best_version:
3214             available_shares = servermap.shares_available()
3215             (num_distinct_shares, k, N) = available_shares[self.best_version]
3216hunk ./src/allmydata/mutable/checker.py 73
3217 
3218     def _verify_all_shares(self, servermap):
3219         # read every byte of each share
3220+        #
3221+        # This logic is going to be very nearly the same as the
3222+        # downloader. I bet we could pass the downloader a flag that
3223+        # makes it do this, and piggyback onto that instead of
3224+        # duplicating a bunch of code.
3225+        #
3226+        # Like:
3227+        #  r = Retrieve(blah, blah, blah, verify=True)
3228+        #  d = r.download()
3229+        #  (wait, wait, wait, d.callback)
3230+        # 
3231+        #  Then, when it has finished, we can check the servermap (which
3232+        #  we provided to Retrieve) to figure out which shares are bad,
3233+        #  since the Retrieve process will have updated the servermap as
3234+        #  it went along.
3235+        #
3236+        #  By passing the verify=True flag to the constructor, we are
3237+        #  telling the downloader a few things.
3238+        #
3239+        #  1. It needs to download all N shares, not just K shares.
3240+        #  2. It doesn't need to decrypt or decode the shares, only
3241+        #     verify them.
3242         if not self.best_version:
3243             return
3244hunk ./src/allmydata/mutable/checker.py 97
3245-        versionmap = servermap.make_versionmap()
3246-        shares = versionmap[self.best_version]
3247-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
3248-         offsets_tuple) = self.best_version
3249-        offsets = dict(offsets_tuple)
3250-        readv = [ (0, offsets["EOF"]) ]
3251-        dl = []
3252-        for (shnum, peerid, timestamp) in shares:
3253-            ss = servermap.connections[peerid]
3254-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
3255-            d.addCallback(self._got_answer, peerid, servermap)
3256-            dl.append(d)
3257-        return defer.DeferredList(dl, fireOnOneErrback=True, consumeErrors=True)
3258 
3259hunk ./src/allmydata/mutable/checker.py 98
3260-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
3261-        # isolate the callRemote to a separate method, so tests can subclass
3262-        # Publish and override it
3263-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
3264+        r = Retrieve(self._node, servermap, self.best_version, verify=True)
3265+        d = r.download()
3266+        d.addCallback(self._process_bad_shares)
3267         return d
3268 
3269hunk ./src/allmydata/mutable/checker.py 103
3270-    def _got_answer(self, datavs, peerid, servermap):
3271-        for shnum,datav in datavs.items():
3272-            data = datav[0]
3273-            try:
3274-                self._got_results_one_share(shnum, peerid, data)
3275-            except CorruptShareError:
3276-                f = failure.Failure()
3277-                self.need_repair = True
3278-                self.bad_shares.append( (peerid, shnum, f) )
3279-                prefix = data[:SIGNED_PREFIX_LENGTH]
3280-                servermap.mark_bad_share(peerid, shnum, prefix)
3281-                ss = servermap.connections[peerid]
3282-                self.notify_server_corruption(ss, shnum, str(f.value))
3283-
3284-    def check_prefix(self, peerid, shnum, data):
3285-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
3286-         offsets_tuple) = self.best_version
3287-        got_prefix = data[:SIGNED_PREFIX_LENGTH]
3288-        if got_prefix != prefix:
3289-            raise CorruptShareError(peerid, shnum,
3290-                                    "prefix mismatch: share changed while we were reading it")
3291-
3292-    def _got_results_one_share(self, shnum, peerid, data):
3293-        self.check_prefix(peerid, shnum, data)
3294-
3295-        # the [seqnum:signature] pieces are validated by _compare_prefix,
3296-        # which checks their signature against the pubkey known to be
3297-        # associated with this file.
3298 
3299hunk ./src/allmydata/mutable/checker.py 104
3300-        (seqnum, root_hash, IV, k, N, segsize, datalen, pubkey, signature,
3301-         share_hash_chain, block_hash_tree, share_data,
3302-         enc_privkey) = unpack_share(data)
3303-
3304-        # validate [share_hash_chain,block_hash_tree,share_data]
3305-
3306-        leaves = [hashutil.block_hash(share_data)]
3307-        t = hashtree.HashTree(leaves)
3308-        if list(t) != block_hash_tree:
3309-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
3310-        share_hash_leaf = t[0]
3311-        t2 = hashtree.IncompleteHashTree(N)
3312-        # root_hash was checked by the signature
3313-        t2.set_hashes({0: root_hash})
3314-        try:
3315-            t2.set_hashes(hashes=share_hash_chain,
3316-                          leaves={shnum: share_hash_leaf})
3317-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
3318-                IndexError), e:
3319-            msg = "corrupt hashes: %s" % (e,)
3320-            raise CorruptShareError(peerid, shnum, msg)
3321-
3322-        # validate enc_privkey: only possible if we have a write-cap
3323-        if not self._node.is_readonly():
3324-            alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
3325-            alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
3326-            if alleged_writekey != self._node.get_writekey():
3327-                raise CorruptShareError(peerid, shnum, "invalid privkey")
3328+    def _process_bad_shares(self, bad_shares):
3329+        if bad_shares:
3330+            self.need_repair = True
3331+        self.bad_shares = bad_shares
3332 
3333hunk ./src/allmydata/mutable/checker.py 109
3334-    def notify_server_corruption(self, ss, shnum, reason):
3335-        ss.callRemoteOnly("advise_corrupt_share",
3336-                          "mutable", self._storage_index, shnum, reason)
3337 
3338     def _count_shares(self, smap, version):
3339         available_shares = smap.shares_available()
3340hunk ./src/allmydata/test/test_mutable.py 193
3341                 if offset1 == "pubkey" and IV:
3342                     real_offset = 107
3343                 elif offset1 == "share_data" and not IV:
3344-                    real_offset = 104
3345+                    real_offset = 107
3346                 elif offset1 in o:
3347                     real_offset = o[offset1]
3348                 else:
3349hunk ./src/allmydata/test/test_mutable.py 395
3350             return d
3351         d.addCallback(_created)
3352         return d
3353+    test_create_mdmf_with_initial_contents.timeout = 20
3354 
3355 
3356     def test_create_with_initial_contents_function(self):
3357hunk ./src/allmydata/test/test_mutable.py 700
3358                                            k, N, segsize, datalen)
3359                 self.failUnless(p._pubkey.verify(sig_material, signature))
3360                 #self.failUnlessEqual(signature, p._privkey.sign(sig_material))
3361-                self.failUnless(isinstance(share_hash_chain, dict))
3362-                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
3363+                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
3364                 for shnum,share_hash in share_hash_chain.items():
3365                     self.failUnless(isinstance(shnum, int))
3366                     self.failUnless(isinstance(share_hash, str))
3367hunk ./src/allmydata/test/test_mutable.py 820
3368                     shares[peerid][shnum] = oldshares[index][peerid][shnum]
3369 
3370 
3371+
3372+
3373 class Servermap(unittest.TestCase, PublishMixin):
3374     def setUp(self):
3375         return self.publish_one()
3376hunk ./src/allmydata/test/test_mutable.py 951
3377         self._storage._peers = {} # delete all shares
3378         ms = self.make_servermap
3379         d = defer.succeed(None)
3380-
3381+#
3382         d.addCallback(lambda res: ms(mode=MODE_CHECK))
3383         d.addCallback(lambda sm: self.failUnlessNoneRecoverable(sm))
3384 
3385hunk ./src/allmydata/test/test_mutable.py 1440
3386         d.addCallback(self.check_good, "test_check_good")
3387         return d
3388 
3389+    def test_check_mdmf_good(self):
3390+        d = self.publish_mdmf()
3391+        d.addCallback(lambda ignored:
3392+            self._fn.check(Monitor()))
3393+        d.addCallback(self.check_good, "test_check_mdmf_good")
3394+        return d
3395+
3396     def test_check_no_shares(self):
3397         for shares in self._storage._peers.values():
3398             shares.clear()
3399hunk ./src/allmydata/test/test_mutable.py 1454
3400         d.addCallback(self.check_bad, "test_check_no_shares")
3401         return d
3402 
3403+    def test_check_mdmf_no_shares(self):
3404+        d = self.publish_mdmf()
3405+        def _then(ignored):
3406+            for share in self._storage._peers.values():
3407+                share.clear()
3408+        d.addCallback(_then)
3409+        d.addCallback(lambda ignored:
3410+            self._fn.check(Monitor()))
3411+        d.addCallback(self.check_bad, "test_check_mdmf_no_shares")
3412+        return d
3413+
3414     def test_check_not_enough_shares(self):
3415         for shares in self._storage._peers.values():
3416             for shnum in shares.keys():
3417hunk ./src/allmydata/test/test_mutable.py 1474
3418         d.addCallback(self.check_bad, "test_check_not_enough_shares")
3419         return d
3420 
3421+    def test_check_mdmf_not_enough_shares(self):
3422+        d = self.publish_mdmf()
3423+        def _then(ignored):
3424+            for shares in self._storage._peers.values():
3425+                for shnum in shares.keys():
3426+                    if shnum > 0:
3427+                        del shares[shnum]
3428+        d.addCallback(_then)
3429+        d.addCallback(lambda ignored:
3430+            self._fn.check(Monitor()))
3431+        d.addCallback(self.check_bad, "test_check_mdmf_not_enougH_shares")
3432+        return d
3433+
3434+
3435     def test_check_all_bad_sig(self):
3436         d = corrupt(None, self._storage, 1) # bad sig
3437         d.addCallback(lambda ignored:
3438hunk ./src/allmydata/test/test_mutable.py 1495
3439         d.addCallback(self.check_bad, "test_check_all_bad_sig")
3440         return d
3441 
3442+    def test_check_mdmf_all_bad_sig(self):
3443+        d = self.publish_mdmf()
3444+        d.addCallback(lambda ignored:
3445+            corrupt(None, self._storage, 1))
3446+        d.addCallback(lambda ignored:
3447+            self._fn.check(Monitor()))
3448+        d.addCallback(self.check_bad, "test_check_mdmf_all_bad_sig")
3449+        return d
3450+
3451     def test_check_all_bad_blocks(self):
3452         d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
3453         # the Checker won't notice this.. it doesn't look at actual data
3454hunk ./src/allmydata/test/test_mutable.py 1512
3455         d.addCallback(self.check_good, "test_check_all_bad_blocks")
3456         return d
3457 
3458+
3459+    def test_check_mdmf_all_bad_blocks(self):
3460+        d = self.publish_mdmf()
3461+        d.addCallback(lambda ignored:
3462+            corrupt(None, self._storage, "share_data"))
3463+        d.addCallback(lambda ignored:
3464+            self._fn.check(Monitor()))
3465+        d.addCallback(self.check_good, "test_check_mdmf_all_bad_blocks")
3466+        return d
3467+
3468     def test_verify_good(self):
3469         d = self._fn.check(Monitor(), verify=True)
3470         d.addCallback(self.check_good, "test_verify_good")
3471hunk ./src/allmydata/test/test_mutable.py 1582
3472                       "test_verify_one_bad_encprivkey_uncheckable")
3473         return d
3474 
3475+
3476+    def test_verify_mdmf_good(self):
3477+        d = self.publish_mdmf()
3478+        d.addCallback(lambda ignored:
3479+            self._fn.check(Monitor(), verify=True))
3480+        d.addCallback(self.check_good, "test_verify_mdmf_good")
3481+        return d
3482+
3483+
3484+    def test_verify_mdmf_one_bad_block(self):
3485+        d = self.publish_mdmf()
3486+        d.addCallback(lambda ignored:
3487+            corrupt(None, self._storage, "share_data", [1]))
3488+        d.addCallback(lambda ignored:
3489+            self._fn.check(Monitor(), verify=True))
3490+        # We should find one bad block here
3491+        d.addCallback(self.check_bad, "test_verify_mdmf_one_bad_block")
3492+        d.addCallback(self.check_expected_failure,
3493+                      CorruptShareError, "block hash tree failure",
3494+                      "test_verify_mdmf_one_bad_block")
3495+        return d
3496+
3497+
3498+    def test_verify_mdmf_bad_encprivkey(self):
3499+        d = self.publish_mdmf()
3500+        d.addCallback(lambda ignored:
3501+            corrupt(None, self._storage, "enc_privkey", [1]))
3502+        d.addCallback(lambda ignored:
3503+            self._fn.check(Monitor(), verify=True))
3504+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_encprivkey")
3505+        d.addCallback(self.check_expected_failure,
3506+                      CorruptShareError, "privkey",
3507+                      "test_verify_mdmf_bad_encprivkey")
3508+        return d
3509+
3510+
3511+    def test_verify_mdmf_bad_sig(self):
3512+        d = self.publish_mdmf()
3513+        d.addCallback(lambda ignored:
3514+            corrupt(None, self._storage, 1, [1]))
3515+        d.addCallback(lambda ignored:
3516+            self._fn.check(Monitor(), verify=True))
3517+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_sig")
3518+        return d
3519+
3520+
3521+    def test_verify_mdmf_bad_encprivkey_uncheckable(self):
3522+        d = self.publish_mdmf()
3523+        d.addCallback(lambda ignored:
3524+            corrupt(None, self._storage, "enc_privkey", [1]))
3525+        d.addCallback(lambda ignored:
3526+            self._fn.get_readonly())
3527+        d.addCallback(lambda fn:
3528+            fn.check(Monitor(), verify=True))
3529+        d.addCallback(self.check_good,
3530+                      "test_verify_mdmf_bad_encprivkey_uncheckable")
3531+        return d
3532+
3533+
3534 class Repair(unittest.TestCase, PublishMixin, ShouldFailMixin):
3535 
3536     def get_shares(self, s):
3537hunk ./src/allmydata/test/test_mutable.py 1706
3538         current_shares = self.old_shares[-1]
3539         self.failUnlessEqual(old_shares, current_shares)
3540 
3541+
3542     def test_unrepairable_0shares(self):
3543         d = self.publish_one()
3544         def _delete_all_shares(ign):
3545hunk ./src/allmydata/test/test_mutable.py 1721
3546         d.addCallback(_check)
3547         return d
3548 
3549+    def test_mdmf_unrepairable_0shares(self):
3550+        d = self.publish_mdmf()
3551+        def _delete_all_shares(ign):
3552+            shares = self._storage._peers
3553+            for peerid in shares:
3554+                shares[peerid] = {}
3555+        d.addCallback(_delete_all_shares)
3556+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3557+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3558+        d.addCallback(lambda crr: self.failIf(crr.get_successful()))
3559+        return d
3560+
3561+
3562     def test_unrepairable_1share(self):
3563         d = self.publish_one()
3564         def _delete_all_shares(ign):
3565hunk ./src/allmydata/test/test_mutable.py 1750
3566         d.addCallback(_check)
3567         return d
3568 
3569+    def test_mdmf_unrepairable_1share(self):
3570+        d = self.publish_mdmf()
3571+        def _delete_all_shares(ign):
3572+            shares = self._storage._peers
3573+            for peerid in shares:
3574+                for shnum in list(shares[peerid]):
3575+                    if shnum > 0:
3576+                        del shares[peerid][shnum]
3577+        d.addCallback(_delete_all_shares)
3578+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3579+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3580+        def _check(crr):
3581+            self.failUnlessEqual(crr.get_successful(), False)
3582+        d.addCallback(_check)
3583+        return d
3584+
3585+    def test_repairable_5shares(self):
3586+        d = self.publish_mdmf()
3587+        def _delete_all_shares(ign):
3588+            shares = self._storage._peers
3589+            for peerid in shares:
3590+                for shnum in list(shares[peerid]):
3591+                    if shnum > 4:
3592+                        del shares[peerid][shnum]
3593+        d.addCallback(_delete_all_shares)
3594+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3595+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3596+        def _check(crr):
3597+            self.failUnlessEqual(crr.get_successful(), True)
3598+        d.addCallback(_check)
3599+        return d
3600+
3601+    def test_mdmf_repairable_5shares(self):
3602+        d = self.publish_mdmf()
3603+        def _delete_all_shares(ign):
3604+            shares = self._storage._peers
3605+            for peerid in shares:
3606+                for shnum in list(shares[peerid]):
3607+                    if shnum > 5:
3608+                        del shares[peerid][shnum]
3609+        d.addCallback(_delete_all_shares)
3610+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3611+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3612+        def _check(crr):
3613+            self.failUnlessEqual(crr.get_successful(), True)
3614+        d.addCallback(_check)
3615+        return d
3616+
3617+
3618     def test_merge(self):
3619         self.old_shares = []
3620         d = self.publish_multiple()
3621}
3622[mutable/retrieve.py: learn how to verify mutable files
3623Kevan Carstensen <kevan@isnotajoke.com>**20100628225201
3624 Ignore-this: 989af7800c47589620918461ec989483
3625] {
3626hunk ./src/allmydata/mutable/retrieve.py 86
3627     # Retrieve object will remain tied to a specific version of the file, and
3628     # will use a single ServerMap instance.
3629 
3630-    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False):
3631+    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False,
3632+                 verify=False):
3633         self._node = filenode
3634         assert self._node.get_pubkey()
3635         self._storage_index = filenode.get_storage_index()
3636hunk ./src/allmydata/mutable/retrieve.py 106
3637         # during repair, we may be called upon to grab the private key, since
3638         # it wasn't picked up during a verify=False checker run, and we'll
3639         # need it for repair to generate a new version.
3640-        self._need_privkey = fetch_privkey
3641-        if self._node.get_privkey():
3642+        self._need_privkey = fetch_privkey or verify
3643+        if self._node.get_privkey() and not verify:
3644             self._need_privkey = False
3645 
3646         if self._need_privkey:
3647hunk ./src/allmydata/mutable/retrieve.py 117
3648             self._privkey_query_markers = [] # one Marker for each time we've
3649                                              # tried to get the privkey.
3650 
3651+        # verify means that we are using the downloader logic to verify all
3652+        # of our shares. This tells the downloader a few things.
3653+        #
3654+        # 1. We need to download all of the shares.
3655+        # 2. We don't need to decode or decrypt the shares, since our
3656+        #    caller doesn't care about the plaintext, only the
3657+        #    information about which shares are or are not valid.
3658+        # 3. When we are validating readers, we need to validate the
3659+        #    signature on the prefix. Do we? We already do this in the
3660+        #    servermap update?
3661+        #
3662+        # (just work on 1 and 2 for now, I guess)
3663+        self._verify = False
3664+        if verify:
3665+            self._verify = True
3666+
3667         self._status = RetrieveStatus()
3668         self._status.set_storage_index(self._storage_index)
3669         self._status.set_helper(False)
3670hunk ./src/allmydata/mutable/retrieve.py 323
3671 
3672         # We need at least self._required_shares readers to download a
3673         # segment.
3674-        needed = self._required_shares - len(self._active_readers)
3675+        if self._verify:
3676+            needed = self._total_shares
3677+        else:
3678+            needed = self._required_shares - len(self._active_readers)
3679         # XXX: Why don't format= log messages work here?
3680         self.log("adding %d peers to the active peers list" % needed)
3681 
3682hunk ./src/allmydata/mutable/retrieve.py 339
3683         # will cause problems later.
3684         active_shnums -= set([reader.shnum for reader in self._active_readers])
3685         active_shnums = list(active_shnums)[:needed]
3686-        if len(active_shnums) < needed:
3687+        if len(active_shnums) < needed and not self._verify:
3688             # We don't have enough readers to retrieve the file; fail.
3689             return self._failed()
3690 
3691hunk ./src/allmydata/mutable/retrieve.py 346
3692         for shnum in active_shnums:
3693             self._active_readers.append(self.readers[shnum])
3694             self.log("added reader for share %d" % shnum)
3695-        assert len(self._active_readers) == self._required_shares
3696+        assert len(self._active_readers) >= self._required_shares
3697         # Conceptually, this is part of the _add_active_peers step. It
3698         # validates the prefixes of newly added readers to make sure
3699         # that they match what we are expecting for self.verinfo. If
3700hunk ./src/allmydata/mutable/retrieve.py 416
3701                     # that we haven't gotten it at the end of
3702                     # segment decoding, then we'll take more drastic
3703                     # measures.
3704-                    if self._need_privkey:
3705+                    if self._need_privkey and not self._node.is_readonly():
3706                         d = reader.get_encprivkey()
3707                         d.addCallback(self._try_to_validate_privkey, reader)
3708             if bad_readers:
3709hunk ./src/allmydata/mutable/retrieve.py 423
3710                 # We do them all at once, or else we screw up list indexing.
3711                 for (reader, f) in bad_readers:
3712                     self._mark_bad_share(reader, f)
3713-                return self._add_active_peers()
3714+                if self._verify:
3715+                    if len(self._active_readers) >= self._required_shares:
3716+                        return self._download_current_segment()
3717+                    else:
3718+                        return self._failed()
3719+                else:
3720+                    return self._add_active_peers()
3721             else:
3722                 return self._download_current_segment()
3723             # The next step will assert that it has enough active
3724hunk ./src/allmydata/mutable/retrieve.py 518
3725         """
3726         self.log("marking share %d on server %s as bad" % \
3727                  (reader.shnum, reader))
3728+        prefix = self.verinfo[-2]
3729+        self.servermap.mark_bad_share(reader.peerid,
3730+                                      reader.shnum,
3731+                                      prefix)
3732         self._remove_reader(reader)
3733hunk ./src/allmydata/mutable/retrieve.py 523
3734-        self._bad_shares.add((reader.peerid, reader.shnum))
3735+        self._bad_shares.add((reader.peerid, reader.shnum, f))
3736         self._status.problems[reader.peerid] = f
3737         self._last_failure = f
3738         self.notify_server_corruption(reader.peerid, reader.shnum,
3739hunk ./src/allmydata/mutable/retrieve.py 571
3740             ds.append(dl)
3741             reader.flush()
3742         dl = defer.DeferredList(ds)
3743-        dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
3744+        if self._verify:
3745+            dl.addCallback(lambda ignored: "")
3746+            dl.addCallback(self._set_segment)
3747+        else:
3748+            dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
3749         return dl
3750 
3751 
3752hunk ./src/allmydata/mutable/retrieve.py 701
3753         # shnum, which will be a leaf in the share hash tree, which
3754         # will allow us to validate the rest of the tree.
3755         if self.share_hash_tree.needed_hashes(reader.shnum,
3756-                                               include_leaf=True):
3757+                                              include_leaf=True) or \
3758+                                              self._verify:
3759             try:
3760                 self.share_hash_tree.set_hashes(hashes=sharehashes[1],
3761                                             leaves={reader.shnum: bht[0]})
3762hunk ./src/allmydata/mutable/retrieve.py 832
3763 
3764 
3765     def _try_to_validate_privkey(self, enc_privkey, reader):
3766-
3767         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
3768         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
3769         if alleged_writekey != self._node.get_writekey():
3770hunk ./src/allmydata/mutable/retrieve.py 838
3771             self.log("invalid privkey from %s shnum %d" %
3772                      (reader, reader.shnum),
3773                      level=log.WEIRD, umid="YIw4tA")
3774+            if self._verify:
3775+                self.servermap.mark_bad_share(reader.peerid, reader.shnum,
3776+                                              self.verinfo[-2])
3777+                e = CorruptShareError(reader.peerid,
3778+                                      reader.shnum,
3779+                                      "invalid privkey")
3780+                f = failure.Failure(e)
3781+                self._bad_shares.add((reader.peerid, reader.shnum, f))
3782             return
3783 
3784         # it's good
3785hunk ./src/allmydata/mutable/retrieve.py 904
3786         statements, I return the decrypted contents to the owner of this
3787         Retrieve object through self._done_deferred.
3788         """
3789-        eventually(self._done_deferred.callback, self._plaintext)
3790+        if self._verify:
3791+            ret = list(self._bad_shares)
3792+            self.log("done verifying, found %d bad shares" % len(ret))
3793+        else:
3794+            ret = self._plaintext
3795+        eventually(self._done_deferred.callback, ret)
3796 
3797 
3798     def _failed(self):
3799hunk ./src/allmydata/mutable/retrieve.py 920
3800         to the caller of this Retrieve object through
3801         self._done_deferred.
3802         """
3803-        format = ("ran out of peers: "
3804-                  "have %(have)d of %(total)d segments "
3805-                  "found %(bad)d bad shares "
3806-                  "encoding %(k)d-of-%(n)d")
3807-        args = {"have": self._current_segment,
3808-                "total": self._num_segments,
3809-                "k": self._required_shares,
3810-                "n": self._total_shares,
3811-                "bad": len(self._bad_shares)}
3812-        e = NotEnoughSharesError("%s, last failure: %s" % (format % args,
3813-                                                        str(self._last_failure)))
3814-        f = failure.Failure(e)
3815-        eventually(self._done_deferred.callback, f)
3816+        if self._verify:
3817+            ret = list(self._bad_shares)
3818+        else:
3819+            format = ("ran out of peers: "
3820+                      "have %(have)d of %(total)d segments "
3821+                      "found %(bad)d bad shares "
3822+                      "encoding %(k)d-of-%(n)d")
3823+            args = {"have": self._current_segment,
3824+                    "total": self._num_segments,
3825+                    "k": self._required_shares,
3826+                    "n": self._total_shares,
3827+                    "bad": len(self._bad_shares)}
3828+            e = NotEnoughSharesError("%s, last failure: %s" % \
3829+                                     (format % args, str(self._last_failure)))
3830+            f = failure.Failure(e)
3831+            ret = f
3832+        eventually(self._done_deferred.callback, ret)
3833}
3834[interfaces.py: add IMutableSlotWriter
3835Kevan Carstensen <kevan@isnotajoke.com>**20100630183305
3836 Ignore-this: ff9dca96ef1a009ae85485682f81ea5
3837] hunk ./src/allmydata/interfaces.py 418
3838         """
3839 
3840 
3841+class IMutableSlotWriter(Interface):
3842+    """
3843+    The interface for a writer around a mutable slot on a remote server.
3844+    """
3845+    def set_checkstring(checkstring, *args):
3846+        """
3847+        Set the checkstring that I will pass to the remote server when
3848+        writing.
3849+
3850+            @param checkstring A packed checkstring to use.
3851+
3852+        Note that implementations can differ in which semantics they
3853+        wish to support for set_checkstring -- they can, for example,
3854+        build the checkstring themselves from its constituents, or
3855+        some other thing.
3856+        """
3857+
3858+    def get_checkstring():
3859+        """
3860+        Get the checkstring that I think currently exists on the remote
3861+        server.
3862+        """
3863+
3864+    def put_block(data, segnum, salt):
3865+        """
3866+        Add a block and salt to the share.
3867+        """
3868+
3869+    def put_encprivey(encprivkey):
3870+        """
3871+        Add the encrypted private key to the share.
3872+        """
3873+
3874+    def put_blockhashes(blockhashes=list):
3875+        """
3876+        Add the block hash tree to the share.
3877+        """
3878+
3879+    def put_sharehashes(sharehashes=dict):
3880+        """
3881+        Add the share hash chain to the share.
3882+        """
3883+
3884+    def get_signable():
3885+        """
3886+        Return the part of the share that needs to be signed.
3887+        """
3888+
3889+    def put_signature(signature):
3890+        """
3891+        Add the signature to the share.
3892+        """
3893+
3894+    def put_verification_key(verification_key):
3895+        """
3896+        Add the verification key to the share.
3897+        """
3898+
3899+    def finish_publishing():
3900+        """
3901+        Do anything necessary to finish writing the share to a remote
3902+        server. I require that no further publishing needs to take place
3903+        after this method has been called.
3904+        """
3905+
3906+
3907 class IURI(Interface):
3908     def init_from_string(uri):
3909         """Accept a string (as created by my to_string() method) and populate
3910[mutable/publish.py: cleanup + simplification
3911Kevan Carstensen <kevan@isnotajoke.com>**20100701232656
3912 Ignore-this: 78ab057bb8dc17632f29c6a841a75c59
3913] {
3914hunk ./src/allmydata/mutable/publish.py 19
3915      UncoordinatedWriteError, NotEnoughServersError
3916 from allmydata.mutable.servermap import ServerMap
3917 from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
3918-     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy
3919+     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy, \
3920+     SDMFSlotWriteProxy
3921 
3922 KiB = 1024
3923 DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
3924hunk ./src/allmydata/mutable/publish.py 24
3925+PUSHING_BLOCKS_STATE = 0
3926+PUSHING_EVERYTHING_ELSE_STATE = 1
3927+DONE_STATE = 2
3928 
3929 class PublishStatus:
3930     implements(IPublishStatus)
3931hunk ./src/allmydata/mutable/publish.py 242
3932             self.bad_share_checkstrings[key] = old_checkstring
3933             self.connections[peerid] = self._servermap.connections[peerid]
3934 
3935-        # Now, the process dovetails -- if this is an SDMF file, we need
3936-        # to write an SDMF file. Otherwise, we need to write an MDMF
3937-        # file.
3938-        if self._version == MDMF_VERSION:
3939-            return self._publish_mdmf()
3940-        else:
3941-            return self._publish_sdmf()
3942-        #return self.done_deferred
3943-
3944-    def _publish_mdmf(self):
3945-        # Next, we find homes for all of the shares that we don't have
3946-        # homes for yet.
3947         # TODO: Make this part do peer selection.
3948         self.update_goal()
3949         self.writers = {}
3950hunk ./src/allmydata/mutable/publish.py 245
3951-        # For each (peerid, shnum) in self.goal, we make an
3952-        # MDMFSlotWriteProxy for that peer. We'll use this to write
3953+        if self._version == MDMF_VERSION:
3954+            writer_class = MDMFSlotWriteProxy
3955+        else:
3956+            writer_class = SDMFSlotWriteProxy
3957+
3958+        # For each (peerid, shnum) in self.goal, we make a
3959+        # write proxy for that peer. We'll use this to write
3960         # shares to the peer.
3961         for key in self.goal:
3962             peerid, shnum = key
3963hunk ./src/allmydata/mutable/publish.py 260
3964             cancel_secret = self._node.get_cancel_secret(peerid)
3965             secrets = (write_enabler, renew_secret, cancel_secret)
3966 
3967-            self.writers[shnum] =  MDMFSlotWriteProxy(shnum,
3968-                                                      self.connections[peerid],
3969-                                                      self._storage_index,
3970-                                                      secrets,
3971-                                                      self._new_seqnum,
3972-                                                      self.required_shares,
3973-                                                      self.total_shares,
3974-                                                      self.segment_size,
3975-                                                      len(self.newdata))
3976+            self.writers[shnum] =  writer_class(shnum,
3977+                                                self.connections[peerid],
3978+                                                self._storage_index,
3979+                                                secrets,
3980+                                                self._new_seqnum,
3981+                                                self.required_shares,
3982+                                                self.total_shares,
3983+                                                self.segment_size,
3984+                                                len(self.newdata))
3985+            self.writers[shnum].peerid = peerid
3986             if (peerid, shnum) in self._servermap.servermap:
3987                 old_versionid, old_timestamp = self._servermap.servermap[key]
3988                 (old_seqnum, old_root_hash, old_salt, old_segsize,
3989hunk ./src/allmydata/mutable/publish.py 275
3990                  old_datalength, old_k, old_N, old_prefix,
3991                  old_offsets_tuple) = old_versionid
3992-                self.writers[shnum].set_checkstring(old_seqnum, old_root_hash)
3993+                self.writers[shnum].set_checkstring(old_seqnum,
3994+                                                    old_root_hash,
3995+                                                    old_salt)
3996+            elif (peerid, shnum) in self.bad_share_checkstrings:
3997+                old_checkstring = self.bad_share_checkstrings[(peerid, shnum)]
3998+                self.writers[shnum].set_checkstring(old_checkstring)
3999+
4000+        # Our remote shares will not have a complete checkstring until
4001+        # after we are done writing share data and have started to write
4002+        # blocks. In the meantime, we need to know what to look for when
4003+        # writing, so that we can detect UncoordinatedWriteErrors.
4004+        self._checkstring = self.writers.values()[0].get_checkstring()
4005 
4006         # Now, we start pushing shares.
4007         self._status.timings["setup"] = time.time() - self._started
4008hunk ./src/allmydata/mutable/publish.py 307
4009 
4010         d = defer.succeed(None)
4011         self.log("Starting push")
4012-        for i in xrange(self.num_segments - 1):
4013-            d.addCallback(lambda ignored, i=i:
4014-                self.push_segment(i))
4015-            d.addCallback(self._turn_barrier)
4016-        # We have at least one segment, so we will have a tail segment
4017-        if self.num_segments > 0:
4018-            d.addCallback(lambda ignored:
4019-                self.push_tail_segment())
4020-
4021-        d.addCallback(lambda ignored:
4022-            self.push_encprivkey())
4023-        d.addCallback(lambda ignored:
4024-            self.push_blockhashes())
4025-        d.addCallback(lambda ignored:
4026-            self.push_sharehashes())
4027-        d.addCallback(lambda ignored:
4028-            self.push_toplevel_hashes_and_signature())
4029-        d.addCallback(lambda ignored:
4030-            self.finish_publishing())
4031-        return d
4032-
4033-
4034-    def _publish_sdmf(self):
4035-        self._status.timings["setup"] = time.time() - self._started
4036-        self.salt = os.urandom(16)
4037 
4038hunk ./src/allmydata/mutable/publish.py 308
4039-        d = self._encrypt_and_encode()
4040-        d.addCallback(self._generate_shares)
4041-        def _start_pushing(res):
4042-            self._started_pushing = time.time()
4043-            return res
4044-        d.addCallback(_start_pushing)
4045-        d.addCallback(self.loop) # trigger delivery
4046-        d.addErrback(self._fatal_error)
4047+        self._state = PUSHING_BLOCKS_STATE
4048+        self._push()
4049 
4050         return self.done_deferred
4051 
4052hunk ./src/allmydata/mutable/publish.py 328
4053                                                   segment_size)
4054         else:
4055             self.num_segments = 0
4056+
4057+        self.log("building encoding parameters for file")
4058+        self.log("got segsize %d" % self.segment_size)
4059+        self.log("got %d segments" % self.num_segments)
4060+
4061         if self._version == SDMF_VERSION:
4062             assert self.num_segments in (0, 1) # SDMF
4063hunk ./src/allmydata/mutable/publish.py 335
4064-            return
4065         # calculate the tail segment size.
4066hunk ./src/allmydata/mutable/publish.py 336
4067-        self.tail_segment_size = len(self.newdata) % segment_size
4068 
4069hunk ./src/allmydata/mutable/publish.py 337
4070-        if self.tail_segment_size == 0:
4071+        if segment_size and self.newdata:
4072+            self.tail_segment_size = len(self.newdata) % segment_size
4073+        else:
4074+            self.tail_segment_size = 0
4075+
4076+        if self.tail_segment_size == 0 and segment_size:
4077             # The tail segment is the same size as the other segments.
4078             self.tail_segment_size = segment_size
4079 
4080hunk ./src/allmydata/mutable/publish.py 346
4081-        # We'll make an encoder ahead-of-time for the normal-sized
4082-        # segments (defined as any segment of segment_size size.
4083-        # (the part of the code that puts the tail segment will make its
4084-        #  own encoder for that part)
4085+        # Make FEC encoders
4086         fec = codec.CRSEncoder()
4087         fec.set_params(self.segment_size,
4088                        self.required_shares, self.total_shares)
4089hunk ./src/allmydata/mutable/publish.py 353
4090         self.piece_size = fec.get_block_size()
4091         self.fec = fec
4092 
4093+        if self.tail_segment_size == self.segment_size:
4094+            self.tail_fec = self.fec
4095+        else:
4096+            tail_fec = codec.CRSEncoder()
4097+            tail_fec.set_params(self.tail_segment_size,
4098+                                self.required_shares,
4099+                                self.total_shares)
4100+            self.tail_fec = tail_fec
4101+
4102+        self._current_segment = 0
4103+
4104+
4105+    def _push(self, ignored=None):
4106+        """
4107+        I manage state transitions. In particular, I see that we still
4108+        have a good enough number of writers to complete the upload
4109+        successfully.
4110+        """
4111+        # Can we still successfully publish this file?
4112+        # TODO: Keep track of outstanding queries before aborting the
4113+        #       process.
4114+        if len(self.writers) <= self.required_shares or self.surprised:
4115+            return self._failure()
4116+
4117+        # Figure out what we need to do next. Each of these needs to
4118+        # return a deferred so that we don't block execution when this
4119+        # is first called in the upload method.
4120+        if self._state == PUSHING_BLOCKS_STATE:
4121+            return self.push_segment(self._current_segment)
4122+
4123+        # XXX: Do we want more granularity in states? Is that useful at
4124+        #      all?
4125+        #      Yes -- quicker reaction to UCW.
4126+        elif self._state == PUSHING_EVERYTHING_ELSE_STATE:
4127+            return self.push_everything_else()
4128+
4129+        # done
4130+        elif self._state == DONE_STATE:
4131+            # Depending on connection errors in the previous operation,
4132+            # we may or may not be successful -- _maybe_done tells us
4133+            # whether we are or aren't.
4134+            return self._done(None)
4135+
4136 
4137     def push_segment(self, segnum):
4138hunk ./src/allmydata/mutable/publish.py 398
4139+        if self.num_segments == 0 and self._version == SDMF_VERSION:
4140+            self._add_dummy_salts()
4141+
4142+        if segnum == self.num_segments:
4143+            # We don't have any more segments to push.
4144+            self._state = PUSHING_EVERYTHING_ELSE_STATE
4145+            return self._push()
4146+
4147+        d = self._encode_segment(segnum)
4148+        d.addCallback(self._push_segment, segnum)
4149+        def _increment_segnum(ign):
4150+            self._current_segment += 1
4151+        # XXX: I don't think we need to do addBoth here -- any errBacks
4152+        # should be handled within push_segment.
4153+        d.addBoth(_increment_segnum)
4154+        d.addBoth(self._push)
4155+
4156+
4157+    def _add_dummy_salts(self):
4158+        """
4159+        SDMF files need a salt even if they're empty, or the signature
4160+        won't make sense. This method adds a dummy salt to each of our
4161+        SDMF writers so that they can write the signature later.
4162+        """
4163+        salt = os.urandom(16)
4164+        assert self._version == SDMF_VERSION
4165+
4166+        for writer in self.writers.itervalues():
4167+            writer.put_salt(salt)
4168+
4169+
4170+    def _encode_segment(self, segnum):
4171+        """
4172+        I encrypt and encode the segment segnum.
4173+        """
4174         started = time.time()
4175hunk ./src/allmydata/mutable/publish.py 434
4176-        segsize = self.segment_size
4177+
4178+        if segnum + 1 == self.num_segments:
4179+            segsize = self.tail_segment_size
4180+        else:
4181+            segsize = self.segment_size
4182+
4183+
4184+        offset = self.segment_size * segnum
4185+        length = segsize + offset
4186         self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
4187hunk ./src/allmydata/mutable/publish.py 444
4188-        data = self.newdata[segsize * segnum:segsize*(segnum + 1)]
4189+        data = self.newdata[offset:length]
4190         assert len(data) == segsize
4191 
4192         salt = os.urandom(16)
4193hunk ./src/allmydata/mutable/publish.py 459
4194         started = now
4195 
4196         # now apply FEC
4197+        if segnum + 1 == self.num_segments:
4198+            fec = self.tail_fec
4199+        else:
4200+            fec = self.fec
4201 
4202         self._status.set_status("Encoding")
4203         crypttext_pieces = [None] * self.required_shares
4204hunk ./src/allmydata/mutable/publish.py 466
4205-        piece_size = self.piece_size
4206+        piece_size = fec.get_block_size()
4207         for i in range(len(crypttext_pieces)):
4208             offset = i * piece_size
4209             piece = crypttext[offset:offset+piece_size]
4210hunk ./src/allmydata/mutable/publish.py 473
4211             piece = piece + "\x00"*(piece_size - len(piece)) # padding
4212             crypttext_pieces[i] = piece
4213             assert len(piece) == piece_size
4214-        d = self.fec.encode(crypttext_pieces)
4215+        d = fec.encode(crypttext_pieces)
4216         def _done_encoding(res):
4217             elapsed = time.time() - started
4218             self._status.timings["encode"] = elapsed
4219hunk ./src/allmydata/mutable/publish.py 477
4220-            return res
4221+            return (res, salt)
4222         d.addCallback(_done_encoding)
4223hunk ./src/allmydata/mutable/publish.py 479
4224-
4225-        def _push_shares_and_salt(results):
4226-            shares, shareids = results
4227-            dl = []
4228-            for i in xrange(len(shares)):
4229-                sharedata = shares[i]
4230-                shareid = shareids[i]
4231-                block_hash = hashutil.block_hash(salt + sharedata)
4232-                self.blockhashes[shareid].append(block_hash)
4233-
4234-                # find the writer for this share
4235-                d = self.writers[shareid].put_block(sharedata, segnum, salt)
4236-                dl.append(d)
4237-            # TODO: Naturally, we need to check on the results of these.
4238-            return defer.DeferredList(dl)
4239-        d.addCallback(_push_shares_and_salt)
4240         return d
4241 
4242 
4243hunk ./src/allmydata/mutable/publish.py 482
4244-    def push_tail_segment(self):
4245-        # This is essentially the same as push_segment, except that we
4246-        # don't use the cached encoder that we use elsewhere.
4247-        self.log("Pushing tail segment")
4248+    def _push_segment(self, encoded_and_salt, segnum):
4249+        """
4250+        I push (data, salt) as segment number segnum.
4251+        """
4252+        results, salt = encoded_and_salt
4253+        shares, shareids = results
4254         started = time.time()
4255hunk ./src/allmydata/mutable/publish.py 489
4256-        segsize = self.segment_size
4257-        data = self.newdata[segsize * (self.num_segments-1):]
4258-        assert len(data) == self.tail_segment_size
4259-        salt = os.urandom(16)
4260-
4261-        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
4262-        enc = AES(key)
4263-        crypttext = enc.process(data)
4264-        assert len(crypttext) == len(data)
4265+        dl = []
4266+        for i in xrange(len(shares)):
4267+            sharedata = shares[i]
4268+            shareid = shareids[i]
4269+            if self._version == MDMF_VERSION:
4270+                hashed = salt + sharedata
4271+            else:
4272+                hashed = sharedata
4273+            block_hash = hashutil.block_hash(hashed)
4274+            self.blockhashes[shareid].append(block_hash)
4275 
4276hunk ./src/allmydata/mutable/publish.py 500
4277-        now = time.time()
4278-        self._status.timings['encrypt'] = now - started
4279-        started = now
4280+            # find the writer for this share
4281+            writer = self.writers[shareid]
4282+            d = writer.put_block(sharedata, segnum, salt)
4283+            d.addCallback(self._got_write_answer, writer, started)
4284+            d.addErrback(self._connection_problem, writer)
4285+            dl.append(d)
4286+            # TODO: Naturally, we need to check on the results of these.
4287+        return defer.DeferredList(dl)
4288 
4289hunk ./src/allmydata/mutable/publish.py 509
4290-        self._status.set_status("Encoding")
4291-        tail_fec = codec.CRSEncoder()
4292-        tail_fec.set_params(self.tail_segment_size,
4293-                            self.required_shares,
4294-                            self.total_shares)
4295 
4296hunk ./src/allmydata/mutable/publish.py 510
4297-        crypttext_pieces = [None] * self.required_shares
4298-        piece_size = tail_fec.get_block_size()
4299-        for i in range(len(crypttext_pieces)):
4300-            offset = i * piece_size
4301-            piece = crypttext[offset:offset+piece_size]
4302-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
4303-            crypttext_pieces[i] = piece
4304-            assert len(piece) == piece_size
4305-        d = tail_fec.encode(crypttext_pieces)
4306-        def _push_shares_and_salt(results):
4307-            shares, shareids = results
4308-            dl = []
4309-            for i in xrange(len(shares)):
4310-                sharedata = shares[i]
4311-                shareid = shareids[i]
4312-                block_hash = hashutil.block_hash(salt + sharedata)
4313-                self.blockhashes[shareid].append(block_hash)
4314-                # find the writer for this share
4315-                d = self.writers[shareid].put_block(sharedata,
4316-                                                    self.num_segments - 1,
4317-                                                    salt)
4318-                dl.append(d)
4319-            # TODO: Naturally, we need to check on the results of these.
4320-            return defer.DeferredList(dl)
4321-        d.addCallback(_push_shares_and_salt)
4322+    def push_everything_else(self):
4323+        """
4324+        I put everything else associated with a share.
4325+        """
4326+        encprivkey = self._encprivkey
4327+        d = self.push_encprivkey()
4328+        d.addCallback(self.push_blockhashes)
4329+        d.addCallback(self.push_sharehashes)
4330+        d.addCallback(self.push_toplevel_hashes_and_signature)
4331+        d.addCallback(self.finish_publishing)
4332+        def _change_state(ignored):
4333+            self._state = DONE_STATE
4334+        d.addCallback(_change_state)
4335+        d.addCallback(self._push)
4336         return d
4337 
4338 
4339hunk ./src/allmydata/mutable/publish.py 531
4340         started = time.time()
4341         encprivkey = self._encprivkey
4342         dl = []
4343-        def _spy_on_writer(results):
4344-            print results
4345-            return results
4346-        for shnum, writer in self.writers.iteritems():
4347+        for writer in self.writers.itervalues():
4348             d = writer.put_encprivkey(encprivkey)
4349hunk ./src/allmydata/mutable/publish.py 533
4350+            d.addCallback(self._got_write_answer, writer, started)
4351+            d.addErrback(self._connection_problem, writer)
4352             dl.append(d)
4353         d = defer.DeferredList(dl)
4354         return d
4355hunk ./src/allmydata/mutable/publish.py 540
4356 
4357 
4358-    def push_blockhashes(self):
4359+    def push_blockhashes(self, ignored):
4360         started = time.time()
4361         dl = []
4362hunk ./src/allmydata/mutable/publish.py 543
4363-        def _spy_on_results(results):
4364-            print results
4365-            return results
4366         self.sharehash_leaves = [None] * len(self.blockhashes)
4367         for shnum, blockhashes in self.blockhashes.iteritems():
4368             t = hashtree.HashTree(blockhashes)
4369hunk ./src/allmydata/mutable/publish.py 549
4370             self.blockhashes[shnum] = list(t)
4371             # set the leaf for future use.
4372             self.sharehash_leaves[shnum] = t[0]
4373-            d = self.writers[shnum].put_blockhashes(self.blockhashes[shnum])
4374+            writer = self.writers[shnum]
4375+            d = writer.put_blockhashes(self.blockhashes[shnum])
4376+            d.addCallback(self._got_write_answer, writer, started)
4377+            d.addErrback(self._connection_problem, self.writers[shnum])
4378             dl.append(d)
4379         d = defer.DeferredList(dl)
4380         return d
4381hunk ./src/allmydata/mutable/publish.py 558
4382 
4383 
4384-    def push_sharehashes(self):
4385+    def push_sharehashes(self, ignored):
4386+        started = time.time()
4387         share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
4388         share_hash_chain = {}
4389         ds = []
4390hunk ./src/allmydata/mutable/publish.py 563
4391-        def _spy_on_results(results):
4392-            print results
4393-            return results
4394         for shnum in xrange(len(self.sharehash_leaves)):
4395             needed_indices = share_hash_tree.needed_hashes(shnum)
4396             self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
4397hunk ./src/allmydata/mutable/publish.py 567
4398                                              for i in needed_indices] )
4399-            d = self.writers[shnum].put_sharehashes(self.sharehashes[shnum])
4400+            writer = self.writers[shnum]
4401+            d = writer.put_sharehashes(self.sharehashes[shnum])
4402+            d.addCallback(self._got_write_answer, writer, started)
4403+            d.addErrback(self._connection_problem, writer)
4404             ds.append(d)
4405         self.root_hash = share_hash_tree[0]
4406         d = defer.DeferredList(ds)
4407hunk ./src/allmydata/mutable/publish.py 577
4408         return d
4409 
4410 
4411-    def push_toplevel_hashes_and_signature(self):
4412+    def push_toplevel_hashes_and_signature(self, ignored):
4413         # We need to to three things here:
4414         #   - Push the root hash and salt hash
4415         #   - Get the checkstring of the resulting layout; sign that.
4416hunk ./src/allmydata/mutable/publish.py 582
4417         #   - Push the signature
4418+        started = time.time()
4419         ds = []
4420hunk ./src/allmydata/mutable/publish.py 584
4421-        def _spy_on_results(results):
4422-            print results
4423-            return results
4424         for shnum in xrange(self.total_shares):
4425hunk ./src/allmydata/mutable/publish.py 585
4426-            d = self.writers[shnum].put_root_hash(self.root_hash)
4427+            writer = self.writers[shnum]
4428+            d = writer.put_root_hash(self.root_hash)
4429+            d.addCallback(self._got_write_answer, writer, started)
4430             ds.append(d)
4431         d = defer.DeferredList(ds)
4432hunk ./src/allmydata/mutable/publish.py 590
4433-        def _make_and_place_signature(ignored):
4434-            signable = self.writers[0].get_signable()
4435-            self.signature = self._privkey.sign(signable)
4436-
4437-            ds = []
4438-            for (shnum, writer) in self.writers.iteritems():
4439-                d = writer.put_signature(self.signature)
4440-                ds.append(d)
4441-            return defer.DeferredList(ds)
4442-        d.addCallback(_make_and_place_signature)
4443+        d.addCallback(self._update_checkstring)
4444+        d.addCallback(self._make_and_place_signature)
4445         return d
4446 
4447 
4448hunk ./src/allmydata/mutable/publish.py 595
4449-    def finish_publishing(self):
4450+    def _update_checkstring(self, ignored):
4451+        """
4452+        After putting the root hash, MDMF files will have the
4453+        checkstring written to the storage server. This means that we
4454+        can update our copy of the checkstring so we can detect
4455+        uncoordinated writes. SDMF files will have the same checkstring,
4456+        so we need not do anything.
4457+        """
4458+        self._checkstring = self.writers.values()[0].get_checkstring()
4459+
4460+
4461+    def _make_and_place_signature(self, ignored):
4462+        """
4463+        I create and place the signature.
4464+        """
4465+        started = time.time()
4466+        signable = self.writers[0].get_signable()
4467+        self.signature = self._privkey.sign(signable)
4468+
4469+        ds = []
4470+        for (shnum, writer) in self.writers.iteritems():
4471+            d = writer.put_signature(self.signature)
4472+            d.addCallback(self._got_write_answer, writer, started)
4473+            d.addErrback(self._connection_problem, writer)
4474+            ds.append(d)
4475+        return defer.DeferredList(ds)
4476+
4477+
4478+    def finish_publishing(self, ignored):
4479         # We're almost done -- we just need to put the verification key
4480         # and the offsets
4481hunk ./src/allmydata/mutable/publish.py 626
4482+        started = time.time()
4483         ds = []
4484         verification_key = self._pubkey.serialize()
4485 
4486hunk ./src/allmydata/mutable/publish.py 630
4487-        def _spy_on_results(results):
4488-            print results
4489-            return results
4490+        # TODO: Bad, since we remove from this same dict. We need to
4491+        # make a copy, or just use a non-iterated value.
4492         for (shnum, writer) in self.writers.iteritems():
4493             d = writer.put_verification_key(verification_key)
4494hunk ./src/allmydata/mutable/publish.py 634
4495+            d.addCallback(self._got_write_answer, writer, started)
4496             d.addCallback(lambda ignored, writer=writer:
4497                 writer.finish_publishing())
4498hunk ./src/allmydata/mutable/publish.py 637
4499+            d.addCallback(self._got_write_answer, writer, started)
4500+            d.addErrback(self._connection_problem, writer)
4501             ds.append(d)
4502         return defer.DeferredList(ds)
4503 
4504hunk ./src/allmydata/mutable/publish.py 643
4505 
4506+    def _connection_problem(self, f, writer):
4507+        """
4508+        We ran into a connection problem while working with writer, and
4509+        need to deal with that.
4510+        """
4511+        self.log("found problem: %s" % str(f))
4512+        self._last_failure = f
4513+        del(self.writers[writer.shnum])
4514+
4515+
4516     def _turn_barrier(self, res):
4517         # putting this method in a Deferred chain imposes a guaranteed
4518         # reactor turn between the pre- and post- portions of that chain.
4519hunk ./src/allmydata/mutable/publish.py 798
4520             self.log_goal(self.goal, "after update: ")
4521 
4522 
4523-    def _encrypt_and_encode(self):
4524-        # this returns a Deferred that fires with a list of (sharedata,
4525-        # sharenum) tuples. TODO: cache the ciphertext, only produce the
4526-        # shares that we care about.
4527-        self.log("_encrypt_and_encode")
4528-
4529-        self._status.set_status("Encrypting")
4530-        started = time.time()
4531+    def _got_write_answer(self, answer, writer, started):
4532+        if not answer:
4533+            # SDMF writers only pretend to write when readers set their
4534+            # blocks, salts, and so on -- they actually just write once,
4535+            # at the end of the upload process. In fake writes, they
4536+            # return defer.succeed(None). If we see that, we shouldn't
4537+            # bother checking it.
4538+            return
4539 
4540hunk ./src/allmydata/mutable/publish.py 807
4541-        key = hashutil.ssk_readkey_data_hash(self.salt, self.readkey)
4542-        enc = AES(key)
4543-        crypttext = enc.process(self.newdata)
4544-        assert len(crypttext) == len(self.newdata)
4545+        peerid = writer.peerid
4546+        lp = self.log("_got_write_answer from %s, share %d" %
4547+                      (idlib.shortnodeid_b2a(peerid), writer.shnum))
4548 
4549         now = time.time()
4550hunk ./src/allmydata/mutable/publish.py 812
4551-        self._status.timings["encrypt"] = now - started
4552-        started = now
4553-
4554-        # now apply FEC
4555-
4556-        self._status.set_status("Encoding")
4557-        fec = codec.CRSEncoder()
4558-        fec.set_params(self.segment_size,
4559-                       self.required_shares, self.total_shares)
4560-        piece_size = fec.get_block_size()
4561-        crypttext_pieces = [None] * self.required_shares
4562-        for i in range(len(crypttext_pieces)):
4563-            offset = i * piece_size
4564-            piece = crypttext[offset:offset+piece_size]
4565-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
4566-            crypttext_pieces[i] = piece
4567-            assert len(piece) == piece_size
4568-
4569-        d = fec.encode(crypttext_pieces)
4570-        def _done_encoding(res):
4571-            elapsed = time.time() - started
4572-            self._status.timings["encode"] = elapsed
4573-            return res
4574-        d.addCallback(_done_encoding)
4575-        return d
4576-
4577-
4578-    def _generate_shares(self, shares_and_shareids):
4579-        # this sets self.shares and self.root_hash
4580-        self.log("_generate_shares")
4581-        self._status.set_status("Generating Shares")
4582-        started = time.time()
4583-
4584-        # we should know these by now
4585-        privkey = self._privkey
4586-        encprivkey = self._encprivkey
4587-        pubkey = self._pubkey
4588-
4589-        (shares, share_ids) = shares_and_shareids
4590-
4591-        assert len(shares) == len(share_ids)
4592-        assert len(shares) == self.total_shares
4593-        all_shares = {}
4594-        block_hash_trees = {}
4595-        share_hash_leaves = [None] * len(shares)
4596-        for i in range(len(shares)):
4597-            share_data = shares[i]
4598-            shnum = share_ids[i]
4599-            all_shares[shnum] = share_data
4600-
4601-            # build the block hash tree. SDMF has only one leaf.
4602-            leaves = [hashutil.block_hash(share_data)]
4603-            t = hashtree.HashTree(leaves)
4604-            block_hash_trees[shnum] = list(t)
4605-            share_hash_leaves[shnum] = t[0]
4606-        for leaf in share_hash_leaves:
4607-            assert leaf is not None
4608-        share_hash_tree = hashtree.HashTree(share_hash_leaves)
4609-        share_hash_chain = {}
4610-        for shnum in range(self.total_shares):
4611-            needed_hashes = share_hash_tree.needed_hashes(shnum)
4612-            share_hash_chain[shnum] = dict( [ (i, share_hash_tree[i])
4613-                                              for i in needed_hashes ] )
4614-        root_hash = share_hash_tree[0]
4615-        assert len(root_hash) == 32
4616-        self.log("my new root_hash is %s" % base32.b2a(root_hash))
4617-        self._new_version_info = (self._new_seqnum, root_hash, self.salt)
4618-
4619-        prefix = pack_prefix(self._new_seqnum, root_hash, self.salt,
4620-                             self.required_shares, self.total_shares,
4621-                             self.segment_size, len(self.newdata))
4622-
4623-        # now pack the beginning of the share. All shares are the same up
4624-        # to the signature, then they have divergent share hash chains,
4625-        # then completely different block hash trees + salt + share data,
4626-        # then they all share the same encprivkey at the end. The sizes
4627-        # of everything are the same for all shares.
4628-
4629-        sign_started = time.time()
4630-        signature = privkey.sign(prefix)
4631-        self._status.timings["sign"] = time.time() - sign_started
4632-
4633-        verification_key = pubkey.serialize()
4634-
4635-        final_shares = {}
4636-        for shnum in range(self.total_shares):
4637-            final_share = pack_share(prefix,
4638-                                     verification_key,
4639-                                     signature,
4640-                                     share_hash_chain[shnum],
4641-                                     block_hash_trees[shnum],
4642-                                     all_shares[shnum],
4643-                                     encprivkey)
4644-            final_shares[shnum] = final_share
4645-        elapsed = time.time() - started
4646-        self._status.timings["pack"] = elapsed
4647-        self.shares = final_shares
4648-        self.root_hash = root_hash
4649-
4650-        # we also need to build up the version identifier for what we're
4651-        # pushing. Extract the offsets from one of our shares.
4652-        assert final_shares
4653-        offsets = unpack_header(final_shares.values()[0])[-1]
4654-        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
4655-        verinfo = (self._new_seqnum, root_hash, self.salt,
4656-                   self.segment_size, len(self.newdata),
4657-                   self.required_shares, self.total_shares,
4658-                   prefix, offsets_tuple)
4659-        self.versioninfo = verinfo
4660-
4661-
4662-
4663-    def _send_shares(self, needed):
4664-        self.log("_send_shares")
4665-
4666-        # we're finally ready to send out our shares. If we encounter any
4667-        # surprises here, it's because somebody else is writing at the same
4668-        # time. (Note: in the future, when we remove the _query_peers() step
4669-        # and instead speculate about [or remember] which shares are where,
4670-        # surprises here are *not* indications of UncoordinatedWriteError,
4671-        # and we'll need to respond to them more gracefully.)
4672-
4673-        # needed is a set of (peerid, shnum) tuples. The first thing we do is
4674-        # organize it by peerid.
4675-
4676-        peermap = DictOfSets()
4677-        for (peerid, shnum) in needed:
4678-            peermap.add(peerid, shnum)
4679-
4680-        # the next thing is to build up a bunch of test vectors. The
4681-        # semantics of Publish are that we perform the operation if the world
4682-        # hasn't changed since the ServerMap was constructed (more or less).
4683-        # For every share we're trying to place, we create a test vector that
4684-        # tests to see if the server*share still corresponds to the
4685-        # map.
4686-
4687-        all_tw_vectors = {} # maps peerid to tw_vectors
4688-        sm = self._servermap.servermap
4689-
4690-        for key in needed:
4691-            (peerid, shnum) = key
4692-
4693-            if key in sm:
4694-                # an old version of that share already exists on the
4695-                # server, according to our servermap. We will create a
4696-                # request that attempts to replace it.
4697-                old_versionid, old_timestamp = sm[key]
4698-                (old_seqnum, old_root_hash, old_salt, old_segsize,
4699-                 old_datalength, old_k, old_N, old_prefix,
4700-                 old_offsets_tuple) = old_versionid
4701-                old_checkstring = pack_checkstring(old_seqnum,
4702-                                                   old_root_hash,
4703-                                                   old_salt)
4704-                testv = (0, len(old_checkstring), "eq", old_checkstring)
4705-
4706-            elif key in self.bad_share_checkstrings:
4707-                old_checkstring = self.bad_share_checkstrings[key]
4708-                testv = (0, len(old_checkstring), "eq", old_checkstring)
4709-
4710-            else:
4711-                # add a testv that requires the share not exist
4712-
4713-                # Unfortunately, foolscap-0.2.5 has a bug in the way inbound
4714-                # constraints are handled. If the same object is referenced
4715-                # multiple times inside the arguments, foolscap emits a
4716-                # 'reference' token instead of a distinct copy of the
4717-                # argument. The bug is that these 'reference' tokens are not
4718-                # accepted by the inbound constraint code. To work around
4719-                # this, we need to prevent python from interning the
4720-                # (constant) tuple, by creating a new copy of this vector
4721-                # each time.
4722-
4723-                # This bug is fixed in foolscap-0.2.6, and even though this
4724-                # version of Tahoe requires foolscap-0.3.1 or newer, we are
4725-                # supposed to be able to interoperate with older versions of
4726-                # Tahoe which are allowed to use older versions of foolscap,
4727-                # including foolscap-0.2.5 . In addition, I've seen other
4728-                # foolscap problems triggered by 'reference' tokens (see #541
4729-                # for details). So we must keep this workaround in place.
4730-
4731-                #testv = (0, 1, 'eq', "")
4732-                testv = tuple([0, 1, 'eq', ""])
4733-
4734-            testvs = [testv]
4735-            # the write vector is simply the share
4736-            writev = [(0, self.shares[shnum])]
4737-
4738-            if peerid not in all_tw_vectors:
4739-                all_tw_vectors[peerid] = {}
4740-                # maps shnum to (testvs, writevs, new_length)
4741-            assert shnum not in all_tw_vectors[peerid]
4742-
4743-            all_tw_vectors[peerid][shnum] = (testvs, writev, None)
4744-
4745-        # we read the checkstring back from each share, however we only use
4746-        # it to detect whether there was a new share that we didn't know
4747-        # about. The success or failure of the write will tell us whether
4748-        # there was a collision or not. If there is a collision, the first
4749-        # thing we'll do is update the servermap, which will find out what
4750-        # happened. We could conceivably reduce a roundtrip by using the
4751-        # readv checkstring to populate the servermap, but really we'd have
4752-        # to read enough data to validate the signatures too, so it wouldn't
4753-        # be an overall win.
4754-        read_vector = [(0, struct.calcsize(SIGNED_PREFIX))]
4755-
4756-        # ok, send the messages!
4757-        self.log("sending %d shares" % len(all_tw_vectors), level=log.NOISY)
4758-        started = time.time()
4759-        for (peerid, tw_vectors) in all_tw_vectors.items():
4760-
4761-            write_enabler = self._node.get_write_enabler(peerid)
4762-            renew_secret = self._node.get_renewal_secret(peerid)
4763-            cancel_secret = self._node.get_cancel_secret(peerid)
4764-            secrets = (write_enabler, renew_secret, cancel_secret)
4765-            shnums = tw_vectors.keys()
4766-
4767-            for shnum in shnums:
4768-                self.outstanding.add( (peerid, shnum) )
4769-
4770-            d = self._do_testreadwrite(peerid, secrets,
4771-                                       tw_vectors, read_vector)
4772-            d.addCallbacks(self._got_write_answer, self._got_write_error,
4773-                           callbackArgs=(peerid, shnums, started),
4774-                           errbackArgs=(peerid, shnums, started))
4775-            # tolerate immediate errback, like with DeadReferenceError
4776-            d.addBoth(fireEventually)
4777-            d.addCallback(self.loop)
4778-            d.addErrback(self._fatal_error)
4779-
4780-        self._update_status()
4781-        self.log("%d shares sent" % len(all_tw_vectors), level=log.NOISY)
4782+        elapsed = now - started
4783 
4784hunk ./src/allmydata/mutable/publish.py 814
4785-    def _do_testreadwrite(self, peerid, secrets,
4786-                          tw_vectors, read_vector):
4787-        storage_index = self._storage_index
4788-        ss = self.connections[peerid]
4789+        self._status.add_per_server_time(peerid, elapsed)
4790 
4791hunk ./src/allmydata/mutable/publish.py 816
4792-        #print "SS[%s] is %s" % (idlib.shortnodeid_b2a(peerid), ss), ss.tracker.interfaceName
4793-        d = ss.callRemote("slot_testv_and_readv_and_writev",
4794-                          storage_index,
4795-                          secrets,
4796-                          tw_vectors,
4797-                          read_vector)
4798-        return d
4799+        wrote, read_data = answer
4800 
4801hunk ./src/allmydata/mutable/publish.py 818
4802-    def _got_write_answer(self, answer, peerid, shnums, started):
4803-        lp = self.log("_got_write_answer from %s" %
4804-                      idlib.shortnodeid_b2a(peerid))
4805-        for shnum in shnums:
4806-            self.outstanding.discard( (peerid, shnum) )
4807+        surprise_shares = set(read_data.keys()) - set([writer.shnum])
4808 
4809hunk ./src/allmydata/mutable/publish.py 820
4810-        now = time.time()
4811-        elapsed = now - started
4812-        self._status.add_per_server_time(peerid, elapsed)
4813+        # We need to remove from surprise_shares any shares that we are
4814+        # knowingly also writing to that peer from other writers.
4815 
4816hunk ./src/allmydata/mutable/publish.py 823
4817-        wrote, read_data = answer
4818+        # TODO: Precompute this.
4819+        known_shnums = [x.shnum for x in self.writers.values()
4820+                        if x.peerid == peerid]
4821+        surprise_shares -= set(known_shnums)
4822+        self.log("found the following surprise shares: %s" %
4823+                 str(surprise_shares))
4824 
4825hunk ./src/allmydata/mutable/publish.py 830
4826-        surprise_shares = set(read_data.keys()) - set(shnums)
4827+        # Now surprise shares contains all of the shares that we did not
4828+        # expect to be there.
4829 
4830         surprised = False
4831         for shnum in surprise_shares:
4832hunk ./src/allmydata/mutable/publish.py 837
4833             # read_data is a dict mapping shnum to checkstring (SIGNED_PREFIX)
4834             checkstring = read_data[shnum][0]
4835-            their_version_info = unpack_checkstring(checkstring)
4836-            if their_version_info == self._new_version_info:
4837+            # What we want to do here is to see if their (seqnum,
4838+            # roothash, salt) is the same as our (seqnum, roothash,
4839+            # salt), or the equivalent for MDMF. The best way to do this
4840+            # is to store a packed representation of our checkstring
4841+            # somewhere, then not bother unpacking the other
4842+            # checkstring.
4843+            if checkstring == self._checkstring:
4844                 # they have the right share, somehow
4845 
4846                 if (peerid,shnum) in self.goal:
4847hunk ./src/allmydata/mutable/publish.py 922
4848             self.log("our testv failed, so the write did not happen",
4849                      parent=lp, level=log.WEIRD, umid="8sc26g")
4850             self.surprised = True
4851-            self.bad_peers.add(peerid) # don't ask them again
4852+            # TODO: This needs to
4853+            self.bad_peers.add(writer) # don't ask them again
4854             # use the checkstring to add information to the log message
4855             for (shnum,readv) in read_data.items():
4856                 checkstring = readv[0]
4857hunk ./src/allmydata/mutable/publish.py 948
4858             # self.loop() will take care of finding new homes
4859             return
4860 
4861-        for shnum in shnums:
4862-            self.placed.add( (peerid, shnum) )
4863-            # and update the servermap
4864-            self._servermap.add_new_share(peerid, shnum,
4865-                                          self.versioninfo, started)
4866-
4867-        # self.loop() will take care of checking to see if we're done
4868-        return
4869+        self.placed.add( (peerid, writer.shnum) )
4870+        # and update the servermap
4871+        #self._servermap.add_new_share(peerid, writer.shnum,
4872+        #                              self.versioninfo, started)
4873 
4874hunk ./src/allmydata/mutable/publish.py 953
4875-    def _got_write_error(self, f, peerid, shnums, started):
4876-        for shnum in shnums:
4877-            self.outstanding.discard( (peerid, shnum) )
4878-        self.bad_peers.add(peerid)
4879-        if self._first_write_error is None:
4880-            self._first_write_error = f
4881-        self.log(format="error while writing shares %(shnums)s to peerid %(peerid)s",
4882-                 shnums=list(shnums), peerid=idlib.shortnodeid_b2a(peerid),
4883-                 failure=f,
4884-                 level=log.UNUSUAL)
4885         # self.loop() will take care of checking to see if we're done
4886         return
4887 
4888hunk ./src/allmydata/mutable/publish.py 980
4889             self._status.set_progress(1.0)
4890         eventually(self.done_deferred.callback, res)
4891 
4892+    def _failure(self):
4893+
4894+        if not self.surprised:
4895+            # We ran out of servers
4896+            self.log("Publish ran out of good servers, "
4897+                     "last failure was: %s" % str(self._last_failure))
4898+            e = NotEnoughServersError("Ran out of non-bad servers, "
4899+                                      "last failure was %s" %
4900+                                      str(self._last_failure))
4901+        else:
4902+            # We ran into shares that we didn't recognize, which means
4903+            # that we need to return an UncoordinatedWriteError.
4904+            self.log("Publish failed with UncoordinatedWriteError")
4905+            e = UncoordinatedWriteError()
4906+        f = failure.Failure(e)
4907+        eventually(self.done_deferred.callback, f)
4908}
4909[test/test_mutable.py: temporarily disable two tests that are now irrelevant
4910Kevan Carstensen <kevan@isnotajoke.com>**20100701232806
4911 Ignore-this: 701e143567f3954812ca6960af1d6ac7
4912] {
4913hunk ./src/allmydata/test/test_mutable.py 651
4914             self.failUnlessEqual(len(share_ids), 10)
4915         d.addCallback(_done)
4916         return d
4917+    test_encrypt.todo = "Write an equivalent of this for the new uploader"
4918 
4919     def test_generate(self):
4920         nm = make_nodemaker()
4921hunk ./src/allmydata/test/test_mutable.py 713
4922                 self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey())
4923         d.addCallback(_generated)
4924         return d
4925+    test_generate.todo = "Write an equivalent of this for the new uploader"
4926 
4927     # TODO: when we publish to 20 peers, we should get one share per peer on 10
4928     # when we publish to 3 peers, we should get either 3 or 4 shares per peer
4929}
4930[Add MDMF reader and writer, and SDMF writer
4931Kevan Carstensen <kevan@isnotajoke.com>**20100701232834
4932 Ignore-this: 129ffbb7675326e506d6131a13d62ab9
4933 
4934 The MDMF/SDMF reader MDMF writer, and SDMF writer are similar to the
4935 object proxies that exist for immutable files. They abstract away
4936 details of connection, state, and caching from their callers (in this
4937 case, the download, servermap updater, and uploader), and expose methods
4938 to get and set information on the remote server.
4939 
4940 MDMFSlotReadProxy reads a mutable file from the server, doing the right
4941 thing (in most cases) regardless of whether the file is MDMF or SDMF. It
4942 allows callers to tell it how to batch and flush reads.
4943 
4944 MDMFSlotWriteProxy writes an MDMF mutable file to a server.
4945 
4946 SDMFSlotWriteProxy writes an SDMF mutable file to a server.
4947 
4948 This patch also includes tests for MDMFSlotReadProxy,
4949 SDMFSlotWriteProxy, and MDMFSlotWriteProxy.
4950] {
4951hunk ./src/allmydata/mutable/layout.py 4
4952 
4953 import struct
4954 from allmydata.mutable.common import NeedMoreDataError, UnknownVersionError
4955+from allmydata.interfaces import HASH_SIZE, SALT_SIZE, SDMF_VERSION, \
4956+                                 MDMF_VERSION, IMutableSlotWriter
4957+from allmydata.util import mathutil, observer
4958+from twisted.python import failure
4959+from twisted.internet import defer
4960+from zope.interface import implements
4961+
4962+
4963+# These strings describe the format of the packed structs they help process
4964+# Here's what they mean:
4965+#
4966+#  PREFIX:
4967+#    >: Big-endian byte order; the most significant byte is first (leftmost).
4968+#    B: The version information; an 8 bit version identifier. Stored as
4969+#       an unsigned char. This is currently 00 00 00 00; our modifications
4970+#       will turn it into 00 00 00 01.
4971+#    Q: The sequence number; this is sort of like a revision history for
4972+#       mutable files; they start at 1 and increase as they are changed after
4973+#       being uploaded. Stored as an unsigned long long, which is 8 bytes in
4974+#       length.
4975+#  32s: The root hash of the share hash tree. We use sha-256d, so we use 32
4976+#       characters = 32 bytes to store the value.
4977+#  16s: The salt for the readkey. This is a 16-byte random value, stored as
4978+#       16 characters.
4979+#
4980+#  SIGNED_PREFIX additions, things that are covered by the signature:
4981+#    B: The "k" encoding parameter. We store this as an 8-bit character,
4982+#       which is convenient because our erasure coding scheme cannot
4983+#       encode if you ask for more than 255 pieces.
4984+#    B: The "N" encoding parameter. Stored as an 8-bit character for the
4985+#       same reasons as above.
4986+#    Q: The segment size of the uploaded file. This will essentially be the
4987+#       length of the file in SDMF. An unsigned long long, so we can store
4988+#       files of quite large size.
4989+#    Q: The data length of the uploaded file. Modulo padding, this will be
4990+#       the same of the data length field. Like the data length field, it is
4991+#       an unsigned long long and can be quite large.
4992+#
4993+#   HEADER additions:
4994+#     L: The offset of the signature of this. An unsigned long.
4995+#     L: The offset of the share hash chain. An unsigned long.
4996+#     L: The offset of the block hash tree. An unsigned long.
4997+#     L: The offset of the share data. An unsigned long.
4998+#     Q: The offset of the encrypted private key. An unsigned long long, to
4999+#        account for the possibility of a lot of share data.
5000+#     Q: The offset of the EOF. An unsigned long long, to account for the
5001+#        possibility of a lot of share data.
5002+#
5003+#  After all of these, we have the following:
5004+#    - The verification key: Occupies the space between the end of the header
5005+#      and the start of the signature (i.e.: data[HEADER_LENGTH:o['signature']].
5006+#    - The signature, which goes from the signature offset to the share hash
5007+#      chain offset.
5008+#    - The share hash chain, which goes from the share hash chain offset to
5009+#      the block hash tree offset.
5010+#    - The share data, which goes from the share data offset to the encrypted
5011+#      private key offset.
5012+#    - The encrypted private key offset, which goes until the end of the file.
5013+#
5014+#  The block hash tree in this encoding has only one share, so the offset of
5015+#  the share data will be 32 bits more than the offset of the block hash tree.
5016+#  Given this, we may need to check to see how many bytes a reasonably sized
5017+#  block hash tree will take up.
5018 
5019 PREFIX = ">BQ32s16s" # each version has a different prefix
5020 SIGNED_PREFIX = ">BQ32s16s BBQQ" # this is covered by the signature
5021hunk ./src/allmydata/mutable/layout.py 73
5022 SIGNED_PREFIX_LENGTH = struct.calcsize(SIGNED_PREFIX)
5023 HEADER = ">BQ32s16s BBQQ LLLLQQ" # includes offsets
5024 HEADER_LENGTH = struct.calcsize(HEADER)
5025+OFFSETS = ">LLLLQQ"
5026+OFFSETS_LENGTH = struct.calcsize(OFFSETS)
5027 
5028 def unpack_header(data):
5029     o = {}
5030hunk ./src/allmydata/mutable/layout.py 194
5031     return (share_hash_chain, block_hash_tree, share_data)
5032 
5033 
5034-def pack_checkstring(seqnum, root_hash, IV):
5035+def pack_checkstring(seqnum, root_hash, IV, version=0):
5036     return struct.pack(PREFIX,
5037hunk ./src/allmydata/mutable/layout.py 196
5038-                       0, # version,
5039+                       version,
5040                        seqnum,
5041                        root_hash,
5042                        IV)
5043hunk ./src/allmydata/mutable/layout.py 269
5044                            encprivkey])
5045     return final_share
5046 
5047+def pack_prefix(seqnum, root_hash, IV,
5048+                required_shares, total_shares,
5049+                segment_size, data_length):
5050+    prefix = struct.pack(SIGNED_PREFIX,
5051+                         0, # version,
5052+                         seqnum,
5053+                         root_hash,
5054+                         IV,
5055+                         required_shares,
5056+                         total_shares,
5057+                         segment_size,
5058+                         data_length,
5059+                         )
5060+    return prefix
5061+
5062+
5063+class SDMFSlotWriteProxy:
5064+    implements(IMutableSlotWriter)
5065+    """
5066+    I represent a remote write slot for an SDMF mutable file. I build a
5067+    share in memory, and then write it in one piece to the remote
5068+    server. This mimics how SDMF shares were built before MDMF (and the
5069+    new MDMF uploader), but provides that functionality in a way that
5070+    allows the MDMF uploader to be built without much special-casing for
5071+    file format, which makes the uploader code more readable.
5072+    """
5073+    def __init__(self,
5074+                 shnum,
5075+                 rref, # a remote reference to a storage server
5076+                 storage_index,
5077+                 secrets, # (write_enabler, renew_secret, cancel_secret)
5078+                 seqnum, # the sequence number of the mutable file
5079+                 required_shares,
5080+                 total_shares,
5081+                 segment_size,
5082+                 data_length): # the length of the original file
5083+        self.shnum = shnum
5084+        self._rref = rref
5085+        self._storage_index = storage_index
5086+        self._secrets = secrets
5087+        self._seqnum = seqnum
5088+        self._required_shares = required_shares
5089+        self._total_shares = total_shares
5090+        self._segment_size = segment_size
5091+        self._data_length = data_length
5092+
5093+        # This is an SDMF file, so it should have only one segment, so,
5094+        # modulo padding of the data length, the segment size and the
5095+        # data length should be the same.
5096+        expected_segment_size = mathutil.next_multiple(data_length,
5097+                                                       self._required_shares)
5098+        assert expected_segment_size == segment_size
5099+
5100+        self._block_size = self._segment_size / self._required_shares
5101+
5102+        # This is meant to mimic how SDMF files were built before MDMF
5103+        # entered the picture: we generate each share in its entirety,
5104+        # then push it off to the storage server in one write. When
5105+        # callers call set_*, they are just populating this dict.
5106+        # finish_publishing will stitch these pieces together into a
5107+        # coherent share, and then write the coherent share to the
5108+        # storage server.
5109+        self._share_pieces = {}
5110+
5111+        # This tells the write logic what checkstring to use when
5112+        # writing remote shares.
5113+        self._testvs = []
5114+
5115+        self._readvs = [(0, struct.calcsize(PREFIX))]
5116+
5117+
5118+    def set_checkstring(self, checkstring_or_seqnum,
5119+                              root_hash=None,
5120+                              salt=None):
5121+        """
5122+        Set the checkstring that I will pass to the remote server when
5123+        writing.
5124+
5125+            @param checkstring_or_seqnum: A packed checkstring to use,
5126+                   or a sequence number. I will treat this as a checkstr
5127+
5128+        Note that implementations can differ in which semantics they
5129+        wish to support for set_checkstring -- they can, for example,
5130+        build the checkstring themselves from its constituents, or
5131+        some other thing.
5132+        """
5133+        if root_hash and salt:
5134+            checkstring = struct.pack(PREFIX,
5135+                                      0,
5136+                                      checkstring_or_seqnum,
5137+                                      root_hash,
5138+                                      salt)
5139+        else:
5140+            checkstring = checkstring_or_seqnum
5141+        self._testvs = [(0, len(checkstring), "eq", checkstring)]
5142+
5143+
5144+    def get_checkstring(self):
5145+        """
5146+        Get the checkstring that I think currently exists on the remote
5147+        server.
5148+        """
5149+        if self._testvs:
5150+            return self._testvs[0][3]
5151+        return ""
5152+
5153+
5154+    def put_block(self, data, segnum, salt):
5155+        """
5156+        Add a block and salt to the share.
5157+        """
5158+        # SDMF files have only one segment
5159+        assert segnum == 0
5160+        assert len(data) == self._block_size
5161+        assert len(salt) == SALT_SIZE
5162+
5163+        self._share_pieces['sharedata'] = data
5164+        self._share_pieces['salt'] = salt
5165+
5166+        # TODO: Figure out something intelligent to return.
5167+        return defer.succeed(None)
5168+
5169+
5170+    def put_encprivkey(self, encprivkey):
5171+        """
5172+        Add the encrypted private key to the share.
5173+        """
5174+        self._share_pieces['encprivkey'] = encprivkey
5175+
5176+        return defer.succeed(None)
5177+
5178+
5179+    def put_blockhashes(self, blockhashes):
5180+        """
5181+        Add the block hash tree to the share.
5182+        """
5183+        assert isinstance(blockhashes, list)
5184+        for h in blockhashes:
5185+            assert len(h) == HASH_SIZE
5186+
5187+        # serialize the blockhashes, then set them.
5188+        blockhashes_s = "".join(blockhashes)
5189+        self._share_pieces['block_hash_tree'] = blockhashes_s
5190+
5191+        return defer.succeed(None)
5192+
5193+
5194+    def put_sharehashes(self, sharehashes):
5195+        """
5196+        Add the share hash chain to the share.
5197+        """
5198+        assert isinstance(sharehashes, dict)
5199+        for h in sharehashes.itervalues():
5200+            assert len(h) == HASH_SIZE
5201+
5202+        # serialize the sharehashes, then set them.
5203+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
5204+                                 for i in sorted(sharehashes.keys())])
5205+        self._share_pieces['share_hash_chain'] = sharehashes_s
5206+
5207+        return defer.succeed(None)
5208+
5209+
5210+    def put_root_hash(self, root_hash):
5211+        """
5212+        Add the root hash to the share.
5213+        """
5214+        assert len(root_hash) == HASH_SIZE
5215+
5216+        self._share_pieces['root_hash'] = root_hash
5217+
5218+        return defer.succeed(None)
5219+
5220+
5221+    def put_salt(self, salt):
5222+        """
5223+        Add a salt to an empty SDMF file.
5224+        """
5225+        assert len(salt) == SALT_SIZE
5226+
5227+        self._share_pieces['salt'] = salt
5228+        self._share_pieces['sharedata'] = ""
5229+
5230+
5231+    def get_signable(self):
5232+        """
5233+        Return the part of the share that needs to be signed.
5234+
5235+        SDMF writers need to sign the packed representation of the
5236+        first eight fields of the remote share, that is:
5237+            - version number (0)
5238+            - sequence number
5239+            - root of the share hash tree
5240+            - salt
5241+            - k
5242+            - n
5243+            - segsize
5244+            - datalen
5245+
5246+        This method is responsible for returning that to callers.
5247+        """
5248+        return struct.pack(SIGNED_PREFIX,
5249+                           0,
5250+                           self._seqnum,
5251+                           self._share_pieces['root_hash'],
5252+                           self._share_pieces['salt'],
5253+                           self._required_shares,
5254+                           self._total_shares,
5255+                           self._segment_size,
5256+                           self._data_length)
5257+
5258+
5259+    def put_signature(self, signature):
5260+        """
5261+        Add the signature to the share.
5262+        """
5263+        self._share_pieces['signature'] = signature
5264+
5265+        return defer.succeed(None)
5266+
5267+
5268+    def put_verification_key(self, verification_key):
5269+        """
5270+        Add the verification key to the share.
5271+        """
5272+        self._share_pieces['verification_key'] = verification_key
5273+
5274+        return defer.succeed(None)
5275+
5276+
5277+    def _pack_offsets(self):
5278+        post_offset = HEADER_LENGTH
5279+        offsets = {}
5280+
5281+        verification_key_length = len(self._share_pieces['verification_key'])
5282+        o1 = offsets['signature'] = post_offset + verification_key_length
5283+
5284+        signature_length = len(self._share_pieces['signature'])
5285+        o2 = offsets['share_hash_chain'] = o1 + signature_length
5286+
5287+        share_hash_chain_length = len(self._share_pieces['share_hash_chain'])
5288+        o3 = offsets['block_hash_tree'] = o2 + share_hash_chain_length
5289+
5290+        block_hash_tree_length = len(self._share_pieces['block_hash_tree'])
5291+        o4 = offsets['share_data'] = o3 + block_hash_tree_length
5292+
5293+        share_data_length = len(self._share_pieces['sharedata'])
5294+        o5 = offsets['enc_privkey'] = o4 + share_data_length
5295+
5296+        encprivkey_length = len(self._share_pieces['encprivkey'])
5297+        offsets['EOF'] = o5 + encprivkey_length
5298+
5299+        return struct.pack(">LLLLQQ",
5300+                           offsets['signature'],
5301+                           offsets['share_hash_chain'],
5302+                           offsets['block_hash_tree'],
5303+                           offsets['share_data'],
5304+                           offsets['enc_privkey'],
5305+                           offsets['EOF'])
5306+
5307+
5308+    def finish_publishing(self):
5309+        """
5310+        Do anything necessary to finish writing the share to a remote
5311+        server. I require that no further publishing needs to take place
5312+        after this method has been called.
5313+        """
5314+        for k in ["sharedata", "encprivkey", "signature", "verification_key",
5315+                  "share_hash_chain", "block_hash_tree"]:
5316+            assert k in self._share_pieces
5317+        # This is the only method that actually writes something to the
5318+        # remote server.
5319+        # First, we need to pack the share into data that we can write
5320+        # to the remote server in one write.
5321+        offsets = self._pack_offsets()
5322+        prefix = self.get_signable()
5323+        final_share = "".join([prefix,
5324+                               offsets,
5325+                               self._share_pieces['verification_key'],
5326+                               self._share_pieces['signature'],
5327+                               self._share_pieces['share_hash_chain'],
5328+                               self._share_pieces['block_hash_tree'],
5329+                               self._share_pieces['sharedata'],
5330+                               self._share_pieces['encprivkey']])
5331+
5332+        # Our only data vector is going to be writing the final share,
5333+        # in its entirely.
5334+        datavs = [(0, final_share)]
5335+
5336+        if not self._testvs:
5337+            # Our caller has not provided us with another checkstring
5338+            # yet, so we assume that we are writing a new share, and set
5339+            # a test vector that will allow a new share to be written.
5340+            self._testvs = []
5341+            self._testvs.append(tuple([0, 1, "eq", ""]))
5342+            new_share = True
5343+
5344+        tw_vectors = {}
5345+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
5346+        return self._rref.callRemote("slot_testv_and_readv_and_writev",
5347+                                     self._storage_index,
5348+                                     self._secrets,
5349+                                     tw_vectors,
5350+                                     # TODO is it useful to read something?
5351+                                     self._readvs)
5352+
5353+
5354+MDMFHEADER = ">BQ32sBBQQ QQQQQQ"
5355+MDMFHEADERWITHOUTOFFSETS = ">BQ32sBBQQ"
5356+MDMFHEADERSIZE = struct.calcsize(MDMFHEADER)
5357+MDMFHEADERWITHOUTOFFSETSSIZE = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
5358+MDMFCHECKSTRING = ">BQ32s"
5359+MDMFSIGNABLEHEADER = ">BQ32sBBQQ"
5360+MDMFOFFSETS = ">QQQQQQ"
5361+MDMFOFFSETS_LENGTH = struct.calcsize(MDMFOFFSETS)
5362+
5363+class MDMFSlotWriteProxy:
5364+    implements(IMutableSlotWriter)
5365+
5366+    """
5367+    I represent a remote write slot for an MDMF mutable file.
5368+
5369+    I abstract away from my caller the details of block and salt
5370+    management, and the implementation of the on-disk format for MDMF
5371+    shares.
5372+    """
5373+    # Expected layout, MDMF:
5374+    # offset:     size:       name:
5375+    #-- signed part --
5376+    # 0           1           version number (01)
5377+    # 1           8           sequence number
5378+    # 9           32          share tree root hash
5379+    # 41          1           The "k" encoding parameter
5380+    # 42          1           The "N" encoding parameter
5381+    # 43          8           The segment size of the uploaded file
5382+    # 51          8           The data length of the original plaintext
5383+    #-- end signed part --
5384+    # 59          8           The offset of the encrypted private key
5385+    # 67          8           The offset of the block hash tree
5386+    # 75          8           The offset of the share hash chain
5387+    # 83          8           The offset of the signature
5388+    # 91          8           The offset of the verification key
5389+    # 99          8           The offset of the EOF
5390+    #
5391+    # followed by salts and share data, the encrypted private key, the
5392+    # block hash tree, the salt hash tree, the share hash chain, a
5393+    # signature over the first eight fields, and a verification key.
5394+    #
5395+    # The checkstring is the first three fields -- the version number,
5396+    # sequence number, root hash and root salt hash. This is consistent
5397+    # in meaning to what we have with SDMF files, except now instead of
5398+    # using the literal salt, we use a value derived from all of the
5399+    # salts -- the share hash root.
5400+    #
5401+    # The salt is stored before the block for each segment. The block
5402+    # hash tree is computed over the combination of block and salt for
5403+    # each segment. In this way, we get integrity checking for both
5404+    # block and salt with the current block hash tree arrangement.
5405+    #
5406+    # The ordering of the offsets is different to reflect the dependencies
5407+    # that we'll run into with an MDMF file. The expected write flow is
5408+    # something like this:
5409+    #
5410+    #   0: Initialize with the sequence number, encoding parameters and
5411+    #      data length. From this, we can deduce the number of segments,
5412+    #      and where they should go.. We can also figure out where the
5413+    #      encrypted private key should go, because we can figure out how
5414+    #      big the share data will be.
5415+    #
5416+    #   1: Encrypt, encode, and upload the file in chunks. Do something
5417+    #      like
5418+    #
5419+    #       put_block(data, segnum, salt)
5420+    #
5421+    #      to write a block and a salt to the disk. We can do both of
5422+    #      these operations now because we have enough of the offsets to
5423+    #      know where to put them.
5424+    #
5425+    #   2: Put the encrypted private key. Use:
5426+    #
5427+    #        put_encprivkey(encprivkey)
5428+    #
5429+    #      Now that we know the length of the private key, we can fill
5430+    #      in the offset for the block hash tree.
5431+    #
5432+    #   3: We're now in a position to upload the block hash tree for
5433+    #      a share. Put that using something like:
5434+    #       
5435+    #        put_blockhashes(block_hash_tree)
5436+    #
5437+    #      Note that block_hash_tree is a list of hashes -- we'll take
5438+    #      care of the details of serializing that appropriately. When
5439+    #      we get the block hash tree, we are also in a position to
5440+    #      calculate the offset for the share hash chain, and fill that
5441+    #      into the offsets table.
5442+    #
5443+    #   4: At the same time, we're in a position to upload the salt hash
5444+    #      tree. This is a Merkle tree over all of the salts. We use a
5445+    #      Merkle tree so that we can validate each block,salt pair as
5446+    #      we download them later. We do this using
5447+    #
5448+    #        put_salthashes(salt_hash_tree)
5449+    #
5450+    #      When you do this, I automatically put the root of the tree
5451+    #      (the hash at index 0 of the list) in its appropriate slot in
5452+    #      the signed prefix of the share.
5453+    #
5454+    #   5: We're now in a position to upload the share hash chain for
5455+    #      a share. Do that with something like:
5456+    #     
5457+    #        put_sharehashes(share_hash_chain)
5458+    #
5459+    #      share_hash_chain should be a dictionary mapping shnums to
5460+    #      32-byte hashes -- the wrapper handles serialization.
5461+    #      We'll know where to put the signature at this point, also.
5462+    #      The root of this tree will be put explicitly in the next
5463+    #      step.
5464+    #
5465+    #      TODO: Why? Why not just include it in the tree here?
5466+    #
5467+    #   6: Before putting the signature, we must first put the
5468+    #      root_hash. Do this with:
5469+    #
5470+    #        put_root_hash(root_hash).
5471+    #     
5472+    #      In terms of knowing where to put this value, it was always
5473+    #      possible to place it, but it makes sense semantically to
5474+    #      place it after the share hash tree, so that's why you do it
5475+    #      in this order.
5476+    #
5477+    #   6: With the root hash put, we can now sign the header. Use:
5478+    #
5479+    #        get_signable()
5480+    #
5481+    #      to get the part of the header that you want to sign, and use:
5482+    #       
5483+    #        put_signature(signature)
5484+    #
5485+    #      to write your signature to the remote server.
5486+    #
5487+    #   6: Add the verification key, and finish. Do:
5488+    #
5489+    #        put_verification_key(key)
5490+    #
5491+    #      and
5492+    #
5493+    #        finish_publish()
5494+    #
5495+    # Checkstring management:
5496+    #
5497+    # To write to a mutable slot, we have to provide test vectors to ensure
5498+    # that we are writing to the same data that we think we are. These
5499+    # vectors allow us to detect uncoordinated writes; that is, writes
5500+    # where both we and some other shareholder are writing to the
5501+    # mutable slot, and to report those back to the parts of the program
5502+    # doing the writing.
5503+    #
5504+    # With SDMF, this was easy -- all of the share data was written in
5505+    # one go, so it was easy to detect uncoordinated writes, and we only
5506+    # had to do it once. With MDMF, not all of the file is written at
5507+    # once.
5508+    #
5509+    # If a share is new, we write out as much of the header as we can
5510+    # before writing out anything else. This gives other writers a
5511+    # canary that they can use to detect uncoordinated writes, and, if
5512+    # they do the same thing, gives us the same canary. We them update
5513+    # the share. We won't be able to write out two fields of the header
5514+    # -- the share tree hash and the salt hash -- until we finish
5515+    # writing out the share. We only require the writer to provide the
5516+    # initial checkstring, and keep track of what it should be after
5517+    # updates ourselves.
5518+    #
5519+    # If we haven't written anything yet, then on the first write (which
5520+    # will probably be a block + salt of a share), we'll also write out
5521+    # the header. On subsequent passes, we'll expect to see the header.
5522+    # This changes in two places:
5523+    #
5524+    #   - When we write out the salt hash
5525+    #   - When we write out the root of the share hash tree
5526+    #
5527+    # since these values will change the header. It is possible that we
5528+    # can just make those be written in one operation to minimize
5529+    # disruption.
5530+    def __init__(self,
5531+                 shnum,
5532+                 rref, # a remote reference to a storage server
5533+                 storage_index,
5534+                 secrets, # (write_enabler, renew_secret, cancel_secret)
5535+                 seqnum, # the sequence number of the mutable file
5536+                 required_shares,
5537+                 total_shares,
5538+                 segment_size,
5539+                 data_length): # the length of the original file
5540+        self.shnum = shnum
5541+        self._rref = rref
5542+        self._storage_index = storage_index
5543+        self._seqnum = seqnum
5544+        self._required_shares = required_shares
5545+        assert self.shnum >= 0 and self.shnum < total_shares
5546+        self._total_shares = total_shares
5547+        # We build up the offset table as we write things. It is the
5548+        # last thing we write to the remote server.
5549+        self._offsets = {}
5550+        self._testvs = []
5551+        self._secrets = secrets
5552+        # The segment size needs to be a multiple of the k parameter --
5553+        # any padding should have been carried out by the publisher
5554+        # already.
5555+        assert segment_size % required_shares == 0
5556+        self._segment_size = segment_size
5557+        self._data_length = data_length
5558+
5559+        # These are set later -- we define them here so that we can
5560+        # check for their existence easily
5561+
5562+        # This is the root of the share hash tree -- the Merkle tree
5563+        # over the roots of the block hash trees computed for shares in
5564+        # this upload.
5565+        self._root_hash = None
5566+
5567+        # We haven't yet written anything to the remote bucket. By
5568+        # setting this, we tell the _write method as much. The write
5569+        # method will then know that it also needs to add a write vector
5570+        # for the checkstring (or what we have of it) to the first write
5571+        # request. We'll then record that value for future use.  If
5572+        # we're expecting something to be there already, we need to call
5573+        # set_checkstring before we write anything to tell the first
5574+        # write about that.
5575+        self._written = False
5576+
5577+        # When writing data to the storage servers, we get a read vector
5578+        # for free. We'll read the checkstring, which will help us
5579+        # figure out what's gone wrong if a write fails.
5580+        self._readv = [(0, struct.calcsize(MDMFCHECKSTRING))]
5581+
5582+        # We calculate the number of segments because it tells us
5583+        # where the salt part of the file ends/share segment begins,
5584+        # and also because it provides a useful amount of bounds checking.
5585+        self._num_segments = mathutil.div_ceil(self._data_length,
5586+                                               self._segment_size)
5587+        self._block_size = self._segment_size / self._required_shares
5588+        # We also calculate the share size, to help us with block
5589+        # constraints later.
5590+        tail_size = self._data_length % self._segment_size
5591+        if not tail_size:
5592+            self._tail_block_size = self._block_size
5593+        else:
5594+            self._tail_block_size = mathutil.next_multiple(tail_size,
5595+                                                           self._required_shares)
5596+            self._tail_block_size /= self._required_shares
5597+
5598+        # We already know where the sharedata starts; right after the end
5599+        # of the header (which is defined as the signable part + the offsets)
5600+        # We can also calculate where the encrypted private key begins
5601+        # from what we know know.
5602+        self._actual_block_size = self._block_size + SALT_SIZE
5603+        data_size = self._actual_block_size * (self._num_segments - 1)
5604+        data_size += self._tail_block_size
5605+        data_size += SALT_SIZE
5606+        self._offsets['enc_privkey'] = MDMFHEADERSIZE
5607+        self._offsets['enc_privkey'] += data_size
5608+        # We'll wait for the rest. Callers can now call my "put_block" and
5609+        # "set_checkstring" methods.
5610+
5611+
5612+    def set_checkstring(self,
5613+                        seqnum_or_checkstring,
5614+                        root_hash=None,
5615+                        salt=None):
5616+        """
5617+        Set checkstring checkstring for the given shnum.
5618+
5619+        This can be invoked in one of two ways.
5620+
5621+        With one argument, I assume that you are giving me a literal
5622+        checkstring -- e.g., the output of get_checkstring. I will then
5623+        set that checkstring as it is. This form is used by unit tests.
5624+
5625+        With two arguments, I assume that you are giving me a sequence
5626+        number and root hash to make a checkstring from. In that case, I
5627+        will build a checkstring and set it for you. This form is used
5628+        by the publisher.
5629+
5630+        By default, I assume that I am writing new shares to the grid.
5631+        If you don't explcitly set your own checkstring, I will use
5632+        one that requires that the remote share not exist. You will want
5633+        to use this method if you are updating a share in-place;
5634+        otherwise, writes will fail.
5635+        """
5636+        # You're allowed to overwrite checkstrings with this method;
5637+        # I assume that users know what they are doing when they call
5638+        # it.
5639+        if root_hash:
5640+            checkstring = struct.pack(MDMFCHECKSTRING,
5641+                                      1,
5642+                                      seqnum_or_checkstring,
5643+                                      root_hash)
5644+        else:
5645+            checkstring = seqnum_or_checkstring
5646+
5647+        if checkstring == "":
5648+            # We special-case this, since len("") = 0, but we need
5649+            # length of 1 for the case of an empty share to work on the
5650+            # storage server, which is what a checkstring that is the
5651+            # empty string means.
5652+            self._testvs = []
5653+        else:
5654+            self._testvs = []
5655+            self._testvs.append((0, len(checkstring), "eq", checkstring))
5656+
5657+
5658+    def __repr__(self):
5659+        return "MDMFSlotWriteProxy for share %d" % self.shnum
5660+
5661+
5662+    def get_checkstring(self):
5663+        """
5664+        Given a share number, I return a representation of what the
5665+        checkstring for that share on the server will look like.
5666+
5667+        I am mostly used for tests.
5668+        """
5669+        if self._root_hash:
5670+            roothash = self._root_hash
5671+        else:
5672+            roothash = "\x00" * 32
5673+        return struct.pack(MDMFCHECKSTRING,
5674+                           1,
5675+                           self._seqnum,
5676+                           roothash)
5677+
5678+
5679+    def put_block(self, data, segnum, salt):
5680+        """
5681+        Put the encrypted-and-encoded data segment in the slot, along
5682+        with the salt.
5683+        """
5684+        if segnum >= self._num_segments:
5685+            raise LayoutInvalid("I won't overwrite the private key")
5686+        if len(salt) != SALT_SIZE:
5687+            raise LayoutInvalid("I was given a salt of size %d, but "
5688+                                "I wanted a salt of size %d")
5689+        if segnum + 1 == self._num_segments:
5690+            if len(data) != self._tail_block_size:
5691+                raise LayoutInvalid("I was given the wrong size block to write")
5692+        elif len(data) != self._block_size:
5693+            raise LayoutInvalid("I was given the wrong size block to write")
5694+
5695+        # We want to write at len(MDMFHEADER) + segnum * block_size.
5696+
5697+        offset = MDMFHEADERSIZE + (self._actual_block_size * segnum)
5698+        data = salt + data
5699+
5700+        datavs = [tuple([offset, data])]
5701+        return self._write(datavs)
5702+
5703+
5704+    def put_encprivkey(self, encprivkey):
5705+        """
5706+        Put the encrypted private key in the remote slot.
5707+        """
5708+        assert self._offsets
5709+        assert self._offsets['enc_privkey']
5710+        # You shouldn't re-write the encprivkey after the block hash
5711+        # tree is written, since that could cause the private key to run
5712+        # into the block hash tree. Before it writes the block hash
5713+        # tree, the block hash tree writing method writes the offset of
5714+        # the salt hash tree. So that's a good indicator of whether or
5715+        # not the block hash tree has been written.
5716+        if "share_hash_chain" in self._offsets:
5717+            raise LayoutInvalid("You must write this before the block hash tree")
5718+
5719+        self._offsets['block_hash_tree'] = self._offsets['enc_privkey'] + len(encprivkey)
5720+        datavs = [(tuple([self._offsets['enc_privkey'], encprivkey]))]
5721+        def _on_failure():
5722+            del(self._offsets['block_hash_tree'])
5723+        return self._write(datavs, on_failure=_on_failure)
5724+
5725+
5726+    def put_blockhashes(self, blockhashes):
5727+        """
5728+        Put the block hash tree in the remote slot.
5729+
5730+        The encrypted private key must be put before the block hash
5731+        tree, since we need to know how large it is to know where the
5732+        block hash tree should go. The block hash tree must be put
5733+        before the salt hash tree, since its size determines the
5734+        offset of the share hash chain.
5735+        """
5736+        assert self._offsets
5737+        assert isinstance(blockhashes, list)
5738+        if "block_hash_tree" not in self._offsets:
5739+            raise LayoutInvalid("You must put the encrypted private key "
5740+                                "before you put the block hash tree")
5741+        # If written, the share hash chain causes the signature offset
5742+        # to be defined.
5743+        if "signature" in self._offsets:
5744+            raise LayoutInvalid("You must put the block hash tree before "
5745+                                "you put the share hash chain")
5746+        blockhashes_s = "".join(blockhashes)
5747+        self._offsets['share_hash_chain'] = self._offsets['block_hash_tree'] + len(blockhashes_s)
5748+        datavs = []
5749+        datavs.append(tuple([self._offsets['block_hash_tree'], blockhashes_s]))
5750+        def _on_failure():
5751+            del(self._offsets['share_hash_chain'])
5752+        return self._write(datavs, on_failure=_on_failure)
5753+
5754+
5755+    def put_sharehashes(self, sharehashes):
5756+        """
5757+        Put the share hash chain in the remote slot.
5758+
5759+        The salt hash tree must be put before the share hash chain,
5760+        since we need to know where the salt hash tree ends before we
5761+        can know where the share hash chain starts. The share hash chain
5762+        must be put before the signature, since the length of the packed
5763+        share hash chain determines the offset of the signature. Also,
5764+        semantically, you must know what the root of the salt hash tree
5765+        is before you can generate a valid signature.
5766+        """
5767+        assert isinstance(sharehashes, dict)
5768+        if "share_hash_chain" not in self._offsets:
5769+            raise LayoutInvalid("You need to put the salt hash tree before "
5770+                                "you can put the share hash chain")
5771+        # The signature comes after the share hash chain. If the
5772+        # signature has already been written, we must not write another
5773+        # share hash chain. The signature writes the verification key
5774+        # offset when it gets sent to the remote server, so we look for
5775+        # that.
5776+        if "verification_key" in self._offsets:
5777+            raise LayoutInvalid("You must write the share hash chain "
5778+                                "before you write the signature")
5779+        datavs = []
5780+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
5781+                                  for i in sorted(sharehashes.keys())])
5782+        self._offsets['signature'] = self._offsets['share_hash_chain'] + len(sharehashes_s)
5783+        datavs.append(tuple([self._offsets['share_hash_chain'], sharehashes_s]))
5784+        def _on_failure():
5785+            del(self._offsets['signature'])
5786+        return self._write(datavs, on_failure=_on_failure)
5787+
5788+
5789+    def put_root_hash(self, roothash):
5790+        """
5791+        Put the root hash (the root of the share hash tree) in the
5792+        remote slot.
5793+        """
5794+        # It does not make sense to be able to put the root
5795+        # hash without first putting the share hashes, since you need
5796+        # the share hashes to generate the root hash.
5797+        #
5798+        # Signature is defined by the routine that places the share hash
5799+        # chain, so it's a good thing to look for in finding out whether
5800+        # or not the share hash chain exists on the remote server.
5801+        if "signature" not in self._offsets:
5802+            raise LayoutInvalid("You need to put the share hash chain "
5803+                                "before you can put the root share hash")
5804+        if len(roothash) != HASH_SIZE:
5805+            raise LayoutInvalid("hashes and salts must be exactly %d bytes"
5806+                                 % HASH_SIZE)
5807+        datavs = []
5808+        self._root_hash = roothash
5809+        # To write both of these values, we update the checkstring on
5810+        # the remote server, which includes them
5811+        checkstring = self.get_checkstring()
5812+        datavs.append(tuple([0, checkstring]))
5813+        # This write, if successful, changes the checkstring, so we need
5814+        # to update our internal checkstring to be consistent with the
5815+        # one on the server.
5816+        def _on_success():
5817+            self._testvs = [(0, len(checkstring), "eq", checkstring)]
5818+        def _on_failure():
5819+            self._root_hash = None
5820+        return self._write(datavs,
5821+                           on_success=_on_success,
5822+                           on_failure=_on_failure)
5823+
5824+
5825+    def get_signable(self):
5826+        """
5827+        Get the first seven fields of the mutable file; the parts that
5828+        are signed.
5829+        """
5830+        if not self._root_hash:
5831+            raise LayoutInvalid("You need to set the root hash "
5832+                                "before getting something to "
5833+                                "sign")
5834+        return struct.pack(MDMFSIGNABLEHEADER,
5835+                           1,
5836+                           self._seqnum,
5837+                           self._root_hash,
5838+                           self._required_shares,
5839+                           self._total_shares,
5840+                           self._segment_size,
5841+                           self._data_length)
5842+
5843+
5844+    def put_signature(self, signature):
5845+        """
5846+        Put the signature field to the remote slot.
5847+
5848+        I require that the root hash and share hash chain have been put
5849+        to the grid before I will write the signature to the grid.
5850+        """
5851+        if "signature" not in self._offsets:
5852+            raise LayoutInvalid("You must put the share hash chain "
5853+        # It does not make sense to put a signature without first
5854+        # putting the root hash and the salt hash (since otherwise
5855+        # the signature would be incomplete), so we don't allow that.
5856+                       "before putting the signature")
5857+        if not self._root_hash:
5858+            raise LayoutInvalid("You must complete the signed prefix "
5859+                                "before computing a signature")
5860+        # If we put the signature after we put the verification key, we
5861+        # could end up running into the verification key, and will
5862+        # probably screw up the offsets as well. So we don't allow that.
5863+        # The method that writes the verification key defines the EOF
5864+        # offset before writing the verification key, so look for that.
5865+        if "EOF" in self._offsets:
5866+            raise LayoutInvalid("You must write the signature before the verification key")
5867+
5868+        self._offsets['verification_key'] = self._offsets['signature'] + len(signature)
5869+        datavs = []
5870+        datavs.append(tuple([self._offsets['signature'], signature]))
5871+        def _on_failure():
5872+            del(self._offsets['verification_key'])
5873+        return self._write(datavs, on_failure=_on_failure)
5874+
5875+
5876+    def put_verification_key(self, verification_key):
5877+        """
5878+        Put the verification key into the remote slot.
5879+
5880+        I require that the signature have been written to the storage
5881+        server before I allow the verification key to be written to the
5882+        remote server.
5883+        """
5884+        if "verification_key" not in self._offsets:
5885+            raise LayoutInvalid("You must put the signature before you "
5886+                                "can put the verification key")
5887+        self._offsets['EOF'] = self._offsets['verification_key'] + len(verification_key)
5888+        datavs = []
5889+        datavs.append(tuple([self._offsets['verification_key'], verification_key]))
5890+        def _on_failure():
5891+            del(self._offsets['EOF'])
5892+        return self._write(datavs, on_failure=_on_failure)
5893+
5894+
5895+    def finish_publishing(self):
5896+        """
5897+        Write the offset table and encoding parameters to the remote
5898+        slot, since that's the only thing we have yet to publish at this
5899+        point.
5900+        """
5901+        if "EOF" not in self._offsets:
5902+            raise LayoutInvalid("You must put the verification key before "
5903+                                "you can publish the offsets")
5904+        offsets_offset = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
5905+        offsets = struct.pack(MDMFOFFSETS,
5906+                              self._offsets['enc_privkey'],
5907+                              self._offsets['block_hash_tree'],
5908+                              self._offsets['share_hash_chain'],
5909+                              self._offsets['signature'],
5910+                              self._offsets['verification_key'],
5911+                              self._offsets['EOF'])
5912+        datavs = []
5913+        datavs.append(tuple([offsets_offset, offsets]))
5914+        encoding_parameters_offset = struct.calcsize(MDMFCHECKSTRING)
5915+        params = struct.pack(">BBQQ",
5916+                             self._required_shares,
5917+                             self._total_shares,
5918+                             self._segment_size,
5919+                             self._data_length)
5920+        datavs.append(tuple([encoding_parameters_offset, params]))
5921+        return self._write(datavs)
5922+
5923+
5924+    def _write(self, datavs, on_failure=None, on_success=None):
5925+        """I write the data vectors in datavs to the remote slot."""
5926+        tw_vectors = {}
5927+        new_share = False
5928+        if not self._testvs:
5929+            self._testvs = []
5930+            self._testvs.append(tuple([0, 1, "eq", ""]))
5931+            new_share = True
5932+        if not self._written:
5933+            # Write a new checkstring to the share when we write it, so
5934+            # that we have something to check later.
5935+            new_checkstring = self.get_checkstring()
5936+            datavs.append((0, new_checkstring))
5937+            def _first_write():
5938+                self._written = True
5939+                self._testvs = [(0, len(new_checkstring), "eq", new_checkstring)]
5940+            on_success = _first_write
5941+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
5942+        datalength = sum([len(x[1]) for x in datavs])
5943+        d = self._rref.callRemote("slot_testv_and_readv_and_writev",
5944+                                  self._storage_index,
5945+                                  self._secrets,
5946+                                  tw_vectors,
5947+                                  self._readv)
5948+        def _result(results):
5949+            if isinstance(results, failure.Failure) or not results[0]:
5950+                # Do nothing; the write was unsuccessful.
5951+                if on_failure: on_failure()
5952+            else:
5953+                if on_success: on_success()
5954+            return results
5955+        d.addCallback(_result)
5956+        return d
5957+
5958+
5959+class MDMFSlotReadProxy:
5960+    """
5961+    I read from a mutable slot filled with data written in the MDMF data
5962+    format (which is described above).
5963+
5964+    I can be initialized with some amount of data, which I will use (if
5965+    it is valid) to eliminate some of the need to fetch it from servers.
5966+    """
5967+    def __init__(self,
5968+                 rref,
5969+                 storage_index,
5970+                 shnum,
5971+                 data=""):
5972+        # Start the initialization process.
5973+        self._rref = rref
5974+        self._storage_index = storage_index
5975+        self.shnum = shnum
5976+
5977+        # Before doing anything, the reader is probably going to want to
5978+        # verify that the signature is correct. To do that, they'll need
5979+        # the verification key, and the signature. To get those, we'll
5980+        # need the offset table. So fetch the offset table on the
5981+        # assumption that that will be the first thing that a reader is
5982+        # going to do.
5983+
5984+        # The fact that these encoding parameters are None tells us
5985+        # that we haven't yet fetched them from the remote share, so we
5986+        # should. We could just not set them, but the checks will be
5987+        # easier to read if we don't have to use hasattr.
5988+        self._version_number = None
5989+        self._sequence_number = None
5990+        self._root_hash = None
5991+        # Filled in if we're dealing with an SDMF file. Unused
5992+        # otherwise.
5993+        self._salt = None
5994+        self._required_shares = None
5995+        self._total_shares = None
5996+        self._segment_size = None
5997+        self._data_length = None
5998+        self._offsets = None
5999+
6000+        # If the user has chosen to initialize us with some data, we'll
6001+        # try to satisfy subsequent data requests with that data before
6002+        # asking the storage server for it. If
6003+        self._data = data
6004+        # The way callers interact with cache in the filenode returns
6005+        # None if there isn't any cached data, but the way we index the
6006+        # cached data requires a string, so convert None to "".
6007+        if self._data == None:
6008+            self._data = ""
6009+
6010+        self._queue_observers = observer.ObserverList()
6011+        self._queue_errbacks = observer.ObserverList()
6012+        self._readvs = []
6013+
6014+
6015+    def _maybe_fetch_offsets_and_header(self, force_remote=False):
6016+        """
6017+        I fetch the offset table and the header from the remote slot if
6018+        I don't already have them. If I do have them, I do nothing and
6019+        return an empty Deferred.
6020+        """
6021+        if self._offsets:
6022+            return defer.succeed(None)
6023+        # At this point, we may be either SDMF or MDMF. Fetching 107
6024+        # bytes will be enough to get header and offsets for both SDMF and
6025+        # MDMF, though we'll be left with 4 more bytes than we
6026+        # need if this ends up being MDMF. This is probably less
6027+        # expensive than the cost of a second roundtrip.
6028+        readvs = [(0, 107)]
6029+        d = self._read(readvs, force_remote)
6030+        d.addCallback(self._process_encoding_parameters)
6031+        d.addCallback(self._process_offsets)
6032+        return d
6033+
6034+
6035+    def _process_encoding_parameters(self, encoding_parameters):
6036+        assert self.shnum in encoding_parameters
6037+        encoding_parameters = encoding_parameters[self.shnum][0]
6038+        # The first byte is the version number. It will tell us what
6039+        # to do next.
6040+        (verno,) = struct.unpack(">B", encoding_parameters[:1])
6041+        if verno == MDMF_VERSION:
6042+            read_size = MDMFHEADERWITHOUTOFFSETSSIZE
6043+            (verno,
6044+             seqnum,
6045+             root_hash,
6046+             k,
6047+             n,
6048+             segsize,
6049+             datalen) = struct.unpack(MDMFHEADERWITHOUTOFFSETS,
6050+                                      encoding_parameters[:read_size])
6051+            if segsize == 0 and datalen == 0:
6052+                # Empty file, no segments.
6053+                self._num_segments = 0
6054+            else:
6055+                self._num_segments = mathutil.div_ceil(datalen, segsize)
6056+
6057+        elif verno == SDMF_VERSION:
6058+            read_size = SIGNED_PREFIX_LENGTH
6059+            (verno,
6060+             seqnum,
6061+             root_hash,
6062+             salt,
6063+             k,
6064+             n,
6065+             segsize,
6066+             datalen) = struct.unpack(">BQ32s16s BBQQ",
6067+                                encoding_parameters[:SIGNED_PREFIX_LENGTH])
6068+            self._salt = salt
6069+            if segsize == 0 and datalen == 0:
6070+                # empty file
6071+                self._num_segments = 0
6072+            else:
6073+                # non-empty SDMF files have one segment.
6074+                self._num_segments = 1
6075+        else:
6076+            raise UnknownVersionError("You asked me to read mutable file "
6077+                                      "version %d, but I only understand "
6078+                                      "%d and %d" % (verno, SDMF_VERSION,
6079+                                                     MDMF_VERSION))
6080+
6081+        self._version_number = verno
6082+        self._sequence_number = seqnum
6083+        self._root_hash = root_hash
6084+        self._required_shares = k
6085+        self._total_shares = n
6086+        self._segment_size = segsize
6087+        self._data_length = datalen
6088+
6089+        self._block_size = self._segment_size / self._required_shares
6090+        # We can upload empty files, and need to account for this fact
6091+        # so as to avoid zero-division and zero-modulo errors.
6092+        if datalen > 0:
6093+            tail_size = self._data_length % self._segment_size
6094+        else:
6095+            tail_size = 0
6096+        if not tail_size:
6097+            self._tail_block_size = self._block_size
6098+        else:
6099+            self._tail_block_size = mathutil.next_multiple(tail_size,
6100+                                                    self._required_shares)
6101+            self._tail_block_size /= self._required_shares
6102+
6103+        return encoding_parameters
6104+
6105+
6106+    def _process_offsets(self, offsets):
6107+        if self._version_number == 0:
6108+            read_size = OFFSETS_LENGTH
6109+            read_offset = SIGNED_PREFIX_LENGTH
6110+            end = read_size + read_offset
6111+            (signature,
6112+             share_hash_chain,
6113+             block_hash_tree,
6114+             share_data,
6115+             enc_privkey,
6116+             EOF) = struct.unpack(">LLLLQQ",
6117+                                  offsets[read_offset:end])
6118+            self._offsets = {}
6119+            self._offsets['signature'] = signature
6120+            self._offsets['share_data'] = share_data
6121+            self._offsets['block_hash_tree'] = block_hash_tree
6122+            self._offsets['share_hash_chain'] = share_hash_chain
6123+            self._offsets['enc_privkey'] = enc_privkey
6124+            self._offsets['EOF'] = EOF
6125+
6126+        elif self._version_number == 1:
6127+            read_offset = MDMFHEADERWITHOUTOFFSETSSIZE
6128+            read_length = MDMFOFFSETS_LENGTH
6129+            end = read_offset + read_length
6130+            (encprivkey,
6131+             blockhashes,
6132+             sharehashes,
6133+             signature,
6134+             verification_key,
6135+             eof) = struct.unpack(MDMFOFFSETS,
6136+                                  offsets[read_offset:end])
6137+            self._offsets = {}
6138+            self._offsets['enc_privkey'] = encprivkey
6139+            self._offsets['block_hash_tree'] = blockhashes
6140+            self._offsets['share_hash_chain'] = sharehashes
6141+            self._offsets['signature'] = signature
6142+            self._offsets['verification_key'] = verification_key
6143+            self._offsets['EOF'] = eof
6144+
6145+
6146+    def get_block_and_salt(self, segnum, queue=False):
6147+        """
6148+        I return (block, salt), where block is the block data and
6149+        salt is the salt used to encrypt that segment.
6150+        """
6151+        d = self._maybe_fetch_offsets_and_header()
6152+        def _then(ignored):
6153+            if self._version_number == 1:
6154+                base_share_offset = MDMFHEADERSIZE
6155+            else:
6156+                base_share_offset = self._offsets['share_data']
6157+
6158+            if segnum + 1 > self._num_segments:
6159+                raise LayoutInvalid("Not a valid segment number")
6160+
6161+            if self._version_number == 0:
6162+                share_offset = base_share_offset + self._block_size * segnum
6163+            else:
6164+                share_offset = base_share_offset + (self._block_size + \
6165+                                                    SALT_SIZE) * segnum
6166+            if segnum + 1 == self._num_segments:
6167+                data = self._tail_block_size
6168+            else:
6169+                data = self._block_size
6170+
6171+            if self._version_number == 1:
6172+                data += SALT_SIZE
6173+
6174+            readvs = [(share_offset, data)]
6175+            return readvs
6176+        d.addCallback(_then)
6177+        d.addCallback(lambda readvs:
6178+            self._read(readvs, queue=queue))
6179+        def _process_results(results):
6180+            assert self.shnum in results
6181+            if self._version_number == 0:
6182+                # We only read the share data, but we know the salt from
6183+                # when we fetched the header
6184+                data = results[self.shnum]
6185+                if not data:
6186+                    data = ""
6187+                else:
6188+                    assert len(data) == 1
6189+                    data = data[0]
6190+                salt = self._salt
6191+            else:
6192+                data = results[self.shnum]
6193+                if not data:
6194+                    salt = data = ""
6195+                else:
6196+                    salt_and_data = results[self.shnum][0]
6197+                    salt = salt_and_data[:SALT_SIZE]
6198+                    data = salt_and_data[SALT_SIZE:]
6199+            return data, salt
6200+        d.addCallback(_process_results)
6201+        return d
6202+
6203+
6204+    def get_blockhashes(self, needed=None, queue=False, force_remote=False):
6205+        """
6206+        I return the block hash tree
6207+
6208+        I take an optional argument, needed, which is a set of indices
6209+        correspond to hashes that I should fetch. If this argument is
6210+        missing, I will fetch the entire block hash tree; otherwise, I
6211+        may attempt to fetch fewer hashes, based on what needed says
6212+        that I should do. Note that I may fetch as many hashes as I
6213+        want, so long as the set of hashes that I do fetch is a superset
6214+        of the ones that I am asked for, so callers should be prepared
6215+        to tolerate additional hashes.
6216+        """
6217+        # TODO: Return only the parts of the block hash tree necessary
6218+        # to validate the blocknum provided?
6219+        # This is a good idea, but it is hard to implement correctly. It
6220+        # is bad to fetch any one block hash more than once, so we
6221+        # probably just want to fetch the whole thing at once and then
6222+        # serve it.
6223+        if needed == set([]):
6224+            return defer.succeed([])
6225+        d = self._maybe_fetch_offsets_and_header()
6226+        def _then(ignored):
6227+            blockhashes_offset = self._offsets['block_hash_tree']
6228+            if self._version_number == 1:
6229+                blockhashes_length = self._offsets['share_hash_chain'] - blockhashes_offset
6230+            else:
6231+                blockhashes_length = self._offsets['share_data'] - blockhashes_offset
6232+            readvs = [(blockhashes_offset, blockhashes_length)]
6233+            return readvs
6234+        d.addCallback(_then)
6235+        d.addCallback(lambda readvs:
6236+            self._read(readvs, queue=queue, force_remote=force_remote))
6237+        def _build_block_hash_tree(results):
6238+            assert self.shnum in results
6239+
6240+            rawhashes = results[self.shnum][0]
6241+            results = [rawhashes[i:i+HASH_SIZE]
6242+                       for i in range(0, len(rawhashes), HASH_SIZE)]
6243+            return results
6244+        d.addCallback(_build_block_hash_tree)
6245+        return d
6246+
6247+
6248+    def get_sharehashes(self, needed=None, queue=False, force_remote=False):
6249+        """
6250+        I return the part of the share hash chain placed to validate
6251+        this share.
6252+
6253+        I take an optional argument, needed. Needed is a set of indices
6254+        that correspond to the hashes that I should fetch. If needed is
6255+        not present, I will fetch and return the entire share hash
6256+        chain. Otherwise, I may fetch and return any part of the share
6257+        hash chain that is a superset of the part that I am asked to
6258+        fetch. Callers should be prepared to deal with more hashes than
6259+        they've asked for.
6260+        """
6261+        if needed == set([]):
6262+            return defer.succeed([])
6263+        d = self._maybe_fetch_offsets_and_header()
6264+
6265+        def _make_readvs(ignored):
6266+            sharehashes_offset = self._offsets['share_hash_chain']
6267+            if self._version_number == 0:
6268+                sharehashes_length = self._offsets['block_hash_tree'] - sharehashes_offset
6269+            else:
6270+                sharehashes_length = self._offsets['signature'] - sharehashes_offset
6271+            readvs = [(sharehashes_offset, sharehashes_length)]
6272+            return readvs
6273+        d.addCallback(_make_readvs)
6274+        d.addCallback(lambda readvs:
6275+            self._read(readvs, queue=queue, force_remote=force_remote))
6276+        def _build_share_hash_chain(results):
6277+            assert self.shnum in results
6278+
6279+            sharehashes = results[self.shnum][0]
6280+            results = [sharehashes[i:i+(HASH_SIZE + 2)]
6281+                       for i in range(0, len(sharehashes), HASH_SIZE + 2)]
6282+            results = dict([struct.unpack(">H32s", data)
6283+                            for data in results])
6284+            return results
6285+        d.addCallback(_build_share_hash_chain)
6286+        return d
6287+
6288+
6289+    def get_encprivkey(self, queue=False):
6290+        """
6291+        I return the encrypted private key.
6292+        """
6293+        d = self._maybe_fetch_offsets_and_header()
6294+
6295+        def _make_readvs(ignored):
6296+            privkey_offset = self._offsets['enc_privkey']
6297+            if self._version_number == 0:
6298+                privkey_length = self._offsets['EOF'] - privkey_offset
6299+            else:
6300+                privkey_length = self._offsets['block_hash_tree'] - privkey_offset
6301+            readvs = [(privkey_offset, privkey_length)]
6302+            return readvs
6303+        d.addCallback(_make_readvs)
6304+        d.addCallback(lambda readvs:
6305+            self._read(readvs, queue=queue))
6306+        def _process_results(results):
6307+            assert self.shnum in results
6308+            privkey = results[self.shnum][0]
6309+            return privkey
6310+        d.addCallback(_process_results)
6311+        return d
6312+
6313+
6314+    def get_signature(self, queue=False):
6315+        """
6316+        I return the signature of my share.
6317+        """
6318+        d = self._maybe_fetch_offsets_and_header()
6319+
6320+        def _make_readvs(ignored):
6321+            signature_offset = self._offsets['signature']
6322+            if self._version_number == 1:
6323+                signature_length = self._offsets['verification_key'] - signature_offset
6324+            else:
6325+                signature_length = self._offsets['share_hash_chain'] - signature_offset
6326+            readvs = [(signature_offset, signature_length)]
6327+            return readvs
6328+        d.addCallback(_make_readvs)
6329+        d.addCallback(lambda readvs:
6330+            self._read(readvs, queue=queue))
6331+        def _process_results(results):
6332+            assert self.shnum in results
6333+            signature = results[self.shnum][0]
6334+            return signature
6335+        d.addCallback(_process_results)
6336+        return d
6337+
6338+
6339+    def get_verification_key(self, queue=False):
6340+        """
6341+        I return the verification key.
6342+        """
6343+        d = self._maybe_fetch_offsets_and_header()
6344+
6345+        def _make_readvs(ignored):
6346+            if self._version_number == 1:
6347+                vk_offset = self._offsets['verification_key']
6348+                vk_length = self._offsets['EOF'] - vk_offset
6349+            else:
6350+                vk_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
6351+                vk_length = self._offsets['signature'] - vk_offset
6352+            readvs = [(vk_offset, vk_length)]
6353+            return readvs
6354+        d.addCallback(_make_readvs)
6355+        d.addCallback(lambda readvs:
6356+            self._read(readvs, queue=queue))
6357+        def _process_results(results):
6358+            assert self.shnum in results
6359+            verification_key = results[self.shnum][0]
6360+            return verification_key
6361+        d.addCallback(_process_results)
6362+        return d
6363+
6364+
6365+    def get_encoding_parameters(self):
6366+        """
6367+        I return (k, n, segsize, datalen)
6368+        """
6369+        d = self._maybe_fetch_offsets_and_header()
6370+        d.addCallback(lambda ignored:
6371+            (self._required_shares,
6372+             self._total_shares,
6373+             self._segment_size,
6374+             self._data_length))
6375+        return d
6376+
6377+
6378+    def get_seqnum(self):
6379+        """
6380+        I return the sequence number for this share.
6381+        """
6382+        d = self._maybe_fetch_offsets_and_header()
6383+        d.addCallback(lambda ignored:
6384+            self._sequence_number)
6385+        return d
6386+
6387+
6388+    def get_root_hash(self):
6389+        """
6390+        I return the root of the block hash tree
6391+        """
6392+        d = self._maybe_fetch_offsets_and_header()
6393+        d.addCallback(lambda ignored: self._root_hash)
6394+        return d
6395+
6396+
6397+    def get_checkstring(self):
6398+        """
6399+        I return the packed representation of the following:
6400+
6401+            - version number
6402+            - sequence number
6403+            - root hash
6404+            - salt hash
6405+
6406+        which my users use as a checkstring to detect other writers.
6407+        """
6408+        d = self._maybe_fetch_offsets_and_header()
6409+        def _build_checkstring(ignored):
6410+            if self._salt:
6411+                checkstring = strut.pack(PREFIX,
6412+                                         self._version_number,
6413+                                         self._sequence_number,
6414+                                         self._root_hash,
6415+                                         self._salt)
6416+            else:
6417+                checkstring = struct.pack(MDMFCHECKSTRING,
6418+                                          self._version_number,
6419+                                          self._sequence_number,
6420+                                          self._root_hash)
6421+
6422+            return checkstring
6423+        d.addCallback(_build_checkstring)
6424+        return d
6425+
6426+
6427+    def get_prefix(self, force_remote):
6428+        d = self._maybe_fetch_offsets_and_header(force_remote)
6429+        d.addCallback(lambda ignored:
6430+            self._build_prefix())
6431+        return d
6432+
6433+
6434+    def _build_prefix(self):
6435+        # The prefix is another name for the part of the remote share
6436+        # that gets signed. It consists of everything up to and
6437+        # including the datalength, packed by struct.
6438+        if self._version_number == SDMF_VERSION:
6439+            return struct.pack(SIGNED_PREFIX,
6440+                           self._version_number,
6441+                           self._sequence_number,
6442+                           self._root_hash,
6443+                           self._salt,
6444+                           self._required_shares,
6445+                           self._total_shares,
6446+                           self._segment_size,
6447+                           self._data_length)
6448+
6449+        else:
6450+            return struct.pack(MDMFSIGNABLEHEADER,
6451+                           self._version_number,
6452+                           self._sequence_number,
6453+                           self._root_hash,
6454+                           self._required_shares,
6455+                           self._total_shares,
6456+                           self._segment_size,
6457+                           self._data_length)
6458+
6459+
6460+    def _get_offsets_tuple(self):
6461+        # The offsets tuple is another component of the version
6462+        # information tuple. It is basically our offsets dictionary,
6463+        # itemized and in a tuple.
6464+        return self._offsets.copy()
6465+
6466+
6467+    def get_verinfo(self):
6468+        """
6469+        I return my verinfo tuple. This is used by the ServermapUpdater
6470+        to keep track of versions of mutable files.
6471+
6472+        The verinfo tuple for MDMF files contains:
6473+            - seqnum
6474+            - root hash
6475+            - a blank (nothing)
6476+            - segsize
6477+            - datalen
6478+            - k
6479+            - n
6480+            - prefix (the thing that you sign)
6481+            - a tuple of offsets
6482+
6483+        We include the nonce in MDMF to simplify processing of version
6484+        information tuples.
6485+
6486+        The verinfo tuple for SDMF files is the same, but contains a
6487+        16-byte IV instead of a hash of salts.
6488+        """
6489+        d = self._maybe_fetch_offsets_and_header()
6490+        def _build_verinfo(ignored):
6491+            if self._version_number == SDMF_VERSION:
6492+                salt_to_use = self._salt
6493+            else:
6494+                salt_to_use = None
6495+            return (self._sequence_number,
6496+                    self._root_hash,
6497+                    salt_to_use,
6498+                    self._segment_size,
6499+                    self._data_length,
6500+                    self._required_shares,
6501+                    self._total_shares,
6502+                    self._build_prefix(),
6503+                    self._get_offsets_tuple())
6504+        d.addCallback(_build_verinfo)
6505+        return d
6506+
6507+
6508+    def flush(self):
6509+        """
6510+        I flush my queue of read vectors.
6511+        """
6512+        d = self._read(self._readvs)
6513+        def _then(results):
6514+            self._readvs = []
6515+            if isinstance(results, failure.Failure):
6516+                self._queue_errbacks.notify(results)
6517+            else:
6518+                self._queue_observers.notify(results)
6519+            self._queue_observers = observer.ObserverList()
6520+            self._queue_errbacks = observer.ObserverList()
6521+        d.addBoth(_then)
6522+
6523+
6524+    def _read(self, readvs, force_remote=False, queue=False):
6525+        unsatisfiable = filter(lambda x: x[0] + x[1] > len(self._data), readvs)
6526+        # TODO: It's entirely possible to tweak this so that it just
6527+        # fulfills the requests that it can, and not demand that all
6528+        # requests are satisfiable before running it.
6529+        if not unsatisfiable and not force_remote:
6530+            results = [self._data[offset:offset+length]
6531+                       for (offset, length) in readvs]
6532+            results = {self.shnum: results}
6533+            return defer.succeed(results)
6534+        else:
6535+            if queue:
6536+                start = len(self._readvs)
6537+                self._readvs += readvs
6538+                end = len(self._readvs)
6539+                def _get_results(results, start, end):
6540+                    if not self.shnum in results:
6541+                        return {self._shnum: [""]}
6542+                    return {self.shnum: results[self.shnum][start:end]}
6543+                d = defer.Deferred()
6544+                d.addCallback(_get_results, start, end)
6545+                self._queue_observers.subscribe(d.callback)
6546+                self._queue_errbacks.subscribe(d.errback)
6547+                return d
6548+            return self._rref.callRemote("slot_readv",
6549+                                         self._storage_index,
6550+                                         [self.shnum],
6551+                                         readvs)
6552+
6553+
6554+    def is_sdmf(self):
6555+        """I tell my caller whether or not my remote file is SDMF or MDMF
6556+        """
6557+        d = self._maybe_fetch_offsets_and_header()
6558+        d.addCallback(lambda ignored:
6559+            self._version_number == 0)
6560+        return d
6561+
6562+
6563+class LayoutInvalid(Exception):
6564+    """
6565+    This isn't a valid MDMF mutable file
6566+    """
6567hunk ./src/allmydata/test/test_storage.py 2
6568 
6569-import time, os.path, stat, re, simplejson, struct
6570+import time, os.path, stat, re, simplejson, struct, shutil
6571 
6572 from twisted.trial import unittest
6573 
6574hunk ./src/allmydata/test/test_storage.py 22
6575 from allmydata.storage.expirer import LeaseCheckingCrawler
6576 from allmydata.immutable.layout import WriteBucketProxy, WriteBucketProxy_v2, \
6577      ReadBucketProxy
6578-from allmydata.interfaces import BadWriteEnablerError
6579-from allmydata.test.common import LoggingServiceParent
6580+from allmydata.mutable.layout import MDMFSlotWriteProxy, MDMFSlotReadProxy, \
6581+                                     LayoutInvalid, MDMFSIGNABLEHEADER, \
6582+                                     SIGNED_PREFIX, MDMFHEADER, \
6583+                                     MDMFOFFSETS, SDMFSlotWriteProxy
6584+from allmydata.interfaces import BadWriteEnablerError, MDMF_VERSION, \
6585+                                 SDMF_VERSION
6586+from allmydata.test.common import LoggingServiceParent, ShouldFailMixin
6587 from allmydata.test.common_web import WebRenderingMixin
6588 from allmydata.web.storage import StorageStatus, remove_prefix
6589 
6590hunk ./src/allmydata/test/test_storage.py 106
6591 
6592 class RemoteBucket:
6593 
6594+    def __init__(self):
6595+        self.read_count = 0
6596+        self.write_count = 0
6597+
6598     def callRemote(self, methname, *args, **kwargs):
6599         def _call():
6600             meth = getattr(self.target, "remote_" + methname)
6601hunk ./src/allmydata/test/test_storage.py 114
6602             return meth(*args, **kwargs)
6603+
6604+        if methname == "slot_readv":
6605+            self.read_count += 1
6606+        if "writev" in methname:
6607+            self.write_count += 1
6608+
6609         return defer.maybeDeferred(_call)
6610 
6611hunk ./src/allmydata/test/test_storage.py 122
6612+
6613 class BucketProxy(unittest.TestCase):
6614     def make_bucket(self, name, size):
6615         basedir = os.path.join("storage", "BucketProxy", name)
6616hunk ./src/allmydata/test/test_storage.py 1299
6617         self.failUnless(os.path.exists(prefixdir), prefixdir)
6618         self.failIf(os.path.exists(bucketdir), bucketdir)
6619 
6620+
6621+class MDMFProxies(unittest.TestCase, ShouldFailMixin):
6622+    def setUp(self):
6623+        self.sparent = LoggingServiceParent()
6624+        self._lease_secret = itertools.count()
6625+        self.ss = self.create("MDMFProxies storage test server")
6626+        self.rref = RemoteBucket()
6627+        self.rref.target = self.ss
6628+        self.secrets = (self.write_enabler("we_secret"),
6629+                        self.renew_secret("renew_secret"),
6630+                        self.cancel_secret("cancel_secret"))
6631+        self.segment = "aaaaaa"
6632+        self.block = "aa"
6633+        self.salt = "a" * 16
6634+        self.block_hash = "a" * 32
6635+        self.block_hash_tree = [self.block_hash for i in xrange(6)]
6636+        self.share_hash = self.block_hash
6637+        self.share_hash_chain = dict([(i, self.share_hash) for i in xrange(6)])
6638+        self.signature = "foobarbaz"
6639+        self.verification_key = "vvvvvv"
6640+        self.encprivkey = "private"
6641+        self.root_hash = self.block_hash
6642+        self.salt_hash = self.root_hash
6643+        self.salt_hash_tree = [self.salt_hash for i in xrange(6)]
6644+        self.block_hash_tree_s = self.serialize_blockhashes(self.block_hash_tree)
6645+        self.share_hash_chain_s = self.serialize_sharehashes(self.share_hash_chain)
6646+        # blockhashes and salt hashes are serialized in the same way,
6647+        # only we lop off the first element and store that in the
6648+        # header.
6649+        self.salt_hash_tree_s = self.serialize_blockhashes(self.salt_hash_tree[1:])
6650+
6651+
6652+    def tearDown(self):
6653+        self.sparent.stopService()
6654+        shutil.rmtree(self.workdir("MDMFProxies storage test server"))
6655+
6656+
6657+    def write_enabler(self, we_tag):
6658+        return hashutil.tagged_hash("we_blah", we_tag)
6659+
6660+
6661+    def renew_secret(self, tag):
6662+        return hashutil.tagged_hash("renew_blah", str(tag))
6663+
6664+
6665+    def cancel_secret(self, tag):
6666+        return hashutil.tagged_hash("cancel_blah", str(tag))
6667+
6668+
6669+    def workdir(self, name):
6670+        basedir = os.path.join("storage", "MutableServer", name)
6671+        return basedir
6672+
6673+
6674+    def create(self, name):
6675+        workdir = self.workdir(name)
6676+        ss = StorageServer(workdir, "\x00" * 20)
6677+        ss.setServiceParent(self.sparent)
6678+        return ss
6679+
6680+
6681+    def build_test_mdmf_share(self, tail_segment=False, empty=False):
6682+        # Start with the checkstring
6683+        data = struct.pack(">BQ32s",
6684+                           1,
6685+                           0,
6686+                           self.root_hash)
6687+        self.checkstring = data
6688+        # Next, the encoding parameters
6689+        if tail_segment:
6690+            data += struct.pack(">BBQQ",
6691+                                3,
6692+                                10,
6693+                                6,
6694+                                33)
6695+        elif empty:
6696+            data += struct.pack(">BBQQ",
6697+                                3,
6698+                                10,
6699+                                0,
6700+                                0)
6701+        else:
6702+            data += struct.pack(">BBQQ",
6703+                                3,
6704+                                10,
6705+                                6,
6706+                                36)
6707+        # Now we'll build the offsets.
6708+        sharedata = ""
6709+        if not tail_segment and not empty:
6710+            for i in xrange(6):
6711+                sharedata += self.salt + self.block
6712+        elif tail_segment:
6713+            for i in xrange(5):
6714+                sharedata += self.salt + self.block
6715+            sharedata += self.salt + "a"
6716+
6717+        # The encrypted private key comes after the shares + salts
6718+        offset_size = struct.calcsize(MDMFOFFSETS)
6719+        encrypted_private_key_offset = len(data) + offset_size + len(sharedata)
6720+        # The blockhashes come after the private key
6721+        blockhashes_offset = encrypted_private_key_offset + len(self.encprivkey)
6722+        # The sharehashes come after the salt hashes
6723+        sharehashes_offset = blockhashes_offset + len(self.block_hash_tree_s)
6724+        # The signature comes after the share hash chain
6725+        signature_offset = sharehashes_offset + len(self.share_hash_chain_s)
6726+        # The verification key comes after the signature
6727+        verification_offset = signature_offset + len(self.signature)
6728+        # The EOF comes after the verification key
6729+        eof_offset = verification_offset + len(self.verification_key)
6730+        data += struct.pack(MDMFOFFSETS,
6731+                            encrypted_private_key_offset,
6732+                            blockhashes_offset,
6733+                            sharehashes_offset,
6734+                            signature_offset,
6735+                            verification_offset,
6736+                            eof_offset)
6737+        self.offsets = {}
6738+        self.offsets['enc_privkey'] = encrypted_private_key_offset
6739+        self.offsets['block_hash_tree'] = blockhashes_offset
6740+        self.offsets['share_hash_chain'] = sharehashes_offset
6741+        self.offsets['signature'] = signature_offset
6742+        self.offsets['verification_key'] = verification_offset
6743+        self.offsets['EOF'] = eof_offset
6744+        # Next, we'll add in the salts and share data,
6745+        data += sharedata
6746+        # the private key,
6747+        data += self.encprivkey
6748+        # the block hash tree,
6749+        data += self.block_hash_tree_s
6750+        # the share hash chain,
6751+        data += self.share_hash_chain_s
6752+        # the signature,
6753+        data += self.signature
6754+        # and the verification key
6755+        data += self.verification_key
6756+        return data
6757+
6758+
6759+    def write_test_share_to_server(self,
6760+                                   storage_index,
6761+                                   tail_segment=False,
6762+                                   empty=False):
6763+        """
6764+        I write some data for the read tests to read to self.ss
6765+
6766+        If tail_segment=True, then I will write a share that has a
6767+        smaller tail segment than other segments.
6768+        """
6769+        write = self.ss.remote_slot_testv_and_readv_and_writev
6770+        data = self.build_test_mdmf_share(tail_segment, empty)
6771+        # Finally, we write the whole thing to the storage server in one
6772+        # pass.
6773+        testvs = [(0, 1, "eq", "")]
6774+        tws = {}
6775+        tws[0] = (testvs, [(0, data)], None)
6776+        readv = [(0, 1)]
6777+        results = write(storage_index, self.secrets, tws, readv)
6778+        self.failUnless(results[0])
6779+
6780+
6781+    def build_test_sdmf_share(self, empty=False):
6782+        if empty:
6783+            sharedata = ""
6784+        else:
6785+            sharedata = self.segment * 6
6786+        self.sharedata = sharedata
6787+        blocksize = len(sharedata) / 3
6788+        block = sharedata[:blocksize]
6789+        self.blockdata = block
6790+        prefix = struct.pack(">BQ32s16s BBQQ",
6791+                             0, # version,
6792+                             0,
6793+                             self.root_hash,
6794+                             self.salt,
6795+                             3,
6796+                             10,
6797+                             len(sharedata),
6798+                             len(sharedata),
6799+                            )
6800+        post_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
6801+        signature_offset = post_offset + len(self.verification_key)
6802+        sharehashes_offset = signature_offset + len(self.signature)
6803+        blockhashes_offset = sharehashes_offset + len(self.share_hash_chain_s)
6804+        sharedata_offset = blockhashes_offset + len(self.block_hash_tree_s)
6805+        encprivkey_offset = sharedata_offset + len(block)
6806+        eof_offset = encprivkey_offset + len(self.encprivkey)
6807+        offsets = struct.pack(">LLLLQQ",
6808+                              signature_offset,
6809+                              sharehashes_offset,
6810+                              blockhashes_offset,
6811+                              sharedata_offset,
6812+                              encprivkey_offset,
6813+                              eof_offset)
6814+        final_share = "".join([prefix,
6815+                           offsets,
6816+                           self.verification_key,
6817+                           self.signature,
6818+                           self.share_hash_chain_s,
6819+                           self.block_hash_tree_s,
6820+                           block,
6821+                           self.encprivkey])
6822+        self.offsets = {}
6823+        self.offsets['signature'] = signature_offset
6824+        self.offsets['share_hash_chain'] = sharehashes_offset
6825+        self.offsets['block_hash_tree'] = blockhashes_offset
6826+        self.offsets['share_data'] = sharedata_offset
6827+        self.offsets['enc_privkey'] = encprivkey_offset
6828+        self.offsets['EOF'] = eof_offset
6829+        return final_share
6830+
6831+
6832+    def write_sdmf_share_to_server(self,
6833+                                   storage_index,
6834+                                   empty=False):
6835+        # Some tests need SDMF shares to verify that we can still
6836+        # read them. This method writes one, which resembles but is not
6837+        assert self.rref
6838+        write = self.ss.remote_slot_testv_and_readv_and_writev
6839+        share = self.build_test_sdmf_share(empty)
6840+        testvs = [(0, 1, "eq", "")]
6841+        tws = {}
6842+        tws[0] = (testvs, [(0, share)], None)
6843+        readv = []
6844+        results = write(storage_index, self.secrets, tws, readv)
6845+        self.failUnless(results[0])
6846+
6847+
6848+    def test_read(self):
6849+        self.write_test_share_to_server("si1")
6850+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6851+        # Check that every method equals what we expect it to.
6852+        d = defer.succeed(None)
6853+        def _check_block_and_salt((block, salt)):
6854+            self.failUnlessEqual(block, self.block)
6855+            self.failUnlessEqual(salt, self.salt)
6856+
6857+        for i in xrange(6):
6858+            d.addCallback(lambda ignored, i=i:
6859+                mr.get_block_and_salt(i))
6860+            d.addCallback(_check_block_and_salt)
6861+
6862+        d.addCallback(lambda ignored:
6863+            mr.get_encprivkey())
6864+        d.addCallback(lambda encprivkey:
6865+            self.failUnlessEqual(self.encprivkey, encprivkey))
6866+
6867+        d.addCallback(lambda ignored:
6868+            mr.get_blockhashes())
6869+        d.addCallback(lambda blockhashes:
6870+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
6871+
6872+        d.addCallback(lambda ignored:
6873+            mr.get_sharehashes())
6874+        d.addCallback(lambda sharehashes:
6875+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
6876+
6877+        d.addCallback(lambda ignored:
6878+            mr.get_signature())
6879+        d.addCallback(lambda signature:
6880+            self.failUnlessEqual(signature, self.signature))
6881+
6882+        d.addCallback(lambda ignored:
6883+            mr.get_verification_key())
6884+        d.addCallback(lambda verification_key:
6885+            self.failUnlessEqual(verification_key, self.verification_key))
6886+
6887+        d.addCallback(lambda ignored:
6888+            mr.get_seqnum())
6889+        d.addCallback(lambda seqnum:
6890+            self.failUnlessEqual(seqnum, 0))
6891+
6892+        d.addCallback(lambda ignored:
6893+            mr.get_root_hash())
6894+        d.addCallback(lambda root_hash:
6895+            self.failUnlessEqual(self.root_hash, root_hash))
6896+
6897+        d.addCallback(lambda ignored:
6898+            mr.get_seqnum())
6899+        d.addCallback(lambda seqnum:
6900+            self.failUnlessEqual(0, seqnum))
6901+
6902+        d.addCallback(lambda ignored:
6903+            mr.get_encoding_parameters())
6904+        def _check_encoding_parameters((k, n, segsize, datalen)):
6905+            self.failUnlessEqual(k, 3)
6906+            self.failUnlessEqual(n, 10)
6907+            self.failUnlessEqual(segsize, 6)
6908+            self.failUnlessEqual(datalen, 36)
6909+        d.addCallback(_check_encoding_parameters)
6910+
6911+        d.addCallback(lambda ignored:
6912+            mr.get_checkstring())
6913+        d.addCallback(lambda checkstring:
6914+            self.failUnlessEqual(checkstring, checkstring))
6915+        return d
6916+
6917+
6918+    def test_read_with_different_tail_segment_size(self):
6919+        self.write_test_share_to_server("si1", tail_segment=True)
6920+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6921+        d = mr.get_block_and_salt(5)
6922+        def _check_tail_segment(results):
6923+            block, salt = results
6924+            self.failUnlessEqual(len(block), 1)
6925+            self.failUnlessEqual(block, "a")
6926+        d.addCallback(_check_tail_segment)
6927+        return d
6928+
6929+
6930+    def test_get_block_with_invalid_segnum(self):
6931+        self.write_test_share_to_server("si1")
6932+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6933+        d = defer.succeed(None)
6934+        d.addCallback(lambda ignored:
6935+            self.shouldFail(LayoutInvalid, "test invalid segnum",
6936+                            None,
6937+                            mr.get_block_and_salt, 7))
6938+        return d
6939+
6940+
6941+    def test_get_encoding_parameters_first(self):
6942+        self.write_test_share_to_server("si1")
6943+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6944+        d = mr.get_encoding_parameters()
6945+        def _check_encoding_parameters((k, n, segment_size, datalen)):
6946+            self.failUnlessEqual(k, 3)
6947+            self.failUnlessEqual(n, 10)
6948+            self.failUnlessEqual(segment_size, 6)
6949+            self.failUnlessEqual(datalen, 36)
6950+        d.addCallback(_check_encoding_parameters)
6951+        return d
6952+
6953+
6954+    def test_get_seqnum_first(self):
6955+        self.write_test_share_to_server("si1")
6956+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6957+        d = mr.get_seqnum()
6958+        d.addCallback(lambda seqnum:
6959+            self.failUnlessEqual(seqnum, 0))
6960+        return d
6961+
6962+
6963+    def test_get_root_hash_first(self):
6964+        self.write_test_share_to_server("si1")
6965+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6966+        d = mr.get_root_hash()
6967+        d.addCallback(lambda root_hash:
6968+            self.failUnlessEqual(root_hash, self.root_hash))
6969+        return d
6970+
6971+
6972+    def test_get_checkstring_first(self):
6973+        self.write_test_share_to_server("si1")
6974+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6975+        d = mr.get_checkstring()
6976+        d.addCallback(lambda checkstring:
6977+            self.failUnlessEqual(checkstring, self.checkstring))
6978+        return d
6979+
6980+
6981+    def test_write_read_vectors(self):
6982+        # When writing for us, the storage server will return to us a
6983+        # read vector, along with its result. If a write fails because
6984+        # the test vectors failed, this read vector can help us to
6985+        # diagnose the problem. This test ensures that the read vector
6986+        # is working appropriately.
6987+        mw = self._make_new_mw("si1", 0)
6988+        d = defer.succeed(None)
6989+
6990+        # Write one share. This should return a checkstring of nothing,
6991+        # since there is no data there.
6992+        d.addCallback(lambda ignored:
6993+            mw.put_block(self.block, 0, self.salt))
6994+        def _check_first_write(results):
6995+            result, readvs = results
6996+            self.failUnless(result)
6997+            self.failIf(readvs)
6998+        d.addCallback(_check_first_write)
6999+        # Now, there should be a different checkstring returned when
7000+        # we write other shares
7001+        d.addCallback(lambda ignored:
7002+            mw.put_block(self.block, 1, self.salt))
7003+        def _check_next_write(results):
7004+            result, readvs = results
7005+            self.failUnless(result)
7006+            self.expected_checkstring = mw.get_checkstring()
7007+            self.failUnlessIn(0, readvs)
7008+            self.failUnlessEqual(readvs[0][0], self.expected_checkstring)
7009+        d.addCallback(_check_next_write)
7010+        # Add the other four shares
7011+        for i in xrange(2, 6):
7012+            d.addCallback(lambda ignored, i=i:
7013+                mw.put_block(self.block, i, self.salt))
7014+            d.addCallback(_check_next_write)
7015+        # Add the encrypted private key
7016+        d.addCallback(lambda ignored:
7017+            mw.put_encprivkey(self.encprivkey))
7018+        d.addCallback(_check_next_write)
7019+        # Add the block hash tree and share hash tree
7020+        d.addCallback(lambda ignored:
7021+            mw.put_blockhashes(self.block_hash_tree))
7022+        d.addCallback(_check_next_write)
7023+        d.addCallback(lambda ignored:
7024+            mw.put_sharehashes(self.share_hash_chain))
7025+        d.addCallback(_check_next_write)
7026+        # Add the root hash and the salt hash. This should change the
7027+        # checkstring, but not in a way that we'll be able to see right
7028+        # now, since the read vectors are applied before the write
7029+        # vectors.
7030+        d.addCallback(lambda ignored:
7031+            mw.put_root_hash(self.root_hash))
7032+        def _check_old_testv_after_new_one_is_written(results):
7033+            result, readvs = results
7034+            self.failUnless(result)
7035+            self.failUnlessIn(0, readvs)
7036+            self.failUnlessEqual(self.expected_checkstring,
7037+                                 readvs[0][0])
7038+            new_checkstring = mw.get_checkstring()
7039+            self.failIfEqual(new_checkstring,
7040+                             readvs[0][0])
7041+        d.addCallback(_check_old_testv_after_new_one_is_written)
7042+        # Now add the signature. This should succeed, meaning that the
7043+        # data gets written and the read vector matches what the writer
7044+        # thinks should be there.
7045+        d.addCallback(lambda ignored:
7046+            mw.put_signature(self.signature))
7047+        d.addCallback(_check_next_write)
7048+        # The checkstring remains the same for the rest of the process.
7049+        return d
7050+
7051+
7052+    def test_blockhashes_after_share_hash_chain(self):
7053+        mw = self._make_new_mw("si1", 0)
7054+        d = defer.succeed(None)
7055+        # Put everything up to and including the share hash chain
7056+        for i in xrange(6):
7057+            d.addCallback(lambda ignored, i=i:
7058+                mw.put_block(self.block, i, self.salt))
7059+        d.addCallback(lambda ignored:
7060+            mw.put_encprivkey(self.encprivkey))
7061+        d.addCallback(lambda ignored:
7062+            mw.put_blockhashes(self.block_hash_tree))
7063+        d.addCallback(lambda ignored:
7064+            mw.put_sharehashes(self.share_hash_chain))
7065+
7066+        # Now try to put the block hash tree again.
7067+        d.addCallback(lambda ignored:
7068+            self.shouldFail(LayoutInvalid, "test repeat salthashes",
7069+                            None,
7070+                            mw.put_blockhashes, self.block_hash_tree))
7071+        return d
7072+
7073+
7074+    def test_encprivkey_after_blockhashes(self):
7075+        mw = self._make_new_mw("si1", 0)
7076+        d = defer.succeed(None)
7077+        # Put everything up to and including the block hash tree
7078+        for i in xrange(6):
7079+            d.addCallback(lambda ignored, i=i:
7080+                mw.put_block(self.block, i, self.salt))
7081+        d.addCallback(lambda ignored:
7082+            mw.put_encprivkey(self.encprivkey))
7083+        d.addCallback(lambda ignored:
7084+            mw.put_blockhashes(self.block_hash_tree))
7085+        d.addCallback(lambda ignored:
7086+            self.shouldFail(LayoutInvalid, "out of order private key",
7087+                            None,
7088+                            mw.put_encprivkey, self.encprivkey))
7089+        return d
7090+
7091+
7092+    def test_share_hash_chain_after_signature(self):
7093+        mw = self._make_new_mw("si1", 0)
7094+        d = defer.succeed(None)
7095+        # Put everything up to and including the signature
7096+        for i in xrange(6):
7097+            d.addCallback(lambda ignored, i=i:
7098+                mw.put_block(self.block, i, self.salt))
7099+        d.addCallback(lambda ignored:
7100+            mw.put_encprivkey(self.encprivkey))
7101+        d.addCallback(lambda ignored:
7102+            mw.put_blockhashes(self.block_hash_tree))
7103+        d.addCallback(lambda ignored:
7104+            mw.put_sharehashes(self.share_hash_chain))
7105+        d.addCallback(lambda ignored:
7106+            mw.put_root_hash(self.root_hash))
7107+        d.addCallback(lambda ignored:
7108+            mw.put_signature(self.signature))
7109+        # Now try to put the share hash chain again. This should fail
7110+        d.addCallback(lambda ignored:
7111+            self.shouldFail(LayoutInvalid, "out of order share hash chain",
7112+                            None,
7113+                            mw.put_sharehashes, self.share_hash_chain))
7114+        return d
7115+
7116+
7117+    def test_signature_after_verification_key(self):
7118+        mw = self._make_new_mw("si1", 0)
7119+        d = defer.succeed(None)
7120+        # Put everything up to and including the verification key.
7121+        for i in xrange(6):
7122+            d.addCallback(lambda ignored, i=i:
7123+                mw.put_block(self.block, i, self.salt))
7124+        d.addCallback(lambda ignored:
7125+            mw.put_encprivkey(self.encprivkey))
7126+        d.addCallback(lambda ignored:
7127+            mw.put_blockhashes(self.block_hash_tree))
7128+        d.addCallback(lambda ignored:
7129+            mw.put_sharehashes(self.share_hash_chain))
7130+        d.addCallback(lambda ignored:
7131+            mw.put_root_hash(self.root_hash))
7132+        d.addCallback(lambda ignored:
7133+            mw.put_signature(self.signature))
7134+        d.addCallback(lambda ignored:
7135+            mw.put_verification_key(self.verification_key))
7136+        # Now try to put the signature again. This should fail
7137+        d.addCallback(lambda ignored:
7138+            self.shouldFail(LayoutInvalid, "signature after verification",
7139+                            None,
7140+                            mw.put_signature, self.signature))
7141+        return d
7142+
7143+
7144+    def test_uncoordinated_write(self):
7145+        # Make two mutable writers, both pointing to the same storage
7146+        # server, both at the same storage index, and try writing to the
7147+        # same share.
7148+        mw1 = self._make_new_mw("si1", 0)
7149+        mw2 = self._make_new_mw("si1", 0)
7150+        d = defer.succeed(None)
7151+        def _check_success(results):
7152+            result, readvs = results
7153+            self.failUnless(result)
7154+
7155+        def _check_failure(results):
7156+            result, readvs = results
7157+            self.failIf(result)
7158+
7159+        d.addCallback(lambda ignored:
7160+            mw1.put_block(self.block, 0, self.salt))
7161+        d.addCallback(_check_success)
7162+        d.addCallback(lambda ignored:
7163+            mw2.put_block(self.block, 0, self.salt))
7164+        d.addCallback(_check_failure)
7165+        return d
7166+
7167+
7168+    def test_invalid_salt_size(self):
7169+        # Salts need to be 16 bytes in size. Writes that attempt to
7170+        # write more or less than this should be rejected.
7171+        mw = self._make_new_mw("si1", 0)
7172+        invalid_salt = "a" * 17 # 17 bytes
7173+        another_invalid_salt = "b" * 15 # 15 bytes
7174+        d = defer.succeed(None)
7175+        d.addCallback(lambda ignored:
7176+            self.shouldFail(LayoutInvalid, "salt too big",
7177+                            None,
7178+                            mw.put_block, self.block, 0, invalid_salt))
7179+        d.addCallback(lambda ignored:
7180+            self.shouldFail(LayoutInvalid, "salt too small",
7181+                            None,
7182+                            mw.put_block, self.block, 0,
7183+                            another_invalid_salt))
7184+        return d
7185+
7186+
7187+    def test_write_test_vectors(self):
7188+        # If we give the write proxy a bogus test vector at
7189+        # any point during the process, it should fail to write.
7190+        mw = self._make_new_mw("si1", 0)
7191+        mw.set_checkstring("this is a lie")
7192+        # The initial write should be expecting to find the improbable
7193+        # checkstring above in place; finding nothing, it should fail.
7194+        d = defer.succeed(None)
7195+        d.addCallback(lambda ignored:
7196+            mw.put_block(self.block, 0, self.salt))
7197+        def _check_failure(results):
7198+            result, readv = results
7199+            self.failIf(result)
7200+        d.addCallback(_check_failure)
7201+        # Now set the checkstring to the empty string, which
7202+        # indicates that no share is there.
7203+        d.addCallback(lambda ignored:
7204+            mw.set_checkstring(""))
7205+        d.addCallback(lambda ignored:
7206+            mw.put_block(self.block, 0, self.salt))
7207+        def _check_success(results):
7208+            result, readv = results
7209+            self.failUnless(result)
7210+        d.addCallback(_check_success)
7211+        # Now set the checkstring to something wrong
7212+        d.addCallback(lambda ignored:
7213+            mw.set_checkstring("something wrong"))
7214+        # This should fail to do anything
7215+        d.addCallback(lambda ignored:
7216+            mw.put_block(self.block, 1, self.salt))
7217+        d.addCallback(_check_failure)
7218+        # Now set it back to what it should be.
7219+        d.addCallback(lambda ignored:
7220+            mw.set_checkstring(mw.get_checkstring()))
7221+        for i in xrange(1, 6):
7222+            d.addCallback(lambda ignored, i=i:
7223+                mw.put_block(self.block, i, self.salt))
7224+            d.addCallback(_check_success)
7225+        d.addCallback(lambda ignored:
7226+            mw.put_encprivkey(self.encprivkey))
7227+        d.addCallback(_check_success)
7228+        d.addCallback(lambda ignored:
7229+            mw.put_blockhashes(self.block_hash_tree))
7230+        d.addCallback(_check_success)
7231+        d.addCallback(lambda ignored:
7232+            mw.put_sharehashes(self.share_hash_chain))
7233+        d.addCallback(_check_success)
7234+        def _keep_old_checkstring(ignored):
7235+            self.old_checkstring = mw.get_checkstring()
7236+            mw.set_checkstring("foobarbaz")
7237+        d.addCallback(_keep_old_checkstring)
7238+        d.addCallback(lambda ignored:
7239+            mw.put_root_hash(self.root_hash))
7240+        d.addCallback(_check_failure)
7241+        d.addCallback(lambda ignored:
7242+            self.failUnlessEqual(self.old_checkstring, mw.get_checkstring()))
7243+        def _restore_old_checkstring(ignored):
7244+            mw.set_checkstring(self.old_checkstring)
7245+        d.addCallback(_restore_old_checkstring)
7246+        d.addCallback(lambda ignored:
7247+            mw.put_root_hash(self.root_hash))
7248+        d.addCallback(_check_success)
7249+        # The checkstring should have been set appropriately for us on
7250+        # the last write; if we try to change it to something else,
7251+        # that change should cause the verification key step to fail.
7252+        d.addCallback(lambda ignored:
7253+            mw.set_checkstring("something else"))
7254+        d.addCallback(lambda ignored:
7255+            mw.put_signature(self.signature))
7256+        d.addCallback(_check_failure)
7257+        d.addCallback(lambda ignored:
7258+            mw.set_checkstring(mw.get_checkstring()))
7259+        d.addCallback(lambda ignored:
7260+            mw.put_signature(self.signature))
7261+        d.addCallback(_check_success)
7262+        d.addCallback(lambda ignored:
7263+            mw.put_verification_key(self.verification_key))
7264+        d.addCallback(_check_success)
7265+        return d
7266+
7267+
7268+    def test_offset_only_set_on_success(self):
7269+        # The write proxy should be smart enough to detect when a write
7270+        # has failed, and to temper its definition of progress based on
7271+        # that.
7272+        mw = self._make_new_mw("si1", 0)
7273+        d = defer.succeed(None)
7274+        for i in xrange(1, 6):
7275+            d.addCallback(lambda ignored, i=i:
7276+                mw.put_block(self.block, i, self.salt))
7277+        def _break_checkstring(ignored):
7278+            self._old_checkstring = mw.get_checkstring()
7279+            mw.set_checkstring("foobarbaz")
7280+
7281+        def _fix_checkstring(ignored):
7282+            mw.set_checkstring(self._old_checkstring)
7283+
7284+        d.addCallback(_break_checkstring)
7285+
7286+        # Setting the encrypted private key shouldn't work now, which is
7287+        # to be expected and is tested elsewhere. We also want to make
7288+        # sure that we can't add the block hash tree after a failed
7289+        # write of this sort.
7290+        d.addCallback(lambda ignored:
7291+            mw.put_encprivkey(self.encprivkey))
7292+        d.addCallback(lambda ignored:
7293+            self.shouldFail(LayoutInvalid, "test out-of-order blockhashes",
7294+                            None,
7295+                            mw.put_blockhashes, self.block_hash_tree))
7296+        d.addCallback(_fix_checkstring)
7297+        d.addCallback(lambda ignored:
7298+            mw.put_encprivkey(self.encprivkey))
7299+        d.addCallback(_break_checkstring)
7300+        d.addCallback(lambda ignored:
7301+            mw.put_blockhashes(self.block_hash_tree))
7302+        d.addCallback(lambda ignored:
7303+            self.shouldFail(LayoutInvalid, "test out-of-order sharehashes",
7304+                            None,
7305+                            mw.put_sharehashes, self.share_hash_chain))
7306+        d.addCallback(_fix_checkstring)
7307+        d.addCallback(lambda ignored:
7308+            mw.put_blockhashes(self.block_hash_tree))
7309+        d.addCallback(_break_checkstring)
7310+        d.addCallback(lambda ignored:
7311+            mw.put_sharehashes(self.share_hash_chain))
7312+        d.addCallback(lambda ignored:
7313+            self.shouldFail(LayoutInvalid, "out-of-order root hash",
7314+                            None,
7315+                            mw.put_root_hash, self.root_hash))
7316+        d.addCallback(_fix_checkstring)
7317+        d.addCallback(lambda ignored:
7318+            mw.put_sharehashes(self.share_hash_chain))
7319+        d.addCallback(_break_checkstring)
7320+        d.addCallback(lambda ignored:
7321+            mw.put_root_hash(self.root_hash))
7322+        d.addCallback(lambda ignored:
7323+            self.shouldFail(LayoutInvalid, "out-of-order signature",
7324+                            None,
7325+                            mw.put_signature, self.signature))
7326+        d.addCallback(_fix_checkstring)
7327+        d.addCallback(lambda ignored:
7328+            mw.put_root_hash(self.root_hash))
7329+        d.addCallback(_break_checkstring)
7330+        d.addCallback(lambda ignored:
7331+            mw.put_signature(self.signature))
7332+        d.addCallback(lambda ignored:
7333+            self.shouldFail(LayoutInvalid, "out-of-order verification key",
7334+                            None,
7335+                            mw.put_verification_key,
7336+                            self.verification_key))
7337+        d.addCallback(_fix_checkstring)
7338+        d.addCallback(lambda ignored:
7339+            mw.put_signature(self.signature))
7340+        d.addCallback(_break_checkstring)
7341+        d.addCallback(lambda ignored:
7342+            mw.put_verification_key(self.verification_key))
7343+        d.addCallback(lambda ignored:
7344+            self.shouldFail(LayoutInvalid, "out-of-order finish",
7345+                            None,
7346+                            mw.finish_publishing))
7347+        return d
7348+
7349+
7350+    def serialize_blockhashes(self, blockhashes):
7351+        return "".join(blockhashes)
7352+
7353+
7354+    def serialize_sharehashes(self, sharehashes):
7355+        ret = "".join([struct.pack(">H32s", i, sharehashes[i])
7356+                        for i in sorted(sharehashes.keys())])
7357+        return ret
7358+
7359+
7360+    def test_write(self):
7361+        # This translates to a file with 6 6-byte segments, and with 2-byte
7362+        # blocks.
7363+        mw = self._make_new_mw("si1", 0)
7364+        mw2 = self._make_new_mw("si1", 1)
7365+        # Test writing some blocks.
7366+        read = self.ss.remote_slot_readv
7367+        expected_sharedata_offset = struct.calcsize(MDMFHEADER)
7368+        written_block_size = 2 + len(self.salt)
7369+        written_block = self.block + self.salt
7370+        def _check_block_write(i, share):
7371+            self.failUnlessEqual(read("si1", [share], [(expected_sharedata_offset + (i * written_block_size), written_block_size)]),
7372+                                {share: [written_block]})
7373+        d = defer.succeed(None)
7374+        for i in xrange(6):
7375+            d.addCallback(lambda ignored, i=i:
7376+                mw.put_block(self.block, i, self.salt))
7377+            d.addCallback(lambda ignored, i=i:
7378+                _check_block_write(i, 0))
7379+        # Now try the same thing, but with share 1 instead of share 0.
7380+        for i in xrange(6):
7381+            d.addCallback(lambda ignored, i=i:
7382+                mw2.put_block(self.block, i, self.salt))
7383+            d.addCallback(lambda ignored, i=i:
7384+                _check_block_write(i, 1))
7385+
7386+        # Next, we make a fake encrypted private key, and put it onto the
7387+        # storage server.
7388+        d.addCallback(lambda ignored:
7389+            mw.put_encprivkey(self.encprivkey))
7390+        expected_private_key_offset = expected_sharedata_offset + \
7391+                                      len(written_block) * 6
7392+        self.failUnlessEqual(len(self.encprivkey), 7)
7393+        d.addCallback(lambda ignored:
7394+            self.failUnlessEqual(read("si1", [0], [(expected_private_key_offset, 7)]),
7395+                                 {0: [self.encprivkey]}))
7396+
7397+        # Next, we put a fake block hash tree.
7398+        d.addCallback(lambda ignored:
7399+            mw.put_blockhashes(self.block_hash_tree))
7400+        expected_block_hash_offset = expected_private_key_offset + len(self.encprivkey)
7401+        self.failUnlessEqual(len(self.block_hash_tree_s), 32 * 6)
7402+        d.addCallback(lambda ignored:
7403+            self.failUnlessEqual(read("si1", [0], [(expected_block_hash_offset, 32 * 6)]),
7404+                                 {0: [self.block_hash_tree_s]}))
7405+
7406+        # Next, put a fake share hash chain
7407+        d.addCallback(lambda ignored:
7408+            mw.put_sharehashes(self.share_hash_chain))
7409+        expected_share_hash_offset = expected_block_hash_offset + len(self.block_hash_tree_s)
7410+        d.addCallback(lambda ignored:
7411+            self.failUnlessEqual(read("si1", [0],[(expected_share_hash_offset, (32 + 2) * 6)]),
7412+                                 {0: [self.share_hash_chain_s]}))
7413+
7414+        # Next, we put what is supposed to be the root hash of
7415+        # our share hash tree but isn't       
7416+        d.addCallback(lambda ignored:
7417+            mw.put_root_hash(self.root_hash))
7418+        # The root hash gets inserted at byte 9 (its position is in the header,
7419+        # and is fixed).
7420+        def _check(ignored):
7421+            self.failUnlessEqual(read("si1", [0], [(9, 32)]),
7422+                                 {0: [self.root_hash]})
7423+        d.addCallback(_check)
7424+
7425+        # Next, we put a signature of the header block.
7426+        d.addCallback(lambda ignored:
7427+            mw.put_signature(self.signature))
7428+        expected_signature_offset = expected_share_hash_offset + len(self.share_hash_chain_s)
7429+        self.failUnlessEqual(len(self.signature), 9)
7430+        d.addCallback(lambda ignored:
7431+            self.failUnlessEqual(read("si1", [0], [(expected_signature_offset, 9)]),
7432+                                 {0: [self.signature]}))
7433+
7434+        # Next, we put the verification key
7435+        d.addCallback(lambda ignored:
7436+            mw.put_verification_key(self.verification_key))
7437+        expected_verification_key_offset = expected_signature_offset + len(self.signature)
7438+        self.failUnlessEqual(len(self.verification_key), 6)
7439+        d.addCallback(lambda ignored:
7440+            self.failUnlessEqual(read("si1", [0], [(expected_verification_key_offset, 6)]),
7441+                                 {0: [self.verification_key]}))
7442+
7443+        def _check_signable(ignored):
7444+            # Make sure that the signable is what we think it should be.
7445+            signable = mw.get_signable()
7446+            verno, seq, roothash, k, n, segsize, datalen = \
7447+                                            struct.unpack(">BQ32sBBQQ",
7448+                                                          signable)
7449+            self.failUnlessEqual(verno, 1)
7450+            self.failUnlessEqual(seq, 0)
7451+            self.failUnlessEqual(roothash, self.root_hash)
7452+            self.failUnlessEqual(k, 3)
7453+            self.failUnlessEqual(n, 10)
7454+            self.failUnlessEqual(segsize, 6)
7455+            self.failUnlessEqual(datalen, 36)
7456+        d.addCallback(_check_signable)
7457+        # Next, we cause the offset table to be published.
7458+        d.addCallback(lambda ignored:
7459+            mw.finish_publishing())
7460+        expected_eof_offset = expected_verification_key_offset + len(self.verification_key)
7461+
7462+        def _check_offsets(ignored):
7463+            # Check the version number to make sure that it is correct.
7464+            expected_version_number = struct.pack(">B", 1)
7465+            self.failUnlessEqual(read("si1", [0], [(0, 1)]),
7466+                                 {0: [expected_version_number]})
7467+            # Check the sequence number to make sure that it is correct
7468+            expected_sequence_number = struct.pack(">Q", 0)
7469+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
7470+                                 {0: [expected_sequence_number]})
7471+            # Check that the encoding parameters (k, N, segement size, data
7472+            # length) are what they should be. These are  3, 10, 6, 36
7473+            expected_k = struct.pack(">B", 3)
7474+            self.failUnlessEqual(read("si1", [0], [(41, 1)]),
7475+                                 {0: [expected_k]})
7476+            expected_n = struct.pack(">B", 10)
7477+            self.failUnlessEqual(read("si1", [0], [(42, 1)]),
7478+                                 {0: [expected_n]})
7479+            expected_segment_size = struct.pack(">Q", 6)
7480+            self.failUnlessEqual(read("si1", [0], [(43, 8)]),
7481+                                 {0: [expected_segment_size]})
7482+            expected_data_length = struct.pack(">Q", 36)
7483+            self.failUnlessEqual(read("si1", [0], [(51, 8)]),
7484+                                 {0: [expected_data_length]})
7485+            expected_offset = struct.pack(">Q", expected_private_key_offset)
7486+            self.failUnlessEqual(read("si1", [0], [(59, 8)]),
7487+                                 {0: [expected_offset]})
7488+            expected_offset = struct.pack(">Q", expected_block_hash_offset)
7489+            self.failUnlessEqual(read("si1", [0], [(67, 8)]),
7490+                                 {0: [expected_offset]})
7491+            expected_offset = struct.pack(">Q", expected_share_hash_offset)
7492+            self.failUnlessEqual(read("si1", [0], [(75, 8)]),
7493+                                 {0: [expected_offset]})
7494+            expected_offset = struct.pack(">Q", expected_signature_offset)
7495+            self.failUnlessEqual(read("si1", [0], [(83, 8)]),
7496+                                 {0: [expected_offset]})
7497+            expected_offset = struct.pack(">Q", expected_verification_key_offset)
7498+            self.failUnlessEqual(read("si1", [0], [(91, 8)]),
7499+                                 {0: [expected_offset]})
7500+            expected_offset = struct.pack(">Q", expected_eof_offset)
7501+            self.failUnlessEqual(read("si1", [0], [(99, 8)]),
7502+                                 {0: [expected_offset]})
7503+        d.addCallback(_check_offsets)
7504+        return d
7505+
7506+    def _make_new_mw(self, si, share, datalength=36):
7507+        # This is a file of size 36 bytes. Since it has a segment
7508+        # size of 6, we know that it has 6 byte segments, which will
7509+        # be split into blocks of 2 bytes because our FEC k
7510+        # parameter is 3.
7511+        mw = MDMFSlotWriteProxy(share, self.rref, si, self.secrets, 0, 3, 10,
7512+                                6, datalength)
7513+        return mw
7514+
7515+
7516+    def test_write_rejected_with_too_many_blocks(self):
7517+        mw = self._make_new_mw("si0", 0)
7518+
7519+        # Try writing too many blocks. We should not be able to write
7520+        # more than 6
7521+        # blocks into each share.
7522+        d = defer.succeed(None)
7523+        for i in xrange(6):
7524+            d.addCallback(lambda ignored, i=i:
7525+                mw.put_block(self.block, i, self.salt))
7526+        d.addCallback(lambda ignored:
7527+            self.shouldFail(LayoutInvalid, "too many blocks",
7528+                            None,
7529+                            mw.put_block, self.block, 7, self.salt))
7530+        return d
7531+
7532+
7533+    def test_write_rejected_with_invalid_salt(self):
7534+        # Try writing an invalid salt. Salts are 16 bytes -- any more or
7535+        # less should cause an error.
7536+        mw = self._make_new_mw("si1", 0)
7537+        bad_salt = "a" * 17 # 17 bytes
7538+        d = defer.succeed(None)
7539+        d.addCallback(lambda ignored:
7540+            self.shouldFail(LayoutInvalid, "test_invalid_salt",
7541+                            None, mw.put_block, self.block, 7, bad_salt))
7542+        return d
7543+
7544+
7545+    def test_write_rejected_with_invalid_root_hash(self):
7546+        # Try writing an invalid root hash. This should be SHA256d, and
7547+        # 32 bytes long as a result.
7548+        mw = self._make_new_mw("si2", 0)
7549+        # 17 bytes != 32 bytes
7550+        invalid_root_hash = "a" * 17
7551+        d = defer.succeed(None)
7552+        # Before this test can work, we need to put some blocks + salts,
7553+        # a block hash tree, and a share hash tree. Otherwise, we'll see
7554+        # failures that match what we are looking for, but are caused by
7555+        # the constraints imposed on operation ordering.
7556+        for i in xrange(6):
7557+            d.addCallback(lambda ignored, i=i:
7558+                mw.put_block(self.block, i, self.salt))
7559+        d.addCallback(lambda ignored:
7560+            mw.put_encprivkey(self.encprivkey))
7561+        d.addCallback(lambda ignored:
7562+            mw.put_blockhashes(self.block_hash_tree))
7563+        d.addCallback(lambda ignored:
7564+            mw.put_sharehashes(self.share_hash_chain))
7565+        d.addCallback(lambda ignored:
7566+            self.shouldFail(LayoutInvalid, "invalid root hash",
7567+                            None, mw.put_root_hash, invalid_root_hash))
7568+        return d
7569+
7570+
7571+    def test_write_rejected_with_invalid_blocksize(self):
7572+        # The blocksize implied by the writer that we get from
7573+        # _make_new_mw is 2bytes -- any more or any less than this
7574+        # should be cause for failure, unless it is the tail segment, in
7575+        # which case it may not be failure.
7576+        invalid_block = "a"
7577+        mw = self._make_new_mw("si3", 0, 33) # implies a tail segment with
7578+                                             # one byte blocks
7579+        # 1 bytes != 2 bytes
7580+        d = defer.succeed(None)
7581+        d.addCallback(lambda ignored, invalid_block=invalid_block:
7582+            self.shouldFail(LayoutInvalid, "test blocksize too small",
7583+                            None, mw.put_block, invalid_block, 0,
7584+                            self.salt))
7585+        invalid_block = invalid_block * 3
7586+        # 3 bytes != 2 bytes
7587+        d.addCallback(lambda ignored:
7588+            self.shouldFail(LayoutInvalid, "test blocksize too large",
7589+                            None,
7590+                            mw.put_block, invalid_block, 0, self.salt))
7591+        for i in xrange(5):
7592+            d.addCallback(lambda ignored, i=i:
7593+                mw.put_block(self.block, i, self.salt))
7594+        # Try to put an invalid tail segment
7595+        d.addCallback(lambda ignored:
7596+            self.shouldFail(LayoutInvalid, "test invalid tail segment",
7597+                            None,
7598+                            mw.put_block, self.block, 5, self.salt))
7599+        valid_block = "a"
7600+        d.addCallback(lambda ignored:
7601+            mw.put_block(valid_block, 5, self.salt))
7602+        return d
7603+
7604+
7605+    def test_write_enforces_order_constraints(self):
7606+        # We require that the MDMFSlotWriteProxy be interacted with in a
7607+        # specific way.
7608+        # That way is:
7609+        # 0: __init__
7610+        # 1: write blocks and salts
7611+        # 2: Write the encrypted private key
7612+        # 3: Write the block hashes
7613+        # 4: Write the share hashes
7614+        # 5: Write the root hash and salt hash
7615+        # 6: Write the signature and verification key
7616+        # 7: Write the file.
7617+        #
7618+        # Some of these can be performed out-of-order, and some can't.
7619+        # The dependencies that I want to test here are:
7620+        #  - Private key before block hashes
7621+        #  - share hashes and block hashes before root hash
7622+        #  - root hash before signature
7623+        #  - signature before verification key
7624+        mw0 = self._make_new_mw("si0", 0)
7625+        # Write some shares
7626+        d = defer.succeed(None)
7627+        for i in xrange(6):
7628+            d.addCallback(lambda ignored, i=i:
7629+                mw0.put_block(self.block, i, self.salt))
7630+        # Try to write the block hashes before writing the encrypted
7631+        # private key
7632+        d.addCallback(lambda ignored:
7633+            self.shouldFail(LayoutInvalid, "block hashes before key",
7634+                            None, mw0.put_blockhashes,
7635+                            self.block_hash_tree))
7636+
7637+        # Write the private key.
7638+        d.addCallback(lambda ignored:
7639+            mw0.put_encprivkey(self.encprivkey))
7640+
7641+
7642+        # Try to write the share hash chain without writing the block
7643+        # hash tree
7644+        d.addCallback(lambda ignored:
7645+            self.shouldFail(LayoutInvalid, "share hash chain before "
7646+                                           "salt hash tree",
7647+                            None,
7648+                            mw0.put_sharehashes, self.share_hash_chain))
7649+
7650+        # Try to write the root hash and without writing either the
7651+        # block hashes or the or the share hashes
7652+        d.addCallback(lambda ignored:
7653+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
7654+                            None,
7655+                            mw0.put_root_hash, self.root_hash))
7656+
7657+        # Now write the block hashes and try again
7658+        d.addCallback(lambda ignored:
7659+            mw0.put_blockhashes(self.block_hash_tree))
7660+
7661+        d.addCallback(lambda ignored:
7662+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
7663+                            None, mw0.put_root_hash, self.root_hash))
7664+
7665+        # We haven't yet put the root hash on the share, so we shouldn't
7666+        # be able to sign it.
7667+        d.addCallback(lambda ignored:
7668+            self.shouldFail(LayoutInvalid, "signature before root hash",
7669+                            None, mw0.put_signature, self.signature))
7670+
7671+        d.addCallback(lambda ignored:
7672+            self.failUnlessRaises(LayoutInvalid, mw0.get_signable))
7673+
7674+        # ..and, since that fails, we also shouldn't be able to put the
7675+        # verification key.
7676+        d.addCallback(lambda ignored:
7677+            self.shouldFail(LayoutInvalid, "key before signature",
7678+                            None, mw0.put_verification_key,
7679+                            self.verification_key))
7680+
7681+        # Now write the share hashes.
7682+        d.addCallback(lambda ignored:
7683+            mw0.put_sharehashes(self.share_hash_chain))
7684+        # We should be able to write the root hash now too
7685+        d.addCallback(lambda ignored:
7686+            mw0.put_root_hash(self.root_hash))
7687+
7688+        # We should still be unable to put the verification key
7689+        d.addCallback(lambda ignored:
7690+            self.shouldFail(LayoutInvalid, "key before signature",
7691+                            None, mw0.put_verification_key,
7692+                            self.verification_key))
7693+
7694+        d.addCallback(lambda ignored:
7695+            mw0.put_signature(self.signature))
7696+
7697+        # We shouldn't be able to write the offsets to the remote server
7698+        # until the offset table is finished; IOW, until we have written
7699+        # the verification key.
7700+        d.addCallback(lambda ignored:
7701+            self.shouldFail(LayoutInvalid, "offsets before verification key",
7702+                            None,
7703+                            mw0.finish_publishing))
7704+
7705+        d.addCallback(lambda ignored:
7706+            mw0.put_verification_key(self.verification_key))
7707+        return d
7708+
7709+
7710+    def test_end_to_end(self):
7711+        mw = self._make_new_mw("si1", 0)
7712+        # Write a share using the mutable writer, and make sure that the
7713+        # reader knows how to read everything back to us.
7714+        d = defer.succeed(None)
7715+        for i in xrange(6):
7716+            d.addCallback(lambda ignored, i=i:
7717+                mw.put_block(self.block, i, self.salt))
7718+        d.addCallback(lambda ignored:
7719+            mw.put_encprivkey(self.encprivkey))
7720+        d.addCallback(lambda ignored:
7721+            mw.put_blockhashes(self.block_hash_tree))
7722+        d.addCallback(lambda ignored:
7723+            mw.put_sharehashes(self.share_hash_chain))
7724+        d.addCallback(lambda ignored:
7725+            mw.put_root_hash(self.root_hash))
7726+        d.addCallback(lambda ignored:
7727+            mw.put_signature(self.signature))
7728+        d.addCallback(lambda ignored:
7729+            mw.put_verification_key(self.verification_key))
7730+        d.addCallback(lambda ignored:
7731+            mw.finish_publishing())
7732+
7733+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7734+        def _check_block_and_salt((block, salt)):
7735+            self.failUnlessEqual(block, self.block)
7736+            self.failUnlessEqual(salt, self.salt)
7737+
7738+        for i in xrange(6):
7739+            d.addCallback(lambda ignored, i=i:
7740+                mr.get_block_and_salt(i))
7741+            d.addCallback(_check_block_and_salt)
7742+
7743+        d.addCallback(lambda ignored:
7744+            mr.get_encprivkey())
7745+        d.addCallback(lambda encprivkey:
7746+            self.failUnlessEqual(self.encprivkey, encprivkey))
7747+
7748+        d.addCallback(lambda ignored:
7749+            mr.get_blockhashes())
7750+        d.addCallback(lambda blockhashes:
7751+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
7752+
7753+        d.addCallback(lambda ignored:
7754+            mr.get_sharehashes())
7755+        d.addCallback(lambda sharehashes:
7756+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
7757+
7758+        d.addCallback(lambda ignored:
7759+            mr.get_signature())
7760+        d.addCallback(lambda signature:
7761+            self.failUnlessEqual(signature, self.signature))
7762+
7763+        d.addCallback(lambda ignored:
7764+            mr.get_verification_key())
7765+        d.addCallback(lambda verification_key:
7766+            self.failUnlessEqual(verification_key, self.verification_key))
7767+
7768+        d.addCallback(lambda ignored:
7769+            mr.get_seqnum())
7770+        d.addCallback(lambda seqnum:
7771+            self.failUnlessEqual(seqnum, 0))
7772+
7773+        d.addCallback(lambda ignored:
7774+            mr.get_root_hash())
7775+        d.addCallback(lambda root_hash:
7776+            self.failUnlessEqual(self.root_hash, root_hash))
7777+
7778+        d.addCallback(lambda ignored:
7779+            mr.get_encoding_parameters())
7780+        def _check_encoding_parameters((k, n, segsize, datalen)):
7781+            self.failUnlessEqual(k, 3)
7782+            self.failUnlessEqual(n, 10)
7783+            self.failUnlessEqual(segsize, 6)
7784+            self.failUnlessEqual(datalen, 36)
7785+        d.addCallback(_check_encoding_parameters)
7786+
7787+        d.addCallback(lambda ignored:
7788+            mr.get_checkstring())
7789+        d.addCallback(lambda checkstring:
7790+            self.failUnlessEqual(checkstring, mw.get_checkstring()))
7791+        return d
7792+
7793+
7794+    def test_is_sdmf(self):
7795+        # The MDMFSlotReadProxy should also know how to read SDMF files,
7796+        # since it will encounter them on the grid. Callers use the
7797+        # is_sdmf method to test this.
7798+        self.write_sdmf_share_to_server("si1")
7799+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7800+        d = mr.is_sdmf()
7801+        d.addCallback(lambda issdmf:
7802+            self.failUnless(issdmf))
7803+        return d
7804+
7805+
7806+    def test_reads_sdmf(self):
7807+        # The slot read proxy should, naturally, know how to tell us
7808+        # about data in the SDMF format
7809+        self.write_sdmf_share_to_server("si1")
7810+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7811+        d = defer.succeed(None)
7812+        d.addCallback(lambda ignored:
7813+            mr.is_sdmf())
7814+        d.addCallback(lambda issdmf:
7815+            self.failUnless(issdmf))
7816+
7817+        # What do we need to read?
7818+        #  - The sharedata
7819+        #  - The salt
7820+        d.addCallback(lambda ignored:
7821+            mr.get_block_and_salt(0))
7822+        def _check_block_and_salt(results):
7823+            block, salt = results
7824+            # Our original file is 36 bytes long. Then each share is 12
7825+            # bytes in size. The share is composed entirely of the
7826+            # letter a. self.block contains 2 as, so 6 * self.block is
7827+            # what we are looking for.
7828+            self.failUnlessEqual(block, self.block * 6)
7829+            self.failUnlessEqual(salt, self.salt)
7830+        d.addCallback(_check_block_and_salt)
7831+
7832+        #  - The blockhashes
7833+        d.addCallback(lambda ignored:
7834+            mr.get_blockhashes())
7835+        d.addCallback(lambda blockhashes:
7836+            self.failUnlessEqual(self.block_hash_tree,
7837+                                 blockhashes,
7838+                                 blockhashes))
7839+        #  - The sharehashes
7840+        d.addCallback(lambda ignored:
7841+            mr.get_sharehashes())
7842+        d.addCallback(lambda sharehashes:
7843+            self.failUnlessEqual(self.share_hash_chain,
7844+                                 sharehashes))
7845+        #  - The keys
7846+        d.addCallback(lambda ignored:
7847+            mr.get_encprivkey())
7848+        d.addCallback(lambda encprivkey:
7849+            self.failUnlessEqual(encprivkey, self.encprivkey, encprivkey))
7850+        d.addCallback(lambda ignored:
7851+            mr.get_verification_key())
7852+        d.addCallback(lambda verification_key:
7853+            self.failUnlessEqual(verification_key,
7854+                                 self.verification_key,
7855+                                 verification_key))
7856+        #  - The signature
7857+        d.addCallback(lambda ignored:
7858+            mr.get_signature())
7859+        d.addCallback(lambda signature:
7860+            self.failUnlessEqual(signature, self.signature, signature))
7861+
7862+        #  - The sequence number
7863+        d.addCallback(lambda ignored:
7864+            mr.get_seqnum())
7865+        d.addCallback(lambda seqnum:
7866+            self.failUnlessEqual(seqnum, 0, seqnum))
7867+
7868+        #  - The root hash
7869+        d.addCallback(lambda ignored:
7870+            mr.get_root_hash())
7871+        d.addCallback(lambda root_hash:
7872+            self.failUnlessEqual(root_hash, self.root_hash, root_hash))
7873+        return d
7874+
7875+
7876+    def test_only_reads_one_segment_sdmf(self):
7877+        # SDMF shares have only one segment, so it doesn't make sense to
7878+        # read more segments than that. The reader should know this and
7879+        # complain if we try to do that.
7880+        self.write_sdmf_share_to_server("si1")
7881+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7882+        d = defer.succeed(None)
7883+        d.addCallback(lambda ignored:
7884+            mr.is_sdmf())
7885+        d.addCallback(lambda issdmf:
7886+            self.failUnless(issdmf))
7887+        d.addCallback(lambda ignored:
7888+            self.shouldFail(LayoutInvalid, "test bad segment",
7889+                            None,
7890+                            mr.get_block_and_salt, 1))
7891+        return d
7892+
7893+
7894+    def test_read_with_prefetched_mdmf_data(self):
7895+        # The MDMFSlotReadProxy will prefill certain fields if you pass
7896+        # it data that you have already fetched. This is useful for
7897+        # cases like the Servermap, which prefetches ~2kb of data while
7898+        # finding out which shares are on the remote peer so that it
7899+        # doesn't waste round trips.
7900+        mdmf_data = self.build_test_mdmf_share()
7901+        self.write_test_share_to_server("si1")
7902+        def _make_mr(ignored, length):
7903+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, mdmf_data[:length])
7904+            return mr
7905+
7906+        d = defer.succeed(None)
7907+        # This should be enough to fill in both the encoding parameters
7908+        # and the table of offsets, which will complete the version
7909+        # information tuple.
7910+        d.addCallback(_make_mr, 107)
7911+        d.addCallback(lambda mr:
7912+            mr.get_verinfo())
7913+        def _check_verinfo(verinfo):
7914+            self.failUnless(verinfo)
7915+            self.failUnlessEqual(len(verinfo), 9)
7916+            (seqnum,
7917+             root_hash,
7918+             salt_hash,
7919+             segsize,
7920+             datalen,
7921+             k,
7922+             n,
7923+             prefix,
7924+             offsets) = verinfo
7925+            self.failUnlessEqual(seqnum, 0)
7926+            self.failUnlessEqual(root_hash, self.root_hash)
7927+            self.failUnlessEqual(segsize, 6)
7928+            self.failUnlessEqual(datalen, 36)
7929+            self.failUnlessEqual(k, 3)
7930+            self.failUnlessEqual(n, 10)
7931+            expected_prefix = struct.pack(MDMFSIGNABLEHEADER,
7932+                                          1,
7933+                                          seqnum,
7934+                                          root_hash,
7935+                                          k,
7936+                                          n,
7937+                                          segsize,
7938+                                          datalen)
7939+            self.failUnlessEqual(expected_prefix, prefix)
7940+            self.failUnlessEqual(self.rref.read_count, 0)
7941+        d.addCallback(_check_verinfo)
7942+        # This is not enough data to read a block and a share, so the
7943+        # wrapper should attempt to read this from the remote server.
7944+        d.addCallback(_make_mr, 107)
7945+        d.addCallback(lambda mr:
7946+            mr.get_block_and_salt(0))
7947+        def _check_block_and_salt((block, salt)):
7948+            self.failUnlessEqual(block, self.block)
7949+            self.failUnlessEqual(salt, self.salt)
7950+            self.failUnlessEqual(self.rref.read_count, 1)
7951+        # This should be enough data to read one block.
7952+        d.addCallback(_make_mr, 249)
7953+        d.addCallback(lambda mr:
7954+            mr.get_block_and_salt(0))
7955+        d.addCallback(_check_block_and_salt)
7956+        return d
7957+
7958+
7959+    def test_read_with_prefetched_sdmf_data(self):
7960+        sdmf_data = self.build_test_sdmf_share()
7961+        self.write_sdmf_share_to_server("si1")
7962+        def _make_mr(ignored, length):
7963+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, sdmf_data[:length])
7964+            return mr
7965+
7966+        d = defer.succeed(None)
7967+        # This should be enough to get us the encoding parameters,
7968+        # offset table, and everything else we need to build a verinfo
7969+        # string.
7970+        d.addCallback(_make_mr, 107)
7971+        d.addCallback(lambda mr:
7972+            mr.get_verinfo())
7973+        def _check_verinfo(verinfo):
7974+            self.failUnless(verinfo)
7975+            self.failUnlessEqual(len(verinfo), 9)
7976+            (seqnum,
7977+             root_hash,
7978+             salt,
7979+             segsize,
7980+             datalen,
7981+             k,
7982+             n,
7983+             prefix,
7984+             offsets) = verinfo
7985+            self.failUnlessEqual(seqnum, 0)
7986+            self.failUnlessEqual(root_hash, self.root_hash)
7987+            self.failUnlessEqual(salt, self.salt)
7988+            self.failUnlessEqual(segsize, 36)
7989+            self.failUnlessEqual(datalen, 36)
7990+            self.failUnlessEqual(k, 3)
7991+            self.failUnlessEqual(n, 10)
7992+            expected_prefix = struct.pack(SIGNED_PREFIX,
7993+                                          0,
7994+                                          seqnum,
7995+                                          root_hash,
7996+                                          salt,
7997+                                          k,
7998+                                          n,
7999+                                          segsize,
8000+                                          datalen)
8001+            self.failUnlessEqual(expected_prefix, prefix)
8002+            self.failUnlessEqual(self.rref.read_count, 0)
8003+        d.addCallback(_check_verinfo)
8004+        # This shouldn't be enough to read any share data.
8005+        d.addCallback(_make_mr, 107)
8006+        d.addCallback(lambda mr:
8007+            mr.get_block_and_salt(0))
8008+        def _check_block_and_salt((block, salt)):
8009+            self.failUnlessEqual(block, self.block * 6)
8010+            self.failUnlessEqual(salt, self.salt)
8011+            # TODO: Fix the read routine so that it reads only the data
8012+            #       that it has cached if it can't read all of it.
8013+            self.failUnlessEqual(self.rref.read_count, 2)
8014+
8015+        # This should be enough to read share data.
8016+        d.addCallback(_make_mr, self.offsets['share_data'])
8017+        d.addCallback(lambda mr:
8018+            mr.get_block_and_salt(0))
8019+        d.addCallback(_check_block_and_salt)
8020+        return d
8021+
8022+
8023+    def test_read_with_empty_mdmf_file(self):
8024+        # Some tests upload a file with no contents to test things
8025+        # unrelated to the actual handling of the content of the file.
8026+        # The reader should behave intelligently in these cases.
8027+        self.write_test_share_to_server("si1", empty=True)
8028+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
8029+        # We should be able to get the encoding parameters, and they
8030+        # should be correct.
8031+        d = defer.succeed(None)
8032+        d.addCallback(lambda ignored:
8033+            mr.get_encoding_parameters())
8034+        def _check_encoding_parameters(params):
8035+            self.failUnlessEqual(len(params), 4)
8036+            k, n, segsize, datalen = params
8037+            self.failUnlessEqual(k, 3)
8038+            self.failUnlessEqual(n, 10)
8039+            self.failUnlessEqual(segsize, 0)
8040+            self.failUnlessEqual(datalen, 0)
8041+        d.addCallback(_check_encoding_parameters)
8042+
8043+        # We should not be able to fetch a block, since there are no
8044+        # blocks to fetch
8045+        d.addCallback(lambda ignored:
8046+            self.shouldFail(LayoutInvalid, "get block on empty file",
8047+                            None,
8048+                            mr.get_block_and_salt, 0))
8049+        return d
8050+
8051+
8052+    def test_read_with_empty_sdmf_file(self):
8053+        self.write_sdmf_share_to_server("si1", empty=True)
8054+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
8055+        # We should be able to get the encoding parameters, and they
8056+        # should be correct
8057+        d = defer.succeed(None)
8058+        d.addCallback(lambda ignored:
8059+            mr.get_encoding_parameters())
8060+        def _check_encoding_parameters(params):
8061+            self.failUnlessEqual(len(params), 4)
8062+            k, n, segsize, datalen = params
8063+            self.failUnlessEqual(k, 3)
8064+            self.failUnlessEqual(n, 10)
8065+            self.failUnlessEqual(segsize, 0)
8066+            self.failUnlessEqual(datalen, 0)
8067+        d.addCallback(_check_encoding_parameters)
8068+
8069+        # It does not make sense to get a block in this format, so we
8070+        # should not be able to.
8071+        d.addCallback(lambda ignored:
8072+            self.shouldFail(LayoutInvalid, "get block on an empty file",
8073+                            None,
8074+                            mr.get_block_and_salt, 0))
8075+        return d
8076+
8077+
8078+    def test_verinfo_with_sdmf_file(self):
8079+        self.write_sdmf_share_to_server("si1")
8080+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
8081+        # We should be able to get the version information.
8082+        d = defer.succeed(None)
8083+        d.addCallback(lambda ignored:
8084+            mr.get_verinfo())
8085+        def _check_verinfo(verinfo):
8086+            self.failUnless(verinfo)
8087+            self.failUnlessEqual(len(verinfo), 9)
8088+            (seqnum,
8089+             root_hash,
8090+             salt,
8091+             segsize,
8092+             datalen,
8093+             k,
8094+             n,
8095+             prefix,
8096+             offsets) = verinfo
8097+            self.failUnlessEqual(seqnum, 0)
8098+            self.failUnlessEqual(root_hash, self.root_hash)
8099+            self.failUnlessEqual(salt, self.salt)
8100+            self.failUnlessEqual(segsize, 36)
8101+            self.failUnlessEqual(datalen, 36)
8102+            self.failUnlessEqual(k, 3)
8103+            self.failUnlessEqual(n, 10)
8104+            expected_prefix = struct.pack(">BQ32s16s BBQQ",
8105+                                          0,
8106+                                          seqnum,
8107+                                          root_hash,
8108+                                          salt,
8109+                                          k,
8110+                                          n,
8111+                                          segsize,
8112+                                          datalen)
8113+            self.failUnlessEqual(prefix, expected_prefix)
8114+            self.failUnlessEqual(offsets, self.offsets)
8115+        d.addCallback(_check_verinfo)
8116+        return d
8117+
8118+
8119+    def test_verinfo_with_mdmf_file(self):
8120+        self.write_test_share_to_server("si1")
8121+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
8122+        d = defer.succeed(None)
8123+        d.addCallback(lambda ignored:
8124+            mr.get_verinfo())
8125+        def _check_verinfo(verinfo):
8126+            self.failUnless(verinfo)
8127+            self.failUnlessEqual(len(verinfo), 9)
8128+            (seqnum,
8129+             root_hash,
8130+             IV,
8131+             segsize,
8132+             datalen,
8133+             k,
8134+             n,
8135+             prefix,
8136+             offsets) = verinfo
8137+            self.failUnlessEqual(seqnum, 0)
8138+            self.failUnlessEqual(root_hash, self.root_hash)
8139+            self.failIf(IV)
8140+            self.failUnlessEqual(segsize, 6)
8141+            self.failUnlessEqual(datalen, 36)
8142+            self.failUnlessEqual(k, 3)
8143+            self.failUnlessEqual(n, 10)
8144+            expected_prefix = struct.pack(">BQ32s BBQQ",
8145+                                          1,
8146+                                          seqnum,
8147+                                          root_hash,
8148+                                          k,
8149+                                          n,
8150+                                          segsize,
8151+                                          datalen)
8152+            self.failUnlessEqual(prefix, expected_prefix)
8153+            self.failUnlessEqual(offsets, self.offsets)
8154+        d.addCallback(_check_verinfo)
8155+        return d
8156+
8157+
8158+    def test_reader_queue(self):
8159+        self.write_test_share_to_server('si1')
8160+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
8161+        d1 = mr.get_block_and_salt(0, queue=True)
8162+        d2 = mr.get_blockhashes(queue=True)
8163+        d3 = mr.get_sharehashes(queue=True)
8164+        d4 = mr.get_signature(queue=True)
8165+        d5 = mr.get_verification_key(queue=True)
8166+        dl = defer.DeferredList([d1, d2, d3, d4, d5])
8167+        mr.flush()
8168+        def _print(results):
8169+            self.failUnlessEqual(len(results), 5)
8170+            # We have one read for version information and offsets, and
8171+            # one for everything else.
8172+            self.failUnlessEqual(self.rref.read_count, 2)
8173+            block, salt = results[0][1] # results[0] is a boolean that says
8174+                                           # whether or not the operation
8175+                                           # worked.
8176+            self.failUnlessEqual(self.block, block)
8177+            self.failUnlessEqual(self.salt, salt)
8178+
8179+            blockhashes = results[1][1]
8180+            self.failUnlessEqual(self.block_hash_tree, blockhashes)
8181+
8182+            sharehashes = results[2][1]
8183+            self.failUnlessEqual(self.share_hash_chain, sharehashes)
8184+
8185+            signature = results[3][1]
8186+            self.failUnlessEqual(self.signature, signature)
8187+
8188+            verification_key = results[4][1]
8189+            self.failUnlessEqual(self.verification_key, verification_key)
8190+        dl.addCallback(_print)
8191+        return dl
8192+
8193+
8194+    def test_sdmf_writer(self):
8195+        # Go through the motions of writing an SDMF share to the storage
8196+        # server. Then read the storage server to see that the share got
8197+        # written in the way that we think it should have.
8198+
8199+        # We do this first so that the necessary instance variables get
8200+        # set the way we want them for the tests below.
8201+        data = self.build_test_sdmf_share()
8202+        sdmfr = SDMFSlotWriteProxy(0,
8203+                                   self.rref,
8204+                                   "si1",
8205+                                   self.secrets,
8206+                                   0, 3, 10, 36, 36)
8207+        # Put the block and salt.
8208+        sdmfr.put_block(self.blockdata, 0, self.salt)
8209+
8210+        # Put the encprivkey
8211+        sdmfr.put_encprivkey(self.encprivkey)
8212+
8213+        # Put the block and share hash chains
8214+        sdmfr.put_blockhashes(self.block_hash_tree)
8215+        sdmfr.put_sharehashes(self.share_hash_chain)
8216+        sdmfr.put_root_hash(self.root_hash)
8217+
8218+        # Put the signature
8219+        sdmfr.put_signature(self.signature)
8220+
8221+        # Put the verification key
8222+        sdmfr.put_verification_key(self.verification_key)
8223+
8224+        # Now check to make sure that nothing has been written yet.
8225+        self.failUnlessEqual(self.rref.write_count, 0)
8226+
8227+        # Now finish publishing
8228+        d = sdmfr.finish_publishing()
8229+        def _then(ignored):
8230+            self.failUnlessEqual(self.rref.write_count, 1)
8231+            read = self.ss.remote_slot_readv
8232+            self.failUnlessEqual(read("si1", [0], [(0, len(data))]),
8233+                                 {0: [data]})
8234+        d.addCallback(_then)
8235+        return d
8236+
8237+
8238+    def test_sdmf_writer_preexisting_share(self):
8239+        data = self.build_test_sdmf_share()
8240+        self.write_sdmf_share_to_server("si1")
8241+
8242+        # Now there is a share on the storage server. To successfully
8243+        # write, we need to set the checkstring correctly. When we
8244+        # don't, no write should occur.
8245+        sdmfw = SDMFSlotWriteProxy(0,
8246+                                   self.rref,
8247+                                   "si1",
8248+                                   self.secrets,
8249+                                   1, 3, 10, 36, 36)
8250+        sdmfw.put_block(self.blockdata, 0, self.salt)
8251+
8252+        # Put the encprivkey
8253+        sdmfw.put_encprivkey(self.encprivkey)
8254+
8255+        # Put the block and share hash chains
8256+        sdmfw.put_blockhashes(self.block_hash_tree)
8257+        sdmfw.put_sharehashes(self.share_hash_chain)
8258+
8259+        # Put the root hash
8260+        sdmfw.put_root_hash(self.root_hash)
8261+
8262+        # Put the signature
8263+        sdmfw.put_signature(self.signature)
8264+
8265+        # Put the verification key
8266+        sdmfw.put_verification_key(self.verification_key)
8267+
8268+        # We shouldn't have a checkstring yet
8269+        self.failUnlessEqual(sdmfw.get_checkstring(), "")
8270+
8271+        d = sdmfw.finish_publishing()
8272+        def _then(results):
8273+            self.failIf(results[0])
8274+            # this is the correct checkstring
8275+            self._expected_checkstring = results[1][0][0]
8276+            return self._expected_checkstring
8277+
8278+        d.addCallback(_then)
8279+        d.addCallback(sdmfw.set_checkstring)
8280+        d.addCallback(lambda ignored:
8281+            sdmfw.get_checkstring())
8282+        d.addCallback(lambda checkstring:
8283+            self.failUnlessEqual(checkstring, self._expected_checkstring))
8284+        d.addCallback(lambda ignored:
8285+            sdmfw.finish_publishing())
8286+        def _then_again(results):
8287+            self.failUnless(results[0])
8288+            read = self.ss.remote_slot_readv
8289+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
8290+                                 {0: [struct.pack(">Q", 1)]})
8291+            self.failUnlessEqual(read("si1", [0], [(9, len(data) - 9)]),
8292+                                 {0: [data[9:]]})
8293+        d.addCallback(_then_again)
8294+        return d
8295+
8296+
8297 class Stats(unittest.TestCase):
8298 
8299     def setUp(self):
8300}
8301
8302Context:
8303
8304[docs/how_to_make_a_tahoe-lafs_release.txt: trivial correction, install.html should now be quickstart.html.
8305david-sarah@jacaranda.org**20100625223929
8306 Ignore-this: 99a5459cac51bd867cc11ad06927ff30
8307] 
8308[setup: in the Makefile, refuse to upload tarballs unless someone has passed the environment variable "BB_BRANCH" with value "trunk"
8309zooko@zooko.com**20100619034928
8310 Ignore-this: 276ddf9b6ad7ec79e27474862e0f7d6
8311] 
8312[trivial: tiny update to in-line comment
8313zooko@zooko.com**20100614045715
8314 Ignore-this: 10851b0ed2abfed542c97749e5d280bc
8315 (I'm actually committing this patch as a test of the new eager-annotation-computation of trac-darcs.)
8316] 
8317[docs: about.html link to home page early on, and be decentralized storage instead of cloud storage this time around
8318zooko@zooko.com**20100619065318
8319 Ignore-this: dc6db03f696e5b6d2848699e754d8053
8320] 
8321[docs: update about.html, especially to have a non-broken link to quickstart.html, and also to comment out the broken links to "for Paranoids" and "for Corporates"
8322zooko@zooko.com**20100619065124
8323 Ignore-this: e292c7f51c337a84ebfeb366fbd24d6c
8324] 
8325[TAG allmydata-tahoe-1.7.0
8326zooko@zooko.com**20100619052631
8327 Ignore-this: d21e27afe6d85e2e3ba6a3292ba2be1
8328] 
8329Patch bundle hash:
83309f2dee795b75428ecf00eda6be7a491d3e9e73ba