Ticket #393: 393status15.dpatch

File 393status15.dpatch, 374.5 KB (added by kevan, at 2010-07-06T23:03:20Z)
Line 
1Thu Jun 24 16:46:37 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
2  * Misc. changes to support the work I'm doing
3 
4      - Add a notion of file version number to interfaces.py
5      - Alter mutable file node interfaces to have a notion of version,
6        though this may be changed later.
7      - Alter mutable/filenode.py to conform to these changes.
8      - Add a salt hasher to util/hashutil.py
9
10Thu Jun 24 16:48:33 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
11  * nodemaker.py: create MDMF files when asked to
12
13Thu Jun 24 16:49:05 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
14  * storage/server.py: minor code cleanup
15
16Thu Jun 24 16:49:24 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
17  * test/test_mutable.py: alter some tests that were failing due to MDMF; minor code cleanup.
18
19Fri Jun 25 17:35:20 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
20  * test/test_mutable.py: change the definition of corrupt() to work with MDMF as well as SDMF files, change users of corrupt to use the new definition
21
22Sat Jun 26 16:41:18 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
23  * Alter the ServermapUpdater to find MDMF files
24 
25  The servermapupdater should find MDMF files on a grid in the same way
26  that it finds SDMF files. This patch makes it do that.
27
28Sat Jun 26 16:42:04 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
29  * Make a segmented mutable uploader
30 
31  The mutable file uploader should be able to publish files with one
32  segment and files with multiple segments. This patch makes it do that.
33  This is still incomplete, and rather ugly -- I need to flesh out error
34  handling, I need to write tests, and I need to remove some of the uglier
35  kludges in the process before I can call this done.
36
37Sat Jun 26 16:43:14 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
38  * Write a segmented mutable downloader
39 
40  The segmented mutable downloader can deal with MDMF files (files with
41  one or more segments in MDMF format) and SDMF files (files with one
42  segment in SDMF format). It is backwards compatible with the old
43  file format.
44 
45  This patch also contains tests for the segmented mutable downloader.
46
47Mon Jun 28 15:50:48 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
48  * mutable/checker.py: check MDMF files
49 
50  This patch adapts the mutable file checker and verifier to check and
51  verify MDMF files. It does this by using the new segmented downloader,
52  which is trained to perform verification operations on request. This
53  removes some code duplication.
54
55Mon Jun 28 15:52:01 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
56  * mutable/retrieve.py: learn how to verify mutable files
57
58Wed Jun 30 11:33:05 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
59  * interfaces.py: add IMutableSlotWriter
60
61Thu Jul  1 16:28:06 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
62  * test/test_mutable.py: temporarily disable two tests that are now irrelevant
63
64Fri Jul  2 15:55:31 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
65  * Add MDMF reader and writer, and SDMF writer
66 
67  The MDMF/SDMF reader MDMF writer, and SDMF writer are similar to the
68  object proxies that exist for immutable files. They abstract away
69  details of connection, state, and caching from their callers (in this
70  case, the download, servermap updater, and uploader), and expose methods
71  to get and set information on the remote server.
72 
73  MDMFSlotReadProxy reads a mutable file from the server, doing the right
74  thing (in most cases) regardless of whether the file is MDMF or SDMF. It
75  allows callers to tell it how to batch and flush reads.
76 
77  MDMFSlotWriteProxy writes an MDMF mutable file to a server.
78 
79  SDMFSlotWriteProxy writes an SDMF mutable file to a server.
80 
81  This patch also includes tests for MDMFSlotReadProxy,
82  SDMFSlotWriteProxy, and MDMFSlotWriteProxy.
83
84Fri Jul  2 15:55:54 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
85  * mutable/publish.py: cleanup + simplification
86
87Fri Jul  2 15:57:10 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
88  * test/test_mutable.py: remove tests that are no longer relevant
89
90Tue Jul  6 14:52:17 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
91  * interfaces.py: create IMutableUploadable
92
93Tue Jul  6 14:52:57 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
94  * mutable/publish.py: add MutableDataHandle and MutableFileHandle
95
96Tue Jul  6 14:55:41 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
97  * mutable/publish.py: reorganize in preparation of file-like uploadables
98
99Tue Jul  6 14:56:49 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
100  * test/test_mutable.py: write tests for MutableFileHandle and MutableDataHandle
101
102New patches:
103
104[Misc. changes to support the work I'm doing
105Kevan Carstensen <kevan@isnotajoke.com>**20100624234637
106 Ignore-this: fdd18fa8cc05f4b4b15ff53ee24a1819
107 
108     - Add a notion of file version number to interfaces.py
109     - Alter mutable file node interfaces to have a notion of version,
110       though this may be changed later.
111     - Alter mutable/filenode.py to conform to these changes.
112     - Add a salt hasher to util/hashutil.py
113] {
114hunk ./src/allmydata/interfaces.py 7
115      ChoiceOf, IntegerConstraint, Any, RemoteInterface, Referenceable
116 
117 HASH_SIZE=32
118+SALT_SIZE=16
119+
120+SDMF_VERSION=0
121+MDMF_VERSION=1
122 
123 Hash = StringConstraint(maxLength=HASH_SIZE,
124                         minLength=HASH_SIZE)# binary format 32-byte SHA256 hash
125hunk ./src/allmydata/interfaces.py 811
126         writer-visible data using this writekey.
127         """
128 
129+    def set_version(version):
130+        """Tahoe-LAFS supports SDMF and MDMF mutable files. By default,
131+        we upload in SDMF for reasons of compatibility. If you want to
132+        change this, set_version will let you do that.
133+
134+        To say that this file should be uploaded in SDMF, pass in a 0. To
135+        say that the file should be uploaded as MDMF, pass in a 1.
136+        """
137+
138+    def get_version():
139+        """Returns the mutable file protocol version."""
140+
141 class NotEnoughSharesError(Exception):
142     """Download was unable to get enough shares"""
143 
144hunk ./src/allmydata/mutable/filenode.py 8
145 from twisted.internet import defer, reactor
146 from foolscap.api import eventually
147 from allmydata.interfaces import IMutableFileNode, \
148-     ICheckable, ICheckResults, NotEnoughSharesError
149+     ICheckable, ICheckResults, NotEnoughSharesError, MDMF_VERSION, SDMF_VERSION
150 from allmydata.util import hashutil, log
151 from allmydata.util.assertutil import precondition
152 from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI
153hunk ./src/allmydata/mutable/filenode.py 67
154         self._sharemap = {} # known shares, shnum-to-[nodeids]
155         self._cache = ResponseCache()
156         self._most_recent_size = None
157+        # filled in after __init__ if we're being created for the first time;
158+        # filled in by the servermap updater before publishing, otherwise.
159+        # set to this default value in case neither of those things happen,
160+        # or in case the servermap can't find any shares to tell us what
161+        # to publish as.
162+        # TODO: Set this back to None, and find out why the tests fail
163+        #       with it set to None.
164+        self._protocol_version = SDMF_VERSION
165 
166         # all users of this MutableFileNode go through the serializer. This
167         # takes advantage of the fact that Deferreds discard the callbacks
168hunk ./src/allmydata/mutable/filenode.py 472
169     def _did_upload(self, res, size):
170         self._most_recent_size = size
171         return res
172+
173+
174+    def set_version(self, version):
175+        # I can be set in two ways:
176+        #  1. When the node is created.
177+        #  2. (for an existing share) when the Servermap is updated
178+        #     before I am read.
179+        assert version in (MDMF_VERSION, SDMF_VERSION)
180+        self._protocol_version = version
181+
182+
183+    def get_version(self):
184+        return self._protocol_version
185hunk ./src/allmydata/util/hashutil.py 90
186 MUTABLE_READKEY_TAG = "allmydata_mutable_writekey_to_readkey_v1"
187 MUTABLE_DATAKEY_TAG = "allmydata_mutable_readkey_to_datakey_v1"
188 MUTABLE_STORAGEINDEX_TAG = "allmydata_mutable_readkey_to_storage_index_v1"
189+MUTABLE_SALT_TAG = "allmydata_mutable_segment_salt_v1"
190 
191 # dirnodes
192 DIRNODE_CHILD_WRITECAP_TAG = "allmydata_mutable_writekey_and_salt_to_dirnode_child_capkey_v1"
193hunk ./src/allmydata/util/hashutil.py 134
194 def plaintext_segment_hasher():
195     return tagged_hasher(PLAINTEXT_SEGMENT_TAG)
196 
197+def mutable_salt_hash(data):
198+    return tagged_hash(MUTABLE_SALT_TAG, data)
199+def mutable_salt_hasher():
200+    return tagged_hasher(MUTABLE_SALT_TAG)
201+
202 KEYLEN = 16
203 IVLEN = 16
204 
205}
206[nodemaker.py: create MDMF files when asked to
207Kevan Carstensen <kevan@isnotajoke.com>**20100624234833
208 Ignore-this: 26c16aaca9ddab7a7ce37a4530bc970
209] {
210hunk ./src/allmydata/nodemaker.py 3
211 import weakref
212 from zope.interface import implements
213-from allmydata.interfaces import INodeMaker
214+from allmydata.util.assertutil import precondition
215+from allmydata.interfaces import INodeMaker, MustBeDeepImmutableError, \
216+                                 SDMF_VERSION, MDMF_VERSION
217 from allmydata.immutable.filenode import ImmutableFileNode, LiteralFileNode
218 from allmydata.immutable.upload import Data
219 from allmydata.mutable.filenode import MutableFileNode
220hunk ./src/allmydata/nodemaker.py 92
221             return self._create_dirnode(filenode)
222         return None
223 
224-    def create_mutable_file(self, contents=None, keysize=None):
225+    def create_mutable_file(self, contents=None, keysize=None,
226+                            version=SDMF_VERSION):
227         n = MutableFileNode(self.storage_broker, self.secret_holder,
228                             self.default_encoding_parameters, self.history)
229hunk ./src/allmydata/nodemaker.py 96
230+        n.set_version(version)
231         d = self.key_generator.generate(keysize)
232         d.addCallback(n.create_with_keys, contents)
233         d.addCallback(lambda res: n)
234hunk ./src/allmydata/nodemaker.py 102
235         return d
236 
237-    def create_new_mutable_directory(self, initial_children={}):
238+    def create_new_mutable_directory(self, initial_children={},
239+                                     version=SDMF_VERSION):
240+        # initial_children must have metadata (i.e. {} instead of None)
241+        for (name, (node, metadata)) in initial_children.iteritems():
242+            precondition(isinstance(metadata, dict),
243+                         "create_new_mutable_directory requires metadata to be a dict, not None", metadata)
244+            node.raise_error()
245         d = self.create_mutable_file(lambda n:
246hunk ./src/allmydata/nodemaker.py 110
247-                                     pack_children(n, initial_children))
248+                                     pack_children(n, initial_children),
249+                                     version)
250         d.addCallback(self._create_dirnode)
251         return d
252 
253}
254[storage/server.py: minor code cleanup
255Kevan Carstensen <kevan@isnotajoke.com>**20100624234905
256 Ignore-this: 2358c531c39e48d3c8e56b62b5768228
257] {
258hunk ./src/allmydata/storage/server.py 569
259                                          self)
260         return share
261 
262-    def remote_slot_readv(self, storage_index, shares, readv):
263+    def remote_slot_readv(self, storage_index, shares, readvs):
264         start = time.time()
265         self.count("readv")
266         si_s = si_b2a(storage_index)
267hunk ./src/allmydata/storage/server.py 590
268             if sharenum in shares or not shares:
269                 filename = os.path.join(bucketdir, sharenum_s)
270                 msf = MutableShareFile(filename, self)
271-                datavs[sharenum] = msf.readv(readv)
272+                datavs[sharenum] = msf.readv(readvs)
273         log.msg("returning shares %s" % (datavs.keys(),),
274                 facility="tahoe.storage", level=log.NOISY, parent=lp)
275         self.add_latency("readv", time.time() - start)
276}
277[test/test_mutable.py: alter some tests that were failing due to MDMF; minor code cleanup.
278Kevan Carstensen <kevan@isnotajoke.com>**20100624234924
279 Ignore-this: afb86ec1fbdbfe1a5ef6f46f350273c0
280] {
281hunk ./src/allmydata/test/test_mutable.py 151
282             chr(ord(original[byte_offset]) ^ 0x01) +
283             original[byte_offset+1:])
284 
285+def add_two(original, byte_offset):
286+    # It isn't enough to simply flip the bit for the version number,
287+    # because 1 is a valid version number. So we add two instead.
288+    return (original[:byte_offset] +
289+            chr(ord(original[byte_offset]) ^ 0x02) +
290+            original[byte_offset+1:])
291+
292 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
293     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
294     # list of shnums to corrupt.
295hunk ./src/allmydata/test/test_mutable.py 187
296                 real_offset = offset1
297             real_offset = int(real_offset) + offset2 + offset_offset
298             assert isinstance(real_offset, int), offset
299-            shares[shnum] = flip_bit(data, real_offset)
300+            if offset1 == 0: # verbyte
301+                f = add_two
302+            else:
303+                f = flip_bit
304+            shares[shnum] = f(data, real_offset)
305     return res
306 
307 def make_storagebroker(s=None, num_peers=10):
308hunk ./src/allmydata/test/test_mutable.py 423
309         d.addCallback(_created)
310         return d
311 
312+
313     def test_modify_backoffer(self):
314         def _modifier(old_contents, servermap, first_time):
315             return old_contents + "line2"
316hunk ./src/allmydata/test/test_mutable.py 658
317         d.addCallback(_created)
318         return d
319 
320+
321     def _copy_shares(self, ignored, index):
322         shares = self._storage._peers
323         # we need a deep copy
324}
325[test/test_mutable.py: change the definition of corrupt() to work with MDMF as well as SDMF files, change users of corrupt to use the new definition
326Kevan Carstensen <kevan@isnotajoke.com>**20100626003520
327 Ignore-this: 836e59e2fde0535f6b4bea3468dc8244
328] {
329hunk ./src/allmydata/test/test_mutable.py 168
330                 and shnum not in shnums_to_corrupt):
331                 continue
332             data = shares[shnum]
333-            (version,
334-             seqnum,
335-             root_hash,
336-             IV,
337-             k, N, segsize, datalen,
338-             o) = unpack_header(data)
339-            if isinstance(offset, tuple):
340-                offset1, offset2 = offset
341-            else:
342-                offset1 = offset
343-                offset2 = 0
344-            if offset1 == "pubkey":
345-                real_offset = 107
346-            elif offset1 in o:
347-                real_offset = o[offset1]
348-            else:
349-                real_offset = offset1
350-            real_offset = int(real_offset) + offset2 + offset_offset
351-            assert isinstance(real_offset, int), offset
352-            if offset1 == 0: # verbyte
353-                f = add_two
354-            else:
355-                f = flip_bit
356-            shares[shnum] = f(data, real_offset)
357-    return res
358+            # We're feeding the reader all of the share data, so it
359+            # won't need to use the rref that we didn't provide, nor the
360+            # storage index that we didn't provide. We do this because
361+            # the reader will work for both MDMF and SDMF.
362+            reader = MDMFSlotReadProxy(None, None, shnum, data)
363+            # We need to get the offsets for the next part.
364+            d = reader.get_verinfo()
365+            def _do_corruption(verinfo, data, shnum):
366+                (seqnum,
367+                 root_hash,
368+                 IV,
369+                 segsize,
370+                 datalen,
371+                 k, n, prefix, o) = verinfo
372+                if isinstance(offset, tuple):
373+                    offset1, offset2 = offset
374+                else:
375+                    offset1 = offset
376+                    offset2 = 0
377+                if offset1 == "pubkey":
378+                    real_offset = 107
379+                elif offset1 in o:
380+                    real_offset = o[offset1]
381+                else:
382+                    real_offset = offset1
383+                real_offset = int(real_offset) + offset2 + offset_offset
384+                assert isinstance(real_offset, int), offset
385+                if offset1 == 0: # verbyte
386+                    f = add_two
387+                else:
388+                    f = flip_bit
389+                shares[shnum] = f(data, real_offset)
390+            d.addCallback(_do_corruption, data, shnum)
391+            ds.append(d)
392+    dl = defer.DeferredList(ds)
393+    dl.addCallback(lambda ignored: res)
394+    return dl
395 
396 def make_storagebroker(s=None, num_peers=10):
397     if not s:
398hunk ./src/allmydata/test/test_mutable.py 1177
399         return d
400 
401     def test_download_fails(self):
402-        corrupt(None, self._storage, "signature")
403-        d = self.shouldFail(UnrecoverableFileError, "test_download_anyway",
404+        d = corrupt(None, self._storage, "signature")
405+        d.addCallback(lambda ignored:
406+            self.shouldFail(UnrecoverableFileError, "test_download_anyway",
407                             "no recoverable versions",
408                             self._fn.download_best_version)
409         return d
410hunk ./src/allmydata/test/test_mutable.py 1232
411         return d
412 
413     def test_check_all_bad_sig(self):
414-        corrupt(None, self._storage, 1) # bad sig
415-        d = self._fn.check(Monitor())
416+        d = corrupt(None, self._storage, 1) # bad sig
417+        d.addCallback(lambda ignored:
418+            self._fn.check(Monitor()))
419         d.addCallback(self.check_bad, "test_check_all_bad_sig")
420         return d
421 
422hunk ./src/allmydata/test/test_mutable.py 1239
423     def test_check_all_bad_blocks(self):
424-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
425+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
426         # the Checker won't notice this.. it doesn't look at actual data
427hunk ./src/allmydata/test/test_mutable.py 1241
428-        d = self._fn.check(Monitor())
429+        d.addCallback(lambda ignored:
430+            self._fn.check(Monitor()))
431         d.addCallback(self.check_good, "test_check_all_bad_blocks")
432         return d
433 
434hunk ./src/allmydata/test/test_mutable.py 1252
435         return d
436 
437     def test_verify_all_bad_sig(self):
438-        corrupt(None, self._storage, 1) # bad sig
439-        d = self._fn.check(Monitor(), verify=True)
440+        d = corrupt(None, self._storage, 1) # bad sig
441+        d.addCallback(lambda ignored:
442+            self._fn.check(Monitor(), verify=True))
443         d.addCallback(self.check_bad, "test_verify_all_bad_sig")
444         return d
445 
446hunk ./src/allmydata/test/test_mutable.py 1259
447     def test_verify_one_bad_sig(self):
448-        corrupt(None, self._storage, 1, [9]) # bad sig
449-        d = self._fn.check(Monitor(), verify=True)
450+        d = corrupt(None, self._storage, 1, [9]) # bad sig
451+        d.addCallback(lambda ignored:
452+            self._fn.check(Monitor(), verify=True))
453         d.addCallback(self.check_bad, "test_verify_one_bad_sig")
454         return d
455 
456hunk ./src/allmydata/test/test_mutable.py 1266
457     def test_verify_one_bad_block(self):
458-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
459+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
460         # the Verifier *will* notice this, since it examines every byte
461hunk ./src/allmydata/test/test_mutable.py 1268
462-        d = self._fn.check(Monitor(), verify=True)
463+        d.addCallback(lambda ignored:
464+            self._fn.check(Monitor(), verify=True))
465         d.addCallback(self.check_bad, "test_verify_one_bad_block")
466         d.addCallback(self.check_expected_failure,
467                       CorruptShareError, "block hash tree failure",
468hunk ./src/allmydata/test/test_mutable.py 1277
469         return d
470 
471     def test_verify_one_bad_sharehash(self):
472-        corrupt(None, self._storage, "share_hash_chain", [9], 5)
473-        d = self._fn.check(Monitor(), verify=True)
474+        d = corrupt(None, self._storage, "share_hash_chain", [9], 5)
475+        d.addCallback(lambda ignored:
476+            self._fn.check(Monitor(), verify=True))
477         d.addCallback(self.check_bad, "test_verify_one_bad_sharehash")
478         d.addCallback(self.check_expected_failure,
479                       CorruptShareError, "corrupt hashes",
480hunk ./src/allmydata/test/test_mutable.py 1287
481         return d
482 
483     def test_verify_one_bad_encprivkey(self):
484-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
485-        d = self._fn.check(Monitor(), verify=True)
486+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
487+        d.addCallback(lambda ignored:
488+            self._fn.check(Monitor(), verify=True))
489         d.addCallback(self.check_bad, "test_verify_one_bad_encprivkey")
490         d.addCallback(self.check_expected_failure,
491                       CorruptShareError, "invalid privkey",
492hunk ./src/allmydata/test/test_mutable.py 1297
493         return d
494 
495     def test_verify_one_bad_encprivkey_uncheckable(self):
496-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
497+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
498         readonly_fn = self._fn.get_readonly()
499         # a read-only node has no way to validate the privkey
500hunk ./src/allmydata/test/test_mutable.py 1300
501-        d = readonly_fn.check(Monitor(), verify=True)
502+        d.addCallback(lambda ignored:
503+            readonly_fn.check(Monitor(), verify=True))
504         d.addCallback(self.check_good,
505                       "test_verify_one_bad_encprivkey_uncheckable")
506         return d
507}
508[Alter the ServermapUpdater to find MDMF files
509Kevan Carstensen <kevan@isnotajoke.com>**20100626234118
510 Ignore-this: 25f6278209c2983ba8f307cfe0fde0
511 
512 The servermapupdater should find MDMF files on a grid in the same way
513 that it finds SDMF files. This patch makes it do that.
514] {
515hunk ./src/allmydata/mutable/servermap.py 7
516 from itertools import count
517 from twisted.internet import defer
518 from twisted.python import failure
519-from foolscap.api import DeadReferenceError, RemoteException, eventually
520+from foolscap.api import DeadReferenceError, RemoteException, eventually, \
521+                         fireEventually
522 from allmydata.util import base32, hashutil, idlib, log
523 from allmydata.storage.server import si_b2a
524 from allmydata.interfaces import IServermapUpdaterStatus
525hunk ./src/allmydata/mutable/servermap.py 17
526 from allmydata.mutable.common import MODE_CHECK, MODE_ANYTHING, MODE_WRITE, MODE_READ, \
527      DictOfSets, CorruptShareError, NeedMoreDataError
528 from allmydata.mutable.layout import unpack_prefix_and_signature, unpack_header, unpack_share, \
529-     SIGNED_PREFIX_LENGTH
530+     SIGNED_PREFIX_LENGTH, MDMFSlotReadProxy
531 
532 class UpdateStatus:
533     implements(IServermapUpdaterStatus)
534hunk ./src/allmydata/mutable/servermap.py 254
535         """Return a set of versionids, one for each version that is currently
536         recoverable."""
537         versionmap = self.make_versionmap()
538-
539         recoverable_versions = set()
540         for (verinfo, shares) in versionmap.items():
541             (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
542hunk ./src/allmydata/mutable/servermap.py 366
543         self._servers_responded = set()
544 
545         # how much data should we read?
546+        # SDMF:
547         #  * if we only need the checkstring, then [0:75]
548         #  * if we need to validate the checkstring sig, then [543ish:799ish]
549         #  * if we need the verification key, then [107:436ish]
550hunk ./src/allmydata/mutable/servermap.py 374
551         #  * if we need the encrypted private key, we want [-1216ish:]
552         #   * but we can't read from negative offsets
553         #   * the offset table tells us the 'ish', also the positive offset
554-        # A future version of the SMDF slot format should consider using
555-        # fixed-size slots so we can retrieve less data. For now, we'll just
556-        # read 2000 bytes, which also happens to read enough actual data to
557-        # pre-fetch a 9-entry dirnode.
558+        # MDMF:
559+        #  * Checkstring? [0:72]
560+        #  * If we want to validate the checkstring, then [0:72], [143:?] --
561+        #    the offset table will tell us for sure.
562+        #  * If we need the verification key, we have to consult the offset
563+        #    table as well.
564+        # At this point, we don't know which we are. Our filenode can
565+        # tell us, but it might be lying -- in some cases, we're
566+        # responsible for telling it which kind of file it is.
567         self._read_size = 4000
568         if mode == MODE_CHECK:
569             # we use unpack_prefix_and_signature, so we need 1k
570hunk ./src/allmydata/mutable/servermap.py 432
571         self._queries_completed = 0
572 
573         sb = self._storage_broker
574+        # All of the peers, permuted by the storage index, as usual.
575         full_peerlist = sb.get_servers_for_index(self._storage_index)
576         self.full_peerlist = full_peerlist # for use later, immutable
577         self.extra_peers = full_peerlist[:] # peers are removed as we use them
578hunk ./src/allmydata/mutable/servermap.py 439
579         self._good_peers = set() # peers who had some shares
580         self._empty_peers = set() # peers who don't have any shares
581         self._bad_peers = set() # peers to whom our queries failed
582+        self._readers = {} # peerid -> dict(sharewriters), filled in
583+                           # after responses come in.
584 
585         k = self._node.get_required_shares()
586hunk ./src/allmydata/mutable/servermap.py 443
587+        # For what cases can these conditions work?
588         if k is None:
589             # make a guess
590             k = 3
591hunk ./src/allmydata/mutable/servermap.py 456
592         self.num_peers_to_query = k + self.EPSILON
593 
594         if self.mode == MODE_CHECK:
595+            # We want to query all of the peers.
596             initial_peers_to_query = dict(full_peerlist)
597             must_query = set(initial_peers_to_query.keys())
598             self.extra_peers = []
599hunk ./src/allmydata/mutable/servermap.py 464
600             # we're planning to replace all the shares, so we want a good
601             # chance of finding them all. We will keep searching until we've
602             # seen epsilon that don't have a share.
603+            # We don't query all of the peers because that could take a while.
604             self.num_peers_to_query = N + self.EPSILON
605             initial_peers_to_query, must_query = self._build_initial_querylist()
606             self.required_num_empty_peers = self.EPSILON
607hunk ./src/allmydata/mutable/servermap.py 474
608             # might also avoid the round trip required to read the encrypted
609             # private key.
610 
611-        else:
612+        else: # MODE_READ, MODE_ANYTHING
613+            # 2k peers is good enough.
614             initial_peers_to_query, must_query = self._build_initial_querylist()
615 
616         # this is a set of peers that we are required to get responses from:
617hunk ./src/allmydata/mutable/servermap.py 490
618         # before we can consider ourselves finished, and self.extra_peers
619         # contains the overflow (peers that we should tap if we don't get
620         # enough responses)
621+        # I guess that self._must_query is a subset of
622+        # initial_peers_to_query?
623+        assert set(must_query).issubset(set(initial_peers_to_query))
624 
625         self._send_initial_requests(initial_peers_to_query)
626         self._status.timings["initial_queries"] = time.time() - self._started
627hunk ./src/allmydata/mutable/servermap.py 549
628         # errors that aren't handled by _query_failed (and errors caused by
629         # _query_failed) get logged, but we still want to check for doneness.
630         d.addErrback(log.err)
631-        d.addBoth(self._check_for_done)
632         d.addErrback(self._fatal_error)
633hunk ./src/allmydata/mutable/servermap.py 550
634+        d.addCallback(self._check_for_done)
635         return d
636 
637     def _do_read(self, ss, peerid, storage_index, shnums, readv):
638hunk ./src/allmydata/mutable/servermap.py 569
639         d = ss.callRemote("slot_readv", storage_index, shnums, readv)
640         return d
641 
642+
643+    def _got_corrupt_share(self, e, shnum, peerid, data, lp):
644+        """
645+        I am called when a remote server returns a corrupt share in
646+        response to one of our queries. By corrupt, I mean a share
647+        without a valid signature. I then record the failure, notify the
648+        server of the corruption, and record the share as bad.
649+        """
650+        f = failure.Failure(e)
651+        self.log(format="bad share: %(f_value)s", f_value=str(f),
652+                 failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
653+        # Notify the server that its share is corrupt.
654+        self.notify_server_corruption(peerid, shnum, str(e))
655+        # By flagging this as a bad peer, we won't count any of
656+        # the other shares on that peer as valid, though if we
657+        # happen to find a valid version string amongst those
658+        # shares, we'll keep track of it so that we don't need
659+        # to validate the signature on those again.
660+        self._bad_peers.add(peerid)
661+        self._last_failure = f
662+        # XXX: Use the reader for this?
663+        checkstring = data[:SIGNED_PREFIX_LENGTH]
664+        self._servermap.mark_bad_share(peerid, shnum, checkstring)
665+        self._servermap.problems.append(f)
666+
667+
668+    def _cache_good_sharedata(self, verinfo, shnum, now, data):
669+        """
670+        If one of my queries returns successfully (which means that we
671+        were able to and successfully did validate the signature), I
672+        cache the data that we initially fetched from the storage
673+        server. This will help reduce the number of roundtrips that need
674+        to occur when the file is downloaded, or when the file is
675+        updated.
676+        """
677+        self._node._add_to_cache(verinfo, shnum, 0, data, now)
678+
679+
680     def _got_results(self, datavs, peerid, readsize, stuff, started):
681         lp = self.log(format="got result from [%(peerid)s], %(numshares)d shares",
682                       peerid=idlib.shortnodeid_b2a(peerid),
683hunk ./src/allmydata/mutable/servermap.py 630
684         else:
685             self._empty_peers.add(peerid)
686 
687-        last_verinfo = None
688-        last_shnum = None
689+        ss, storage_index = stuff
690+        ds = []
691+
692         for shnum,datav in datavs.items():
693             data = datav[0]
694hunk ./src/allmydata/mutable/servermap.py 635
695-            try:
696-                verinfo = self._got_results_one_share(shnum, data, peerid, lp)
697-                last_verinfo = verinfo
698-                last_shnum = shnum
699-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
700-            except CorruptShareError, e:
701-                # log it and give the other shares a chance to be processed
702-                f = failure.Failure()
703-                self.log(format="bad share: %(f_value)s", f_value=str(f.value),
704-                         failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
705-                self.notify_server_corruption(peerid, shnum, str(e))
706-                self._bad_peers.add(peerid)
707-                self._last_failure = f
708-                checkstring = data[:SIGNED_PREFIX_LENGTH]
709-                self._servermap.mark_bad_share(peerid, shnum, checkstring)
710-                self._servermap.problems.append(f)
711-                pass
712-
713-        self._status.timings["cumulative_verify"] += (time.time() - now)
714+            reader = MDMFSlotReadProxy(ss,
715+                                       storage_index,
716+                                       shnum,
717+                                       data)
718+            self._readers.setdefault(peerid, dict())[shnum] = reader
719+            # our goal, with each response, is to validate the version
720+            # information and share data as best we can at this point --
721+            # we do this by validating the signature. To do this, we
722+            # need to do the following:
723+            #   - If we don't already have the public key, fetch the
724+            #     public key. We use this to validate the signature.
725+            if not self._node.get_pubkey():
726+                # fetch and set the public key.
727+                d = reader.get_verification_key()
728+                d.addCallback(lambda results, shnum=shnum, peerid=peerid:
729+                    self._try_to_set_pubkey(results, peerid, shnum, lp))
730+                # XXX: Make self._pubkey_query_failed?
731+                d.addErrback(lambda error, shnum=shnum, peerid=peerid:
732+                    self._got_corrupt_share(error, shnum, peerid, data, lp))
733+            else:
734+                # we already have the public key.
735+                d = defer.succeed(None)
736+            # Neither of these two branches return anything of
737+            # consequence, so the first entry in our deferredlist will
738+            # be None.
739 
740hunk ./src/allmydata/mutable/servermap.py 661
741-        if self._need_privkey and last_verinfo:
742-            # send them a request for the privkey. We send one request per
743-            # server.
744-            lp2 = self.log("sending privkey request",
745-                           parent=lp, level=log.NOISY)
746-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
747-             offsets_tuple) = last_verinfo
748-            o = dict(offsets_tuple)
749+            # - Next, we need the version information. We almost
750+            #   certainly got this by reading the first thousand or so
751+            #   bytes of the share on the storage server, so we
752+            #   shouldn't need to fetch anything at this step.
753+            d2 = reader.get_verinfo()
754+            d2.addErrback(lambda error, shnum=shnum, peerid=peerid:
755+                self._got_corrupt_share(error, shnum, peerid, data, lp))
756+            # - Next, we need the signature. For an SDMF share, it is
757+            #   likely that we fetched this when doing our initial fetch
758+            #   to get the version information. In MDMF, this lives at
759+            #   the end of the share, so unless the file is quite small,
760+            #   we'll need to do a remote fetch to get it.
761+            d3 = reader.get_signature()
762+            d3.addErrback(lambda error, shnum=shnum, peerid=peerid:
763+                self._got_corrupt_share(error, shnum, peerid, data, lp))
764+            #  Once we have all three of these responses, we can move on
765+            #  to validating the signature
766 
767hunk ./src/allmydata/mutable/servermap.py 679
768-            self._queries_outstanding.add(peerid)
769-            readv = [ (o['enc_privkey'], (o['EOF'] - o['enc_privkey'])) ]
770-            ss = self._servermap.connections[peerid]
771-            privkey_started = time.time()
772-            d = self._do_read(ss, peerid, self._storage_index,
773-                              [last_shnum], readv)
774-            d.addCallback(self._got_privkey_results, peerid, last_shnum,
775-                          privkey_started, lp2)
776-            d.addErrback(self._privkey_query_failed, peerid, last_shnum, lp2)
777-            d.addErrback(log.err)
778-            d.addCallback(self._check_for_done)
779-            d.addErrback(self._fatal_error)
780+            # Does the node already have a privkey? If not, we'll try to
781+            # fetch it here.
782+            if self._need_privkey:
783+                d4 = reader.get_encprivkey()
784+                d4.addCallback(lambda results, shnum=shnum, peerid=peerid:
785+                    self._try_to_validate_privkey(results, peerid, shnum, lp))
786+                d4.addErrback(lambda error, shnum=shnum, peerid=peerid:
787+                    self._privkey_query_failed(error, shnum, data, lp))
788+            else:
789+                d4 = defer.succeed(None)
790 
791hunk ./src/allmydata/mutable/servermap.py 690
792+            dl = defer.DeferredList([d, d2, d3, d4])
793+            dl.addCallback(lambda results, shnum=shnum, peerid=peerid:
794+                self._got_signature_one_share(results, shnum, peerid, lp))
795+            dl.addErrback(lambda error, shnum=shnum, data=data:
796+               self._got_corrupt_share(error, shnum, peerid, data, lp))
797+            dl.addCallback(lambda verinfo, shnum=shnum, peerid=peerid, data=data:
798+                self._cache_good_sharedata(verinfo, shnum, now, data))
799+            ds.append(dl)
800+        # dl is a deferred list that will fire when all of the shares
801+        # that we found on this peer are done processing. When dl fires,
802+        # we know that processing is done, so we can decrement the
803+        # semaphore-like thing that we incremented earlier.
804+        dl = defer.DeferredList(ds, fireOnOneErrback=True)
805+        # Are we done? Done means that there are no more queries to
806+        # send, that there are no outstanding queries, and that we
807+        # haven't received any queries that are still processing. If we
808+        # are done, self._check_for_done will cause the done deferred
809+        # that we returned to our caller to fire, which tells them that
810+        # they have a complete servermap, and that we won't be touching
811+        # the servermap anymore.
812+        dl.addCallback(self._check_for_done)
813+        dl.addErrback(self._fatal_error)
814         # all done!
815         self.log("_got_results done", parent=lp, level=log.NOISY)
816hunk ./src/allmydata/mutable/servermap.py 714
817+        return dl
818+
819+
820+    def _try_to_set_pubkey(self, pubkey_s, peerid, shnum, lp):
821+        if self._node.get_pubkey():
822+            return # don't go through this again if we don't have to
823+        fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
824+        assert len(fingerprint) == 32
825+        if fingerprint != self._node.get_fingerprint():
826+            raise CorruptShareError(peerid, shnum,
827+                                "pubkey doesn't match fingerprint")
828+        self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
829+        assert self._node.get_pubkey()
830+
831 
832     def notify_server_corruption(self, peerid, shnum, reason):
833         ss = self._servermap.connections[peerid]
834hunk ./src/allmydata/mutable/servermap.py 734
835         ss.callRemoteOnly("advise_corrupt_share",
836                           "mutable", self._storage_index, shnum, reason)
837 
838-    def _got_results_one_share(self, shnum, data, peerid, lp):
839+
840+    def _got_signature_one_share(self, results, shnum, peerid, lp):
841+        # It is our job to give versioninfo to our caller. We need to
842+        # raise CorruptShareError if the share is corrupt for any
843+        # reason, something that our caller will handle.
844         self.log(format="_got_results: got shnum #%(shnum)d from peerid %(peerid)s",
845                  shnum=shnum,
846                  peerid=idlib.shortnodeid_b2a(peerid),
847hunk ./src/allmydata/mutable/servermap.py 744
848                  level=log.NOISY,
849                  parent=lp)
850-
851-        # this might raise NeedMoreDataError, if the pubkey and signature
852-        # live at some weird offset. That shouldn't happen, so I'm going to
853-        # treat it as a bad share.
854-        (seqnum, root_hash, IV, k, N, segsize, datalength,
855-         pubkey_s, signature, prefix) = unpack_prefix_and_signature(data)
856-
857-        if not self._node.get_pubkey():
858-            fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
859-            assert len(fingerprint) == 32
860-            if fingerprint != self._node.get_fingerprint():
861-                raise CorruptShareError(peerid, shnum,
862-                                        "pubkey doesn't match fingerprint")
863-            self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
864-
865-        if self._need_privkey:
866-            self._try_to_extract_privkey(data, peerid, shnum, lp)
867-
868-        (ig_version, ig_seqnum, ig_root_hash, ig_IV, ig_k, ig_N,
869-         ig_segsize, ig_datalen, offsets) = unpack_header(data)
870+        _, verinfo, signature, __ = results
871+        (seqnum,
872+         root_hash,
873+         saltish,
874+         segsize,
875+         datalen,
876+         k,
877+         n,
878+         prefix,
879+         offsets) = verinfo[1]
880         offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
881 
882hunk ./src/allmydata/mutable/servermap.py 756
883-        verinfo = (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
884+        # XXX: This should be done for us in the method, so
885+        # presumably you can go in there and fix it.
886+        verinfo = (seqnum,
887+                   root_hash,
888+                   saltish,
889+                   segsize,
890+                   datalen,
891+                   k,
892+                   n,
893+                   prefix,
894                    offsets_tuple)
895hunk ./src/allmydata/mutable/servermap.py 767
896+        # This tuple uniquely identifies a share on the grid; we use it
897+        # to keep track of the ones that we've already seen.
898 
899         if verinfo not in self._valid_versions:
900hunk ./src/allmydata/mutable/servermap.py 771
901-            # it's a new pair. Verify the signature.
902-            valid = self._node.get_pubkey().verify(prefix, signature)
903+            # This is a new version tuple, and we need to validate it
904+            # against the public key before keeping track of it.
905+            assert self._node.get_pubkey()
906+            valid = self._node.get_pubkey().verify(prefix, signature[1])
907             if not valid:
908hunk ./src/allmydata/mutable/servermap.py 776
909-                raise CorruptShareError(peerid, shnum, "signature is invalid")
910+                raise CorruptShareError(peerid, shnum,
911+                                        "signature is invalid")
912 
913hunk ./src/allmydata/mutable/servermap.py 779
914-            # ok, it's a valid verinfo. Add it to the list of validated
915-            # versions.
916-            self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
917-                     % (seqnum, base32.b2a(root_hash)[:4],
918-                        idlib.shortnodeid_b2a(peerid), shnum,
919-                        k, N, segsize, datalength),
920-                     parent=lp)
921-            self._valid_versions.add(verinfo)
922-        # We now know that this is a valid candidate verinfo.
923+        # ok, it's a valid verinfo. Add it to the list of validated
924+        # versions.
925+        self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
926+                 % (seqnum, base32.b2a(root_hash)[:4],
927+                    idlib.shortnodeid_b2a(peerid), shnum,
928+                    k, n, segsize, datalen),
929+                    parent=lp)
930+        self._valid_versions.add(verinfo)
931+        # We now know that this is a valid candidate verinfo. Whether or
932+        # not this instance of it is valid is a matter for the next
933+        # statement; at this point, we just know that if we see this
934+        # version info again, that its signature checks out and that
935+        # we're okay to skip the signature-checking step.
936 
937hunk ./src/allmydata/mutable/servermap.py 793
938+        # (peerid, shnum) are bound in the method invocation.
939         if (peerid, shnum) in self._servermap.bad_shares:
940             # we've been told that the rest of the data in this share is
941             # unusable, so don't add it to the servermap.
942hunk ./src/allmydata/mutable/servermap.py 808
943         self.versionmap.add(verinfo, (shnum, peerid, timestamp))
944         return verinfo
945 
946+
947     def _deserialize_pubkey(self, pubkey_s):
948         verifier = rsa.create_verifying_key_from_string(pubkey_s)
949         return verifier
950hunk ./src/allmydata/mutable/servermap.py 813
951 
952-    def _try_to_extract_privkey(self, data, peerid, shnum, lp):
953-        try:
954-            r = unpack_share(data)
955-        except NeedMoreDataError, e:
956-            # this share won't help us. oh well.
957-            offset = e.encprivkey_offset
958-            length = e.encprivkey_length
959-            self.log("shnum %d on peerid %s: share was too short (%dB) "
960-                     "to get the encprivkey; [%d:%d] ought to hold it" %
961-                     (shnum, idlib.shortnodeid_b2a(peerid), len(data),
962-                      offset, offset+length),
963-                     parent=lp)
964-            # NOTE: if uncoordinated writes are taking place, someone might
965-            # change the share (and most probably move the encprivkey) before
966-            # we get a chance to do one of these reads and fetch it. This
967-            # will cause us to see a NotEnoughSharesError(unable to fetch
968-            # privkey) instead of an UncoordinatedWriteError . This is a
969-            # nuisance, but it will go away when we move to DSA-based mutable
970-            # files (since the privkey will be small enough to fit in the
971-            # write cap).
972-
973-            return
974-
975-        (seqnum, root_hash, IV, k, N, segsize, datalen,
976-         pubkey, signature, share_hash_chain, block_hash_tree,
977-         share_data, enc_privkey) = r
978-
979-        return self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
980 
981     def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
982hunk ./src/allmydata/mutable/servermap.py 815
983-
984+        """
985+        Given a writekey from a remote server, I validate it against the
986+        writekey stored in my node. If it is valid, then I set the
987+        privkey and encprivkey properties of the node.
988+        """
989         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
990         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
991         if alleged_writekey != self._node.get_writekey():
992hunk ./src/allmydata/mutable/servermap.py 892
993         self._queries_completed += 1
994         self._last_failure = f
995 
996-    def _got_privkey_results(self, datavs, peerid, shnum, started, lp):
997-        now = time.time()
998-        elapsed = now - started
999-        self._status.add_per_server_time(peerid, "privkey", started, elapsed)
1000-        self._queries_outstanding.discard(peerid)
1001-        if not self._need_privkey:
1002-            return
1003-        if shnum not in datavs:
1004-            self.log("privkey wasn't there when we asked it",
1005-                     level=log.WEIRD, umid="VA9uDQ")
1006-            return
1007-        datav = datavs[shnum]
1008-        enc_privkey = datav[0]
1009-        self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
1010 
1011     def _privkey_query_failed(self, f, peerid, shnum, lp):
1012         self._queries_outstanding.discard(peerid)
1013hunk ./src/allmydata/mutable/servermap.py 906
1014         self._servermap.problems.append(f)
1015         self._last_failure = f
1016 
1017+
1018     def _check_for_done(self, res):
1019         # exit paths:
1020         #  return self._send_more_queries(outstanding) : send some more queries
1021hunk ./src/allmydata/mutable/servermap.py 912
1022         #  return self._done() : all done
1023         #  return : keep waiting, no new queries
1024-
1025         lp = self.log(format=("_check_for_done, mode is '%(mode)s', "
1026                               "%(outstanding)d queries outstanding, "
1027                               "%(extra)d extra peers available, "
1028hunk ./src/allmydata/mutable/servermap.py 1117
1029         self._servermap.last_update_time = self._started
1030         # the servermap will not be touched after this
1031         self.log("servermap: %s" % self._servermap.summarize_versions())
1032+
1033         eventually(self._done_deferred.callback, self._servermap)
1034 
1035     def _fatal_error(self, f):
1036hunk ./src/allmydata/test/test_mutable.py 637
1037         d.addCallback(_created)
1038         return d
1039 
1040-    def publish_multiple(self):
1041+    def publish_mdmf(self):
1042+        # like publish_one, except that the result is guaranteed to be
1043+        # an MDMF file.
1044+        # self.CONTENTS should have more than one segment.
1045+        self.CONTENTS = "This is an MDMF file" * 100000
1046+        self._storage = FakeStorage()
1047+        self._nodemaker = make_nodemaker(self._storage)
1048+        self._storage_broker = self._nodemaker.storage_broker
1049+        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=1)
1050+        def _created(node):
1051+            self._fn = node
1052+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
1053+        d.addCallback(_created)
1054+        return d
1055+
1056+
1057+    def publish_sdmf(self):
1058+        # like publish_one, except that the result is guaranteed to be
1059+        # an SDMF file
1060+        self.CONTENTS = "This is an SDMF file" * 1000
1061+        self._storage = FakeStorage()
1062+        self._nodemaker = make_nodemaker(self._storage)
1063+        self._storage_broker = self._nodemaker.storage_broker
1064+        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=0)
1065+        def _created(node):
1066+            self._fn = node
1067+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
1068+        d.addCallback(_created)
1069+        return d
1070+
1071+
1072+    def publish_multiple(self, version=0):
1073         self.CONTENTS = ["Contents 0",
1074                          "Contents 1",
1075                          "Contents 2",
1076hunk ./src/allmydata/test/test_mutable.py 677
1077         self._copied_shares = {}
1078         self._storage = FakeStorage()
1079         self._nodemaker = make_nodemaker(self._storage)
1080-        d = self._nodemaker.create_mutable_file(self.CONTENTS[0]) # seqnum=1
1081+        d = self._nodemaker.create_mutable_file(self.CONTENTS[0], version=version) # seqnum=1
1082         def _created(node):
1083             self._fn = node
1084             # now create multiple versions of the same file, and accumulate
1085hunk ./src/allmydata/test/test_mutable.py 906
1086         return d
1087 
1088 
1089+    def test_servermapupdater_finds_mdmf_files(self):
1090+        # setUp already published an MDMF file for us. We just need to
1091+        # make sure that when we run the ServermapUpdater, the file is
1092+        # reported to have one recoverable version.
1093+        d = defer.succeed(None)
1094+        d.addCallback(lambda ignored:
1095+            self.publish_mdmf())
1096+        d.addCallback(lambda ignored:
1097+            self.make_servermap(mode=MODE_CHECK))
1098+        # Calling make_servermap also updates the servermap in the mode
1099+        # that we specify, so we just need to see what it says.
1100+        def _check_servermap(sm):
1101+            self.failUnlessEqual(len(sm.recoverable_versions()), 1)
1102+        d.addCallback(_check_servermap)
1103+        return d
1104+
1105+
1106+    def test_servermapupdater_finds_sdmf_files(self):
1107+        d = defer.succeed(None)
1108+        d.addCallback(lambda ignored:
1109+            self.publish_sdmf())
1110+        d.addCallback(lambda ignored:
1111+            self.make_servermap(mode=MODE_CHECK))
1112+        d.addCallback(lambda servermap:
1113+            self.failUnlessEqual(len(servermap.recoverable_versions()), 1))
1114+        return d
1115+
1116 
1117 class Roundtrip(unittest.TestCase, testutil.ShouldFailMixin, PublishMixin):
1118     def setUp(self):
1119hunk ./src/allmydata/test/test_mutable.py 1050
1120         return d
1121     test_no_servers_download.timeout = 15
1122 
1123+
1124     def _test_corrupt_all(self, offset, substring,
1125                           should_succeed=False, corrupt_early=True,
1126                           failure_checker=None):
1127}
1128[Make a segmented mutable uploader
1129Kevan Carstensen <kevan@isnotajoke.com>**20100626234204
1130 Ignore-this: d199af8ab0bc64d8ed2bc19c5437bfba
1131 
1132 The mutable file uploader should be able to publish files with one
1133 segment and files with multiple segments. This patch makes it do that.
1134 This is still incomplete, and rather ugly -- I need to flesh out error
1135 handling, I need to write tests, and I need to remove some of the uglier
1136 kludges in the process before I can call this done.
1137] {
1138hunk ./src/allmydata/mutable/publish.py 8
1139 from zope.interface import implements
1140 from twisted.internet import defer
1141 from twisted.python import failure
1142-from allmydata.interfaces import IPublishStatus
1143+from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION
1144 from allmydata.util import base32, hashutil, mathutil, idlib, log
1145 from allmydata import hashtree, codec
1146 from allmydata.storage.server import si_b2a
1147hunk ./src/allmydata/mutable/publish.py 19
1148      UncoordinatedWriteError, NotEnoughServersError
1149 from allmydata.mutable.servermap import ServerMap
1150 from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
1151-     unpack_checkstring, SIGNED_PREFIX
1152+     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy
1153+
1154+KiB = 1024
1155+DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
1156 
1157 class PublishStatus:
1158     implements(IPublishStatus)
1159hunk ./src/allmydata/mutable/publish.py 112
1160         self._status.set_helper(False)
1161         self._status.set_progress(0.0)
1162         self._status.set_active(True)
1163+        # We use this to control how the file is written.
1164+        version = self._node.get_version()
1165+        assert version in (SDMF_VERSION, MDMF_VERSION)
1166+        self._version = version
1167 
1168     def get_status(self):
1169         return self._status
1170hunk ./src/allmydata/mutable/publish.py 134
1171         simultaneous write.
1172         """
1173 
1174-        # 1: generate shares (SDMF: files are small, so we can do it in RAM)
1175-        # 2: perform peer selection, get candidate servers
1176-        #  2a: send queries to n+epsilon servers, to determine current shares
1177-        #  2b: based upon responses, create target map
1178-        # 3: send slot_testv_and_readv_and_writev messages
1179-        # 4: as responses return, update share-dispatch table
1180-        # 4a: may need to run recovery algorithm
1181-        # 5: when enough responses are back, we're done
1182+        # 0. Setup encoding parameters, encoder, and other such things.
1183+        # 1. Encrypt, encode, and publish segments.
1184 
1185         self.log("starting publish, datalen is %s" % len(newdata))
1186         self._status.set_size(len(newdata))
1187hunk ./src/allmydata/mutable/publish.py 187
1188         self.bad_peers = set() # peerids who have errbacked/refused requests
1189 
1190         self.newdata = newdata
1191-        self.salt = os.urandom(16)
1192 
1193hunk ./src/allmydata/mutable/publish.py 188
1194+        # This will set self.segment_size, self.num_segments, and
1195+        # self.fec.
1196         self.setup_encoding_parameters()
1197 
1198         # if we experience any surprises (writes which were rejected because
1199hunk ./src/allmydata/mutable/publish.py 238
1200             self.bad_share_checkstrings[key] = old_checkstring
1201             self.connections[peerid] = self._servermap.connections[peerid]
1202 
1203-        # create the shares. We'll discard these as they are delivered. SDMF:
1204-        # we're allowed to hold everything in memory.
1205+        # Now, the process dovetails -- if this is an SDMF file, we need
1206+        # to write an SDMF file. Otherwise, we need to write an MDMF
1207+        # file.
1208+        if self._version == MDMF_VERSION:
1209+            return self._publish_mdmf()
1210+        else:
1211+            return self._publish_sdmf()
1212+        #return self.done_deferred
1213+
1214+    def _publish_mdmf(self):
1215+        # Next, we find homes for all of the shares that we don't have
1216+        # homes for yet.
1217+        # TODO: Make this part do peer selection.
1218+        self.update_goal()
1219+        self.writers = {}
1220+        # For each (peerid, shnum) in self.goal, we make an
1221+        # MDMFSlotWriteProxy for that peer. We'll use this to write
1222+        # shares to the peer.
1223+        for key in self.goal:
1224+            peerid, shnum = key
1225+            write_enabler = self._node.get_write_enabler(peerid)
1226+            renew_secret = self._node.get_renewal_secret(peerid)
1227+            cancel_secret = self._node.get_cancel_secret(peerid)
1228+            secrets = (write_enabler, renew_secret, cancel_secret)
1229+
1230+            self.writers[shnum] =  MDMFSlotWriteProxy(shnum,
1231+                                                      self.connections[peerid],
1232+                                                      self._storage_index,
1233+                                                      secrets,
1234+                                                      self._new_seqnum,
1235+                                                      self.required_shares,
1236+                                                      self.total_shares,
1237+                                                      self.segment_size,
1238+                                                      len(self.newdata))
1239+            if (peerid, shnum) in self._servermap.servermap:
1240+                old_versionid, old_timestamp = self._servermap.servermap[key]
1241+                (old_seqnum, old_root_hash, old_salt, old_segsize,
1242+                 old_datalength, old_k, old_N, old_prefix,
1243+                 old_offsets_tuple) = old_versionid
1244+                self.writers[shnum].set_checkstring(old_seqnum, old_root_hash)
1245+
1246+        # Now, we start pushing shares.
1247+        self._status.timings["setup"] = time.time() - self._started
1248+        def _start_pushing(res):
1249+            self._started_pushing = time.time()
1250+            return res
1251+
1252+        # First, we encrypt, encode, and publish the shares that we need
1253+        # to encrypt, encode, and publish.
1254+
1255+        # This will eventually hold the block hash chain for each share
1256+        # that we publish. We define it this way so that empty publishes
1257+        # will still have something to write to the remote slot.
1258+        self.blockhashes = dict([(i, []) for i in xrange(self.total_shares)])
1259+        self.sharehash_leaves = None # eventually [sharehashes]
1260+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
1261+                              # validate the share]
1262 
1263hunk ./src/allmydata/mutable/publish.py 296
1264+        d = defer.succeed(None)
1265+        self.log("Starting push")
1266+        for i in xrange(self.num_segments - 1):
1267+            d.addCallback(lambda ignored, i=i:
1268+                self.push_segment(i))
1269+            d.addCallback(self._turn_barrier)
1270+        # We have at least one segment, so we will have a tail segment
1271+        if self.num_segments > 0:
1272+            d.addCallback(lambda ignored:
1273+                self.push_tail_segment())
1274+
1275+        d.addCallback(lambda ignored:
1276+            self.push_encprivkey())
1277+        d.addCallback(lambda ignored:
1278+            self.push_blockhashes())
1279+        d.addCallback(lambda ignored:
1280+            self.push_sharehashes())
1281+        d.addCallback(lambda ignored:
1282+            self.push_toplevel_hashes_and_signature())
1283+        d.addCallback(lambda ignored:
1284+            self.finish_publishing())
1285+        return d
1286+
1287+
1288+    def _publish_sdmf(self):
1289         self._status.timings["setup"] = time.time() - self._started
1290hunk ./src/allmydata/mutable/publish.py 322
1291+        self.salt = os.urandom(16)
1292+
1293         d = self._encrypt_and_encode()
1294         d.addCallback(self._generate_shares)
1295         def _start_pushing(res):
1296hunk ./src/allmydata/mutable/publish.py 335
1297 
1298         return self.done_deferred
1299 
1300+
1301     def setup_encoding_parameters(self):
1302hunk ./src/allmydata/mutable/publish.py 337
1303-        segment_size = len(self.newdata)
1304+        if self._version == MDMF_VERSION:
1305+            segment_size = DEFAULT_MAX_SEGMENT_SIZE # 128 KiB by default
1306+        else:
1307+            segment_size = len(self.newdata) # SDMF is only one segment
1308         # this must be a multiple of self.required_shares
1309         segment_size = mathutil.next_multiple(segment_size,
1310                                               self.required_shares)
1311hunk ./src/allmydata/mutable/publish.py 350
1312                                                   segment_size)
1313         else:
1314             self.num_segments = 0
1315-        assert self.num_segments in [0, 1,] # SDMF restrictions
1316+        if self._version == SDMF_VERSION:
1317+            assert self.num_segments in (0, 1) # SDMF
1318+            return
1319+        # calculate the tail segment size.
1320+        self.tail_segment_size = len(self.newdata) % segment_size
1321+
1322+        if self.tail_segment_size == 0:
1323+            # The tail segment is the same size as the other segments.
1324+            self.tail_segment_size = segment_size
1325+
1326+        # We'll make an encoder ahead-of-time for the normal-sized
1327+        # segments (defined as any segment of segment_size size.
1328+        # (the part of the code that puts the tail segment will make its
1329+        #  own encoder for that part)
1330+        fec = codec.CRSEncoder()
1331+        fec.set_params(self.segment_size,
1332+                       self.required_shares, self.total_shares)
1333+        self.piece_size = fec.get_block_size()
1334+        self.fec = fec
1335+
1336+
1337+    def push_segment(self, segnum):
1338+        started = time.time()
1339+        segsize = self.segment_size
1340+        self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
1341+        data = self.newdata[segsize * segnum:segsize*(segnum + 1)]
1342+        assert len(data) == segsize
1343+
1344+        salt = os.urandom(16)
1345+
1346+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
1347+        enc = AES(key)
1348+        crypttext = enc.process(data)
1349+        assert len(crypttext) == len(data)
1350+
1351+        now = time.time()
1352+        self._status.timings["encrypt"] = now - started
1353+        started = now
1354+
1355+        # now apply FEC
1356+
1357+        self._status.set_status("Encoding")
1358+        crypttext_pieces = [None] * self.required_shares
1359+        piece_size = self.piece_size
1360+        for i in range(len(crypttext_pieces)):
1361+            offset = i * piece_size
1362+            piece = crypttext[offset:offset+piece_size]
1363+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
1364+            crypttext_pieces[i] = piece
1365+            assert len(piece) == piece_size
1366+        d = self.fec.encode(crypttext_pieces)
1367+        def _done_encoding(res):
1368+            elapsed = time.time() - started
1369+            self._status.timings["encode"] = elapsed
1370+            return res
1371+        d.addCallback(_done_encoding)
1372+
1373+        def _push_shares_and_salt(results):
1374+            shares, shareids = results
1375+            dl = []
1376+            for i in xrange(len(shares)):
1377+                sharedata = shares[i]
1378+                shareid = shareids[i]
1379+                block_hash = hashutil.block_hash(salt + sharedata)
1380+                self.blockhashes[shareid].append(block_hash)
1381+
1382+                # find the writer for this share
1383+                d = self.writers[shareid].put_block(sharedata, segnum, salt)
1384+                dl.append(d)
1385+            # TODO: Naturally, we need to check on the results of these.
1386+            return defer.DeferredList(dl)
1387+        d.addCallback(_push_shares_and_salt)
1388+        return d
1389+
1390+
1391+    def push_tail_segment(self):
1392+        # This is essentially the same as push_segment, except that we
1393+        # don't use the cached encoder that we use elsewhere.
1394+        self.log("Pushing tail segment")
1395+        started = time.time()
1396+        segsize = self.segment_size
1397+        data = self.newdata[segsize * (self.num_segments-1):]
1398+        assert len(data) == self.tail_segment_size
1399+        salt = os.urandom(16)
1400+
1401+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
1402+        enc = AES(key)
1403+        crypttext = enc.process(data)
1404+        assert len(crypttext) == len(data)
1405+
1406+        now = time.time()
1407+        self._status.timings['encrypt'] = now - started
1408+        started = now
1409+
1410+        self._status.set_status("Encoding")
1411+        tail_fec = codec.CRSEncoder()
1412+        tail_fec.set_params(self.tail_segment_size,
1413+                            self.required_shares,
1414+                            self.total_shares)
1415+
1416+        crypttext_pieces = [None] * self.required_shares
1417+        piece_size = tail_fec.get_block_size()
1418+        for i in range(len(crypttext_pieces)):
1419+            offset = i * piece_size
1420+            piece = crypttext[offset:offset+piece_size]
1421+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
1422+            crypttext_pieces[i] = piece
1423+            assert len(piece) == piece_size
1424+        d = tail_fec.encode(crypttext_pieces)
1425+        def _push_shares_and_salt(results):
1426+            shares, shareids = results
1427+            dl = []
1428+            for i in xrange(len(shares)):
1429+                sharedata = shares[i]
1430+                shareid = shareids[i]
1431+                block_hash = hashutil.block_hash(salt + sharedata)
1432+                self.blockhashes[shareid].append(block_hash)
1433+                # find the writer for this share
1434+                d = self.writers[shareid].put_block(sharedata,
1435+                                                    self.num_segments - 1,
1436+                                                    salt)
1437+                dl.append(d)
1438+            # TODO: Naturally, we need to check on the results of these.
1439+            return defer.DeferredList(dl)
1440+        d.addCallback(_push_shares_and_salt)
1441+        return d
1442+
1443+
1444+    def push_encprivkey(self):
1445+        started = time.time()
1446+        encprivkey = self._encprivkey
1447+        dl = []
1448+        def _spy_on_writer(results):
1449+            print results
1450+            return results
1451+        for shnum, writer in self.writers.iteritems():
1452+            d = writer.put_encprivkey(encprivkey)
1453+            dl.append(d)
1454+        d = defer.DeferredList(dl)
1455+        return d
1456+
1457+
1458+    def push_blockhashes(self):
1459+        started = time.time()
1460+        dl = []
1461+        def _spy_on_results(results):
1462+            print results
1463+            return results
1464+        self.sharehash_leaves = [None] * len(self.blockhashes)
1465+        for shnum, blockhashes in self.blockhashes.iteritems():
1466+            t = hashtree.HashTree(blockhashes)
1467+            self.blockhashes[shnum] = list(t)
1468+            # set the leaf for future use.
1469+            self.sharehash_leaves[shnum] = t[0]
1470+            d = self.writers[shnum].put_blockhashes(self.blockhashes[shnum])
1471+            dl.append(d)
1472+        d = defer.DeferredList(dl)
1473+        return d
1474+
1475+
1476+    def push_sharehashes(self):
1477+        share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
1478+        share_hash_chain = {}
1479+        ds = []
1480+        def _spy_on_results(results):
1481+            print results
1482+            return results
1483+        for shnum in xrange(len(self.sharehash_leaves)):
1484+            needed_indices = share_hash_tree.needed_hashes(shnum)
1485+            self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
1486+                                             for i in needed_indices] )
1487+            d = self.writers[shnum].put_sharehashes(self.sharehashes[shnum])
1488+            ds.append(d)
1489+        self.root_hash = share_hash_tree[0]
1490+        d = defer.DeferredList(ds)
1491+        return d
1492+
1493+
1494+    def push_toplevel_hashes_and_signature(self):
1495+        # We need to to three things here:
1496+        #   - Push the root hash and salt hash
1497+        #   - Get the checkstring of the resulting layout; sign that.
1498+        #   - Push the signature
1499+        ds = []
1500+        def _spy_on_results(results):
1501+            print results
1502+            return results
1503+        for shnum in xrange(self.total_shares):
1504+            d = self.writers[shnum].put_root_hash(self.root_hash)
1505+            ds.append(d)
1506+        d = defer.DeferredList(ds)
1507+        def _make_and_place_signature(ignored):
1508+            signable = self.writers[0].get_signable()
1509+            self.signature = self._privkey.sign(signable)
1510+
1511+            ds = []
1512+            for (shnum, writer) in self.writers.iteritems():
1513+                d = writer.put_signature(self.signature)
1514+                ds.append(d)
1515+            return defer.DeferredList(ds)
1516+        d.addCallback(_make_and_place_signature)
1517+        return d
1518+
1519+
1520+    def finish_publishing(self):
1521+        # We're almost done -- we just need to put the verification key
1522+        # and the offsets
1523+        ds = []
1524+        verification_key = self._pubkey.serialize()
1525+
1526+        def _spy_on_results(results):
1527+            print results
1528+            return results
1529+        for (shnum, writer) in self.writers.iteritems():
1530+            d = writer.put_verification_key(verification_key)
1531+            d.addCallback(lambda ignored, writer=writer:
1532+                writer.finish_publishing())
1533+            ds.append(d)
1534+        return defer.DeferredList(ds)
1535+
1536+
1537+    def _turn_barrier(self, res):
1538+        # putting this method in a Deferred chain imposes a guaranteed
1539+        # reactor turn between the pre- and post- portions of that chain.
1540+        # This can be useful to limit memory consumption: since Deferreds do
1541+        # not do tail recursion, code which uses defer.succeed(result) for
1542+        # consistency will cause objects to live for longer than you might
1543+        # normally expect.
1544+        return fireEventually(res)
1545+
1546 
1547     def _fatal_error(self, f):
1548         self.log("error during loop", failure=f, level=log.UNUSUAL)
1549hunk ./src/allmydata/mutable/publish.py 716
1550             self.log_goal(self.goal, "after update: ")
1551 
1552 
1553-
1554     def _encrypt_and_encode(self):
1555         # this returns a Deferred that fires with a list of (sharedata,
1556         # sharenum) tuples. TODO: cache the ciphertext, only produce the
1557hunk ./src/allmydata/mutable/publish.py 757
1558         d.addCallback(_done_encoding)
1559         return d
1560 
1561+
1562     def _generate_shares(self, shares_and_shareids):
1563         # this sets self.shares and self.root_hash
1564         self.log("_generate_shares")
1565hunk ./src/allmydata/mutable/publish.py 1145
1566             self._status.set_progress(1.0)
1567         eventually(self.done_deferred.callback, res)
1568 
1569-
1570hunk ./src/allmydata/test/test_mutable.py 248
1571         d.addCallback(_created)
1572         return d
1573 
1574+
1575+    def test_create_mdmf(self):
1576+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
1577+        def _created(n):
1578+            self.failUnless(isinstance(n, MutableFileNode))
1579+            self.failUnlessEqual(n.get_storage_index(), n._storage_index)
1580+            sb = self.nodemaker.storage_broker
1581+            peer0 = sorted(sb.get_all_serverids())[0]
1582+            shnums = self._storage._peers[peer0].keys()
1583+            self.failUnlessEqual(len(shnums), 1)
1584+        d.addCallback(_created)
1585+        return d
1586+
1587+
1588     def test_serialize(self):
1589         n = MutableFileNode(None, None, {"k": 3, "n": 10}, None)
1590         calls = []
1591hunk ./src/allmydata/test/test_mutable.py 334
1592         d.addCallback(_created)
1593         return d
1594 
1595+
1596+    def test_create_mdmf_with_initial_contents(self):
1597+        initial_contents = "foobarbaz" * 131072 # 900KiB
1598+        d = self.nodemaker.create_mutable_file(initial_contents,
1599+                                               version=MDMF_VERSION)
1600+        def _created(n):
1601+            d = n.download_best_version()
1602+            d.addCallback(lambda data:
1603+                self.failUnlessEqual(data, initial_contents))
1604+            d.addCallback(lambda ignored:
1605+                n.overwrite(initial_contents + "foobarbaz"))
1606+            d.addCallback(lambda ignored:
1607+                n.download_best_version())
1608+            d.addCallback(lambda data:
1609+                self.failUnlessEqual(data, initial_contents +
1610+                                           "foobarbaz"))
1611+            return d
1612+        d.addCallback(_created)
1613+        return d
1614+
1615+
1616     def test_create_with_initial_contents_function(self):
1617         data = "initial contents"
1618         def _make_contents(n):
1619hunk ./src/allmydata/test/test_mutable.py 370
1620         d.addCallback(lambda data2: self.failUnlessEqual(data2, data))
1621         return d
1622 
1623+
1624+    def test_create_mdmf_with_initial_contents_function(self):
1625+        data = "initial contents" * 100000
1626+        def _make_contents(n):
1627+            self.failUnless(isinstance(n, MutableFileNode))
1628+            key = n.get_writekey()
1629+            self.failUnless(isinstance(key, str), key)
1630+            self.failUnlessEqual(len(key), 16)
1631+            return data
1632+        d = self.nodemaker.create_mutable_file(_make_contents,
1633+                                               version=MDMF_VERSION)
1634+        d.addCallback(lambda n:
1635+            n.download_best_version())
1636+        d.addCallback(lambda data2:
1637+            self.failUnlessEqual(data2, data))
1638+        return d
1639+
1640+
1641     def test_create_with_too_large_contents(self):
1642         BIG = "a" * (self.OLD_MAX_SEGMENT_SIZE + 1)
1643         d = self.nodemaker.create_mutable_file(BIG)
1644}
1645[Write a segmented mutable downloader
1646Kevan Carstensen <kevan@isnotajoke.com>**20100626234314
1647 Ignore-this: d2bef531cde1b5c38f2eb28afdd4b17c
1648 
1649 The segmented mutable downloader can deal with MDMF files (files with
1650 one or more segments in MDMF format) and SDMF files (files with one
1651 segment in SDMF format). It is backwards compatible with the old
1652 file format.
1653 
1654 This patch also contains tests for the segmented mutable downloader.
1655] {
1656hunk ./src/allmydata/mutable/retrieve.py 8
1657 from twisted.internet import defer
1658 from twisted.python import failure
1659 from foolscap.api import DeadReferenceError, eventually, fireEventually
1660-from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError
1661-from allmydata.util import hashutil, idlib, log
1662+from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError, \
1663+                                 MDMF_VERSION, SDMF_VERSION
1664+from allmydata.util import hashutil, idlib, log, mathutil
1665 from allmydata import hashtree, codec
1666 from allmydata.storage.server import si_b2a
1667 from pycryptopp.cipher.aes import AES
1668hunk ./src/allmydata/mutable/retrieve.py 17
1669 from pycryptopp.publickey import rsa
1670 
1671 from allmydata.mutable.common import DictOfSets, CorruptShareError, UncoordinatedWriteError
1672-from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data
1673+from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data, \
1674+                                     MDMFSlotReadProxy
1675 
1676 class RetrieveStatus:
1677     implements(IRetrieveStatus)
1678hunk ./src/allmydata/mutable/retrieve.py 104
1679         self.verinfo = verinfo
1680         # during repair, we may be called upon to grab the private key, since
1681         # it wasn't picked up during a verify=False checker run, and we'll
1682-        # need it for repair to generate the a new version.
1683+        # need it for repair to generate a new version.
1684         self._need_privkey = fetch_privkey
1685         if self._node.get_privkey():
1686             self._need_privkey = False
1687hunk ./src/allmydata/mutable/retrieve.py 109
1688 
1689+        if self._need_privkey:
1690+            # TODO: Evaluate the need for this. We'll use it if we want
1691+            # to limit how many queries are on the wire for the privkey
1692+            # at once.
1693+            self._privkey_query_markers = [] # one Marker for each time we've
1694+                                             # tried to get the privkey.
1695+
1696         self._status = RetrieveStatus()
1697         self._status.set_storage_index(self._storage_index)
1698         self._status.set_helper(False)
1699hunk ./src/allmydata/mutable/retrieve.py 125
1700          offsets_tuple) = self.verinfo
1701         self._status.set_size(datalength)
1702         self._status.set_encoding(k, N)
1703+        self.readers = {}
1704 
1705     def get_status(self):
1706         return self._status
1707hunk ./src/allmydata/mutable/retrieve.py 149
1708         self.remaining_sharemap = DictOfSets()
1709         for (shnum, peerid, timestamp) in shares:
1710             self.remaining_sharemap.add(shnum, peerid)
1711+            # If the servermap update fetched anything, it fetched at least 1
1712+            # KiB, so we ask for that much.
1713+            # TODO: Change the cache methods to allow us to fetch all of the
1714+            # data that they have, then change this method to do that.
1715+            any_cache, timestamp = self._node._read_from_cache(self.verinfo,
1716+                                                               shnum,
1717+                                                               0,
1718+                                                               1000)
1719+            ss = self.servermap.connections[peerid]
1720+            reader = MDMFSlotReadProxy(ss,
1721+                                       self._storage_index,
1722+                                       shnum,
1723+                                       any_cache)
1724+            reader.peerid = peerid
1725+            self.readers[shnum] = reader
1726+
1727 
1728         self.shares = {} # maps shnum to validated blocks
1729hunk ./src/allmydata/mutable/retrieve.py 167
1730+        self._active_readers = [] # list of active readers for this dl.
1731+        self._validated_readers = set() # set of readers that we have
1732+                                        # validated the prefix of
1733+        self._block_hash_trees = {} # shnum => hashtree
1734+        # TODO: Make this into a file-backed consumer or something to
1735+        # conserve memory.
1736+        self._plaintext = ""
1737 
1738         # how many shares do we need?
1739hunk ./src/allmydata/mutable/retrieve.py 176
1740-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1741+        (seqnum,
1742+         root_hash,
1743+         IV,
1744+         segsize,
1745+         datalength,
1746+         k,
1747+         N,
1748+         prefix,
1749          offsets_tuple) = self.verinfo
1750hunk ./src/allmydata/mutable/retrieve.py 185
1751-        assert len(self.remaining_sharemap) >= k
1752-        # we start with the lowest shnums we have available, since FEC is
1753-        # faster if we're using "primary shares"
1754-        self.active_shnums = set(sorted(self.remaining_sharemap.keys())[:k])
1755-        for shnum in self.active_shnums:
1756-            # we use an arbitrary peer who has the share. If shares are
1757-            # doubled up (more than one share per peer), we could make this
1758-            # run faster by spreading the load among multiple peers. But the
1759-            # algorithm to do that is more complicated than I want to write
1760-            # right now, and a well-provisioned grid shouldn't have multiple
1761-            # shares per peer.
1762-            peerid = list(self.remaining_sharemap[shnum])[0]
1763-            self.get_data(shnum, peerid)
1764 
1765hunk ./src/allmydata/mutable/retrieve.py 186
1766-        # control flow beyond this point: state machine. Receiving responses
1767-        # from queries is the input. We might send out more queries, or we
1768-        # might produce a result.
1769 
1770hunk ./src/allmydata/mutable/retrieve.py 187
1771+        # We need one share hash tree for the entire file; its leaves
1772+        # are the roots of the block hash trees for the shares that
1773+        # comprise it, and its root is in the verinfo.
1774+        self.share_hash_tree = hashtree.IncompleteHashTree(N)
1775+        self.share_hash_tree.set_hashes({0: root_hash})
1776+
1777+        # This will set up both the segment decoder and the tail segment
1778+        # decoder, as well as a variety of other instance variables that
1779+        # the download process will use.
1780+        self._setup_encoding_parameters()
1781+        assert len(self.remaining_sharemap) >= k
1782+
1783+        self.log("starting download")
1784+        self._add_active_peers()
1785+        # The download process beyond this is a state machine.
1786+        # _add_active_peers will select the peers that we want to use
1787+        # for the download, and then attempt to start downloading. After
1788+        # each segment, it will check for doneness, reacting to broken
1789+        # peers and corrupt shares as necessary. If it runs out of good
1790+        # peers before downloading all of the segments, _done_deferred
1791+        # will errback.  Otherwise, it will eventually callback with the
1792+        # contents of the mutable file.
1793         return self._done_deferred
1794 
1795hunk ./src/allmydata/mutable/retrieve.py 211
1796-    def get_data(self, shnum, peerid):
1797-        self.log(format="sending sh#%(shnum)d request to [%(peerid)s]",
1798-                 shnum=shnum,
1799-                 peerid=idlib.shortnodeid_b2a(peerid),
1800-                 level=log.NOISY)
1801-        ss = self.servermap.connections[peerid]
1802-        started = time.time()
1803-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1804+
1805+    def _setup_encoding_parameters(self):
1806+        """
1807+        I set up the encoding parameters, including k, n, the number
1808+        of segments associated with this file, and the segment decoder.
1809+        """
1810+        (seqnum,
1811+         root_hash,
1812+         IV,
1813+         segsize,
1814+         datalength,
1815+         k,
1816+         n,
1817+         known_prefix,
1818          offsets_tuple) = self.verinfo
1819hunk ./src/allmydata/mutable/retrieve.py 226
1820-        offsets = dict(offsets_tuple)
1821+        self._required_shares = k
1822+        self._total_shares = n
1823+        self._segment_size = segsize
1824+        self._data_length = datalength
1825+
1826+        if not IV:
1827+            self._version = MDMF_VERSION
1828+        else:
1829+            self._version = SDMF_VERSION
1830+
1831+        if datalength and segsize:
1832+            self._num_segments = mathutil.div_ceil(datalength, segsize)
1833+            self._tail_data_size = datalength % segsize
1834+        else:
1835+            self._num_segments = 0
1836+            self._tail_data_size = 0
1837 
1838hunk ./src/allmydata/mutable/retrieve.py 243
1839-        # we read the checkstring, to make sure that the data we grab is from
1840-        # the right version.
1841-        readv = [ (0, struct.calcsize(SIGNED_PREFIX)) ]
1842+        self._segment_decoder = codec.CRSDecoder()
1843+        self._segment_decoder.set_params(segsize, k, n)
1844+        self._current_segment = 0
1845 
1846hunk ./src/allmydata/mutable/retrieve.py 247
1847-        # We also read the data, and the hashes necessary to validate them
1848-        # (share_hash_chain, block_hash_tree, share_data). We don't read the
1849-        # signature or the pubkey, since that was handled during the
1850-        # servermap phase, and we'll be comparing the share hash chain
1851-        # against the roothash that was validated back then.
1852+        if  not self._tail_data_size:
1853+            self._tail_data_size = segsize
1854 
1855hunk ./src/allmydata/mutable/retrieve.py 250
1856-        readv.append( (offsets['share_hash_chain'],
1857-                       offsets['enc_privkey'] - offsets['share_hash_chain'] ) )
1858+        self._tail_segment_size = mathutil.next_multiple(self._tail_data_size,
1859+                                                         self._required_shares)
1860+        if self._tail_segment_size == self._segment_size:
1861+            self._tail_decoder = self._segment_decoder
1862+        else:
1863+            self._tail_decoder = codec.CRSDecoder()
1864+            self._tail_decoder.set_params(self._tail_segment_size,
1865+                                          self._required_shares,
1866+                                          self._total_shares)
1867 
1868hunk ./src/allmydata/mutable/retrieve.py 260
1869-        # if we need the private key (for repair), we also fetch that
1870-        if self._need_privkey:
1871-            readv.append( (offsets['enc_privkey'],
1872-                           offsets['EOF'] - offsets['enc_privkey']) )
1873+        self.log("got encoding parameters: "
1874+                 "k: %d "
1875+                 "n: %d "
1876+                 "%d segments of %d bytes each (%d byte tail segment)" % \
1877+                 (k, n, self._num_segments, self._segment_size,
1878+                  self._tail_segment_size))
1879 
1880hunk ./src/allmydata/mutable/retrieve.py 267
1881-        m = Marker()
1882-        self._outstanding_queries[m] = (peerid, shnum, started)
1883+        for i in xrange(self._total_shares):
1884+            # So we don't have to do this later.
1885+            self._block_hash_trees[i] = hashtree.IncompleteHashTree(self._num_segments)
1886 
1887hunk ./src/allmydata/mutable/retrieve.py 271
1888-        # ask the cache first
1889-        got_from_cache = False
1890-        datavs = []
1891-        for (offset, length) in readv:
1892-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
1893-                                                            offset, length)
1894-            if data is not None:
1895-                datavs.append(data)
1896-        if len(datavs) == len(readv):
1897-            self.log("got data from cache")
1898-            got_from_cache = True
1899-            d = fireEventually({shnum: datavs})
1900-            # datavs is a dict mapping shnum to a pair of strings
1901-        else:
1902-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
1903-        self.remaining_sharemap.discard(shnum, peerid)
1904+        # If we have more than one segment, we are an SDMF file, which
1905+        # means that we need to validate the salts as we receive them.
1906+        self._salt_hash_tree = hashtree.IncompleteHashTree(self._num_segments)
1907+        self._salt_hash_tree[0] = IV # from the prefix.
1908 
1909hunk ./src/allmydata/mutable/retrieve.py 276
1910-        d.addCallback(self._got_results, m, peerid, started, got_from_cache)
1911-        d.addErrback(self._query_failed, m, peerid)
1912-        # errors that aren't handled by _query_failed (and errors caused by
1913-        # _query_failed) get logged, but we still want to check for doneness.
1914-        def _oops(f):
1915-            self.log(format="problem in _query_failed for sh#%(shnum)d to %(peerid)s",
1916-                     shnum=shnum,
1917-                     peerid=idlib.shortnodeid_b2a(peerid),
1918-                     failure=f,
1919-                     level=log.WEIRD, umid="W0xnQA")
1920-        d.addErrback(_oops)
1921-        d.addBoth(self._check_for_done)
1922-        # any error during _check_for_done means the download fails. If the
1923-        # download is successful, _check_for_done will fire _done by itself.
1924-        d.addErrback(self._done)
1925-        d.addErrback(log.err)
1926-        return d # purely for testing convenience
1927 
1928hunk ./src/allmydata/mutable/retrieve.py 277
1929-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
1930-        # isolate the callRemote to a separate method, so tests can subclass
1931-        # Publish and override it
1932-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
1933-        return d
1934+    def _add_active_peers(self):
1935+        """
1936+        I populate self._active_readers with enough active readers to
1937+        retrieve the contents of this mutable file. I am called before
1938+        downloading starts, and (eventually) after each validation
1939+        error, connection error, or other problem in the download.
1940+        """
1941+        # TODO: It would be cool to investigate other heuristics for
1942+        # reader selection. For instance, the cost (in time the user
1943+        # spends waiting for their file) of selecting a really slow peer
1944+        # that happens to have a primary share is probably more than
1945+        # selecting a really fast peer that doesn't have a primary
1946+        # share. Maybe the servermap could be extended to provide this
1947+        # information; it could keep track of latency information while
1948+        # it gathers more important data, and then this routine could
1949+        # use that to select active readers.
1950+        #
1951+        # (these and other questions would be easier to answer with a
1952+        #  robust, configurable tahoe-lafs simulator, which modeled node
1953+        #  failures, differences in node speed, and other characteristics
1954+        #  that we expect storage servers to have.  You could have
1955+        #  presets for really stable grids (like allmydata.com),
1956+        #  friendnets, make it easy to configure your own settings, and
1957+        #  then simulate the effect of big changes on these use cases
1958+        #  instead of just reasoning about what the effect might be. Out
1959+        #  of scope for MDMF, though.)
1960 
1961hunk ./src/allmydata/mutable/retrieve.py 304
1962-    def remove_peer(self, peerid):
1963-        for shnum in list(self.remaining_sharemap.keys()):
1964-            self.remaining_sharemap.discard(shnum, peerid)
1965+        # We need at least self._required_shares readers to download a
1966+        # segment.
1967+        needed = self._required_shares - len(self._active_readers)
1968+        # XXX: Why don't format= log messages work here?
1969+        self.log("adding %d peers to the active peers list" % needed)
1970 
1971hunk ./src/allmydata/mutable/retrieve.py 310
1972-    def _got_results(self, datavs, marker, peerid, started, got_from_cache):
1973-        now = time.time()
1974-        elapsed = now - started
1975-        if not got_from_cache:
1976-            self._status.add_fetch_timing(peerid, elapsed)
1977-        self.log(format="got results (%(shares)d shares) from [%(peerid)s]",
1978-                 shares=len(datavs),
1979-                 peerid=idlib.shortnodeid_b2a(peerid),
1980-                 level=log.NOISY)
1981-        self._outstanding_queries.pop(marker, None)
1982-        if not self._running:
1983-            return
1984+        # We favor lower numbered shares, since FEC is faster with
1985+        # primary shares than with other shares, and lower-numbered
1986+        # shares are more likely to be primary than higher numbered
1987+        # shares.
1988+        active_shnums = set(sorted(self.remaining_sharemap.keys()))
1989+        # We shouldn't consider adding shares that we already have; this
1990+        # will cause problems later.
1991+        active_shnums -= set([reader.shnum for reader in self._active_readers])
1992+        active_shnums = list(active_shnums)[:needed]
1993+        if len(active_shnums) < needed:
1994+            # We don't have enough readers to retrieve the file; fail.
1995+            return self._failed()
1996 
1997hunk ./src/allmydata/mutable/retrieve.py 323
1998-        # note that we only ask for a single share per query, so we only
1999-        # expect a single share back. On the other hand, we use the extra
2000-        # shares if we get them.. seems better than an assert().
2001+        for shnum in active_shnums:
2002+            self._active_readers.append(self.readers[shnum])
2003+            self.log("added reader for share %d" % shnum)
2004+        assert len(self._active_readers) == self._required_shares
2005+        # Conceptually, this is part of the _add_active_peers step. It
2006+        # validates the prefixes of newly added readers to make sure
2007+        # that they match what we are expecting for self.verinfo. If
2008+        # validation is successful, _validate_active_prefixes will call
2009+        # _download_current_segment for us. If validation is
2010+        # unsuccessful, then _validate_prefixes will remove the peer and
2011+        # call _add_active_peers again, where we will attempt to rectify
2012+        # the problem by choosing another peer.
2013+        return self._validate_active_prefixes()
2014 
2015hunk ./src/allmydata/mutable/retrieve.py 337
2016-        for shnum,datav in datavs.items():
2017-            (prefix, hash_and_data) = datav[:2]
2018-            try:
2019-                self._got_results_one_share(shnum, peerid,
2020-                                            prefix, hash_and_data)
2021-            except CorruptShareError, e:
2022-                # log it and give the other shares a chance to be processed
2023-                f = failure.Failure()
2024-                self.log(format="bad share: %(f_value)s",
2025-                         f_value=str(f.value), failure=f,
2026-                         level=log.WEIRD, umid="7fzWZw")
2027-                self.notify_server_corruption(peerid, shnum, str(e))
2028-                self.remove_peer(peerid)
2029-                self.servermap.mark_bad_share(peerid, shnum, prefix)
2030-                self._bad_shares.add( (peerid, shnum) )
2031-                self._status.problems[peerid] = f
2032-                self._last_failure = f
2033-                pass
2034-            if self._need_privkey and len(datav) > 2:
2035-                lp = None
2036-                self._try_to_validate_privkey(datav[2], peerid, shnum, lp)
2037-        # all done!
2038 
2039hunk ./src/allmydata/mutable/retrieve.py 338
2040-    def notify_server_corruption(self, peerid, shnum, reason):
2041-        ss = self.servermap.connections[peerid]
2042-        ss.callRemoteOnly("advise_corrupt_share",
2043-                          "mutable", self._storage_index, shnum, reason)
2044+    def _validate_active_prefixes(self):
2045+        """
2046+        I check to make sure that the prefixes on the peers that I am
2047+        currently reading from match the prefix that we want to see, as
2048+        said in self.verinfo.
2049 
2050hunk ./src/allmydata/mutable/retrieve.py 344
2051-    def _got_results_one_share(self, shnum, peerid,
2052-                               got_prefix, got_hash_and_data):
2053-        self.log("_got_results: got shnum #%d from peerid %s"
2054-                 % (shnum, idlib.shortnodeid_b2a(peerid)))
2055-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2056+        If I find that all of the active peers have acceptable prefixes,
2057+        I pass control to _download_current_segment, which will use
2058+        those peers to do cool things. If I find that some of the active
2059+        peers have unacceptable prefixes, I will remove them from active
2060+        peers (and from further consideration) and call
2061+        _add_active_peers to attempt to rectify the situation. I keep
2062+        track of which peers I have already validated so that I don't
2063+        need to do so again.
2064+        """
2065+        assert self._active_readers, "No more active readers"
2066+
2067+        ds = []
2068+        new_readers = set(self._active_readers) - self._validated_readers
2069+        self.log('validating %d newly-added active readers' % len(new_readers))
2070+
2071+        for reader in new_readers:
2072+            # We force a remote read here -- otherwise, we are relying
2073+            # on cached data that we already verified as valid, and we
2074+            # won't detect an uncoordinated write that has occurred
2075+            # since the last servermap update.
2076+            d = reader.get_prefix(force_remote=True)
2077+            d.addCallback(self._try_to_validate_prefix, reader)
2078+            ds.append(d)
2079+        dl = defer.DeferredList(ds, consumeErrors=True)
2080+        def _check_results(results):
2081+            # Each result in results will be of the form (success, msg).
2082+            # We don't care about msg, but success will tell us whether
2083+            # or not the checkstring validated. If it didn't, we need to
2084+            # remove the offending (peer,share) from our active readers,
2085+            # and ensure that active readers is again populated.
2086+            bad_readers = []
2087+            for i, result in enumerate(results):
2088+                if not result[0]:
2089+                    reader = self._active_readers[i]
2090+                    f = result[1]
2091+                    assert isinstance(f, failure.Failure)
2092+
2093+                    self.log("The reader %s failed to "
2094+                             "properly validate: %s" % \
2095+                             (reader, str(f.value)))
2096+                    bad_readers.append((reader, f))
2097+                else:
2098+                    reader = self._active_readers[i]
2099+                    self.log("the reader %s checks out, so we'll use it" % \
2100+                             reader)
2101+                    self._validated_readers.add(reader)
2102+                    # Each time we validate a reader, we check to see if
2103+                    # we need the private key. If we do, we politely ask
2104+                    # for it and then continue computing. If we find
2105+                    # that we haven't gotten it at the end of
2106+                    # segment decoding, then we'll take more drastic
2107+                    # measures.
2108+                    if self._need_privkey:
2109+                        d = reader.get_encprivkey()
2110+                        d.addCallback(self._try_to_validate_privkey, reader)
2111+            if bad_readers:
2112+                # We do them all at once, or else we screw up list indexing.
2113+                for (reader, f) in bad_readers:
2114+                    self._mark_bad_share(reader, f)
2115+                return self._add_active_peers()
2116+            else:
2117+                return self._download_current_segment()
2118+            # The next step will assert that it has enough active
2119+            # readers to fetch shares; we just need to remove it.
2120+        dl.addCallback(_check_results)
2121+        return dl
2122+
2123+
2124+    def _try_to_validate_prefix(self, prefix, reader):
2125+        """
2126+        I check that the prefix returned by a candidate server for
2127+        retrieval matches the prefix that the servermap knows about
2128+        (and, hence, the prefix that was validated earlier). If it does,
2129+        I return True, which means that I approve of the use of the
2130+        candidate server for segment retrieval. If it doesn't, I return
2131+        False, which means that another server must be chosen.
2132+        """
2133+        (seqnum,
2134+         root_hash,
2135+         IV,
2136+         segsize,
2137+         datalength,
2138+         k,
2139+         N,
2140+         known_prefix,
2141          offsets_tuple) = self.verinfo
2142hunk ./src/allmydata/mutable/retrieve.py 430
2143-        assert len(got_prefix) == len(prefix), (len(got_prefix), len(prefix))
2144-        if got_prefix != prefix:
2145-            msg = "someone wrote to the data since we read the servermap: prefix changed"
2146-            raise UncoordinatedWriteError(msg)
2147-        (share_hash_chain, block_hash_tree,
2148-         share_data) = unpack_share_data(self.verinfo, got_hash_and_data)
2149+        if known_prefix != prefix:
2150+            self.log("prefix from share %d doesn't match" % reader.shnum)
2151+            raise UncoordinatedWriteError("Mismatched prefix -- this could "
2152+                                          "indicate an uncoordinated write")
2153+        # Otherwise, we're okay -- no issues.
2154 
2155hunk ./src/allmydata/mutable/retrieve.py 436
2156-        assert isinstance(share_data, str)
2157-        # build the block hash tree. SDMF has only one leaf.
2158-        leaves = [hashutil.block_hash(share_data)]
2159-        t = hashtree.HashTree(leaves)
2160-        if list(t) != block_hash_tree:
2161-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
2162-        share_hash_leaf = t[0]
2163-        t2 = hashtree.IncompleteHashTree(N)
2164-        # root_hash was checked by the signature
2165-        t2.set_hashes({0: root_hash})
2166-        try:
2167-            t2.set_hashes(hashes=share_hash_chain,
2168-                          leaves={shnum: share_hash_leaf})
2169-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
2170-                IndexError), e:
2171-            msg = "corrupt hashes: %s" % (e,)
2172-            raise CorruptShareError(peerid, shnum, msg)
2173-        self.log(" data valid! len=%d" % len(share_data))
2174-        # each query comes down to this: placing validated share data into
2175-        # self.shares
2176-        self.shares[shnum] = share_data
2177 
2178hunk ./src/allmydata/mutable/retrieve.py 437
2179-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
2180+    def _remove_reader(self, reader):
2181+        """
2182+        At various points, we will wish to remove a peer from
2183+        consideration and/or use. These include, but are not necessarily
2184+        limited to:
2185 
2186hunk ./src/allmydata/mutable/retrieve.py 443
2187-        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
2188-        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
2189-        if alleged_writekey != self._node.get_writekey():
2190-            self.log("invalid privkey from %s shnum %d" %
2191-                     (idlib.nodeid_b2a(peerid)[:8], shnum),
2192-                     parent=lp, level=log.WEIRD, umid="YIw4tA")
2193-            return
2194+            - A connection error.
2195+            - A mismatched prefix (that is, a prefix that does not match
2196+              our conception of the version information string).
2197+            - A failing block hash, salt hash, or share hash, which can
2198+              indicate disk failure/bit flips, or network trouble.
2199 
2200hunk ./src/allmydata/mutable/retrieve.py 449
2201-        # it's good
2202-        self.log("got valid privkey from shnum %d on peerid %s" %
2203-                 (shnum, idlib.shortnodeid_b2a(peerid)),
2204-                 parent=lp)
2205-        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
2206-        self._node._populate_encprivkey(enc_privkey)
2207-        self._node._populate_privkey(privkey)
2208-        self._need_privkey = False
2209+        This method will do that. I will make sure that the
2210+        (shnum,reader) combination represented by my reader argument is
2211+        not used for anything else during this download. I will not
2212+        advise the reader of any corruption, something that my callers
2213+        may wish to do on their own.
2214+        """
2215+        # TODO: When you're done writing this, see if this is ever
2216+        # actually used for something that _mark_bad_share isn't. I have
2217+        # a feeling that they will be used for very similar things, and
2218+        # that having them both here is just going to be an epic amount
2219+        # of code duplication.
2220+        #
2221+        # (well, okay, not epic, but meaningful)
2222+        self.log("removing reader %s" % reader)
2223+        # Remove the reader from _active_readers
2224+        self._active_readers.remove(reader)
2225+        # TODO: self.readers.remove(reader)?
2226+        for shnum in list(self.remaining_sharemap.keys()):
2227+            self.remaining_sharemap.discard(shnum, reader.peerid)
2228 
2229hunk ./src/allmydata/mutable/retrieve.py 469
2230-    def _query_failed(self, f, marker, peerid):
2231-        self.log(format="query to [%(peerid)s] failed",
2232-                 peerid=idlib.shortnodeid_b2a(peerid),
2233-                 level=log.NOISY)
2234-        self._status.problems[peerid] = f
2235-        self._outstanding_queries.pop(marker, None)
2236-        if not self._running:
2237-            return
2238-        self._last_failure = f
2239-        self.remove_peer(peerid)
2240-        level = log.WEIRD
2241-        if f.check(DeadReferenceError):
2242-            level = log.UNUSUAL
2243-        self.log(format="error during query: %(f_value)s",
2244-                 f_value=str(f.value), failure=f, level=level, umid="gOJB5g")
2245 
2246hunk ./src/allmydata/mutable/retrieve.py 470
2247-    def _check_for_done(self, res):
2248-        # exit paths:
2249-        #  return : keep waiting, no new queries
2250-        #  return self._send_more_queries(outstanding) : send some more queries
2251-        #  fire self._done(plaintext) : download successful
2252-        #  raise exception : download fails
2253+    def _mark_bad_share(self, reader, f):
2254+        """
2255+        I mark the (peerid, shnum) encapsulated by my reader argument as
2256+        a bad share, which means that it will not be used anywhere else.
2257 
2258hunk ./src/allmydata/mutable/retrieve.py 475
2259-        self.log(format="_check_for_done: running=%(running)s, decoding=%(decoding)s",
2260-                 running=self._running, decoding=self._decoding,
2261-                 level=log.NOISY)
2262-        if not self._running:
2263-            return
2264-        if self._decoding:
2265-            return
2266-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2267-         offsets_tuple) = self.verinfo
2268+        There are several reasons to want to mark something as a bad
2269+        share. These include:
2270 
2271hunk ./src/allmydata/mutable/retrieve.py 478
2272-        if len(self.shares) < k:
2273-            # we don't have enough shares yet
2274-            return self._maybe_send_more_queries(k)
2275-        if self._need_privkey:
2276-            # we got k shares, but none of them had a valid privkey. TODO:
2277-            # look further. Adding code to do this is a bit complicated, and
2278-            # I want to avoid that complication, and this should be pretty
2279-            # rare (k shares with bitflips in the enc_privkey but not in the
2280-            # data blocks). If we actually do get here, the subsequent repair
2281-            # will fail for lack of a privkey.
2282-            self.log("got k shares but still need_privkey, bummer",
2283-                     level=log.WEIRD, umid="MdRHPA")
2284+            - A connection error to the peer.
2285+            - A mismatched prefix (that is, a prefix that does not match
2286+              our local conception of the version information string).
2287+            - A failing block hash, salt hash, share hash, or other
2288+              integrity check.
2289 
2290hunk ./src/allmydata/mutable/retrieve.py 484
2291-        # we have enough to finish. All the shares have had their hashes
2292-        # checked, so if something fails at this point, we don't know how
2293-        # to fix it, so the download will fail.
2294+        This method will ensure that readers that we wish to mark bad
2295+        (for these reasons or other reasons) are not used for the rest
2296+        of the download. Additionally, it will attempt to tell the
2297+        remote peer (with no guarantee of success) that its share is
2298+        corrupt.
2299+        """
2300+        self.log("marking share %d on server %s as bad" % \
2301+                 (reader.shnum, reader))
2302+        self._remove_reader(reader)
2303+        self._bad_shares.add((reader.peerid, reader.shnum))
2304+        self._status.problems[reader.peerid] = f
2305+        self._last_failure = f
2306+        self.notify_server_corruption(reader.peerid, reader.shnum,
2307+                                      str(f.value))
2308 
2309hunk ./src/allmydata/mutable/retrieve.py 499
2310-        self._decoding = True # avoid reentrancy
2311-        self._status.set_status("decoding")
2312-        now = time.time()
2313-        elapsed = now - self._started
2314-        self._status.timings["fetch"] = elapsed
2315 
2316hunk ./src/allmydata/mutable/retrieve.py 500
2317-        d = defer.maybeDeferred(self._decode)
2318-        d.addCallback(self._decrypt, IV, self._node.get_readkey())
2319-        d.addBoth(self._done)
2320-        return d # purely for test convenience
2321+    def _download_current_segment(self):
2322+        """
2323+        I download, validate, decode, decrypt, and assemble the segment
2324+        that this Retrieve is currently responsible for downloading.
2325+        """
2326+        assert len(self._active_readers) >= self._required_shares
2327+        if self._current_segment < self._num_segments:
2328+            d = self._process_segment(self._current_segment)
2329+        else:
2330+            d = defer.succeed(None)
2331+        d.addCallback(self._check_for_done)
2332+        return d
2333 
2334hunk ./src/allmydata/mutable/retrieve.py 513
2335-    def _maybe_send_more_queries(self, k):
2336-        # we don't have enough shares yet. Should we send out more queries?
2337-        # There are some number of queries outstanding, each for a single
2338-        # share. If we can generate 'needed_shares' additional queries, we do
2339-        # so. If we can't, then we know this file is a goner, and we raise
2340-        # NotEnoughSharesError.
2341-        self.log(format=("_maybe_send_more_queries, have=%(have)d, k=%(k)d, "
2342-                         "outstanding=%(outstanding)d"),
2343-                 have=len(self.shares), k=k,
2344-                 outstanding=len(self._outstanding_queries),
2345-                 level=log.NOISY)
2346 
2347hunk ./src/allmydata/mutable/retrieve.py 514
2348-        remaining_shares = k - len(self.shares)
2349-        needed = remaining_shares - len(self._outstanding_queries)
2350-        if not needed:
2351-            # we have enough queries in flight already
2352+    def _process_segment(self, segnum):
2353+        """
2354+        I download, validate, decode, and decrypt one segment of the
2355+        file that this Retrieve is retrieving. This means coordinating
2356+        the process of getting k blocks of that file, validating them,
2357+        assembling them into one segment with the decoder, and then
2358+        decrypting them.
2359+        """
2360+        self.log("processing segment %d" % segnum)
2361 
2362hunk ./src/allmydata/mutable/retrieve.py 524
2363-            # TODO: but if they've been in flight for a long time, and we
2364-            # have reason to believe that new queries might respond faster
2365-            # (i.e. we've seen other queries come back faster, then consider
2366-            # sending out new queries. This could help with peers which have
2367-            # silently gone away since the servermap was updated, for which
2368-            # we're still waiting for the 15-minute TCP disconnect to happen.
2369-            self.log("enough queries are in flight, no more are needed",
2370-                     level=log.NOISY)
2371-            return
2372+        # TODO: The old code uses a marker. Should this code do that
2373+        # too? What did the Marker do?
2374+        assert len(self._active_readers) >= self._required_shares
2375+
2376+        # We need to ask each of our active readers for its block and
2377+        # salt. We will then validate those. If validation is
2378+        # successful, we will assemble the results into plaintext.
2379+        ds = []
2380+        for reader in self._active_readers:
2381+            d = reader.get_block_and_salt(segnum, queue=True)
2382+            d2 = self._get_needed_hashes(reader, segnum)
2383+            dl = defer.DeferredList([d, d2], consumeErrors=True)
2384+            dl.addCallback(self._validate_block, segnum, reader)
2385+            dl.addErrback(self._validation_or_decoding_failed, [reader])
2386+            ds.append(dl)
2387+            reader.flush()
2388+        dl = defer.DeferredList(ds)
2389+        dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
2390+        return dl
2391 
2392hunk ./src/allmydata/mutable/retrieve.py 544
2393-        outstanding_shnums = set([shnum
2394-                                  for (peerid, shnum, started)
2395-                                  in self._outstanding_queries.values()])
2396-        # prefer low-numbered shares, they are more likely to be primary
2397-        available_shnums = sorted(self.remaining_sharemap.keys())
2398-        for shnum in available_shnums:
2399-            if shnum in outstanding_shnums:
2400-                # skip ones that are already in transit
2401-                continue
2402-            if shnum not in self.remaining_sharemap:
2403-                # no servers for that shnum. note that DictOfSets removes
2404-                # empty sets from the dict for us.
2405-                continue
2406-            peerid = list(self.remaining_sharemap[shnum])[0]
2407-            # get_data will remove that peerid from the sharemap, and add the
2408-            # query to self._outstanding_queries
2409-            self._status.set_status("Retrieving More Shares")
2410-            self.get_data(shnum, peerid)
2411-            needed -= 1
2412-            if not needed:
2413+
2414+    def _maybe_decode_and_decrypt_segment(self, blocks_and_salts, segnum):
2415+        """
2416+        I take the results of fetching and validating the blocks from a
2417+        callback chain in another method. If the results are such that
2418+        they tell me that validation and fetching succeeded without
2419+        incident, I will proceed with decoding and decryption.
2420+        Otherwise, I will do nothing.
2421+        """
2422+        self.log("trying to decode and decrypt segment %d" % segnum)
2423+        failures = False
2424+        for block_and_salt in blocks_and_salts:
2425+            if not block_and_salt[0] or block_and_salt[1] == None:
2426+                self.log("some validation operations failed; not proceeding")
2427+                failures = True
2428                 break
2429hunk ./src/allmydata/mutable/retrieve.py 560
2430+        if not failures:
2431+            self.log("everything looks ok, building segment %d" % segnum)
2432+            d = self._decode_blocks(blocks_and_salts, segnum)
2433+            d.addCallback(self._decrypt_segment)
2434+            d.addErrback(self._validation_or_decoding_failed,
2435+                         self._active_readers)
2436+            d.addCallback(self._set_segment)
2437+            return d
2438+        else:
2439+            return defer.succeed(None)
2440+
2441+
2442+    def _set_segment(self, segment):
2443+        """
2444+        Given a plaintext segment, I register that segment with the
2445+        target that is handling the file download.
2446+        """
2447+        self.log("got plaintext for segment %d" % self._current_segment)
2448+        self._plaintext += segment
2449+        self._current_segment += 1
2450 
2451hunk ./src/allmydata/mutable/retrieve.py 581
2452-        # at this point, we have as many outstanding queries as we can. If
2453-        # needed!=0 then we might not have enough to recover the file.
2454-        if needed:
2455-            format = ("ran out of peers: "
2456-                      "have %(have)d shares (k=%(k)d), "
2457-                      "%(outstanding)d queries in flight, "
2458-                      "need %(need)d more, "
2459-                      "found %(bad)d bad shares")
2460-            args = {"have": len(self.shares),
2461-                    "k": k,
2462-                    "outstanding": len(self._outstanding_queries),
2463-                    "need": needed,
2464-                    "bad": len(self._bad_shares),
2465-                    }
2466-            self.log(format=format,
2467-                     level=log.WEIRD, umid="ezTfjw", **args)
2468-            err = NotEnoughSharesError("%s, last failure: %s" %
2469-                                      (format % args, self._last_failure))
2470-            if self._bad_shares:
2471-                self.log("We found some bad shares this pass. You should "
2472-                         "update the servermap and try again to check "
2473-                         "more peers",
2474-                         level=log.WEIRD, umid="EFkOlA")
2475-                err.servermap = self.servermap
2476-            raise err
2477 
2478hunk ./src/allmydata/mutable/retrieve.py 582
2479+    def _validation_or_decoding_failed(self, f, readers):
2480+        """
2481+        I am called when a block or a salt fails to correctly validate, or when
2482+        the decryption or decoding operation fails for some reason.  I react to
2483+        this failure by notifying the remote server of corruption, and then
2484+        removing the remote peer from further activity.
2485+        """
2486+        assert isinstance(readers, list)
2487+        bad_shnums = [reader.shnum for reader in readers]
2488+
2489+        self.log("validation or decoding failed on share(s) %s, peer(s) %s "
2490+                 ", segment %d: %s" % \
2491+                 (bad_shnums, readers, self._current_segment, str(f)))
2492+        for reader in readers:
2493+            self._mark_bad_share(reader, f)
2494         return
2495 
2496hunk ./src/allmydata/mutable/retrieve.py 599
2497-    def _decode(self):
2498-        started = time.time()
2499-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2500-         offsets_tuple) = self.verinfo
2501 
2502hunk ./src/allmydata/mutable/retrieve.py 600
2503-        # shares_dict is a dict mapping shnum to share data, but the codec
2504-        # wants two lists.
2505-        shareids = []; shares = []
2506-        for shareid, share in self.shares.items():
2507+    def _validate_block(self, results, segnum, reader):
2508+        """
2509+        I validate a block from one share on a remote server.
2510+        """
2511+        # Grab the part of the block hash tree that is necessary to
2512+        # validate this block, then generate the block hash root.
2513+        self.log("validating share %d for segment %d" % (reader.shnum,
2514+                                                             segnum))
2515+        # Did we fail to fetch either of the things that we were
2516+        # supposed to? Fail if so.
2517+        if not results[0][0] and results[1][0]:
2518+            # handled by the errback handler.
2519+
2520+            # These all get batched into one query, so the resulting
2521+            # failure should be the same for all of them, so we can just
2522+            # use the first one.
2523+            assert isinstance(results[0][1], failure.Failure)
2524+
2525+            f = results[0][1]
2526+            raise CorruptShareError(reader.peerid,
2527+                                    reader.shnum,
2528+                                    "Connection error: %s" % str(f))
2529+
2530+        block_and_salt, block_and_sharehashes = results
2531+        block, salt = block_and_salt[1]
2532+        blockhashes, sharehashes = block_and_sharehashes[1]
2533+
2534+        blockhashes = dict(enumerate(blockhashes[1]))
2535+        self.log("the reader gave me the following blockhashes: %s" % \
2536+                 blockhashes.keys())
2537+        self.log("the reader gave me the following sharehashes: %s" % \
2538+                 sharehashes[1].keys())
2539+        bht = self._block_hash_trees[reader.shnum]
2540+
2541+        if bht.needed_hashes(segnum, include_leaf=True):
2542+            try:
2543+                bht.set_hashes(blockhashes)
2544+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2545+                    IndexError), e:
2546+                raise CorruptShareError(reader.peerid,
2547+                                        reader.shnum,
2548+                                        "block hash tree failure: %s" % e)
2549+
2550+        if self._version == MDMF_VERSION:
2551+            blockhash = hashutil.block_hash(salt + block)
2552+        else:
2553+            blockhash = hashutil.block_hash(block)
2554+        # If this works without an error, then validation is
2555+        # successful.
2556+        try:
2557+           bht.set_hashes(leaves={segnum: blockhash})
2558+        except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2559+                IndexError), e:
2560+            raise CorruptShareError(reader.peerid,
2561+                                    reader.shnum,
2562+                                    "block hash tree failure: %s" % e)
2563+
2564+        # Reaching this point means that we know that this segment
2565+        # is correct. Now we need to check to see whether the share
2566+        # hash chain is also correct.
2567+        # SDMF wrote share hash chains that didn't contain the
2568+        # leaves, which would be produced from the block hash tree.
2569+        # So we need to validate the block hash tree first. If
2570+        # successful, then bht[0] will contain the root for the
2571+        # shnum, which will be a leaf in the share hash tree, which
2572+        # will allow us to validate the rest of the tree.
2573+        if self.share_hash_tree.needed_hashes(reader.shnum,
2574+                                               include_leaf=True):
2575+            try:
2576+                self.share_hash_tree.set_hashes(hashes=sharehashes[1],
2577+                                            leaves={reader.shnum: bht[0]})
2578+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2579+                    IndexError), e:
2580+                raise CorruptShareError(reader.peerid,
2581+                                        reader.shnum,
2582+                                        "corrupt hashes: %s" % e)
2583+
2584+        # TODO: Validate the salt, too.
2585+        self.log('share %d is valid for segment %d' % (reader.shnum,
2586+                                                       segnum))
2587+        return {reader.shnum: (block, salt)}
2588+
2589+
2590+    def _get_needed_hashes(self, reader, segnum):
2591+        """
2592+        I get the hashes needed to validate segnum from the reader, then return
2593+        to my caller when this is done.
2594+        """
2595+        bht = self._block_hash_trees[reader.shnum]
2596+        needed = bht.needed_hashes(segnum, include_leaf=True)
2597+        # The root of the block hash tree is also a leaf in the share
2598+        # hash tree. So we don't need to fetch it from the remote
2599+        # server. In the case of files with one segment, this means that
2600+        # we won't fetch any block hash tree from the remote server,
2601+        # since the hash of each share of the file is the entire block
2602+        # hash tree, and is a leaf in the share hash tree. This is fine,
2603+        # since any share corruption will be detected in the share hash
2604+        # tree.
2605+        #needed.discard(0)
2606+        self.log("getting blockhashes for segment %d, share %d: %s" % \
2607+                 (segnum, reader.shnum, str(needed)))
2608+        d1 = reader.get_blockhashes(needed, queue=True, force_remote=True)
2609+        if self.share_hash_tree.needed_hashes(reader.shnum):
2610+            need = self.share_hash_tree.needed_hashes(reader.shnum)
2611+            self.log("also need sharehashes for share %d: %s" % (reader.shnum,
2612+                                                                 str(need)))
2613+            d2 = reader.get_sharehashes(need, queue=True, force_remote=True)
2614+        else:
2615+            d2 = defer.succeed({}) # the logic in the next method
2616+                                   # expects a dict
2617+        dl = defer.DeferredList([d1, d2], consumeErrors=True)
2618+        return dl
2619+
2620+
2621+    def _decode_blocks(self, blocks_and_salts, segnum):
2622+        """
2623+        I take a list of k blocks and salts, and decode that into a
2624+        single encrypted segment.
2625+        """
2626+        d = {}
2627+        # We want to merge our dictionaries to the form
2628+        # {shnum: blocks_and_salts}
2629+        #
2630+        # The dictionaries come from validate block that way, so we just
2631+        # need to merge them.
2632+        for block_and_salt in blocks_and_salts:
2633+            d.update(block_and_salt[1])
2634+
2635+        # All of these blocks should have the same salt; in SDMF, it is
2636+        # the file-wide IV, while in MDMF it is the per-segment salt. In
2637+        # either case, we just need to get one of them and use it.
2638+        #
2639+        # d.items()[0] is like (shnum, (block, salt))
2640+        # d.items()[0][1] is like (block, salt)
2641+        # d.items()[0][1][1] is the salt.
2642+        salt = d.items()[0][1][1]
2643+        # Next, extract just the blocks from the dict. We'll use the
2644+        # salt in the next step.
2645+        share_and_shareids = [(k, v[0]) for k, v in d.items()]
2646+        d2 = dict(share_and_shareids)
2647+        shareids = []
2648+        shares = []
2649+        for shareid, share in d2.items():
2650             shareids.append(shareid)
2651             shares.append(share)
2652 
2653hunk ./src/allmydata/mutable/retrieve.py 746
2654-        assert len(shareids) >= k, len(shareids)
2655+        assert len(shareids) >= self._required_shares, len(shareids)
2656         # zfec really doesn't want extra shares
2657hunk ./src/allmydata/mutable/retrieve.py 748
2658-        shareids = shareids[:k]
2659-        shares = shares[:k]
2660-
2661-        fec = codec.CRSDecoder()
2662-        fec.set_params(segsize, k, N)
2663-
2664-        self.log("params %s, we have %d shares" % ((segsize, k, N), len(shares)))
2665-        self.log("about to decode, shareids=%s" % (shareids,))
2666-        d = defer.maybeDeferred(fec.decode, shares, shareids)
2667-        def _done(buffers):
2668-            self._status.timings["decode"] = time.time() - started
2669-            self.log(" decode done, %d buffers" % len(buffers))
2670+        shareids = shareids[:self._required_shares]
2671+        shares = shares[:self._required_shares]
2672+        self.log("decoding segment %d" % segnum)
2673+        if segnum == self._num_segments - 1:
2674+            d = defer.maybeDeferred(self._tail_decoder.decode, shares, shareids)
2675+        else:
2676+            d = defer.maybeDeferred(self._segment_decoder.decode, shares, shareids)
2677+        def _process(buffers):
2678             segment = "".join(buffers)
2679hunk ./src/allmydata/mutable/retrieve.py 757
2680+            self.log(format="now decoding segment %(segnum)s of %(numsegs)s",
2681+                     segnum=segnum,
2682+                     numsegs=self._num_segments,
2683+                     level=log.NOISY)
2684             self.log(" joined length %d, datalength %d" %
2685hunk ./src/allmydata/mutable/retrieve.py 762
2686-                     (len(segment), datalength))
2687-            segment = segment[:datalength]
2688+                     (len(segment), self._data_length))
2689+            if segnum == self._num_segments - 1:
2690+                size_to_use = self._tail_data_size
2691+            else:
2692+                size_to_use = self._segment_size
2693+            segment = segment[:size_to_use]
2694             self.log(" segment len=%d" % len(segment))
2695hunk ./src/allmydata/mutable/retrieve.py 769
2696-            return segment
2697-        def _err(f):
2698-            self.log(" decode failed: %s" % f)
2699-            return f
2700-        d.addCallback(_done)
2701-        d.addErrback(_err)
2702+            return segment, salt
2703+        d.addCallback(_process)
2704         return d
2705 
2706hunk ./src/allmydata/mutable/retrieve.py 773
2707-    def _decrypt(self, crypttext, IV, readkey):
2708+
2709+    def _decrypt_segment(self, segment_and_salt):
2710+        """
2711+        I take a single segment and its salt, and decrypt it. I return
2712+        the plaintext of the segment that is in my argument.
2713+        """
2714+        segment, salt = segment_and_salt
2715         self._status.set_status("decrypting")
2716hunk ./src/allmydata/mutable/retrieve.py 781
2717+        self.log("decrypting segment %d" % self._current_segment)
2718         started = time.time()
2719hunk ./src/allmydata/mutable/retrieve.py 783
2720-        key = hashutil.ssk_readkey_data_hash(IV, readkey)
2721+        key = hashutil.ssk_readkey_data_hash(salt, self._node.get_readkey())
2722         decryptor = AES(key)
2723hunk ./src/allmydata/mutable/retrieve.py 785
2724-        plaintext = decryptor.process(crypttext)
2725+        plaintext = decryptor.process(segment)
2726         self._status.timings["decrypt"] = time.time() - started
2727         return plaintext
2728 
2729hunk ./src/allmydata/mutable/retrieve.py 789
2730-    def _done(self, res):
2731-        if not self._running:
2732+
2733+    def notify_server_corruption(self, peerid, shnum, reason):
2734+        ss = self.servermap.connections[peerid]
2735+        ss.callRemoteOnly("advise_corrupt_share",
2736+                          "mutable", self._storage_index, shnum, reason)
2737+
2738+
2739+    def _try_to_validate_privkey(self, enc_privkey, reader):
2740+
2741+        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
2742+        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
2743+        if alleged_writekey != self._node.get_writekey():
2744+            self.log("invalid privkey from %s shnum %d" %
2745+                     (reader, reader.shnum),
2746+                     level=log.WEIRD, umid="YIw4tA")
2747             return
2748hunk ./src/allmydata/mutable/retrieve.py 805
2749-        self._running = False
2750-        self._status.set_active(False)
2751-        self._status.timings["total"] = time.time() - self._started
2752-        # res is either the new contents, or a Failure
2753-        if isinstance(res, failure.Failure):
2754-            self.log("Retrieve done, with failure", failure=res,
2755-                     level=log.UNUSUAL)
2756-            self._status.set_status("Failed")
2757-        else:
2758-            self.log("Retrieve done, success!")
2759-            self._status.set_status("Finished")
2760-            self._status.set_progress(1.0)
2761-            # remember the encoding parameters, use them again next time
2762-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2763-             offsets_tuple) = self.verinfo
2764-            self._node._populate_required_shares(k)
2765-            self._node._populate_total_shares(N)
2766-        eventually(self._done_deferred.callback, res)
2767 
2768hunk ./src/allmydata/mutable/retrieve.py 806
2769+        # it's good
2770+        self.log("got valid privkey from shnum %d on reader %s" %
2771+                 (reader.shnum, reader))
2772+        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
2773+        self._node._populate_encprivkey(enc_privkey)
2774+        self._node._populate_privkey(privkey)
2775+        self._need_privkey = False
2776+
2777+
2778+    def _check_for_done(self, res):
2779+        """
2780+        I check to see if this Retrieve object has successfully finished
2781+        its work.
2782+
2783+        I can exit in the following ways:
2784+            - If there are no more segments to download, then I exit by
2785+              causing self._done_deferred to fire with the plaintext
2786+              content requested by the caller.
2787+            - If there are still segments to be downloaded, and there
2788+              are enough active readers (readers which have not broken
2789+              and have not given us corrupt data) to continue
2790+              downloading, I send control back to
2791+              _download_current_segment.
2792+            - If there are still segments to be downloaded but there are
2793+              not enough active peers to download them, I ask
2794+              _add_active_peers to add more peers. If it is successful,
2795+              it will call _download_current_segment. If there are not
2796+              enough peers to retrieve the file, then that will cause
2797+              _done_deferred to errback.
2798+        """
2799+        self.log("checking for doneness")
2800+        if self._current_segment == self._num_segments:
2801+            # No more segments to download, we're done.
2802+            self.log("got plaintext, done")
2803+            return self._done()
2804+
2805+        if len(self._active_readers) >= self._required_shares:
2806+            # More segments to download, but we have enough good peers
2807+            # in self._active_readers that we can do that without issue,
2808+            # so go nab the next segment.
2809+            self.log("not done yet: on segment %d of %d" % \
2810+                     (self._current_segment + 1, self._num_segments))
2811+            return self._download_current_segment()
2812+
2813+        self.log("not done yet: on segment %d of %d, need to add peers" % \
2814+                 (self._current_segment + 1, self._num_segments))
2815+        return self._add_active_peers()
2816+
2817+
2818+    def _done(self):
2819+        """
2820+        I am called by _check_for_done when the download process has
2821+        finished successfully. After making some useful logging
2822+        statements, I return the decrypted contents to the owner of this
2823+        Retrieve object through self._done_deferred.
2824+        """
2825+        eventually(self._done_deferred.callback, self._plaintext)
2826+
2827+
2828+    def _failed(self):
2829+        """
2830+        I am called by _add_active_peers when there are not enough
2831+        active peers left to complete the download. After making some
2832+        useful logging statements, I return an exception to that effect
2833+        to the caller of this Retrieve object through
2834+        self._done_deferred.
2835+        """
2836+        format = ("ran out of peers: "
2837+                  "have %(have)d of %(total)d segments "
2838+                  "found %(bad)d bad shares "
2839+                  "encoding %(k)d-of-%(n)d")
2840+        args = {"have": self._current_segment,
2841+                "total": self._num_segments,
2842+                "k": self._required_shares,
2843+                "n": self._total_shares,
2844+                "bad": len(self._bad_shares)}
2845+        e = NotEnoughSharesError("%s, last failure: %s" % (format % args,
2846+                                                        str(self._last_failure)))
2847+        f = failure.Failure(e)
2848+        eventually(self._done_deferred.callback, f)
2849hunk ./src/allmydata/test/test_mutable.py 12
2850 from allmydata.util.hashutil import tagged_hash, ssk_writekey_hash, \
2851      ssk_pubkey_fingerprint_hash
2852 from allmydata.interfaces import IRepairResults, ICheckAndRepairResults, \
2853-     NotEnoughSharesError
2854+     NotEnoughSharesError, SDMF_VERSION, MDMF_VERSION
2855 from allmydata.monitor import Monitor
2856 from allmydata.test.common import ShouldFailMixin
2857 from allmydata.test.no_network import GridTestMixin
2858hunk ./src/allmydata/test/test_mutable.py 28
2859 from allmydata.mutable.retrieve import Retrieve
2860 from allmydata.mutable.publish import Publish
2861 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
2862-from allmydata.mutable.layout import unpack_header, unpack_share
2863+from allmydata.mutable.layout import unpack_header, unpack_share, \
2864+                                     MDMFSlotReadProxy
2865 from allmydata.mutable.repairer import MustForceRepairError
2866 
2867 import allmydata.test.common_util as testutil
2868hunk ./src/allmydata/test/test_mutable.py 104
2869         d = fireEventually()
2870         d.addCallback(lambda res: _call())
2871         return d
2872+
2873     def callRemoteOnly(self, methname, *args, **kwargs):
2874         d = self.callRemote(methname, *args, **kwargs)
2875         d.addBoth(lambda ignore: None)
2876hunk ./src/allmydata/test/test_mutable.py 163
2877 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
2878     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
2879     # list of shnums to corrupt.
2880+    ds = []
2881     for peerid in s._peers:
2882         shares = s._peers[peerid]
2883         for shnum in shares:
2884hunk ./src/allmydata/test/test_mutable.py 190
2885                 else:
2886                     offset1 = offset
2887                     offset2 = 0
2888-                if offset1 == "pubkey":
2889+                if offset1 == "pubkey" and IV:
2890                     real_offset = 107
2891hunk ./src/allmydata/test/test_mutable.py 192
2892+                elif offset1 == "share_data" and not IV:
2893+                    real_offset = 104
2894                 elif offset1 in o:
2895                     real_offset = o[offset1]
2896                 else:
2897hunk ./src/allmydata/test/test_mutable.py 327
2898         d.addCallback(_created)
2899         return d
2900 
2901+
2902+    def test_upload_and_download_mdmf(self):
2903+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
2904+        def _created(n):
2905+            d = defer.succeed(None)
2906+            d.addCallback(lambda ignored:
2907+                n.get_servermap(MODE_READ))
2908+            def _then(servermap):
2909+                dumped = servermap.dump(StringIO())
2910+                self.failUnlessIn("3-of-10", dumped.getvalue())
2911+            d.addCallback(_then)
2912+            # Now overwrite the contents with some new contents. We want
2913+            # to make them big enough to force the file to be uploaded
2914+            # in more than one segment.
2915+            big_contents = "contents1" * 100000 # about 900 KiB
2916+            d.addCallback(lambda ignored:
2917+                n.overwrite(big_contents))
2918+            d.addCallback(lambda ignored:
2919+                n.download_best_version())
2920+            d.addCallback(lambda data:
2921+                self.failUnlessEqual(data, big_contents))
2922+            # Overwrite the contents again with some new contents. As
2923+            # before, they need to be big enough to force multiple
2924+            # segments, so that we make the downloader deal with
2925+            # multiple segments.
2926+            bigger_contents = "contents2" * 1000000 # about 9MiB
2927+            d.addCallback(lambda ignored:
2928+                n.overwrite(bigger_contents))
2929+            d.addCallback(lambda ignored:
2930+                n.download_best_version())
2931+            d.addCallback(lambda data:
2932+                self.failUnlessEqual(data, bigger_contents))
2933+            return d
2934+        d.addCallback(_created)
2935+        return d
2936+
2937+
2938     def test_create_with_initial_contents(self):
2939         d = self.nodemaker.create_mutable_file("contents 1")
2940         def _created(n):
2941hunk ./src/allmydata/test/test_mutable.py 1147
2942 
2943 
2944     def _test_corrupt_all(self, offset, substring,
2945-                          should_succeed=False, corrupt_early=True,
2946-                          failure_checker=None):
2947+                          should_succeed=False,
2948+                          corrupt_early=True,
2949+                          failure_checker=None,
2950+                          fetch_privkey=False):
2951         d = defer.succeed(None)
2952         if corrupt_early:
2953             d.addCallback(corrupt, self._storage, offset)
2954hunk ./src/allmydata/test/test_mutable.py 1167
2955                     self.failUnlessIn(substring, "".join(allproblems))
2956                 return servermap
2957             if should_succeed:
2958-                d1 = self._fn.download_version(servermap, ver)
2959+                d1 = self._fn.download_version(servermap, ver,
2960+                                               fetch_privkey)
2961                 d1.addCallback(lambda new_contents:
2962                                self.failUnlessEqual(new_contents, self.CONTENTS))
2963             else:
2964hunk ./src/allmydata/test/test_mutable.py 1175
2965                 d1 = self.shouldFail(NotEnoughSharesError,
2966                                      "_corrupt_all(offset=%s)" % (offset,),
2967                                      substring,
2968-                                     self._fn.download_version, servermap, ver)
2969+                                     self._fn.download_version, servermap,
2970+                                                                ver,
2971+                                                                fetch_privkey)
2972             if failure_checker:
2973                 d1.addCallback(failure_checker)
2974             d1.addCallback(lambda res: servermap)
2975hunk ./src/allmydata/test/test_mutable.py 1186
2976         return d
2977 
2978     def test_corrupt_all_verbyte(self):
2979-        # when the version byte is not 0, we hit an UnknownVersionError error
2980-        # in unpack_share().
2981+        # when the version byte is not 0 or 1, we hit an UnknownVersionError
2982+        # error in unpack_share().
2983         d = self._test_corrupt_all(0, "UnknownVersionError")
2984         def _check_servermap(servermap):
2985             # and the dump should mention the problems
2986hunk ./src/allmydata/test/test_mutable.py 1193
2987             s = StringIO()
2988             dump = servermap.dump(s).getvalue()
2989-            self.failUnless("10 PROBLEMS" in dump, dump)
2990+            self.failUnless("30 PROBLEMS" in dump, dump)
2991         d.addCallback(_check_servermap)
2992         return d
2993 
2994hunk ./src/allmydata/test/test_mutable.py 1263
2995         return self._test_corrupt_all("enc_privkey", None, should_succeed=True)
2996 
2997 
2998+    def test_corrupt_all_encprivkey_late(self):
2999+        # this should work for the same reason as above, but we corrupt
3000+        # after the servermap update to exercise the error handling
3001+        # code.
3002+        # We need to remove the privkey from the node, or the retrieve
3003+        # process won't know to update it.
3004+        self._fn._privkey = None
3005+        return self._test_corrupt_all("enc_privkey",
3006+                                      None, # this shouldn't fail
3007+                                      should_succeed=True,
3008+                                      corrupt_early=False,
3009+                                      fetch_privkey=True)
3010+
3011+
3012     def test_corrupt_all_seqnum_late(self):
3013         # corrupting the seqnum between mapupdate and retrieve should result
3014         # in NotEnoughSharesError, since each share will look invalid
3015hunk ./src/allmydata/test/test_mutable.py 1283
3016         def _check(res):
3017             f = res[0]
3018             self.failUnless(f.check(NotEnoughSharesError))
3019-            self.failUnless("someone wrote to the data since we read the servermap" in str(f))
3020+            self.failUnless("uncoordinated write" in str(f))
3021         return self._test_corrupt_all(1, "ran out of peers",
3022                                       corrupt_early=False,
3023                                       failure_checker=_check)
3024hunk ./src/allmydata/test/test_mutable.py 1333
3025                       self.failUnlessEqual(new_contents, self.CONTENTS))
3026         return d
3027 
3028-    def test_corrupt_some(self):
3029-        # corrupt the data of first five shares (so the servermap thinks
3030-        # they're good but retrieve marks them as bad), so that the
3031-        # MODE_READ set of 6 will be insufficient, forcing node.download to
3032-        # retry with more servers.
3033-        corrupt(None, self._storage, "share_data", range(5))
3034-        d = self.make_servermap()
3035+
3036+    def _test_corrupt_some(self, offset, mdmf=False):
3037+        if mdmf:
3038+            d = self.publish_mdmf()
3039+        else:
3040+            d = defer.succeed(None)
3041+        d.addCallback(lambda ignored:
3042+            corrupt(None, self._storage, offset, range(5)))
3043+        d.addCallback(lambda ignored:
3044+            self.make_servermap())
3045         def _do_retrieve(servermap):
3046             ver = servermap.best_recoverable_version()
3047             self.failUnless(ver)
3048hunk ./src/allmydata/test/test_mutable.py 1349
3049             return self._fn.download_best_version()
3050         d.addCallback(_do_retrieve)
3051         d.addCallback(lambda new_contents:
3052-                      self.failUnlessEqual(new_contents, self.CONTENTS))
3053+            self.failUnlessEqual(new_contents, self.CONTENTS))
3054         return d
3055 
3056hunk ./src/allmydata/test/test_mutable.py 1352
3057+
3058+    def test_corrupt_some(self):
3059+        # corrupt the data of first five shares (so the servermap thinks
3060+        # they're good but retrieve marks them as bad), so that the
3061+        # MODE_READ set of 6 will be insufficient, forcing node.download to
3062+        # retry with more servers.
3063+        return self._test_corrupt_some("share_data")
3064+
3065+
3066     def test_download_fails(self):
3067         d = corrupt(None, self._storage, "signature")
3068         d.addCallback(lambda ignored:
3069hunk ./src/allmydata/test/test_mutable.py 1366
3070             self.shouldFail(UnrecoverableFileError, "test_download_anyway",
3071                             "no recoverable versions",
3072-                            self._fn.download_best_version)
3073+                            self._fn.download_best_version))
3074         return d
3075 
3076 
3077hunk ./src/allmydata/test/test_mutable.py 1370
3078+
3079+    def test_corrupt_mdmf_block_hash_tree(self):
3080+        d = self.publish_mdmf()
3081+        d.addCallback(lambda ignored:
3082+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
3083+                                   "block hash tree failure",
3084+                                   corrupt_early=False,
3085+                                   should_succeed=False))
3086+        return d
3087+
3088+
3089+    def test_corrupt_mdmf_block_hash_tree_late(self):
3090+        d = self.publish_mdmf()
3091+        d.addCallback(lambda ignored:
3092+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
3093+                                   "block hash tree failure",
3094+                                   corrupt_early=True,
3095+                                   should_succeed=False))
3096+        return d
3097+
3098+
3099+    def test_corrupt_mdmf_share_data(self):
3100+        d = self.publish_mdmf()
3101+        d.addCallback(lambda ignored:
3102+            # TODO: Find out what the block size is and corrupt a
3103+            # specific block, rather than just guessing.
3104+            self._test_corrupt_all(("share_data", 12 * 40),
3105+                                    "block hash tree failure",
3106+                                    corrupt_early=True,
3107+                                    should_succeed=False))
3108+        return d
3109+
3110+
3111+    def test_corrupt_some_mdmf(self):
3112+        return self._test_corrupt_some(("share_data", 12 * 40),
3113+                                       mdmf=True)
3114+
3115+
3116 class CheckerMixin:
3117     def check_good(self, r, where):
3118         self.failUnless(r.is_healthy(), where)
3119hunk ./src/allmydata/test/test_mutable.py 2116
3120             d.addCallback(lambda res:
3121                           self.shouldFail(NotEnoughSharesError,
3122                                           "test_retrieve_surprise",
3123-                                          "ran out of peers: have 0 shares (k=3)",
3124+                                          "ran out of peers: have 0 of 1",
3125                                           n.download_version,
3126                                           self.old_map,
3127                                           self.old_map.best_recoverable_version(),
3128hunk ./src/allmydata/test/test_mutable.py 2125
3129         d.addCallback(_created)
3130         return d
3131 
3132+
3133     def test_unexpected_shares(self):
3134         # upload the file, take a servermap, shut down one of the servers,
3135         # upload it again (causing shares to appear on a new server), then
3136hunk ./src/allmydata/test/test_mutable.py 2329
3137         self.basedir = "mutable/Problems/test_privkey_query_missing"
3138         self.set_up_grid(num_servers=20)
3139         nm = self.g.clients[0].nodemaker
3140-        LARGE = "These are Larger contents" * 2000 # about 50KB
3141+        LARGE = "These are Larger contents" * 2000 # about 50KiB
3142         nm._node_cache = DevNullDictionary() # disable the nodecache
3143 
3144         d = nm.create_mutable_file(LARGE)
3145hunk ./src/allmydata/test/test_mutable.py 2342
3146         d.addCallback(_created)
3147         d.addCallback(lambda res: self.n2.get_servermap(MODE_WRITE))
3148         return d
3149+
3150+
3151+    def test_block_and_hash_query_error(self):
3152+        # This tests for what happens when a query to a remote server
3153+        # fails in either the hash validation step or the block getting
3154+        # step (because of batching, this is the same actual query).
3155+        # We need to have the storage server persist up until the point
3156+        # that its prefix is validated, then suddenly die. This
3157+        # exercises some exception handling code in Retrieve.
3158+        self.basedir = "mutable/Problems/test_block_and_hash_query_error"
3159+        self.set_up_grid(num_servers=20)
3160+        nm = self.g.clients[0].nodemaker
3161+        CONTENTS = "contents" * 2000
3162+        d = nm.create_mutable_file(CONTENTS)
3163+        def _created(node):
3164+            self._node = node
3165+        d.addCallback(_created)
3166+        d.addCallback(lambda ignored:
3167+            self._node.get_servermap(MODE_READ))
3168+        def _then(servermap):
3169+            # we have our servermap. Now we set up the servers like the
3170+            # tests above -- the first one that gets a read call should
3171+            # start throwing errors, but only after returning its prefix
3172+            # for validation. Since we'll download without fetching the
3173+            # private key, the next query to the remote server will be
3174+            # for either a block and salt or for hashes, either of which
3175+            # will exercise the error handling code.
3176+            killer = FirstServerGetsKilled()
3177+            for (serverid, ss) in nm.storage_broker.get_all_servers():
3178+                ss.post_call_notifier = killer.notify
3179+            ver = servermap.best_recoverable_version()
3180+            assert ver
3181+            return self._node.download_version(servermap, ver)
3182+        d.addCallback(_then)
3183+        d.addCallback(lambda data:
3184+            self.failUnlessEqual(data, CONTENTS))
3185+        return d
3186}
3187[mutable/checker.py: check MDMF files
3188Kevan Carstensen <kevan@isnotajoke.com>**20100628225048
3189 Ignore-this: fb697b36285d60552df6ca5ac6a37629
3190 
3191 This patch adapts the mutable file checker and verifier to check and
3192 verify MDMF files. It does this by using the new segmented downloader,
3193 which is trained to perform verification operations on request. This
3194 removes some code duplication.
3195] {
3196hunk ./src/allmydata/mutable/checker.py 12
3197 from allmydata.mutable.common import MODE_CHECK, CorruptShareError
3198 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
3199 from allmydata.mutable.layout import unpack_share, SIGNED_PREFIX_LENGTH
3200+from allmydata.mutable.retrieve import Retrieve # for verifying
3201 
3202 class MutableChecker:
3203 
3204hunk ./src/allmydata/mutable/checker.py 29
3205 
3206     def check(self, verify=False, add_lease=False):
3207         servermap = ServerMap()
3208+        # Updating the servermap in MODE_CHECK will stand a good chance
3209+        # of finding all of the shares, and getting a good idea of
3210+        # recoverability, etc, without verifying.
3211         u = ServermapUpdater(self._node, self._storage_broker, self._monitor,
3212                              servermap, MODE_CHECK, add_lease=add_lease)
3213         if self._history:
3214hunk ./src/allmydata/mutable/checker.py 55
3215         if num_recoverable:
3216             self.best_version = servermap.best_recoverable_version()
3217 
3218+        # The file is unhealthy and needs to be repaired if:
3219+        # - There are unrecoverable versions.
3220         if servermap.unrecoverable_versions():
3221             self.need_repair = True
3222hunk ./src/allmydata/mutable/checker.py 59
3223+        # - There isn't a recoverable version.
3224         if num_recoverable != 1:
3225             self.need_repair = True
3226hunk ./src/allmydata/mutable/checker.py 62
3227+        # - The best recoverable version is missing some shares.
3228         if self.best_version:
3229             available_shares = servermap.shares_available()
3230             (num_distinct_shares, k, N) = available_shares[self.best_version]
3231hunk ./src/allmydata/mutable/checker.py 73
3232 
3233     def _verify_all_shares(self, servermap):
3234         # read every byte of each share
3235+        #
3236+        # This logic is going to be very nearly the same as the
3237+        # downloader. I bet we could pass the downloader a flag that
3238+        # makes it do this, and piggyback onto that instead of
3239+        # duplicating a bunch of code.
3240+        #
3241+        # Like:
3242+        #  r = Retrieve(blah, blah, blah, verify=True)
3243+        #  d = r.download()
3244+        #  (wait, wait, wait, d.callback)
3245+        # 
3246+        #  Then, when it has finished, we can check the servermap (which
3247+        #  we provided to Retrieve) to figure out which shares are bad,
3248+        #  since the Retrieve process will have updated the servermap as
3249+        #  it went along.
3250+        #
3251+        #  By passing the verify=True flag to the constructor, we are
3252+        #  telling the downloader a few things.
3253+        #
3254+        #  1. It needs to download all N shares, not just K shares.
3255+        #  2. It doesn't need to decrypt or decode the shares, only
3256+        #     verify them.
3257         if not self.best_version:
3258             return
3259hunk ./src/allmydata/mutable/checker.py 97
3260-        versionmap = servermap.make_versionmap()
3261-        shares = versionmap[self.best_version]
3262-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
3263-         offsets_tuple) = self.best_version
3264-        offsets = dict(offsets_tuple)
3265-        readv = [ (0, offsets["EOF"]) ]
3266-        dl = []
3267-        for (shnum, peerid, timestamp) in shares:
3268-            ss = servermap.connections[peerid]
3269-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
3270-            d.addCallback(self._got_answer, peerid, servermap)
3271-            dl.append(d)
3272-        return defer.DeferredList(dl, fireOnOneErrback=True, consumeErrors=True)
3273 
3274hunk ./src/allmydata/mutable/checker.py 98
3275-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
3276-        # isolate the callRemote to a separate method, so tests can subclass
3277-        # Publish and override it
3278-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
3279+        r = Retrieve(self._node, servermap, self.best_version, verify=True)
3280+        d = r.download()
3281+        d.addCallback(self._process_bad_shares)
3282         return d
3283 
3284hunk ./src/allmydata/mutable/checker.py 103
3285-    def _got_answer(self, datavs, peerid, servermap):
3286-        for shnum,datav in datavs.items():
3287-            data = datav[0]
3288-            try:
3289-                self._got_results_one_share(shnum, peerid, data)
3290-            except CorruptShareError:
3291-                f = failure.Failure()
3292-                self.need_repair = True
3293-                self.bad_shares.append( (peerid, shnum, f) )
3294-                prefix = data[:SIGNED_PREFIX_LENGTH]
3295-                servermap.mark_bad_share(peerid, shnum, prefix)
3296-                ss = servermap.connections[peerid]
3297-                self.notify_server_corruption(ss, shnum, str(f.value))
3298-
3299-    def check_prefix(self, peerid, shnum, data):
3300-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
3301-         offsets_tuple) = self.best_version
3302-        got_prefix = data[:SIGNED_PREFIX_LENGTH]
3303-        if got_prefix != prefix:
3304-            raise CorruptShareError(peerid, shnum,
3305-                                    "prefix mismatch: share changed while we were reading it")
3306-
3307-    def _got_results_one_share(self, shnum, peerid, data):
3308-        self.check_prefix(peerid, shnum, data)
3309-
3310-        # the [seqnum:signature] pieces are validated by _compare_prefix,
3311-        # which checks their signature against the pubkey known to be
3312-        # associated with this file.
3313 
3314hunk ./src/allmydata/mutable/checker.py 104
3315-        (seqnum, root_hash, IV, k, N, segsize, datalen, pubkey, signature,
3316-         share_hash_chain, block_hash_tree, share_data,
3317-         enc_privkey) = unpack_share(data)
3318-
3319-        # validate [share_hash_chain,block_hash_tree,share_data]
3320-
3321-        leaves = [hashutil.block_hash(share_data)]
3322-        t = hashtree.HashTree(leaves)
3323-        if list(t) != block_hash_tree:
3324-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
3325-        share_hash_leaf = t[0]
3326-        t2 = hashtree.IncompleteHashTree(N)
3327-        # root_hash was checked by the signature
3328-        t2.set_hashes({0: root_hash})
3329-        try:
3330-            t2.set_hashes(hashes=share_hash_chain,
3331-                          leaves={shnum: share_hash_leaf})
3332-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
3333-                IndexError), e:
3334-            msg = "corrupt hashes: %s" % (e,)
3335-            raise CorruptShareError(peerid, shnum, msg)
3336-
3337-        # validate enc_privkey: only possible if we have a write-cap
3338-        if not self._node.is_readonly():
3339-            alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
3340-            alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
3341-            if alleged_writekey != self._node.get_writekey():
3342-                raise CorruptShareError(peerid, shnum, "invalid privkey")
3343+    def _process_bad_shares(self, bad_shares):
3344+        if bad_shares:
3345+            self.need_repair = True
3346+        self.bad_shares = bad_shares
3347 
3348hunk ./src/allmydata/mutable/checker.py 109
3349-    def notify_server_corruption(self, ss, shnum, reason):
3350-        ss.callRemoteOnly("advise_corrupt_share",
3351-                          "mutable", self._storage_index, shnum, reason)
3352 
3353     def _count_shares(self, smap, version):
3354         available_shares = smap.shares_available()
3355hunk ./src/allmydata/test/test_mutable.py 193
3356                 if offset1 == "pubkey" and IV:
3357                     real_offset = 107
3358                 elif offset1 == "share_data" and not IV:
3359-                    real_offset = 104
3360+                    real_offset = 107
3361                 elif offset1 in o:
3362                     real_offset = o[offset1]
3363                 else:
3364hunk ./src/allmydata/test/test_mutable.py 395
3365             return d
3366         d.addCallback(_created)
3367         return d
3368+    test_create_mdmf_with_initial_contents.timeout = 20
3369 
3370 
3371     def test_create_with_initial_contents_function(self):
3372hunk ./src/allmydata/test/test_mutable.py 700
3373                                            k, N, segsize, datalen)
3374                 self.failUnless(p._pubkey.verify(sig_material, signature))
3375                 #self.failUnlessEqual(signature, p._privkey.sign(sig_material))
3376-                self.failUnless(isinstance(share_hash_chain, dict))
3377-                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
3378+                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
3379                 for shnum,share_hash in share_hash_chain.items():
3380                     self.failUnless(isinstance(shnum, int))
3381                     self.failUnless(isinstance(share_hash, str))
3382hunk ./src/allmydata/test/test_mutable.py 820
3383                     shares[peerid][shnum] = oldshares[index][peerid][shnum]
3384 
3385 
3386+
3387+
3388 class Servermap(unittest.TestCase, PublishMixin):
3389     def setUp(self):
3390         return self.publish_one()
3391hunk ./src/allmydata/test/test_mutable.py 951
3392         self._storage._peers = {} # delete all shares
3393         ms = self.make_servermap
3394         d = defer.succeed(None)
3395-
3396+#
3397         d.addCallback(lambda res: ms(mode=MODE_CHECK))
3398         d.addCallback(lambda sm: self.failUnlessNoneRecoverable(sm))
3399 
3400hunk ./src/allmydata/test/test_mutable.py 1440
3401         d.addCallback(self.check_good, "test_check_good")
3402         return d
3403 
3404+    def test_check_mdmf_good(self):
3405+        d = self.publish_mdmf()
3406+        d.addCallback(lambda ignored:
3407+            self._fn.check(Monitor()))
3408+        d.addCallback(self.check_good, "test_check_mdmf_good")
3409+        return d
3410+
3411     def test_check_no_shares(self):
3412         for shares in self._storage._peers.values():
3413             shares.clear()
3414hunk ./src/allmydata/test/test_mutable.py 1454
3415         d.addCallback(self.check_bad, "test_check_no_shares")
3416         return d
3417 
3418+    def test_check_mdmf_no_shares(self):
3419+        d = self.publish_mdmf()
3420+        def _then(ignored):
3421+            for share in self._storage._peers.values():
3422+                share.clear()
3423+        d.addCallback(_then)
3424+        d.addCallback(lambda ignored:
3425+            self._fn.check(Monitor()))
3426+        d.addCallback(self.check_bad, "test_check_mdmf_no_shares")
3427+        return d
3428+
3429     def test_check_not_enough_shares(self):
3430         for shares in self._storage._peers.values():
3431             for shnum in shares.keys():
3432hunk ./src/allmydata/test/test_mutable.py 1474
3433         d.addCallback(self.check_bad, "test_check_not_enough_shares")
3434         return d
3435 
3436+    def test_check_mdmf_not_enough_shares(self):
3437+        d = self.publish_mdmf()
3438+        def _then(ignored):
3439+            for shares in self._storage._peers.values():
3440+                for shnum in shares.keys():
3441+                    if shnum > 0:
3442+                        del shares[shnum]
3443+        d.addCallback(_then)
3444+        d.addCallback(lambda ignored:
3445+            self._fn.check(Monitor()))
3446+        d.addCallback(self.check_bad, "test_check_mdmf_not_enougH_shares")
3447+        return d
3448+
3449+
3450     def test_check_all_bad_sig(self):
3451         d = corrupt(None, self._storage, 1) # bad sig
3452         d.addCallback(lambda ignored:
3453hunk ./src/allmydata/test/test_mutable.py 1495
3454         d.addCallback(self.check_bad, "test_check_all_bad_sig")
3455         return d
3456 
3457+    def test_check_mdmf_all_bad_sig(self):
3458+        d = self.publish_mdmf()
3459+        d.addCallback(lambda ignored:
3460+            corrupt(None, self._storage, 1))
3461+        d.addCallback(lambda ignored:
3462+            self._fn.check(Monitor()))
3463+        d.addCallback(self.check_bad, "test_check_mdmf_all_bad_sig")
3464+        return d
3465+
3466     def test_check_all_bad_blocks(self):
3467         d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
3468         # the Checker won't notice this.. it doesn't look at actual data
3469hunk ./src/allmydata/test/test_mutable.py 1512
3470         d.addCallback(self.check_good, "test_check_all_bad_blocks")
3471         return d
3472 
3473+
3474+    def test_check_mdmf_all_bad_blocks(self):
3475+        d = self.publish_mdmf()
3476+        d.addCallback(lambda ignored:
3477+            corrupt(None, self._storage, "share_data"))
3478+        d.addCallback(lambda ignored:
3479+            self._fn.check(Monitor()))
3480+        d.addCallback(self.check_good, "test_check_mdmf_all_bad_blocks")
3481+        return d
3482+
3483     def test_verify_good(self):
3484         d = self._fn.check(Monitor(), verify=True)
3485         d.addCallback(self.check_good, "test_verify_good")
3486hunk ./src/allmydata/test/test_mutable.py 1582
3487                       "test_verify_one_bad_encprivkey_uncheckable")
3488         return d
3489 
3490+
3491+    def test_verify_mdmf_good(self):
3492+        d = self.publish_mdmf()
3493+        d.addCallback(lambda ignored:
3494+            self._fn.check(Monitor(), verify=True))
3495+        d.addCallback(self.check_good, "test_verify_mdmf_good")
3496+        return d
3497+
3498+
3499+    def test_verify_mdmf_one_bad_block(self):
3500+        d = self.publish_mdmf()
3501+        d.addCallback(lambda ignored:
3502+            corrupt(None, self._storage, "share_data", [1]))
3503+        d.addCallback(lambda ignored:
3504+            self._fn.check(Monitor(), verify=True))
3505+        # We should find one bad block here
3506+        d.addCallback(self.check_bad, "test_verify_mdmf_one_bad_block")
3507+        d.addCallback(self.check_expected_failure,
3508+                      CorruptShareError, "block hash tree failure",
3509+                      "test_verify_mdmf_one_bad_block")
3510+        return d
3511+
3512+
3513+    def test_verify_mdmf_bad_encprivkey(self):
3514+        d = self.publish_mdmf()
3515+        d.addCallback(lambda ignored:
3516+            corrupt(None, self._storage, "enc_privkey", [1]))
3517+        d.addCallback(lambda ignored:
3518+            self._fn.check(Monitor(), verify=True))
3519+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_encprivkey")
3520+        d.addCallback(self.check_expected_failure,
3521+                      CorruptShareError, "privkey",
3522+                      "test_verify_mdmf_bad_encprivkey")
3523+        return d
3524+
3525+
3526+    def test_verify_mdmf_bad_sig(self):
3527+        d = self.publish_mdmf()
3528+        d.addCallback(lambda ignored:
3529+            corrupt(None, self._storage, 1, [1]))
3530+        d.addCallback(lambda ignored:
3531+            self._fn.check(Monitor(), verify=True))
3532+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_sig")
3533+        return d
3534+
3535+
3536+    def test_verify_mdmf_bad_encprivkey_uncheckable(self):
3537+        d = self.publish_mdmf()
3538+        d.addCallback(lambda ignored:
3539+            corrupt(None, self._storage, "enc_privkey", [1]))
3540+        d.addCallback(lambda ignored:
3541+            self._fn.get_readonly())
3542+        d.addCallback(lambda fn:
3543+            fn.check(Monitor(), verify=True))
3544+        d.addCallback(self.check_good,
3545+                      "test_verify_mdmf_bad_encprivkey_uncheckable")
3546+        return d
3547+
3548+
3549 class Repair(unittest.TestCase, PublishMixin, ShouldFailMixin):
3550 
3551     def get_shares(self, s):
3552hunk ./src/allmydata/test/test_mutable.py 1706
3553         current_shares = self.old_shares[-1]
3554         self.failUnlessEqual(old_shares, current_shares)
3555 
3556+
3557     def test_unrepairable_0shares(self):
3558         d = self.publish_one()
3559         def _delete_all_shares(ign):
3560hunk ./src/allmydata/test/test_mutable.py 1721
3561         d.addCallback(_check)
3562         return d
3563 
3564+    def test_mdmf_unrepairable_0shares(self):
3565+        d = self.publish_mdmf()
3566+        def _delete_all_shares(ign):
3567+            shares = self._storage._peers
3568+            for peerid in shares:
3569+                shares[peerid] = {}
3570+        d.addCallback(_delete_all_shares)
3571+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3572+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3573+        d.addCallback(lambda crr: self.failIf(crr.get_successful()))
3574+        return d
3575+
3576+
3577     def test_unrepairable_1share(self):
3578         d = self.publish_one()
3579         def _delete_all_shares(ign):
3580hunk ./src/allmydata/test/test_mutable.py 1750
3581         d.addCallback(_check)
3582         return d
3583 
3584+    def test_mdmf_unrepairable_1share(self):
3585+        d = self.publish_mdmf()
3586+        def _delete_all_shares(ign):
3587+            shares = self._storage._peers
3588+            for peerid in shares:
3589+                for shnum in list(shares[peerid]):
3590+                    if shnum > 0:
3591+                        del shares[peerid][shnum]
3592+        d.addCallback(_delete_all_shares)
3593+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3594+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3595+        def _check(crr):
3596+            self.failUnlessEqual(crr.get_successful(), False)
3597+        d.addCallback(_check)
3598+        return d
3599+
3600+    def test_repairable_5shares(self):
3601+        d = self.publish_mdmf()
3602+        def _delete_all_shares(ign):
3603+            shares = self._storage._peers
3604+            for peerid in shares:
3605+                for shnum in list(shares[peerid]):
3606+                    if shnum > 4:
3607+                        del shares[peerid][shnum]
3608+        d.addCallback(_delete_all_shares)
3609+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3610+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3611+        def _check(crr):
3612+            self.failUnlessEqual(crr.get_successful(), True)
3613+        d.addCallback(_check)
3614+        return d
3615+
3616+    def test_mdmf_repairable_5shares(self):
3617+        d = self.publish_mdmf()
3618+        def _delete_all_shares(ign):
3619+            shares = self._storage._peers
3620+            for peerid in shares:
3621+                for shnum in list(shares[peerid]):
3622+                    if shnum > 5:
3623+                        del shares[peerid][shnum]
3624+        d.addCallback(_delete_all_shares)
3625+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3626+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3627+        def _check(crr):
3628+            self.failUnlessEqual(crr.get_successful(), True)
3629+        d.addCallback(_check)
3630+        return d
3631+
3632+
3633     def test_merge(self):
3634         self.old_shares = []
3635         d = self.publish_multiple()
3636}
3637[mutable/retrieve.py: learn how to verify mutable files
3638Kevan Carstensen <kevan@isnotajoke.com>**20100628225201
3639 Ignore-this: 989af7800c47589620918461ec989483
3640] {
3641hunk ./src/allmydata/mutable/retrieve.py 86
3642     # Retrieve object will remain tied to a specific version of the file, and
3643     # will use a single ServerMap instance.
3644 
3645-    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False):
3646+    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False,
3647+                 verify=False):
3648         self._node = filenode
3649         assert self._node.get_pubkey()
3650         self._storage_index = filenode.get_storage_index()
3651hunk ./src/allmydata/mutable/retrieve.py 106
3652         # during repair, we may be called upon to grab the private key, since
3653         # it wasn't picked up during a verify=False checker run, and we'll
3654         # need it for repair to generate a new version.
3655-        self._need_privkey = fetch_privkey
3656-        if self._node.get_privkey():
3657+        self._need_privkey = fetch_privkey or verify
3658+        if self._node.get_privkey() and not verify:
3659             self._need_privkey = False
3660 
3661         if self._need_privkey:
3662hunk ./src/allmydata/mutable/retrieve.py 117
3663             self._privkey_query_markers = [] # one Marker for each time we've
3664                                              # tried to get the privkey.
3665 
3666+        # verify means that we are using the downloader logic to verify all
3667+        # of our shares. This tells the downloader a few things.
3668+        #
3669+        # 1. We need to download all of the shares.
3670+        # 2. We don't need to decode or decrypt the shares, since our
3671+        #    caller doesn't care about the plaintext, only the
3672+        #    information about which shares are or are not valid.
3673+        # 3. When we are validating readers, we need to validate the
3674+        #    signature on the prefix. Do we? We already do this in the
3675+        #    servermap update?
3676+        #
3677+        # (just work on 1 and 2 for now, I guess)
3678+        self._verify = False
3679+        if verify:
3680+            self._verify = True
3681+
3682         self._status = RetrieveStatus()
3683         self._status.set_storage_index(self._storage_index)
3684         self._status.set_helper(False)
3685hunk ./src/allmydata/mutable/retrieve.py 323
3686 
3687         # We need at least self._required_shares readers to download a
3688         # segment.
3689-        needed = self._required_shares - len(self._active_readers)
3690+        if self._verify:
3691+            needed = self._total_shares
3692+        else:
3693+            needed = self._required_shares - len(self._active_readers)
3694         # XXX: Why don't format= log messages work here?
3695         self.log("adding %d peers to the active peers list" % needed)
3696 
3697hunk ./src/allmydata/mutable/retrieve.py 339
3698         # will cause problems later.
3699         active_shnums -= set([reader.shnum for reader in self._active_readers])
3700         active_shnums = list(active_shnums)[:needed]
3701-        if len(active_shnums) < needed:
3702+        if len(active_shnums) < needed and not self._verify:
3703             # We don't have enough readers to retrieve the file; fail.
3704             return self._failed()
3705 
3706hunk ./src/allmydata/mutable/retrieve.py 346
3707         for shnum in active_shnums:
3708             self._active_readers.append(self.readers[shnum])
3709             self.log("added reader for share %d" % shnum)
3710-        assert len(self._active_readers) == self._required_shares
3711+        assert len(self._active_readers) >= self._required_shares
3712         # Conceptually, this is part of the _add_active_peers step. It
3713         # validates the prefixes of newly added readers to make sure
3714         # that they match what we are expecting for self.verinfo. If
3715hunk ./src/allmydata/mutable/retrieve.py 416
3716                     # that we haven't gotten it at the end of
3717                     # segment decoding, then we'll take more drastic
3718                     # measures.
3719-                    if self._need_privkey:
3720+                    if self._need_privkey and not self._node.is_readonly():
3721                         d = reader.get_encprivkey()
3722                         d.addCallback(self._try_to_validate_privkey, reader)
3723             if bad_readers:
3724hunk ./src/allmydata/mutable/retrieve.py 423
3725                 # We do them all at once, or else we screw up list indexing.
3726                 for (reader, f) in bad_readers:
3727                     self._mark_bad_share(reader, f)
3728-                return self._add_active_peers()
3729+                if self._verify:
3730+                    if len(self._active_readers) >= self._required_shares:
3731+                        return self._download_current_segment()
3732+                    else:
3733+                        return self._failed()
3734+                else:
3735+                    return self._add_active_peers()
3736             else:
3737                 return self._download_current_segment()
3738             # The next step will assert that it has enough active
3739hunk ./src/allmydata/mutable/retrieve.py 518
3740         """
3741         self.log("marking share %d on server %s as bad" % \
3742                  (reader.shnum, reader))
3743+        prefix = self.verinfo[-2]
3744+        self.servermap.mark_bad_share(reader.peerid,
3745+                                      reader.shnum,
3746+                                      prefix)
3747         self._remove_reader(reader)
3748hunk ./src/allmydata/mutable/retrieve.py 523
3749-        self._bad_shares.add((reader.peerid, reader.shnum))
3750+        self._bad_shares.add((reader.peerid, reader.shnum, f))
3751         self._status.problems[reader.peerid] = f
3752         self._last_failure = f
3753         self.notify_server_corruption(reader.peerid, reader.shnum,
3754hunk ./src/allmydata/mutable/retrieve.py 571
3755             ds.append(dl)
3756             reader.flush()
3757         dl = defer.DeferredList(ds)
3758-        dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
3759+        if self._verify:
3760+            dl.addCallback(lambda ignored: "")
3761+            dl.addCallback(self._set_segment)
3762+        else:
3763+            dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
3764         return dl
3765 
3766 
3767hunk ./src/allmydata/mutable/retrieve.py 701
3768         # shnum, which will be a leaf in the share hash tree, which
3769         # will allow us to validate the rest of the tree.
3770         if self.share_hash_tree.needed_hashes(reader.shnum,
3771-                                               include_leaf=True):
3772+                                              include_leaf=True) or \
3773+                                              self._verify:
3774             try:
3775                 self.share_hash_tree.set_hashes(hashes=sharehashes[1],
3776                                             leaves={reader.shnum: bht[0]})
3777hunk ./src/allmydata/mutable/retrieve.py 832
3778 
3779 
3780     def _try_to_validate_privkey(self, enc_privkey, reader):
3781-
3782         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
3783         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
3784         if alleged_writekey != self._node.get_writekey():
3785hunk ./src/allmydata/mutable/retrieve.py 838
3786             self.log("invalid privkey from %s shnum %d" %
3787                      (reader, reader.shnum),
3788                      level=log.WEIRD, umid="YIw4tA")
3789+            if self._verify:
3790+                self.servermap.mark_bad_share(reader.peerid, reader.shnum,
3791+                                              self.verinfo[-2])
3792+                e = CorruptShareError(reader.peerid,
3793+                                      reader.shnum,
3794+                                      "invalid privkey")
3795+                f = failure.Failure(e)
3796+                self._bad_shares.add((reader.peerid, reader.shnum, f))
3797             return
3798 
3799         # it's good
3800hunk ./src/allmydata/mutable/retrieve.py 904
3801         statements, I return the decrypted contents to the owner of this
3802         Retrieve object through self._done_deferred.
3803         """
3804-        eventually(self._done_deferred.callback, self._plaintext)
3805+        if self._verify:
3806+            ret = list(self._bad_shares)
3807+            self.log("done verifying, found %d bad shares" % len(ret))
3808+        else:
3809+            ret = self._plaintext
3810+        eventually(self._done_deferred.callback, ret)
3811 
3812 
3813     def _failed(self):
3814hunk ./src/allmydata/mutable/retrieve.py 920
3815         to the caller of this Retrieve object through
3816         self._done_deferred.
3817         """
3818-        format = ("ran out of peers: "
3819-                  "have %(have)d of %(total)d segments "
3820-                  "found %(bad)d bad shares "
3821-                  "encoding %(k)d-of-%(n)d")
3822-        args = {"have": self._current_segment,
3823-                "total": self._num_segments,
3824-                "k": self._required_shares,
3825-                "n": self._total_shares,
3826-                "bad": len(self._bad_shares)}
3827-        e = NotEnoughSharesError("%s, last failure: %s" % (format % args,
3828-                                                        str(self._last_failure)))
3829-        f = failure.Failure(e)
3830-        eventually(self._done_deferred.callback, f)
3831+        if self._verify:
3832+            ret = list(self._bad_shares)
3833+        else:
3834+            format = ("ran out of peers: "
3835+                      "have %(have)d of %(total)d segments "
3836+                      "found %(bad)d bad shares "
3837+                      "encoding %(k)d-of-%(n)d")
3838+            args = {"have": self._current_segment,
3839+                    "total": self._num_segments,
3840+                    "k": self._required_shares,
3841+                    "n": self._total_shares,
3842+                    "bad": len(self._bad_shares)}
3843+            e = NotEnoughSharesError("%s, last failure: %s" % \
3844+                                     (format % args, str(self._last_failure)))
3845+            f = failure.Failure(e)
3846+            ret = f
3847+        eventually(self._done_deferred.callback, ret)
3848}
3849[interfaces.py: add IMutableSlotWriter
3850Kevan Carstensen <kevan@isnotajoke.com>**20100630183305
3851 Ignore-this: ff9dca96ef1a009ae85485682f81ea5
3852] hunk ./src/allmydata/interfaces.py 418
3853         """
3854 
3855 
3856+class IMutableSlotWriter(Interface):
3857+    """
3858+    The interface for a writer around a mutable slot on a remote server.
3859+    """
3860+    def set_checkstring(checkstring, *args):
3861+        """
3862+        Set the checkstring that I will pass to the remote server when
3863+        writing.
3864+
3865+            @param checkstring A packed checkstring to use.
3866+
3867+        Note that implementations can differ in which semantics they
3868+        wish to support for set_checkstring -- they can, for example,
3869+        build the checkstring themselves from its constituents, or
3870+        some other thing.
3871+        """
3872+
3873+    def get_checkstring():
3874+        """
3875+        Get the checkstring that I think currently exists on the remote
3876+        server.
3877+        """
3878+
3879+    def put_block(data, segnum, salt):
3880+        """
3881+        Add a block and salt to the share.
3882+        """
3883+
3884+    def put_encprivey(encprivkey):
3885+        """
3886+        Add the encrypted private key to the share.
3887+        """
3888+
3889+    def put_blockhashes(blockhashes=list):
3890+        """
3891+        Add the block hash tree to the share.
3892+        """
3893+
3894+    def put_sharehashes(sharehashes=dict):
3895+        """
3896+        Add the share hash chain to the share.
3897+        """
3898+
3899+    def get_signable():
3900+        """
3901+        Return the part of the share that needs to be signed.
3902+        """
3903+
3904+    def put_signature(signature):
3905+        """
3906+        Add the signature to the share.
3907+        """
3908+
3909+    def put_verification_key(verification_key):
3910+        """
3911+        Add the verification key to the share.
3912+        """
3913+
3914+    def finish_publishing():
3915+        """
3916+        Do anything necessary to finish writing the share to a remote
3917+        server. I require that no further publishing needs to take place
3918+        after this method has been called.
3919+        """
3920+
3921+
3922 class IURI(Interface):
3923     def init_from_string(uri):
3924         """Accept a string (as created by my to_string() method) and populate
3925[test/test_mutable.py: temporarily disable two tests that are now irrelevant
3926Kevan Carstensen <kevan@isnotajoke.com>**20100701232806
3927 Ignore-this: 701e143567f3954812ca6960af1d6ac7
3928] {
3929hunk ./src/allmydata/test/test_mutable.py 651
3930             self.failUnlessEqual(len(share_ids), 10)
3931         d.addCallback(_done)
3932         return d
3933+    test_encrypt.todo = "Write an equivalent of this for the new uploader"
3934 
3935     def test_generate(self):
3936         nm = make_nodemaker()
3937hunk ./src/allmydata/test/test_mutable.py 713
3938                 self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey())
3939         d.addCallback(_generated)
3940         return d
3941+    test_generate.todo = "Write an equivalent of this for the new uploader"
3942 
3943     # TODO: when we publish to 20 peers, we should get one share per peer on 10
3944     # when we publish to 3 peers, we should get either 3 or 4 shares per peer
3945}
3946[Add MDMF reader and writer, and SDMF writer
3947Kevan Carstensen <kevan@isnotajoke.com>**20100702225531
3948 Ignore-this: bf6276a91d27dcb4e779b0eb82ea1843
3949 
3950 The MDMF/SDMF reader MDMF writer, and SDMF writer are similar to the
3951 object proxies that exist for immutable files. They abstract away
3952 details of connection, state, and caching from their callers (in this
3953 case, the download, servermap updater, and uploader), and expose methods
3954 to get and set information on the remote server.
3955 
3956 MDMFSlotReadProxy reads a mutable file from the server, doing the right
3957 thing (in most cases) regardless of whether the file is MDMF or SDMF. It
3958 allows callers to tell it how to batch and flush reads.
3959 
3960 MDMFSlotWriteProxy writes an MDMF mutable file to a server.
3961 
3962 SDMFSlotWriteProxy writes an SDMF mutable file to a server.
3963 
3964 This patch also includes tests for MDMFSlotReadProxy,
3965 SDMFSlotWriteProxy, and MDMFSlotWriteProxy.
3966] {
3967hunk ./src/allmydata/mutable/layout.py 4
3968 
3969 import struct
3970 from allmydata.mutable.common import NeedMoreDataError, UnknownVersionError
3971+from allmydata.interfaces import HASH_SIZE, SALT_SIZE, SDMF_VERSION, \
3972+                                 MDMF_VERSION, IMutableSlotWriter
3973+from allmydata.util import mathutil, observer
3974+from twisted.python import failure
3975+from twisted.internet import defer
3976+from zope.interface import implements
3977+
3978+
3979+# These strings describe the format of the packed structs they help process
3980+# Here's what they mean:
3981+#
3982+#  PREFIX:
3983+#    >: Big-endian byte order; the most significant byte is first (leftmost).
3984+#    B: The version information; an 8 bit version identifier. Stored as
3985+#       an unsigned char. This is currently 00 00 00 00; our modifications
3986+#       will turn it into 00 00 00 01.
3987+#    Q: The sequence number; this is sort of like a revision history for
3988+#       mutable files; they start at 1 and increase as they are changed after
3989+#       being uploaded. Stored as an unsigned long long, which is 8 bytes in
3990+#       length.
3991+#  32s: The root hash of the share hash tree. We use sha-256d, so we use 32
3992+#       characters = 32 bytes to store the value.
3993+#  16s: The salt for the readkey. This is a 16-byte random value, stored as
3994+#       16 characters.
3995+#
3996+#  SIGNED_PREFIX additions, things that are covered by the signature:
3997+#    B: The "k" encoding parameter. We store this as an 8-bit character,
3998+#       which is convenient because our erasure coding scheme cannot
3999+#       encode if you ask for more than 255 pieces.
4000+#    B: The "N" encoding parameter. Stored as an 8-bit character for the
4001+#       same reasons as above.
4002+#    Q: The segment size of the uploaded file. This will essentially be the
4003+#       length of the file in SDMF. An unsigned long long, so we can store
4004+#       files of quite large size.
4005+#    Q: The data length of the uploaded file. Modulo padding, this will be
4006+#       the same of the data length field. Like the data length field, it is
4007+#       an unsigned long long and can be quite large.
4008+#
4009+#   HEADER additions:
4010+#     L: The offset of the signature of this. An unsigned long.
4011+#     L: The offset of the share hash chain. An unsigned long.
4012+#     L: The offset of the block hash tree. An unsigned long.
4013+#     L: The offset of the share data. An unsigned long.
4014+#     Q: The offset of the encrypted private key. An unsigned long long, to
4015+#        account for the possibility of a lot of share data.
4016+#     Q: The offset of the EOF. An unsigned long long, to account for the
4017+#        possibility of a lot of share data.
4018+#
4019+#  After all of these, we have the following:
4020+#    - The verification key: Occupies the space between the end of the header
4021+#      and the start of the signature (i.e.: data[HEADER_LENGTH:o['signature']].
4022+#    - The signature, which goes from the signature offset to the share hash
4023+#      chain offset.
4024+#    - The share hash chain, which goes from the share hash chain offset to
4025+#      the block hash tree offset.
4026+#    - The share data, which goes from the share data offset to the encrypted
4027+#      private key offset.
4028+#    - The encrypted private key offset, which goes until the end of the file.
4029+#
4030+#  The block hash tree in this encoding has only one share, so the offset of
4031+#  the share data will be 32 bits more than the offset of the block hash tree.
4032+#  Given this, we may need to check to see how many bytes a reasonably sized
4033+#  block hash tree will take up.
4034 
4035 PREFIX = ">BQ32s16s" # each version has a different prefix
4036 SIGNED_PREFIX = ">BQ32s16s BBQQ" # this is covered by the signature
4037hunk ./src/allmydata/mutable/layout.py 73
4038 SIGNED_PREFIX_LENGTH = struct.calcsize(SIGNED_PREFIX)
4039 HEADER = ">BQ32s16s BBQQ LLLLQQ" # includes offsets
4040 HEADER_LENGTH = struct.calcsize(HEADER)
4041+OFFSETS = ">LLLLQQ"
4042+OFFSETS_LENGTH = struct.calcsize(OFFSETS)
4043 
4044 def unpack_header(data):
4045     o = {}
4046hunk ./src/allmydata/mutable/layout.py 194
4047     return (share_hash_chain, block_hash_tree, share_data)
4048 
4049 
4050-def pack_checkstring(seqnum, root_hash, IV):
4051+def pack_checkstring(seqnum, root_hash, IV, version=0):
4052     return struct.pack(PREFIX,
4053hunk ./src/allmydata/mutable/layout.py 196
4054-                       0, # version,
4055+                       version,
4056                        seqnum,
4057                        root_hash,
4058                        IV)
4059hunk ./src/allmydata/mutable/layout.py 269
4060                            encprivkey])
4061     return final_share
4062 
4063+def pack_prefix(seqnum, root_hash, IV,
4064+                required_shares, total_shares,
4065+                segment_size, data_length):
4066+    prefix = struct.pack(SIGNED_PREFIX,
4067+                         0, # version,
4068+                         seqnum,
4069+                         root_hash,
4070+                         IV,
4071+                         required_shares,
4072+                         total_shares,
4073+                         segment_size,
4074+                         data_length,
4075+                         )
4076+    return prefix
4077+
4078+
4079+class SDMFSlotWriteProxy:
4080+    implements(IMutableSlotWriter)
4081+    """
4082+    I represent a remote write slot for an SDMF mutable file. I build a
4083+    share in memory, and then write it in one piece to the remote
4084+    server. This mimics how SDMF shares were built before MDMF (and the
4085+    new MDMF uploader), but provides that functionality in a way that
4086+    allows the MDMF uploader to be built without much special-casing for
4087+    file format, which makes the uploader code more readable.
4088+    """
4089+    def __init__(self,
4090+                 shnum,
4091+                 rref, # a remote reference to a storage server
4092+                 storage_index,
4093+                 secrets, # (write_enabler, renew_secret, cancel_secret)
4094+                 seqnum, # the sequence number of the mutable file
4095+                 required_shares,
4096+                 total_shares,
4097+                 segment_size,
4098+                 data_length): # the length of the original file
4099+        self.shnum = shnum
4100+        self._rref = rref
4101+        self._storage_index = storage_index
4102+        self._secrets = secrets
4103+        self._seqnum = seqnum
4104+        self._required_shares = required_shares
4105+        self._total_shares = total_shares
4106+        self._segment_size = segment_size
4107+        self._data_length = data_length
4108+
4109+        # This is an SDMF file, so it should have only one segment, so,
4110+        # modulo padding of the data length, the segment size and the
4111+        # data length should be the same.
4112+        expected_segment_size = mathutil.next_multiple(data_length,
4113+                                                       self._required_shares)
4114+        assert expected_segment_size == segment_size
4115+
4116+        self._block_size = self._segment_size / self._required_shares
4117+
4118+        # This is meant to mimic how SDMF files were built before MDMF
4119+        # entered the picture: we generate each share in its entirety,
4120+        # then push it off to the storage server in one write. When
4121+        # callers call set_*, they are just populating this dict.
4122+        # finish_publishing will stitch these pieces together into a
4123+        # coherent share, and then write the coherent share to the
4124+        # storage server.
4125+        self._share_pieces = {}
4126+
4127+        # This tells the write logic what checkstring to use when
4128+        # writing remote shares.
4129+        self._testvs = []
4130+
4131+        self._readvs = [(0, struct.calcsize(PREFIX))]
4132+
4133+
4134+    def set_checkstring(self, checkstring_or_seqnum,
4135+                              root_hash=None,
4136+                              salt=None):
4137+        """
4138+        Set the checkstring that I will pass to the remote server when
4139+        writing.
4140+
4141+            @param checkstring_or_seqnum: A packed checkstring to use,
4142+                   or a sequence number. I will treat this as a checkstr
4143+
4144+        Note that implementations can differ in which semantics they
4145+        wish to support for set_checkstring -- they can, for example,
4146+        build the checkstring themselves from its constituents, or
4147+        some other thing.
4148+        """
4149+        if root_hash and salt:
4150+            checkstring = struct.pack(PREFIX,
4151+                                      0,
4152+                                      checkstring_or_seqnum,
4153+                                      root_hash,
4154+                                      salt)
4155+        else:
4156+            checkstring = checkstring_or_seqnum
4157+        self._testvs = [(0, len(checkstring), "eq", checkstring)]
4158+
4159+
4160+    def get_checkstring(self):
4161+        """
4162+        Get the checkstring that I think currently exists on the remote
4163+        server.
4164+        """
4165+        if self._testvs:
4166+            return self._testvs[0][3]
4167+        return ""
4168+
4169+
4170+    def put_block(self, data, segnum, salt):
4171+        """
4172+        Add a block and salt to the share.
4173+        """
4174+        # SDMF files have only one segment
4175+        assert segnum == 0
4176+        assert len(data) == self._block_size
4177+        assert len(salt) == SALT_SIZE
4178+
4179+        self._share_pieces['sharedata'] = data
4180+        self._share_pieces['salt'] = salt
4181+
4182+        # TODO: Figure out something intelligent to return.
4183+        return defer.succeed(None)
4184+
4185+
4186+    def put_encprivkey(self, encprivkey):
4187+        """
4188+        Add the encrypted private key to the share.
4189+        """
4190+        self._share_pieces['encprivkey'] = encprivkey
4191+
4192+        return defer.succeed(None)
4193+
4194+
4195+    def put_blockhashes(self, blockhashes):
4196+        """
4197+        Add the block hash tree to the share.
4198+        """
4199+        assert isinstance(blockhashes, list)
4200+        for h in blockhashes:
4201+            assert len(h) == HASH_SIZE
4202+
4203+        # serialize the blockhashes, then set them.
4204+        blockhashes_s = "".join(blockhashes)
4205+        self._share_pieces['block_hash_tree'] = blockhashes_s
4206+
4207+        return defer.succeed(None)
4208+
4209+
4210+    def put_sharehashes(self, sharehashes):
4211+        """
4212+        Add the share hash chain to the share.
4213+        """
4214+        assert isinstance(sharehashes, dict)
4215+        for h in sharehashes.itervalues():
4216+            assert len(h) == HASH_SIZE
4217+
4218+        # serialize the sharehashes, then set them.
4219+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
4220+                                 for i in sorted(sharehashes.keys())])
4221+        self._share_pieces['share_hash_chain'] = sharehashes_s
4222+
4223+        return defer.succeed(None)
4224+
4225+
4226+    def put_root_hash(self, root_hash):
4227+        """
4228+        Add the root hash to the share.
4229+        """
4230+        assert len(root_hash) == HASH_SIZE
4231+
4232+        self._share_pieces['root_hash'] = root_hash
4233+
4234+        return defer.succeed(None)
4235+
4236+
4237+    def put_salt(self, salt):
4238+        """
4239+        Add a salt to an empty SDMF file.
4240+        """
4241+        assert len(salt) == SALT_SIZE
4242+
4243+        self._share_pieces['salt'] = salt
4244+        self._share_pieces['sharedata'] = ""
4245+
4246+
4247+    def get_signable(self):
4248+        """
4249+        Return the part of the share that needs to be signed.
4250+
4251+        SDMF writers need to sign the packed representation of the
4252+        first eight fields of the remote share, that is:
4253+            - version number (0)
4254+            - sequence number
4255+            - root of the share hash tree
4256+            - salt
4257+            - k
4258+            - n
4259+            - segsize
4260+            - datalen
4261+
4262+        This method is responsible for returning that to callers.
4263+        """
4264+        return struct.pack(SIGNED_PREFIX,
4265+                           0,
4266+                           self._seqnum,
4267+                           self._share_pieces['root_hash'],
4268+                           self._share_pieces['salt'],
4269+                           self._required_shares,
4270+                           self._total_shares,
4271+                           self._segment_size,
4272+                           self._data_length)
4273+
4274+
4275+    def put_signature(self, signature):
4276+        """
4277+        Add the signature to the share.
4278+        """
4279+        self._share_pieces['signature'] = signature
4280+
4281+        return defer.succeed(None)
4282+
4283+
4284+    def put_verification_key(self, verification_key):
4285+        """
4286+        Add the verification key to the share.
4287+        """
4288+        self._share_pieces['verification_key'] = verification_key
4289+
4290+        return defer.succeed(None)
4291+
4292+
4293+    def get_verinfo(self):
4294+        """
4295+        I return my verinfo tuple. This is used by the ServermapUpdater
4296+        to keep track of versions of mutable files.
4297+
4298+        The verinfo tuple for MDMF files contains:
4299+            - seqnum
4300+            - root hash
4301+            - a blank (nothing)
4302+            - segsize
4303+            - datalen
4304+            - k
4305+            - n
4306+            - prefix (the thing that you sign)
4307+            - a tuple of offsets
4308+
4309+        We include the nonce in MDMF to simplify processing of version
4310+        information tuples.
4311+
4312+        The verinfo tuple for SDMF files is the same, but contains a
4313+        16-byte IV instead of a hash of salts.
4314+        """
4315+        return (self._seqnum,
4316+                self._share_pieces['root_hash'],
4317+                self._share_pieces['salt'],
4318+                self._segment_size,
4319+                self._data_length,
4320+                self._required_shares,
4321+                self._total_shares,
4322+                self.get_signable(),
4323+                self._get_offsets_tuple())
4324+
4325+    def _get_offsets_dict(self):
4326+        post_offset = HEADER_LENGTH
4327+        offsets = {}
4328+
4329+        verification_key_length = len(self._share_pieces['verification_key'])
4330+        o1 = offsets['signature'] = post_offset + verification_key_length
4331+
4332+        signature_length = len(self._share_pieces['signature'])
4333+        o2 = offsets['share_hash_chain'] = o1 + signature_length
4334+
4335+        share_hash_chain_length = len(self._share_pieces['share_hash_chain'])
4336+        o3 = offsets['block_hash_tree'] = o2 + share_hash_chain_length
4337+
4338+        block_hash_tree_length = len(self._share_pieces['block_hash_tree'])
4339+        o4 = offsets['share_data'] = o3 + block_hash_tree_length
4340+
4341+        share_data_length = len(self._share_pieces['sharedata'])
4342+        o5 = offsets['enc_privkey'] = o4 + share_data_length
4343+
4344+        encprivkey_length = len(self._share_pieces['encprivkey'])
4345+        offsets['EOF'] = o5 + encprivkey_length
4346+        return offsets
4347+
4348+
4349+    def _get_offsets_tuple(self):
4350+        offsets = self._get_offsets_dict()
4351+        return tuple([(key, value) for key, value in offsets.items()])
4352+
4353+
4354+    def _pack_offsets(self):
4355+        offsets = self._get_offsets_dict()
4356+        return struct.pack(">LLLLQQ",
4357+                           offsets['signature'],
4358+                           offsets['share_hash_chain'],
4359+                           offsets['block_hash_tree'],
4360+                           offsets['share_data'],
4361+                           offsets['enc_privkey'],
4362+                           offsets['EOF'])
4363+
4364+
4365+    def finish_publishing(self):
4366+        """
4367+        Do anything necessary to finish writing the share to a remote
4368+        server. I require that no further publishing needs to take place
4369+        after this method has been called.
4370+        """
4371+        for k in ["sharedata", "encprivkey", "signature", "verification_key",
4372+                  "share_hash_chain", "block_hash_tree"]:
4373+            assert k in self._share_pieces
4374+        # This is the only method that actually writes something to the
4375+        # remote server.
4376+        # First, we need to pack the share into data that we can write
4377+        # to the remote server in one write.
4378+        offsets = self._pack_offsets()
4379+        prefix = self.get_signable()
4380+        final_share = "".join([prefix,
4381+                               offsets,
4382+                               self._share_pieces['verification_key'],
4383+                               self._share_pieces['signature'],
4384+                               self._share_pieces['share_hash_chain'],
4385+                               self._share_pieces['block_hash_tree'],
4386+                               self._share_pieces['sharedata'],
4387+                               self._share_pieces['encprivkey']])
4388+
4389+        # Our only data vector is going to be writing the final share,
4390+        # in its entirely.
4391+        datavs = [(0, final_share)]
4392+
4393+        if not self._testvs:
4394+            # Our caller has not provided us with another checkstring
4395+            # yet, so we assume that we are writing a new share, and set
4396+            # a test vector that will allow a new share to be written.
4397+            self._testvs = []
4398+            self._testvs.append(tuple([0, 1, "eq", ""]))
4399+            new_share = True
4400+
4401+        tw_vectors = {}
4402+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
4403+        return self._rref.callRemote("slot_testv_and_readv_and_writev",
4404+                                     self._storage_index,
4405+                                     self._secrets,
4406+                                     tw_vectors,
4407+                                     # TODO is it useful to read something?
4408+                                     self._readvs)
4409+
4410+
4411+MDMFHEADER = ">BQ32sBBQQ QQQQQQ"
4412+MDMFHEADERWITHOUTOFFSETS = ">BQ32sBBQQ"
4413+MDMFHEADERSIZE = struct.calcsize(MDMFHEADER)
4414+MDMFHEADERWITHOUTOFFSETSSIZE = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
4415+MDMFCHECKSTRING = ">BQ32s"
4416+MDMFSIGNABLEHEADER = ">BQ32sBBQQ"
4417+MDMFOFFSETS = ">QQQQQQ"
4418+MDMFOFFSETS_LENGTH = struct.calcsize(MDMFOFFSETS)
4419+
4420+class MDMFSlotWriteProxy:
4421+    implements(IMutableSlotWriter)
4422+
4423+    """
4424+    I represent a remote write slot for an MDMF mutable file.
4425+
4426+    I abstract away from my caller the details of block and salt
4427+    management, and the implementation of the on-disk format for MDMF
4428+    shares.
4429+    """
4430+    # Expected layout, MDMF:
4431+    # offset:     size:       name:
4432+    #-- signed part --
4433+    # 0           1           version number (01)
4434+    # 1           8           sequence number
4435+    # 9           32          share tree root hash
4436+    # 41          1           The "k" encoding parameter
4437+    # 42          1           The "N" encoding parameter
4438+    # 43          8           The segment size of the uploaded file
4439+    # 51          8           The data length of the original plaintext
4440+    #-- end signed part --
4441+    # 59          8           The offset of the encrypted private key
4442+    # 67          8           The offset of the block hash tree
4443+    # 75          8           The offset of the share hash chain
4444+    # 83          8           The offset of the signature
4445+    # 91          8           The offset of the verification key
4446+    # 99          8           The offset of the EOF
4447+    #
4448+    # followed by salts and share data, the encrypted private key, the
4449+    # block hash tree, the salt hash tree, the share hash chain, a
4450+    # signature over the first eight fields, and a verification key.
4451+    #
4452+    # The checkstring is the first three fields -- the version number,
4453+    # sequence number, root hash and root salt hash. This is consistent
4454+    # in meaning to what we have with SDMF files, except now instead of
4455+    # using the literal salt, we use a value derived from all of the
4456+    # salts -- the share hash root.
4457+    #
4458+    # The salt is stored before the block for each segment. The block
4459+    # hash tree is computed over the combination of block and salt for
4460+    # each segment. In this way, we get integrity checking for both
4461+    # block and salt with the current block hash tree arrangement.
4462+    #
4463+    # The ordering of the offsets is different to reflect the dependencies
4464+    # that we'll run into with an MDMF file. The expected write flow is
4465+    # something like this:
4466+    #
4467+    #   0: Initialize with the sequence number, encoding parameters and
4468+    #      data length. From this, we can deduce the number of segments,
4469+    #      and where they should go.. We can also figure out where the
4470+    #      encrypted private key should go, because we can figure out how
4471+    #      big the share data will be.
4472+    #
4473+    #   1: Encrypt, encode, and upload the file in chunks. Do something
4474+    #      like
4475+    #
4476+    #       put_block(data, segnum, salt)
4477+    #
4478+    #      to write a block and a salt to the disk. We can do both of
4479+    #      these operations now because we have enough of the offsets to
4480+    #      know where to put them.
4481+    #
4482+    #   2: Put the encrypted private key. Use:
4483+    #
4484+    #        put_encprivkey(encprivkey)
4485+    #
4486+    #      Now that we know the length of the private key, we can fill
4487+    #      in the offset for the block hash tree.
4488+    #
4489+    #   3: We're now in a position to upload the block hash tree for
4490+    #      a share. Put that using something like:
4491+    #       
4492+    #        put_blockhashes(block_hash_tree)
4493+    #
4494+    #      Note that block_hash_tree is a list of hashes -- we'll take
4495+    #      care of the details of serializing that appropriately. When
4496+    #      we get the block hash tree, we are also in a position to
4497+    #      calculate the offset for the share hash chain, and fill that
4498+    #      into the offsets table.
4499+    #
4500+    #   4: At the same time, we're in a position to upload the salt hash
4501+    #      tree. This is a Merkle tree over all of the salts. We use a
4502+    #      Merkle tree so that we can validate each block,salt pair as
4503+    #      we download them later. We do this using
4504+    #
4505+    #        put_salthashes(salt_hash_tree)
4506+    #
4507+    #      When you do this, I automatically put the root of the tree
4508+    #      (the hash at index 0 of the list) in its appropriate slot in
4509+    #      the signed prefix of the share.
4510+    #
4511+    #   5: We're now in a position to upload the share hash chain for
4512+    #      a share. Do that with something like:
4513+    #     
4514+    #        put_sharehashes(share_hash_chain)
4515+    #
4516+    #      share_hash_chain should be a dictionary mapping shnums to
4517+    #      32-byte hashes -- the wrapper handles serialization.
4518+    #      We'll know where to put the signature at this point, also.
4519+    #      The root of this tree will be put explicitly in the next
4520+    #      step.
4521+    #
4522+    #      TODO: Why? Why not just include it in the tree here?
4523+    #
4524+    #   6: Before putting the signature, we must first put the
4525+    #      root_hash. Do this with:
4526+    #
4527+    #        put_root_hash(root_hash).
4528+    #     
4529+    #      In terms of knowing where to put this value, it was always
4530+    #      possible to place it, but it makes sense semantically to
4531+    #      place it after the share hash tree, so that's why you do it
4532+    #      in this order.
4533+    #
4534+    #   6: With the root hash put, we can now sign the header. Use:
4535+    #
4536+    #        get_signable()
4537+    #
4538+    #      to get the part of the header that you want to sign, and use:
4539+    #       
4540+    #        put_signature(signature)
4541+    #
4542+    #      to write your signature to the remote server.
4543+    #
4544+    #   6: Add the verification key, and finish. Do:
4545+    #
4546+    #        put_verification_key(key)
4547+    #
4548+    #      and
4549+    #
4550+    #        finish_publish()
4551+    #
4552+    # Checkstring management:
4553+    #
4554+    # To write to a mutable slot, we have to provide test vectors to ensure
4555+    # that we are writing to the same data that we think we are. These
4556+    # vectors allow us to detect uncoordinated writes; that is, writes
4557+    # where both we and some other shareholder are writing to the
4558+    # mutable slot, and to report those back to the parts of the program
4559+    # doing the writing.
4560+    #
4561+    # With SDMF, this was easy -- all of the share data was written in
4562+    # one go, so it was easy to detect uncoordinated writes, and we only
4563+    # had to do it once. With MDMF, not all of the file is written at
4564+    # once.
4565+    #
4566+    # If a share is new, we write out as much of the header as we can
4567+    # before writing out anything else. This gives other writers a
4568+    # canary that they can use to detect uncoordinated writes, and, if
4569+    # they do the same thing, gives us the same canary. We them update
4570+    # the share. We won't be able to write out two fields of the header
4571+    # -- the share tree hash and the salt hash -- until we finish
4572+    # writing out the share. We only require the writer to provide the
4573+    # initial checkstring, and keep track of what it should be after
4574+    # updates ourselves.
4575+    #
4576+    # If we haven't written anything yet, then on the first write (which
4577+    # will probably be a block + salt of a share), we'll also write out
4578+    # the header. On subsequent passes, we'll expect to see the header.
4579+    # This changes in two places:
4580+    #
4581+    #   - When we write out the salt hash
4582+    #   - When we write out the root of the share hash tree
4583+    #
4584+    # since these values will change the header. It is possible that we
4585+    # can just make those be written in one operation to minimize
4586+    # disruption.
4587+    def __init__(self,
4588+                 shnum,
4589+                 rref, # a remote reference to a storage server
4590+                 storage_index,
4591+                 secrets, # (write_enabler, renew_secret, cancel_secret)
4592+                 seqnum, # the sequence number of the mutable file
4593+                 required_shares,
4594+                 total_shares,
4595+                 segment_size,
4596+                 data_length): # the length of the original file
4597+        self.shnum = shnum
4598+        self._rref = rref
4599+        self._storage_index = storage_index
4600+        self._seqnum = seqnum
4601+        self._required_shares = required_shares
4602+        assert self.shnum >= 0 and self.shnum < total_shares
4603+        self._total_shares = total_shares
4604+        # We build up the offset table as we write things. It is the
4605+        # last thing we write to the remote server.
4606+        self._offsets = {}
4607+        self._testvs = []
4608+        self._secrets = secrets
4609+        # The segment size needs to be a multiple of the k parameter --
4610+        # any padding should have been carried out by the publisher
4611+        # already.
4612+        assert segment_size % required_shares == 0
4613+        self._segment_size = segment_size
4614+        self._data_length = data_length
4615+
4616+        # These are set later -- we define them here so that we can
4617+        # check for their existence easily
4618+
4619+        # This is the root of the share hash tree -- the Merkle tree
4620+        # over the roots of the block hash trees computed for shares in
4621+        # this upload.
4622+        self._root_hash = None
4623+
4624+        # We haven't yet written anything to the remote bucket. By
4625+        # setting this, we tell the _write method as much. The write
4626+        # method will then know that it also needs to add a write vector
4627+        # for the checkstring (or what we have of it) to the first write
4628+        # request. We'll then record that value for future use.  If
4629+        # we're expecting something to be there already, we need to call
4630+        # set_checkstring before we write anything to tell the first
4631+        # write about that.
4632+        self._written = False
4633+
4634+        # When writing data to the storage servers, we get a read vector
4635+        # for free. We'll read the checkstring, which will help us
4636+        # figure out what's gone wrong if a write fails.
4637+        self._readv = [(0, struct.calcsize(MDMFCHECKSTRING))]
4638+
4639+        # We calculate the number of segments because it tells us
4640+        # where the salt part of the file ends/share segment begins,
4641+        # and also because it provides a useful amount of bounds checking.
4642+        self._num_segments = mathutil.div_ceil(self._data_length,
4643+                                               self._segment_size)
4644+        self._block_size = self._segment_size / self._required_shares
4645+        # We also calculate the share size, to help us with block
4646+        # constraints later.
4647+        tail_size = self._data_length % self._segment_size
4648+        if not tail_size:
4649+            self._tail_block_size = self._block_size
4650+        else:
4651+            self._tail_block_size = mathutil.next_multiple(tail_size,
4652+                                                           self._required_shares)
4653+            self._tail_block_size /= self._required_shares
4654+
4655+        # We already know where the sharedata starts; right after the end
4656+        # of the header (which is defined as the signable part + the offsets)
4657+        # We can also calculate where the encrypted private key begins
4658+        # from what we know know.
4659+        self._actual_block_size = self._block_size + SALT_SIZE
4660+        data_size = self._actual_block_size * (self._num_segments - 1)
4661+        data_size += self._tail_block_size
4662+        data_size += SALT_SIZE
4663+        self._offsets['enc_privkey'] = MDMFHEADERSIZE
4664+        self._offsets['enc_privkey'] += data_size
4665+        # We'll wait for the rest. Callers can now call my "put_block" and
4666+        # "set_checkstring" methods.
4667+
4668+
4669+    def set_checkstring(self,
4670+                        seqnum_or_checkstring,
4671+                        root_hash=None,
4672+                        salt=None):
4673+        """
4674+        Set checkstring checkstring for the given shnum.
4675+
4676+        This can be invoked in one of two ways.
4677+
4678+        With one argument, I assume that you are giving me a literal
4679+        checkstring -- e.g., the output of get_checkstring. I will then
4680+        set that checkstring as it is. This form is used by unit tests.
4681+
4682+        With two arguments, I assume that you are giving me a sequence
4683+        number and root hash to make a checkstring from. In that case, I
4684+        will build a checkstring and set it for you. This form is used
4685+        by the publisher.
4686+
4687+        By default, I assume that I am writing new shares to the grid.
4688+        If you don't explcitly set your own checkstring, I will use
4689+        one that requires that the remote share not exist. You will want
4690+        to use this method if you are updating a share in-place;
4691+        otherwise, writes will fail.
4692+        """
4693+        # You're allowed to overwrite checkstrings with this method;
4694+        # I assume that users know what they are doing when they call
4695+        # it.
4696+        if root_hash:
4697+            checkstring = struct.pack(MDMFCHECKSTRING,
4698+                                      1,
4699+                                      seqnum_or_checkstring,
4700+                                      root_hash)
4701+        else:
4702+            checkstring = seqnum_or_checkstring
4703+
4704+        if checkstring == "":
4705+            # We special-case this, since len("") = 0, but we need
4706+            # length of 1 for the case of an empty share to work on the
4707+            # storage server, which is what a checkstring that is the
4708+            # empty string means.
4709+            self._testvs = []
4710+        else:
4711+            self._testvs = []
4712+            self._testvs.append((0, len(checkstring), "eq", checkstring))
4713+
4714+
4715+    def __repr__(self):
4716+        return "MDMFSlotWriteProxy for share %d" % self.shnum
4717+
4718+
4719+    def get_checkstring(self):
4720+        """
4721+        Given a share number, I return a representation of what the
4722+        checkstring for that share on the server will look like.
4723+
4724+        I am mostly used for tests.
4725+        """
4726+        if self._root_hash:
4727+            roothash = self._root_hash
4728+        else:
4729+            roothash = "\x00" * 32
4730+        return struct.pack(MDMFCHECKSTRING,
4731+                           1,
4732+                           self._seqnum,
4733+                           roothash)
4734+
4735+
4736+    def put_block(self, data, segnum, salt):
4737+        """
4738+        Put the encrypted-and-encoded data segment in the slot, along
4739+        with the salt.
4740+        """
4741+        if segnum >= self._num_segments:
4742+            raise LayoutInvalid("I won't overwrite the private key")
4743+        if len(salt) != SALT_SIZE:
4744+            raise LayoutInvalid("I was given a salt of size %d, but "
4745+                                "I wanted a salt of size %d")
4746+        if segnum + 1 == self._num_segments:
4747+            if len(data) != self._tail_block_size:
4748+                raise LayoutInvalid("I was given the wrong size block to write")
4749+        elif len(data) != self._block_size:
4750+            raise LayoutInvalid("I was given the wrong size block to write")
4751+
4752+        # We want to write at len(MDMFHEADER) + segnum * block_size.
4753+
4754+        offset = MDMFHEADERSIZE + (self._actual_block_size * segnum)
4755+        data = salt + data
4756+
4757+        datavs = [tuple([offset, data])]
4758+        return self._write(datavs)
4759+
4760+
4761+    def put_encprivkey(self, encprivkey):
4762+        """
4763+        Put the encrypted private key in the remote slot.
4764+        """
4765+        assert self._offsets
4766+        assert self._offsets['enc_privkey']
4767+        # You shouldn't re-write the encprivkey after the block hash
4768+        # tree is written, since that could cause the private key to run
4769+        # into the block hash tree. Before it writes the block hash
4770+        # tree, the block hash tree writing method writes the offset of
4771+        # the salt hash tree. So that's a good indicator of whether or
4772+        # not the block hash tree has been written.
4773+        if "share_hash_chain" in self._offsets:
4774+            raise LayoutInvalid("You must write this before the block hash tree")
4775+
4776+        self._offsets['block_hash_tree'] = self._offsets['enc_privkey'] + len(encprivkey)
4777+        datavs = [(tuple([self._offsets['enc_privkey'], encprivkey]))]
4778+        def _on_failure():
4779+            del(self._offsets['block_hash_tree'])
4780+        return self._write(datavs, on_failure=_on_failure)
4781+
4782+
4783+    def put_blockhashes(self, blockhashes):
4784+        """
4785+        Put the block hash tree in the remote slot.
4786+
4787+        The encrypted private key must be put before the block hash
4788+        tree, since we need to know how large it is to know where the
4789+        block hash tree should go. The block hash tree must be put
4790+        before the salt hash tree, since its size determines the
4791+        offset of the share hash chain.
4792+        """
4793+        assert self._offsets
4794+        assert isinstance(blockhashes, list)
4795+        if "block_hash_tree" not in self._offsets:
4796+            raise LayoutInvalid("You must put the encrypted private key "
4797+                                "before you put the block hash tree")
4798+        # If written, the share hash chain causes the signature offset
4799+        # to be defined.
4800+        if "signature" in self._offsets:
4801+            raise LayoutInvalid("You must put the block hash tree before "
4802+                                "you put the share hash chain")
4803+        blockhashes_s = "".join(blockhashes)
4804+        self._offsets['share_hash_chain'] = self._offsets['block_hash_tree'] + len(blockhashes_s)
4805+        datavs = []
4806+        datavs.append(tuple([self._offsets['block_hash_tree'], blockhashes_s]))
4807+        def _on_failure():
4808+            del(self._offsets['share_hash_chain'])
4809+        return self._write(datavs, on_failure=_on_failure)
4810+
4811+
4812+    def put_sharehashes(self, sharehashes):
4813+        """
4814+        Put the share hash chain in the remote slot.
4815+
4816+        The salt hash tree must be put before the share hash chain,
4817+        since we need to know where the salt hash tree ends before we
4818+        can know where the share hash chain starts. The share hash chain
4819+        must be put before the signature, since the length of the packed
4820+        share hash chain determines the offset of the signature. Also,
4821+        semantically, you must know what the root of the salt hash tree
4822+        is before you can generate a valid signature.
4823+        """
4824+        assert isinstance(sharehashes, dict)
4825+        if "share_hash_chain" not in self._offsets:
4826+            raise LayoutInvalid("You need to put the salt hash tree before "
4827+                                "you can put the share hash chain")
4828+        # The signature comes after the share hash chain. If the
4829+        # signature has already been written, we must not write another
4830+        # share hash chain. The signature writes the verification key
4831+        # offset when it gets sent to the remote server, so we look for
4832+        # that.
4833+        if "verification_key" in self._offsets:
4834+            raise LayoutInvalid("You must write the share hash chain "
4835+                                "before you write the signature")
4836+        datavs = []
4837+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
4838+                                  for i in sorted(sharehashes.keys())])
4839+        self._offsets['signature'] = self._offsets['share_hash_chain'] + len(sharehashes_s)
4840+        datavs.append(tuple([self._offsets['share_hash_chain'], sharehashes_s]))
4841+        def _on_failure():
4842+            del(self._offsets['signature'])
4843+        return self._write(datavs, on_failure=_on_failure)
4844+
4845+
4846+    def put_root_hash(self, roothash):
4847+        """
4848+        Put the root hash (the root of the share hash tree) in the
4849+        remote slot.
4850+        """
4851+        # It does not make sense to be able to put the root
4852+        # hash without first putting the share hashes, since you need
4853+        # the share hashes to generate the root hash.
4854+        #
4855+        # Signature is defined by the routine that places the share hash
4856+        # chain, so it's a good thing to look for in finding out whether
4857+        # or not the share hash chain exists on the remote server.
4858+        if "signature" not in self._offsets:
4859+            raise LayoutInvalid("You need to put the share hash chain "
4860+                                "before you can put the root share hash")
4861+        if len(roothash) != HASH_SIZE:
4862+            raise LayoutInvalid("hashes and salts must be exactly %d bytes"
4863+                                 % HASH_SIZE)
4864+        datavs = []
4865+        self._root_hash = roothash
4866+        # To write both of these values, we update the checkstring on
4867+        # the remote server, which includes them
4868+        checkstring = self.get_checkstring()
4869+        datavs.append(tuple([0, checkstring]))
4870+        # This write, if successful, changes the checkstring, so we need
4871+        # to update our internal checkstring to be consistent with the
4872+        # one on the server.
4873+        def _on_success():
4874+            self._testvs = [(0, len(checkstring), "eq", checkstring)]
4875+        def _on_failure():
4876+            self._root_hash = None
4877+        return self._write(datavs,
4878+                           on_success=_on_success,
4879+                           on_failure=_on_failure)
4880+
4881+
4882+    def get_signable(self):
4883+        """
4884+        Get the first seven fields of the mutable file; the parts that
4885+        are signed.
4886+        """
4887+        if not self._root_hash:
4888+            raise LayoutInvalid("You need to set the root hash "
4889+                                "before getting something to "
4890+                                "sign")
4891+        return struct.pack(MDMFSIGNABLEHEADER,
4892+                           1,
4893+                           self._seqnum,
4894+                           self._root_hash,
4895+                           self._required_shares,
4896+                           self._total_shares,
4897+                           self._segment_size,
4898+                           self._data_length)
4899+
4900+
4901+    def put_signature(self, signature):
4902+        """
4903+        Put the signature field to the remote slot.
4904+
4905+        I require that the root hash and share hash chain have been put
4906+        to the grid before I will write the signature to the grid.
4907+        """
4908+        if "signature" not in self._offsets:
4909+            raise LayoutInvalid("You must put the share hash chain "
4910+        # It does not make sense to put a signature without first
4911+        # putting the root hash and the salt hash (since otherwise
4912+        # the signature would be incomplete), so we don't allow that.
4913+                       "before putting the signature")
4914+        if not self._root_hash:
4915+            raise LayoutInvalid("You must complete the signed prefix "
4916+                                "before computing a signature")
4917+        # If we put the signature after we put the verification key, we
4918+        # could end up running into the verification key, and will
4919+        # probably screw up the offsets as well. So we don't allow that.
4920+        # The method that writes the verification key defines the EOF
4921+        # offset before writing the verification key, so look for that.
4922+        if "EOF" in self._offsets:
4923+            raise LayoutInvalid("You must write the signature before the verification key")
4924+
4925+        self._offsets['verification_key'] = self._offsets['signature'] + len(signature)
4926+        datavs = []
4927+        datavs.append(tuple([self._offsets['signature'], signature]))
4928+        def _on_failure():
4929+            del(self._offsets['verification_key'])
4930+        return self._write(datavs, on_failure=_on_failure)
4931+
4932+
4933+    def put_verification_key(self, verification_key):
4934+        """
4935+        Put the verification key into the remote slot.
4936+
4937+        I require that the signature have been written to the storage
4938+        server before I allow the verification key to be written to the
4939+        remote server.
4940+        """
4941+        if "verification_key" not in self._offsets:
4942+            raise LayoutInvalid("You must put the signature before you "
4943+                                "can put the verification key")
4944+        self._offsets['EOF'] = self._offsets['verification_key'] + len(verification_key)
4945+        datavs = []
4946+        datavs.append(tuple([self._offsets['verification_key'], verification_key]))
4947+        def _on_failure():
4948+            del(self._offsets['EOF'])
4949+        return self._write(datavs, on_failure=_on_failure)
4950+
4951+    def _get_offsets_tuple(self):
4952+        return tuple([(key, value) for key, value in self._offsets.items()])
4953+
4954+    def get_verinfo(self):
4955+        return (self._seqnum,
4956+                self._root_hash,
4957+                self._required_shares,
4958+                self._total_shares,
4959+                self._segment_size,
4960+                self._data_length,
4961+                self.get_signable(),
4962+                self._get_offsets_tuple())
4963+
4964+
4965+    def finish_publishing(self):
4966+        """
4967+        Write the offset table and encoding parameters to the remote
4968+        slot, since that's the only thing we have yet to publish at this
4969+        point.
4970+        """
4971+        if "EOF" not in self._offsets:
4972+            raise LayoutInvalid("You must put the verification key before "
4973+                                "you can publish the offsets")
4974+        offsets_offset = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
4975+        offsets = struct.pack(MDMFOFFSETS,
4976+                              self._offsets['enc_privkey'],
4977+                              self._offsets['block_hash_tree'],
4978+                              self._offsets['share_hash_chain'],
4979+                              self._offsets['signature'],
4980+                              self._offsets['verification_key'],
4981+                              self._offsets['EOF'])
4982+        datavs = []
4983+        datavs.append(tuple([offsets_offset, offsets]))
4984+        encoding_parameters_offset = struct.calcsize(MDMFCHECKSTRING)
4985+        params = struct.pack(">BBQQ",
4986+                             self._required_shares,
4987+                             self._total_shares,
4988+                             self._segment_size,
4989+                             self._data_length)
4990+        datavs.append(tuple([encoding_parameters_offset, params]))
4991+        return self._write(datavs)
4992+
4993+
4994+    def _write(self, datavs, on_failure=None, on_success=None):
4995+        """I write the data vectors in datavs to the remote slot."""
4996+        tw_vectors = {}
4997+        new_share = False
4998+        if not self._testvs:
4999+            self._testvs = []
5000+            self._testvs.append(tuple([0, 1, "eq", ""]))
5001+            new_share = True
5002+        if not self._written:
5003+            # Write a new checkstring to the share when we write it, so
5004+            # that we have something to check later.
5005+            new_checkstring = self.get_checkstring()
5006+            datavs.append((0, new_checkstring))
5007+            def _first_write():
5008+                self._written = True
5009+                self._testvs = [(0, len(new_checkstring), "eq", new_checkstring)]
5010+            on_success = _first_write
5011+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
5012+        datalength = sum([len(x[1]) for x in datavs])
5013+        d = self._rref.callRemote("slot_testv_and_readv_and_writev",
5014+                                  self._storage_index,
5015+                                  self._secrets,
5016+                                  tw_vectors,
5017+                                  self._readv)
5018+        def _result(results):
5019+            if isinstance(results, failure.Failure) or not results[0]:
5020+                # Do nothing; the write was unsuccessful.
5021+                if on_failure: on_failure()
5022+            else:
5023+                if on_success: on_success()
5024+            return results
5025+        d.addCallback(_result)
5026+        return d
5027+
5028+
5029+class MDMFSlotReadProxy:
5030+    """
5031+    I read from a mutable slot filled with data written in the MDMF data
5032+    format (which is described above).
5033+
5034+    I can be initialized with some amount of data, which I will use (if
5035+    it is valid) to eliminate some of the need to fetch it from servers.
5036+    """
5037+    def __init__(self,
5038+                 rref,
5039+                 storage_index,
5040+                 shnum,
5041+                 data=""):
5042+        # Start the initialization process.
5043+        self._rref = rref
5044+        self._storage_index = storage_index
5045+        self.shnum = shnum
5046+
5047+        # Before doing anything, the reader is probably going to want to
5048+        # verify that the signature is correct. To do that, they'll need
5049+        # the verification key, and the signature. To get those, we'll
5050+        # need the offset table. So fetch the offset table on the
5051+        # assumption that that will be the first thing that a reader is
5052+        # going to do.
5053+
5054+        # The fact that these encoding parameters are None tells us
5055+        # that we haven't yet fetched them from the remote share, so we
5056+        # should. We could just not set them, but the checks will be
5057+        # easier to read if we don't have to use hasattr.
5058+        self._version_number = None
5059+        self._sequence_number = None
5060+        self._root_hash = None
5061+        # Filled in if we're dealing with an SDMF file. Unused
5062+        # otherwise.
5063+        self._salt = None
5064+        self._required_shares = None
5065+        self._total_shares = None
5066+        self._segment_size = None
5067+        self._data_length = None
5068+        self._offsets = None
5069+
5070+        # If the user has chosen to initialize us with some data, we'll
5071+        # try to satisfy subsequent data requests with that data before
5072+        # asking the storage server for it. If
5073+        self._data = data
5074+        # The way callers interact with cache in the filenode returns
5075+        # None if there isn't any cached data, but the way we index the
5076+        # cached data requires a string, so convert None to "".
5077+        if self._data == None:
5078+            self._data = ""
5079+
5080+        self._queue_observers = observer.ObserverList()
5081+        self._queue_errbacks = observer.ObserverList()
5082+        self._readvs = []
5083+
5084+
5085+    def _maybe_fetch_offsets_and_header(self, force_remote=False):
5086+        """
5087+        I fetch the offset table and the header from the remote slot if
5088+        I don't already have them. If I do have them, I do nothing and
5089+        return an empty Deferred.
5090+        """
5091+        if self._offsets:
5092+            return defer.succeed(None)
5093+        # At this point, we may be either SDMF or MDMF. Fetching 107
5094+        # bytes will be enough to get header and offsets for both SDMF and
5095+        # MDMF, though we'll be left with 4 more bytes than we
5096+        # need if this ends up being MDMF. This is probably less
5097+        # expensive than the cost of a second roundtrip.
5098+        readvs = [(0, 107)]
5099+        d = self._read(readvs, force_remote)
5100+        d.addCallback(self._process_encoding_parameters)
5101+        d.addCallback(self._process_offsets)
5102+        return d
5103+
5104+
5105+    def _process_encoding_parameters(self, encoding_parameters):
5106+        assert self.shnum in encoding_parameters
5107+        encoding_parameters = encoding_parameters[self.shnum][0]
5108+        # The first byte is the version number. It will tell us what
5109+        # to do next.
5110+        (verno,) = struct.unpack(">B", encoding_parameters[:1])
5111+        if verno == MDMF_VERSION:
5112+            read_size = MDMFHEADERWITHOUTOFFSETSSIZE
5113+            (verno,
5114+             seqnum,
5115+             root_hash,
5116+             k,
5117+             n,
5118+             segsize,
5119+             datalen) = struct.unpack(MDMFHEADERWITHOUTOFFSETS,
5120+                                      encoding_parameters[:read_size])
5121+            if segsize == 0 and datalen == 0:
5122+                # Empty file, no segments.
5123+                self._num_segments = 0
5124+            else:
5125+                self._num_segments = mathutil.div_ceil(datalen, segsize)
5126+
5127+        elif verno == SDMF_VERSION:
5128+            read_size = SIGNED_PREFIX_LENGTH
5129+            (verno,
5130+             seqnum,
5131+             root_hash,
5132+             salt,
5133+             k,
5134+             n,
5135+             segsize,
5136+             datalen) = struct.unpack(">BQ32s16s BBQQ",
5137+                                encoding_parameters[:SIGNED_PREFIX_LENGTH])
5138+            self._salt = salt
5139+            if segsize == 0 and datalen == 0:
5140+                # empty file
5141+                self._num_segments = 0
5142+            else:
5143+                # non-empty SDMF files have one segment.
5144+                self._num_segments = 1
5145+        else:
5146+            raise UnknownVersionError("You asked me to read mutable file "
5147+                                      "version %d, but I only understand "
5148+                                      "%d and %d" % (verno, SDMF_VERSION,
5149+                                                     MDMF_VERSION))
5150+
5151+        self._version_number = verno
5152+        self._sequence_number = seqnum
5153+        self._root_hash = root_hash
5154+        self._required_shares = k
5155+        self._total_shares = n
5156+        self._segment_size = segsize
5157+        self._data_length = datalen
5158+
5159+        self._block_size = self._segment_size / self._required_shares
5160+        # We can upload empty files, and need to account for this fact
5161+        # so as to avoid zero-division and zero-modulo errors.
5162+        if datalen > 0:
5163+            tail_size = self._data_length % self._segment_size
5164+        else:
5165+            tail_size = 0
5166+        if not tail_size:
5167+            self._tail_block_size = self._block_size
5168+        else:
5169+            self._tail_block_size = mathutil.next_multiple(tail_size,
5170+                                                    self._required_shares)
5171+            self._tail_block_size /= self._required_shares
5172+
5173+        return encoding_parameters
5174+
5175+
5176+    def _process_offsets(self, offsets):
5177+        if self._version_number == 0:
5178+            read_size = OFFSETS_LENGTH
5179+            read_offset = SIGNED_PREFIX_LENGTH
5180+            end = read_size + read_offset
5181+            (signature,
5182+             share_hash_chain,
5183+             block_hash_tree,
5184+             share_data,
5185+             enc_privkey,
5186+             EOF) = struct.unpack(">LLLLQQ",
5187+                                  offsets[read_offset:end])
5188+            self._offsets = {}
5189+            self._offsets['signature'] = signature
5190+            self._offsets['share_data'] = share_data
5191+            self._offsets['block_hash_tree'] = block_hash_tree
5192+            self._offsets['share_hash_chain'] = share_hash_chain
5193+            self._offsets['enc_privkey'] = enc_privkey
5194+            self._offsets['EOF'] = EOF
5195+
5196+        elif self._version_number == 1:
5197+            read_offset = MDMFHEADERWITHOUTOFFSETSSIZE
5198+            read_length = MDMFOFFSETS_LENGTH
5199+            end = read_offset + read_length
5200+            (encprivkey,
5201+             blockhashes,
5202+             sharehashes,
5203+             signature,
5204+             verification_key,
5205+             eof) = struct.unpack(MDMFOFFSETS,
5206+                                  offsets[read_offset:end])
5207+            self._offsets = {}
5208+            self._offsets['enc_privkey'] = encprivkey
5209+            self._offsets['block_hash_tree'] = blockhashes
5210+            self._offsets['share_hash_chain'] = sharehashes
5211+            self._offsets['signature'] = signature
5212+            self._offsets['verification_key'] = verification_key
5213+            self._offsets['EOF'] = eof
5214+
5215+
5216+    def get_block_and_salt(self, segnum, queue=False):
5217+        """
5218+        I return (block, salt), where block is the block data and
5219+        salt is the salt used to encrypt that segment.
5220+        """
5221+        d = self._maybe_fetch_offsets_and_header()
5222+        def _then(ignored):
5223+            if self._version_number == 1:
5224+                base_share_offset = MDMFHEADERSIZE
5225+            else:
5226+                base_share_offset = self._offsets['share_data']
5227+
5228+            if segnum + 1 > self._num_segments:
5229+                raise LayoutInvalid("Not a valid segment number")
5230+
5231+            if self._version_number == 0:
5232+                share_offset = base_share_offset + self._block_size * segnum
5233+            else:
5234+                share_offset = base_share_offset + (self._block_size + \
5235+                                                    SALT_SIZE) * segnum
5236+            if segnum + 1 == self._num_segments:
5237+                data = self._tail_block_size
5238+            else:
5239+                data = self._block_size
5240+
5241+            if self._version_number == 1:
5242+                data += SALT_SIZE
5243+
5244+            readvs = [(share_offset, data)]
5245+            return readvs
5246+        d.addCallback(_then)
5247+        d.addCallback(lambda readvs:
5248+            self._read(readvs, queue=queue))
5249+        def _process_results(results):
5250+            assert self.shnum in results
5251+            if self._version_number == 0:
5252+                # We only read the share data, but we know the salt from
5253+                # when we fetched the header
5254+                data = results[self.shnum]
5255+                if not data:
5256+                    data = ""
5257+                else:
5258+                    assert len(data) == 1
5259+                    data = data[0]
5260+                salt = self._salt
5261+            else:
5262+                data = results[self.shnum]
5263+                if not data:
5264+                    salt = data = ""
5265+                else:
5266+                    salt_and_data = results[self.shnum][0]
5267+                    salt = salt_and_data[:SALT_SIZE]
5268+                    data = salt_and_data[SALT_SIZE:]
5269+            return data, salt
5270+        d.addCallback(_process_results)
5271+        return d
5272+
5273+
5274+    def get_blockhashes(self, needed=None, queue=False, force_remote=False):
5275+        """
5276+        I return the block hash tree
5277+
5278+        I take an optional argument, needed, which is a set of indices
5279+        correspond to hashes that I should fetch. If this argument is
5280+        missing, I will fetch the entire block hash tree; otherwise, I
5281+        may attempt to fetch fewer hashes, based on what needed says
5282+        that I should do. Note that I may fetch as many hashes as I
5283+        want, so long as the set of hashes that I do fetch is a superset
5284+        of the ones that I am asked for, so callers should be prepared
5285+        to tolerate additional hashes.
5286+        """
5287+        # TODO: Return only the parts of the block hash tree necessary
5288+        # to validate the blocknum provided?
5289+        # This is a good idea, but it is hard to implement correctly. It
5290+        # is bad to fetch any one block hash more than once, so we
5291+        # probably just want to fetch the whole thing at once and then
5292+        # serve it.
5293+        if needed == set([]):
5294+            return defer.succeed([])
5295+        d = self._maybe_fetch_offsets_and_header()
5296+        def _then(ignored):
5297+            blockhashes_offset = self._offsets['block_hash_tree']
5298+            if self._version_number == 1:
5299+                blockhashes_length = self._offsets['share_hash_chain'] - blockhashes_offset
5300+            else:
5301+                blockhashes_length = self._offsets['share_data'] - blockhashes_offset
5302+            readvs = [(blockhashes_offset, blockhashes_length)]
5303+            return readvs
5304+        d.addCallback(_then)
5305+        d.addCallback(lambda readvs:
5306+            self._read(readvs, queue=queue, force_remote=force_remote))
5307+        def _build_block_hash_tree(results):
5308+            assert self.shnum in results
5309+
5310+            rawhashes = results[self.shnum][0]
5311+            results = [rawhashes[i:i+HASH_SIZE]
5312+                       for i in range(0, len(rawhashes), HASH_SIZE)]
5313+            return results
5314+        d.addCallback(_build_block_hash_tree)
5315+        return d
5316+
5317+
5318+    def get_sharehashes(self, needed=None, queue=False, force_remote=False):
5319+        """
5320+        I return the part of the share hash chain placed to validate
5321+        this share.
5322+
5323+        I take an optional argument, needed. Needed is a set of indices
5324+        that correspond to the hashes that I should fetch. If needed is
5325+        not present, I will fetch and return the entire share hash
5326+        chain. Otherwise, I may fetch and return any part of the share
5327+        hash chain that is a superset of the part that I am asked to
5328+        fetch. Callers should be prepared to deal with more hashes than
5329+        they've asked for.
5330+        """
5331+        if needed == set([]):
5332+            return defer.succeed([])
5333+        d = self._maybe_fetch_offsets_and_header()
5334+
5335+        def _make_readvs(ignored):
5336+            sharehashes_offset = self._offsets['share_hash_chain']
5337+            if self._version_number == 0:
5338+                sharehashes_length = self._offsets['block_hash_tree'] - sharehashes_offset
5339+            else:
5340+                sharehashes_length = self._offsets['signature'] - sharehashes_offset
5341+            readvs = [(sharehashes_offset, sharehashes_length)]
5342+            return readvs
5343+        d.addCallback(_make_readvs)
5344+        d.addCallback(lambda readvs:
5345+            self._read(readvs, queue=queue, force_remote=force_remote))
5346+        def _build_share_hash_chain(results):
5347+            assert self.shnum in results
5348+
5349+            sharehashes = results[self.shnum][0]
5350+            results = [sharehashes[i:i+(HASH_SIZE + 2)]
5351+                       for i in range(0, len(sharehashes), HASH_SIZE + 2)]
5352+            results = dict([struct.unpack(">H32s", data)
5353+                            for data in results])
5354+            return results
5355+        d.addCallback(_build_share_hash_chain)
5356+        return d
5357+
5358+
5359+    def get_encprivkey(self, queue=False):
5360+        """
5361+        I return the encrypted private key.
5362+        """
5363+        d = self._maybe_fetch_offsets_and_header()
5364+
5365+        def _make_readvs(ignored):
5366+            privkey_offset = self._offsets['enc_privkey']
5367+            if self._version_number == 0:
5368+                privkey_length = self._offsets['EOF'] - privkey_offset
5369+            else:
5370+                privkey_length = self._offsets['block_hash_tree'] - privkey_offset
5371+            readvs = [(privkey_offset, privkey_length)]
5372+            return readvs
5373+        d.addCallback(_make_readvs)
5374+        d.addCallback(lambda readvs:
5375+            self._read(readvs, queue=queue))
5376+        def _process_results(results):
5377+            assert self.shnum in results
5378+            privkey = results[self.shnum][0]
5379+            return privkey
5380+        d.addCallback(_process_results)
5381+        return d
5382+
5383+
5384+    def get_signature(self, queue=False):
5385+        """
5386+        I return the signature of my share.
5387+        """
5388+        d = self._maybe_fetch_offsets_and_header()
5389+
5390+        def _make_readvs(ignored):
5391+            signature_offset = self._offsets['signature']
5392+            if self._version_number == 1:
5393+                signature_length = self._offsets['verification_key'] - signature_offset
5394+            else:
5395+                signature_length = self._offsets['share_hash_chain'] - signature_offset
5396+            readvs = [(signature_offset, signature_length)]
5397+            return readvs
5398+        d.addCallback(_make_readvs)
5399+        d.addCallback(lambda readvs:
5400+            self._read(readvs, queue=queue))
5401+        def _process_results(results):
5402+            assert self.shnum in results
5403+            signature = results[self.shnum][0]
5404+            return signature
5405+        d.addCallback(_process_results)
5406+        return d
5407+
5408+
5409+    def get_verification_key(self, queue=False):
5410+        """
5411+        I return the verification key.
5412+        """
5413+        d = self._maybe_fetch_offsets_and_header()
5414+
5415+        def _make_readvs(ignored):
5416+            if self._version_number == 1:
5417+                vk_offset = self._offsets['verification_key']
5418+                vk_length = self._offsets['EOF'] - vk_offset
5419+            else:
5420+                vk_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
5421+                vk_length = self._offsets['signature'] - vk_offset
5422+            readvs = [(vk_offset, vk_length)]
5423+            return readvs
5424+        d.addCallback(_make_readvs)
5425+        d.addCallback(lambda readvs:
5426+            self._read(readvs, queue=queue))
5427+        def _process_results(results):
5428+            assert self.shnum in results
5429+            verification_key = results[self.shnum][0]
5430+            return verification_key
5431+        d.addCallback(_process_results)
5432+        return d
5433+
5434+
5435+    def get_encoding_parameters(self):
5436+        """
5437+        I return (k, n, segsize, datalen)
5438+        """
5439+        d = self._maybe_fetch_offsets_and_header()
5440+        d.addCallback(lambda ignored:
5441+            (self._required_shares,
5442+             self._total_shares,
5443+             self._segment_size,
5444+             self._data_length))
5445+        return d
5446+
5447+
5448+    def get_seqnum(self):
5449+        """
5450+        I return the sequence number for this share.
5451+        """
5452+        d = self._maybe_fetch_offsets_and_header()
5453+        d.addCallback(lambda ignored:
5454+            self._sequence_number)
5455+        return d
5456+
5457+
5458+    def get_root_hash(self):
5459+        """
5460+        I return the root of the block hash tree
5461+        """
5462+        d = self._maybe_fetch_offsets_and_header()
5463+        d.addCallback(lambda ignored: self._root_hash)
5464+        return d
5465+
5466+
5467+    def get_checkstring(self):
5468+        """
5469+        I return the packed representation of the following:
5470+
5471+            - version number
5472+            - sequence number
5473+            - root hash
5474+            - salt hash
5475+
5476+        which my users use as a checkstring to detect other writers.
5477+        """
5478+        d = self._maybe_fetch_offsets_and_header()
5479+        def _build_checkstring(ignored):
5480+            if self._salt:
5481+                checkstring = strut.pack(PREFIX,
5482+                                         self._version_number,
5483+                                         self._sequence_number,
5484+                                         self._root_hash,
5485+                                         self._salt)
5486+            else:
5487+                checkstring = struct.pack(MDMFCHECKSTRING,
5488+                                          self._version_number,
5489+                                          self._sequence_number,
5490+                                          self._root_hash)
5491+
5492+            return checkstring
5493+        d.addCallback(_build_checkstring)
5494+        return d
5495+
5496+
5497+    def get_prefix(self, force_remote):
5498+        d = self._maybe_fetch_offsets_and_header(force_remote)
5499+        d.addCallback(lambda ignored:
5500+            self._build_prefix())
5501+        return d
5502+
5503+
5504+    def _build_prefix(self):
5505+        # The prefix is another name for the part of the remote share
5506+        # that gets signed. It consists of everything up to and
5507+        # including the datalength, packed by struct.
5508+        if self._version_number == SDMF_VERSION:
5509+            return struct.pack(SIGNED_PREFIX,
5510+                           self._version_number,
5511+                           self._sequence_number,
5512+                           self._root_hash,
5513+                           self._salt,
5514+                           self._required_shares,
5515+                           self._total_shares,
5516+                           self._segment_size,
5517+                           self._data_length)
5518+
5519+        else:
5520+            return struct.pack(MDMFSIGNABLEHEADER,
5521+                           self._version_number,
5522+                           self._sequence_number,
5523+                           self._root_hash,
5524+                           self._required_shares,
5525+                           self._total_shares,
5526+                           self._segment_size,
5527+                           self._data_length)
5528+
5529+
5530+    def _get_offsets_tuple(self):
5531+        # The offsets tuple is another component of the version
5532+        # information tuple. It is basically our offsets dictionary,
5533+        # itemized and in a tuple.
5534+        return self._offsets.copy()
5535+
5536+
5537+    def get_verinfo(self):
5538+        """
5539+        I return my verinfo tuple. This is used by the ServermapUpdater
5540+        to keep track of versions of mutable files.
5541+
5542+        The verinfo tuple for MDMF files contains:
5543+            - seqnum
5544+            - root hash
5545+            - a blank (nothing)
5546+            - segsize
5547+            - datalen
5548+            - k
5549+            - n
5550+            - prefix (the thing that you sign)
5551+            - a tuple of offsets
5552+
5553+        We include the nonce in MDMF to simplify processing of version
5554+        information tuples.
5555+
5556+        The verinfo tuple for SDMF files is the same, but contains a
5557+        16-byte IV instead of a hash of salts.
5558+        """
5559+        d = self._maybe_fetch_offsets_and_header()
5560+        def _build_verinfo(ignored):
5561+            if self._version_number == SDMF_VERSION:
5562+                salt_to_use = self._salt
5563+            else:
5564+                salt_to_use = None
5565+            return (self._sequence_number,
5566+                    self._root_hash,
5567+                    salt_to_use,
5568+                    self._segment_size,
5569+                    self._data_length,
5570+                    self._required_shares,
5571+                    self._total_shares,
5572+                    self._build_prefix(),
5573+                    self._get_offsets_tuple())
5574+        d.addCallback(_build_verinfo)
5575+        return d
5576+
5577+
5578+    def flush(self):
5579+        """
5580+        I flush my queue of read vectors.
5581+        """
5582+        d = self._read(self._readvs)
5583+        def _then(results):
5584+            self._readvs = []
5585+            if isinstance(results, failure.Failure):
5586+                self._queue_errbacks.notify(results)
5587+            else:
5588+                self._queue_observers.notify(results)
5589+            self._queue_observers = observer.ObserverList()
5590+            self._queue_errbacks = observer.ObserverList()
5591+        d.addBoth(_then)
5592+
5593+
5594+    def _read(self, readvs, force_remote=False, queue=False):
5595+        unsatisfiable = filter(lambda x: x[0] + x[1] > len(self._data), readvs)
5596+        # TODO: It's entirely possible to tweak this so that it just
5597+        # fulfills the requests that it can, and not demand that all
5598+        # requests are satisfiable before running it.
5599+        if not unsatisfiable and not force_remote:
5600+            results = [self._data[offset:offset+length]
5601+                       for (offset, length) in readvs]
5602+            results = {self.shnum: results}
5603+            return defer.succeed(results)
5604+        else:
5605+            if queue:
5606+                start = len(self._readvs)
5607+                self._readvs += readvs
5608+                end = len(self._readvs)
5609+                def _get_results(results, start, end):
5610+                    if not self.shnum in results:
5611+                        return {self._shnum: [""]}
5612+                    return {self.shnum: results[self.shnum][start:end]}
5613+                d = defer.Deferred()
5614+                d.addCallback(_get_results, start, end)
5615+                self._queue_observers.subscribe(d.callback)
5616+                self._queue_errbacks.subscribe(d.errback)
5617+                return d
5618+            return self._rref.callRemote("slot_readv",
5619+                                         self._storage_index,
5620+                                         [self.shnum],
5621+                                         readvs)
5622+
5623+
5624+    def is_sdmf(self):
5625+        """I tell my caller whether or not my remote file is SDMF or MDMF
5626+        """
5627+        d = self._maybe_fetch_offsets_and_header()
5628+        d.addCallback(lambda ignored:
5629+            self._version_number == 0)
5630+        return d
5631+
5632+
5633+class LayoutInvalid(Exception):
5634+    """
5635+    This isn't a valid MDMF mutable file
5636+    """
5637hunk ./src/allmydata/test/test_storage.py 2
5638 
5639-import time, os.path, stat, re, simplejson, struct
5640+import time, os.path, stat, re, simplejson, struct, shutil
5641 
5642 from twisted.trial import unittest
5643 
5644hunk ./src/allmydata/test/test_storage.py 22
5645 from allmydata.storage.expirer import LeaseCheckingCrawler
5646 from allmydata.immutable.layout import WriteBucketProxy, WriteBucketProxy_v2, \
5647      ReadBucketProxy
5648-from allmydata.interfaces import BadWriteEnablerError
5649-from allmydata.test.common import LoggingServiceParent
5650+from allmydata.mutable.layout import MDMFSlotWriteProxy, MDMFSlotReadProxy, \
5651+                                     LayoutInvalid, MDMFSIGNABLEHEADER, \
5652+                                     SIGNED_PREFIX, MDMFHEADER, \
5653+                                     MDMFOFFSETS, SDMFSlotWriteProxy
5654+from allmydata.interfaces import BadWriteEnablerError, MDMF_VERSION, \
5655+                                 SDMF_VERSION
5656+from allmydata.test.common import LoggingServiceParent, ShouldFailMixin
5657 from allmydata.test.common_web import WebRenderingMixin
5658 from allmydata.web.storage import StorageStatus, remove_prefix
5659 
5660hunk ./src/allmydata/test/test_storage.py 106
5661 
5662 class RemoteBucket:
5663 
5664+    def __init__(self):
5665+        self.read_count = 0
5666+        self.write_count = 0
5667+
5668     def callRemote(self, methname, *args, **kwargs):
5669         def _call():
5670             meth = getattr(self.target, "remote_" + methname)
5671hunk ./src/allmydata/test/test_storage.py 114
5672             return meth(*args, **kwargs)
5673+
5674+        if methname == "slot_readv":
5675+            self.read_count += 1
5676+        if "writev" in methname:
5677+            self.write_count += 1
5678+
5679         return defer.maybeDeferred(_call)
5680 
5681hunk ./src/allmydata/test/test_storage.py 122
5682+
5683 class BucketProxy(unittest.TestCase):
5684     def make_bucket(self, name, size):
5685         basedir = os.path.join("storage", "BucketProxy", name)
5686hunk ./src/allmydata/test/test_storage.py 1299
5687         self.failUnless(os.path.exists(prefixdir), prefixdir)
5688         self.failIf(os.path.exists(bucketdir), bucketdir)
5689 
5690+
5691+class MDMFProxies(unittest.TestCase, ShouldFailMixin):
5692+    def setUp(self):
5693+        self.sparent = LoggingServiceParent()
5694+        self._lease_secret = itertools.count()
5695+        self.ss = self.create("MDMFProxies storage test server")
5696+        self.rref = RemoteBucket()
5697+        self.rref.target = self.ss
5698+        self.secrets = (self.write_enabler("we_secret"),
5699+                        self.renew_secret("renew_secret"),
5700+                        self.cancel_secret("cancel_secret"))
5701+        self.segment = "aaaaaa"
5702+        self.block = "aa"
5703+        self.salt = "a" * 16
5704+        self.block_hash = "a" * 32
5705+        self.block_hash_tree = [self.block_hash for i in xrange(6)]
5706+        self.share_hash = self.block_hash
5707+        self.share_hash_chain = dict([(i, self.share_hash) for i in xrange(6)])
5708+        self.signature = "foobarbaz"
5709+        self.verification_key = "vvvvvv"
5710+        self.encprivkey = "private"
5711+        self.root_hash = self.block_hash
5712+        self.salt_hash = self.root_hash
5713+        self.salt_hash_tree = [self.salt_hash for i in xrange(6)]
5714+        self.block_hash_tree_s = self.serialize_blockhashes(self.block_hash_tree)
5715+        self.share_hash_chain_s = self.serialize_sharehashes(self.share_hash_chain)
5716+        # blockhashes and salt hashes are serialized in the same way,
5717+        # only we lop off the first element and store that in the
5718+        # header.
5719+        self.salt_hash_tree_s = self.serialize_blockhashes(self.salt_hash_tree[1:])
5720+
5721+
5722+    def tearDown(self):
5723+        self.sparent.stopService()
5724+        shutil.rmtree(self.workdir("MDMFProxies storage test server"))
5725+
5726+
5727+    def write_enabler(self, we_tag):
5728+        return hashutil.tagged_hash("we_blah", we_tag)
5729+
5730+
5731+    def renew_secret(self, tag):
5732+        return hashutil.tagged_hash("renew_blah", str(tag))
5733+
5734+
5735+    def cancel_secret(self, tag):
5736+        return hashutil.tagged_hash("cancel_blah", str(tag))
5737+
5738+
5739+    def workdir(self, name):
5740+        basedir = os.path.join("storage", "MutableServer", name)
5741+        return basedir
5742+
5743+
5744+    def create(self, name):
5745+        workdir = self.workdir(name)
5746+        ss = StorageServer(workdir, "\x00" * 20)
5747+        ss.setServiceParent(self.sparent)
5748+        return ss
5749+
5750+
5751+    def build_test_mdmf_share(self, tail_segment=False, empty=False):
5752+        # Start with the checkstring
5753+        data = struct.pack(">BQ32s",
5754+                           1,
5755+                           0,
5756+                           self.root_hash)
5757+        self.checkstring = data
5758+        # Next, the encoding parameters
5759+        if tail_segment:
5760+            data += struct.pack(">BBQQ",
5761+                                3,
5762+                                10,
5763+                                6,
5764+                                33)
5765+        elif empty:
5766+            data += struct.pack(">BBQQ",
5767+                                3,
5768+                                10,
5769+                                0,
5770+                                0)
5771+        else:
5772+            data += struct.pack(">BBQQ",
5773+                                3,
5774+                                10,
5775+                                6,
5776+                                36)
5777+        # Now we'll build the offsets.
5778+        sharedata = ""
5779+        if not tail_segment and not empty:
5780+            for i in xrange(6):
5781+                sharedata += self.salt + self.block
5782+        elif tail_segment:
5783+            for i in xrange(5):
5784+                sharedata += self.salt + self.block
5785+            sharedata += self.salt + "a"
5786+
5787+        # The encrypted private key comes after the shares + salts
5788+        offset_size = struct.calcsize(MDMFOFFSETS)
5789+        encrypted_private_key_offset = len(data) + offset_size + len(sharedata)
5790+        # The blockhashes come after the private key
5791+        blockhashes_offset = encrypted_private_key_offset + len(self.encprivkey)
5792+        # The sharehashes come after the salt hashes
5793+        sharehashes_offset = blockhashes_offset + len(self.block_hash_tree_s)
5794+        # The signature comes after the share hash chain
5795+        signature_offset = sharehashes_offset + len(self.share_hash_chain_s)
5796+        # The verification key comes after the signature
5797+        verification_offset = signature_offset + len(self.signature)
5798+        # The EOF comes after the verification key
5799+        eof_offset = verification_offset + len(self.verification_key)
5800+        data += struct.pack(MDMFOFFSETS,
5801+                            encrypted_private_key_offset,
5802+                            blockhashes_offset,
5803+                            sharehashes_offset,
5804+                            signature_offset,
5805+                            verification_offset,
5806+                            eof_offset)
5807+        self.offsets = {}
5808+        self.offsets['enc_privkey'] = encrypted_private_key_offset
5809+        self.offsets['block_hash_tree'] = blockhashes_offset
5810+        self.offsets['share_hash_chain'] = sharehashes_offset
5811+        self.offsets['signature'] = signature_offset
5812+        self.offsets['verification_key'] = verification_offset
5813+        self.offsets['EOF'] = eof_offset
5814+        # Next, we'll add in the salts and share data,
5815+        data += sharedata
5816+        # the private key,
5817+        data += self.encprivkey
5818+        # the block hash tree,
5819+        data += self.block_hash_tree_s
5820+        # the share hash chain,
5821+        data += self.share_hash_chain_s
5822+        # the signature,
5823+        data += self.signature
5824+        # and the verification key
5825+        data += self.verification_key
5826+        return data
5827+
5828+
5829+    def write_test_share_to_server(self,
5830+                                   storage_index,
5831+                                   tail_segment=False,
5832+                                   empty=False):
5833+        """
5834+        I write some data for the read tests to read to self.ss
5835+
5836+        If tail_segment=True, then I will write a share that has a
5837+        smaller tail segment than other segments.
5838+        """
5839+        write = self.ss.remote_slot_testv_and_readv_and_writev
5840+        data = self.build_test_mdmf_share(tail_segment, empty)
5841+        # Finally, we write the whole thing to the storage server in one
5842+        # pass.
5843+        testvs = [(0, 1, "eq", "")]
5844+        tws = {}
5845+        tws[0] = (testvs, [(0, data)], None)
5846+        readv = [(0, 1)]
5847+        results = write(storage_index, self.secrets, tws, readv)
5848+        self.failUnless(results[0])
5849+
5850+
5851+    def build_test_sdmf_share(self, empty=False):
5852+        if empty:
5853+            sharedata = ""
5854+        else:
5855+            sharedata = self.segment * 6
5856+        self.sharedata = sharedata
5857+        blocksize = len(sharedata) / 3
5858+        block = sharedata[:blocksize]
5859+        self.blockdata = block
5860+        prefix = struct.pack(">BQ32s16s BBQQ",
5861+                             0, # version,
5862+                             0,
5863+                             self.root_hash,
5864+                             self.salt,
5865+                             3,
5866+                             10,
5867+                             len(sharedata),
5868+                             len(sharedata),
5869+                            )
5870+        post_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
5871+        signature_offset = post_offset + len(self.verification_key)
5872+        sharehashes_offset = signature_offset + len(self.signature)
5873+        blockhashes_offset = sharehashes_offset + len(self.share_hash_chain_s)
5874+        sharedata_offset = blockhashes_offset + len(self.block_hash_tree_s)
5875+        encprivkey_offset = sharedata_offset + len(block)
5876+        eof_offset = encprivkey_offset + len(self.encprivkey)
5877+        offsets = struct.pack(">LLLLQQ",
5878+                              signature_offset,
5879+                              sharehashes_offset,
5880+                              blockhashes_offset,
5881+                              sharedata_offset,
5882+                              encprivkey_offset,
5883+                              eof_offset)
5884+        final_share = "".join([prefix,
5885+                           offsets,
5886+                           self.verification_key,
5887+                           self.signature,
5888+                           self.share_hash_chain_s,
5889+                           self.block_hash_tree_s,
5890+                           block,
5891+                           self.encprivkey])
5892+        self.offsets = {}
5893+        self.offsets['signature'] = signature_offset
5894+        self.offsets['share_hash_chain'] = sharehashes_offset
5895+        self.offsets['block_hash_tree'] = blockhashes_offset
5896+        self.offsets['share_data'] = sharedata_offset
5897+        self.offsets['enc_privkey'] = encprivkey_offset
5898+        self.offsets['EOF'] = eof_offset
5899+        return final_share
5900+
5901+
5902+    def write_sdmf_share_to_server(self,
5903+                                   storage_index,
5904+                                   empty=False):
5905+        # Some tests need SDMF shares to verify that we can still
5906+        # read them. This method writes one, which resembles but is not
5907+        assert self.rref
5908+        write = self.ss.remote_slot_testv_and_readv_and_writev
5909+        share = self.build_test_sdmf_share(empty)
5910+        testvs = [(0, 1, "eq", "")]
5911+        tws = {}
5912+        tws[0] = (testvs, [(0, share)], None)
5913+        readv = []
5914+        results = write(storage_index, self.secrets, tws, readv)
5915+        self.failUnless(results[0])
5916+
5917+
5918+    def test_read(self):
5919+        self.write_test_share_to_server("si1")
5920+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
5921+        # Check that every method equals what we expect it to.
5922+        d = defer.succeed(None)
5923+        def _check_block_and_salt((block, salt)):
5924+            self.failUnlessEqual(block, self.block)
5925+            self.failUnlessEqual(salt, self.salt)
5926+
5927+        for i in xrange(6):
5928+            d.addCallback(lambda ignored, i=i:
5929+                mr.get_block_and_salt(i))
5930+            d.addCallback(_check_block_and_salt)
5931+
5932+        d.addCallback(lambda ignored:
5933+            mr.get_encprivkey())
5934+        d.addCallback(lambda encprivkey:
5935+            self.failUnlessEqual(self.encprivkey, encprivkey))
5936+
5937+        d.addCallback(lambda ignored:
5938+            mr.get_blockhashes())
5939+        d.addCallback(lambda blockhashes:
5940+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
5941+
5942+        d.addCallback(lambda ignored:
5943+            mr.get_sharehashes())
5944+        d.addCallback(lambda sharehashes:
5945+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
5946+
5947+        d.addCallback(lambda ignored:
5948+            mr.get_signature())
5949+        d.addCallback(lambda signature:
5950+            self.failUnlessEqual(signature, self.signature))
5951+
5952+        d.addCallback(lambda ignored:
5953+            mr.get_verification_key())
5954+        d.addCallback(lambda verification_key:
5955+            self.failUnlessEqual(verification_key, self.verification_key))
5956+
5957+        d.addCallback(lambda ignored:
5958+            mr.get_seqnum())
5959+        d.addCallback(lambda seqnum:
5960+            self.failUnlessEqual(seqnum, 0))
5961+
5962+        d.addCallback(lambda ignored:
5963+            mr.get_root_hash())
5964+        d.addCallback(lambda root_hash:
5965+            self.failUnlessEqual(self.root_hash, root_hash))
5966+
5967+        d.addCallback(lambda ignored:
5968+            mr.get_seqnum())
5969+        d.addCallback(lambda seqnum:
5970+            self.failUnlessEqual(0, seqnum))
5971+
5972+        d.addCallback(lambda ignored:
5973+            mr.get_encoding_parameters())
5974+        def _check_encoding_parameters((k, n, segsize, datalen)):
5975+            self.failUnlessEqual(k, 3)
5976+            self.failUnlessEqual(n, 10)
5977+            self.failUnlessEqual(segsize, 6)
5978+            self.failUnlessEqual(datalen, 36)
5979+        d.addCallback(_check_encoding_parameters)
5980+
5981+        d.addCallback(lambda ignored:
5982+            mr.get_checkstring())
5983+        d.addCallback(lambda checkstring:
5984+            self.failUnlessEqual(checkstring, checkstring))
5985+        return d
5986+
5987+
5988+    def test_read_with_different_tail_segment_size(self):
5989+        self.write_test_share_to_server("si1", tail_segment=True)
5990+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
5991+        d = mr.get_block_and_salt(5)
5992+        def _check_tail_segment(results):
5993+            block, salt = results
5994+            self.failUnlessEqual(len(block), 1)
5995+            self.failUnlessEqual(block, "a")
5996+        d.addCallback(_check_tail_segment)
5997+        return d
5998+
5999+
6000+    def test_get_block_with_invalid_segnum(self):
6001+        self.write_test_share_to_server("si1")
6002+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6003+        d = defer.succeed(None)
6004+        d.addCallback(lambda ignored:
6005+            self.shouldFail(LayoutInvalid, "test invalid segnum",
6006+                            None,
6007+                            mr.get_block_and_salt, 7))
6008+        return d
6009+
6010+
6011+    def test_get_encoding_parameters_first(self):
6012+        self.write_test_share_to_server("si1")
6013+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6014+        d = mr.get_encoding_parameters()
6015+        def _check_encoding_parameters((k, n, segment_size, datalen)):
6016+            self.failUnlessEqual(k, 3)
6017+            self.failUnlessEqual(n, 10)
6018+            self.failUnlessEqual(segment_size, 6)
6019+            self.failUnlessEqual(datalen, 36)
6020+        d.addCallback(_check_encoding_parameters)
6021+        return d
6022+
6023+
6024+    def test_get_seqnum_first(self):
6025+        self.write_test_share_to_server("si1")
6026+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6027+        d = mr.get_seqnum()
6028+        d.addCallback(lambda seqnum:
6029+            self.failUnlessEqual(seqnum, 0))
6030+        return d
6031+
6032+
6033+    def test_get_root_hash_first(self):
6034+        self.write_test_share_to_server("si1")
6035+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6036+        d = mr.get_root_hash()
6037+        d.addCallback(lambda root_hash:
6038+            self.failUnlessEqual(root_hash, self.root_hash))
6039+        return d
6040+
6041+
6042+    def test_get_checkstring_first(self):
6043+        self.write_test_share_to_server("si1")
6044+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6045+        d = mr.get_checkstring()
6046+        d.addCallback(lambda checkstring:
6047+            self.failUnlessEqual(checkstring, self.checkstring))
6048+        return d
6049+
6050+
6051+    def test_write_read_vectors(self):
6052+        # When writing for us, the storage server will return to us a
6053+        # read vector, along with its result. If a write fails because
6054+        # the test vectors failed, this read vector can help us to
6055+        # diagnose the problem. This test ensures that the read vector
6056+        # is working appropriately.
6057+        mw = self._make_new_mw("si1", 0)
6058+        d = defer.succeed(None)
6059+
6060+        # Write one share. This should return a checkstring of nothing,
6061+        # since there is no data there.
6062+        d.addCallback(lambda ignored:
6063+            mw.put_block(self.block, 0, self.salt))
6064+        def _check_first_write(results):
6065+            result, readvs = results
6066+            self.failUnless(result)
6067+            self.failIf(readvs)
6068+        d.addCallback(_check_first_write)
6069+        # Now, there should be a different checkstring returned when
6070+        # we write other shares
6071+        d.addCallback(lambda ignored:
6072+            mw.put_block(self.block, 1, self.salt))
6073+        def _check_next_write(results):
6074+            result, readvs = results
6075+            self.failUnless(result)
6076+            self.expected_checkstring = mw.get_checkstring()
6077+            self.failUnlessIn(0, readvs)
6078+            self.failUnlessEqual(readvs[0][0], self.expected_checkstring)
6079+        d.addCallback(_check_next_write)
6080+        # Add the other four shares
6081+        for i in xrange(2, 6):
6082+            d.addCallback(lambda ignored, i=i:
6083+                mw.put_block(self.block, i, self.salt))
6084+            d.addCallback(_check_next_write)
6085+        # Add the encrypted private key
6086+        d.addCallback(lambda ignored:
6087+            mw.put_encprivkey(self.encprivkey))
6088+        d.addCallback(_check_next_write)
6089+        # Add the block hash tree and share hash tree
6090+        d.addCallback(lambda ignored:
6091+            mw.put_blockhashes(self.block_hash_tree))
6092+        d.addCallback(_check_next_write)
6093+        d.addCallback(lambda ignored:
6094+            mw.put_sharehashes(self.share_hash_chain))
6095+        d.addCallback(_check_next_write)
6096+        # Add the root hash and the salt hash. This should change the
6097+        # checkstring, but not in a way that we'll be able to see right
6098+        # now, since the read vectors are applied before the write
6099+        # vectors.
6100+        d.addCallback(lambda ignored:
6101+            mw.put_root_hash(self.root_hash))
6102+        def _check_old_testv_after_new_one_is_written(results):
6103+            result, readvs = results
6104+            self.failUnless(result)
6105+            self.failUnlessIn(0, readvs)
6106+            self.failUnlessEqual(self.expected_checkstring,
6107+                                 readvs[0][0])
6108+            new_checkstring = mw.get_checkstring()
6109+            self.failIfEqual(new_checkstring,
6110+                             readvs[0][0])
6111+        d.addCallback(_check_old_testv_after_new_one_is_written)
6112+        # Now add the signature. This should succeed, meaning that the
6113+        # data gets written and the read vector matches what the writer
6114+        # thinks should be there.
6115+        d.addCallback(lambda ignored:
6116+            mw.put_signature(self.signature))
6117+        d.addCallback(_check_next_write)
6118+        # The checkstring remains the same for the rest of the process.
6119+        return d
6120+
6121+
6122+    def test_blockhashes_after_share_hash_chain(self):
6123+        mw = self._make_new_mw("si1", 0)
6124+        d = defer.succeed(None)
6125+        # Put everything up to and including the share hash chain
6126+        for i in xrange(6):
6127+            d.addCallback(lambda ignored, i=i:
6128+                mw.put_block(self.block, i, self.salt))
6129+        d.addCallback(lambda ignored:
6130+            mw.put_encprivkey(self.encprivkey))
6131+        d.addCallback(lambda ignored:
6132+            mw.put_blockhashes(self.block_hash_tree))
6133+        d.addCallback(lambda ignored:
6134+            mw.put_sharehashes(self.share_hash_chain))
6135+
6136+        # Now try to put the block hash tree again.
6137+        d.addCallback(lambda ignored:
6138+            self.shouldFail(LayoutInvalid, "test repeat salthashes",
6139+                            None,
6140+                            mw.put_blockhashes, self.block_hash_tree))
6141+        return d
6142+
6143+
6144+    def test_encprivkey_after_blockhashes(self):
6145+        mw = self._make_new_mw("si1", 0)
6146+        d = defer.succeed(None)
6147+        # Put everything up to and including the block hash tree
6148+        for i in xrange(6):
6149+            d.addCallback(lambda ignored, i=i:
6150+                mw.put_block(self.block, i, self.salt))
6151+        d.addCallback(lambda ignored:
6152+            mw.put_encprivkey(self.encprivkey))
6153+        d.addCallback(lambda ignored:
6154+            mw.put_blockhashes(self.block_hash_tree))
6155+        d.addCallback(lambda ignored:
6156+            self.shouldFail(LayoutInvalid, "out of order private key",
6157+                            None,
6158+                            mw.put_encprivkey, self.encprivkey))
6159+        return d
6160+
6161+
6162+    def test_share_hash_chain_after_signature(self):
6163+        mw = self._make_new_mw("si1", 0)
6164+        d = defer.succeed(None)
6165+        # Put everything up to and including the signature
6166+        for i in xrange(6):
6167+            d.addCallback(lambda ignored, i=i:
6168+                mw.put_block(self.block, i, self.salt))
6169+        d.addCallback(lambda ignored:
6170+            mw.put_encprivkey(self.encprivkey))
6171+        d.addCallback(lambda ignored:
6172+            mw.put_blockhashes(self.block_hash_tree))
6173+        d.addCallback(lambda ignored:
6174+            mw.put_sharehashes(self.share_hash_chain))
6175+        d.addCallback(lambda ignored:
6176+            mw.put_root_hash(self.root_hash))
6177+        d.addCallback(lambda ignored:
6178+            mw.put_signature(self.signature))
6179+        # Now try to put the share hash chain again. This should fail
6180+        d.addCallback(lambda ignored:
6181+            self.shouldFail(LayoutInvalid, "out of order share hash chain",
6182+                            None,
6183+                            mw.put_sharehashes, self.share_hash_chain))
6184+        return d
6185+
6186+
6187+    def test_signature_after_verification_key(self):
6188+        mw = self._make_new_mw("si1", 0)
6189+        d = defer.succeed(None)
6190+        # Put everything up to and including the verification key.
6191+        for i in xrange(6):
6192+            d.addCallback(lambda ignored, i=i:
6193+                mw.put_block(self.block, i, self.salt))
6194+        d.addCallback(lambda ignored:
6195+            mw.put_encprivkey(self.encprivkey))
6196+        d.addCallback(lambda ignored:
6197+            mw.put_blockhashes(self.block_hash_tree))
6198+        d.addCallback(lambda ignored:
6199+            mw.put_sharehashes(self.share_hash_chain))
6200+        d.addCallback(lambda ignored:
6201+            mw.put_root_hash(self.root_hash))
6202+        d.addCallback(lambda ignored:
6203+            mw.put_signature(self.signature))
6204+        d.addCallback(lambda ignored:
6205+            mw.put_verification_key(self.verification_key))
6206+        # Now try to put the signature again. This should fail
6207+        d.addCallback(lambda ignored:
6208+            self.shouldFail(LayoutInvalid, "signature after verification",
6209+                            None,
6210+                            mw.put_signature, self.signature))
6211+        return d
6212+
6213+
6214+    def test_uncoordinated_write(self):
6215+        # Make two mutable writers, both pointing to the same storage
6216+        # server, both at the same storage index, and try writing to the
6217+        # same share.
6218+        mw1 = self._make_new_mw("si1", 0)
6219+        mw2 = self._make_new_mw("si1", 0)
6220+        d = defer.succeed(None)
6221+        def _check_success(results):
6222+            result, readvs = results
6223+            self.failUnless(result)
6224+
6225+        def _check_failure(results):
6226+            result, readvs = results
6227+            self.failIf(result)
6228+
6229+        d.addCallback(lambda ignored:
6230+            mw1.put_block(self.block, 0, self.salt))
6231+        d.addCallback(_check_success)
6232+        d.addCallback(lambda ignored:
6233+            mw2.put_block(self.block, 0, self.salt))
6234+        d.addCallback(_check_failure)
6235+        return d
6236+
6237+
6238+    def test_invalid_salt_size(self):
6239+        # Salts need to be 16 bytes in size. Writes that attempt to
6240+        # write more or less than this should be rejected.
6241+        mw = self._make_new_mw("si1", 0)
6242+        invalid_salt = "a" * 17 # 17 bytes
6243+        another_invalid_salt = "b" * 15 # 15 bytes
6244+        d = defer.succeed(None)
6245+        d.addCallback(lambda ignored:
6246+            self.shouldFail(LayoutInvalid, "salt too big",
6247+                            None,
6248+                            mw.put_block, self.block, 0, invalid_salt))
6249+        d.addCallback(lambda ignored:
6250+            self.shouldFail(LayoutInvalid, "salt too small",
6251+                            None,
6252+                            mw.put_block, self.block, 0,
6253+                            another_invalid_salt))
6254+        return d
6255+
6256+
6257+    def test_write_test_vectors(self):
6258+        # If we give the write proxy a bogus test vector at
6259+        # any point during the process, it should fail to write.
6260+        mw = self._make_new_mw("si1", 0)
6261+        mw.set_checkstring("this is a lie")
6262+        # The initial write should be expecting to find the improbable
6263+        # checkstring above in place; finding nothing, it should fail.
6264+        d = defer.succeed(None)
6265+        d.addCallback(lambda ignored:
6266+            mw.put_block(self.block, 0, self.salt))
6267+        def _check_failure(results):
6268+            result, readv = results
6269+            self.failIf(result)
6270+        d.addCallback(_check_failure)
6271+        # Now set the checkstring to the empty string, which
6272+        # indicates that no share is there.
6273+        d.addCallback(lambda ignored:
6274+            mw.set_checkstring(""))
6275+        d.addCallback(lambda ignored:
6276+            mw.put_block(self.block, 0, self.salt))
6277+        def _check_success(results):
6278+            result, readv = results
6279+            self.failUnless(result)
6280+        d.addCallback(_check_success)
6281+        # Now set the checkstring to something wrong
6282+        d.addCallback(lambda ignored:
6283+            mw.set_checkstring("something wrong"))
6284+        # This should fail to do anything
6285+        d.addCallback(lambda ignored:
6286+            mw.put_block(self.block, 1, self.salt))
6287+        d.addCallback(_check_failure)
6288+        # Now set it back to what it should be.
6289+        d.addCallback(lambda ignored:
6290+            mw.set_checkstring(mw.get_checkstring()))
6291+        for i in xrange(1, 6):
6292+            d.addCallback(lambda ignored, i=i:
6293+                mw.put_block(self.block, i, self.salt))
6294+            d.addCallback(_check_success)
6295+        d.addCallback(lambda ignored:
6296+            mw.put_encprivkey(self.encprivkey))
6297+        d.addCallback(_check_success)
6298+        d.addCallback(lambda ignored:
6299+            mw.put_blockhashes(self.block_hash_tree))
6300+        d.addCallback(_check_success)
6301+        d.addCallback(lambda ignored:
6302+            mw.put_sharehashes(self.share_hash_chain))
6303+        d.addCallback(_check_success)
6304+        def _keep_old_checkstring(ignored):
6305+            self.old_checkstring = mw.get_checkstring()
6306+            mw.set_checkstring("foobarbaz")
6307+        d.addCallback(_keep_old_checkstring)
6308+        d.addCallback(lambda ignored:
6309+            mw.put_root_hash(self.root_hash))
6310+        d.addCallback(_check_failure)
6311+        d.addCallback(lambda ignored:
6312+            self.failUnlessEqual(self.old_checkstring, mw.get_checkstring()))
6313+        def _restore_old_checkstring(ignored):
6314+            mw.set_checkstring(self.old_checkstring)
6315+        d.addCallback(_restore_old_checkstring)
6316+        d.addCallback(lambda ignored:
6317+            mw.put_root_hash(self.root_hash))
6318+        d.addCallback(_check_success)
6319+        # The checkstring should have been set appropriately for us on
6320+        # the last write; if we try to change it to something else,
6321+        # that change should cause the verification key step to fail.
6322+        d.addCallback(lambda ignored:
6323+            mw.set_checkstring("something else"))
6324+        d.addCallback(lambda ignored:
6325+            mw.put_signature(self.signature))
6326+        d.addCallback(_check_failure)
6327+        d.addCallback(lambda ignored:
6328+            mw.set_checkstring(mw.get_checkstring()))
6329+        d.addCallback(lambda ignored:
6330+            mw.put_signature(self.signature))
6331+        d.addCallback(_check_success)
6332+        d.addCallback(lambda ignored:
6333+            mw.put_verification_key(self.verification_key))
6334+        d.addCallback(_check_success)
6335+        return d
6336+
6337+
6338+    def test_offset_only_set_on_success(self):
6339+        # The write proxy should be smart enough to detect when a write
6340+        # has failed, and to temper its definition of progress based on
6341+        # that.
6342+        mw = self._make_new_mw("si1", 0)
6343+        d = defer.succeed(None)
6344+        for i in xrange(1, 6):
6345+            d.addCallback(lambda ignored, i=i:
6346+                mw.put_block(self.block, i, self.salt))
6347+        def _break_checkstring(ignored):
6348+            self._old_checkstring = mw.get_checkstring()
6349+            mw.set_checkstring("foobarbaz")
6350+
6351+        def _fix_checkstring(ignored):
6352+            mw.set_checkstring(self._old_checkstring)
6353+
6354+        d.addCallback(_break_checkstring)
6355+
6356+        # Setting the encrypted private key shouldn't work now, which is
6357+        # to be expected and is tested elsewhere. We also want to make
6358+        # sure that we can't add the block hash tree after a failed
6359+        # write of this sort.
6360+        d.addCallback(lambda ignored:
6361+            mw.put_encprivkey(self.encprivkey))
6362+        d.addCallback(lambda ignored:
6363+            self.shouldFail(LayoutInvalid, "test out-of-order blockhashes",
6364+                            None,
6365+                            mw.put_blockhashes, self.block_hash_tree))
6366+        d.addCallback(_fix_checkstring)
6367+        d.addCallback(lambda ignored:
6368+            mw.put_encprivkey(self.encprivkey))
6369+        d.addCallback(_break_checkstring)
6370+        d.addCallback(lambda ignored:
6371+            mw.put_blockhashes(self.block_hash_tree))
6372+        d.addCallback(lambda ignored:
6373+            self.shouldFail(LayoutInvalid, "test out-of-order sharehashes",
6374+                            None,
6375+                            mw.put_sharehashes, self.share_hash_chain))
6376+        d.addCallback(_fix_checkstring)
6377+        d.addCallback(lambda ignored:
6378+            mw.put_blockhashes(self.block_hash_tree))
6379+        d.addCallback(_break_checkstring)
6380+        d.addCallback(lambda ignored:
6381+            mw.put_sharehashes(self.share_hash_chain))
6382+        d.addCallback(lambda ignored:
6383+            self.shouldFail(LayoutInvalid, "out-of-order root hash",
6384+                            None,
6385+                            mw.put_root_hash, self.root_hash))
6386+        d.addCallback(_fix_checkstring)
6387+        d.addCallback(lambda ignored:
6388+            mw.put_sharehashes(self.share_hash_chain))
6389+        d.addCallback(_break_checkstring)
6390+        d.addCallback(lambda ignored:
6391+            mw.put_root_hash(self.root_hash))
6392+        d.addCallback(lambda ignored:
6393+            self.shouldFail(LayoutInvalid, "out-of-order signature",
6394+                            None,
6395+                            mw.put_signature, self.signature))
6396+        d.addCallback(_fix_checkstring)
6397+        d.addCallback(lambda ignored:
6398+            mw.put_root_hash(self.root_hash))
6399+        d.addCallback(_break_checkstring)
6400+        d.addCallback(lambda ignored:
6401+            mw.put_signature(self.signature))
6402+        d.addCallback(lambda ignored:
6403+            self.shouldFail(LayoutInvalid, "out-of-order verification key",
6404+                            None,
6405+                            mw.put_verification_key,
6406+                            self.verification_key))
6407+        d.addCallback(_fix_checkstring)
6408+        d.addCallback(lambda ignored:
6409+            mw.put_signature(self.signature))
6410+        d.addCallback(_break_checkstring)
6411+        d.addCallback(lambda ignored:
6412+            mw.put_verification_key(self.verification_key))
6413+        d.addCallback(lambda ignored:
6414+            self.shouldFail(LayoutInvalid, "out-of-order finish",
6415+                            None,
6416+                            mw.finish_publishing))
6417+        return d
6418+
6419+
6420+    def serialize_blockhashes(self, blockhashes):
6421+        return "".join(blockhashes)
6422+
6423+
6424+    def serialize_sharehashes(self, sharehashes):
6425+        ret = "".join([struct.pack(">H32s", i, sharehashes[i])
6426+                        for i in sorted(sharehashes.keys())])
6427+        return ret
6428+
6429+
6430+    def test_write(self):
6431+        # This translates to a file with 6 6-byte segments, and with 2-byte
6432+        # blocks.
6433+        mw = self._make_new_mw("si1", 0)
6434+        mw2 = self._make_new_mw("si1", 1)
6435+        # Test writing some blocks.
6436+        read = self.ss.remote_slot_readv
6437+        expected_sharedata_offset = struct.calcsize(MDMFHEADER)
6438+        written_block_size = 2 + len(self.salt)
6439+        written_block = self.block + self.salt
6440+        def _check_block_write(i, share):
6441+            self.failUnlessEqual(read("si1", [share], [(expected_sharedata_offset + (i * written_block_size), written_block_size)]),
6442+                                {share: [written_block]})
6443+        d = defer.succeed(None)
6444+        for i in xrange(6):
6445+            d.addCallback(lambda ignored, i=i:
6446+                mw.put_block(self.block, i, self.salt))
6447+            d.addCallback(lambda ignored, i=i:
6448+                _check_block_write(i, 0))
6449+        # Now try the same thing, but with share 1 instead of share 0.
6450+        for i in xrange(6):
6451+            d.addCallback(lambda ignored, i=i:
6452+                mw2.put_block(self.block, i, self.salt))
6453+            d.addCallback(lambda ignored, i=i:
6454+                _check_block_write(i, 1))
6455+
6456+        # Next, we make a fake encrypted private key, and put it onto the
6457+        # storage server.
6458+        d.addCallback(lambda ignored:
6459+            mw.put_encprivkey(self.encprivkey))
6460+        expected_private_key_offset = expected_sharedata_offset + \
6461+                                      len(written_block) * 6
6462+        self.failUnlessEqual(len(self.encprivkey), 7)
6463+        d.addCallback(lambda ignored:
6464+            self.failUnlessEqual(read("si1", [0], [(expected_private_key_offset, 7)]),
6465+                                 {0: [self.encprivkey]}))
6466+
6467+        # Next, we put a fake block hash tree.
6468+        d.addCallback(lambda ignored:
6469+            mw.put_blockhashes(self.block_hash_tree))
6470+        expected_block_hash_offset = expected_private_key_offset + len(self.encprivkey)
6471+        self.failUnlessEqual(len(self.block_hash_tree_s), 32 * 6)
6472+        d.addCallback(lambda ignored:
6473+            self.failUnlessEqual(read("si1", [0], [(expected_block_hash_offset, 32 * 6)]),
6474+                                 {0: [self.block_hash_tree_s]}))
6475+
6476+        # Next, put a fake share hash chain
6477+        d.addCallback(lambda ignored:
6478+            mw.put_sharehashes(self.share_hash_chain))
6479+        expected_share_hash_offset = expected_block_hash_offset + len(self.block_hash_tree_s)
6480+        d.addCallback(lambda ignored:
6481+            self.failUnlessEqual(read("si1", [0],[(expected_share_hash_offset, (32 + 2) * 6)]),
6482+                                 {0: [self.share_hash_chain_s]}))
6483+
6484+        # Next, we put what is supposed to be the root hash of
6485+        # our share hash tree but isn't       
6486+        d.addCallback(lambda ignored:
6487+            mw.put_root_hash(self.root_hash))
6488+        # The root hash gets inserted at byte 9 (its position is in the header,
6489+        # and is fixed).
6490+        def _check(ignored):
6491+            self.failUnlessEqual(read("si1", [0], [(9, 32)]),
6492+                                 {0: [self.root_hash]})
6493+        d.addCallback(_check)
6494+
6495+        # Next, we put a signature of the header block.
6496+        d.addCallback(lambda ignored:
6497+            mw.put_signature(self.signature))
6498+        expected_signature_offset = expected_share_hash_offset + len(self.share_hash_chain_s)
6499+        self.failUnlessEqual(len(self.signature), 9)
6500+        d.addCallback(lambda ignored:
6501+            self.failUnlessEqual(read("si1", [0], [(expected_signature_offset, 9)]),
6502+                                 {0: [self.signature]}))
6503+
6504+        # Next, we put the verification key
6505+        d.addCallback(lambda ignored:
6506+            mw.put_verification_key(self.verification_key))
6507+        expected_verification_key_offset = expected_signature_offset + len(self.signature)
6508+        self.failUnlessEqual(len(self.verification_key), 6)
6509+        d.addCallback(lambda ignored:
6510+            self.failUnlessEqual(read("si1", [0], [(expected_verification_key_offset, 6)]),
6511+                                 {0: [self.verification_key]}))
6512+
6513+        def _check_signable(ignored):
6514+            # Make sure that the signable is what we think it should be.
6515+            signable = mw.get_signable()
6516+            verno, seq, roothash, k, n, segsize, datalen = \
6517+                                            struct.unpack(">BQ32sBBQQ",
6518+                                                          signable)
6519+            self.failUnlessEqual(verno, 1)
6520+            self.failUnlessEqual(seq, 0)
6521+            self.failUnlessEqual(roothash, self.root_hash)
6522+            self.failUnlessEqual(k, 3)
6523+            self.failUnlessEqual(n, 10)
6524+            self.failUnlessEqual(segsize, 6)
6525+            self.failUnlessEqual(datalen, 36)
6526+        d.addCallback(_check_signable)
6527+        # Next, we cause the offset table to be published.
6528+        d.addCallback(lambda ignored:
6529+            mw.finish_publishing())
6530+        expected_eof_offset = expected_verification_key_offset + len(self.verification_key)
6531+
6532+        def _check_offsets(ignored):
6533+            # Check the version number to make sure that it is correct.
6534+            expected_version_number = struct.pack(">B", 1)
6535+            self.failUnlessEqual(read("si1", [0], [(0, 1)]),
6536+                                 {0: [expected_version_number]})
6537+            # Check the sequence number to make sure that it is correct
6538+            expected_sequence_number = struct.pack(">Q", 0)
6539+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
6540+                                 {0: [expected_sequence_number]})
6541+            # Check that the encoding parameters (k, N, segement size, data
6542+            # length) are what they should be. These are  3, 10, 6, 36
6543+            expected_k = struct.pack(">B", 3)
6544+            self.failUnlessEqual(read("si1", [0], [(41, 1)]),
6545+                                 {0: [expected_k]})
6546+            expected_n = struct.pack(">B", 10)
6547+            self.failUnlessEqual(read("si1", [0], [(42, 1)]),
6548+                                 {0: [expected_n]})
6549+            expected_segment_size = struct.pack(">Q", 6)
6550+            self.failUnlessEqual(read("si1", [0], [(43, 8)]),
6551+                                 {0: [expected_segment_size]})
6552+            expected_data_length = struct.pack(">Q", 36)
6553+            self.failUnlessEqual(read("si1", [0], [(51, 8)]),
6554+                                 {0: [expected_data_length]})
6555+            expected_offset = struct.pack(">Q", expected_private_key_offset)
6556+            self.failUnlessEqual(read("si1", [0], [(59, 8)]),
6557+                                 {0: [expected_offset]})
6558+            expected_offset = struct.pack(">Q", expected_block_hash_offset)
6559+            self.failUnlessEqual(read("si1", [0], [(67, 8)]),
6560+                                 {0: [expected_offset]})
6561+            expected_offset = struct.pack(">Q", expected_share_hash_offset)
6562+            self.failUnlessEqual(read("si1", [0], [(75, 8)]),
6563+                                 {0: [expected_offset]})
6564+            expected_offset = struct.pack(">Q", expected_signature_offset)
6565+            self.failUnlessEqual(read("si1", [0], [(83, 8)]),
6566+                                 {0: [expected_offset]})
6567+            expected_offset = struct.pack(">Q", expected_verification_key_offset)
6568+            self.failUnlessEqual(read("si1", [0], [(91, 8)]),
6569+                                 {0: [expected_offset]})
6570+            expected_offset = struct.pack(">Q", expected_eof_offset)
6571+            self.failUnlessEqual(read("si1", [0], [(99, 8)]),
6572+                                 {0: [expected_offset]})
6573+        d.addCallback(_check_offsets)
6574+        return d
6575+
6576+    def _make_new_mw(self, si, share, datalength=36):
6577+        # This is a file of size 36 bytes. Since it has a segment
6578+        # size of 6, we know that it has 6 byte segments, which will
6579+        # be split into blocks of 2 bytes because our FEC k
6580+        # parameter is 3.
6581+        mw = MDMFSlotWriteProxy(share, self.rref, si, self.secrets, 0, 3, 10,
6582+                                6, datalength)
6583+        return mw
6584+
6585+
6586+    def test_write_rejected_with_too_many_blocks(self):
6587+        mw = self._make_new_mw("si0", 0)
6588+
6589+        # Try writing too many blocks. We should not be able to write
6590+        # more than 6
6591+        # blocks into each share.
6592+        d = defer.succeed(None)
6593+        for i in xrange(6):
6594+            d.addCallback(lambda ignored, i=i:
6595+                mw.put_block(self.block, i, self.salt))
6596+        d.addCallback(lambda ignored:
6597+            self.shouldFail(LayoutInvalid, "too many blocks",
6598+                            None,
6599+                            mw.put_block, self.block, 7, self.salt))
6600+        return d
6601+
6602+
6603+    def test_write_rejected_with_invalid_salt(self):
6604+        # Try writing an invalid salt. Salts are 16 bytes -- any more or
6605+        # less should cause an error.
6606+        mw = self._make_new_mw("si1", 0)
6607+        bad_salt = "a" * 17 # 17 bytes
6608+        d = defer.succeed(None)
6609+        d.addCallback(lambda ignored:
6610+            self.shouldFail(LayoutInvalid, "test_invalid_salt",
6611+                            None, mw.put_block, self.block, 7, bad_salt))
6612+        return d
6613+
6614+
6615+    def test_write_rejected_with_invalid_root_hash(self):
6616+        # Try writing an invalid root hash. This should be SHA256d, and
6617+        # 32 bytes long as a result.
6618+        mw = self._make_new_mw("si2", 0)
6619+        # 17 bytes != 32 bytes
6620+        invalid_root_hash = "a" * 17
6621+        d = defer.succeed(None)
6622+        # Before this test can work, we need to put some blocks + salts,
6623+        # a block hash tree, and a share hash tree. Otherwise, we'll see
6624+        # failures that match what we are looking for, but are caused by
6625+        # the constraints imposed on operation ordering.
6626+        for i in xrange(6):
6627+            d.addCallback(lambda ignored, i=i:
6628+                mw.put_block(self.block, i, self.salt))
6629+        d.addCallback(lambda ignored:
6630+            mw.put_encprivkey(self.encprivkey))
6631+        d.addCallback(lambda ignored:
6632+            mw.put_blockhashes(self.block_hash_tree))
6633+        d.addCallback(lambda ignored:
6634+            mw.put_sharehashes(self.share_hash_chain))
6635+        d.addCallback(lambda ignored:
6636+            self.shouldFail(LayoutInvalid, "invalid root hash",
6637+                            None, mw.put_root_hash, invalid_root_hash))
6638+        return d
6639+
6640+
6641+    def test_write_rejected_with_invalid_blocksize(self):
6642+        # The blocksize implied by the writer that we get from
6643+        # _make_new_mw is 2bytes -- any more or any less than this
6644+        # should be cause for failure, unless it is the tail segment, in
6645+        # which case it may not be failure.
6646+        invalid_block = "a"
6647+        mw = self._make_new_mw("si3", 0, 33) # implies a tail segment with
6648+                                             # one byte blocks
6649+        # 1 bytes != 2 bytes
6650+        d = defer.succeed(None)
6651+        d.addCallback(lambda ignored, invalid_block=invalid_block:
6652+            self.shouldFail(LayoutInvalid, "test blocksize too small",
6653+                            None, mw.put_block, invalid_block, 0,
6654+                            self.salt))
6655+        invalid_block = invalid_block * 3
6656+        # 3 bytes != 2 bytes
6657+        d.addCallback(lambda ignored:
6658+            self.shouldFail(LayoutInvalid, "test blocksize too large",
6659+                            None,
6660+                            mw.put_block, invalid_block, 0, self.salt))
6661+        for i in xrange(5):
6662+            d.addCallback(lambda ignored, i=i:
6663+                mw.put_block(self.block, i, self.salt))
6664+        # Try to put an invalid tail segment
6665+        d.addCallback(lambda ignored:
6666+            self.shouldFail(LayoutInvalid, "test invalid tail segment",
6667+                            None,
6668+                            mw.put_block, self.block, 5, self.salt))
6669+        valid_block = "a"
6670+        d.addCallback(lambda ignored:
6671+            mw.put_block(valid_block, 5, self.salt))
6672+        return d
6673+
6674+
6675+    def test_write_enforces_order_constraints(self):
6676+        # We require that the MDMFSlotWriteProxy be interacted with in a
6677+        # specific way.
6678+        # That way is:
6679+        # 0: __init__
6680+        # 1: write blocks and salts
6681+        # 2: Write the encrypted private key
6682+        # 3: Write the block hashes
6683+        # 4: Write the share hashes
6684+        # 5: Write the root hash and salt hash
6685+        # 6: Write the signature and verification key
6686+        # 7: Write the file.
6687+        #
6688+        # Some of these can be performed out-of-order, and some can't.
6689+        # The dependencies that I want to test here are:
6690+        #  - Private key before block hashes
6691+        #  - share hashes and block hashes before root hash
6692+        #  - root hash before signature
6693+        #  - signature before verification key
6694+        mw0 = self._make_new_mw("si0", 0)
6695+        # Write some shares
6696+        d = defer.succeed(None)
6697+        for i in xrange(6):
6698+            d.addCallback(lambda ignored, i=i:
6699+                mw0.put_block(self.block, i, self.salt))
6700+        # Try to write the block hashes before writing the encrypted
6701+        # private key
6702+        d.addCallback(lambda ignored:
6703+            self.shouldFail(LayoutInvalid, "block hashes before key",
6704+                            None, mw0.put_blockhashes,
6705+                            self.block_hash_tree))
6706+
6707+        # Write the private key.
6708+        d.addCallback(lambda ignored:
6709+            mw0.put_encprivkey(self.encprivkey))
6710+
6711+
6712+        # Try to write the share hash chain without writing the block
6713+        # hash tree
6714+        d.addCallback(lambda ignored:
6715+            self.shouldFail(LayoutInvalid, "share hash chain before "
6716+                                           "salt hash tree",
6717+                            None,
6718+                            mw0.put_sharehashes, self.share_hash_chain))
6719+
6720+        # Try to write the root hash and without writing either the
6721+        # block hashes or the or the share hashes
6722+        d.addCallback(lambda ignored:
6723+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
6724+                            None,
6725+                            mw0.put_root_hash, self.root_hash))
6726+
6727+        # Now write the block hashes and try again
6728+        d.addCallback(lambda ignored:
6729+            mw0.put_blockhashes(self.block_hash_tree))
6730+
6731+        d.addCallback(lambda ignored:
6732+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
6733+                            None, mw0.put_root_hash, self.root_hash))
6734+
6735+        # We haven't yet put the root hash on the share, so we shouldn't
6736+        # be able to sign it.
6737+        d.addCallback(lambda ignored:
6738+            self.shouldFail(LayoutInvalid, "signature before root hash",
6739+                            None, mw0.put_signature, self.signature))
6740+
6741+        d.addCallback(lambda ignored:
6742+            self.failUnlessRaises(LayoutInvalid, mw0.get_signable))
6743+
6744+        # ..and, since that fails, we also shouldn't be able to put the
6745+        # verification key.
6746+        d.addCallback(lambda ignored:
6747+            self.shouldFail(LayoutInvalid, "key before signature",
6748+                            None, mw0.put_verification_key,
6749+                            self.verification_key))
6750+
6751+        # Now write the share hashes.
6752+        d.addCallback(lambda ignored:
6753+            mw0.put_sharehashes(self.share_hash_chain))
6754+        # We should be able to write the root hash now too
6755+        d.addCallback(lambda ignored:
6756+            mw0.put_root_hash(self.root_hash))
6757+
6758+        # We should still be unable to put the verification key
6759+        d.addCallback(lambda ignored:
6760+            self.shouldFail(LayoutInvalid, "key before signature",
6761+                            None, mw0.put_verification_key,
6762+                            self.verification_key))
6763+
6764+        d.addCallback(lambda ignored:
6765+            mw0.put_signature(self.signature))
6766+
6767+        # We shouldn't be able to write the offsets to the remote server
6768+        # until the offset table is finished; IOW, until we have written
6769+        # the verification key.
6770+        d.addCallback(lambda ignored:
6771+            self.shouldFail(LayoutInvalid, "offsets before verification key",
6772+                            None,
6773+                            mw0.finish_publishing))
6774+
6775+        d.addCallback(lambda ignored:
6776+            mw0.put_verification_key(self.verification_key))
6777+        return d
6778+
6779+
6780+    def test_end_to_end(self):
6781+        mw = self._make_new_mw("si1", 0)
6782+        # Write a share using the mutable writer, and make sure that the
6783+        # reader knows how to read everything back to us.
6784+        d = defer.succeed(None)
6785+        for i in xrange(6):
6786+            d.addCallback(lambda ignored, i=i:
6787+                mw.put_block(self.block, i, self.salt))
6788+        d.addCallback(lambda ignored:
6789+            mw.put_encprivkey(self.encprivkey))
6790+        d.addCallback(lambda ignored:
6791+            mw.put_blockhashes(self.block_hash_tree))
6792+        d.addCallback(lambda ignored:
6793+            mw.put_sharehashes(self.share_hash_chain))
6794+        d.addCallback(lambda ignored:
6795+            mw.put_root_hash(self.root_hash))
6796+        d.addCallback(lambda ignored:
6797+            mw.put_signature(self.signature))
6798+        d.addCallback(lambda ignored:
6799+            mw.put_verification_key(self.verification_key))
6800+        d.addCallback(lambda ignored:
6801+            mw.finish_publishing())
6802+
6803+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6804+        def _check_block_and_salt((block, salt)):
6805+            self.failUnlessEqual(block, self.block)
6806+            self.failUnlessEqual(salt, self.salt)
6807+
6808+        for i in xrange(6):
6809+            d.addCallback(lambda ignored, i=i:
6810+                mr.get_block_and_salt(i))
6811+            d.addCallback(_check_block_and_salt)
6812+
6813+        d.addCallback(lambda ignored:
6814+            mr.get_encprivkey())
6815+        d.addCallback(lambda encprivkey:
6816+            self.failUnlessEqual(self.encprivkey, encprivkey))
6817+
6818+        d.addCallback(lambda ignored:
6819+            mr.get_blockhashes())
6820+        d.addCallback(lambda blockhashes:
6821+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
6822+
6823+        d.addCallback(lambda ignored:
6824+            mr.get_sharehashes())
6825+        d.addCallback(lambda sharehashes:
6826+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
6827+
6828+        d.addCallback(lambda ignored:
6829+            mr.get_signature())
6830+        d.addCallback(lambda signature:
6831+            self.failUnlessEqual(signature, self.signature))
6832+
6833+        d.addCallback(lambda ignored:
6834+            mr.get_verification_key())
6835+        d.addCallback(lambda verification_key:
6836+            self.failUnlessEqual(verification_key, self.verification_key))
6837+
6838+        d.addCallback(lambda ignored:
6839+            mr.get_seqnum())
6840+        d.addCallback(lambda seqnum:
6841+            self.failUnlessEqual(seqnum, 0))
6842+
6843+        d.addCallback(lambda ignored:
6844+            mr.get_root_hash())
6845+        d.addCallback(lambda root_hash:
6846+            self.failUnlessEqual(self.root_hash, root_hash))
6847+
6848+        d.addCallback(lambda ignored:
6849+            mr.get_encoding_parameters())
6850+        def _check_encoding_parameters((k, n, segsize, datalen)):
6851+            self.failUnlessEqual(k, 3)
6852+            self.failUnlessEqual(n, 10)
6853+            self.failUnlessEqual(segsize, 6)
6854+            self.failUnlessEqual(datalen, 36)
6855+        d.addCallback(_check_encoding_parameters)
6856+
6857+        d.addCallback(lambda ignored:
6858+            mr.get_checkstring())
6859+        d.addCallback(lambda checkstring:
6860+            self.failUnlessEqual(checkstring, mw.get_checkstring()))
6861+        return d
6862+
6863+
6864+    def test_is_sdmf(self):
6865+        # The MDMFSlotReadProxy should also know how to read SDMF files,
6866+        # since it will encounter them on the grid. Callers use the
6867+        # is_sdmf method to test this.
6868+        self.write_sdmf_share_to_server("si1")
6869+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6870+        d = mr.is_sdmf()
6871+        d.addCallback(lambda issdmf:
6872+            self.failUnless(issdmf))
6873+        return d
6874+
6875+
6876+    def test_reads_sdmf(self):
6877+        # The slot read proxy should, naturally, know how to tell us
6878+        # about data in the SDMF format
6879+        self.write_sdmf_share_to_server("si1")
6880+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6881+        d = defer.succeed(None)
6882+        d.addCallback(lambda ignored:
6883+            mr.is_sdmf())
6884+        d.addCallback(lambda issdmf:
6885+            self.failUnless(issdmf))
6886+
6887+        # What do we need to read?
6888+        #  - The sharedata
6889+        #  - The salt
6890+        d.addCallback(lambda ignored:
6891+            mr.get_block_and_salt(0))
6892+        def _check_block_and_salt(results):
6893+            block, salt = results
6894+            # Our original file is 36 bytes long. Then each share is 12
6895+            # bytes in size. The share is composed entirely of the
6896+            # letter a. self.block contains 2 as, so 6 * self.block is
6897+            # what we are looking for.
6898+            self.failUnlessEqual(block, self.block * 6)
6899+            self.failUnlessEqual(salt, self.salt)
6900+        d.addCallback(_check_block_and_salt)
6901+
6902+        #  - The blockhashes
6903+        d.addCallback(lambda ignored:
6904+            mr.get_blockhashes())
6905+        d.addCallback(lambda blockhashes:
6906+            self.failUnlessEqual(self.block_hash_tree,
6907+                                 blockhashes,
6908+                                 blockhashes))
6909+        #  - The sharehashes
6910+        d.addCallback(lambda ignored:
6911+            mr.get_sharehashes())
6912+        d.addCallback(lambda sharehashes:
6913+            self.failUnlessEqual(self.share_hash_chain,
6914+                                 sharehashes))
6915+        #  - The keys
6916+        d.addCallback(lambda ignored:
6917+            mr.get_encprivkey())
6918+        d.addCallback(lambda encprivkey:
6919+            self.failUnlessEqual(encprivkey, self.encprivkey, encprivkey))
6920+        d.addCallback(lambda ignored:
6921+            mr.get_verification_key())
6922+        d.addCallback(lambda verification_key:
6923+            self.failUnlessEqual(verification_key,
6924+                                 self.verification_key,
6925+                                 verification_key))
6926+        #  - The signature
6927+        d.addCallback(lambda ignored:
6928+            mr.get_signature())
6929+        d.addCallback(lambda signature:
6930+            self.failUnlessEqual(signature, self.signature, signature))
6931+
6932+        #  - The sequence number
6933+        d.addCallback(lambda ignored:
6934+            mr.get_seqnum())
6935+        d.addCallback(lambda seqnum:
6936+            self.failUnlessEqual(seqnum, 0, seqnum))
6937+
6938+        #  - The root hash
6939+        d.addCallback(lambda ignored:
6940+            mr.get_root_hash())
6941+        d.addCallback(lambda root_hash:
6942+            self.failUnlessEqual(root_hash, self.root_hash, root_hash))
6943+        return d
6944+
6945+
6946+    def test_only_reads_one_segment_sdmf(self):
6947+        # SDMF shares have only one segment, so it doesn't make sense to
6948+        # read more segments than that. The reader should know this and
6949+        # complain if we try to do that.
6950+        self.write_sdmf_share_to_server("si1")
6951+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6952+        d = defer.succeed(None)
6953+        d.addCallback(lambda ignored:
6954+            mr.is_sdmf())
6955+        d.addCallback(lambda issdmf:
6956+            self.failUnless(issdmf))
6957+        d.addCallback(lambda ignored:
6958+            self.shouldFail(LayoutInvalid, "test bad segment",
6959+                            None,
6960+                            mr.get_block_and_salt, 1))
6961+        return d
6962+
6963+
6964+    def test_read_with_prefetched_mdmf_data(self):
6965+        # The MDMFSlotReadProxy will prefill certain fields if you pass
6966+        # it data that you have already fetched. This is useful for
6967+        # cases like the Servermap, which prefetches ~2kb of data while
6968+        # finding out which shares are on the remote peer so that it
6969+        # doesn't waste round trips.
6970+        mdmf_data = self.build_test_mdmf_share()
6971+        self.write_test_share_to_server("si1")
6972+        def _make_mr(ignored, length):
6973+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, mdmf_data[:length])
6974+            return mr
6975+
6976+        d = defer.succeed(None)
6977+        # This should be enough to fill in both the encoding parameters
6978+        # and the table of offsets, which will complete the version
6979+        # information tuple.
6980+        d.addCallback(_make_mr, 107)
6981+        d.addCallback(lambda mr:
6982+            mr.get_verinfo())
6983+        def _check_verinfo(verinfo):
6984+            self.failUnless(verinfo)
6985+            self.failUnlessEqual(len(verinfo), 9)
6986+            (seqnum,
6987+             root_hash,
6988+             salt_hash,
6989+             segsize,
6990+             datalen,
6991+             k,
6992+             n,
6993+             prefix,
6994+             offsets) = verinfo
6995+            self.failUnlessEqual(seqnum, 0)
6996+            self.failUnlessEqual(root_hash, self.root_hash)
6997+            self.failUnlessEqual(segsize, 6)
6998+            self.failUnlessEqual(datalen, 36)
6999+            self.failUnlessEqual(k, 3)
7000+            self.failUnlessEqual(n, 10)
7001+            expected_prefix = struct.pack(MDMFSIGNABLEHEADER,
7002+                                          1,
7003+                                          seqnum,
7004+                                          root_hash,
7005+                                          k,
7006+                                          n,
7007+                                          segsize,
7008+                                          datalen)
7009+            self.failUnlessEqual(expected_prefix, prefix)
7010+            self.failUnlessEqual(self.rref.read_count, 0)
7011+        d.addCallback(_check_verinfo)
7012+        # This is not enough data to read a block and a share, so the
7013+        # wrapper should attempt to read this from the remote server.
7014+        d.addCallback(_make_mr, 107)
7015+        d.addCallback(lambda mr:
7016+            mr.get_block_and_salt(0))
7017+        def _check_block_and_salt((block, salt)):
7018+            self.failUnlessEqual(block, self.block)
7019+            self.failUnlessEqual(salt, self.salt)
7020+            self.failUnlessEqual(self.rref.read_count, 1)
7021+        # This should be enough data to read one block.
7022+        d.addCallback(_make_mr, 249)
7023+        d.addCallback(lambda mr:
7024+            mr.get_block_and_salt(0))
7025+        d.addCallback(_check_block_and_salt)
7026+        return d
7027+
7028+
7029+    def test_read_with_prefetched_sdmf_data(self):
7030+        sdmf_data = self.build_test_sdmf_share()
7031+        self.write_sdmf_share_to_server("si1")
7032+        def _make_mr(ignored, length):
7033+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, sdmf_data[:length])
7034+            return mr
7035+
7036+        d = defer.succeed(None)
7037+        # This should be enough to get us the encoding parameters,
7038+        # offset table, and everything else we need to build a verinfo
7039+        # string.
7040+        d.addCallback(_make_mr, 107)
7041+        d.addCallback(lambda mr:
7042+            mr.get_verinfo())
7043+        def _check_verinfo(verinfo):
7044+            self.failUnless(verinfo)
7045+            self.failUnlessEqual(len(verinfo), 9)
7046+            (seqnum,
7047+             root_hash,
7048+             salt,
7049+             segsize,
7050+             datalen,
7051+             k,
7052+             n,
7053+             prefix,
7054+             offsets) = verinfo
7055+            self.failUnlessEqual(seqnum, 0)
7056+            self.failUnlessEqual(root_hash, self.root_hash)
7057+            self.failUnlessEqual(salt, self.salt)
7058+            self.failUnlessEqual(segsize, 36)
7059+            self.failUnlessEqual(datalen, 36)
7060+            self.failUnlessEqual(k, 3)
7061+            self.failUnlessEqual(n, 10)
7062+            expected_prefix = struct.pack(SIGNED_PREFIX,
7063+                                          0,
7064+                                          seqnum,
7065+                                          root_hash,
7066+                                          salt,
7067+                                          k,
7068+                                          n,
7069+                                          segsize,
7070+                                          datalen)
7071+            self.failUnlessEqual(expected_prefix, prefix)
7072+            self.failUnlessEqual(self.rref.read_count, 0)
7073+        d.addCallback(_check_verinfo)
7074+        # This shouldn't be enough to read any share data.
7075+        d.addCallback(_make_mr, 107)
7076+        d.addCallback(lambda mr:
7077+            mr.get_block_and_salt(0))
7078+        def _check_block_and_salt((block, salt)):
7079+            self.failUnlessEqual(block, self.block * 6)
7080+            self.failUnlessEqual(salt, self.salt)
7081+            # TODO: Fix the read routine so that it reads only the data
7082+            #       that it has cached if it can't read all of it.
7083+            self.failUnlessEqual(self.rref.read_count, 2)
7084+
7085+        # This should be enough to read share data.
7086+        d.addCallback(_make_mr, self.offsets['share_data'])
7087+        d.addCallback(lambda mr:
7088+            mr.get_block_and_salt(0))
7089+        d.addCallback(_check_block_and_salt)
7090+        return d
7091+
7092+
7093+    def test_read_with_empty_mdmf_file(self):
7094+        # Some tests upload a file with no contents to test things
7095+        # unrelated to the actual handling of the content of the file.
7096+        # The reader should behave intelligently in these cases.
7097+        self.write_test_share_to_server("si1", empty=True)
7098+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7099+        # We should be able to get the encoding parameters, and they
7100+        # should be correct.
7101+        d = defer.succeed(None)
7102+        d.addCallback(lambda ignored:
7103+            mr.get_encoding_parameters())
7104+        def _check_encoding_parameters(params):
7105+            self.failUnlessEqual(len(params), 4)
7106+            k, n, segsize, datalen = params
7107+            self.failUnlessEqual(k, 3)
7108+            self.failUnlessEqual(n, 10)
7109+            self.failUnlessEqual(segsize, 0)
7110+            self.failUnlessEqual(datalen, 0)
7111+        d.addCallback(_check_encoding_parameters)
7112+
7113+        # We should not be able to fetch a block, since there are no
7114+        # blocks to fetch
7115+        d.addCallback(lambda ignored:
7116+            self.shouldFail(LayoutInvalid, "get block on empty file",
7117+                            None,
7118+                            mr.get_block_and_salt, 0))
7119+        return d
7120+
7121+
7122+    def test_read_with_empty_sdmf_file(self):
7123+        self.write_sdmf_share_to_server("si1", empty=True)
7124+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7125+        # We should be able to get the encoding parameters, and they
7126+        # should be correct
7127+        d = defer.succeed(None)
7128+        d.addCallback(lambda ignored:
7129+            mr.get_encoding_parameters())
7130+        def _check_encoding_parameters(params):
7131+            self.failUnlessEqual(len(params), 4)
7132+            k, n, segsize, datalen = params
7133+            self.failUnlessEqual(k, 3)
7134+            self.failUnlessEqual(n, 10)
7135+            self.failUnlessEqual(segsize, 0)
7136+            self.failUnlessEqual(datalen, 0)
7137+        d.addCallback(_check_encoding_parameters)
7138+
7139+        # It does not make sense to get a block in this format, so we
7140+        # should not be able to.
7141+        d.addCallback(lambda ignored:
7142+            self.shouldFail(LayoutInvalid, "get block on an empty file",
7143+                            None,
7144+                            mr.get_block_and_salt, 0))
7145+        return d
7146+
7147+
7148+    def test_verinfo_with_sdmf_file(self):
7149+        self.write_sdmf_share_to_server("si1")
7150+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7151+        # We should be able to get the version information.
7152+        d = defer.succeed(None)
7153+        d.addCallback(lambda ignored:
7154+            mr.get_verinfo())
7155+        def _check_verinfo(verinfo):
7156+            self.failUnless(verinfo)
7157+            self.failUnlessEqual(len(verinfo), 9)
7158+            (seqnum,
7159+             root_hash,
7160+             salt,
7161+             segsize,
7162+             datalen,
7163+             k,
7164+             n,
7165+             prefix,
7166+             offsets) = verinfo
7167+            self.failUnlessEqual(seqnum, 0)
7168+            self.failUnlessEqual(root_hash, self.root_hash)
7169+            self.failUnlessEqual(salt, self.salt)
7170+            self.failUnlessEqual(segsize, 36)
7171+            self.failUnlessEqual(datalen, 36)
7172+            self.failUnlessEqual(k, 3)
7173+            self.failUnlessEqual(n, 10)
7174+            expected_prefix = struct.pack(">BQ32s16s BBQQ",
7175+                                          0,
7176+                                          seqnum,
7177+                                          root_hash,
7178+                                          salt,
7179+                                          k,
7180+                                          n,
7181+                                          segsize,
7182+                                          datalen)
7183+            self.failUnlessEqual(prefix, expected_prefix)
7184+            self.failUnlessEqual(offsets, self.offsets)
7185+        d.addCallback(_check_verinfo)
7186+        return d
7187+
7188+
7189+    def test_verinfo_with_mdmf_file(self):
7190+        self.write_test_share_to_server("si1")
7191+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7192+        d = defer.succeed(None)
7193+        d.addCallback(lambda ignored:
7194+            mr.get_verinfo())
7195+        def _check_verinfo(verinfo):
7196+            self.failUnless(verinfo)
7197+            self.failUnlessEqual(len(verinfo), 9)
7198+            (seqnum,
7199+             root_hash,
7200+             IV,
7201+             segsize,
7202+             datalen,
7203+             k,
7204+             n,
7205+             prefix,
7206+             offsets) = verinfo
7207+            self.failUnlessEqual(seqnum, 0)
7208+            self.failUnlessEqual(root_hash, self.root_hash)
7209+            self.failIf(IV)
7210+            self.failUnlessEqual(segsize, 6)
7211+            self.failUnlessEqual(datalen, 36)
7212+            self.failUnlessEqual(k, 3)
7213+            self.failUnlessEqual(n, 10)
7214+            expected_prefix = struct.pack(">BQ32s BBQQ",
7215+                                          1,
7216+                                          seqnum,
7217+                                          root_hash,
7218+                                          k,
7219+                                          n,
7220+                                          segsize,
7221+                                          datalen)
7222+            self.failUnlessEqual(prefix, expected_prefix)
7223+            self.failUnlessEqual(offsets, self.offsets)
7224+        d.addCallback(_check_verinfo)
7225+        return d
7226+
7227+
7228+    def test_reader_queue(self):
7229+        self.write_test_share_to_server('si1')
7230+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7231+        d1 = mr.get_block_and_salt(0, queue=True)
7232+        d2 = mr.get_blockhashes(queue=True)
7233+        d3 = mr.get_sharehashes(queue=True)
7234+        d4 = mr.get_signature(queue=True)
7235+        d5 = mr.get_verification_key(queue=True)
7236+        dl = defer.DeferredList([d1, d2, d3, d4, d5])
7237+        mr.flush()
7238+        def _print(results):
7239+            self.failUnlessEqual(len(results), 5)
7240+            # We have one read for version information and offsets, and
7241+            # one for everything else.
7242+            self.failUnlessEqual(self.rref.read_count, 2)
7243+            block, salt = results[0][1] # results[0] is a boolean that says
7244+                                           # whether or not the operation
7245+                                           # worked.
7246+            self.failUnlessEqual(self.block, block)
7247+            self.failUnlessEqual(self.salt, salt)
7248+
7249+            blockhashes = results[1][1]
7250+            self.failUnlessEqual(self.block_hash_tree, blockhashes)
7251+
7252+            sharehashes = results[2][1]
7253+            self.failUnlessEqual(self.share_hash_chain, sharehashes)
7254+
7255+            signature = results[3][1]
7256+            self.failUnlessEqual(self.signature, signature)
7257+
7258+            verification_key = results[4][1]
7259+            self.failUnlessEqual(self.verification_key, verification_key)
7260+        dl.addCallback(_print)
7261+        return dl
7262+
7263+
7264+    def test_sdmf_writer(self):
7265+        # Go through the motions of writing an SDMF share to the storage
7266+        # server. Then read the storage server to see that the share got
7267+        # written in the way that we think it should have.
7268+
7269+        # We do this first so that the necessary instance variables get
7270+        # set the way we want them for the tests below.
7271+        data = self.build_test_sdmf_share()
7272+        sdmfr = SDMFSlotWriteProxy(0,
7273+                                   self.rref,
7274+                                   "si1",
7275+                                   self.secrets,
7276+                                   0, 3, 10, 36, 36)
7277+        # Put the block and salt.
7278+        sdmfr.put_block(self.blockdata, 0, self.salt)
7279+
7280+        # Put the encprivkey
7281+        sdmfr.put_encprivkey(self.encprivkey)
7282+
7283+        # Put the block and share hash chains
7284+        sdmfr.put_blockhashes(self.block_hash_tree)
7285+        sdmfr.put_sharehashes(self.share_hash_chain)
7286+        sdmfr.put_root_hash(self.root_hash)
7287+
7288+        # Put the signature
7289+        sdmfr.put_signature(self.signature)
7290+
7291+        # Put the verification key
7292+        sdmfr.put_verification_key(self.verification_key)
7293+
7294+        # Now check to make sure that nothing has been written yet.
7295+        self.failUnlessEqual(self.rref.write_count, 0)
7296+
7297+        # Now finish publishing
7298+        d = sdmfr.finish_publishing()
7299+        def _then(ignored):
7300+            self.failUnlessEqual(self.rref.write_count, 1)
7301+            read = self.ss.remote_slot_readv
7302+            self.failUnlessEqual(read("si1", [0], [(0, len(data))]),
7303+                                 {0: [data]})
7304+        d.addCallback(_then)
7305+        return d
7306+
7307+
7308+    def test_sdmf_writer_preexisting_share(self):
7309+        data = self.build_test_sdmf_share()
7310+        self.write_sdmf_share_to_server("si1")
7311+
7312+        # Now there is a share on the storage server. To successfully
7313+        # write, we need to set the checkstring correctly. When we
7314+        # don't, no write should occur.
7315+        sdmfw = SDMFSlotWriteProxy(0,
7316+                                   self.rref,
7317+                                   "si1",
7318+                                   self.secrets,
7319+                                   1, 3, 10, 36, 36)
7320+        sdmfw.put_block(self.blockdata, 0, self.salt)
7321+
7322+        # Put the encprivkey
7323+        sdmfw.put_encprivkey(self.encprivkey)
7324+
7325+        # Put the block and share hash chains
7326+        sdmfw.put_blockhashes(self.block_hash_tree)
7327+        sdmfw.put_sharehashes(self.share_hash_chain)
7328+
7329+        # Put the root hash
7330+        sdmfw.put_root_hash(self.root_hash)
7331+
7332+        # Put the signature
7333+        sdmfw.put_signature(self.signature)
7334+
7335+        # Put the verification key
7336+        sdmfw.put_verification_key(self.verification_key)
7337+
7338+        # We shouldn't have a checkstring yet
7339+        self.failUnlessEqual(sdmfw.get_checkstring(), "")
7340+
7341+        d = sdmfw.finish_publishing()
7342+        def _then(results):
7343+            self.failIf(results[0])
7344+            # this is the correct checkstring
7345+            self._expected_checkstring = results[1][0][0]
7346+            return self._expected_checkstring
7347+
7348+        d.addCallback(_then)
7349+        d.addCallback(sdmfw.set_checkstring)
7350+        d.addCallback(lambda ignored:
7351+            sdmfw.get_checkstring())
7352+        d.addCallback(lambda checkstring:
7353+            self.failUnlessEqual(checkstring, self._expected_checkstring))
7354+        d.addCallback(lambda ignored:
7355+            sdmfw.finish_publishing())
7356+        def _then_again(results):
7357+            self.failUnless(results[0])
7358+            read = self.ss.remote_slot_readv
7359+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
7360+                                 {0: [struct.pack(">Q", 1)]})
7361+            self.failUnlessEqual(read("si1", [0], [(9, len(data) - 9)]),
7362+                                 {0: [data[9:]]})
7363+        d.addCallback(_then_again)
7364+        return d
7365+
7366+
7367 class Stats(unittest.TestCase):
7368 
7369     def setUp(self):
7370}
7371[mutable/publish.py: cleanup + simplification
7372Kevan Carstensen <kevan@isnotajoke.com>**20100702225554
7373 Ignore-this: 36a58424ceceffb1ddc55cc5934399e2
7374] {
7375hunk ./src/allmydata/mutable/publish.py 19
7376      UncoordinatedWriteError, NotEnoughServersError
7377 from allmydata.mutable.servermap import ServerMap
7378 from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
7379-     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy
7380+     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy, \
7381+     SDMFSlotWriteProxy
7382 
7383 KiB = 1024
7384 DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
7385hunk ./src/allmydata/mutable/publish.py 24
7386+PUSHING_BLOCKS_STATE = 0
7387+PUSHING_EVERYTHING_ELSE_STATE = 1
7388+DONE_STATE = 2
7389 
7390 class PublishStatus:
7391     implements(IPublishStatus)
7392hunk ./src/allmydata/mutable/publish.py 229
7393 
7394         self.bad_share_checkstrings = {}
7395 
7396+        # This is set at the last step of the publishing process.
7397+        self.versioninfo = ""
7398+
7399         # we use the servermap to populate the initial goal: this way we will
7400         # try to update each existing share in place.
7401         for (peerid, shnum) in self._servermap.servermap:
7402hunk ./src/allmydata/mutable/publish.py 245
7403             self.bad_share_checkstrings[key] = old_checkstring
7404             self.connections[peerid] = self._servermap.connections[peerid]
7405 
7406-        # Now, the process dovetails -- if this is an SDMF file, we need
7407-        # to write an SDMF file. Otherwise, we need to write an MDMF
7408-        # file.
7409-        if self._version == MDMF_VERSION:
7410-            return self._publish_mdmf()
7411-        else:
7412-            return self._publish_sdmf()
7413-        #return self.done_deferred
7414-
7415-    def _publish_mdmf(self):
7416-        # Next, we find homes for all of the shares that we don't have
7417-        # homes for yet.
7418         # TODO: Make this part do peer selection.
7419         self.update_goal()
7420         self.writers = {}
7421hunk ./src/allmydata/mutable/publish.py 248
7422-        # For each (peerid, shnum) in self.goal, we make an
7423-        # MDMFSlotWriteProxy for that peer. We'll use this to write
7424+        if self._version == MDMF_VERSION:
7425+            writer_class = MDMFSlotWriteProxy
7426+        else:
7427+            writer_class = SDMFSlotWriteProxy
7428+
7429+        # For each (peerid, shnum) in self.goal, we make a
7430+        # write proxy for that peer. We'll use this to write
7431         # shares to the peer.
7432         for key in self.goal:
7433             peerid, shnum = key
7434hunk ./src/allmydata/mutable/publish.py 263
7435             cancel_secret = self._node.get_cancel_secret(peerid)
7436             secrets = (write_enabler, renew_secret, cancel_secret)
7437 
7438-            self.writers[shnum] =  MDMFSlotWriteProxy(shnum,
7439-                                                      self.connections[peerid],
7440-                                                      self._storage_index,
7441-                                                      secrets,
7442-                                                      self._new_seqnum,
7443-                                                      self.required_shares,
7444-                                                      self.total_shares,
7445-                                                      self.segment_size,
7446-                                                      len(self.newdata))
7447+            self.writers[shnum] =  writer_class(shnum,
7448+                                                self.connections[peerid],
7449+                                                self._storage_index,
7450+                                                secrets,
7451+                                                self._new_seqnum,
7452+                                                self.required_shares,
7453+                                                self.total_shares,
7454+                                                self.segment_size,
7455+                                                len(self.newdata))
7456+            self.writers[shnum].peerid = peerid
7457             if (peerid, shnum) in self._servermap.servermap:
7458                 old_versionid, old_timestamp = self._servermap.servermap[key]
7459                 (old_seqnum, old_root_hash, old_salt, old_segsize,
7460hunk ./src/allmydata/mutable/publish.py 278
7461                  old_datalength, old_k, old_N, old_prefix,
7462                  old_offsets_tuple) = old_versionid
7463-                self.writers[shnum].set_checkstring(old_seqnum, old_root_hash)
7464+                self.writers[shnum].set_checkstring(old_seqnum,
7465+                                                    old_root_hash,
7466+                                                    old_salt)
7467+            elif (peerid, shnum) in self.bad_share_checkstrings:
7468+                old_checkstring = self.bad_share_checkstrings[(peerid, shnum)]
7469+                self.writers[shnum].set_checkstring(old_checkstring)
7470+
7471+        # Our remote shares will not have a complete checkstring until
7472+        # after we are done writing share data and have started to write
7473+        # blocks. In the meantime, we need to know what to look for when
7474+        # writing, so that we can detect UncoordinatedWriteErrors.
7475+        self._checkstring = self.writers.values()[0].get_checkstring()
7476 
7477         # Now, we start pushing shares.
7478         self._status.timings["setup"] = time.time() - self._started
7479hunk ./src/allmydata/mutable/publish.py 293
7480-        def _start_pushing(res):
7481-            self._started_pushing = time.time()
7482-            return res
7483-
7484         # First, we encrypt, encode, and publish the shares that we need
7485         # to encrypt, encode, and publish.
7486 
7487hunk ./src/allmydata/mutable/publish.py 306
7488 
7489         d = defer.succeed(None)
7490         self.log("Starting push")
7491-        for i in xrange(self.num_segments - 1):
7492-            d.addCallback(lambda ignored, i=i:
7493-                self.push_segment(i))
7494-            d.addCallback(self._turn_barrier)
7495-        # We have at least one segment, so we will have a tail segment
7496-        if self.num_segments > 0:
7497-            d.addCallback(lambda ignored:
7498-                self.push_tail_segment())
7499-
7500-        d.addCallback(lambda ignored:
7501-            self.push_encprivkey())
7502-        d.addCallback(lambda ignored:
7503-            self.push_blockhashes())
7504-        d.addCallback(lambda ignored:
7505-            self.push_sharehashes())
7506-        d.addCallback(lambda ignored:
7507-            self.push_toplevel_hashes_and_signature())
7508-        d.addCallback(lambda ignored:
7509-            self.finish_publishing())
7510-        return d
7511-
7512-
7513-    def _publish_sdmf(self):
7514-        self._status.timings["setup"] = time.time() - self._started
7515-        self.salt = os.urandom(16)
7516 
7517hunk ./src/allmydata/mutable/publish.py 307
7518-        d = self._encrypt_and_encode()
7519-        d.addCallback(self._generate_shares)
7520-        def _start_pushing(res):
7521-            self._started_pushing = time.time()
7522-            return res
7523-        d.addCallback(_start_pushing)
7524-        d.addCallback(self.loop) # trigger delivery
7525-        d.addErrback(self._fatal_error)
7526+        self._state = PUSHING_BLOCKS_STATE
7527+        self._push()
7528 
7529         return self.done_deferred
7530 
7531hunk ./src/allmydata/mutable/publish.py 327
7532                                                   segment_size)
7533         else:
7534             self.num_segments = 0
7535+
7536+        self.log("building encoding parameters for file")
7537+        self.log("got segsize %d" % self.segment_size)
7538+        self.log("got %d segments" % self.num_segments)
7539+
7540         if self._version == SDMF_VERSION:
7541             assert self.num_segments in (0, 1) # SDMF
7542hunk ./src/allmydata/mutable/publish.py 334
7543-            return
7544         # calculate the tail segment size.
7545hunk ./src/allmydata/mutable/publish.py 335
7546-        self.tail_segment_size = len(self.newdata) % segment_size
7547 
7548hunk ./src/allmydata/mutable/publish.py 336
7549-        if self.tail_segment_size == 0:
7550+        if segment_size and self.newdata:
7551+            self.tail_segment_size = len(self.newdata) % segment_size
7552+        else:
7553+            self.tail_segment_size = 0
7554+
7555+        if self.tail_segment_size == 0 and segment_size:
7556             # The tail segment is the same size as the other segments.
7557             self.tail_segment_size = segment_size
7558 
7559hunk ./src/allmydata/mutable/publish.py 345
7560-        # We'll make an encoder ahead-of-time for the normal-sized
7561-        # segments (defined as any segment of segment_size size.
7562-        # (the part of the code that puts the tail segment will make its
7563-        #  own encoder for that part)
7564+        # Make FEC encoders
7565         fec = codec.CRSEncoder()
7566         fec.set_params(self.segment_size,
7567                        self.required_shares, self.total_shares)
7568hunk ./src/allmydata/mutable/publish.py 352
7569         self.piece_size = fec.get_block_size()
7570         self.fec = fec
7571 
7572+        if self.tail_segment_size == self.segment_size:
7573+            self.tail_fec = self.fec
7574+        else:
7575+            tail_fec = codec.CRSEncoder()
7576+            tail_fec.set_params(self.tail_segment_size,
7577+                                self.required_shares,
7578+                                self.total_shares)
7579+            self.tail_fec = tail_fec
7580+
7581+        self._current_segment = 0
7582+
7583+
7584+    def _push(self, ignored=None):
7585+        """
7586+        I manage state transitions. In particular, I see that we still
7587+        have a good enough number of writers to complete the upload
7588+        successfully.
7589+        """
7590+        # Can we still successfully publish this file?
7591+        # TODO: Keep track of outstanding queries before aborting the
7592+        #       process.
7593+        if len(self.writers) <= self.required_shares or self.surprised:
7594+            return self._failure()
7595+
7596+        # Figure out what we need to do next. Each of these needs to
7597+        # return a deferred so that we don't block execution when this
7598+        # is first called in the upload method.
7599+        if self._state == PUSHING_BLOCKS_STATE:
7600+            return self.push_segment(self._current_segment)
7601+
7602+        # XXX: Do we want more granularity in states? Is that useful at
7603+        #      all?
7604+        #      Yes -- quicker reaction to UCW.
7605+        elif self._state == PUSHING_EVERYTHING_ELSE_STATE:
7606+            return self.push_everything_else()
7607+
7608+        # If we make it to this point, we were successful in placing the
7609+        # file.
7610+        return self._done(None)
7611+
7612 
7613     def push_segment(self, segnum):
7614hunk ./src/allmydata/mutable/publish.py 394
7615+        if self.num_segments == 0 and self._version == SDMF_VERSION:
7616+            self._add_dummy_salts()
7617+
7618+        if segnum == self.num_segments:
7619+            # We don't have any more segments to push.
7620+            self._state = PUSHING_EVERYTHING_ELSE_STATE
7621+            return self._push()
7622+
7623+        d = self._encode_segment(segnum)
7624+        d.addCallback(self._push_segment, segnum)
7625+        def _increment_segnum(ign):
7626+            self._current_segment += 1
7627+        # XXX: I don't think we need to do addBoth here -- any errBacks
7628+        # should be handled within push_segment.
7629+        d.addBoth(_increment_segnum)
7630+        d.addBoth(self._push)
7631+
7632+
7633+    def _add_dummy_salts(self):
7634+        """
7635+        SDMF files need a salt even if they're empty, or the signature
7636+        won't make sense. This method adds a dummy salt to each of our
7637+        SDMF writers so that they can write the signature later.
7638+        """
7639+        salt = os.urandom(16)
7640+        assert self._version == SDMF_VERSION
7641+
7642+        for writer in self.writers.itervalues():
7643+            writer.put_salt(salt)
7644+
7645+
7646+    def _encode_segment(self, segnum):
7647+        """
7648+        I encrypt and encode the segment segnum.
7649+        """
7650         started = time.time()
7651hunk ./src/allmydata/mutable/publish.py 430
7652-        segsize = self.segment_size
7653+
7654+        if segnum + 1 == self.num_segments:
7655+            segsize = self.tail_segment_size
7656+        else:
7657+            segsize = self.segment_size
7658+
7659+
7660+        offset = self.segment_size * segnum
7661+        length = segsize + offset
7662         self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
7663hunk ./src/allmydata/mutable/publish.py 440
7664-        data = self.newdata[segsize * segnum:segsize*(segnum + 1)]
7665+        data = self.newdata[offset:length]
7666         assert len(data) == segsize
7667 
7668         salt = os.urandom(16)
7669hunk ./src/allmydata/mutable/publish.py 455
7670         started = now
7671 
7672         # now apply FEC
7673+        if segnum + 1 == self.num_segments:
7674+            fec = self.tail_fec
7675+        else:
7676+            fec = self.fec
7677 
7678         self._status.set_status("Encoding")
7679         crypttext_pieces = [None] * self.required_shares
7680hunk ./src/allmydata/mutable/publish.py 462
7681-        piece_size = self.piece_size
7682+        piece_size = fec.get_block_size()
7683         for i in range(len(crypttext_pieces)):
7684             offset = i * piece_size
7685             piece = crypttext[offset:offset+piece_size]
7686hunk ./src/allmydata/mutable/publish.py 469
7687             piece = piece + "\x00"*(piece_size - len(piece)) # padding
7688             crypttext_pieces[i] = piece
7689             assert len(piece) == piece_size
7690-        d = self.fec.encode(crypttext_pieces)
7691+        d = fec.encode(crypttext_pieces)
7692         def _done_encoding(res):
7693             elapsed = time.time() - started
7694             self._status.timings["encode"] = elapsed
7695hunk ./src/allmydata/mutable/publish.py 473
7696-            return res
7697+            return (res, salt)
7698         d.addCallback(_done_encoding)
7699hunk ./src/allmydata/mutable/publish.py 475
7700-
7701-        def _push_shares_and_salt(results):
7702-            shares, shareids = results
7703-            dl = []
7704-            for i in xrange(len(shares)):
7705-                sharedata = shares[i]
7706-                shareid = shareids[i]
7707-                block_hash = hashutil.block_hash(salt + sharedata)
7708-                self.blockhashes[shareid].append(block_hash)
7709-
7710-                # find the writer for this share
7711-                d = self.writers[shareid].put_block(sharedata, segnum, salt)
7712-                dl.append(d)
7713-            # TODO: Naturally, we need to check on the results of these.
7714-            return defer.DeferredList(dl)
7715-        d.addCallback(_push_shares_and_salt)
7716         return d
7717 
7718 
7719hunk ./src/allmydata/mutable/publish.py 478
7720-    def push_tail_segment(self):
7721-        # This is essentially the same as push_segment, except that we
7722-        # don't use the cached encoder that we use elsewhere.
7723-        self.log("Pushing tail segment")
7724+    def _push_segment(self, encoded_and_salt, segnum):
7725+        """
7726+        I push (data, salt) as segment number segnum.
7727+        """
7728+        results, salt = encoded_and_salt
7729+        shares, shareids = results
7730         started = time.time()
7731hunk ./src/allmydata/mutable/publish.py 485
7732-        segsize = self.segment_size
7733-        data = self.newdata[segsize * (self.num_segments-1):]
7734-        assert len(data) == self.tail_segment_size
7735-        salt = os.urandom(16)
7736-
7737-        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
7738-        enc = AES(key)
7739-        crypttext = enc.process(data)
7740-        assert len(crypttext) == len(data)
7741+        dl = []
7742+        for i in xrange(len(shares)):
7743+            sharedata = shares[i]
7744+            shareid = shareids[i]
7745+            if self._version == MDMF_VERSION:
7746+                hashed = salt + sharedata
7747+            else:
7748+                hashed = sharedata
7749+            block_hash = hashutil.block_hash(hashed)
7750+            self.blockhashes[shareid].append(block_hash)
7751 
7752hunk ./src/allmydata/mutable/publish.py 496
7753-        now = time.time()
7754-        self._status.timings['encrypt'] = now - started
7755-        started = now
7756+            # find the writer for this share
7757+            writer = self.writers[shareid]
7758+            d = writer.put_block(sharedata, segnum, salt)
7759+            d.addCallback(self._got_write_answer, writer, started)
7760+            d.addErrback(self._connection_problem, writer)
7761+            dl.append(d)
7762+            # TODO: Naturally, we need to check on the results of these.
7763+        return defer.DeferredList(dl)
7764 
7765hunk ./src/allmydata/mutable/publish.py 505
7766-        self._status.set_status("Encoding")
7767-        tail_fec = codec.CRSEncoder()
7768-        tail_fec.set_params(self.tail_segment_size,
7769-                            self.required_shares,
7770-                            self.total_shares)
7771 
7772hunk ./src/allmydata/mutable/publish.py 506
7773-        crypttext_pieces = [None] * self.required_shares
7774-        piece_size = tail_fec.get_block_size()
7775-        for i in range(len(crypttext_pieces)):
7776-            offset = i * piece_size
7777-            piece = crypttext[offset:offset+piece_size]
7778-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
7779-            crypttext_pieces[i] = piece
7780-            assert len(piece) == piece_size
7781-        d = tail_fec.encode(crypttext_pieces)
7782-        def _push_shares_and_salt(results):
7783-            shares, shareids = results
7784-            dl = []
7785-            for i in xrange(len(shares)):
7786-                sharedata = shares[i]
7787-                shareid = shareids[i]
7788-                block_hash = hashutil.block_hash(salt + sharedata)
7789-                self.blockhashes[shareid].append(block_hash)
7790-                # find the writer for this share
7791-                d = self.writers[shareid].put_block(sharedata,
7792-                                                    self.num_segments - 1,
7793-                                                    salt)
7794-                dl.append(d)
7795-            # TODO: Naturally, we need to check on the results of these.
7796-            return defer.DeferredList(dl)
7797-        d.addCallback(_push_shares_and_salt)
7798+    def push_everything_else(self):
7799+        """
7800+        I put everything else associated with a share.
7801+        """
7802+        encprivkey = self._encprivkey
7803+        d = self.push_encprivkey()
7804+        d.addCallback(self.push_blockhashes)
7805+        d.addCallback(self.push_sharehashes)
7806+        d.addCallback(self.push_toplevel_hashes_and_signature)
7807+        d.addCallback(self.finish_publishing)
7808+        def _change_state(ignored):
7809+            self._state = DONE_STATE
7810+        d.addCallback(_change_state)
7811+        d.addCallback(self._push)
7812         return d
7813 
7814 
7815hunk ./src/allmydata/mutable/publish.py 527
7816         started = time.time()
7817         encprivkey = self._encprivkey
7818         dl = []
7819-        def _spy_on_writer(results):
7820-            print results
7821-            return results
7822-        for shnum, writer in self.writers.iteritems():
7823+        for writer in self.writers.itervalues():
7824             d = writer.put_encprivkey(encprivkey)
7825hunk ./src/allmydata/mutable/publish.py 529
7826+            d.addCallback(self._got_write_answer, writer, started)
7827+            d.addErrback(self._connection_problem, writer)
7828             dl.append(d)
7829         d = defer.DeferredList(dl)
7830         return d
7831hunk ./src/allmydata/mutable/publish.py 536
7832 
7833 
7834-    def push_blockhashes(self):
7835+    def push_blockhashes(self, ignored):
7836         started = time.time()
7837         dl = []
7838hunk ./src/allmydata/mutable/publish.py 539
7839-        def _spy_on_results(results):
7840-            print results
7841-            return results
7842         self.sharehash_leaves = [None] * len(self.blockhashes)
7843         for shnum, blockhashes in self.blockhashes.iteritems():
7844             t = hashtree.HashTree(blockhashes)
7845hunk ./src/allmydata/mutable/publish.py 545
7846             self.blockhashes[shnum] = list(t)
7847             # set the leaf for future use.
7848             self.sharehash_leaves[shnum] = t[0]
7849-            d = self.writers[shnum].put_blockhashes(self.blockhashes[shnum])
7850+            writer = self.writers[shnum]
7851+            d = writer.put_blockhashes(self.blockhashes[shnum])
7852+            d.addCallback(self._got_write_answer, writer, started)
7853+            d.addErrback(self._connection_problem, self.writers[shnum])
7854             dl.append(d)
7855         d = defer.DeferredList(dl)
7856         return d
7857hunk ./src/allmydata/mutable/publish.py 554
7858 
7859 
7860-    def push_sharehashes(self):
7861+    def push_sharehashes(self, ignored):
7862+        started = time.time()
7863         share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
7864         share_hash_chain = {}
7865         ds = []
7866hunk ./src/allmydata/mutable/publish.py 559
7867-        def _spy_on_results(results):
7868-            print results
7869-            return results
7870         for shnum in xrange(len(self.sharehash_leaves)):
7871             needed_indices = share_hash_tree.needed_hashes(shnum)
7872             self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
7873hunk ./src/allmydata/mutable/publish.py 563
7874                                              for i in needed_indices] )
7875-            d = self.writers[shnum].put_sharehashes(self.sharehashes[shnum])
7876+            writer = self.writers[shnum]
7877+            d = writer.put_sharehashes(self.sharehashes[shnum])
7878+            d.addCallback(self._got_write_answer, writer, started)
7879+            d.addErrback(self._connection_problem, writer)
7880             ds.append(d)
7881         self.root_hash = share_hash_tree[0]
7882         d = defer.DeferredList(ds)
7883hunk ./src/allmydata/mutable/publish.py 573
7884         return d
7885 
7886 
7887-    def push_toplevel_hashes_and_signature(self):
7888+    def push_toplevel_hashes_and_signature(self, ignored):
7889         # We need to to three things here:
7890         #   - Push the root hash and salt hash
7891         #   - Get the checkstring of the resulting layout; sign that.
7892hunk ./src/allmydata/mutable/publish.py 578
7893         #   - Push the signature
7894+        started = time.time()
7895         ds = []
7896hunk ./src/allmydata/mutable/publish.py 580
7897-        def _spy_on_results(results):
7898-            print results
7899-            return results
7900         for shnum in xrange(self.total_shares):
7901hunk ./src/allmydata/mutable/publish.py 581
7902-            d = self.writers[shnum].put_root_hash(self.root_hash)
7903+            writer = self.writers[shnum]
7904+            d = writer.put_root_hash(self.root_hash)
7905+            d.addCallback(self._got_write_answer, writer, started)
7906             ds.append(d)
7907         d = defer.DeferredList(ds)
7908hunk ./src/allmydata/mutable/publish.py 586
7909-        def _make_and_place_signature(ignored):
7910-            signable = self.writers[0].get_signable()
7911-            self.signature = self._privkey.sign(signable)
7912-
7913-            ds = []
7914-            for (shnum, writer) in self.writers.iteritems():
7915-                d = writer.put_signature(self.signature)
7916-                ds.append(d)
7917-            return defer.DeferredList(ds)
7918-        d.addCallback(_make_and_place_signature)
7919+        d.addCallback(self._update_checkstring)
7920+        d.addCallback(self._make_and_place_signature)
7921         return d
7922 
7923 
7924hunk ./src/allmydata/mutable/publish.py 591
7925-    def finish_publishing(self):
7926+    def _update_checkstring(self, ignored):
7927+        """
7928+        After putting the root hash, MDMF files will have the
7929+        checkstring written to the storage server. This means that we
7930+        can update our copy of the checkstring so we can detect
7931+        uncoordinated writes. SDMF files will have the same checkstring,
7932+        so we need not do anything.
7933+        """
7934+        self._checkstring = self.writers.values()[0].get_checkstring()
7935+
7936+
7937+    def _make_and_place_signature(self, ignored):
7938+        """
7939+        I create and place the signature.
7940+        """
7941+        started = time.time()
7942+        signable = self.writers[0].get_signable()
7943+        self.signature = self._privkey.sign(signable)
7944+
7945+        ds = []
7946+        for (shnum, writer) in self.writers.iteritems():
7947+            d = writer.put_signature(self.signature)
7948+            d.addCallback(self._got_write_answer, writer, started)
7949+            d.addErrback(self._connection_problem, writer)
7950+            ds.append(d)
7951+        return defer.DeferredList(ds)
7952+
7953+
7954+    def finish_publishing(self, ignored):
7955         # We're almost done -- we just need to put the verification key
7956         # and the offsets
7957hunk ./src/allmydata/mutable/publish.py 622
7958+        started = time.time()
7959         ds = []
7960         verification_key = self._pubkey.serialize()
7961 
7962hunk ./src/allmydata/mutable/publish.py 626
7963-        def _spy_on_results(results):
7964-            print results
7965-            return results
7966+
7967+        # TODO: Bad, since we remove from this same dict. We need to
7968+        # make a copy, or just use a non-iterated value.
7969         for (shnum, writer) in self.writers.iteritems():
7970             d = writer.put_verification_key(verification_key)
7971hunk ./src/allmydata/mutable/publish.py 631
7972+            d.addCallback(self._got_write_answer, writer, started)
7973+            d.addCallback(self._record_verinfo)
7974             d.addCallback(lambda ignored, writer=writer:
7975                 writer.finish_publishing())
7976hunk ./src/allmydata/mutable/publish.py 635
7977+            d.addCallback(self._got_write_answer, writer, started)
7978+            d.addErrback(self._connection_problem, writer)
7979             ds.append(d)
7980         return defer.DeferredList(ds)
7981 
7982hunk ./src/allmydata/mutable/publish.py 641
7983 
7984-    def _turn_barrier(self, res):
7985-        # putting this method in a Deferred chain imposes a guaranteed
7986-        # reactor turn between the pre- and post- portions of that chain.
7987-        # This can be useful to limit memory consumption: since Deferreds do
7988-        # not do tail recursion, code which uses defer.succeed(result) for
7989-        # consistency will cause objects to live for longer than you might
7990-        # normally expect.
7991-        return fireEventually(res)
7992+    def _record_verinfo(self, ignored):
7993+        self.versioninfo = self.writers.values()[0].get_verinfo()
7994 
7995 
7996hunk ./src/allmydata/mutable/publish.py 645
7997-    def _fatal_error(self, f):
7998-        self.log("error during loop", failure=f, level=log.UNUSUAL)
7999-        self._done(f)
8000+    def _connection_problem(self, f, writer):
8001+        """
8002+        We ran into a connection problem while working with writer, and
8003+        need to deal with that.
8004+        """
8005+        self.log("found problem: %s" % str(f))
8006+        self._last_failure = f
8007+        del(self.writers[writer.shnum])
8008 
8009hunk ./src/allmydata/mutable/publish.py 654
8010-    def _update_status(self):
8011-        self._status.set_status("Sending Shares: %d placed out of %d, "
8012-                                "%d messages outstanding" %
8013-                                (len(self.placed),
8014-                                 len(self.goal),
8015-                                 len(self.outstanding)))
8016-        self._status.set_progress(1.0 * len(self.placed) / len(self.goal))
8017 
8018     def loop(self, ignored=None):
8019         self.log("entering loop", level=log.NOISY)
8020hunk ./src/allmydata/mutable/publish.py 778
8021             self.log_goal(self.goal, "after update: ")
8022 
8023 
8024-    def _encrypt_and_encode(self):
8025-        # this returns a Deferred that fires with a list of (sharedata,
8026-        # sharenum) tuples. TODO: cache the ciphertext, only produce the
8027-        # shares that we care about.
8028-        self.log("_encrypt_and_encode")
8029-
8030-        self._status.set_status("Encrypting")
8031-        started = time.time()
8032+    def _got_write_answer(self, answer, writer, started):
8033+        if not answer:
8034+            # SDMF writers only pretend to write when readers set their
8035+            # blocks, salts, and so on -- they actually just write once,
8036+            # at the end of the upload process. In fake writes, they
8037+            # return defer.succeed(None). If we see that, we shouldn't
8038+            # bother checking it.
8039+            return
8040 
8041hunk ./src/allmydata/mutable/publish.py 787
8042-        key = hashutil.ssk_readkey_data_hash(self.salt, self.readkey)
8043-        enc = AES(key)
8044-        crypttext = enc.process(self.newdata)
8045-        assert len(crypttext) == len(self.newdata)
8046+        peerid = writer.peerid
8047+        lp = self.log("_got_write_answer from %s, share %d" %
8048+                      (idlib.shortnodeid_b2a(peerid), writer.shnum))
8049 
8050         now = time.time()
8051hunk ./src/allmydata/mutable/publish.py 792
8052-        self._status.timings["encrypt"] = now - started
8053-        started = now
8054-
8055-        # now apply FEC
8056-
8057-        self._status.set_status("Encoding")
8058-        fec = codec.CRSEncoder()
8059-        fec.set_params(self.segment_size,
8060-                       self.required_shares, self.total_shares)
8061-        piece_size = fec.get_block_size()
8062-        crypttext_pieces = [None] * self.required_shares
8063-        for i in range(len(crypttext_pieces)):
8064-            offset = i * piece_size
8065-            piece = crypttext[offset:offset+piece_size]
8066-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
8067-            crypttext_pieces[i] = piece
8068-            assert len(piece) == piece_size
8069-
8070-        d = fec.encode(crypttext_pieces)
8071-        def _done_encoding(res):
8072-            elapsed = time.time() - started
8073-            self._status.timings["encode"] = elapsed
8074-            return res
8075-        d.addCallback(_done_encoding)
8076-        return d
8077-
8078-
8079-    def _generate_shares(self, shares_and_shareids):
8080-        # this sets self.shares and self.root_hash
8081-        self.log("_generate_shares")
8082-        self._status.set_status("Generating Shares")
8083-        started = time.time()
8084-
8085-        # we should know these by now
8086-        privkey = self._privkey
8087-        encprivkey = self._encprivkey
8088-        pubkey = self._pubkey
8089-
8090-        (shares, share_ids) = shares_and_shareids
8091-
8092-        assert len(shares) == len(share_ids)
8093-        assert len(shares) == self.total_shares
8094-        all_shares = {}
8095-        block_hash_trees = {}
8096-        share_hash_leaves = [None] * len(shares)
8097-        for i in range(len(shares)):
8098-            share_data = shares[i]
8099-            shnum = share_ids[i]
8100-            all_shares[shnum] = share_data
8101-
8102-            # build the block hash tree. SDMF has only one leaf.
8103-            leaves = [hashutil.block_hash(share_data)]
8104-            t = hashtree.HashTree(leaves)
8105-            block_hash_trees[shnum] = list(t)
8106-            share_hash_leaves[shnum] = t[0]
8107-        for leaf in share_hash_leaves:
8108-            assert leaf is not None
8109-        share_hash_tree = hashtree.HashTree(share_hash_leaves)
8110-        share_hash_chain = {}
8111-        for shnum in range(self.total_shares):
8112-            needed_hashes = share_hash_tree.needed_hashes(shnum)
8113-            share_hash_chain[shnum] = dict( [ (i, share_hash_tree[i])
8114-                                              for i in needed_hashes ] )
8115-        root_hash = share_hash_tree[0]
8116-        assert len(root_hash) == 32
8117-        self.log("my new root_hash is %s" % base32.b2a(root_hash))
8118-        self._new_version_info = (self._new_seqnum, root_hash, self.salt)
8119-
8120-        prefix = pack_prefix(self._new_seqnum, root_hash, self.salt,
8121-                             self.required_shares, self.total_shares,
8122-                             self.segment_size, len(self.newdata))
8123-
8124-        # now pack the beginning of the share. All shares are the same up
8125-        # to the signature, then they have divergent share hash chains,
8126-        # then completely different block hash trees + salt + share data,
8127-        # then they all share the same encprivkey at the end. The sizes
8128-        # of everything are the same for all shares.
8129-
8130-        sign_started = time.time()
8131-        signature = privkey.sign(prefix)
8132-        self._status.timings["sign"] = time.time() - sign_started
8133-
8134-        verification_key = pubkey.serialize()
8135-
8136-        final_shares = {}
8137-        for shnum in range(self.total_shares):
8138-            final_share = pack_share(prefix,
8139-                                     verification_key,
8140-                                     signature,
8141-                                     share_hash_chain[shnum],
8142-                                     block_hash_trees[shnum],
8143-                                     all_shares[shnum],
8144-                                     encprivkey)
8145-            final_shares[shnum] = final_share
8146-        elapsed = time.time() - started
8147-        self._status.timings["pack"] = elapsed
8148-        self.shares = final_shares
8149-        self.root_hash = root_hash
8150-
8151-        # we also need to build up the version identifier for what we're
8152-        # pushing. Extract the offsets from one of our shares.
8153-        assert final_shares
8154-        offsets = unpack_header(final_shares.values()[0])[-1]
8155-        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
8156-        verinfo = (self._new_seqnum, root_hash, self.salt,
8157-                   self.segment_size, len(self.newdata),
8158-                   self.required_shares, self.total_shares,
8159-                   prefix, offsets_tuple)
8160-        self.versioninfo = verinfo
8161-
8162-
8163-
8164-    def _send_shares(self, needed):
8165-        self.log("_send_shares")
8166-
8167-        # we're finally ready to send out our shares. If we encounter any
8168-        # surprises here, it's because somebody else is writing at the same
8169-        # time. (Note: in the future, when we remove the _query_peers() step
8170-        # and instead speculate about [or remember] which shares are where,
8171-        # surprises here are *not* indications of UncoordinatedWriteError,
8172-        # and we'll need to respond to them more gracefully.)
8173-
8174-        # needed is a set of (peerid, shnum) tuples. The first thing we do is
8175-        # organize it by peerid.
8176-
8177-        peermap = DictOfSets()
8178-        for (peerid, shnum) in needed:
8179-            peermap.add(peerid, shnum)
8180-
8181-        # the next thing is to build up a bunch of test vectors. The
8182-        # semantics of Publish are that we perform the operation if the world
8183-        # hasn't changed since the ServerMap was constructed (more or less).
8184-        # For every share we're trying to place, we create a test vector that
8185-        # tests to see if the server*share still corresponds to the
8186-        # map.
8187-
8188-        all_tw_vectors = {} # maps peerid to tw_vectors
8189-        sm = self._servermap.servermap
8190-
8191-        for key in needed:
8192-            (peerid, shnum) = key
8193-
8194-            if key in sm:
8195-                # an old version of that share already exists on the
8196-                # server, according to our servermap. We will create a
8197-                # request that attempts to replace it.
8198-                old_versionid, old_timestamp = sm[key]
8199-                (old_seqnum, old_root_hash, old_salt, old_segsize,
8200-                 old_datalength, old_k, old_N, old_prefix,
8201-                 old_offsets_tuple) = old_versionid
8202-                old_checkstring = pack_checkstring(old_seqnum,
8203-                                                   old_root_hash,
8204-                                                   old_salt)
8205-                testv = (0, len(old_checkstring), "eq", old_checkstring)
8206-
8207-            elif key in self.bad_share_checkstrings:
8208-                old_checkstring = self.bad_share_checkstrings[key]
8209-                testv = (0, len(old_checkstring), "eq", old_checkstring)
8210-
8211-            else:
8212-                # add a testv that requires the share not exist
8213-
8214-                # Unfortunately, foolscap-0.2.5 has a bug in the way inbound
8215-                # constraints are handled. If the same object is referenced
8216-                # multiple times inside the arguments, foolscap emits a
8217-                # 'reference' token instead of a distinct copy of the
8218-                # argument. The bug is that these 'reference' tokens are not
8219-                # accepted by the inbound constraint code. To work around
8220-                # this, we need to prevent python from interning the
8221-                # (constant) tuple, by creating a new copy of this vector
8222-                # each time.
8223-
8224-                # This bug is fixed in foolscap-0.2.6, and even though this
8225-                # version of Tahoe requires foolscap-0.3.1 or newer, we are
8226-                # supposed to be able to interoperate with older versions of
8227-                # Tahoe which are allowed to use older versions of foolscap,
8228-                # including foolscap-0.2.5 . In addition, I've seen other
8229-                # foolscap problems triggered by 'reference' tokens (see #541
8230-                # for details). So we must keep this workaround in place.
8231-
8232-                #testv = (0, 1, 'eq', "")
8233-                testv = tuple([0, 1, 'eq', ""])
8234-
8235-            testvs = [testv]
8236-            # the write vector is simply the share
8237-            writev = [(0, self.shares[shnum])]
8238-
8239-            if peerid not in all_tw_vectors:
8240-                all_tw_vectors[peerid] = {}
8241-                # maps shnum to (testvs, writevs, new_length)
8242-            assert shnum not in all_tw_vectors[peerid]
8243-
8244-            all_tw_vectors[peerid][shnum] = (testvs, writev, None)
8245-
8246-        # we read the checkstring back from each share, however we only use
8247-        # it to detect whether there was a new share that we didn't know
8248-        # about. The success or failure of the write will tell us whether
8249-        # there was a collision or not. If there is a collision, the first
8250-        # thing we'll do is update the servermap, which will find out what
8251-        # happened. We could conceivably reduce a roundtrip by using the
8252-        # readv checkstring to populate the servermap, but really we'd have
8253-        # to read enough data to validate the signatures too, so it wouldn't
8254-        # be an overall win.
8255-        read_vector = [(0, struct.calcsize(SIGNED_PREFIX))]
8256-
8257-        # ok, send the messages!
8258-        self.log("sending %d shares" % len(all_tw_vectors), level=log.NOISY)
8259-        started = time.time()
8260-        for (peerid, tw_vectors) in all_tw_vectors.items():
8261-
8262-            write_enabler = self._node.get_write_enabler(peerid)
8263-            renew_secret = self._node.get_renewal_secret(peerid)
8264-            cancel_secret = self._node.get_cancel_secret(peerid)
8265-            secrets = (write_enabler, renew_secret, cancel_secret)
8266-            shnums = tw_vectors.keys()
8267-
8268-            for shnum in shnums:
8269-                self.outstanding.add( (peerid, shnum) )
8270-
8271-            d = self._do_testreadwrite(peerid, secrets,
8272-                                       tw_vectors, read_vector)
8273-            d.addCallbacks(self._got_write_answer, self._got_write_error,
8274-                           callbackArgs=(peerid, shnums, started),
8275-                           errbackArgs=(peerid, shnums, started))
8276-            # tolerate immediate errback, like with DeadReferenceError
8277-            d.addBoth(fireEventually)
8278-            d.addCallback(self.loop)
8279-            d.addErrback(self._fatal_error)
8280-
8281-        self._update_status()
8282-        self.log("%d shares sent" % len(all_tw_vectors), level=log.NOISY)
8283+        elapsed = now - started
8284 
8285hunk ./src/allmydata/mutable/publish.py 794
8286-    def _do_testreadwrite(self, peerid, secrets,
8287-                          tw_vectors, read_vector):
8288-        storage_index = self._storage_index
8289-        ss = self.connections[peerid]
8290+        self._status.add_per_server_time(peerid, elapsed)
8291 
8292hunk ./src/allmydata/mutable/publish.py 796
8293-        #print "SS[%s] is %s" % (idlib.shortnodeid_b2a(peerid), ss), ss.tracker.interfaceName
8294-        d = ss.callRemote("slot_testv_and_readv_and_writev",
8295-                          storage_index,
8296-                          secrets,
8297-                          tw_vectors,
8298-                          read_vector)
8299-        return d
8300+        wrote, read_data = answer
8301 
8302hunk ./src/allmydata/mutable/publish.py 798
8303-    def _got_write_answer(self, answer, peerid, shnums, started):
8304-        lp = self.log("_got_write_answer from %s" %
8305-                      idlib.shortnodeid_b2a(peerid))
8306-        for shnum in shnums:
8307-            self.outstanding.discard( (peerid, shnum) )
8308+        surprise_shares = set(read_data.keys()) - set([writer.shnum])
8309 
8310hunk ./src/allmydata/mutable/publish.py 800
8311-        now = time.time()
8312-        elapsed = now - started
8313-        self._status.add_per_server_time(peerid, elapsed)
8314+        # We need to remove from surprise_shares any shares that we are
8315+        # knowingly also writing to that peer from other writers.
8316 
8317hunk ./src/allmydata/mutable/publish.py 803
8318-        wrote, read_data = answer
8319+        # TODO: Precompute this.
8320+        known_shnums = [x.shnum for x in self.writers.values()
8321+                        if x.peerid == peerid]
8322+        surprise_shares -= set(known_shnums)
8323+        self.log("found the following surprise shares: %s" %
8324+                 str(surprise_shares))
8325 
8326hunk ./src/allmydata/mutable/publish.py 810
8327-        surprise_shares = set(read_data.keys()) - set(shnums)
8328+        # Now surprise shares contains all of the shares that we did not
8329+        # expect to be there.
8330 
8331         surprised = False
8332         for shnum in surprise_shares:
8333hunk ./src/allmydata/mutable/publish.py 817
8334             # read_data is a dict mapping shnum to checkstring (SIGNED_PREFIX)
8335             checkstring = read_data[shnum][0]
8336-            their_version_info = unpack_checkstring(checkstring)
8337-            if their_version_info == self._new_version_info:
8338+            # What we want to do here is to see if their (seqnum,
8339+            # roothash, salt) is the same as our (seqnum, roothash,
8340+            # salt), or the equivalent for MDMF. The best way to do this
8341+            # is to store a packed representation of our checkstring
8342+            # somewhere, then not bother unpacking the other
8343+            # checkstring.
8344+            if checkstring == self._checkstring:
8345                 # they have the right share, somehow
8346 
8347                 if (peerid,shnum) in self.goal:
8348hunk ./src/allmydata/mutable/publish.py 902
8349             self.log("our testv failed, so the write did not happen",
8350                      parent=lp, level=log.WEIRD, umid="8sc26g")
8351             self.surprised = True
8352-            self.bad_peers.add(peerid) # don't ask them again
8353+            # TODO: This needs to
8354+            self.bad_peers.add(writer) # don't ask them again
8355             # use the checkstring to add information to the log message
8356             for (shnum,readv) in read_data.items():
8357                 checkstring = readv[0]
8358hunk ./src/allmydata/mutable/publish.py 928
8359             # self.loop() will take care of finding new homes
8360             return
8361 
8362-        for shnum in shnums:
8363-            self.placed.add( (peerid, shnum) )
8364-            # and update the servermap
8365-            self._servermap.add_new_share(peerid, shnum,
8366+        # and update the servermap
8367+        # self.versioninfo is set during the last phase of publishing.
8368+        # If we get there, we know that responses correspond to placed
8369+        # shares, and can safely execute these statements.
8370+        if self.versioninfo:
8371+            self.log("wrote successfully: adding new share to servermap")
8372+            self._servermap.add_new_share(peerid, writer.shnum,
8373                                           self.versioninfo, started)
8374hunk ./src/allmydata/mutable/publish.py 936
8375-
8376-        # self.loop() will take care of checking to see if we're done
8377-        return
8378+            self.placed.add( (peerid, writer.shnum) )
8379 
8380hunk ./src/allmydata/mutable/publish.py 938
8381-    def _got_write_error(self, f, peerid, shnums, started):
8382-        for shnum in shnums:
8383-            self.outstanding.discard( (peerid, shnum) )
8384-        self.bad_peers.add(peerid)
8385-        if self._first_write_error is None:
8386-            self._first_write_error = f
8387-        self.log(format="error while writing shares %(shnums)s to peerid %(peerid)s",
8388-                 shnums=list(shnums), peerid=idlib.shortnodeid_b2a(peerid),
8389-                 failure=f,
8390-                 level=log.UNUSUAL)
8391         # self.loop() will take care of checking to see if we're done
8392         return
8393 
8394hunk ./src/allmydata/mutable/publish.py 949
8395         now = time.time()
8396         self._status.timings["total"] = now - self._started
8397         self._status.set_active(False)
8398-        if isinstance(res, failure.Failure):
8399-            self.log("Publish done, with failure", failure=res,
8400-                     level=log.WEIRD, umid="nRsR9Q")
8401-            self._status.set_status("Failed")
8402-        elif self.surprised:
8403-            self.log("Publish done, UncoordinatedWriteError", level=log.UNUSUAL)
8404-            self._status.set_status("UncoordinatedWriteError")
8405-            # deliver a failure
8406-            res = failure.Failure(UncoordinatedWriteError())
8407-            # TODO: recovery
8408-        else:
8409-            self.log("Publish done, success")
8410-            self._status.set_status("Finished")
8411-            self._status.set_progress(1.0)
8412+        self.log("Publish done, success")
8413+        self._status.set_status("Finished")
8414+        self._status.set_progress(1.0)
8415         eventually(self.done_deferred.callback, res)
8416 
8417hunk ./src/allmydata/mutable/publish.py 954
8418+    def _failure(self):
8419+
8420+        if not self.surprised:
8421+            # We ran out of servers
8422+            self.log("Publish ran out of good servers, "
8423+                     "last failure was: %s" % str(self._last_failure))
8424+            e = NotEnoughServersError("Ran out of non-bad servers, "
8425+                                      "last failure was %s" %
8426+                                      str(self._last_failure))
8427+        else:
8428+            # We ran into shares that we didn't recognize, which means
8429+            # that we need to return an UncoordinatedWriteError.
8430+            self.log("Publish failed with UncoordinatedWriteError")
8431+            e = UncoordinatedWriteError()
8432+        f = failure.Failure(e)
8433+        eventually(self.done_deferred.callback, f)
8434}
8435[test/test_mutable.py: remove tests that are no longer relevant
8436Kevan Carstensen <kevan@isnotajoke.com>**20100702225710
8437 Ignore-this: 90a26b4cc4b2e190a635474ba7097e21
8438] hunk ./src/allmydata/test/test_mutable.py 627
8439         return d
8440 
8441 
8442-class MakeShares(unittest.TestCase):
8443-    def test_encrypt(self):
8444-        nm = make_nodemaker()
8445-        CONTENTS = "some initial contents"
8446-        d = nm.create_mutable_file(CONTENTS)
8447-        def _created(fn):
8448-            p = Publish(fn, nm.storage_broker, None)
8449-            p.salt = "SALT" * 4
8450-            p.readkey = "\x00" * 16
8451-            p.newdata = CONTENTS
8452-            p.required_shares = 3
8453-            p.total_shares = 10
8454-            p.setup_encoding_parameters()
8455-            return p._encrypt_and_encode()
8456-        d.addCallback(_created)
8457-        def _done(shares_and_shareids):
8458-            (shares, share_ids) = shares_and_shareids
8459-            self.failUnlessEqual(len(shares), 10)
8460-            for sh in shares:
8461-                self.failUnless(isinstance(sh, str))
8462-                self.failUnlessEqual(len(sh), 7)
8463-            self.failUnlessEqual(len(share_ids), 10)
8464-        d.addCallback(_done)
8465-        return d
8466-    test_encrypt.todo = "Write an equivalent of this for the new uploader"
8467-
8468-    def test_generate(self):
8469-        nm = make_nodemaker()
8470-        CONTENTS = "some initial contents"
8471-        d = nm.create_mutable_file(CONTENTS)
8472-        def _created(fn):
8473-            self._fn = fn
8474-            p = Publish(fn, nm.storage_broker, None)
8475-            self._p = p
8476-            p.newdata = CONTENTS
8477-            p.required_shares = 3
8478-            p.total_shares = 10
8479-            p.setup_encoding_parameters()
8480-            p._new_seqnum = 3
8481-            p.salt = "SALT" * 4
8482-            # make some fake shares
8483-            shares_and_ids = ( ["%07d" % i for i in range(10)], range(10) )
8484-            p._privkey = fn.get_privkey()
8485-            p._encprivkey = fn.get_encprivkey()
8486-            p._pubkey = fn.get_pubkey()
8487-            return p._generate_shares(shares_and_ids)
8488-        d.addCallback(_created)
8489-        def _generated(res):
8490-            p = self._p
8491-            final_shares = p.shares
8492-            root_hash = p.root_hash
8493-            self.failUnlessEqual(len(root_hash), 32)
8494-            self.failUnless(isinstance(final_shares, dict))
8495-            self.failUnlessEqual(len(final_shares), 10)
8496-            self.failUnlessEqual(sorted(final_shares.keys()), range(10))
8497-            for i,sh in final_shares.items():
8498-                self.failUnless(isinstance(sh, str))
8499-                # feed the share through the unpacker as a sanity-check
8500-                pieces = unpack_share(sh)
8501-                (u_seqnum, u_root_hash, IV, k, N, segsize, datalen,
8502-                 pubkey, signature, share_hash_chain, block_hash_tree,
8503-                 share_data, enc_privkey) = pieces
8504-                self.failUnlessEqual(u_seqnum, 3)
8505-                self.failUnlessEqual(u_root_hash, root_hash)
8506-                self.failUnlessEqual(k, 3)
8507-                self.failUnlessEqual(N, 10)
8508-                self.failUnlessEqual(segsize, 21)
8509-                self.failUnlessEqual(datalen, len(CONTENTS))
8510-                self.failUnlessEqual(pubkey, p._pubkey.serialize())
8511-                sig_material = struct.pack(">BQ32s16s BBQQ",
8512-                                           0, p._new_seqnum, root_hash, IV,
8513-                                           k, N, segsize, datalen)
8514-                self.failUnless(p._pubkey.verify(sig_material, signature))
8515-                #self.failUnlessEqual(signature, p._privkey.sign(sig_material))
8516-                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
8517-                for shnum,share_hash in share_hash_chain.items():
8518-                    self.failUnless(isinstance(shnum, int))
8519-                    self.failUnless(isinstance(share_hash, str))
8520-                    self.failUnlessEqual(len(share_hash), 32)
8521-                self.failUnless(isinstance(block_hash_tree, list))
8522-                self.failUnlessEqual(len(block_hash_tree), 1) # very small tree
8523-                self.failUnlessEqual(IV, "SALT"*4)
8524-                self.failUnlessEqual(len(share_data), len("%07d" % 1))
8525-                self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey())
8526-        d.addCallback(_generated)
8527-        return d
8528-    test_generate.todo = "Write an equivalent of this for the new uploader"
8529-
8530-    # TODO: when we publish to 20 peers, we should get one share per peer on 10
8531-    # when we publish to 3 peers, we should get either 3 or 4 shares per peer
8532-    # when we publish to zero peers, we should get a NotEnoughSharesError
8533-
8534 class PublishMixin:
8535     def publish_one(self):
8536         # publish a file and create shares, which can then be manipulated
8537[interfaces.py: create IMutableUploadable
8538Kevan Carstensen <kevan@isnotajoke.com>**20100706215217
8539 Ignore-this: bee202ec2bfbd8e41f2d4019cce176c7
8540] hunk ./src/allmydata/interfaces.py 1693
8541         """The upload is finished, and whatever filehandle was in use may be
8542         closed."""
8543 
8544+
8545+class IMutableUploadable(Interface):
8546+    """
8547+    I represent content that is due to be uploaded to a mutable filecap.
8548+    """
8549+    # This is somewhat simpler than the IUploadable interface above
8550+    # because mutable files do not need to be concerned with possibly
8551+    # generating a CHK, nor with per-file keys. It is a subset of the
8552+    # methods in IUploadable, though, so we could just as well implement
8553+    # the mutable uploadables as IUploadables that don't happen to use
8554+    # those methods (with the understanding that the unused methods will
8555+    # never be called on such objects)
8556+    def get_size():
8557+        """
8558+        Returns a Deferred that fires with the size of the content held
8559+        by the uploadable.
8560+        """
8561+
8562+    def read(length):
8563+        """
8564+        Returns a list of strings which, when concatenated, are the next
8565+        length bytes of the file, or fewer if there are fewer bytes
8566+        between the current location and the end of the file.
8567+        """
8568+
8569+    def close():
8570+        """
8571+        The process that used the Uploadable is finished using it, so
8572+        the uploadable may be closed.
8573+        """
8574+
8575 class IUploadResults(Interface):
8576     """I am returned by upload() methods. I contain a number of public
8577     attributes which can be read to determine the results of the upload. Some
8578[mutable/publish.py: add MutableDataHandle and MutableFileHandle
8579Kevan Carstensen <kevan@isnotajoke.com>**20100706215257
8580 Ignore-this: 295ea3bc2a962fd14fb7877fc76c011c
8581] {
8582hunk ./src/allmydata/mutable/publish.py 8
8583 from zope.interface import implements
8584 from twisted.internet import defer
8585 from twisted.python import failure
8586-from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION
8587+from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION, \
8588+                                 IMutableUploadable
8589 from allmydata.util import base32, hashutil, mathutil, idlib, log
8590 from allmydata import hashtree, codec
8591 from allmydata.storage.server import si_b2a
8592hunk ./src/allmydata/mutable/publish.py 971
8593             e = UncoordinatedWriteError()
8594         f = failure.Failure(e)
8595         eventually(self.done_deferred.callback, f)
8596+
8597+
8598+class MutableFileHandle:
8599+    """
8600+    I am a mutable uploadable built around a filehandle-like object,
8601+    usually either a StringIO instance or a handle to an actual file.
8602+    """
8603+    implements(IMutableUploadable)
8604+
8605+    def __init__(self, filehandle):
8606+        # The filehandle is defined as a generally file-like object that
8607+        # has these two methods. We don't care beyond that.
8608+        assert hasattr(filehandle, "read")
8609+        assert hasattr(filehandle, "close")
8610+
8611+        self._filehandle = filehandle
8612+
8613+
8614+    def get_size(self):
8615+        """
8616+        I return the amount of data in my filehandle.
8617+        """
8618+        if not hasattr(self, "_size"):
8619+            old_position = self._filehandle.tell()
8620+            # Seek to the end of the file by seeking 0 bytes from the
8621+            # file's end
8622+            self._filehandle.seek(0, os.SEEK_END)
8623+            self._size = self._filehandle.tell()
8624+            # Restore the previous position, in case this was called
8625+            # after a read.
8626+            self._filehandle.seek(old_position)
8627+            assert self._filehandle.tell() == old_position
8628+
8629+        assert hasattr(self, "_size")
8630+        return self._size
8631+
8632+
8633+    def read(self, length):
8634+        """
8635+        I return some data (up to length bytes) from my filehandle.
8636+
8637+        In most cases, I return length bytes. If I don't, it is because
8638+        length is longer than the distance between my current position
8639+        in the file that I represent and its end. In that case, I return
8640+        as many bytes as I can before going over the EOF.
8641+        """
8642+        return [self._filehandle.read(length)]
8643+
8644+
8645+    def close(self):
8646+        """
8647+        I close the underlying filehandle. Any further operations on the
8648+        filehandle fail at this point.
8649+        """
8650+        self._filehandle.close()
8651+
8652+
8653+class MutableDataHandle(MutableFileHandle):
8654+    """
8655+    I am a mutable uploadable built around a string, which I then cast
8656+    into a StringIO and treat as a filehandle.
8657+    """
8658+
8659+    def __init__(self, s):
8660+        # Take a string and return a file-like uploadable.
8661+        assert isinstance(s, str)
8662+
8663+        MutableFileHandle.__init__(self, StringIO(s))
8664}
8665[mutable/publish.py: reorganize in preparation of file-like uploadables
8666Kevan Carstensen <kevan@isnotajoke.com>**20100706215541
8667 Ignore-this: 5346c9f919ee5b73807c8f287c64e8ce
8668] {
8669hunk ./src/allmydata/mutable/publish.py 4
8670 
8671 
8672 import os, struct, time
8673+from StringIO import StringIO
8674 from itertools import count
8675 from zope.interface import implements
8676 from twisted.internet import defer
8677hunk ./src/allmydata/mutable/publish.py 118
8678         self._status.set_helper(False)
8679         self._status.set_progress(0.0)
8680         self._status.set_active(True)
8681-        # We use this to control how the file is written.
8682-        version = self._node.get_version()
8683-        assert version in (SDMF_VERSION, MDMF_VERSION)
8684-        self._version = version
8685+        self._version = self._node.get_version()
8686+        assert self._version in (SDMF_VERSION, MDMF_VERSION)
8687+
8688 
8689     def get_status(self):
8690         return self._status
8691hunk ./src/allmydata/mutable/publish.py 141
8692 
8693         # 0. Setup encoding parameters, encoder, and other such things.
8694         # 1. Encrypt, encode, and publish segments.
8695+        self.data = StringIO(newdata)
8696+        self.datalength = len(newdata)
8697 
8698hunk ./src/allmydata/mutable/publish.py 144
8699-        self.log("starting publish, datalen is %s" % len(newdata))
8700-        self._status.set_size(len(newdata))
8701+        self.log("starting publish, datalen is %s" % self.datalength)
8702+        self._status.set_size(self.datalength)
8703         self._status.set_status("Started")
8704         self._started = time.time()
8705 
8706hunk ./src/allmydata/mutable/publish.py 193
8707         self.full_peerlist = full_peerlist # for use later, immutable
8708         self.bad_peers = set() # peerids who have errbacked/refused requests
8709 
8710-        self.newdata = newdata
8711-
8712         # This will set self.segment_size, self.num_segments, and
8713         # self.fec.
8714         self.setup_encoding_parameters()
8715hunk ./src/allmydata/mutable/publish.py 272
8716                                                 self.required_shares,
8717                                                 self.total_shares,
8718                                                 self.segment_size,
8719-                                                len(self.newdata))
8720+                                                self.datalength)
8721             self.writers[shnum].peerid = peerid
8722             if (peerid, shnum) in self._servermap.servermap:
8723                 old_versionid, old_timestamp = self._servermap.servermap[key]
8724hunk ./src/allmydata/mutable/publish.py 318
8725         if self._version == MDMF_VERSION:
8726             segment_size = DEFAULT_MAX_SEGMENT_SIZE # 128 KiB by default
8727         else:
8728-            segment_size = len(self.newdata) # SDMF is only one segment
8729+            segment_size = self.datalength # SDMF is only one segment
8730         # this must be a multiple of self.required_shares
8731         segment_size = mathutil.next_multiple(segment_size,
8732                                               self.required_shares)
8733hunk ./src/allmydata/mutable/publish.py 324
8734         self.segment_size = segment_size
8735         if segment_size:
8736-            self.num_segments = mathutil.div_ceil(len(self.newdata),
8737+            self.num_segments = mathutil.div_ceil(self.datalength,
8738                                                   segment_size)
8739         else:
8740             self.num_segments = 0
8741hunk ./src/allmydata/mutable/publish.py 337
8742             assert self.num_segments in (0, 1) # SDMF
8743         # calculate the tail segment size.
8744 
8745-        if segment_size and self.newdata:
8746-            self.tail_segment_size = len(self.newdata) % segment_size
8747+        if segment_size and self.datalength:
8748+            self.tail_segment_size = self.datalength % segment_size
8749         else:
8750             self.tail_segment_size = 0
8751 
8752hunk ./src/allmydata/mutable/publish.py 438
8753             segsize = self.segment_size
8754 
8755 
8756-        offset = self.segment_size * segnum
8757-        length = segsize + offset
8758         self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
8759hunk ./src/allmydata/mutable/publish.py 439
8760-        data = self.newdata[offset:length]
8761+        data = self.data.read(segsize)
8762+
8763         assert len(data) == segsize
8764 
8765         salt = os.urandom(16)
8766hunk ./src/allmydata/mutable/publish.py 502
8767             d.addCallback(self._got_write_answer, writer, started)
8768             d.addErrback(self._connection_problem, writer)
8769             dl.append(d)
8770-            # TODO: Naturally, we need to check on the results of these.
8771         return defer.DeferredList(dl)
8772 
8773 
8774}
8775[test/test_mutable.py: write tests for MutableFileHandle and MutableDataHandle
8776Kevan Carstensen <kevan@isnotajoke.com>**20100706215649
8777 Ignore-this: df719a0c52b4bbe9be4fae206c7ab3e7
8778] {
8779hunk ./src/allmydata/test/test_mutable.py 2
8780 
8781-import struct
8782+import struct, os
8783 from cStringIO import StringIO
8784 from twisted.trial import unittest
8785 from twisted.internet import defer, reactor
8786hunk ./src/allmydata/test/test_mutable.py 26
8787      NeedMoreDataError, UnrecoverableFileError, UncoordinatedWriteError, \
8788      NotEnoughServersError, CorruptShareError
8789 from allmydata.mutable.retrieve import Retrieve
8790-from allmydata.mutable.publish import Publish
8791+from allmydata.mutable.publish import Publish, MutableFileHandle, \
8792+                                      MutableDataHandle
8793 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
8794 from allmydata.mutable.layout import unpack_header, unpack_share, \
8795                                      MDMFSlotReadProxy
8796hunk ./src/allmydata/test/test_mutable.py 2465
8797         d.addCallback(lambda data:
8798             self.failUnlessEqual(data, CONTENTS))
8799         return d
8800+
8801+
8802+class FileHandle(unittest.TestCase):
8803+    def setUp(self):
8804+        self.test_data = "Test Data" * 50000
8805+        self.sio = StringIO(self.test_data)
8806+        self.uploadable = MutableFileHandle(self.sio)
8807+
8808+
8809+    def test_filehandle_read(self):
8810+        self.basedir = "mutable/FileHandle/test_filehandle_read"
8811+        chunk_size = 10
8812+        for i in xrange(0, len(self.test_data), chunk_size):
8813+            data = self.uploadable.read(chunk_size)
8814+            data = "".join(data)
8815+            start = i
8816+            end = i + chunk_size
8817+            self.failUnlessEqual(data, self.test_data[start:end])
8818+
8819+
8820+    def test_filehandle_get_size(self):
8821+        self.basedir = "mutable/FileHandle/test_filehandle_get_size"
8822+        actual_size = len(self.test_data)
8823+        size = self.uploadable.get_size()
8824+        self.failUnlessEqual(size, actual_size)
8825+
8826+
8827+    def test_filehandle_get_size_out_of_order(self):
8828+        # We should be able to call get_size whenever we want without
8829+        # disturbing the location of the seek pointer.
8830+        chunk_size = 100
8831+        data = self.uploadable.read(chunk_size)
8832+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
8833+
8834+        # Now get the size.
8835+        size = self.uploadable.get_size()
8836+        self.failUnlessEqual(size, len(self.test_data))
8837+
8838+        # Now get more data. We should be right where we left off.
8839+        more_data = self.uploadable.read(chunk_size)
8840+        start = chunk_size
8841+        end = chunk_size * 2
8842+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
8843+
8844+
8845+    def test_filehandle_file(self):
8846+        # Make sure that the MutableFileHandle works on a file as well
8847+        # as a StringIO object, since in some cases it will be asked to
8848+        # deal with files.
8849+        self.basedir = self.mktemp()
8850+        # necessary? What am I doing wrong here?
8851+        os.mkdir(self.basedir)
8852+        f_path = os.path.join(self.basedir, "test_file")
8853+        f = open(f_path, "w")
8854+        f.write(self.test_data)
8855+        f.close()
8856+        f = open(f_path, "r")
8857+
8858+        uploadable = MutableFileHandle(f)
8859+
8860+        data = uploadable.read(len(self.test_data))
8861+        self.failUnlessEqual("".join(data), self.test_data)
8862+        size = uploadable.get_size()
8863+        self.failUnlessEqual(size, len(self.test_data))
8864+
8865+
8866+    def test_close(self):
8867+        # Make sure that the MutableFileHandle closes its handle when
8868+        # told to do so.
8869+        self.uploadable.close()
8870+        self.failUnless(self.sio.closed)
8871+
8872+
8873+class DataHandle(unittest.TestCase):
8874+    def setUp(self):
8875+        self.test_data = "Test Data" * 50000
8876+        self.uploadable = MutableDataHandle(self.test_data)
8877+
8878+
8879+    def test_datahandle_read(self):
8880+        chunk_size = 10
8881+        for i in xrange(0, len(self.test_data), chunk_size):
8882+            data = self.uploadable.read(chunk_size)
8883+            data = "".join(data)
8884+            start = i
8885+            end = i + chunk_size
8886+            self.failUnlessEqual(data, self.test_data[start:end])
8887+
8888+
8889+    def test_datahandle_get_size(self):
8890+        actual_size = len(self.test_data)
8891+        size = self.uploadable.get_size()
8892+        self.failUnlessEqual(size, actual_size)
8893+
8894+
8895+    def test_datahandle_get_size_out_of_order(self):
8896+        # We should be able to call get_size whenever we want without
8897+        # disturbing the location of the seek pointer.
8898+        chunk_size = 100
8899+        data = self.uploadable.read(chunk_size)
8900+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
8901+
8902+        # Now get the size.
8903+        size = self.uploadable.get_size()
8904+        self.failUnlessEqual(size, len(self.test_data))
8905+
8906+        # Now get more data. We should be right where we left off.
8907+        more_data = self.uploadable.read(chunk_size)
8908+        start = chunk_size
8909+        end = chunk_size * 2
8910+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
8911}
8912
8913Context:
8914
8915[SFTP: don't call .stopProducing on the producer registered with OverwriteableFileConsumer (which breaks with warner's new downloader).
8916david-sarah@jacaranda.org**20100628231926
8917 Ignore-this: 131b7a5787bc85a9a356b5740d9d996f
8918] 
8919[docs/how_to_make_a_tahoe-lafs_release.txt: trivial correction, install.html should now be quickstart.html.
8920david-sarah@jacaranda.org**20100625223929
8921 Ignore-this: 99a5459cac51bd867cc11ad06927ff30
8922] 
8923[setup: in the Makefile, refuse to upload tarballs unless someone has passed the environment variable "BB_BRANCH" with value "trunk"
8924zooko@zooko.com**20100619034928
8925 Ignore-this: 276ddf9b6ad7ec79e27474862e0f7d6
8926] 
8927[trivial: tiny update to in-line comment
8928zooko@zooko.com**20100614045715
8929 Ignore-this: 10851b0ed2abfed542c97749e5d280bc
8930 (I'm actually committing this patch as a test of the new eager-annotation-computation of trac-darcs.)
8931] 
8932[docs: about.html link to home page early on, and be decentralized storage instead of cloud storage this time around
8933zooko@zooko.com**20100619065318
8934 Ignore-this: dc6db03f696e5b6d2848699e754d8053
8935] 
8936[docs: update about.html, especially to have a non-broken link to quickstart.html, and also to comment out the broken links to "for Paranoids" and "for Corporates"
8937zooko@zooko.com**20100619065124
8938 Ignore-this: e292c7f51c337a84ebfeb366fbd24d6c
8939] 
8940[TAG allmydata-tahoe-1.7.0
8941zooko@zooko.com**20100619052631
8942 Ignore-this: d21e27afe6d85e2e3ba6a3292ba2be1
8943] 
8944Patch bundle hash:
8945f3e32a0a61bba807488a699894d459369596ceae