Ticket #393: 393status18.dpatch

File 393status18.dpatch, 451.9 KB (added by kevan, at 2010-07-09T23:47:54Z)
Line 
1Thu Jun 24 16:46:37 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
2  * Misc. changes to support the work I'm doing
3 
4      - Add a notion of file version number to interfaces.py
5      - Alter mutable file node interfaces to have a notion of version,
6        though this may be changed later.
7      - Alter mutable/filenode.py to conform to these changes.
8      - Add a salt hasher to util/hashutil.py
9
10Thu Jun 24 16:48:33 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
11  * nodemaker.py: create MDMF files when asked to
12
13Thu Jun 24 16:49:05 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
14  * storage/server.py: minor code cleanup
15
16Thu Jun 24 16:49:24 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
17  * test/test_mutable.py: alter some tests that were failing due to MDMF; minor code cleanup.
18
19Fri Jun 25 17:35:20 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
20  * test/test_mutable.py: change the definition of corrupt() to work with MDMF as well as SDMF files, change users of corrupt to use the new definition
21
22Sat Jun 26 16:41:18 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
23  * Alter the ServermapUpdater to find MDMF files
24 
25  The servermapupdater should find MDMF files on a grid in the same way
26  that it finds SDMF files. This patch makes it do that.
27
28Sat Jun 26 16:42:04 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
29  * Make a segmented mutable uploader
30 
31  The mutable file uploader should be able to publish files with one
32  segment and files with multiple segments. This patch makes it do that.
33  This is still incomplete, and rather ugly -- I need to flesh out error
34  handling, I need to write tests, and I need to remove some of the uglier
35  kludges in the process before I can call this done.
36
37Sat Jun 26 16:43:14 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
38  * Write a segmented mutable downloader
39 
40  The segmented mutable downloader can deal with MDMF files (files with
41  one or more segments in MDMF format) and SDMF files (files with one
42  segment in SDMF format). It is backwards compatible with the old
43  file format.
44 
45  This patch also contains tests for the segmented mutable downloader.
46
47Mon Jun 28 15:50:48 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
48  * mutable/checker.py: check MDMF files
49 
50  This patch adapts the mutable file checker and verifier to check and
51  verify MDMF files. It does this by using the new segmented downloader,
52  which is trained to perform verification operations on request. This
53  removes some code duplication.
54
55Mon Jun 28 15:52:01 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
56  * mutable/retrieve.py: learn how to verify mutable files
57
58Wed Jun 30 11:33:05 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
59  * interfaces.py: add IMutableSlotWriter
60
61Thu Jul  1 16:28:06 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
62  * test/test_mutable.py: temporarily disable two tests that are now irrelevant
63
64Fri Jul  2 15:55:31 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
65  * Add MDMF reader and writer, and SDMF writer
66 
67  The MDMF/SDMF reader MDMF writer, and SDMF writer are similar to the
68  object proxies that exist for immutable files. They abstract away
69  details of connection, state, and caching from their callers (in this
70  case, the download, servermap updater, and uploader), and expose methods
71  to get and set information on the remote server.
72 
73  MDMFSlotReadProxy reads a mutable file from the server, doing the right
74  thing (in most cases) regardless of whether the file is MDMF or SDMF. It
75  allows callers to tell it how to batch and flush reads.
76 
77  MDMFSlotWriteProxy writes an MDMF mutable file to a server.
78 
79  SDMFSlotWriteProxy writes an SDMF mutable file to a server.
80 
81  This patch also includes tests for MDMFSlotReadProxy,
82  SDMFSlotWriteProxy, and MDMFSlotWriteProxy.
83
84Fri Jul  2 15:55:54 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
85  * mutable/publish.py: cleanup + simplification
86
87Fri Jul  2 15:57:10 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
88  * test/test_mutable.py: remove tests that are no longer relevant
89
90Tue Jul  6 14:52:17 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
91  * interfaces.py: create IMutableUploadable
92
93Tue Jul  6 14:52:57 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
94  * mutable/publish.py: add MutableDataHandle and MutableFileHandle
95
96Tue Jul  6 14:55:41 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
97  * mutable/publish.py: reorganize in preparation of file-like uploadables
98
99Tue Jul  6 14:56:49 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
100  * test/test_mutable.py: write tests for MutableFileHandle and MutableDataHandle
101
102Wed Jul  7 17:00:31 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
103  * Alter tests to work with the new APIs
104
105Wed Jul  7 17:07:32 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
106  * Alter mutable files to use file-like objects for publishing instead of strings.
107
108Thu Jul  8 12:35:22 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
109  * test/test_sftp.py: alter a setup routine to work with new mutable file APIs.
110
111Thu Jul  8 12:36:00 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
112  * mutable/publish.py: make MutableFileHandle seek to the beginning of its file handle before reading.
113
114Fri Jul  9 16:29:12 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
115  * Refactor download interfaces to be more uniform, per #993
116
117Fri Jul  9 16:29:51 PDT 2010  Kevan Carstensen <kevan@isnotajoke.com>
118  * frontends/sftpd.py: alter a mutable file overwrite to work with the new API
119
120New patches:
121
122[Misc. changes to support the work I'm doing
123Kevan Carstensen <kevan@isnotajoke.com>**20100624234637
124 Ignore-this: fdd18fa8cc05f4b4b15ff53ee24a1819
125 
126     - Add a notion of file version number to interfaces.py
127     - Alter mutable file node interfaces to have a notion of version,
128       though this may be changed later.
129     - Alter mutable/filenode.py to conform to these changes.
130     - Add a salt hasher to util/hashutil.py
131] {
132hunk ./src/allmydata/interfaces.py 7
133      ChoiceOf, IntegerConstraint, Any, RemoteInterface, Referenceable
134 
135 HASH_SIZE=32
136+SALT_SIZE=16
137+
138+SDMF_VERSION=0
139+MDMF_VERSION=1
140 
141 Hash = StringConstraint(maxLength=HASH_SIZE,
142                         minLength=HASH_SIZE)# binary format 32-byte SHA256 hash
143hunk ./src/allmydata/interfaces.py 811
144         writer-visible data using this writekey.
145         """
146 
147+    def set_version(version):
148+        """Tahoe-LAFS supports SDMF and MDMF mutable files. By default,
149+        we upload in SDMF for reasons of compatibility. If you want to
150+        change this, set_version will let you do that.
151+
152+        To say that this file should be uploaded in SDMF, pass in a 0. To
153+        say that the file should be uploaded as MDMF, pass in a 1.
154+        """
155+
156+    def get_version():
157+        """Returns the mutable file protocol version."""
158+
159 class NotEnoughSharesError(Exception):
160     """Download was unable to get enough shares"""
161 
162hunk ./src/allmydata/mutable/filenode.py 8
163 from twisted.internet import defer, reactor
164 from foolscap.api import eventually
165 from allmydata.interfaces import IMutableFileNode, \
166-     ICheckable, ICheckResults, NotEnoughSharesError
167+     ICheckable, ICheckResults, NotEnoughSharesError, MDMF_VERSION, SDMF_VERSION
168 from allmydata.util import hashutil, log
169 from allmydata.util.assertutil import precondition
170 from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI
171hunk ./src/allmydata/mutable/filenode.py 67
172         self._sharemap = {} # known shares, shnum-to-[nodeids]
173         self._cache = ResponseCache()
174         self._most_recent_size = None
175+        # filled in after __init__ if we're being created for the first time;
176+        # filled in by the servermap updater before publishing, otherwise.
177+        # set to this default value in case neither of those things happen,
178+        # or in case the servermap can't find any shares to tell us what
179+        # to publish as.
180+        # TODO: Set this back to None, and find out why the tests fail
181+        #       with it set to None.
182+        self._protocol_version = SDMF_VERSION
183 
184         # all users of this MutableFileNode go through the serializer. This
185         # takes advantage of the fact that Deferreds discard the callbacks
186hunk ./src/allmydata/mutable/filenode.py 472
187     def _did_upload(self, res, size):
188         self._most_recent_size = size
189         return res
190+
191+
192+    def set_version(self, version):
193+        # I can be set in two ways:
194+        #  1. When the node is created.
195+        #  2. (for an existing share) when the Servermap is updated
196+        #     before I am read.
197+        assert version in (MDMF_VERSION, SDMF_VERSION)
198+        self._protocol_version = version
199+
200+
201+    def get_version(self):
202+        return self._protocol_version
203hunk ./src/allmydata/util/hashutil.py 90
204 MUTABLE_READKEY_TAG = "allmydata_mutable_writekey_to_readkey_v1"
205 MUTABLE_DATAKEY_TAG = "allmydata_mutable_readkey_to_datakey_v1"
206 MUTABLE_STORAGEINDEX_TAG = "allmydata_mutable_readkey_to_storage_index_v1"
207+MUTABLE_SALT_TAG = "allmydata_mutable_segment_salt_v1"
208 
209 # dirnodes
210 DIRNODE_CHILD_WRITECAP_TAG = "allmydata_mutable_writekey_and_salt_to_dirnode_child_capkey_v1"
211hunk ./src/allmydata/util/hashutil.py 134
212 def plaintext_segment_hasher():
213     return tagged_hasher(PLAINTEXT_SEGMENT_TAG)
214 
215+def mutable_salt_hash(data):
216+    return tagged_hash(MUTABLE_SALT_TAG, data)
217+def mutable_salt_hasher():
218+    return tagged_hasher(MUTABLE_SALT_TAG)
219+
220 KEYLEN = 16
221 IVLEN = 16
222 
223}
224[nodemaker.py: create MDMF files when asked to
225Kevan Carstensen <kevan@isnotajoke.com>**20100624234833
226 Ignore-this: 26c16aaca9ddab7a7ce37a4530bc970
227] {
228hunk ./src/allmydata/nodemaker.py 3
229 import weakref
230 from zope.interface import implements
231-from allmydata.interfaces import INodeMaker
232+from allmydata.util.assertutil import precondition
233+from allmydata.interfaces import INodeMaker, MustBeDeepImmutableError, \
234+                                 SDMF_VERSION, MDMF_VERSION
235 from allmydata.immutable.filenode import ImmutableFileNode, LiteralFileNode
236 from allmydata.immutable.upload import Data
237 from allmydata.mutable.filenode import MutableFileNode
238hunk ./src/allmydata/nodemaker.py 92
239             return self._create_dirnode(filenode)
240         return None
241 
242-    def create_mutable_file(self, contents=None, keysize=None):
243+    def create_mutable_file(self, contents=None, keysize=None,
244+                            version=SDMF_VERSION):
245         n = MutableFileNode(self.storage_broker, self.secret_holder,
246                             self.default_encoding_parameters, self.history)
247hunk ./src/allmydata/nodemaker.py 96
248+        n.set_version(version)
249         d = self.key_generator.generate(keysize)
250         d.addCallback(n.create_with_keys, contents)
251         d.addCallback(lambda res: n)
252hunk ./src/allmydata/nodemaker.py 102
253         return d
254 
255-    def create_new_mutable_directory(self, initial_children={}):
256+    def create_new_mutable_directory(self, initial_children={},
257+                                     version=SDMF_VERSION):
258+        # initial_children must have metadata (i.e. {} instead of None)
259+        for (name, (node, metadata)) in initial_children.iteritems():
260+            precondition(isinstance(metadata, dict),
261+                         "create_new_mutable_directory requires metadata to be a dict, not None", metadata)
262+            node.raise_error()
263         d = self.create_mutable_file(lambda n:
264hunk ./src/allmydata/nodemaker.py 110
265-                                     pack_children(n, initial_children))
266+                                     pack_children(n, initial_children),
267+                                     version)
268         d.addCallback(self._create_dirnode)
269         return d
270 
271}
272[storage/server.py: minor code cleanup
273Kevan Carstensen <kevan@isnotajoke.com>**20100624234905
274 Ignore-this: 2358c531c39e48d3c8e56b62b5768228
275] {
276hunk ./src/allmydata/storage/server.py 569
277                                          self)
278         return share
279 
280-    def remote_slot_readv(self, storage_index, shares, readv):
281+    def remote_slot_readv(self, storage_index, shares, readvs):
282         start = time.time()
283         self.count("readv")
284         si_s = si_b2a(storage_index)
285hunk ./src/allmydata/storage/server.py 590
286             if sharenum in shares or not shares:
287                 filename = os.path.join(bucketdir, sharenum_s)
288                 msf = MutableShareFile(filename, self)
289-                datavs[sharenum] = msf.readv(readv)
290+                datavs[sharenum] = msf.readv(readvs)
291         log.msg("returning shares %s" % (datavs.keys(),),
292                 facility="tahoe.storage", level=log.NOISY, parent=lp)
293         self.add_latency("readv", time.time() - start)
294}
295[test/test_mutable.py: alter some tests that were failing due to MDMF; minor code cleanup.
296Kevan Carstensen <kevan@isnotajoke.com>**20100624234924
297 Ignore-this: afb86ec1fbdbfe1a5ef6f46f350273c0
298] {
299hunk ./src/allmydata/test/test_mutable.py 151
300             chr(ord(original[byte_offset]) ^ 0x01) +
301             original[byte_offset+1:])
302 
303+def add_two(original, byte_offset):
304+    # It isn't enough to simply flip the bit for the version number,
305+    # because 1 is a valid version number. So we add two instead.
306+    return (original[:byte_offset] +
307+            chr(ord(original[byte_offset]) ^ 0x02) +
308+            original[byte_offset+1:])
309+
310 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
311     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
312     # list of shnums to corrupt.
313hunk ./src/allmydata/test/test_mutable.py 187
314                 real_offset = offset1
315             real_offset = int(real_offset) + offset2 + offset_offset
316             assert isinstance(real_offset, int), offset
317-            shares[shnum] = flip_bit(data, real_offset)
318+            if offset1 == 0: # verbyte
319+                f = add_two
320+            else:
321+                f = flip_bit
322+            shares[shnum] = f(data, real_offset)
323     return res
324 
325 def make_storagebroker(s=None, num_peers=10):
326hunk ./src/allmydata/test/test_mutable.py 423
327         d.addCallback(_created)
328         return d
329 
330+
331     def test_modify_backoffer(self):
332         def _modifier(old_contents, servermap, first_time):
333             return old_contents + "line2"
334hunk ./src/allmydata/test/test_mutable.py 658
335         d.addCallback(_created)
336         return d
337 
338+
339     def _copy_shares(self, ignored, index):
340         shares = self._storage._peers
341         # we need a deep copy
342}
343[test/test_mutable.py: change the definition of corrupt() to work with MDMF as well as SDMF files, change users of corrupt to use the new definition
344Kevan Carstensen <kevan@isnotajoke.com>**20100626003520
345 Ignore-this: 836e59e2fde0535f6b4bea3468dc8244
346] {
347hunk ./src/allmydata/test/test_mutable.py 168
348                 and shnum not in shnums_to_corrupt):
349                 continue
350             data = shares[shnum]
351-            (version,
352-             seqnum,
353-             root_hash,
354-             IV,
355-             k, N, segsize, datalen,
356-             o) = unpack_header(data)
357-            if isinstance(offset, tuple):
358-                offset1, offset2 = offset
359-            else:
360-                offset1 = offset
361-                offset2 = 0
362-            if offset1 == "pubkey":
363-                real_offset = 107
364-            elif offset1 in o:
365-                real_offset = o[offset1]
366-            else:
367-                real_offset = offset1
368-            real_offset = int(real_offset) + offset2 + offset_offset
369-            assert isinstance(real_offset, int), offset
370-            if offset1 == 0: # verbyte
371-                f = add_two
372-            else:
373-                f = flip_bit
374-            shares[shnum] = f(data, real_offset)
375-    return res
376+            # We're feeding the reader all of the share data, so it
377+            # won't need to use the rref that we didn't provide, nor the
378+            # storage index that we didn't provide. We do this because
379+            # the reader will work for both MDMF and SDMF.
380+            reader = MDMFSlotReadProxy(None, None, shnum, data)
381+            # We need to get the offsets for the next part.
382+            d = reader.get_verinfo()
383+            def _do_corruption(verinfo, data, shnum):
384+                (seqnum,
385+                 root_hash,
386+                 IV,
387+                 segsize,
388+                 datalen,
389+                 k, n, prefix, o) = verinfo
390+                if isinstance(offset, tuple):
391+                    offset1, offset2 = offset
392+                else:
393+                    offset1 = offset
394+                    offset2 = 0
395+                if offset1 == "pubkey":
396+                    real_offset = 107
397+                elif offset1 in o:
398+                    real_offset = o[offset1]
399+                else:
400+                    real_offset = offset1
401+                real_offset = int(real_offset) + offset2 + offset_offset
402+                assert isinstance(real_offset, int), offset
403+                if offset1 == 0: # verbyte
404+                    f = add_two
405+                else:
406+                    f = flip_bit
407+                shares[shnum] = f(data, real_offset)
408+            d.addCallback(_do_corruption, data, shnum)
409+            ds.append(d)
410+    dl = defer.DeferredList(ds)
411+    dl.addCallback(lambda ignored: res)
412+    return dl
413 
414 def make_storagebroker(s=None, num_peers=10):
415     if not s:
416hunk ./src/allmydata/test/test_mutable.py 1177
417         return d
418 
419     def test_download_fails(self):
420-        corrupt(None, self._storage, "signature")
421-        d = self.shouldFail(UnrecoverableFileError, "test_download_anyway",
422+        d = corrupt(None, self._storage, "signature")
423+        d.addCallback(lambda ignored:
424+            self.shouldFail(UnrecoverableFileError, "test_download_anyway",
425                             "no recoverable versions",
426                             self._fn.download_best_version)
427         return d
428hunk ./src/allmydata/test/test_mutable.py 1232
429         return d
430 
431     def test_check_all_bad_sig(self):
432-        corrupt(None, self._storage, 1) # bad sig
433-        d = self._fn.check(Monitor())
434+        d = corrupt(None, self._storage, 1) # bad sig
435+        d.addCallback(lambda ignored:
436+            self._fn.check(Monitor()))
437         d.addCallback(self.check_bad, "test_check_all_bad_sig")
438         return d
439 
440hunk ./src/allmydata/test/test_mutable.py 1239
441     def test_check_all_bad_blocks(self):
442-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
443+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
444         # the Checker won't notice this.. it doesn't look at actual data
445hunk ./src/allmydata/test/test_mutable.py 1241
446-        d = self._fn.check(Monitor())
447+        d.addCallback(lambda ignored:
448+            self._fn.check(Monitor()))
449         d.addCallback(self.check_good, "test_check_all_bad_blocks")
450         return d
451 
452hunk ./src/allmydata/test/test_mutable.py 1252
453         return d
454 
455     def test_verify_all_bad_sig(self):
456-        corrupt(None, self._storage, 1) # bad sig
457-        d = self._fn.check(Monitor(), verify=True)
458+        d = corrupt(None, self._storage, 1) # bad sig
459+        d.addCallback(lambda ignored:
460+            self._fn.check(Monitor(), verify=True))
461         d.addCallback(self.check_bad, "test_verify_all_bad_sig")
462         return d
463 
464hunk ./src/allmydata/test/test_mutable.py 1259
465     def test_verify_one_bad_sig(self):
466-        corrupt(None, self._storage, 1, [9]) # bad sig
467-        d = self._fn.check(Monitor(), verify=True)
468+        d = corrupt(None, self._storage, 1, [9]) # bad sig
469+        d.addCallback(lambda ignored:
470+            self._fn.check(Monitor(), verify=True))
471         d.addCallback(self.check_bad, "test_verify_one_bad_sig")
472         return d
473 
474hunk ./src/allmydata/test/test_mutable.py 1266
475     def test_verify_one_bad_block(self):
476-        corrupt(None, self._storage, "share_data", [9]) # bad blocks
477+        d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
478         # the Verifier *will* notice this, since it examines every byte
479hunk ./src/allmydata/test/test_mutable.py 1268
480-        d = self._fn.check(Monitor(), verify=True)
481+        d.addCallback(lambda ignored:
482+            self._fn.check(Monitor(), verify=True))
483         d.addCallback(self.check_bad, "test_verify_one_bad_block")
484         d.addCallback(self.check_expected_failure,
485                       CorruptShareError, "block hash tree failure",
486hunk ./src/allmydata/test/test_mutable.py 1277
487         return d
488 
489     def test_verify_one_bad_sharehash(self):
490-        corrupt(None, self._storage, "share_hash_chain", [9], 5)
491-        d = self._fn.check(Monitor(), verify=True)
492+        d = corrupt(None, self._storage, "share_hash_chain", [9], 5)
493+        d.addCallback(lambda ignored:
494+            self._fn.check(Monitor(), verify=True))
495         d.addCallback(self.check_bad, "test_verify_one_bad_sharehash")
496         d.addCallback(self.check_expected_failure,
497                       CorruptShareError, "corrupt hashes",
498hunk ./src/allmydata/test/test_mutable.py 1287
499         return d
500 
501     def test_verify_one_bad_encprivkey(self):
502-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
503-        d = self._fn.check(Monitor(), verify=True)
504+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
505+        d.addCallback(lambda ignored:
506+            self._fn.check(Monitor(), verify=True))
507         d.addCallback(self.check_bad, "test_verify_one_bad_encprivkey")
508         d.addCallback(self.check_expected_failure,
509                       CorruptShareError, "invalid privkey",
510hunk ./src/allmydata/test/test_mutable.py 1297
511         return d
512 
513     def test_verify_one_bad_encprivkey_uncheckable(self):
514-        corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
515+        d = corrupt(None, self._storage, "enc_privkey", [9]) # bad privkey
516         readonly_fn = self._fn.get_readonly()
517         # a read-only node has no way to validate the privkey
518hunk ./src/allmydata/test/test_mutable.py 1300
519-        d = readonly_fn.check(Monitor(), verify=True)
520+        d.addCallback(lambda ignored:
521+            readonly_fn.check(Monitor(), verify=True))
522         d.addCallback(self.check_good,
523                       "test_verify_one_bad_encprivkey_uncheckable")
524         return d
525}
526[Alter the ServermapUpdater to find MDMF files
527Kevan Carstensen <kevan@isnotajoke.com>**20100626234118
528 Ignore-this: 25f6278209c2983ba8f307cfe0fde0
529 
530 The servermapupdater should find MDMF files on a grid in the same way
531 that it finds SDMF files. This patch makes it do that.
532] {
533hunk ./src/allmydata/mutable/servermap.py 7
534 from itertools import count
535 from twisted.internet import defer
536 from twisted.python import failure
537-from foolscap.api import DeadReferenceError, RemoteException, eventually
538+from foolscap.api import DeadReferenceError, RemoteException, eventually, \
539+                         fireEventually
540 from allmydata.util import base32, hashutil, idlib, log
541 from allmydata.storage.server import si_b2a
542 from allmydata.interfaces import IServermapUpdaterStatus
543hunk ./src/allmydata/mutable/servermap.py 17
544 from allmydata.mutable.common import MODE_CHECK, MODE_ANYTHING, MODE_WRITE, MODE_READ, \
545      DictOfSets, CorruptShareError, NeedMoreDataError
546 from allmydata.mutable.layout import unpack_prefix_and_signature, unpack_header, unpack_share, \
547-     SIGNED_PREFIX_LENGTH
548+     SIGNED_PREFIX_LENGTH, MDMFSlotReadProxy
549 
550 class UpdateStatus:
551     implements(IServermapUpdaterStatus)
552hunk ./src/allmydata/mutable/servermap.py 254
553         """Return a set of versionids, one for each version that is currently
554         recoverable."""
555         versionmap = self.make_versionmap()
556-
557         recoverable_versions = set()
558         for (verinfo, shares) in versionmap.items():
559             (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
560hunk ./src/allmydata/mutable/servermap.py 366
561         self._servers_responded = set()
562 
563         # how much data should we read?
564+        # SDMF:
565         #  * if we only need the checkstring, then [0:75]
566         #  * if we need to validate the checkstring sig, then [543ish:799ish]
567         #  * if we need the verification key, then [107:436ish]
568hunk ./src/allmydata/mutable/servermap.py 374
569         #  * if we need the encrypted private key, we want [-1216ish:]
570         #   * but we can't read from negative offsets
571         #   * the offset table tells us the 'ish', also the positive offset
572-        # A future version of the SMDF slot format should consider using
573-        # fixed-size slots so we can retrieve less data. For now, we'll just
574-        # read 2000 bytes, which also happens to read enough actual data to
575-        # pre-fetch a 9-entry dirnode.
576+        # MDMF:
577+        #  * Checkstring? [0:72]
578+        #  * If we want to validate the checkstring, then [0:72], [143:?] --
579+        #    the offset table will tell us for sure.
580+        #  * If we need the verification key, we have to consult the offset
581+        #    table as well.
582+        # At this point, we don't know which we are. Our filenode can
583+        # tell us, but it might be lying -- in some cases, we're
584+        # responsible for telling it which kind of file it is.
585         self._read_size = 4000
586         if mode == MODE_CHECK:
587             # we use unpack_prefix_and_signature, so we need 1k
588hunk ./src/allmydata/mutable/servermap.py 432
589         self._queries_completed = 0
590 
591         sb = self._storage_broker
592+        # All of the peers, permuted by the storage index, as usual.
593         full_peerlist = sb.get_servers_for_index(self._storage_index)
594         self.full_peerlist = full_peerlist # for use later, immutable
595         self.extra_peers = full_peerlist[:] # peers are removed as we use them
596hunk ./src/allmydata/mutable/servermap.py 439
597         self._good_peers = set() # peers who had some shares
598         self._empty_peers = set() # peers who don't have any shares
599         self._bad_peers = set() # peers to whom our queries failed
600+        self._readers = {} # peerid -> dict(sharewriters), filled in
601+                           # after responses come in.
602 
603         k = self._node.get_required_shares()
604hunk ./src/allmydata/mutable/servermap.py 443
605+        # For what cases can these conditions work?
606         if k is None:
607             # make a guess
608             k = 3
609hunk ./src/allmydata/mutable/servermap.py 456
610         self.num_peers_to_query = k + self.EPSILON
611 
612         if self.mode == MODE_CHECK:
613+            # We want to query all of the peers.
614             initial_peers_to_query = dict(full_peerlist)
615             must_query = set(initial_peers_to_query.keys())
616             self.extra_peers = []
617hunk ./src/allmydata/mutable/servermap.py 464
618             # we're planning to replace all the shares, so we want a good
619             # chance of finding them all. We will keep searching until we've
620             # seen epsilon that don't have a share.
621+            # We don't query all of the peers because that could take a while.
622             self.num_peers_to_query = N + self.EPSILON
623             initial_peers_to_query, must_query = self._build_initial_querylist()
624             self.required_num_empty_peers = self.EPSILON
625hunk ./src/allmydata/mutable/servermap.py 474
626             # might also avoid the round trip required to read the encrypted
627             # private key.
628 
629-        else:
630+        else: # MODE_READ, MODE_ANYTHING
631+            # 2k peers is good enough.
632             initial_peers_to_query, must_query = self._build_initial_querylist()
633 
634         # this is a set of peers that we are required to get responses from:
635hunk ./src/allmydata/mutable/servermap.py 490
636         # before we can consider ourselves finished, and self.extra_peers
637         # contains the overflow (peers that we should tap if we don't get
638         # enough responses)
639+        # I guess that self._must_query is a subset of
640+        # initial_peers_to_query?
641+        assert set(must_query).issubset(set(initial_peers_to_query))
642 
643         self._send_initial_requests(initial_peers_to_query)
644         self._status.timings["initial_queries"] = time.time() - self._started
645hunk ./src/allmydata/mutable/servermap.py 549
646         # errors that aren't handled by _query_failed (and errors caused by
647         # _query_failed) get logged, but we still want to check for doneness.
648         d.addErrback(log.err)
649-        d.addBoth(self._check_for_done)
650         d.addErrback(self._fatal_error)
651hunk ./src/allmydata/mutable/servermap.py 550
652+        d.addCallback(self._check_for_done)
653         return d
654 
655     def _do_read(self, ss, peerid, storage_index, shnums, readv):
656hunk ./src/allmydata/mutable/servermap.py 569
657         d = ss.callRemote("slot_readv", storage_index, shnums, readv)
658         return d
659 
660+
661+    def _got_corrupt_share(self, e, shnum, peerid, data, lp):
662+        """
663+        I am called when a remote server returns a corrupt share in
664+        response to one of our queries. By corrupt, I mean a share
665+        without a valid signature. I then record the failure, notify the
666+        server of the corruption, and record the share as bad.
667+        """
668+        f = failure.Failure(e)
669+        self.log(format="bad share: %(f_value)s", f_value=str(f),
670+                 failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
671+        # Notify the server that its share is corrupt.
672+        self.notify_server_corruption(peerid, shnum, str(e))
673+        # By flagging this as a bad peer, we won't count any of
674+        # the other shares on that peer as valid, though if we
675+        # happen to find a valid version string amongst those
676+        # shares, we'll keep track of it so that we don't need
677+        # to validate the signature on those again.
678+        self._bad_peers.add(peerid)
679+        self._last_failure = f
680+        # XXX: Use the reader for this?
681+        checkstring = data[:SIGNED_PREFIX_LENGTH]
682+        self._servermap.mark_bad_share(peerid, shnum, checkstring)
683+        self._servermap.problems.append(f)
684+
685+
686+    def _cache_good_sharedata(self, verinfo, shnum, now, data):
687+        """
688+        If one of my queries returns successfully (which means that we
689+        were able to and successfully did validate the signature), I
690+        cache the data that we initially fetched from the storage
691+        server. This will help reduce the number of roundtrips that need
692+        to occur when the file is downloaded, or when the file is
693+        updated.
694+        """
695+        self._node._add_to_cache(verinfo, shnum, 0, data, now)
696+
697+
698     def _got_results(self, datavs, peerid, readsize, stuff, started):
699         lp = self.log(format="got result from [%(peerid)s], %(numshares)d shares",
700                       peerid=idlib.shortnodeid_b2a(peerid),
701hunk ./src/allmydata/mutable/servermap.py 630
702         else:
703             self._empty_peers.add(peerid)
704 
705-        last_verinfo = None
706-        last_shnum = None
707+        ss, storage_index = stuff
708+        ds = []
709+
710         for shnum,datav in datavs.items():
711             data = datav[0]
712hunk ./src/allmydata/mutable/servermap.py 635
713-            try:
714-                verinfo = self._got_results_one_share(shnum, data, peerid, lp)
715-                last_verinfo = verinfo
716-                last_shnum = shnum
717-                self._node._add_to_cache(verinfo, shnum, 0, data, now)
718-            except CorruptShareError, e:
719-                # log it and give the other shares a chance to be processed
720-                f = failure.Failure()
721-                self.log(format="bad share: %(f_value)s", f_value=str(f.value),
722-                         failure=f, parent=lp, level=log.WEIRD, umid="h5llHg")
723-                self.notify_server_corruption(peerid, shnum, str(e))
724-                self._bad_peers.add(peerid)
725-                self._last_failure = f
726-                checkstring = data[:SIGNED_PREFIX_LENGTH]
727-                self._servermap.mark_bad_share(peerid, shnum, checkstring)
728-                self._servermap.problems.append(f)
729-                pass
730-
731-        self._status.timings["cumulative_verify"] += (time.time() - now)
732+            reader = MDMFSlotReadProxy(ss,
733+                                       storage_index,
734+                                       shnum,
735+                                       data)
736+            self._readers.setdefault(peerid, dict())[shnum] = reader
737+            # our goal, with each response, is to validate the version
738+            # information and share data as best we can at this point --
739+            # we do this by validating the signature. To do this, we
740+            # need to do the following:
741+            #   - If we don't already have the public key, fetch the
742+            #     public key. We use this to validate the signature.
743+            if not self._node.get_pubkey():
744+                # fetch and set the public key.
745+                d = reader.get_verification_key()
746+                d.addCallback(lambda results, shnum=shnum, peerid=peerid:
747+                    self._try_to_set_pubkey(results, peerid, shnum, lp))
748+                # XXX: Make self._pubkey_query_failed?
749+                d.addErrback(lambda error, shnum=shnum, peerid=peerid:
750+                    self._got_corrupt_share(error, shnum, peerid, data, lp))
751+            else:
752+                # we already have the public key.
753+                d = defer.succeed(None)
754+            # Neither of these two branches return anything of
755+            # consequence, so the first entry in our deferredlist will
756+            # be None.
757 
758hunk ./src/allmydata/mutable/servermap.py 661
759-        if self._need_privkey and last_verinfo:
760-            # send them a request for the privkey. We send one request per
761-            # server.
762-            lp2 = self.log("sending privkey request",
763-                           parent=lp, level=log.NOISY)
764-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
765-             offsets_tuple) = last_verinfo
766-            o = dict(offsets_tuple)
767+            # - Next, we need the version information. We almost
768+            #   certainly got this by reading the first thousand or so
769+            #   bytes of the share on the storage server, so we
770+            #   shouldn't need to fetch anything at this step.
771+            d2 = reader.get_verinfo()
772+            d2.addErrback(lambda error, shnum=shnum, peerid=peerid:
773+                self._got_corrupt_share(error, shnum, peerid, data, lp))
774+            # - Next, we need the signature. For an SDMF share, it is
775+            #   likely that we fetched this when doing our initial fetch
776+            #   to get the version information. In MDMF, this lives at
777+            #   the end of the share, so unless the file is quite small,
778+            #   we'll need to do a remote fetch to get it.
779+            d3 = reader.get_signature()
780+            d3.addErrback(lambda error, shnum=shnum, peerid=peerid:
781+                self._got_corrupt_share(error, shnum, peerid, data, lp))
782+            #  Once we have all three of these responses, we can move on
783+            #  to validating the signature
784 
785hunk ./src/allmydata/mutable/servermap.py 679
786-            self._queries_outstanding.add(peerid)
787-            readv = [ (o['enc_privkey'], (o['EOF'] - o['enc_privkey'])) ]
788-            ss = self._servermap.connections[peerid]
789-            privkey_started = time.time()
790-            d = self._do_read(ss, peerid, self._storage_index,
791-                              [last_shnum], readv)
792-            d.addCallback(self._got_privkey_results, peerid, last_shnum,
793-                          privkey_started, lp2)
794-            d.addErrback(self._privkey_query_failed, peerid, last_shnum, lp2)
795-            d.addErrback(log.err)
796-            d.addCallback(self._check_for_done)
797-            d.addErrback(self._fatal_error)
798+            # Does the node already have a privkey? If not, we'll try to
799+            # fetch it here.
800+            if self._need_privkey:
801+                d4 = reader.get_encprivkey()
802+                d4.addCallback(lambda results, shnum=shnum, peerid=peerid:
803+                    self._try_to_validate_privkey(results, peerid, shnum, lp))
804+                d4.addErrback(lambda error, shnum=shnum, peerid=peerid:
805+                    self._privkey_query_failed(error, shnum, data, lp))
806+            else:
807+                d4 = defer.succeed(None)
808 
809hunk ./src/allmydata/mutable/servermap.py 690
810+            dl = defer.DeferredList([d, d2, d3, d4])
811+            dl.addCallback(lambda results, shnum=shnum, peerid=peerid:
812+                self._got_signature_one_share(results, shnum, peerid, lp))
813+            dl.addErrback(lambda error, shnum=shnum, data=data:
814+               self._got_corrupt_share(error, shnum, peerid, data, lp))
815+            dl.addCallback(lambda verinfo, shnum=shnum, peerid=peerid, data=data:
816+                self._cache_good_sharedata(verinfo, shnum, now, data))
817+            ds.append(dl)
818+        # dl is a deferred list that will fire when all of the shares
819+        # that we found on this peer are done processing. When dl fires,
820+        # we know that processing is done, so we can decrement the
821+        # semaphore-like thing that we incremented earlier.
822+        dl = defer.DeferredList(ds, fireOnOneErrback=True)
823+        # Are we done? Done means that there are no more queries to
824+        # send, that there are no outstanding queries, and that we
825+        # haven't received any queries that are still processing. If we
826+        # are done, self._check_for_done will cause the done deferred
827+        # that we returned to our caller to fire, which tells them that
828+        # they have a complete servermap, and that we won't be touching
829+        # the servermap anymore.
830+        dl.addCallback(self._check_for_done)
831+        dl.addErrback(self._fatal_error)
832         # all done!
833         self.log("_got_results done", parent=lp, level=log.NOISY)
834hunk ./src/allmydata/mutable/servermap.py 714
835+        return dl
836+
837+
838+    def _try_to_set_pubkey(self, pubkey_s, peerid, shnum, lp):
839+        if self._node.get_pubkey():
840+            return # don't go through this again if we don't have to
841+        fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
842+        assert len(fingerprint) == 32
843+        if fingerprint != self._node.get_fingerprint():
844+            raise CorruptShareError(peerid, shnum,
845+                                "pubkey doesn't match fingerprint")
846+        self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
847+        assert self._node.get_pubkey()
848+
849 
850     def notify_server_corruption(self, peerid, shnum, reason):
851         ss = self._servermap.connections[peerid]
852hunk ./src/allmydata/mutable/servermap.py 734
853         ss.callRemoteOnly("advise_corrupt_share",
854                           "mutable", self._storage_index, shnum, reason)
855 
856-    def _got_results_one_share(self, shnum, data, peerid, lp):
857+
858+    def _got_signature_one_share(self, results, shnum, peerid, lp):
859+        # It is our job to give versioninfo to our caller. We need to
860+        # raise CorruptShareError if the share is corrupt for any
861+        # reason, something that our caller will handle.
862         self.log(format="_got_results: got shnum #%(shnum)d from peerid %(peerid)s",
863                  shnum=shnum,
864                  peerid=idlib.shortnodeid_b2a(peerid),
865hunk ./src/allmydata/mutable/servermap.py 744
866                  level=log.NOISY,
867                  parent=lp)
868-
869-        # this might raise NeedMoreDataError, if the pubkey and signature
870-        # live at some weird offset. That shouldn't happen, so I'm going to
871-        # treat it as a bad share.
872-        (seqnum, root_hash, IV, k, N, segsize, datalength,
873-         pubkey_s, signature, prefix) = unpack_prefix_and_signature(data)
874-
875-        if not self._node.get_pubkey():
876-            fingerprint = hashutil.ssk_pubkey_fingerprint_hash(pubkey_s)
877-            assert len(fingerprint) == 32
878-            if fingerprint != self._node.get_fingerprint():
879-                raise CorruptShareError(peerid, shnum,
880-                                        "pubkey doesn't match fingerprint")
881-            self._node._populate_pubkey(self._deserialize_pubkey(pubkey_s))
882-
883-        if self._need_privkey:
884-            self._try_to_extract_privkey(data, peerid, shnum, lp)
885-
886-        (ig_version, ig_seqnum, ig_root_hash, ig_IV, ig_k, ig_N,
887-         ig_segsize, ig_datalen, offsets) = unpack_header(data)
888+        _, verinfo, signature, __ = results
889+        (seqnum,
890+         root_hash,
891+         saltish,
892+         segsize,
893+         datalen,
894+         k,
895+         n,
896+         prefix,
897+         offsets) = verinfo[1]
898         offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
899 
900hunk ./src/allmydata/mutable/servermap.py 756
901-        verinfo = (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
902+        # XXX: This should be done for us in the method, so
903+        # presumably you can go in there and fix it.
904+        verinfo = (seqnum,
905+                   root_hash,
906+                   saltish,
907+                   segsize,
908+                   datalen,
909+                   k,
910+                   n,
911+                   prefix,
912                    offsets_tuple)
913hunk ./src/allmydata/mutable/servermap.py 767
914+        # This tuple uniquely identifies a share on the grid; we use it
915+        # to keep track of the ones that we've already seen.
916 
917         if verinfo not in self._valid_versions:
918hunk ./src/allmydata/mutable/servermap.py 771
919-            # it's a new pair. Verify the signature.
920-            valid = self._node.get_pubkey().verify(prefix, signature)
921+            # This is a new version tuple, and we need to validate it
922+            # against the public key before keeping track of it.
923+            assert self._node.get_pubkey()
924+            valid = self._node.get_pubkey().verify(prefix, signature[1])
925             if not valid:
926hunk ./src/allmydata/mutable/servermap.py 776
927-                raise CorruptShareError(peerid, shnum, "signature is invalid")
928+                raise CorruptShareError(peerid, shnum,
929+                                        "signature is invalid")
930 
931hunk ./src/allmydata/mutable/servermap.py 779
932-            # ok, it's a valid verinfo. Add it to the list of validated
933-            # versions.
934-            self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
935-                     % (seqnum, base32.b2a(root_hash)[:4],
936-                        idlib.shortnodeid_b2a(peerid), shnum,
937-                        k, N, segsize, datalength),
938-                     parent=lp)
939-            self._valid_versions.add(verinfo)
940-        # We now know that this is a valid candidate verinfo.
941+        # ok, it's a valid verinfo. Add it to the list of validated
942+        # versions.
943+        self.log(" found valid version %d-%s from %s-sh%d: %d-%d/%d/%d"
944+                 % (seqnum, base32.b2a(root_hash)[:4],
945+                    idlib.shortnodeid_b2a(peerid), shnum,
946+                    k, n, segsize, datalen),
947+                    parent=lp)
948+        self._valid_versions.add(verinfo)
949+        # We now know that this is a valid candidate verinfo. Whether or
950+        # not this instance of it is valid is a matter for the next
951+        # statement; at this point, we just know that if we see this
952+        # version info again, that its signature checks out and that
953+        # we're okay to skip the signature-checking step.
954 
955hunk ./src/allmydata/mutable/servermap.py 793
956+        # (peerid, shnum) are bound in the method invocation.
957         if (peerid, shnum) in self._servermap.bad_shares:
958             # we've been told that the rest of the data in this share is
959             # unusable, so don't add it to the servermap.
960hunk ./src/allmydata/mutable/servermap.py 808
961         self.versionmap.add(verinfo, (shnum, peerid, timestamp))
962         return verinfo
963 
964+
965     def _deserialize_pubkey(self, pubkey_s):
966         verifier = rsa.create_verifying_key_from_string(pubkey_s)
967         return verifier
968hunk ./src/allmydata/mutable/servermap.py 813
969 
970-    def _try_to_extract_privkey(self, data, peerid, shnum, lp):
971-        try:
972-            r = unpack_share(data)
973-        except NeedMoreDataError, e:
974-            # this share won't help us. oh well.
975-            offset = e.encprivkey_offset
976-            length = e.encprivkey_length
977-            self.log("shnum %d on peerid %s: share was too short (%dB) "
978-                     "to get the encprivkey; [%d:%d] ought to hold it" %
979-                     (shnum, idlib.shortnodeid_b2a(peerid), len(data),
980-                      offset, offset+length),
981-                     parent=lp)
982-            # NOTE: if uncoordinated writes are taking place, someone might
983-            # change the share (and most probably move the encprivkey) before
984-            # we get a chance to do one of these reads and fetch it. This
985-            # will cause us to see a NotEnoughSharesError(unable to fetch
986-            # privkey) instead of an UncoordinatedWriteError . This is a
987-            # nuisance, but it will go away when we move to DSA-based mutable
988-            # files (since the privkey will be small enough to fit in the
989-            # write cap).
990-
991-            return
992-
993-        (seqnum, root_hash, IV, k, N, segsize, datalen,
994-         pubkey, signature, share_hash_chain, block_hash_tree,
995-         share_data, enc_privkey) = r
996-
997-        return self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
998 
999     def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
1000hunk ./src/allmydata/mutable/servermap.py 815
1001-
1002+        """
1003+        Given a writekey from a remote server, I validate it against the
1004+        writekey stored in my node. If it is valid, then I set the
1005+        privkey and encprivkey properties of the node.
1006+        """
1007         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
1008         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
1009         if alleged_writekey != self._node.get_writekey():
1010hunk ./src/allmydata/mutable/servermap.py 892
1011         self._queries_completed += 1
1012         self._last_failure = f
1013 
1014-    def _got_privkey_results(self, datavs, peerid, shnum, started, lp):
1015-        now = time.time()
1016-        elapsed = now - started
1017-        self._status.add_per_server_time(peerid, "privkey", started, elapsed)
1018-        self._queries_outstanding.discard(peerid)
1019-        if not self._need_privkey:
1020-            return
1021-        if shnum not in datavs:
1022-            self.log("privkey wasn't there when we asked it",
1023-                     level=log.WEIRD, umid="VA9uDQ")
1024-            return
1025-        datav = datavs[shnum]
1026-        enc_privkey = datav[0]
1027-        self._try_to_validate_privkey(enc_privkey, peerid, shnum, lp)
1028 
1029     def _privkey_query_failed(self, f, peerid, shnum, lp):
1030         self._queries_outstanding.discard(peerid)
1031hunk ./src/allmydata/mutable/servermap.py 906
1032         self._servermap.problems.append(f)
1033         self._last_failure = f
1034 
1035+
1036     def _check_for_done(self, res):
1037         # exit paths:
1038         #  return self._send_more_queries(outstanding) : send some more queries
1039hunk ./src/allmydata/mutable/servermap.py 912
1040         #  return self._done() : all done
1041         #  return : keep waiting, no new queries
1042-
1043         lp = self.log(format=("_check_for_done, mode is '%(mode)s', "
1044                               "%(outstanding)d queries outstanding, "
1045                               "%(extra)d extra peers available, "
1046hunk ./src/allmydata/mutable/servermap.py 1117
1047         self._servermap.last_update_time = self._started
1048         # the servermap will not be touched after this
1049         self.log("servermap: %s" % self._servermap.summarize_versions())
1050+
1051         eventually(self._done_deferred.callback, self._servermap)
1052 
1053     def _fatal_error(self, f):
1054hunk ./src/allmydata/test/test_mutable.py 637
1055         d.addCallback(_created)
1056         return d
1057 
1058-    def publish_multiple(self):
1059+    def publish_mdmf(self):
1060+        # like publish_one, except that the result is guaranteed to be
1061+        # an MDMF file.
1062+        # self.CONTENTS should have more than one segment.
1063+        self.CONTENTS = "This is an MDMF file" * 100000
1064+        self._storage = FakeStorage()
1065+        self._nodemaker = make_nodemaker(self._storage)
1066+        self._storage_broker = self._nodemaker.storage_broker
1067+        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=1)
1068+        def _created(node):
1069+            self._fn = node
1070+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
1071+        d.addCallback(_created)
1072+        return d
1073+
1074+
1075+    def publish_sdmf(self):
1076+        # like publish_one, except that the result is guaranteed to be
1077+        # an SDMF file
1078+        self.CONTENTS = "This is an SDMF file" * 1000
1079+        self._storage = FakeStorage()
1080+        self._nodemaker = make_nodemaker(self._storage)
1081+        self._storage_broker = self._nodemaker.storage_broker
1082+        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=0)
1083+        def _created(node):
1084+            self._fn = node
1085+            self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
1086+        d.addCallback(_created)
1087+        return d
1088+
1089+
1090+    def publish_multiple(self, version=0):
1091         self.CONTENTS = ["Contents 0",
1092                          "Contents 1",
1093                          "Contents 2",
1094hunk ./src/allmydata/test/test_mutable.py 677
1095         self._copied_shares = {}
1096         self._storage = FakeStorage()
1097         self._nodemaker = make_nodemaker(self._storage)
1098-        d = self._nodemaker.create_mutable_file(self.CONTENTS[0]) # seqnum=1
1099+        d = self._nodemaker.create_mutable_file(self.CONTENTS[0], version=version) # seqnum=1
1100         def _created(node):
1101             self._fn = node
1102             # now create multiple versions of the same file, and accumulate
1103hunk ./src/allmydata/test/test_mutable.py 906
1104         return d
1105 
1106 
1107+    def test_servermapupdater_finds_mdmf_files(self):
1108+        # setUp already published an MDMF file for us. We just need to
1109+        # make sure that when we run the ServermapUpdater, the file is
1110+        # reported to have one recoverable version.
1111+        d = defer.succeed(None)
1112+        d.addCallback(lambda ignored:
1113+            self.publish_mdmf())
1114+        d.addCallback(lambda ignored:
1115+            self.make_servermap(mode=MODE_CHECK))
1116+        # Calling make_servermap also updates the servermap in the mode
1117+        # that we specify, so we just need to see what it says.
1118+        def _check_servermap(sm):
1119+            self.failUnlessEqual(len(sm.recoverable_versions()), 1)
1120+        d.addCallback(_check_servermap)
1121+        return d
1122+
1123+
1124+    def test_servermapupdater_finds_sdmf_files(self):
1125+        d = defer.succeed(None)
1126+        d.addCallback(lambda ignored:
1127+            self.publish_sdmf())
1128+        d.addCallback(lambda ignored:
1129+            self.make_servermap(mode=MODE_CHECK))
1130+        d.addCallback(lambda servermap:
1131+            self.failUnlessEqual(len(servermap.recoverable_versions()), 1))
1132+        return d
1133+
1134 
1135 class Roundtrip(unittest.TestCase, testutil.ShouldFailMixin, PublishMixin):
1136     def setUp(self):
1137hunk ./src/allmydata/test/test_mutable.py 1050
1138         return d
1139     test_no_servers_download.timeout = 15
1140 
1141+
1142     def _test_corrupt_all(self, offset, substring,
1143                           should_succeed=False, corrupt_early=True,
1144                           failure_checker=None):
1145}
1146[Make a segmented mutable uploader
1147Kevan Carstensen <kevan@isnotajoke.com>**20100626234204
1148 Ignore-this: d199af8ab0bc64d8ed2bc19c5437bfba
1149 
1150 The mutable file uploader should be able to publish files with one
1151 segment and files with multiple segments. This patch makes it do that.
1152 This is still incomplete, and rather ugly -- I need to flesh out error
1153 handling, I need to write tests, and I need to remove some of the uglier
1154 kludges in the process before I can call this done.
1155] {
1156hunk ./src/allmydata/mutable/publish.py 8
1157 from zope.interface import implements
1158 from twisted.internet import defer
1159 from twisted.python import failure
1160-from allmydata.interfaces import IPublishStatus
1161+from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION
1162 from allmydata.util import base32, hashutil, mathutil, idlib, log
1163 from allmydata import hashtree, codec
1164 from allmydata.storage.server import si_b2a
1165hunk ./src/allmydata/mutable/publish.py 19
1166      UncoordinatedWriteError, NotEnoughServersError
1167 from allmydata.mutable.servermap import ServerMap
1168 from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
1169-     unpack_checkstring, SIGNED_PREFIX
1170+     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy
1171+
1172+KiB = 1024
1173+DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
1174 
1175 class PublishStatus:
1176     implements(IPublishStatus)
1177hunk ./src/allmydata/mutable/publish.py 112
1178         self._status.set_helper(False)
1179         self._status.set_progress(0.0)
1180         self._status.set_active(True)
1181+        # We use this to control how the file is written.
1182+        version = self._node.get_version()
1183+        assert version in (SDMF_VERSION, MDMF_VERSION)
1184+        self._version = version
1185 
1186     def get_status(self):
1187         return self._status
1188hunk ./src/allmydata/mutable/publish.py 134
1189         simultaneous write.
1190         """
1191 
1192-        # 1: generate shares (SDMF: files are small, so we can do it in RAM)
1193-        # 2: perform peer selection, get candidate servers
1194-        #  2a: send queries to n+epsilon servers, to determine current shares
1195-        #  2b: based upon responses, create target map
1196-        # 3: send slot_testv_and_readv_and_writev messages
1197-        # 4: as responses return, update share-dispatch table
1198-        # 4a: may need to run recovery algorithm
1199-        # 5: when enough responses are back, we're done
1200+        # 0. Setup encoding parameters, encoder, and other such things.
1201+        # 1. Encrypt, encode, and publish segments.
1202 
1203         self.log("starting publish, datalen is %s" % len(newdata))
1204         self._status.set_size(len(newdata))
1205hunk ./src/allmydata/mutable/publish.py 187
1206         self.bad_peers = set() # peerids who have errbacked/refused requests
1207 
1208         self.newdata = newdata
1209-        self.salt = os.urandom(16)
1210 
1211hunk ./src/allmydata/mutable/publish.py 188
1212+        # This will set self.segment_size, self.num_segments, and
1213+        # self.fec.
1214         self.setup_encoding_parameters()
1215 
1216         # if we experience any surprises (writes which were rejected because
1217hunk ./src/allmydata/mutable/publish.py 238
1218             self.bad_share_checkstrings[key] = old_checkstring
1219             self.connections[peerid] = self._servermap.connections[peerid]
1220 
1221-        # create the shares. We'll discard these as they are delivered. SDMF:
1222-        # we're allowed to hold everything in memory.
1223+        # Now, the process dovetails -- if this is an SDMF file, we need
1224+        # to write an SDMF file. Otherwise, we need to write an MDMF
1225+        # file.
1226+        if self._version == MDMF_VERSION:
1227+            return self._publish_mdmf()
1228+        else:
1229+            return self._publish_sdmf()
1230+        #return self.done_deferred
1231+
1232+    def _publish_mdmf(self):
1233+        # Next, we find homes for all of the shares that we don't have
1234+        # homes for yet.
1235+        # TODO: Make this part do peer selection.
1236+        self.update_goal()
1237+        self.writers = {}
1238+        # For each (peerid, shnum) in self.goal, we make an
1239+        # MDMFSlotWriteProxy for that peer. We'll use this to write
1240+        # shares to the peer.
1241+        for key in self.goal:
1242+            peerid, shnum = key
1243+            write_enabler = self._node.get_write_enabler(peerid)
1244+            renew_secret = self._node.get_renewal_secret(peerid)
1245+            cancel_secret = self._node.get_cancel_secret(peerid)
1246+            secrets = (write_enabler, renew_secret, cancel_secret)
1247+
1248+            self.writers[shnum] =  MDMFSlotWriteProxy(shnum,
1249+                                                      self.connections[peerid],
1250+                                                      self._storage_index,
1251+                                                      secrets,
1252+                                                      self._new_seqnum,
1253+                                                      self.required_shares,
1254+                                                      self.total_shares,
1255+                                                      self.segment_size,
1256+                                                      len(self.newdata))
1257+            if (peerid, shnum) in self._servermap.servermap:
1258+                old_versionid, old_timestamp = self._servermap.servermap[key]
1259+                (old_seqnum, old_root_hash, old_salt, old_segsize,
1260+                 old_datalength, old_k, old_N, old_prefix,
1261+                 old_offsets_tuple) = old_versionid
1262+                self.writers[shnum].set_checkstring(old_seqnum, old_root_hash)
1263+
1264+        # Now, we start pushing shares.
1265+        self._status.timings["setup"] = time.time() - self._started
1266+        def _start_pushing(res):
1267+            self._started_pushing = time.time()
1268+            return res
1269+
1270+        # First, we encrypt, encode, and publish the shares that we need
1271+        # to encrypt, encode, and publish.
1272+
1273+        # This will eventually hold the block hash chain for each share
1274+        # that we publish. We define it this way so that empty publishes
1275+        # will still have something to write to the remote slot.
1276+        self.blockhashes = dict([(i, []) for i in xrange(self.total_shares)])
1277+        self.sharehash_leaves = None # eventually [sharehashes]
1278+        self.sharehashes = {} # shnum -> [sharehash leaves necessary to
1279+                              # validate the share]
1280 
1281hunk ./src/allmydata/mutable/publish.py 296
1282+        d = defer.succeed(None)
1283+        self.log("Starting push")
1284+        for i in xrange(self.num_segments - 1):
1285+            d.addCallback(lambda ignored, i=i:
1286+                self.push_segment(i))
1287+            d.addCallback(self._turn_barrier)
1288+        # We have at least one segment, so we will have a tail segment
1289+        if self.num_segments > 0:
1290+            d.addCallback(lambda ignored:
1291+                self.push_tail_segment())
1292+
1293+        d.addCallback(lambda ignored:
1294+            self.push_encprivkey())
1295+        d.addCallback(lambda ignored:
1296+            self.push_blockhashes())
1297+        d.addCallback(lambda ignored:
1298+            self.push_sharehashes())
1299+        d.addCallback(lambda ignored:
1300+            self.push_toplevel_hashes_and_signature())
1301+        d.addCallback(lambda ignored:
1302+            self.finish_publishing())
1303+        return d
1304+
1305+
1306+    def _publish_sdmf(self):
1307         self._status.timings["setup"] = time.time() - self._started
1308hunk ./src/allmydata/mutable/publish.py 322
1309+        self.salt = os.urandom(16)
1310+
1311         d = self._encrypt_and_encode()
1312         d.addCallback(self._generate_shares)
1313         def _start_pushing(res):
1314hunk ./src/allmydata/mutable/publish.py 335
1315 
1316         return self.done_deferred
1317 
1318+
1319     def setup_encoding_parameters(self):
1320hunk ./src/allmydata/mutable/publish.py 337
1321-        segment_size = len(self.newdata)
1322+        if self._version == MDMF_VERSION:
1323+            segment_size = DEFAULT_MAX_SEGMENT_SIZE # 128 KiB by default
1324+        else:
1325+            segment_size = len(self.newdata) # SDMF is only one segment
1326         # this must be a multiple of self.required_shares
1327         segment_size = mathutil.next_multiple(segment_size,
1328                                               self.required_shares)
1329hunk ./src/allmydata/mutable/publish.py 350
1330                                                   segment_size)
1331         else:
1332             self.num_segments = 0
1333-        assert self.num_segments in [0, 1,] # SDMF restrictions
1334+        if self._version == SDMF_VERSION:
1335+            assert self.num_segments in (0, 1) # SDMF
1336+            return
1337+        # calculate the tail segment size.
1338+        self.tail_segment_size = len(self.newdata) % segment_size
1339+
1340+        if self.tail_segment_size == 0:
1341+            # The tail segment is the same size as the other segments.
1342+            self.tail_segment_size = segment_size
1343+
1344+        # We'll make an encoder ahead-of-time for the normal-sized
1345+        # segments (defined as any segment of segment_size size.
1346+        # (the part of the code that puts the tail segment will make its
1347+        #  own encoder for that part)
1348+        fec = codec.CRSEncoder()
1349+        fec.set_params(self.segment_size,
1350+                       self.required_shares, self.total_shares)
1351+        self.piece_size = fec.get_block_size()
1352+        self.fec = fec
1353+
1354+
1355+    def push_segment(self, segnum):
1356+        started = time.time()
1357+        segsize = self.segment_size
1358+        self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
1359+        data = self.newdata[segsize * segnum:segsize*(segnum + 1)]
1360+        assert len(data) == segsize
1361+
1362+        salt = os.urandom(16)
1363+
1364+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
1365+        enc = AES(key)
1366+        crypttext = enc.process(data)
1367+        assert len(crypttext) == len(data)
1368+
1369+        now = time.time()
1370+        self._status.timings["encrypt"] = now - started
1371+        started = now
1372+
1373+        # now apply FEC
1374+
1375+        self._status.set_status("Encoding")
1376+        crypttext_pieces = [None] * self.required_shares
1377+        piece_size = self.piece_size
1378+        for i in range(len(crypttext_pieces)):
1379+            offset = i * piece_size
1380+            piece = crypttext[offset:offset+piece_size]
1381+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
1382+            crypttext_pieces[i] = piece
1383+            assert len(piece) == piece_size
1384+        d = self.fec.encode(crypttext_pieces)
1385+        def _done_encoding(res):
1386+            elapsed = time.time() - started
1387+            self._status.timings["encode"] = elapsed
1388+            return res
1389+        d.addCallback(_done_encoding)
1390+
1391+        def _push_shares_and_salt(results):
1392+            shares, shareids = results
1393+            dl = []
1394+            for i in xrange(len(shares)):
1395+                sharedata = shares[i]
1396+                shareid = shareids[i]
1397+                block_hash = hashutil.block_hash(salt + sharedata)
1398+                self.blockhashes[shareid].append(block_hash)
1399+
1400+                # find the writer for this share
1401+                d = self.writers[shareid].put_block(sharedata, segnum, salt)
1402+                dl.append(d)
1403+            # TODO: Naturally, we need to check on the results of these.
1404+            return defer.DeferredList(dl)
1405+        d.addCallback(_push_shares_and_salt)
1406+        return d
1407+
1408+
1409+    def push_tail_segment(self):
1410+        # This is essentially the same as push_segment, except that we
1411+        # don't use the cached encoder that we use elsewhere.
1412+        self.log("Pushing tail segment")
1413+        started = time.time()
1414+        segsize = self.segment_size
1415+        data = self.newdata[segsize * (self.num_segments-1):]
1416+        assert len(data) == self.tail_segment_size
1417+        salt = os.urandom(16)
1418+
1419+        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
1420+        enc = AES(key)
1421+        crypttext = enc.process(data)
1422+        assert len(crypttext) == len(data)
1423+
1424+        now = time.time()
1425+        self._status.timings['encrypt'] = now - started
1426+        started = now
1427+
1428+        self._status.set_status("Encoding")
1429+        tail_fec = codec.CRSEncoder()
1430+        tail_fec.set_params(self.tail_segment_size,
1431+                            self.required_shares,
1432+                            self.total_shares)
1433+
1434+        crypttext_pieces = [None] * self.required_shares
1435+        piece_size = tail_fec.get_block_size()
1436+        for i in range(len(crypttext_pieces)):
1437+            offset = i * piece_size
1438+            piece = crypttext[offset:offset+piece_size]
1439+            piece = piece + "\x00"*(piece_size - len(piece)) # padding
1440+            crypttext_pieces[i] = piece
1441+            assert len(piece) == piece_size
1442+        d = tail_fec.encode(crypttext_pieces)
1443+        def _push_shares_and_salt(results):
1444+            shares, shareids = results
1445+            dl = []
1446+            for i in xrange(len(shares)):
1447+                sharedata = shares[i]
1448+                shareid = shareids[i]
1449+                block_hash = hashutil.block_hash(salt + sharedata)
1450+                self.blockhashes[shareid].append(block_hash)
1451+                # find the writer for this share
1452+                d = self.writers[shareid].put_block(sharedata,
1453+                                                    self.num_segments - 1,
1454+                                                    salt)
1455+                dl.append(d)
1456+            # TODO: Naturally, we need to check on the results of these.
1457+            return defer.DeferredList(dl)
1458+        d.addCallback(_push_shares_and_salt)
1459+        return d
1460+
1461+
1462+    def push_encprivkey(self):
1463+        started = time.time()
1464+        encprivkey = self._encprivkey
1465+        dl = []
1466+        def _spy_on_writer(results):
1467+            print results
1468+            return results
1469+        for shnum, writer in self.writers.iteritems():
1470+            d = writer.put_encprivkey(encprivkey)
1471+            dl.append(d)
1472+        d = defer.DeferredList(dl)
1473+        return d
1474+
1475+
1476+    def push_blockhashes(self):
1477+        started = time.time()
1478+        dl = []
1479+        def _spy_on_results(results):
1480+            print results
1481+            return results
1482+        self.sharehash_leaves = [None] * len(self.blockhashes)
1483+        for shnum, blockhashes in self.blockhashes.iteritems():
1484+            t = hashtree.HashTree(blockhashes)
1485+            self.blockhashes[shnum] = list(t)
1486+            # set the leaf for future use.
1487+            self.sharehash_leaves[shnum] = t[0]
1488+            d = self.writers[shnum].put_blockhashes(self.blockhashes[shnum])
1489+            dl.append(d)
1490+        d = defer.DeferredList(dl)
1491+        return d
1492+
1493+
1494+    def push_sharehashes(self):
1495+        share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
1496+        share_hash_chain = {}
1497+        ds = []
1498+        def _spy_on_results(results):
1499+            print results
1500+            return results
1501+        for shnum in xrange(len(self.sharehash_leaves)):
1502+            needed_indices = share_hash_tree.needed_hashes(shnum)
1503+            self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
1504+                                             for i in needed_indices] )
1505+            d = self.writers[shnum].put_sharehashes(self.sharehashes[shnum])
1506+            ds.append(d)
1507+        self.root_hash = share_hash_tree[0]
1508+        d = defer.DeferredList(ds)
1509+        return d
1510+
1511+
1512+    def push_toplevel_hashes_and_signature(self):
1513+        # We need to to three things here:
1514+        #   - Push the root hash and salt hash
1515+        #   - Get the checkstring of the resulting layout; sign that.
1516+        #   - Push the signature
1517+        ds = []
1518+        def _spy_on_results(results):
1519+            print results
1520+            return results
1521+        for shnum in xrange(self.total_shares):
1522+            d = self.writers[shnum].put_root_hash(self.root_hash)
1523+            ds.append(d)
1524+        d = defer.DeferredList(ds)
1525+        def _make_and_place_signature(ignored):
1526+            signable = self.writers[0].get_signable()
1527+            self.signature = self._privkey.sign(signable)
1528+
1529+            ds = []
1530+            for (shnum, writer) in self.writers.iteritems():
1531+                d = writer.put_signature(self.signature)
1532+                ds.append(d)
1533+            return defer.DeferredList(ds)
1534+        d.addCallback(_make_and_place_signature)
1535+        return d
1536+
1537+
1538+    def finish_publishing(self):
1539+        # We're almost done -- we just need to put the verification key
1540+        # and the offsets
1541+        ds = []
1542+        verification_key = self._pubkey.serialize()
1543+
1544+        def _spy_on_results(results):
1545+            print results
1546+            return results
1547+        for (shnum, writer) in self.writers.iteritems():
1548+            d = writer.put_verification_key(verification_key)
1549+            d.addCallback(lambda ignored, writer=writer:
1550+                writer.finish_publishing())
1551+            ds.append(d)
1552+        return defer.DeferredList(ds)
1553+
1554+
1555+    def _turn_barrier(self, res):
1556+        # putting this method in a Deferred chain imposes a guaranteed
1557+        # reactor turn between the pre- and post- portions of that chain.
1558+        # This can be useful to limit memory consumption: since Deferreds do
1559+        # not do tail recursion, code which uses defer.succeed(result) for
1560+        # consistency will cause objects to live for longer than you might
1561+        # normally expect.
1562+        return fireEventually(res)
1563+
1564 
1565     def _fatal_error(self, f):
1566         self.log("error during loop", failure=f, level=log.UNUSUAL)
1567hunk ./src/allmydata/mutable/publish.py 716
1568             self.log_goal(self.goal, "after update: ")
1569 
1570 
1571-
1572     def _encrypt_and_encode(self):
1573         # this returns a Deferred that fires with a list of (sharedata,
1574         # sharenum) tuples. TODO: cache the ciphertext, only produce the
1575hunk ./src/allmydata/mutable/publish.py 757
1576         d.addCallback(_done_encoding)
1577         return d
1578 
1579+
1580     def _generate_shares(self, shares_and_shareids):
1581         # this sets self.shares and self.root_hash
1582         self.log("_generate_shares")
1583hunk ./src/allmydata/mutable/publish.py 1145
1584             self._status.set_progress(1.0)
1585         eventually(self.done_deferred.callback, res)
1586 
1587-
1588hunk ./src/allmydata/test/test_mutable.py 248
1589         d.addCallback(_created)
1590         return d
1591 
1592+
1593+    def test_create_mdmf(self):
1594+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
1595+        def _created(n):
1596+            self.failUnless(isinstance(n, MutableFileNode))
1597+            self.failUnlessEqual(n.get_storage_index(), n._storage_index)
1598+            sb = self.nodemaker.storage_broker
1599+            peer0 = sorted(sb.get_all_serverids())[0]
1600+            shnums = self._storage._peers[peer0].keys()
1601+            self.failUnlessEqual(len(shnums), 1)
1602+        d.addCallback(_created)
1603+        return d
1604+
1605+
1606     def test_serialize(self):
1607         n = MutableFileNode(None, None, {"k": 3, "n": 10}, None)
1608         calls = []
1609hunk ./src/allmydata/test/test_mutable.py 334
1610         d.addCallback(_created)
1611         return d
1612 
1613+
1614+    def test_create_mdmf_with_initial_contents(self):
1615+        initial_contents = "foobarbaz" * 131072 # 900KiB
1616+        d = self.nodemaker.create_mutable_file(initial_contents,
1617+                                               version=MDMF_VERSION)
1618+        def _created(n):
1619+            d = n.download_best_version()
1620+            d.addCallback(lambda data:
1621+                self.failUnlessEqual(data, initial_contents))
1622+            d.addCallback(lambda ignored:
1623+                n.overwrite(initial_contents + "foobarbaz"))
1624+            d.addCallback(lambda ignored:
1625+                n.download_best_version())
1626+            d.addCallback(lambda data:
1627+                self.failUnlessEqual(data, initial_contents +
1628+                                           "foobarbaz"))
1629+            return d
1630+        d.addCallback(_created)
1631+        return d
1632+
1633+
1634     def test_create_with_initial_contents_function(self):
1635         data = "initial contents"
1636         def _make_contents(n):
1637hunk ./src/allmydata/test/test_mutable.py 370
1638         d.addCallback(lambda data2: self.failUnlessEqual(data2, data))
1639         return d
1640 
1641+
1642+    def test_create_mdmf_with_initial_contents_function(self):
1643+        data = "initial contents" * 100000
1644+        def _make_contents(n):
1645+            self.failUnless(isinstance(n, MutableFileNode))
1646+            key = n.get_writekey()
1647+            self.failUnless(isinstance(key, str), key)
1648+            self.failUnlessEqual(len(key), 16)
1649+            return data
1650+        d = self.nodemaker.create_mutable_file(_make_contents,
1651+                                               version=MDMF_VERSION)
1652+        d.addCallback(lambda n:
1653+            n.download_best_version())
1654+        d.addCallback(lambda data2:
1655+            self.failUnlessEqual(data2, data))
1656+        return d
1657+
1658+
1659     def test_create_with_too_large_contents(self):
1660         BIG = "a" * (self.OLD_MAX_SEGMENT_SIZE + 1)
1661         d = self.nodemaker.create_mutable_file(BIG)
1662}
1663[Write a segmented mutable downloader
1664Kevan Carstensen <kevan@isnotajoke.com>**20100626234314
1665 Ignore-this: d2bef531cde1b5c38f2eb28afdd4b17c
1666 
1667 The segmented mutable downloader can deal with MDMF files (files with
1668 one or more segments in MDMF format) and SDMF files (files with one
1669 segment in SDMF format). It is backwards compatible with the old
1670 file format.
1671 
1672 This patch also contains tests for the segmented mutable downloader.
1673] {
1674hunk ./src/allmydata/mutable/retrieve.py 8
1675 from twisted.internet import defer
1676 from twisted.python import failure
1677 from foolscap.api import DeadReferenceError, eventually, fireEventually
1678-from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError
1679-from allmydata.util import hashutil, idlib, log
1680+from allmydata.interfaces import IRetrieveStatus, NotEnoughSharesError, \
1681+                                 MDMF_VERSION, SDMF_VERSION
1682+from allmydata.util import hashutil, idlib, log, mathutil
1683 from allmydata import hashtree, codec
1684 from allmydata.storage.server import si_b2a
1685 from pycryptopp.cipher.aes import AES
1686hunk ./src/allmydata/mutable/retrieve.py 17
1687 from pycryptopp.publickey import rsa
1688 
1689 from allmydata.mutable.common import DictOfSets, CorruptShareError, UncoordinatedWriteError
1690-from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data
1691+from allmydata.mutable.layout import SIGNED_PREFIX, unpack_share_data, \
1692+                                     MDMFSlotReadProxy
1693 
1694 class RetrieveStatus:
1695     implements(IRetrieveStatus)
1696hunk ./src/allmydata/mutable/retrieve.py 104
1697         self.verinfo = verinfo
1698         # during repair, we may be called upon to grab the private key, since
1699         # it wasn't picked up during a verify=False checker run, and we'll
1700-        # need it for repair to generate the a new version.
1701+        # need it for repair to generate a new version.
1702         self._need_privkey = fetch_privkey
1703         if self._node.get_privkey():
1704             self._need_privkey = False
1705hunk ./src/allmydata/mutable/retrieve.py 109
1706 
1707+        if self._need_privkey:
1708+            # TODO: Evaluate the need for this. We'll use it if we want
1709+            # to limit how many queries are on the wire for the privkey
1710+            # at once.
1711+            self._privkey_query_markers = [] # one Marker for each time we've
1712+                                             # tried to get the privkey.
1713+
1714         self._status = RetrieveStatus()
1715         self._status.set_storage_index(self._storage_index)
1716         self._status.set_helper(False)
1717hunk ./src/allmydata/mutable/retrieve.py 125
1718          offsets_tuple) = self.verinfo
1719         self._status.set_size(datalength)
1720         self._status.set_encoding(k, N)
1721+        self.readers = {}
1722 
1723     def get_status(self):
1724         return self._status
1725hunk ./src/allmydata/mutable/retrieve.py 149
1726         self.remaining_sharemap = DictOfSets()
1727         for (shnum, peerid, timestamp) in shares:
1728             self.remaining_sharemap.add(shnum, peerid)
1729+            # If the servermap update fetched anything, it fetched at least 1
1730+            # KiB, so we ask for that much.
1731+            # TODO: Change the cache methods to allow us to fetch all of the
1732+            # data that they have, then change this method to do that.
1733+            any_cache, timestamp = self._node._read_from_cache(self.verinfo,
1734+                                                               shnum,
1735+                                                               0,
1736+                                                               1000)
1737+            ss = self.servermap.connections[peerid]
1738+            reader = MDMFSlotReadProxy(ss,
1739+                                       self._storage_index,
1740+                                       shnum,
1741+                                       any_cache)
1742+            reader.peerid = peerid
1743+            self.readers[shnum] = reader
1744+
1745 
1746         self.shares = {} # maps shnum to validated blocks
1747hunk ./src/allmydata/mutable/retrieve.py 167
1748+        self._active_readers = [] # list of active readers for this dl.
1749+        self._validated_readers = set() # set of readers that we have
1750+                                        # validated the prefix of
1751+        self._block_hash_trees = {} # shnum => hashtree
1752+        # TODO: Make this into a file-backed consumer or something to
1753+        # conserve memory.
1754+        self._plaintext = ""
1755 
1756         # how many shares do we need?
1757hunk ./src/allmydata/mutable/retrieve.py 176
1758-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1759+        (seqnum,
1760+         root_hash,
1761+         IV,
1762+         segsize,
1763+         datalength,
1764+         k,
1765+         N,
1766+         prefix,
1767          offsets_tuple) = self.verinfo
1768hunk ./src/allmydata/mutable/retrieve.py 185
1769-        assert len(self.remaining_sharemap) >= k
1770-        # we start with the lowest shnums we have available, since FEC is
1771-        # faster if we're using "primary shares"
1772-        self.active_shnums = set(sorted(self.remaining_sharemap.keys())[:k])
1773-        for shnum in self.active_shnums:
1774-            # we use an arbitrary peer who has the share. If shares are
1775-            # doubled up (more than one share per peer), we could make this
1776-            # run faster by spreading the load among multiple peers. But the
1777-            # algorithm to do that is more complicated than I want to write
1778-            # right now, and a well-provisioned grid shouldn't have multiple
1779-            # shares per peer.
1780-            peerid = list(self.remaining_sharemap[shnum])[0]
1781-            self.get_data(shnum, peerid)
1782 
1783hunk ./src/allmydata/mutable/retrieve.py 186
1784-        # control flow beyond this point: state machine. Receiving responses
1785-        # from queries is the input. We might send out more queries, or we
1786-        # might produce a result.
1787 
1788hunk ./src/allmydata/mutable/retrieve.py 187
1789+        # We need one share hash tree for the entire file; its leaves
1790+        # are the roots of the block hash trees for the shares that
1791+        # comprise it, and its root is in the verinfo.
1792+        self.share_hash_tree = hashtree.IncompleteHashTree(N)
1793+        self.share_hash_tree.set_hashes({0: root_hash})
1794+
1795+        # This will set up both the segment decoder and the tail segment
1796+        # decoder, as well as a variety of other instance variables that
1797+        # the download process will use.
1798+        self._setup_encoding_parameters()
1799+        assert len(self.remaining_sharemap) >= k
1800+
1801+        self.log("starting download")
1802+        self._add_active_peers()
1803+        # The download process beyond this is a state machine.
1804+        # _add_active_peers will select the peers that we want to use
1805+        # for the download, and then attempt to start downloading. After
1806+        # each segment, it will check for doneness, reacting to broken
1807+        # peers and corrupt shares as necessary. If it runs out of good
1808+        # peers before downloading all of the segments, _done_deferred
1809+        # will errback.  Otherwise, it will eventually callback with the
1810+        # contents of the mutable file.
1811         return self._done_deferred
1812 
1813hunk ./src/allmydata/mutable/retrieve.py 211
1814-    def get_data(self, shnum, peerid):
1815-        self.log(format="sending sh#%(shnum)d request to [%(peerid)s]",
1816-                 shnum=shnum,
1817-                 peerid=idlib.shortnodeid_b2a(peerid),
1818-                 level=log.NOISY)
1819-        ss = self.servermap.connections[peerid]
1820-        started = time.time()
1821-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
1822+
1823+    def _setup_encoding_parameters(self):
1824+        """
1825+        I set up the encoding parameters, including k, n, the number
1826+        of segments associated with this file, and the segment decoder.
1827+        """
1828+        (seqnum,
1829+         root_hash,
1830+         IV,
1831+         segsize,
1832+         datalength,
1833+         k,
1834+         n,
1835+         known_prefix,
1836          offsets_tuple) = self.verinfo
1837hunk ./src/allmydata/mutable/retrieve.py 226
1838-        offsets = dict(offsets_tuple)
1839+        self._required_shares = k
1840+        self._total_shares = n
1841+        self._segment_size = segsize
1842+        self._data_length = datalength
1843+
1844+        if not IV:
1845+            self._version = MDMF_VERSION
1846+        else:
1847+            self._version = SDMF_VERSION
1848+
1849+        if datalength and segsize:
1850+            self._num_segments = mathutil.div_ceil(datalength, segsize)
1851+            self._tail_data_size = datalength % segsize
1852+        else:
1853+            self._num_segments = 0
1854+            self._tail_data_size = 0
1855 
1856hunk ./src/allmydata/mutable/retrieve.py 243
1857-        # we read the checkstring, to make sure that the data we grab is from
1858-        # the right version.
1859-        readv = [ (0, struct.calcsize(SIGNED_PREFIX)) ]
1860+        self._segment_decoder = codec.CRSDecoder()
1861+        self._segment_decoder.set_params(segsize, k, n)
1862+        self._current_segment = 0
1863 
1864hunk ./src/allmydata/mutable/retrieve.py 247
1865-        # We also read the data, and the hashes necessary to validate them
1866-        # (share_hash_chain, block_hash_tree, share_data). We don't read the
1867-        # signature or the pubkey, since that was handled during the
1868-        # servermap phase, and we'll be comparing the share hash chain
1869-        # against the roothash that was validated back then.
1870+        if  not self._tail_data_size:
1871+            self._tail_data_size = segsize
1872 
1873hunk ./src/allmydata/mutable/retrieve.py 250
1874-        readv.append( (offsets['share_hash_chain'],
1875-                       offsets['enc_privkey'] - offsets['share_hash_chain'] ) )
1876+        self._tail_segment_size = mathutil.next_multiple(self._tail_data_size,
1877+                                                         self._required_shares)
1878+        if self._tail_segment_size == self._segment_size:
1879+            self._tail_decoder = self._segment_decoder
1880+        else:
1881+            self._tail_decoder = codec.CRSDecoder()
1882+            self._tail_decoder.set_params(self._tail_segment_size,
1883+                                          self._required_shares,
1884+                                          self._total_shares)
1885 
1886hunk ./src/allmydata/mutable/retrieve.py 260
1887-        # if we need the private key (for repair), we also fetch that
1888-        if self._need_privkey:
1889-            readv.append( (offsets['enc_privkey'],
1890-                           offsets['EOF'] - offsets['enc_privkey']) )
1891+        self.log("got encoding parameters: "
1892+                 "k: %d "
1893+                 "n: %d "
1894+                 "%d segments of %d bytes each (%d byte tail segment)" % \
1895+                 (k, n, self._num_segments, self._segment_size,
1896+                  self._tail_segment_size))
1897 
1898hunk ./src/allmydata/mutable/retrieve.py 267
1899-        m = Marker()
1900-        self._outstanding_queries[m] = (peerid, shnum, started)
1901+        for i in xrange(self._total_shares):
1902+            # So we don't have to do this later.
1903+            self._block_hash_trees[i] = hashtree.IncompleteHashTree(self._num_segments)
1904 
1905hunk ./src/allmydata/mutable/retrieve.py 271
1906-        # ask the cache first
1907-        got_from_cache = False
1908-        datavs = []
1909-        for (offset, length) in readv:
1910-            (data, timestamp) = self._node._read_from_cache(self.verinfo, shnum,
1911-                                                            offset, length)
1912-            if data is not None:
1913-                datavs.append(data)
1914-        if len(datavs) == len(readv):
1915-            self.log("got data from cache")
1916-            got_from_cache = True
1917-            d = fireEventually({shnum: datavs})
1918-            # datavs is a dict mapping shnum to a pair of strings
1919-        else:
1920-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
1921-        self.remaining_sharemap.discard(shnum, peerid)
1922+        # If we have more than one segment, we are an SDMF file, which
1923+        # means that we need to validate the salts as we receive them.
1924+        self._salt_hash_tree = hashtree.IncompleteHashTree(self._num_segments)
1925+        self._salt_hash_tree[0] = IV # from the prefix.
1926 
1927hunk ./src/allmydata/mutable/retrieve.py 276
1928-        d.addCallback(self._got_results, m, peerid, started, got_from_cache)
1929-        d.addErrback(self._query_failed, m, peerid)
1930-        # errors that aren't handled by _query_failed (and errors caused by
1931-        # _query_failed) get logged, but we still want to check for doneness.
1932-        def _oops(f):
1933-            self.log(format="problem in _query_failed for sh#%(shnum)d to %(peerid)s",
1934-                     shnum=shnum,
1935-                     peerid=idlib.shortnodeid_b2a(peerid),
1936-                     failure=f,
1937-                     level=log.WEIRD, umid="W0xnQA")
1938-        d.addErrback(_oops)
1939-        d.addBoth(self._check_for_done)
1940-        # any error during _check_for_done means the download fails. If the
1941-        # download is successful, _check_for_done will fire _done by itself.
1942-        d.addErrback(self._done)
1943-        d.addErrback(log.err)
1944-        return d # purely for testing convenience
1945 
1946hunk ./src/allmydata/mutable/retrieve.py 277
1947-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
1948-        # isolate the callRemote to a separate method, so tests can subclass
1949-        # Publish and override it
1950-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
1951-        return d
1952+    def _add_active_peers(self):
1953+        """
1954+        I populate self._active_readers with enough active readers to
1955+        retrieve the contents of this mutable file. I am called before
1956+        downloading starts, and (eventually) after each validation
1957+        error, connection error, or other problem in the download.
1958+        """
1959+        # TODO: It would be cool to investigate other heuristics for
1960+        # reader selection. For instance, the cost (in time the user
1961+        # spends waiting for their file) of selecting a really slow peer
1962+        # that happens to have a primary share is probably more than
1963+        # selecting a really fast peer that doesn't have a primary
1964+        # share. Maybe the servermap could be extended to provide this
1965+        # information; it could keep track of latency information while
1966+        # it gathers more important data, and then this routine could
1967+        # use that to select active readers.
1968+        #
1969+        # (these and other questions would be easier to answer with a
1970+        #  robust, configurable tahoe-lafs simulator, which modeled node
1971+        #  failures, differences in node speed, and other characteristics
1972+        #  that we expect storage servers to have.  You could have
1973+        #  presets for really stable grids (like allmydata.com),
1974+        #  friendnets, make it easy to configure your own settings, and
1975+        #  then simulate the effect of big changes on these use cases
1976+        #  instead of just reasoning about what the effect might be. Out
1977+        #  of scope for MDMF, though.)
1978 
1979hunk ./src/allmydata/mutable/retrieve.py 304
1980-    def remove_peer(self, peerid):
1981-        for shnum in list(self.remaining_sharemap.keys()):
1982-            self.remaining_sharemap.discard(shnum, peerid)
1983+        # We need at least self._required_shares readers to download a
1984+        # segment.
1985+        needed = self._required_shares - len(self._active_readers)
1986+        # XXX: Why don't format= log messages work here?
1987+        self.log("adding %d peers to the active peers list" % needed)
1988 
1989hunk ./src/allmydata/mutable/retrieve.py 310
1990-    def _got_results(self, datavs, marker, peerid, started, got_from_cache):
1991-        now = time.time()
1992-        elapsed = now - started
1993-        if not got_from_cache:
1994-            self._status.add_fetch_timing(peerid, elapsed)
1995-        self.log(format="got results (%(shares)d shares) from [%(peerid)s]",
1996-                 shares=len(datavs),
1997-                 peerid=idlib.shortnodeid_b2a(peerid),
1998-                 level=log.NOISY)
1999-        self._outstanding_queries.pop(marker, None)
2000-        if not self._running:
2001-            return
2002+        # We favor lower numbered shares, since FEC is faster with
2003+        # primary shares than with other shares, and lower-numbered
2004+        # shares are more likely to be primary than higher numbered
2005+        # shares.
2006+        active_shnums = set(sorted(self.remaining_sharemap.keys()))
2007+        # We shouldn't consider adding shares that we already have; this
2008+        # will cause problems later.
2009+        active_shnums -= set([reader.shnum for reader in self._active_readers])
2010+        active_shnums = list(active_shnums)[:needed]
2011+        if len(active_shnums) < needed:
2012+            # We don't have enough readers to retrieve the file; fail.
2013+            return self._failed()
2014 
2015hunk ./src/allmydata/mutable/retrieve.py 323
2016-        # note that we only ask for a single share per query, so we only
2017-        # expect a single share back. On the other hand, we use the extra
2018-        # shares if we get them.. seems better than an assert().
2019+        for shnum in active_shnums:
2020+            self._active_readers.append(self.readers[shnum])
2021+            self.log("added reader for share %d" % shnum)
2022+        assert len(self._active_readers) == self._required_shares
2023+        # Conceptually, this is part of the _add_active_peers step. It
2024+        # validates the prefixes of newly added readers to make sure
2025+        # that they match what we are expecting for self.verinfo. If
2026+        # validation is successful, _validate_active_prefixes will call
2027+        # _download_current_segment for us. If validation is
2028+        # unsuccessful, then _validate_prefixes will remove the peer and
2029+        # call _add_active_peers again, where we will attempt to rectify
2030+        # the problem by choosing another peer.
2031+        return self._validate_active_prefixes()
2032 
2033hunk ./src/allmydata/mutable/retrieve.py 337
2034-        for shnum,datav in datavs.items():
2035-            (prefix, hash_and_data) = datav[:2]
2036-            try:
2037-                self._got_results_one_share(shnum, peerid,
2038-                                            prefix, hash_and_data)
2039-            except CorruptShareError, e:
2040-                # log it and give the other shares a chance to be processed
2041-                f = failure.Failure()
2042-                self.log(format="bad share: %(f_value)s",
2043-                         f_value=str(f.value), failure=f,
2044-                         level=log.WEIRD, umid="7fzWZw")
2045-                self.notify_server_corruption(peerid, shnum, str(e))
2046-                self.remove_peer(peerid)
2047-                self.servermap.mark_bad_share(peerid, shnum, prefix)
2048-                self._bad_shares.add( (peerid, shnum) )
2049-                self._status.problems[peerid] = f
2050-                self._last_failure = f
2051-                pass
2052-            if self._need_privkey and len(datav) > 2:
2053-                lp = None
2054-                self._try_to_validate_privkey(datav[2], peerid, shnum, lp)
2055-        # all done!
2056 
2057hunk ./src/allmydata/mutable/retrieve.py 338
2058-    def notify_server_corruption(self, peerid, shnum, reason):
2059-        ss = self.servermap.connections[peerid]
2060-        ss.callRemoteOnly("advise_corrupt_share",
2061-                          "mutable", self._storage_index, shnum, reason)
2062+    def _validate_active_prefixes(self):
2063+        """
2064+        I check to make sure that the prefixes on the peers that I am
2065+        currently reading from match the prefix that we want to see, as
2066+        said in self.verinfo.
2067 
2068hunk ./src/allmydata/mutable/retrieve.py 344
2069-    def _got_results_one_share(self, shnum, peerid,
2070-                               got_prefix, got_hash_and_data):
2071-        self.log("_got_results: got shnum #%d from peerid %s"
2072-                 % (shnum, idlib.shortnodeid_b2a(peerid)))
2073-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2074+        If I find that all of the active peers have acceptable prefixes,
2075+        I pass control to _download_current_segment, which will use
2076+        those peers to do cool things. If I find that some of the active
2077+        peers have unacceptable prefixes, I will remove them from active
2078+        peers (and from further consideration) and call
2079+        _add_active_peers to attempt to rectify the situation. I keep
2080+        track of which peers I have already validated so that I don't
2081+        need to do so again.
2082+        """
2083+        assert self._active_readers, "No more active readers"
2084+
2085+        ds = []
2086+        new_readers = set(self._active_readers) - self._validated_readers
2087+        self.log('validating %d newly-added active readers' % len(new_readers))
2088+
2089+        for reader in new_readers:
2090+            # We force a remote read here -- otherwise, we are relying
2091+            # on cached data that we already verified as valid, and we
2092+            # won't detect an uncoordinated write that has occurred
2093+            # since the last servermap update.
2094+            d = reader.get_prefix(force_remote=True)
2095+            d.addCallback(self._try_to_validate_prefix, reader)
2096+            ds.append(d)
2097+        dl = defer.DeferredList(ds, consumeErrors=True)
2098+        def _check_results(results):
2099+            # Each result in results will be of the form (success, msg).
2100+            # We don't care about msg, but success will tell us whether
2101+            # or not the checkstring validated. If it didn't, we need to
2102+            # remove the offending (peer,share) from our active readers,
2103+            # and ensure that active readers is again populated.
2104+            bad_readers = []
2105+            for i, result in enumerate(results):
2106+                if not result[0]:
2107+                    reader = self._active_readers[i]
2108+                    f = result[1]
2109+                    assert isinstance(f, failure.Failure)
2110+
2111+                    self.log("The reader %s failed to "
2112+                             "properly validate: %s" % \
2113+                             (reader, str(f.value)))
2114+                    bad_readers.append((reader, f))
2115+                else:
2116+                    reader = self._active_readers[i]
2117+                    self.log("the reader %s checks out, so we'll use it" % \
2118+                             reader)
2119+                    self._validated_readers.add(reader)
2120+                    # Each time we validate a reader, we check to see if
2121+                    # we need the private key. If we do, we politely ask
2122+                    # for it and then continue computing. If we find
2123+                    # that we haven't gotten it at the end of
2124+                    # segment decoding, then we'll take more drastic
2125+                    # measures.
2126+                    if self._need_privkey:
2127+                        d = reader.get_encprivkey()
2128+                        d.addCallback(self._try_to_validate_privkey, reader)
2129+            if bad_readers:
2130+                # We do them all at once, or else we screw up list indexing.
2131+                for (reader, f) in bad_readers:
2132+                    self._mark_bad_share(reader, f)
2133+                return self._add_active_peers()
2134+            else:
2135+                return self._download_current_segment()
2136+            # The next step will assert that it has enough active
2137+            # readers to fetch shares; we just need to remove it.
2138+        dl.addCallback(_check_results)
2139+        return dl
2140+
2141+
2142+    def _try_to_validate_prefix(self, prefix, reader):
2143+        """
2144+        I check that the prefix returned by a candidate server for
2145+        retrieval matches the prefix that the servermap knows about
2146+        (and, hence, the prefix that was validated earlier). If it does,
2147+        I return True, which means that I approve of the use of the
2148+        candidate server for segment retrieval. If it doesn't, I return
2149+        False, which means that another server must be chosen.
2150+        """
2151+        (seqnum,
2152+         root_hash,
2153+         IV,
2154+         segsize,
2155+         datalength,
2156+         k,
2157+         N,
2158+         known_prefix,
2159          offsets_tuple) = self.verinfo
2160hunk ./src/allmydata/mutable/retrieve.py 430
2161-        assert len(got_prefix) == len(prefix), (len(got_prefix), len(prefix))
2162-        if got_prefix != prefix:
2163-            msg = "someone wrote to the data since we read the servermap: prefix changed"
2164-            raise UncoordinatedWriteError(msg)
2165-        (share_hash_chain, block_hash_tree,
2166-         share_data) = unpack_share_data(self.verinfo, got_hash_and_data)
2167+        if known_prefix != prefix:
2168+            self.log("prefix from share %d doesn't match" % reader.shnum)
2169+            raise UncoordinatedWriteError("Mismatched prefix -- this could "
2170+                                          "indicate an uncoordinated write")
2171+        # Otherwise, we're okay -- no issues.
2172 
2173hunk ./src/allmydata/mutable/retrieve.py 436
2174-        assert isinstance(share_data, str)
2175-        # build the block hash tree. SDMF has only one leaf.
2176-        leaves = [hashutil.block_hash(share_data)]
2177-        t = hashtree.HashTree(leaves)
2178-        if list(t) != block_hash_tree:
2179-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
2180-        share_hash_leaf = t[0]
2181-        t2 = hashtree.IncompleteHashTree(N)
2182-        # root_hash was checked by the signature
2183-        t2.set_hashes({0: root_hash})
2184-        try:
2185-            t2.set_hashes(hashes=share_hash_chain,
2186-                          leaves={shnum: share_hash_leaf})
2187-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
2188-                IndexError), e:
2189-            msg = "corrupt hashes: %s" % (e,)
2190-            raise CorruptShareError(peerid, shnum, msg)
2191-        self.log(" data valid! len=%d" % len(share_data))
2192-        # each query comes down to this: placing validated share data into
2193-        # self.shares
2194-        self.shares[shnum] = share_data
2195 
2196hunk ./src/allmydata/mutable/retrieve.py 437
2197-    def _try_to_validate_privkey(self, enc_privkey, peerid, shnum, lp):
2198+    def _remove_reader(self, reader):
2199+        """
2200+        At various points, we will wish to remove a peer from
2201+        consideration and/or use. These include, but are not necessarily
2202+        limited to:
2203 
2204hunk ./src/allmydata/mutable/retrieve.py 443
2205-        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
2206-        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
2207-        if alleged_writekey != self._node.get_writekey():
2208-            self.log("invalid privkey from %s shnum %d" %
2209-                     (idlib.nodeid_b2a(peerid)[:8], shnum),
2210-                     parent=lp, level=log.WEIRD, umid="YIw4tA")
2211-            return
2212+            - A connection error.
2213+            - A mismatched prefix (that is, a prefix that does not match
2214+              our conception of the version information string).
2215+            - A failing block hash, salt hash, or share hash, which can
2216+              indicate disk failure/bit flips, or network trouble.
2217 
2218hunk ./src/allmydata/mutable/retrieve.py 449
2219-        # it's good
2220-        self.log("got valid privkey from shnum %d on peerid %s" %
2221-                 (shnum, idlib.shortnodeid_b2a(peerid)),
2222-                 parent=lp)
2223-        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
2224-        self._node._populate_encprivkey(enc_privkey)
2225-        self._node._populate_privkey(privkey)
2226-        self._need_privkey = False
2227+        This method will do that. I will make sure that the
2228+        (shnum,reader) combination represented by my reader argument is
2229+        not used for anything else during this download. I will not
2230+        advise the reader of any corruption, something that my callers
2231+        may wish to do on their own.
2232+        """
2233+        # TODO: When you're done writing this, see if this is ever
2234+        # actually used for something that _mark_bad_share isn't. I have
2235+        # a feeling that they will be used for very similar things, and
2236+        # that having them both here is just going to be an epic amount
2237+        # of code duplication.
2238+        #
2239+        # (well, okay, not epic, but meaningful)
2240+        self.log("removing reader %s" % reader)
2241+        # Remove the reader from _active_readers
2242+        self._active_readers.remove(reader)
2243+        # TODO: self.readers.remove(reader)?
2244+        for shnum in list(self.remaining_sharemap.keys()):
2245+            self.remaining_sharemap.discard(shnum, reader.peerid)
2246 
2247hunk ./src/allmydata/mutable/retrieve.py 469
2248-    def _query_failed(self, f, marker, peerid):
2249-        self.log(format="query to [%(peerid)s] failed",
2250-                 peerid=idlib.shortnodeid_b2a(peerid),
2251-                 level=log.NOISY)
2252-        self._status.problems[peerid] = f
2253-        self._outstanding_queries.pop(marker, None)
2254-        if not self._running:
2255-            return
2256-        self._last_failure = f
2257-        self.remove_peer(peerid)
2258-        level = log.WEIRD
2259-        if f.check(DeadReferenceError):
2260-            level = log.UNUSUAL
2261-        self.log(format="error during query: %(f_value)s",
2262-                 f_value=str(f.value), failure=f, level=level, umid="gOJB5g")
2263 
2264hunk ./src/allmydata/mutable/retrieve.py 470
2265-    def _check_for_done(self, res):
2266-        # exit paths:
2267-        #  return : keep waiting, no new queries
2268-        #  return self._send_more_queries(outstanding) : send some more queries
2269-        #  fire self._done(plaintext) : download successful
2270-        #  raise exception : download fails
2271+    def _mark_bad_share(self, reader, f):
2272+        """
2273+        I mark the (peerid, shnum) encapsulated by my reader argument as
2274+        a bad share, which means that it will not be used anywhere else.
2275 
2276hunk ./src/allmydata/mutable/retrieve.py 475
2277-        self.log(format="_check_for_done: running=%(running)s, decoding=%(decoding)s",
2278-                 running=self._running, decoding=self._decoding,
2279-                 level=log.NOISY)
2280-        if not self._running:
2281-            return
2282-        if self._decoding:
2283-            return
2284-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2285-         offsets_tuple) = self.verinfo
2286+        There are several reasons to want to mark something as a bad
2287+        share. These include:
2288 
2289hunk ./src/allmydata/mutable/retrieve.py 478
2290-        if len(self.shares) < k:
2291-            # we don't have enough shares yet
2292-            return self._maybe_send_more_queries(k)
2293-        if self._need_privkey:
2294-            # we got k shares, but none of them had a valid privkey. TODO:
2295-            # look further. Adding code to do this is a bit complicated, and
2296-            # I want to avoid that complication, and this should be pretty
2297-            # rare (k shares with bitflips in the enc_privkey but not in the
2298-            # data blocks). If we actually do get here, the subsequent repair
2299-            # will fail for lack of a privkey.
2300-            self.log("got k shares but still need_privkey, bummer",
2301-                     level=log.WEIRD, umid="MdRHPA")
2302+            - A connection error to the peer.
2303+            - A mismatched prefix (that is, a prefix that does not match
2304+              our local conception of the version information string).
2305+            - A failing block hash, salt hash, share hash, or other
2306+              integrity check.
2307 
2308hunk ./src/allmydata/mutable/retrieve.py 484
2309-        # we have enough to finish. All the shares have had their hashes
2310-        # checked, so if something fails at this point, we don't know how
2311-        # to fix it, so the download will fail.
2312+        This method will ensure that readers that we wish to mark bad
2313+        (for these reasons or other reasons) are not used for the rest
2314+        of the download. Additionally, it will attempt to tell the
2315+        remote peer (with no guarantee of success) that its share is
2316+        corrupt.
2317+        """
2318+        self.log("marking share %d on server %s as bad" % \
2319+                 (reader.shnum, reader))
2320+        self._remove_reader(reader)
2321+        self._bad_shares.add((reader.peerid, reader.shnum))
2322+        self._status.problems[reader.peerid] = f
2323+        self._last_failure = f
2324+        self.notify_server_corruption(reader.peerid, reader.shnum,
2325+                                      str(f.value))
2326 
2327hunk ./src/allmydata/mutable/retrieve.py 499
2328-        self._decoding = True # avoid reentrancy
2329-        self._status.set_status("decoding")
2330-        now = time.time()
2331-        elapsed = now - self._started
2332-        self._status.timings["fetch"] = elapsed
2333 
2334hunk ./src/allmydata/mutable/retrieve.py 500
2335-        d = defer.maybeDeferred(self._decode)
2336-        d.addCallback(self._decrypt, IV, self._node.get_readkey())
2337-        d.addBoth(self._done)
2338-        return d # purely for test convenience
2339+    def _download_current_segment(self):
2340+        """
2341+        I download, validate, decode, decrypt, and assemble the segment
2342+        that this Retrieve is currently responsible for downloading.
2343+        """
2344+        assert len(self._active_readers) >= self._required_shares
2345+        if self._current_segment < self._num_segments:
2346+            d = self._process_segment(self._current_segment)
2347+        else:
2348+            d = defer.succeed(None)
2349+        d.addCallback(self._check_for_done)
2350+        return d
2351 
2352hunk ./src/allmydata/mutable/retrieve.py 513
2353-    def _maybe_send_more_queries(self, k):
2354-        # we don't have enough shares yet. Should we send out more queries?
2355-        # There are some number of queries outstanding, each for a single
2356-        # share. If we can generate 'needed_shares' additional queries, we do
2357-        # so. If we can't, then we know this file is a goner, and we raise
2358-        # NotEnoughSharesError.
2359-        self.log(format=("_maybe_send_more_queries, have=%(have)d, k=%(k)d, "
2360-                         "outstanding=%(outstanding)d"),
2361-                 have=len(self.shares), k=k,
2362-                 outstanding=len(self._outstanding_queries),
2363-                 level=log.NOISY)
2364 
2365hunk ./src/allmydata/mutable/retrieve.py 514
2366-        remaining_shares = k - len(self.shares)
2367-        needed = remaining_shares - len(self._outstanding_queries)
2368-        if not needed:
2369-            # we have enough queries in flight already
2370+    def _process_segment(self, segnum):
2371+        """
2372+        I download, validate, decode, and decrypt one segment of the
2373+        file that this Retrieve is retrieving. This means coordinating
2374+        the process of getting k blocks of that file, validating them,
2375+        assembling them into one segment with the decoder, and then
2376+        decrypting them.
2377+        """
2378+        self.log("processing segment %d" % segnum)
2379 
2380hunk ./src/allmydata/mutable/retrieve.py 524
2381-            # TODO: but if they've been in flight for a long time, and we
2382-            # have reason to believe that new queries might respond faster
2383-            # (i.e. we've seen other queries come back faster, then consider
2384-            # sending out new queries. This could help with peers which have
2385-            # silently gone away since the servermap was updated, for which
2386-            # we're still waiting for the 15-minute TCP disconnect to happen.
2387-            self.log("enough queries are in flight, no more are needed",
2388-                     level=log.NOISY)
2389-            return
2390+        # TODO: The old code uses a marker. Should this code do that
2391+        # too? What did the Marker do?
2392+        assert len(self._active_readers) >= self._required_shares
2393+
2394+        # We need to ask each of our active readers for its block and
2395+        # salt. We will then validate those. If validation is
2396+        # successful, we will assemble the results into plaintext.
2397+        ds = []
2398+        for reader in self._active_readers:
2399+            d = reader.get_block_and_salt(segnum, queue=True)
2400+            d2 = self._get_needed_hashes(reader, segnum)
2401+            dl = defer.DeferredList([d, d2], consumeErrors=True)
2402+            dl.addCallback(self._validate_block, segnum, reader)
2403+            dl.addErrback(self._validation_or_decoding_failed, [reader])
2404+            ds.append(dl)
2405+            reader.flush()
2406+        dl = defer.DeferredList(ds)
2407+        dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
2408+        return dl
2409 
2410hunk ./src/allmydata/mutable/retrieve.py 544
2411-        outstanding_shnums = set([shnum
2412-                                  for (peerid, shnum, started)
2413-                                  in self._outstanding_queries.values()])
2414-        # prefer low-numbered shares, they are more likely to be primary
2415-        available_shnums = sorted(self.remaining_sharemap.keys())
2416-        for shnum in available_shnums:
2417-            if shnum in outstanding_shnums:
2418-                # skip ones that are already in transit
2419-                continue
2420-            if shnum not in self.remaining_sharemap:
2421-                # no servers for that shnum. note that DictOfSets removes
2422-                # empty sets from the dict for us.
2423-                continue
2424-            peerid = list(self.remaining_sharemap[shnum])[0]
2425-            # get_data will remove that peerid from the sharemap, and add the
2426-            # query to self._outstanding_queries
2427-            self._status.set_status("Retrieving More Shares")
2428-            self.get_data(shnum, peerid)
2429-            needed -= 1
2430-            if not needed:
2431+
2432+    def _maybe_decode_and_decrypt_segment(self, blocks_and_salts, segnum):
2433+        """
2434+        I take the results of fetching and validating the blocks from a
2435+        callback chain in another method. If the results are such that
2436+        they tell me that validation and fetching succeeded without
2437+        incident, I will proceed with decoding and decryption.
2438+        Otherwise, I will do nothing.
2439+        """
2440+        self.log("trying to decode and decrypt segment %d" % segnum)
2441+        failures = False
2442+        for block_and_salt in blocks_and_salts:
2443+            if not block_and_salt[0] or block_and_salt[1] == None:
2444+                self.log("some validation operations failed; not proceeding")
2445+                failures = True
2446                 break
2447hunk ./src/allmydata/mutable/retrieve.py 560
2448+        if not failures:
2449+            self.log("everything looks ok, building segment %d" % segnum)
2450+            d = self._decode_blocks(blocks_and_salts, segnum)
2451+            d.addCallback(self._decrypt_segment)
2452+            d.addErrback(self._validation_or_decoding_failed,
2453+                         self._active_readers)
2454+            d.addCallback(self._set_segment)
2455+            return d
2456+        else:
2457+            return defer.succeed(None)
2458+
2459+
2460+    def _set_segment(self, segment):
2461+        """
2462+        Given a plaintext segment, I register that segment with the
2463+        target that is handling the file download.
2464+        """
2465+        self.log("got plaintext for segment %d" % self._current_segment)
2466+        self._plaintext += segment
2467+        self._current_segment += 1
2468 
2469hunk ./src/allmydata/mutable/retrieve.py 581
2470-        # at this point, we have as many outstanding queries as we can. If
2471-        # needed!=0 then we might not have enough to recover the file.
2472-        if needed:
2473-            format = ("ran out of peers: "
2474-                      "have %(have)d shares (k=%(k)d), "
2475-                      "%(outstanding)d queries in flight, "
2476-                      "need %(need)d more, "
2477-                      "found %(bad)d bad shares")
2478-            args = {"have": len(self.shares),
2479-                    "k": k,
2480-                    "outstanding": len(self._outstanding_queries),
2481-                    "need": needed,
2482-                    "bad": len(self._bad_shares),
2483-                    }
2484-            self.log(format=format,
2485-                     level=log.WEIRD, umid="ezTfjw", **args)
2486-            err = NotEnoughSharesError("%s, last failure: %s" %
2487-                                      (format % args, self._last_failure))
2488-            if self._bad_shares:
2489-                self.log("We found some bad shares this pass. You should "
2490-                         "update the servermap and try again to check "
2491-                         "more peers",
2492-                         level=log.WEIRD, umid="EFkOlA")
2493-                err.servermap = self.servermap
2494-            raise err
2495 
2496hunk ./src/allmydata/mutable/retrieve.py 582
2497+    def _validation_or_decoding_failed(self, f, readers):
2498+        """
2499+        I am called when a block or a salt fails to correctly validate, or when
2500+        the decryption or decoding operation fails for some reason.  I react to
2501+        this failure by notifying the remote server of corruption, and then
2502+        removing the remote peer from further activity.
2503+        """
2504+        assert isinstance(readers, list)
2505+        bad_shnums = [reader.shnum for reader in readers]
2506+
2507+        self.log("validation or decoding failed on share(s) %s, peer(s) %s "
2508+                 ", segment %d: %s" % \
2509+                 (bad_shnums, readers, self._current_segment, str(f)))
2510+        for reader in readers:
2511+            self._mark_bad_share(reader, f)
2512         return
2513 
2514hunk ./src/allmydata/mutable/retrieve.py 599
2515-    def _decode(self):
2516-        started = time.time()
2517-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2518-         offsets_tuple) = self.verinfo
2519 
2520hunk ./src/allmydata/mutable/retrieve.py 600
2521-        # shares_dict is a dict mapping shnum to share data, but the codec
2522-        # wants two lists.
2523-        shareids = []; shares = []
2524-        for shareid, share in self.shares.items():
2525+    def _validate_block(self, results, segnum, reader):
2526+        """
2527+        I validate a block from one share on a remote server.
2528+        """
2529+        # Grab the part of the block hash tree that is necessary to
2530+        # validate this block, then generate the block hash root.
2531+        self.log("validating share %d for segment %d" % (reader.shnum,
2532+                                                             segnum))
2533+        # Did we fail to fetch either of the things that we were
2534+        # supposed to? Fail if so.
2535+        if not results[0][0] and results[1][0]:
2536+            # handled by the errback handler.
2537+
2538+            # These all get batched into one query, so the resulting
2539+            # failure should be the same for all of them, so we can just
2540+            # use the first one.
2541+            assert isinstance(results[0][1], failure.Failure)
2542+
2543+            f = results[0][1]
2544+            raise CorruptShareError(reader.peerid,
2545+                                    reader.shnum,
2546+                                    "Connection error: %s" % str(f))
2547+
2548+        block_and_salt, block_and_sharehashes = results
2549+        block, salt = block_and_salt[1]
2550+        blockhashes, sharehashes = block_and_sharehashes[1]
2551+
2552+        blockhashes = dict(enumerate(blockhashes[1]))
2553+        self.log("the reader gave me the following blockhashes: %s" % \
2554+                 blockhashes.keys())
2555+        self.log("the reader gave me the following sharehashes: %s" % \
2556+                 sharehashes[1].keys())
2557+        bht = self._block_hash_trees[reader.shnum]
2558+
2559+        if bht.needed_hashes(segnum, include_leaf=True):
2560+            try:
2561+                bht.set_hashes(blockhashes)
2562+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2563+                    IndexError), e:
2564+                raise CorruptShareError(reader.peerid,
2565+                                        reader.shnum,
2566+                                        "block hash tree failure: %s" % e)
2567+
2568+        if self._version == MDMF_VERSION:
2569+            blockhash = hashutil.block_hash(salt + block)
2570+        else:
2571+            blockhash = hashutil.block_hash(block)
2572+        # If this works without an error, then validation is
2573+        # successful.
2574+        try:
2575+           bht.set_hashes(leaves={segnum: blockhash})
2576+        except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2577+                IndexError), e:
2578+            raise CorruptShareError(reader.peerid,
2579+                                    reader.shnum,
2580+                                    "block hash tree failure: %s" % e)
2581+
2582+        # Reaching this point means that we know that this segment
2583+        # is correct. Now we need to check to see whether the share
2584+        # hash chain is also correct.
2585+        # SDMF wrote share hash chains that didn't contain the
2586+        # leaves, which would be produced from the block hash tree.
2587+        # So we need to validate the block hash tree first. If
2588+        # successful, then bht[0] will contain the root for the
2589+        # shnum, which will be a leaf in the share hash tree, which
2590+        # will allow us to validate the rest of the tree.
2591+        if self.share_hash_tree.needed_hashes(reader.shnum,
2592+                                               include_leaf=True):
2593+            try:
2594+                self.share_hash_tree.set_hashes(hashes=sharehashes[1],
2595+                                            leaves={reader.shnum: bht[0]})
2596+            except (hashtree.BadHashError, hashtree.NotEnoughHashesError, \
2597+                    IndexError), e:
2598+                raise CorruptShareError(reader.peerid,
2599+                                        reader.shnum,
2600+                                        "corrupt hashes: %s" % e)
2601+
2602+        # TODO: Validate the salt, too.
2603+        self.log('share %d is valid for segment %d' % (reader.shnum,
2604+                                                       segnum))
2605+        return {reader.shnum: (block, salt)}
2606+
2607+
2608+    def _get_needed_hashes(self, reader, segnum):
2609+        """
2610+        I get the hashes needed to validate segnum from the reader, then return
2611+        to my caller when this is done.
2612+        """
2613+        bht = self._block_hash_trees[reader.shnum]
2614+        needed = bht.needed_hashes(segnum, include_leaf=True)
2615+        # The root of the block hash tree is also a leaf in the share
2616+        # hash tree. So we don't need to fetch it from the remote
2617+        # server. In the case of files with one segment, this means that
2618+        # we won't fetch any block hash tree from the remote server,
2619+        # since the hash of each share of the file is the entire block
2620+        # hash tree, and is a leaf in the share hash tree. This is fine,
2621+        # since any share corruption will be detected in the share hash
2622+        # tree.
2623+        #needed.discard(0)
2624+        self.log("getting blockhashes for segment %d, share %d: %s" % \
2625+                 (segnum, reader.shnum, str(needed)))
2626+        d1 = reader.get_blockhashes(needed, queue=True, force_remote=True)
2627+        if self.share_hash_tree.needed_hashes(reader.shnum):
2628+            need = self.share_hash_tree.needed_hashes(reader.shnum)
2629+            self.log("also need sharehashes for share %d: %s" % (reader.shnum,
2630+                                                                 str(need)))
2631+            d2 = reader.get_sharehashes(need, queue=True, force_remote=True)
2632+        else:
2633+            d2 = defer.succeed({}) # the logic in the next method
2634+                                   # expects a dict
2635+        dl = defer.DeferredList([d1, d2], consumeErrors=True)
2636+        return dl
2637+
2638+
2639+    def _decode_blocks(self, blocks_and_salts, segnum):
2640+        """
2641+        I take a list of k blocks and salts, and decode that into a
2642+        single encrypted segment.
2643+        """
2644+        d = {}
2645+        # We want to merge our dictionaries to the form
2646+        # {shnum: blocks_and_salts}
2647+        #
2648+        # The dictionaries come from validate block that way, so we just
2649+        # need to merge them.
2650+        for block_and_salt in blocks_and_salts:
2651+            d.update(block_and_salt[1])
2652+
2653+        # All of these blocks should have the same salt; in SDMF, it is
2654+        # the file-wide IV, while in MDMF it is the per-segment salt. In
2655+        # either case, we just need to get one of them and use it.
2656+        #
2657+        # d.items()[0] is like (shnum, (block, salt))
2658+        # d.items()[0][1] is like (block, salt)
2659+        # d.items()[0][1][1] is the salt.
2660+        salt = d.items()[0][1][1]
2661+        # Next, extract just the blocks from the dict. We'll use the
2662+        # salt in the next step.
2663+        share_and_shareids = [(k, v[0]) for k, v in d.items()]
2664+        d2 = dict(share_and_shareids)
2665+        shareids = []
2666+        shares = []
2667+        for shareid, share in d2.items():
2668             shareids.append(shareid)
2669             shares.append(share)
2670 
2671hunk ./src/allmydata/mutable/retrieve.py 746
2672-        assert len(shareids) >= k, len(shareids)
2673+        assert len(shareids) >= self._required_shares, len(shareids)
2674         # zfec really doesn't want extra shares
2675hunk ./src/allmydata/mutable/retrieve.py 748
2676-        shareids = shareids[:k]
2677-        shares = shares[:k]
2678-
2679-        fec = codec.CRSDecoder()
2680-        fec.set_params(segsize, k, N)
2681-
2682-        self.log("params %s, we have %d shares" % ((segsize, k, N), len(shares)))
2683-        self.log("about to decode, shareids=%s" % (shareids,))
2684-        d = defer.maybeDeferred(fec.decode, shares, shareids)
2685-        def _done(buffers):
2686-            self._status.timings["decode"] = time.time() - started
2687-            self.log(" decode done, %d buffers" % len(buffers))
2688+        shareids = shareids[:self._required_shares]
2689+        shares = shares[:self._required_shares]
2690+        self.log("decoding segment %d" % segnum)
2691+        if segnum == self._num_segments - 1:
2692+            d = defer.maybeDeferred(self._tail_decoder.decode, shares, shareids)
2693+        else:
2694+            d = defer.maybeDeferred(self._segment_decoder.decode, shares, shareids)
2695+        def _process(buffers):
2696             segment = "".join(buffers)
2697hunk ./src/allmydata/mutable/retrieve.py 757
2698+            self.log(format="now decoding segment %(segnum)s of %(numsegs)s",
2699+                     segnum=segnum,
2700+                     numsegs=self._num_segments,
2701+                     level=log.NOISY)
2702             self.log(" joined length %d, datalength %d" %
2703hunk ./src/allmydata/mutable/retrieve.py 762
2704-                     (len(segment), datalength))
2705-            segment = segment[:datalength]
2706+                     (len(segment), self._data_length))
2707+            if segnum == self._num_segments - 1:
2708+                size_to_use = self._tail_data_size
2709+            else:
2710+                size_to_use = self._segment_size
2711+            segment = segment[:size_to_use]
2712             self.log(" segment len=%d" % len(segment))
2713hunk ./src/allmydata/mutable/retrieve.py 769
2714-            return segment
2715-        def _err(f):
2716-            self.log(" decode failed: %s" % f)
2717-            return f
2718-        d.addCallback(_done)
2719-        d.addErrback(_err)
2720+            return segment, salt
2721+        d.addCallback(_process)
2722         return d
2723 
2724hunk ./src/allmydata/mutable/retrieve.py 773
2725-    def _decrypt(self, crypttext, IV, readkey):
2726+
2727+    def _decrypt_segment(self, segment_and_salt):
2728+        """
2729+        I take a single segment and its salt, and decrypt it. I return
2730+        the plaintext of the segment that is in my argument.
2731+        """
2732+        segment, salt = segment_and_salt
2733         self._status.set_status("decrypting")
2734hunk ./src/allmydata/mutable/retrieve.py 781
2735+        self.log("decrypting segment %d" % self._current_segment)
2736         started = time.time()
2737hunk ./src/allmydata/mutable/retrieve.py 783
2738-        key = hashutil.ssk_readkey_data_hash(IV, readkey)
2739+        key = hashutil.ssk_readkey_data_hash(salt, self._node.get_readkey())
2740         decryptor = AES(key)
2741hunk ./src/allmydata/mutable/retrieve.py 785
2742-        plaintext = decryptor.process(crypttext)
2743+        plaintext = decryptor.process(segment)
2744         self._status.timings["decrypt"] = time.time() - started
2745         return plaintext
2746 
2747hunk ./src/allmydata/mutable/retrieve.py 789
2748-    def _done(self, res):
2749-        if not self._running:
2750+
2751+    def notify_server_corruption(self, peerid, shnum, reason):
2752+        ss = self.servermap.connections[peerid]
2753+        ss.callRemoteOnly("advise_corrupt_share",
2754+                          "mutable", self._storage_index, shnum, reason)
2755+
2756+
2757+    def _try_to_validate_privkey(self, enc_privkey, reader):
2758+
2759+        alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
2760+        alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
2761+        if alleged_writekey != self._node.get_writekey():
2762+            self.log("invalid privkey from %s shnum %d" %
2763+                     (reader, reader.shnum),
2764+                     level=log.WEIRD, umid="YIw4tA")
2765             return
2766hunk ./src/allmydata/mutable/retrieve.py 805
2767-        self._running = False
2768-        self._status.set_active(False)
2769-        self._status.timings["total"] = time.time() - self._started
2770-        # res is either the new contents, or a Failure
2771-        if isinstance(res, failure.Failure):
2772-            self.log("Retrieve done, with failure", failure=res,
2773-                     level=log.UNUSUAL)
2774-            self._status.set_status("Failed")
2775-        else:
2776-            self.log("Retrieve done, success!")
2777-            self._status.set_status("Finished")
2778-            self._status.set_progress(1.0)
2779-            # remember the encoding parameters, use them again next time
2780-            (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
2781-             offsets_tuple) = self.verinfo
2782-            self._node._populate_required_shares(k)
2783-            self._node._populate_total_shares(N)
2784-        eventually(self._done_deferred.callback, res)
2785 
2786hunk ./src/allmydata/mutable/retrieve.py 806
2787+        # it's good
2788+        self.log("got valid privkey from shnum %d on reader %s" %
2789+                 (reader.shnum, reader))
2790+        privkey = rsa.create_signing_key_from_string(alleged_privkey_s)
2791+        self._node._populate_encprivkey(enc_privkey)
2792+        self._node._populate_privkey(privkey)
2793+        self._need_privkey = False
2794+
2795+
2796+    def _check_for_done(self, res):
2797+        """
2798+        I check to see if this Retrieve object has successfully finished
2799+        its work.
2800+
2801+        I can exit in the following ways:
2802+            - If there are no more segments to download, then I exit by
2803+              causing self._done_deferred to fire with the plaintext
2804+              content requested by the caller.
2805+            - If there are still segments to be downloaded, and there
2806+              are enough active readers (readers which have not broken
2807+              and have not given us corrupt data) to continue
2808+              downloading, I send control back to
2809+              _download_current_segment.
2810+            - If there are still segments to be downloaded but there are
2811+              not enough active peers to download them, I ask
2812+              _add_active_peers to add more peers. If it is successful,
2813+              it will call _download_current_segment. If there are not
2814+              enough peers to retrieve the file, then that will cause
2815+              _done_deferred to errback.
2816+        """
2817+        self.log("checking for doneness")
2818+        if self._current_segment == self._num_segments:
2819+            # No more segments to download, we're done.
2820+            self.log("got plaintext, done")
2821+            return self._done()
2822+
2823+        if len(self._active_readers) >= self._required_shares:
2824+            # More segments to download, but we have enough good peers
2825+            # in self._active_readers that we can do that without issue,
2826+            # so go nab the next segment.
2827+            self.log("not done yet: on segment %d of %d" % \
2828+                     (self._current_segment + 1, self._num_segments))
2829+            return self._download_current_segment()
2830+
2831+        self.log("not done yet: on segment %d of %d, need to add peers" % \
2832+                 (self._current_segment + 1, self._num_segments))
2833+        return self._add_active_peers()
2834+
2835+
2836+    def _done(self):
2837+        """
2838+        I am called by _check_for_done when the download process has
2839+        finished successfully. After making some useful logging
2840+        statements, I return the decrypted contents to the owner of this
2841+        Retrieve object through self._done_deferred.
2842+        """
2843+        eventually(self._done_deferred.callback, self._plaintext)
2844+
2845+
2846+    def _failed(self):
2847+        """
2848+        I am called by _add_active_peers when there are not enough
2849+        active peers left to complete the download. After making some
2850+        useful logging statements, I return an exception to that effect
2851+        to the caller of this Retrieve object through
2852+        self._done_deferred.
2853+        """
2854+        format = ("ran out of peers: "
2855+                  "have %(have)d of %(total)d segments "
2856+                  "found %(bad)d bad shares "
2857+                  "encoding %(k)d-of-%(n)d")
2858+        args = {"have": self._current_segment,
2859+                "total": self._num_segments,
2860+                "k": self._required_shares,
2861+                "n": self._total_shares,
2862+                "bad": len(self._bad_shares)}
2863+        e = NotEnoughSharesError("%s, last failure: %s" % (format % args,
2864+                                                        str(self._last_failure)))
2865+        f = failure.Failure(e)
2866+        eventually(self._done_deferred.callback, f)
2867hunk ./src/allmydata/test/test_mutable.py 12
2868 from allmydata.util.hashutil import tagged_hash, ssk_writekey_hash, \
2869      ssk_pubkey_fingerprint_hash
2870 from allmydata.interfaces import IRepairResults, ICheckAndRepairResults, \
2871-     NotEnoughSharesError
2872+     NotEnoughSharesError, SDMF_VERSION, MDMF_VERSION
2873 from allmydata.monitor import Monitor
2874 from allmydata.test.common import ShouldFailMixin
2875 from allmydata.test.no_network import GridTestMixin
2876hunk ./src/allmydata/test/test_mutable.py 28
2877 from allmydata.mutable.retrieve import Retrieve
2878 from allmydata.mutable.publish import Publish
2879 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
2880-from allmydata.mutable.layout import unpack_header, unpack_share
2881+from allmydata.mutable.layout import unpack_header, unpack_share, \
2882+                                     MDMFSlotReadProxy
2883 from allmydata.mutable.repairer import MustForceRepairError
2884 
2885 import allmydata.test.common_util as testutil
2886hunk ./src/allmydata/test/test_mutable.py 104
2887         d = fireEventually()
2888         d.addCallback(lambda res: _call())
2889         return d
2890+
2891     def callRemoteOnly(self, methname, *args, **kwargs):
2892         d = self.callRemote(methname, *args, **kwargs)
2893         d.addBoth(lambda ignore: None)
2894hunk ./src/allmydata/test/test_mutable.py 163
2895 def corrupt(res, s, offset, shnums_to_corrupt=None, offset_offset=0):
2896     # if shnums_to_corrupt is None, corrupt all shares. Otherwise it is a
2897     # list of shnums to corrupt.
2898+    ds = []
2899     for peerid in s._peers:
2900         shares = s._peers[peerid]
2901         for shnum in shares:
2902hunk ./src/allmydata/test/test_mutable.py 190
2903                 else:
2904                     offset1 = offset
2905                     offset2 = 0
2906-                if offset1 == "pubkey":
2907+                if offset1 == "pubkey" and IV:
2908                     real_offset = 107
2909hunk ./src/allmydata/test/test_mutable.py 192
2910+                elif offset1 == "share_data" and not IV:
2911+                    real_offset = 104
2912                 elif offset1 in o:
2913                     real_offset = o[offset1]
2914                 else:
2915hunk ./src/allmydata/test/test_mutable.py 327
2916         d.addCallback(_created)
2917         return d
2918 
2919+
2920+    def test_upload_and_download_mdmf(self):
2921+        d = self.nodemaker.create_mutable_file(version=MDMF_VERSION)
2922+        def _created(n):
2923+            d = defer.succeed(None)
2924+            d.addCallback(lambda ignored:
2925+                n.get_servermap(MODE_READ))
2926+            def _then(servermap):
2927+                dumped = servermap.dump(StringIO())
2928+                self.failUnlessIn("3-of-10", dumped.getvalue())
2929+            d.addCallback(_then)
2930+            # Now overwrite the contents with some new contents. We want
2931+            # to make them big enough to force the file to be uploaded
2932+            # in more than one segment.
2933+            big_contents = "contents1" * 100000 # about 900 KiB
2934+            d.addCallback(lambda ignored:
2935+                n.overwrite(big_contents))
2936+            d.addCallback(lambda ignored:
2937+                n.download_best_version())
2938+            d.addCallback(lambda data:
2939+                self.failUnlessEqual(data, big_contents))
2940+            # Overwrite the contents again with some new contents. As
2941+            # before, they need to be big enough to force multiple
2942+            # segments, so that we make the downloader deal with
2943+            # multiple segments.
2944+            bigger_contents = "contents2" * 1000000 # about 9MiB
2945+            d.addCallback(lambda ignored:
2946+                n.overwrite(bigger_contents))
2947+            d.addCallback(lambda ignored:
2948+                n.download_best_version())
2949+            d.addCallback(lambda data:
2950+                self.failUnlessEqual(data, bigger_contents))
2951+            return d
2952+        d.addCallback(_created)
2953+        return d
2954+
2955+
2956     def test_create_with_initial_contents(self):
2957         d = self.nodemaker.create_mutable_file("contents 1")
2958         def _created(n):
2959hunk ./src/allmydata/test/test_mutable.py 1147
2960 
2961 
2962     def _test_corrupt_all(self, offset, substring,
2963-                          should_succeed=False, corrupt_early=True,
2964-                          failure_checker=None):
2965+                          should_succeed=False,
2966+                          corrupt_early=True,
2967+                          failure_checker=None,
2968+                          fetch_privkey=False):
2969         d = defer.succeed(None)
2970         if corrupt_early:
2971             d.addCallback(corrupt, self._storage, offset)
2972hunk ./src/allmydata/test/test_mutable.py 1167
2973                     self.failUnlessIn(substring, "".join(allproblems))
2974                 return servermap
2975             if should_succeed:
2976-                d1 = self._fn.download_version(servermap, ver)
2977+                d1 = self._fn.download_version(servermap, ver,
2978+                                               fetch_privkey)
2979                 d1.addCallback(lambda new_contents:
2980                                self.failUnlessEqual(new_contents, self.CONTENTS))
2981             else:
2982hunk ./src/allmydata/test/test_mutable.py 1175
2983                 d1 = self.shouldFail(NotEnoughSharesError,
2984                                      "_corrupt_all(offset=%s)" % (offset,),
2985                                      substring,
2986-                                     self._fn.download_version, servermap, ver)
2987+                                     self._fn.download_version, servermap,
2988+                                                                ver,
2989+                                                                fetch_privkey)
2990             if failure_checker:
2991                 d1.addCallback(failure_checker)
2992             d1.addCallback(lambda res: servermap)
2993hunk ./src/allmydata/test/test_mutable.py 1186
2994         return d
2995 
2996     def test_corrupt_all_verbyte(self):
2997-        # when the version byte is not 0, we hit an UnknownVersionError error
2998-        # in unpack_share().
2999+        # when the version byte is not 0 or 1, we hit an UnknownVersionError
3000+        # error in unpack_share().
3001         d = self._test_corrupt_all(0, "UnknownVersionError")
3002         def _check_servermap(servermap):
3003             # and the dump should mention the problems
3004hunk ./src/allmydata/test/test_mutable.py 1193
3005             s = StringIO()
3006             dump = servermap.dump(s).getvalue()
3007-            self.failUnless("10 PROBLEMS" in dump, dump)
3008+            self.failUnless("30 PROBLEMS" in dump, dump)
3009         d.addCallback(_check_servermap)
3010         return d
3011 
3012hunk ./src/allmydata/test/test_mutable.py 1263
3013         return self._test_corrupt_all("enc_privkey", None, should_succeed=True)
3014 
3015 
3016+    def test_corrupt_all_encprivkey_late(self):
3017+        # this should work for the same reason as above, but we corrupt
3018+        # after the servermap update to exercise the error handling
3019+        # code.
3020+        # We need to remove the privkey from the node, or the retrieve
3021+        # process won't know to update it.
3022+        self._fn._privkey = None
3023+        return self._test_corrupt_all("enc_privkey",
3024+                                      None, # this shouldn't fail
3025+                                      should_succeed=True,
3026+                                      corrupt_early=False,
3027+                                      fetch_privkey=True)
3028+
3029+
3030     def test_corrupt_all_seqnum_late(self):
3031         # corrupting the seqnum between mapupdate and retrieve should result
3032         # in NotEnoughSharesError, since each share will look invalid
3033hunk ./src/allmydata/test/test_mutable.py 1283
3034         def _check(res):
3035             f = res[0]
3036             self.failUnless(f.check(NotEnoughSharesError))
3037-            self.failUnless("someone wrote to the data since we read the servermap" in str(f))
3038+            self.failUnless("uncoordinated write" in str(f))
3039         return self._test_corrupt_all(1, "ran out of peers",
3040                                       corrupt_early=False,
3041                                       failure_checker=_check)
3042hunk ./src/allmydata/test/test_mutable.py 1333
3043                       self.failUnlessEqual(new_contents, self.CONTENTS))
3044         return d
3045 
3046-    def test_corrupt_some(self):
3047-        # corrupt the data of first five shares (so the servermap thinks
3048-        # they're good but retrieve marks them as bad), so that the
3049-        # MODE_READ set of 6 will be insufficient, forcing node.download to
3050-        # retry with more servers.
3051-        corrupt(None, self._storage, "share_data", range(5))
3052-        d = self.make_servermap()
3053+
3054+    def _test_corrupt_some(self, offset, mdmf=False):
3055+        if mdmf:
3056+            d = self.publish_mdmf()
3057+        else:
3058+            d = defer.succeed(None)
3059+        d.addCallback(lambda ignored:
3060+            corrupt(None, self._storage, offset, range(5)))
3061+        d.addCallback(lambda ignored:
3062+            self.make_servermap())
3063         def _do_retrieve(servermap):
3064             ver = servermap.best_recoverable_version()
3065             self.failUnless(ver)
3066hunk ./src/allmydata/test/test_mutable.py 1349
3067             return self._fn.download_best_version()
3068         d.addCallback(_do_retrieve)
3069         d.addCallback(lambda new_contents:
3070-                      self.failUnlessEqual(new_contents, self.CONTENTS))
3071+            self.failUnlessEqual(new_contents, self.CONTENTS))
3072         return d
3073 
3074hunk ./src/allmydata/test/test_mutable.py 1352
3075+
3076+    def test_corrupt_some(self):
3077+        # corrupt the data of first five shares (so the servermap thinks
3078+        # they're good but retrieve marks them as bad), so that the
3079+        # MODE_READ set of 6 will be insufficient, forcing node.download to
3080+        # retry with more servers.
3081+        return self._test_corrupt_some("share_data")
3082+
3083+
3084     def test_download_fails(self):
3085         d = corrupt(None, self._storage, "signature")
3086         d.addCallback(lambda ignored:
3087hunk ./src/allmydata/test/test_mutable.py 1366
3088             self.shouldFail(UnrecoverableFileError, "test_download_anyway",
3089                             "no recoverable versions",
3090-                            self._fn.download_best_version)
3091+                            self._fn.download_best_version))
3092         return d
3093 
3094 
3095hunk ./src/allmydata/test/test_mutable.py 1370
3096+
3097+    def test_corrupt_mdmf_block_hash_tree(self):
3098+        d = self.publish_mdmf()
3099+        d.addCallback(lambda ignored:
3100+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
3101+                                   "block hash tree failure",
3102+                                   corrupt_early=False,
3103+                                   should_succeed=False))
3104+        return d
3105+
3106+
3107+    def test_corrupt_mdmf_block_hash_tree_late(self):
3108+        d = self.publish_mdmf()
3109+        d.addCallback(lambda ignored:
3110+            self._test_corrupt_all(("block_hash_tree", 12 * 32),
3111+                                   "block hash tree failure",
3112+                                   corrupt_early=True,
3113+                                   should_succeed=False))
3114+        return d
3115+
3116+
3117+    def test_corrupt_mdmf_share_data(self):
3118+        d = self.publish_mdmf()
3119+        d.addCallback(lambda ignored:
3120+            # TODO: Find out what the block size is and corrupt a
3121+            # specific block, rather than just guessing.
3122+            self._test_corrupt_all(("share_data", 12 * 40),
3123+                                    "block hash tree failure",
3124+                                    corrupt_early=True,
3125+                                    should_succeed=False))
3126+        return d
3127+
3128+
3129+    def test_corrupt_some_mdmf(self):
3130+        return self._test_corrupt_some(("share_data", 12 * 40),
3131+                                       mdmf=True)
3132+
3133+
3134 class CheckerMixin:
3135     def check_good(self, r, where):
3136         self.failUnless(r.is_healthy(), where)
3137hunk ./src/allmydata/test/test_mutable.py 2116
3138             d.addCallback(lambda res:
3139                           self.shouldFail(NotEnoughSharesError,
3140                                           "test_retrieve_surprise",
3141-                                          "ran out of peers: have 0 shares (k=3)",
3142+                                          "ran out of peers: have 0 of 1",
3143                                           n.download_version,
3144                                           self.old_map,
3145                                           self.old_map.best_recoverable_version(),
3146hunk ./src/allmydata/test/test_mutable.py 2125
3147         d.addCallback(_created)
3148         return d
3149 
3150+
3151     def test_unexpected_shares(self):
3152         # upload the file, take a servermap, shut down one of the servers,
3153         # upload it again (causing shares to appear on a new server), then
3154hunk ./src/allmydata/test/test_mutable.py 2329
3155         self.basedir = "mutable/Problems/test_privkey_query_missing"
3156         self.set_up_grid(num_servers=20)
3157         nm = self.g.clients[0].nodemaker
3158-        LARGE = "These are Larger contents" * 2000 # about 50KB
3159+        LARGE = "These are Larger contents" * 2000 # about 50KiB
3160         nm._node_cache = DevNullDictionary() # disable the nodecache
3161 
3162         d = nm.create_mutable_file(LARGE)
3163hunk ./src/allmydata/test/test_mutable.py 2342
3164         d.addCallback(_created)
3165         d.addCallback(lambda res: self.n2.get_servermap(MODE_WRITE))
3166         return d
3167+
3168+
3169+    def test_block_and_hash_query_error(self):
3170+        # This tests for what happens when a query to a remote server
3171+        # fails in either the hash validation step or the block getting
3172+        # step (because of batching, this is the same actual query).
3173+        # We need to have the storage server persist up until the point
3174+        # that its prefix is validated, then suddenly die. This
3175+        # exercises some exception handling code in Retrieve.
3176+        self.basedir = "mutable/Problems/test_block_and_hash_query_error"
3177+        self.set_up_grid(num_servers=20)
3178+        nm = self.g.clients[0].nodemaker
3179+        CONTENTS = "contents" * 2000
3180+        d = nm.create_mutable_file(CONTENTS)
3181+        def _created(node):
3182+            self._node = node
3183+        d.addCallback(_created)
3184+        d.addCallback(lambda ignored:
3185+            self._node.get_servermap(MODE_READ))
3186+        def _then(servermap):
3187+            # we have our servermap. Now we set up the servers like the
3188+            # tests above -- the first one that gets a read call should
3189+            # start throwing errors, but only after returning its prefix
3190+            # for validation. Since we'll download without fetching the
3191+            # private key, the next query to the remote server will be
3192+            # for either a block and salt or for hashes, either of which
3193+            # will exercise the error handling code.
3194+            killer = FirstServerGetsKilled()
3195+            for (serverid, ss) in nm.storage_broker.get_all_servers():
3196+                ss.post_call_notifier = killer.notify
3197+            ver = servermap.best_recoverable_version()
3198+            assert ver
3199+            return self._node.download_version(servermap, ver)
3200+        d.addCallback(_then)
3201+        d.addCallback(lambda data:
3202+            self.failUnlessEqual(data, CONTENTS))
3203+        return d
3204}
3205[mutable/checker.py: check MDMF files
3206Kevan Carstensen <kevan@isnotajoke.com>**20100628225048
3207 Ignore-this: fb697b36285d60552df6ca5ac6a37629
3208 
3209 This patch adapts the mutable file checker and verifier to check and
3210 verify MDMF files. It does this by using the new segmented downloader,
3211 which is trained to perform verification operations on request. This
3212 removes some code duplication.
3213] {
3214hunk ./src/allmydata/mutable/checker.py 12
3215 from allmydata.mutable.common import MODE_CHECK, CorruptShareError
3216 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
3217 from allmydata.mutable.layout import unpack_share, SIGNED_PREFIX_LENGTH
3218+from allmydata.mutable.retrieve import Retrieve # for verifying
3219 
3220 class MutableChecker:
3221 
3222hunk ./src/allmydata/mutable/checker.py 29
3223 
3224     def check(self, verify=False, add_lease=False):
3225         servermap = ServerMap()
3226+        # Updating the servermap in MODE_CHECK will stand a good chance
3227+        # of finding all of the shares, and getting a good idea of
3228+        # recoverability, etc, without verifying.
3229         u = ServermapUpdater(self._node, self._storage_broker, self._monitor,
3230                              servermap, MODE_CHECK, add_lease=add_lease)
3231         if self._history:
3232hunk ./src/allmydata/mutable/checker.py 55
3233         if num_recoverable:
3234             self.best_version = servermap.best_recoverable_version()
3235 
3236+        # The file is unhealthy and needs to be repaired if:
3237+        # - There are unrecoverable versions.
3238         if servermap.unrecoverable_versions():
3239             self.need_repair = True
3240hunk ./src/allmydata/mutable/checker.py 59
3241+        # - There isn't a recoverable version.
3242         if num_recoverable != 1:
3243             self.need_repair = True
3244hunk ./src/allmydata/mutable/checker.py 62
3245+        # - The best recoverable version is missing some shares.
3246         if self.best_version:
3247             available_shares = servermap.shares_available()
3248             (num_distinct_shares, k, N) = available_shares[self.best_version]
3249hunk ./src/allmydata/mutable/checker.py 73
3250 
3251     def _verify_all_shares(self, servermap):
3252         # read every byte of each share
3253+        #
3254+        # This logic is going to be very nearly the same as the
3255+        # downloader. I bet we could pass the downloader a flag that
3256+        # makes it do this, and piggyback onto that instead of
3257+        # duplicating a bunch of code.
3258+        #
3259+        # Like:
3260+        #  r = Retrieve(blah, blah, blah, verify=True)
3261+        #  d = r.download()
3262+        #  (wait, wait, wait, d.callback)
3263+        # 
3264+        #  Then, when it has finished, we can check the servermap (which
3265+        #  we provided to Retrieve) to figure out which shares are bad,
3266+        #  since the Retrieve process will have updated the servermap as
3267+        #  it went along.
3268+        #
3269+        #  By passing the verify=True flag to the constructor, we are
3270+        #  telling the downloader a few things.
3271+        #
3272+        #  1. It needs to download all N shares, not just K shares.
3273+        #  2. It doesn't need to decrypt or decode the shares, only
3274+        #     verify them.
3275         if not self.best_version:
3276             return
3277hunk ./src/allmydata/mutable/checker.py 97
3278-        versionmap = servermap.make_versionmap()
3279-        shares = versionmap[self.best_version]
3280-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
3281-         offsets_tuple) = self.best_version
3282-        offsets = dict(offsets_tuple)
3283-        readv = [ (0, offsets["EOF"]) ]
3284-        dl = []
3285-        for (shnum, peerid, timestamp) in shares:
3286-            ss = servermap.connections[peerid]
3287-            d = self._do_read(ss, peerid, self._storage_index, [shnum], readv)
3288-            d.addCallback(self._got_answer, peerid, servermap)
3289-            dl.append(d)
3290-        return defer.DeferredList(dl, fireOnOneErrback=True, consumeErrors=True)
3291 
3292hunk ./src/allmydata/mutable/checker.py 98
3293-    def _do_read(self, ss, peerid, storage_index, shnums, readv):
3294-        # isolate the callRemote to a separate method, so tests can subclass
3295-        # Publish and override it
3296-        d = ss.callRemote("slot_readv", storage_index, shnums, readv)
3297+        r = Retrieve(self._node, servermap, self.best_version, verify=True)
3298+        d = r.download()
3299+        d.addCallback(self._process_bad_shares)
3300         return d
3301 
3302hunk ./src/allmydata/mutable/checker.py 103
3303-    def _got_answer(self, datavs, peerid, servermap):
3304-        for shnum,datav in datavs.items():
3305-            data = datav[0]
3306-            try:
3307-                self._got_results_one_share(shnum, peerid, data)
3308-            except CorruptShareError:
3309-                f = failure.Failure()
3310-                self.need_repair = True
3311-                self.bad_shares.append( (peerid, shnum, f) )
3312-                prefix = data[:SIGNED_PREFIX_LENGTH]
3313-                servermap.mark_bad_share(peerid, shnum, prefix)
3314-                ss = servermap.connections[peerid]
3315-                self.notify_server_corruption(ss, shnum, str(f.value))
3316-
3317-    def check_prefix(self, peerid, shnum, data):
3318-        (seqnum, root_hash, IV, segsize, datalength, k, N, prefix,
3319-         offsets_tuple) = self.best_version
3320-        got_prefix = data[:SIGNED_PREFIX_LENGTH]
3321-        if got_prefix != prefix:
3322-            raise CorruptShareError(peerid, shnum,
3323-                                    "prefix mismatch: share changed while we were reading it")
3324-
3325-    def _got_results_one_share(self, shnum, peerid, data):
3326-        self.check_prefix(peerid, shnum, data)
3327-
3328-        # the [seqnum:signature] pieces are validated by _compare_prefix,
3329-        # which checks their signature against the pubkey known to be
3330-        # associated with this file.
3331 
3332hunk ./src/allmydata/mutable/checker.py 104
3333-        (seqnum, root_hash, IV, k, N, segsize, datalen, pubkey, signature,
3334-         share_hash_chain, block_hash_tree, share_data,
3335-         enc_privkey) = unpack_share(data)
3336-
3337-        # validate [share_hash_chain,block_hash_tree,share_data]
3338-
3339-        leaves = [hashutil.block_hash(share_data)]
3340-        t = hashtree.HashTree(leaves)
3341-        if list(t) != block_hash_tree:
3342-            raise CorruptShareError(peerid, shnum, "block hash tree failure")
3343-        share_hash_leaf = t[0]
3344-        t2 = hashtree.IncompleteHashTree(N)
3345-        # root_hash was checked by the signature
3346-        t2.set_hashes({0: root_hash})
3347-        try:
3348-            t2.set_hashes(hashes=share_hash_chain,
3349-                          leaves={shnum: share_hash_leaf})
3350-        except (hashtree.BadHashError, hashtree.NotEnoughHashesError,
3351-                IndexError), e:
3352-            msg = "corrupt hashes: %s" % (e,)
3353-            raise CorruptShareError(peerid, shnum, msg)
3354-
3355-        # validate enc_privkey: only possible if we have a write-cap
3356-        if not self._node.is_readonly():
3357-            alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
3358-            alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
3359-            if alleged_writekey != self._node.get_writekey():
3360-                raise CorruptShareError(peerid, shnum, "invalid privkey")
3361+    def _process_bad_shares(self, bad_shares):
3362+        if bad_shares:
3363+            self.need_repair = True
3364+        self.bad_shares = bad_shares
3365 
3366hunk ./src/allmydata/mutable/checker.py 109
3367-    def notify_server_corruption(self, ss, shnum, reason):
3368-        ss.callRemoteOnly("advise_corrupt_share",
3369-                          "mutable", self._storage_index, shnum, reason)
3370 
3371     def _count_shares(self, smap, version):
3372         available_shares = smap.shares_available()
3373hunk ./src/allmydata/test/test_mutable.py 193
3374                 if offset1 == "pubkey" and IV:
3375                     real_offset = 107
3376                 elif offset1 == "share_data" and not IV:
3377-                    real_offset = 104
3378+                    real_offset = 107
3379                 elif offset1 in o:
3380                     real_offset = o[offset1]
3381                 else:
3382hunk ./src/allmydata/test/test_mutable.py 395
3383             return d
3384         d.addCallback(_created)
3385         return d
3386+    test_create_mdmf_with_initial_contents.timeout = 20
3387 
3388 
3389     def test_create_with_initial_contents_function(self):
3390hunk ./src/allmydata/test/test_mutable.py 700
3391                                            k, N, segsize, datalen)
3392                 self.failUnless(p._pubkey.verify(sig_material, signature))
3393                 #self.failUnlessEqual(signature, p._privkey.sign(sig_material))
3394-                self.failUnless(isinstance(share_hash_chain, dict))
3395-                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
3396+                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
3397                 for shnum,share_hash in share_hash_chain.items():
3398                     self.failUnless(isinstance(shnum, int))
3399                     self.failUnless(isinstance(share_hash, str))
3400hunk ./src/allmydata/test/test_mutable.py 820
3401                     shares[peerid][shnum] = oldshares[index][peerid][shnum]
3402 
3403 
3404+
3405+
3406 class Servermap(unittest.TestCase, PublishMixin):
3407     def setUp(self):
3408         return self.publish_one()
3409hunk ./src/allmydata/test/test_mutable.py 951
3410         self._storage._peers = {} # delete all shares
3411         ms = self.make_servermap
3412         d = defer.succeed(None)
3413-
3414+#
3415         d.addCallback(lambda res: ms(mode=MODE_CHECK))
3416         d.addCallback(lambda sm: self.failUnlessNoneRecoverable(sm))
3417 
3418hunk ./src/allmydata/test/test_mutable.py 1440
3419         d.addCallback(self.check_good, "test_check_good")
3420         return d
3421 
3422+    def test_check_mdmf_good(self):
3423+        d = self.publish_mdmf()
3424+        d.addCallback(lambda ignored:
3425+            self._fn.check(Monitor()))
3426+        d.addCallback(self.check_good, "test_check_mdmf_good")
3427+        return d
3428+
3429     def test_check_no_shares(self):
3430         for shares in self._storage._peers.values():
3431             shares.clear()
3432hunk ./src/allmydata/test/test_mutable.py 1454
3433         d.addCallback(self.check_bad, "test_check_no_shares")
3434         return d
3435 
3436+    def test_check_mdmf_no_shares(self):
3437+        d = self.publish_mdmf()
3438+        def _then(ignored):
3439+            for share in self._storage._peers.values():
3440+                share.clear()
3441+        d.addCallback(_then)
3442+        d.addCallback(lambda ignored:
3443+            self._fn.check(Monitor()))
3444+        d.addCallback(self.check_bad, "test_check_mdmf_no_shares")
3445+        return d
3446+
3447     def test_check_not_enough_shares(self):
3448         for shares in self._storage._peers.values():
3449             for shnum in shares.keys():
3450hunk ./src/allmydata/test/test_mutable.py 1474
3451         d.addCallback(self.check_bad, "test_check_not_enough_shares")
3452         return d
3453 
3454+    def test_check_mdmf_not_enough_shares(self):
3455+        d = self.publish_mdmf()
3456+        def _then(ignored):
3457+            for shares in self._storage._peers.values():
3458+                for shnum in shares.keys():
3459+                    if shnum > 0:
3460+                        del shares[shnum]
3461+        d.addCallback(_then)
3462+        d.addCallback(lambda ignored:
3463+            self._fn.check(Monitor()))
3464+        d.addCallback(self.check_bad, "test_check_mdmf_not_enougH_shares")
3465+        return d
3466+
3467+
3468     def test_check_all_bad_sig(self):
3469         d = corrupt(None, self._storage, 1) # bad sig
3470         d.addCallback(lambda ignored:
3471hunk ./src/allmydata/test/test_mutable.py 1495
3472         d.addCallback(self.check_bad, "test_check_all_bad_sig")
3473         return d
3474 
3475+    def test_check_mdmf_all_bad_sig(self):
3476+        d = self.publish_mdmf()
3477+        d.addCallback(lambda ignored:
3478+            corrupt(None, self._storage, 1))
3479+        d.addCallback(lambda ignored:
3480+            self._fn.check(Monitor()))
3481+        d.addCallback(self.check_bad, "test_check_mdmf_all_bad_sig")
3482+        return d
3483+
3484     def test_check_all_bad_blocks(self):
3485         d = corrupt(None, self._storage, "share_data", [9]) # bad blocks
3486         # the Checker won't notice this.. it doesn't look at actual data
3487hunk ./src/allmydata/test/test_mutable.py 1512
3488         d.addCallback(self.check_good, "test_check_all_bad_blocks")
3489         return d
3490 
3491+
3492+    def test_check_mdmf_all_bad_blocks(self):
3493+        d = self.publish_mdmf()
3494+        d.addCallback(lambda ignored:
3495+            corrupt(None, self._storage, "share_data"))
3496+        d.addCallback(lambda ignored:
3497+            self._fn.check(Monitor()))
3498+        d.addCallback(self.check_good, "test_check_mdmf_all_bad_blocks")
3499+        return d
3500+
3501     def test_verify_good(self):
3502         d = self._fn.check(Monitor(), verify=True)
3503         d.addCallback(self.check_good, "test_verify_good")
3504hunk ./src/allmydata/test/test_mutable.py 1582
3505                       "test_verify_one_bad_encprivkey_uncheckable")
3506         return d
3507 
3508+
3509+    def test_verify_mdmf_good(self):
3510+        d = self.publish_mdmf()
3511+        d.addCallback(lambda ignored:
3512+            self._fn.check(Monitor(), verify=True))
3513+        d.addCallback(self.check_good, "test_verify_mdmf_good")
3514+        return d
3515+
3516+
3517+    def test_verify_mdmf_one_bad_block(self):
3518+        d = self.publish_mdmf()
3519+        d.addCallback(lambda ignored:
3520+            corrupt(None, self._storage, "share_data", [1]))
3521+        d.addCallback(lambda ignored:
3522+            self._fn.check(Monitor(), verify=True))
3523+        # We should find one bad block here
3524+        d.addCallback(self.check_bad, "test_verify_mdmf_one_bad_block")
3525+        d.addCallback(self.check_expected_failure,
3526+                      CorruptShareError, "block hash tree failure",
3527+                      "test_verify_mdmf_one_bad_block")
3528+        return d
3529+
3530+
3531+    def test_verify_mdmf_bad_encprivkey(self):
3532+        d = self.publish_mdmf()
3533+        d.addCallback(lambda ignored:
3534+            corrupt(None, self._storage, "enc_privkey", [1]))
3535+        d.addCallback(lambda ignored:
3536+            self._fn.check(Monitor(), verify=True))
3537+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_encprivkey")
3538+        d.addCallback(self.check_expected_failure,
3539+                      CorruptShareError, "privkey",
3540+                      "test_verify_mdmf_bad_encprivkey")
3541+        return d
3542+
3543+
3544+    def test_verify_mdmf_bad_sig(self):
3545+        d = self.publish_mdmf()
3546+        d.addCallback(lambda ignored:
3547+            corrupt(None, self._storage, 1, [1]))
3548+        d.addCallback(lambda ignored:
3549+            self._fn.check(Monitor(), verify=True))
3550+        d.addCallback(self.check_bad, "test_verify_mdmf_bad_sig")
3551+        return d
3552+
3553+
3554+    def test_verify_mdmf_bad_encprivkey_uncheckable(self):
3555+        d = self.publish_mdmf()
3556+        d.addCallback(lambda ignored:
3557+            corrupt(None, self._storage, "enc_privkey", [1]))
3558+        d.addCallback(lambda ignored:
3559+            self._fn.get_readonly())
3560+        d.addCallback(lambda fn:
3561+            fn.check(Monitor(), verify=True))
3562+        d.addCallback(self.check_good,
3563+                      "test_verify_mdmf_bad_encprivkey_uncheckable")
3564+        return d
3565+
3566+
3567 class Repair(unittest.TestCase, PublishMixin, ShouldFailMixin):
3568 
3569     def get_shares(self, s):
3570hunk ./src/allmydata/test/test_mutable.py 1706
3571         current_shares = self.old_shares[-1]
3572         self.failUnlessEqual(old_shares, current_shares)
3573 
3574+
3575     def test_unrepairable_0shares(self):
3576         d = self.publish_one()
3577         def _delete_all_shares(ign):
3578hunk ./src/allmydata/test/test_mutable.py 1721
3579         d.addCallback(_check)
3580         return d
3581 
3582+    def test_mdmf_unrepairable_0shares(self):
3583+        d = self.publish_mdmf()
3584+        def _delete_all_shares(ign):
3585+            shares = self._storage._peers
3586+            for peerid in shares:
3587+                shares[peerid] = {}
3588+        d.addCallback(_delete_all_shares)
3589+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3590+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3591+        d.addCallback(lambda crr: self.failIf(crr.get_successful()))
3592+        return d
3593+
3594+
3595     def test_unrepairable_1share(self):
3596         d = self.publish_one()
3597         def _delete_all_shares(ign):
3598hunk ./src/allmydata/test/test_mutable.py 1750
3599         d.addCallback(_check)
3600         return d
3601 
3602+    def test_mdmf_unrepairable_1share(self):
3603+        d = self.publish_mdmf()
3604+        def _delete_all_shares(ign):
3605+            shares = self._storage._peers
3606+            for peerid in shares:
3607+                for shnum in list(shares[peerid]):
3608+                    if shnum > 0:
3609+                        del shares[peerid][shnum]
3610+        d.addCallback(_delete_all_shares)
3611+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3612+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3613+        def _check(crr):
3614+            self.failUnlessEqual(crr.get_successful(), False)
3615+        d.addCallback(_check)
3616+        return d
3617+
3618+    def test_repairable_5shares(self):
3619+        d = self.publish_mdmf()
3620+        def _delete_all_shares(ign):
3621+            shares = self._storage._peers
3622+            for peerid in shares:
3623+                for shnum in list(shares[peerid]):
3624+                    if shnum > 4:
3625+                        del shares[peerid][shnum]
3626+        d.addCallback(_delete_all_shares)
3627+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3628+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3629+        def _check(crr):
3630+            self.failUnlessEqual(crr.get_successful(), True)
3631+        d.addCallback(_check)
3632+        return d
3633+
3634+    def test_mdmf_repairable_5shares(self):
3635+        d = self.publish_mdmf()
3636+        def _delete_all_shares(ign):
3637+            shares = self._storage._peers
3638+            for peerid in shares:
3639+                for shnum in list(shares[peerid]):
3640+                    if shnum > 5:
3641+                        del shares[peerid][shnum]
3642+        d.addCallback(_delete_all_shares)
3643+        d.addCallback(lambda ign: self._fn.check(Monitor()))
3644+        d.addCallback(lambda check_results: self._fn.repair(check_results))
3645+        def _check(crr):
3646+            self.failUnlessEqual(crr.get_successful(), True)
3647+        d.addCallback(_check)
3648+        return d
3649+
3650+
3651     def test_merge(self):
3652         self.old_shares = []
3653         d = self.publish_multiple()
3654}
3655[mutable/retrieve.py: learn how to verify mutable files
3656Kevan Carstensen <kevan@isnotajoke.com>**20100628225201
3657 Ignore-this: 989af7800c47589620918461ec989483
3658] {
3659hunk ./src/allmydata/mutable/retrieve.py 86
3660     # Retrieve object will remain tied to a specific version of the file, and
3661     # will use a single ServerMap instance.
3662 
3663-    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False):
3664+    def __init__(self, filenode, servermap, verinfo, fetch_privkey=False,
3665+                 verify=False):
3666         self._node = filenode
3667         assert self._node.get_pubkey()
3668         self._storage_index = filenode.get_storage_index()
3669hunk ./src/allmydata/mutable/retrieve.py 106
3670         # during repair, we may be called upon to grab the private key, since
3671         # it wasn't picked up during a verify=False checker run, and we'll
3672         # need it for repair to generate a new version.
3673-        self._need_privkey = fetch_privkey
3674-        if self._node.get_privkey():
3675+        self._need_privkey = fetch_privkey or verify
3676+        if self._node.get_privkey() and not verify:
3677             self._need_privkey = False
3678 
3679         if self._need_privkey:
3680hunk ./src/allmydata/mutable/retrieve.py 117
3681             self._privkey_query_markers = [] # one Marker for each time we've
3682                                              # tried to get the privkey.
3683 
3684+        # verify means that we are using the downloader logic to verify all
3685+        # of our shares. This tells the downloader a few things.
3686+        #
3687+        # 1. We need to download all of the shares.
3688+        # 2. We don't need to decode or decrypt the shares, since our
3689+        #    caller doesn't care about the plaintext, only the
3690+        #    information about which shares are or are not valid.
3691+        # 3. When we are validating readers, we need to validate the
3692+        #    signature on the prefix. Do we? We already do this in the
3693+        #    servermap update?
3694+        #
3695+        # (just work on 1 and 2 for now, I guess)
3696+        self._verify = False
3697+        if verify:
3698+            self._verify = True
3699+
3700         self._status = RetrieveStatus()
3701         self._status.set_storage_index(self._storage_index)
3702         self._status.set_helper(False)
3703hunk ./src/allmydata/mutable/retrieve.py 323
3704 
3705         # We need at least self._required_shares readers to download a
3706         # segment.
3707-        needed = self._required_shares - len(self._active_readers)
3708+        if self._verify:
3709+            needed = self._total_shares
3710+        else:
3711+            needed = self._required_shares - len(self._active_readers)
3712         # XXX: Why don't format= log messages work here?
3713         self.log("adding %d peers to the active peers list" % needed)
3714 
3715hunk ./src/allmydata/mutable/retrieve.py 339
3716         # will cause problems later.
3717         active_shnums -= set([reader.shnum for reader in self._active_readers])
3718         active_shnums = list(active_shnums)[:needed]
3719-        if len(active_shnums) < needed:
3720+        if len(active_shnums) < needed and not self._verify:
3721             # We don't have enough readers to retrieve the file; fail.
3722             return self._failed()
3723 
3724hunk ./src/allmydata/mutable/retrieve.py 346
3725         for shnum in active_shnums:
3726             self._active_readers.append(self.readers[shnum])
3727             self.log("added reader for share %d" % shnum)
3728-        assert len(self._active_readers) == self._required_shares
3729+        assert len(self._active_readers) >= self._required_shares
3730         # Conceptually, this is part of the _add_active_peers step. It
3731         # validates the prefixes of newly added readers to make sure
3732         # that they match what we are expecting for self.verinfo. If
3733hunk ./src/allmydata/mutable/retrieve.py 416
3734                     # that we haven't gotten it at the end of
3735                     # segment decoding, then we'll take more drastic
3736                     # measures.
3737-                    if self._need_privkey:
3738+                    if self._need_privkey and not self._node.is_readonly():
3739                         d = reader.get_encprivkey()
3740                         d.addCallback(self._try_to_validate_privkey, reader)
3741             if bad_readers:
3742hunk ./src/allmydata/mutable/retrieve.py 423
3743                 # We do them all at once, or else we screw up list indexing.
3744                 for (reader, f) in bad_readers:
3745                     self._mark_bad_share(reader, f)
3746-                return self._add_active_peers()
3747+                if self._verify:
3748+                    if len(self._active_readers) >= self._required_shares:
3749+                        return self._download_current_segment()
3750+                    else:
3751+                        return self._failed()
3752+                else:
3753+                    return self._add_active_peers()
3754             else:
3755                 return self._download_current_segment()
3756             # The next step will assert that it has enough active
3757hunk ./src/allmydata/mutable/retrieve.py 518
3758         """
3759         self.log("marking share %d on server %s as bad" % \
3760                  (reader.shnum, reader))
3761+        prefix = self.verinfo[-2]
3762+        self.servermap.mark_bad_share(reader.peerid,
3763+                                      reader.shnum,
3764+                                      prefix)
3765         self._remove_reader(reader)
3766hunk ./src/allmydata/mutable/retrieve.py 523
3767-        self._bad_shares.add((reader.peerid, reader.shnum))
3768+        self._bad_shares.add((reader.peerid, reader.shnum, f))
3769         self._status.problems[reader.peerid] = f
3770         self._last_failure = f
3771         self.notify_server_corruption(reader.peerid, reader.shnum,
3772hunk ./src/allmydata/mutable/retrieve.py 571
3773             ds.append(dl)
3774             reader.flush()
3775         dl = defer.DeferredList(ds)
3776-        dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
3777+        if self._verify:
3778+            dl.addCallback(lambda ignored: "")
3779+            dl.addCallback(self._set_segment)
3780+        else:
3781+            dl.addCallback(self._maybe_decode_and_decrypt_segment, segnum)
3782         return dl
3783 
3784 
3785hunk ./src/allmydata/mutable/retrieve.py 701
3786         # shnum, which will be a leaf in the share hash tree, which
3787         # will allow us to validate the rest of the tree.
3788         if self.share_hash_tree.needed_hashes(reader.shnum,
3789-                                               include_leaf=True):
3790+                                              include_leaf=True) or \
3791+                                              self._verify:
3792             try:
3793                 self.share_hash_tree.set_hashes(hashes=sharehashes[1],
3794                                             leaves={reader.shnum: bht[0]})
3795hunk ./src/allmydata/mutable/retrieve.py 832
3796 
3797 
3798     def _try_to_validate_privkey(self, enc_privkey, reader):
3799-
3800         alleged_privkey_s = self._node._decrypt_privkey(enc_privkey)
3801         alleged_writekey = hashutil.ssk_writekey_hash(alleged_privkey_s)
3802         if alleged_writekey != self._node.get_writekey():
3803hunk ./src/allmydata/mutable/retrieve.py 838
3804             self.log("invalid privkey from %s shnum %d" %
3805                      (reader, reader.shnum),
3806                      level=log.WEIRD, umid="YIw4tA")
3807+            if self._verify:
3808+                self.servermap.mark_bad_share(reader.peerid, reader.shnum,
3809+                                              self.verinfo[-2])
3810+                e = CorruptShareError(reader.peerid,
3811+                                      reader.shnum,
3812+                                      "invalid privkey")
3813+                f = failure.Failure(e)
3814+                self._bad_shares.add((reader.peerid, reader.shnum, f))
3815             return
3816 
3817         # it's good
3818hunk ./src/allmydata/mutable/retrieve.py 904
3819         statements, I return the decrypted contents to the owner of this
3820         Retrieve object through self._done_deferred.
3821         """
3822-        eventually(self._done_deferred.callback, self._plaintext)
3823+        if self._verify:
3824+            ret = list(self._bad_shares)
3825+            self.log("done verifying, found %d bad shares" % len(ret))
3826+        else:
3827+            ret = self._plaintext
3828+        eventually(self._done_deferred.callback, ret)
3829 
3830 
3831     def _failed(self):
3832hunk ./src/allmydata/mutable/retrieve.py 920
3833         to the caller of this Retrieve object through
3834         self._done_deferred.
3835         """
3836-        format = ("ran out of peers: "
3837-                  "have %(have)d of %(total)d segments "
3838-                  "found %(bad)d bad shares "
3839-                  "encoding %(k)d-of-%(n)d")
3840-        args = {"have": self._current_segment,
3841-                "total": self._num_segments,
3842-                "k": self._required_shares,
3843-                "n": self._total_shares,
3844-                "bad": len(self._bad_shares)}
3845-        e = NotEnoughSharesError("%s, last failure: %s" % (format % args,
3846-                                                        str(self._last_failure)))
3847-        f = failure.Failure(e)
3848-        eventually(self._done_deferred.callback, f)
3849+        if self._verify:
3850+            ret = list(self._bad_shares)
3851+        else:
3852+            format = ("ran out of peers: "
3853+                      "have %(have)d of %(total)d segments "
3854+                      "found %(bad)d bad shares "
3855+                      "encoding %(k)d-of-%(n)d")
3856+            args = {"have": self._current_segment,
3857+                    "total": self._num_segments,
3858+                    "k": self._required_shares,
3859+                    "n": self._total_shares,
3860+                    "bad": len(self._bad_shares)}
3861+            e = NotEnoughSharesError("%s, last failure: %s" % \
3862+                                     (format % args, str(self._last_failure)))
3863+            f = failure.Failure(e)
3864+            ret = f
3865+        eventually(self._done_deferred.callback, ret)
3866}
3867[interfaces.py: add IMutableSlotWriter
3868Kevan Carstensen <kevan@isnotajoke.com>**20100630183305
3869 Ignore-this: ff9dca96ef1a009ae85485682f81ea5
3870] hunk ./src/allmydata/interfaces.py 418
3871         """
3872 
3873 
3874+class IMutableSlotWriter(Interface):
3875+    """
3876+    The interface for a writer around a mutable slot on a remote server.
3877+    """
3878+    def set_checkstring(checkstring, *args):
3879+        """
3880+        Set the checkstring that I will pass to the remote server when
3881+        writing.
3882+
3883+            @param checkstring A packed checkstring to use.
3884+
3885+        Note that implementations can differ in which semantics they
3886+        wish to support for set_checkstring -- they can, for example,
3887+        build the checkstring themselves from its constituents, or
3888+        some other thing.
3889+        """
3890+
3891+    def get_checkstring():
3892+        """
3893+        Get the checkstring that I think currently exists on the remote
3894+        server.
3895+        """
3896+
3897+    def put_block(data, segnum, salt):
3898+        """
3899+        Add a block and salt to the share.
3900+        """
3901+
3902+    def put_encprivey(encprivkey):
3903+        """
3904+        Add the encrypted private key to the share.
3905+        """
3906+
3907+    def put_blockhashes(blockhashes=list):
3908+        """
3909+        Add the block hash tree to the share.
3910+        """
3911+
3912+    def put_sharehashes(sharehashes=dict):
3913+        """
3914+        Add the share hash chain to the share.
3915+        """
3916+
3917+    def get_signable():
3918+        """
3919+        Return the part of the share that needs to be signed.
3920+        """
3921+
3922+    def put_signature(signature):
3923+        """
3924+        Add the signature to the share.
3925+        """
3926+
3927+    def put_verification_key(verification_key):
3928+        """
3929+        Add the verification key to the share.
3930+        """
3931+
3932+    def finish_publishing():
3933+        """
3934+        Do anything necessary to finish writing the share to a remote
3935+        server. I require that no further publishing needs to take place
3936+        after this method has been called.
3937+        """
3938+
3939+
3940 class IURI(Interface):
3941     def init_from_string(uri):
3942         """Accept a string (as created by my to_string() method) and populate
3943[test/test_mutable.py: temporarily disable two tests that are now irrelevant
3944Kevan Carstensen <kevan@isnotajoke.com>**20100701232806
3945 Ignore-this: 701e143567f3954812ca6960af1d6ac7
3946] {
3947hunk ./src/allmydata/test/test_mutable.py 651
3948             self.failUnlessEqual(len(share_ids), 10)
3949         d.addCallback(_done)
3950         return d
3951+    test_encrypt.todo = "Write an equivalent of this for the new uploader"
3952 
3953     def test_generate(self):
3954         nm = make_nodemaker()
3955hunk ./src/allmydata/test/test_mutable.py 713
3956                 self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey())
3957         d.addCallback(_generated)
3958         return d
3959+    test_generate.todo = "Write an equivalent of this for the new uploader"
3960 
3961     # TODO: when we publish to 20 peers, we should get one share per peer on 10
3962     # when we publish to 3 peers, we should get either 3 or 4 shares per peer
3963}
3964[Add MDMF reader and writer, and SDMF writer
3965Kevan Carstensen <kevan@isnotajoke.com>**20100702225531
3966 Ignore-this: bf6276a91d27dcb4e779b0eb82ea1843
3967 
3968 The MDMF/SDMF reader MDMF writer, and SDMF writer are similar to the
3969 object proxies that exist for immutable files. They abstract away
3970 details of connection, state, and caching from their callers (in this
3971 case, the download, servermap updater, and uploader), and expose methods
3972 to get and set information on the remote server.
3973 
3974 MDMFSlotReadProxy reads a mutable file from the server, doing the right
3975 thing (in most cases) regardless of whether the file is MDMF or SDMF. It
3976 allows callers to tell it how to batch and flush reads.
3977 
3978 MDMFSlotWriteProxy writes an MDMF mutable file to a server.
3979 
3980 SDMFSlotWriteProxy writes an SDMF mutable file to a server.
3981 
3982 This patch also includes tests for MDMFSlotReadProxy,
3983 SDMFSlotWriteProxy, and MDMFSlotWriteProxy.
3984] {
3985hunk ./src/allmydata/mutable/layout.py 4
3986 
3987 import struct
3988 from allmydata.mutable.common import NeedMoreDataError, UnknownVersionError
3989+from allmydata.interfaces import HASH_SIZE, SALT_SIZE, SDMF_VERSION, \
3990+                                 MDMF_VERSION, IMutableSlotWriter
3991+from allmydata.util import mathutil, observer
3992+from twisted.python import failure
3993+from twisted.internet import defer
3994+from zope.interface import implements
3995+
3996+
3997+# These strings describe the format of the packed structs they help process
3998+# Here's what they mean:
3999+#
4000+#  PREFIX:
4001+#    >: Big-endian byte order; the most significant byte is first (leftmost).
4002+#    B: The version information; an 8 bit version identifier. Stored as
4003+#       an unsigned char. This is currently 00 00 00 00; our modifications
4004+#       will turn it into 00 00 00 01.
4005+#    Q: The sequence number; this is sort of like a revision history for
4006+#       mutable files; they start at 1 and increase as they are changed after
4007+#       being uploaded. Stored as an unsigned long long, which is 8 bytes in
4008+#       length.
4009+#  32s: The root hash of the share hash tree. We use sha-256d, so we use 32
4010+#       characters = 32 bytes to store the value.
4011+#  16s: The salt for the readkey. This is a 16-byte random value, stored as
4012+#       16 characters.
4013+#
4014+#  SIGNED_PREFIX additions, things that are covered by the signature:
4015+#    B: The "k" encoding parameter. We store this as an 8-bit character,
4016+#       which is convenient because our erasure coding scheme cannot
4017+#       encode if you ask for more than 255 pieces.
4018+#    B: The "N" encoding parameter. Stored as an 8-bit character for the
4019+#       same reasons as above.
4020+#    Q: The segment size of the uploaded file. This will essentially be the
4021+#       length of the file in SDMF. An unsigned long long, so we can store
4022+#       files of quite large size.
4023+#    Q: The data length of the uploaded file. Modulo padding, this will be
4024+#       the same of the data length field. Like the data length field, it is
4025+#       an unsigned long long and can be quite large.
4026+#
4027+#   HEADER additions:
4028+#     L: The offset of the signature of this. An unsigned long.
4029+#     L: The offset of the share hash chain. An unsigned long.
4030+#     L: The offset of the block hash tree. An unsigned long.
4031+#     L: The offset of the share data. An unsigned long.
4032+#     Q: The offset of the encrypted private key. An unsigned long long, to
4033+#        account for the possibility of a lot of share data.
4034+#     Q: The offset of the EOF. An unsigned long long, to account for the
4035+#        possibility of a lot of share data.
4036+#
4037+#  After all of these, we have the following:
4038+#    - The verification key: Occupies the space between the end of the header
4039+#      and the start of the signature (i.e.: data[HEADER_LENGTH:o['signature']].
4040+#    - The signature, which goes from the signature offset to the share hash
4041+#      chain offset.
4042+#    - The share hash chain, which goes from the share hash chain offset to
4043+#      the block hash tree offset.
4044+#    - The share data, which goes from the share data offset to the encrypted
4045+#      private key offset.
4046+#    - The encrypted private key offset, which goes until the end of the file.
4047+#
4048+#  The block hash tree in this encoding has only one share, so the offset of
4049+#  the share data will be 32 bits more than the offset of the block hash tree.
4050+#  Given this, we may need to check to see how many bytes a reasonably sized
4051+#  block hash tree will take up.
4052 
4053 PREFIX = ">BQ32s16s" # each version has a different prefix
4054 SIGNED_PREFIX = ">BQ32s16s BBQQ" # this is covered by the signature
4055hunk ./src/allmydata/mutable/layout.py 73
4056 SIGNED_PREFIX_LENGTH = struct.calcsize(SIGNED_PREFIX)
4057 HEADER = ">BQ32s16s BBQQ LLLLQQ" # includes offsets
4058 HEADER_LENGTH = struct.calcsize(HEADER)
4059+OFFSETS = ">LLLLQQ"
4060+OFFSETS_LENGTH = struct.calcsize(OFFSETS)
4061 
4062 def unpack_header(data):
4063     o = {}
4064hunk ./src/allmydata/mutable/layout.py 194
4065     return (share_hash_chain, block_hash_tree, share_data)
4066 
4067 
4068-def pack_checkstring(seqnum, root_hash, IV):
4069+def pack_checkstring(seqnum, root_hash, IV, version=0):
4070     return struct.pack(PREFIX,
4071hunk ./src/allmydata/mutable/layout.py 196
4072-                       0, # version,
4073+                       version,
4074                        seqnum,
4075                        root_hash,
4076                        IV)
4077hunk ./src/allmydata/mutable/layout.py 269
4078                            encprivkey])
4079     return final_share
4080 
4081+def pack_prefix(seqnum, root_hash, IV,
4082+                required_shares, total_shares,
4083+                segment_size, data_length):
4084+    prefix = struct.pack(SIGNED_PREFIX,
4085+                         0, # version,
4086+                         seqnum,
4087+                         root_hash,
4088+                         IV,
4089+                         required_shares,
4090+                         total_shares,
4091+                         segment_size,
4092+                         data_length,
4093+                         )
4094+    return prefix
4095+
4096+
4097+class SDMFSlotWriteProxy:
4098+    implements(IMutableSlotWriter)
4099+    """
4100+    I represent a remote write slot for an SDMF mutable file. I build a
4101+    share in memory, and then write it in one piece to the remote
4102+    server. This mimics how SDMF shares were built before MDMF (and the
4103+    new MDMF uploader), but provides that functionality in a way that
4104+    allows the MDMF uploader to be built without much special-casing for
4105+    file format, which makes the uploader code more readable.
4106+    """
4107+    def __init__(self,
4108+                 shnum,
4109+                 rref, # a remote reference to a storage server
4110+                 storage_index,
4111+                 secrets, # (write_enabler, renew_secret, cancel_secret)
4112+                 seqnum, # the sequence number of the mutable file
4113+                 required_shares,
4114+                 total_shares,
4115+                 segment_size,
4116+                 data_length): # the length of the original file
4117+        self.shnum = shnum
4118+        self._rref = rref
4119+        self._storage_index = storage_index
4120+        self._secrets = secrets
4121+        self._seqnum = seqnum
4122+        self._required_shares = required_shares
4123+        self._total_shares = total_shares
4124+        self._segment_size = segment_size
4125+        self._data_length = data_length
4126+
4127+        # This is an SDMF file, so it should have only one segment, so,
4128+        # modulo padding of the data length, the segment size and the
4129+        # data length should be the same.
4130+        expected_segment_size = mathutil.next_multiple(data_length,
4131+                                                       self._required_shares)
4132+        assert expected_segment_size == segment_size
4133+
4134+        self._block_size = self._segment_size / self._required_shares
4135+
4136+        # This is meant to mimic how SDMF files were built before MDMF
4137+        # entered the picture: we generate each share in its entirety,
4138+        # then push it off to the storage server in one write. When
4139+        # callers call set_*, they are just populating this dict.
4140+        # finish_publishing will stitch these pieces together into a
4141+        # coherent share, and then write the coherent share to the
4142+        # storage server.
4143+        self._share_pieces = {}
4144+
4145+        # This tells the write logic what checkstring to use when
4146+        # writing remote shares.
4147+        self._testvs = []
4148+
4149+        self._readvs = [(0, struct.calcsize(PREFIX))]
4150+
4151+
4152+    def set_checkstring(self, checkstring_or_seqnum,
4153+                              root_hash=None,
4154+                              salt=None):
4155+        """
4156+        Set the checkstring that I will pass to the remote server when
4157+        writing.
4158+
4159+            @param checkstring_or_seqnum: A packed checkstring to use,
4160+                   or a sequence number. I will treat this as a checkstr
4161+
4162+        Note that implementations can differ in which semantics they
4163+        wish to support for set_checkstring -- they can, for example,
4164+        build the checkstring themselves from its constituents, or
4165+        some other thing.
4166+        """
4167+        if root_hash and salt:
4168+            checkstring = struct.pack(PREFIX,
4169+                                      0,
4170+                                      checkstring_or_seqnum,
4171+                                      root_hash,
4172+                                      salt)
4173+        else:
4174+            checkstring = checkstring_or_seqnum
4175+        self._testvs = [(0, len(checkstring), "eq", checkstring)]
4176+
4177+
4178+    def get_checkstring(self):
4179+        """
4180+        Get the checkstring that I think currently exists on the remote
4181+        server.
4182+        """
4183+        if self._testvs:
4184+            return self._testvs[0][3]
4185+        return ""
4186+
4187+
4188+    def put_block(self, data, segnum, salt):
4189+        """
4190+        Add a block and salt to the share.
4191+        """
4192+        # SDMF files have only one segment
4193+        assert segnum == 0
4194+        assert len(data) == self._block_size
4195+        assert len(salt) == SALT_SIZE
4196+
4197+        self._share_pieces['sharedata'] = data
4198+        self._share_pieces['salt'] = salt
4199+
4200+        # TODO: Figure out something intelligent to return.
4201+        return defer.succeed(None)
4202+
4203+
4204+    def put_encprivkey(self, encprivkey):
4205+        """
4206+        Add the encrypted private key to the share.
4207+        """
4208+        self._share_pieces['encprivkey'] = encprivkey
4209+
4210+        return defer.succeed(None)
4211+
4212+
4213+    def put_blockhashes(self, blockhashes):
4214+        """
4215+        Add the block hash tree to the share.
4216+        """
4217+        assert isinstance(blockhashes, list)
4218+        for h in blockhashes:
4219+            assert len(h) == HASH_SIZE
4220+
4221+        # serialize the blockhashes, then set them.
4222+        blockhashes_s = "".join(blockhashes)
4223+        self._share_pieces['block_hash_tree'] = blockhashes_s
4224+
4225+        return defer.succeed(None)
4226+
4227+
4228+    def put_sharehashes(self, sharehashes):
4229+        """
4230+        Add the share hash chain to the share.
4231+        """
4232+        assert isinstance(sharehashes, dict)
4233+        for h in sharehashes.itervalues():
4234+            assert len(h) == HASH_SIZE
4235+
4236+        # serialize the sharehashes, then set them.
4237+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
4238+                                 for i in sorted(sharehashes.keys())])
4239+        self._share_pieces['share_hash_chain'] = sharehashes_s
4240+
4241+        return defer.succeed(None)
4242+
4243+
4244+    def put_root_hash(self, root_hash):
4245+        """
4246+        Add the root hash to the share.
4247+        """
4248+        assert len(root_hash) == HASH_SIZE
4249+
4250+        self._share_pieces['root_hash'] = root_hash
4251+
4252+        return defer.succeed(None)
4253+
4254+
4255+    def put_salt(self, salt):
4256+        """
4257+        Add a salt to an empty SDMF file.
4258+        """
4259+        assert len(salt) == SALT_SIZE
4260+
4261+        self._share_pieces['salt'] = salt
4262+        self._share_pieces['sharedata'] = ""
4263+
4264+
4265+    def get_signable(self):
4266+        """
4267+        Return the part of the share that needs to be signed.
4268+
4269+        SDMF writers need to sign the packed representation of the
4270+        first eight fields of the remote share, that is:
4271+            - version number (0)
4272+            - sequence number
4273+            - root of the share hash tree
4274+            - salt
4275+            - k
4276+            - n
4277+            - segsize
4278+            - datalen
4279+
4280+        This method is responsible for returning that to callers.
4281+        """
4282+        return struct.pack(SIGNED_PREFIX,
4283+                           0,
4284+                           self._seqnum,
4285+                           self._share_pieces['root_hash'],
4286+                           self._share_pieces['salt'],
4287+                           self._required_shares,
4288+                           self._total_shares,
4289+                           self._segment_size,
4290+                           self._data_length)
4291+
4292+
4293+    def put_signature(self, signature):
4294+        """
4295+        Add the signature to the share.
4296+        """
4297+        self._share_pieces['signature'] = signature
4298+
4299+        return defer.succeed(None)
4300+
4301+
4302+    def put_verification_key(self, verification_key):
4303+        """
4304+        Add the verification key to the share.
4305+        """
4306+        self._share_pieces['verification_key'] = verification_key
4307+
4308+        return defer.succeed(None)
4309+
4310+
4311+    def get_verinfo(self):
4312+        """
4313+        I return my verinfo tuple. This is used by the ServermapUpdater
4314+        to keep track of versions of mutable files.
4315+
4316+        The verinfo tuple for MDMF files contains:
4317+            - seqnum
4318+            - root hash
4319+            - a blank (nothing)
4320+            - segsize
4321+            - datalen
4322+            - k
4323+            - n
4324+            - prefix (the thing that you sign)
4325+            - a tuple of offsets
4326+
4327+        We include the nonce in MDMF to simplify processing of version
4328+        information tuples.
4329+
4330+        The verinfo tuple for SDMF files is the same, but contains a
4331+        16-byte IV instead of a hash of salts.
4332+        """
4333+        return (self._seqnum,
4334+                self._share_pieces['root_hash'],
4335+                self._share_pieces['salt'],
4336+                self._segment_size,
4337+                self._data_length,
4338+                self._required_shares,
4339+                self._total_shares,
4340+                self.get_signable(),
4341+                self._get_offsets_tuple())
4342+
4343+    def _get_offsets_dict(self):
4344+        post_offset = HEADER_LENGTH
4345+        offsets = {}
4346+
4347+        verification_key_length = len(self._share_pieces['verification_key'])
4348+        o1 = offsets['signature'] = post_offset + verification_key_length
4349+
4350+        signature_length = len(self._share_pieces['signature'])
4351+        o2 = offsets['share_hash_chain'] = o1 + signature_length
4352+
4353+        share_hash_chain_length = len(self._share_pieces['share_hash_chain'])
4354+        o3 = offsets['block_hash_tree'] = o2 + share_hash_chain_length
4355+
4356+        block_hash_tree_length = len(self._share_pieces['block_hash_tree'])
4357+        o4 = offsets['share_data'] = o3 + block_hash_tree_length
4358+
4359+        share_data_length = len(self._share_pieces['sharedata'])
4360+        o5 = offsets['enc_privkey'] = o4 + share_data_length
4361+
4362+        encprivkey_length = len(self._share_pieces['encprivkey'])
4363+        offsets['EOF'] = o5 + encprivkey_length
4364+        return offsets
4365+
4366+
4367+    def _get_offsets_tuple(self):
4368+        offsets = self._get_offsets_dict()
4369+        return tuple([(key, value) for key, value in offsets.items()])
4370+
4371+
4372+    def _pack_offsets(self):
4373+        offsets = self._get_offsets_dict()
4374+        return struct.pack(">LLLLQQ",
4375+                           offsets['signature'],
4376+                           offsets['share_hash_chain'],
4377+                           offsets['block_hash_tree'],
4378+                           offsets['share_data'],
4379+                           offsets['enc_privkey'],
4380+                           offsets['EOF'])
4381+
4382+
4383+    def finish_publishing(self):
4384+        """
4385+        Do anything necessary to finish writing the share to a remote
4386+        server. I require that no further publishing needs to take place
4387+        after this method has been called.
4388+        """
4389+        for k in ["sharedata", "encprivkey", "signature", "verification_key",
4390+                  "share_hash_chain", "block_hash_tree"]:
4391+            assert k in self._share_pieces
4392+        # This is the only method that actually writes something to the
4393+        # remote server.
4394+        # First, we need to pack the share into data that we can write
4395+        # to the remote server in one write.
4396+        offsets = self._pack_offsets()
4397+        prefix = self.get_signable()
4398+        final_share = "".join([prefix,
4399+                               offsets,
4400+                               self._share_pieces['verification_key'],
4401+                               self._share_pieces['signature'],
4402+                               self._share_pieces['share_hash_chain'],
4403+                               self._share_pieces['block_hash_tree'],
4404+                               self._share_pieces['sharedata'],
4405+                               self._share_pieces['encprivkey']])
4406+
4407+        # Our only data vector is going to be writing the final share,
4408+        # in its entirely.
4409+        datavs = [(0, final_share)]
4410+
4411+        if not self._testvs:
4412+            # Our caller has not provided us with another checkstring
4413+            # yet, so we assume that we are writing a new share, and set
4414+            # a test vector that will allow a new share to be written.
4415+            self._testvs = []
4416+            self._testvs.append(tuple([0, 1, "eq", ""]))
4417+            new_share = True
4418+
4419+        tw_vectors = {}
4420+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
4421+        return self._rref.callRemote("slot_testv_and_readv_and_writev",
4422+                                     self._storage_index,
4423+                                     self._secrets,
4424+                                     tw_vectors,
4425+                                     # TODO is it useful to read something?
4426+                                     self._readvs)
4427+
4428+
4429+MDMFHEADER = ">BQ32sBBQQ QQQQQQ"
4430+MDMFHEADERWITHOUTOFFSETS = ">BQ32sBBQQ"
4431+MDMFHEADERSIZE = struct.calcsize(MDMFHEADER)
4432+MDMFHEADERWITHOUTOFFSETSSIZE = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
4433+MDMFCHECKSTRING = ">BQ32s"
4434+MDMFSIGNABLEHEADER = ">BQ32sBBQQ"
4435+MDMFOFFSETS = ">QQQQQQ"
4436+MDMFOFFSETS_LENGTH = struct.calcsize(MDMFOFFSETS)
4437+
4438+class MDMFSlotWriteProxy:
4439+    implements(IMutableSlotWriter)
4440+
4441+    """
4442+    I represent a remote write slot for an MDMF mutable file.
4443+
4444+    I abstract away from my caller the details of block and salt
4445+    management, and the implementation of the on-disk format for MDMF
4446+    shares.
4447+    """
4448+    # Expected layout, MDMF:
4449+    # offset:     size:       name:
4450+    #-- signed part --
4451+    # 0           1           version number (01)
4452+    # 1           8           sequence number
4453+    # 9           32          share tree root hash
4454+    # 41          1           The "k" encoding parameter
4455+    # 42          1           The "N" encoding parameter
4456+    # 43          8           The segment size of the uploaded file
4457+    # 51          8           The data length of the original plaintext
4458+    #-- end signed part --
4459+    # 59          8           The offset of the encrypted private key
4460+    # 67          8           The offset of the block hash tree
4461+    # 75          8           The offset of the share hash chain
4462+    # 83          8           The offset of the signature
4463+    # 91          8           The offset of the verification key
4464+    # 99          8           The offset of the EOF
4465+    #
4466+    # followed by salts and share data, the encrypted private key, the
4467+    # block hash tree, the salt hash tree, the share hash chain, a
4468+    # signature over the first eight fields, and a verification key.
4469+    #
4470+    # The checkstring is the first three fields -- the version number,
4471+    # sequence number, root hash and root salt hash. This is consistent
4472+    # in meaning to what we have with SDMF files, except now instead of
4473+    # using the literal salt, we use a value derived from all of the
4474+    # salts -- the share hash root.
4475+    #
4476+    # The salt is stored before the block for each segment. The block
4477+    # hash tree is computed over the combination of block and salt for
4478+    # each segment. In this way, we get integrity checking for both
4479+    # block and salt with the current block hash tree arrangement.
4480+    #
4481+    # The ordering of the offsets is different to reflect the dependencies
4482+    # that we'll run into with an MDMF file. The expected write flow is
4483+    # something like this:
4484+    #
4485+    #   0: Initialize with the sequence number, encoding parameters and
4486+    #      data length. From this, we can deduce the number of segments,
4487+    #      and where they should go.. We can also figure out where the
4488+    #      encrypted private key should go, because we can figure out how
4489+    #      big the share data will be.
4490+    #
4491+    #   1: Encrypt, encode, and upload the file in chunks. Do something
4492+    #      like
4493+    #
4494+    #       put_block(data, segnum, salt)
4495+    #
4496+    #      to write a block and a salt to the disk. We can do both of
4497+    #      these operations now because we have enough of the offsets to
4498+    #      know where to put them.
4499+    #
4500+    #   2: Put the encrypted private key. Use:
4501+    #
4502+    #        put_encprivkey(encprivkey)
4503+    #
4504+    #      Now that we know the length of the private key, we can fill
4505+    #      in the offset for the block hash tree.
4506+    #
4507+    #   3: We're now in a position to upload the block hash tree for
4508+    #      a share. Put that using something like:
4509+    #       
4510+    #        put_blockhashes(block_hash_tree)
4511+    #
4512+    #      Note that block_hash_tree is a list of hashes -- we'll take
4513+    #      care of the details of serializing that appropriately. When
4514+    #      we get the block hash tree, we are also in a position to
4515+    #      calculate the offset for the share hash chain, and fill that
4516+    #      into the offsets table.
4517+    #
4518+    #   4: At the same time, we're in a position to upload the salt hash
4519+    #      tree. This is a Merkle tree over all of the salts. We use a
4520+    #      Merkle tree so that we can validate each block,salt pair as
4521+    #      we download them later. We do this using
4522+    #
4523+    #        put_salthashes(salt_hash_tree)
4524+    #
4525+    #      When you do this, I automatically put the root of the tree
4526+    #      (the hash at index 0 of the list) in its appropriate slot in
4527+    #      the signed prefix of the share.
4528+    #
4529+    #   5: We're now in a position to upload the share hash chain for
4530+    #      a share. Do that with something like:
4531+    #     
4532+    #        put_sharehashes(share_hash_chain)
4533+    #
4534+    #      share_hash_chain should be a dictionary mapping shnums to
4535+    #      32-byte hashes -- the wrapper handles serialization.
4536+    #      We'll know where to put the signature at this point, also.
4537+    #      The root of this tree will be put explicitly in the next
4538+    #      step.
4539+    #
4540+    #      TODO: Why? Why not just include it in the tree here?
4541+    #
4542+    #   6: Before putting the signature, we must first put the
4543+    #      root_hash. Do this with:
4544+    #
4545+    #        put_root_hash(root_hash).
4546+    #     
4547+    #      In terms of knowing where to put this value, it was always
4548+    #      possible to place it, but it makes sense semantically to
4549+    #      place it after the share hash tree, so that's why you do it
4550+    #      in this order.
4551+    #
4552+    #   6: With the root hash put, we can now sign the header. Use:
4553+    #
4554+    #        get_signable()
4555+    #
4556+    #      to get the part of the header that you want to sign, and use:
4557+    #       
4558+    #        put_signature(signature)
4559+    #
4560+    #      to write your signature to the remote server.
4561+    #
4562+    #   6: Add the verification key, and finish. Do:
4563+    #
4564+    #        put_verification_key(key)
4565+    #
4566+    #      and
4567+    #
4568+    #        finish_publish()
4569+    #
4570+    # Checkstring management:
4571+    #
4572+    # To write to a mutable slot, we have to provide test vectors to ensure
4573+    # that we are writing to the same data that we think we are. These
4574+    # vectors allow us to detect uncoordinated writes; that is, writes
4575+    # where both we and some other shareholder are writing to the
4576+    # mutable slot, and to report those back to the parts of the program
4577+    # doing the writing.
4578+    #
4579+    # With SDMF, this was easy -- all of the share data was written in
4580+    # one go, so it was easy to detect uncoordinated writes, and we only
4581+    # had to do it once. With MDMF, not all of the file is written at
4582+    # once.
4583+    #
4584+    # If a share is new, we write out as much of the header as we can
4585+    # before writing out anything else. This gives other writers a
4586+    # canary that they can use to detect uncoordinated writes, and, if
4587+    # they do the same thing, gives us the same canary. We them update
4588+    # the share. We won't be able to write out two fields of the header
4589+    # -- the share tree hash and the salt hash -- until we finish
4590+    # writing out the share. We only require the writer to provide the
4591+    # initial checkstring, and keep track of what it should be after
4592+    # updates ourselves.
4593+    #
4594+    # If we haven't written anything yet, then on the first write (which
4595+    # will probably be a block + salt of a share), we'll also write out
4596+    # the header. On subsequent passes, we'll expect to see the header.
4597+    # This changes in two places:
4598+    #
4599+    #   - When we write out the salt hash
4600+    #   - When we write out the root of the share hash tree
4601+    #
4602+    # since these values will change the header. It is possible that we
4603+    # can just make those be written in one operation to minimize
4604+    # disruption.
4605+    def __init__(self,
4606+                 shnum,
4607+                 rref, # a remote reference to a storage server
4608+                 storage_index,
4609+                 secrets, # (write_enabler, renew_secret, cancel_secret)
4610+                 seqnum, # the sequence number of the mutable file
4611+                 required_shares,
4612+                 total_shares,
4613+                 segment_size,
4614+                 data_length): # the length of the original file
4615+        self.shnum = shnum
4616+        self._rref = rref
4617+        self._storage_index = storage_index
4618+        self._seqnum = seqnum
4619+        self._required_shares = required_shares
4620+        assert self.shnum >= 0 and self.shnum < total_shares
4621+        self._total_shares = total_shares
4622+        # We build up the offset table as we write things. It is the
4623+        # last thing we write to the remote server.
4624+        self._offsets = {}
4625+        self._testvs = []
4626+        self._secrets = secrets
4627+        # The segment size needs to be a multiple of the k parameter --
4628+        # any padding should have been carried out by the publisher
4629+        # already.
4630+        assert segment_size % required_shares == 0
4631+        self._segment_size = segment_size
4632+        self._data_length = data_length
4633+
4634+        # These are set later -- we define them here so that we can
4635+        # check for their existence easily
4636+
4637+        # This is the root of the share hash tree -- the Merkle tree
4638+        # over the roots of the block hash trees computed for shares in
4639+        # this upload.
4640+        self._root_hash = None
4641+
4642+        # We haven't yet written anything to the remote bucket. By
4643+        # setting this, we tell the _write method as much. The write
4644+        # method will then know that it also needs to add a write vector
4645+        # for the checkstring (or what we have of it) to the first write
4646+        # request. We'll then record that value for future use.  If
4647+        # we're expecting something to be there already, we need to call
4648+        # set_checkstring before we write anything to tell the first
4649+        # write about that.
4650+        self._written = False
4651+
4652+        # When writing data to the storage servers, we get a read vector
4653+        # for free. We'll read the checkstring, which will help us
4654+        # figure out what's gone wrong if a write fails.
4655+        self._readv = [(0, struct.calcsize(MDMFCHECKSTRING))]
4656+
4657+        # We calculate the number of segments because it tells us
4658+        # where the salt part of the file ends/share segment begins,
4659+        # and also because it provides a useful amount of bounds checking.
4660+        self._num_segments = mathutil.div_ceil(self._data_length,
4661+                                               self._segment_size)
4662+        self._block_size = self._segment_size / self._required_shares
4663+        # We also calculate the share size, to help us with block
4664+        # constraints later.
4665+        tail_size = self._data_length % self._segment_size
4666+        if not tail_size:
4667+            self._tail_block_size = self._block_size
4668+        else:
4669+            self._tail_block_size = mathutil.next_multiple(tail_size,
4670+                                                           self._required_shares)
4671+            self._tail_block_size /= self._required_shares
4672+
4673+        # We already know where the sharedata starts; right after the end
4674+        # of the header (which is defined as the signable part + the offsets)
4675+        # We can also calculate where the encrypted private key begins
4676+        # from what we know know.
4677+        self._actual_block_size = self._block_size + SALT_SIZE
4678+        data_size = self._actual_block_size * (self._num_segments - 1)
4679+        data_size += self._tail_block_size
4680+        data_size += SALT_SIZE
4681+        self._offsets['enc_privkey'] = MDMFHEADERSIZE
4682+        self._offsets['enc_privkey'] += data_size
4683+        # We'll wait for the rest. Callers can now call my "put_block" and
4684+        # "set_checkstring" methods.
4685+
4686+
4687+    def set_checkstring(self,
4688+                        seqnum_or_checkstring,
4689+                        root_hash=None,
4690+                        salt=None):
4691+        """
4692+        Set checkstring checkstring for the given shnum.
4693+
4694+        This can be invoked in one of two ways.
4695+
4696+        With one argument, I assume that you are giving me a literal
4697+        checkstring -- e.g., the output of get_checkstring. I will then
4698+        set that checkstring as it is. This form is used by unit tests.
4699+
4700+        With two arguments, I assume that you are giving me a sequence
4701+        number and root hash to make a checkstring from. In that case, I
4702+        will build a checkstring and set it for you. This form is used
4703+        by the publisher.
4704+
4705+        By default, I assume that I am writing new shares to the grid.
4706+        If you don't explcitly set your own checkstring, I will use
4707+        one that requires that the remote share not exist. You will want
4708+        to use this method if you are updating a share in-place;
4709+        otherwise, writes will fail.
4710+        """
4711+        # You're allowed to overwrite checkstrings with this method;
4712+        # I assume that users know what they are doing when they call
4713+        # it.
4714+        if root_hash:
4715+            checkstring = struct.pack(MDMFCHECKSTRING,
4716+                                      1,
4717+                                      seqnum_or_checkstring,
4718+                                      root_hash)
4719+        else:
4720+            checkstring = seqnum_or_checkstring
4721+
4722+        if checkstring == "":
4723+            # We special-case this, since len("") = 0, but we need
4724+            # length of 1 for the case of an empty share to work on the
4725+            # storage server, which is what a checkstring that is the
4726+            # empty string means.
4727+            self._testvs = []
4728+        else:
4729+            self._testvs = []
4730+            self._testvs.append((0, len(checkstring), "eq", checkstring))
4731+
4732+
4733+    def __repr__(self):
4734+        return "MDMFSlotWriteProxy for share %d" % self.shnum
4735+
4736+
4737+    def get_checkstring(self):
4738+        """
4739+        Given a share number, I return a representation of what the
4740+        checkstring for that share on the server will look like.
4741+
4742+        I am mostly used for tests.
4743+        """
4744+        if self._root_hash:
4745+            roothash = self._root_hash
4746+        else:
4747+            roothash = "\x00" * 32
4748+        return struct.pack(MDMFCHECKSTRING,
4749+                           1,
4750+                           self._seqnum,
4751+                           roothash)
4752+
4753+
4754+    def put_block(self, data, segnum, salt):
4755+        """
4756+        Put the encrypted-and-encoded data segment in the slot, along
4757+        with the salt.
4758+        """
4759+        if segnum >= self._num_segments:
4760+            raise LayoutInvalid("I won't overwrite the private key")
4761+        if len(salt) != SALT_SIZE:
4762+            raise LayoutInvalid("I was given a salt of size %d, but "
4763+                                "I wanted a salt of size %d")
4764+        if segnum + 1 == self._num_segments:
4765+            if len(data) != self._tail_block_size:
4766+                raise LayoutInvalid("I was given the wrong size block to write")
4767+        elif len(data) != self._block_size:
4768+            raise LayoutInvalid("I was given the wrong size block to write")
4769+
4770+        # We want to write at len(MDMFHEADER) + segnum * block_size.
4771+
4772+        offset = MDMFHEADERSIZE + (self._actual_block_size * segnum)
4773+        data = salt + data
4774+
4775+        datavs = [tuple([offset, data])]
4776+        return self._write(datavs)
4777+
4778+
4779+    def put_encprivkey(self, encprivkey):
4780+        """
4781+        Put the encrypted private key in the remote slot.
4782+        """
4783+        assert self._offsets
4784+        assert self._offsets['enc_privkey']
4785+        # You shouldn't re-write the encprivkey after the block hash
4786+        # tree is written, since that could cause the private key to run
4787+        # into the block hash tree. Before it writes the block hash
4788+        # tree, the block hash tree writing method writes the offset of
4789+        # the salt hash tree. So that's a good indicator of whether or
4790+        # not the block hash tree has been written.
4791+        if "share_hash_chain" in self._offsets:
4792+            raise LayoutInvalid("You must write this before the block hash tree")
4793+
4794+        self._offsets['block_hash_tree'] = self._offsets['enc_privkey'] + len(encprivkey)
4795+        datavs = [(tuple([self._offsets['enc_privkey'], encprivkey]))]
4796+        def _on_failure():
4797+            del(self._offsets['block_hash_tree'])
4798+        return self._write(datavs, on_failure=_on_failure)
4799+
4800+
4801+    def put_blockhashes(self, blockhashes):
4802+        """
4803+        Put the block hash tree in the remote slot.
4804+
4805+        The encrypted private key must be put before the block hash
4806+        tree, since we need to know how large it is to know where the
4807+        block hash tree should go. The block hash tree must be put
4808+        before the salt hash tree, since its size determines the
4809+        offset of the share hash chain.
4810+        """
4811+        assert self._offsets
4812+        assert isinstance(blockhashes, list)
4813+        if "block_hash_tree" not in self._offsets:
4814+            raise LayoutInvalid("You must put the encrypted private key "
4815+                                "before you put the block hash tree")
4816+        # If written, the share hash chain causes the signature offset
4817+        # to be defined.
4818+        if "signature" in self._offsets:
4819+            raise LayoutInvalid("You must put the block hash tree before "
4820+                                "you put the share hash chain")
4821+        blockhashes_s = "".join(blockhashes)
4822+        self._offsets['share_hash_chain'] = self._offsets['block_hash_tree'] + len(blockhashes_s)
4823+        datavs = []
4824+        datavs.append(tuple([self._offsets['block_hash_tree'], blockhashes_s]))
4825+        def _on_failure():
4826+            del(self._offsets['share_hash_chain'])
4827+        return self._write(datavs, on_failure=_on_failure)
4828+
4829+
4830+    def put_sharehashes(self, sharehashes):
4831+        """
4832+        Put the share hash chain in the remote slot.
4833+
4834+        The salt hash tree must be put before the share hash chain,
4835+        since we need to know where the salt hash tree ends before we
4836+        can know where the share hash chain starts. The share hash chain
4837+        must be put before the signature, since the length of the packed
4838+        share hash chain determines the offset of the signature. Also,
4839+        semantically, you must know what the root of the salt hash tree
4840+        is before you can generate a valid signature.
4841+        """
4842+        assert isinstance(sharehashes, dict)
4843+        if "share_hash_chain" not in self._offsets:
4844+            raise LayoutInvalid("You need to put the salt hash tree before "
4845+                                "you can put the share hash chain")
4846+        # The signature comes after the share hash chain. If the
4847+        # signature has already been written, we must not write another
4848+        # share hash chain. The signature writes the verification key
4849+        # offset when it gets sent to the remote server, so we look for
4850+        # that.
4851+        if "verification_key" in self._offsets:
4852+            raise LayoutInvalid("You must write the share hash chain "
4853+                                "before you write the signature")
4854+        datavs = []
4855+        sharehashes_s = "".join([struct.pack(">H32s", i, sharehashes[i])
4856+                                  for i in sorted(sharehashes.keys())])
4857+        self._offsets['signature'] = self._offsets['share_hash_chain'] + len(sharehashes_s)
4858+        datavs.append(tuple([self._offsets['share_hash_chain'], sharehashes_s]))
4859+        def _on_failure():
4860+            del(self._offsets['signature'])
4861+        return self._write(datavs, on_failure=_on_failure)
4862+
4863+
4864+    def put_root_hash(self, roothash):
4865+        """
4866+        Put the root hash (the root of the share hash tree) in the
4867+        remote slot.
4868+        """
4869+        # It does not make sense to be able to put the root
4870+        # hash without first putting the share hashes, since you need
4871+        # the share hashes to generate the root hash.
4872+        #
4873+        # Signature is defined by the routine that places the share hash
4874+        # chain, so it's a good thing to look for in finding out whether
4875+        # or not the share hash chain exists on the remote server.
4876+        if "signature" not in self._offsets:
4877+            raise LayoutInvalid("You need to put the share hash chain "
4878+                                "before you can put the root share hash")
4879+        if len(roothash) != HASH_SIZE:
4880+            raise LayoutInvalid("hashes and salts must be exactly %d bytes"
4881+                                 % HASH_SIZE)
4882+        datavs = []
4883+        self._root_hash = roothash
4884+        # To write both of these values, we update the checkstring on
4885+        # the remote server, which includes them
4886+        checkstring = self.get_checkstring()
4887+        datavs.append(tuple([0, checkstring]))
4888+        # This write, if successful, changes the checkstring, so we need
4889+        # to update our internal checkstring to be consistent with the
4890+        # one on the server.
4891+        def _on_success():
4892+            self._testvs = [(0, len(checkstring), "eq", checkstring)]
4893+        def _on_failure():
4894+            self._root_hash = None
4895+        return self._write(datavs,
4896+                           on_success=_on_success,
4897+                           on_failure=_on_failure)
4898+
4899+
4900+    def get_signable(self):
4901+        """
4902+        Get the first seven fields of the mutable file; the parts that
4903+        are signed.
4904+        """
4905+        if not self._root_hash:
4906+            raise LayoutInvalid("You need to set the root hash "
4907+                                "before getting something to "
4908+                                "sign")
4909+        return struct.pack(MDMFSIGNABLEHEADER,
4910+                           1,
4911+                           self._seqnum,
4912+                           self._root_hash,
4913+                           self._required_shares,
4914+                           self._total_shares,
4915+                           self._segment_size,
4916+                           self._data_length)
4917+
4918+
4919+    def put_signature(self, signature):
4920+        """
4921+        Put the signature field to the remote slot.
4922+
4923+        I require that the root hash and share hash chain have been put
4924+        to the grid before I will write the signature to the grid.
4925+        """
4926+        if "signature" not in self._offsets:
4927+            raise LayoutInvalid("You must put the share hash chain "
4928+        # It does not make sense to put a signature without first
4929+        # putting the root hash and the salt hash (since otherwise
4930+        # the signature would be incomplete), so we don't allow that.
4931+                       "before putting the signature")
4932+        if not self._root_hash:
4933+            raise LayoutInvalid("You must complete the signed prefix "
4934+                                "before computing a signature")
4935+        # If we put the signature after we put the verification key, we
4936+        # could end up running into the verification key, and will
4937+        # probably screw up the offsets as well. So we don't allow that.
4938+        # The method that writes the verification key defines the EOF
4939+        # offset before writing the verification key, so look for that.
4940+        if "EOF" in self._offsets:
4941+            raise LayoutInvalid("You must write the signature before the verification key")
4942+
4943+        self._offsets['verification_key'] = self._offsets['signature'] + len(signature)
4944+        datavs = []
4945+        datavs.append(tuple([self._offsets['signature'], signature]))
4946+        def _on_failure():
4947+            del(self._offsets['verification_key'])
4948+        return self._write(datavs, on_failure=_on_failure)
4949+
4950+
4951+    def put_verification_key(self, verification_key):
4952+        """
4953+        Put the verification key into the remote slot.
4954+
4955+        I require that the signature have been written to the storage
4956+        server before I allow the verification key to be written to the
4957+        remote server.
4958+        """
4959+        if "verification_key" not in self._offsets:
4960+            raise LayoutInvalid("You must put the signature before you "
4961+                                "can put the verification key")
4962+        self._offsets['EOF'] = self._offsets['verification_key'] + len(verification_key)
4963+        datavs = []
4964+        datavs.append(tuple([self._offsets['verification_key'], verification_key]))
4965+        def _on_failure():
4966+            del(self._offsets['EOF'])
4967+        return self._write(datavs, on_failure=_on_failure)
4968+
4969+    def _get_offsets_tuple(self):
4970+        return tuple([(key, value) for key, value in self._offsets.items()])
4971+
4972+    def get_verinfo(self):
4973+        return (self._seqnum,
4974+                self._root_hash,
4975+                self._required_shares,
4976+                self._total_shares,
4977+                self._segment_size,
4978+                self._data_length,
4979+                self.get_signable(),
4980+                self._get_offsets_tuple())
4981+
4982+
4983+    def finish_publishing(self):
4984+        """
4985+        Write the offset table and encoding parameters to the remote
4986+        slot, since that's the only thing we have yet to publish at this
4987+        point.
4988+        """
4989+        if "EOF" not in self._offsets:
4990+            raise LayoutInvalid("You must put the verification key before "
4991+                                "you can publish the offsets")
4992+        offsets_offset = struct.calcsize(MDMFHEADERWITHOUTOFFSETS)
4993+        offsets = struct.pack(MDMFOFFSETS,
4994+                              self._offsets['enc_privkey'],
4995+                              self._offsets['block_hash_tree'],
4996+                              self._offsets['share_hash_chain'],
4997+                              self._offsets['signature'],
4998+                              self._offsets['verification_key'],
4999+                              self._offsets['EOF'])
5000+        datavs = []
5001+        datavs.append(tuple([offsets_offset, offsets]))
5002+        encoding_parameters_offset = struct.calcsize(MDMFCHECKSTRING)
5003+        params = struct.pack(">BBQQ",
5004+                             self._required_shares,
5005+                             self._total_shares,
5006+                             self._segment_size,
5007+                             self._data_length)
5008+        datavs.append(tuple([encoding_parameters_offset, params]))
5009+        return self._write(datavs)
5010+
5011+
5012+    def _write(self, datavs, on_failure=None, on_success=None):
5013+        """I write the data vectors in datavs to the remote slot."""
5014+        tw_vectors = {}
5015+        new_share = False
5016+        if not self._testvs:
5017+            self._testvs = []
5018+            self._testvs.append(tuple([0, 1, "eq", ""]))
5019+            new_share = True
5020+        if not self._written:
5021+            # Write a new checkstring to the share when we write it, so
5022+            # that we have something to check later.
5023+            new_checkstring = self.get_checkstring()
5024+            datavs.append((0, new_checkstring))
5025+            def _first_write():
5026+                self._written = True
5027+                self._testvs = [(0, len(new_checkstring), "eq", new_checkstring)]
5028+            on_success = _first_write
5029+        tw_vectors[self.shnum] = (self._testvs, datavs, None)
5030+        datalength = sum([len(x[1]) for x in datavs])
5031+        d = self._rref.callRemote("slot_testv_and_readv_and_writev",
5032+                                  self._storage_index,
5033+                                  self._secrets,
5034+                                  tw_vectors,
5035+                                  self._readv)
5036+        def _result(results):
5037+            if isinstance(results, failure.Failure) or not results[0]:
5038+                # Do nothing; the write was unsuccessful.
5039+                if on_failure: on_failure()
5040+            else:
5041+                if on_success: on_success()
5042+            return results
5043+        d.addCallback(_result)
5044+        return d
5045+
5046+
5047+class MDMFSlotReadProxy:
5048+    """
5049+    I read from a mutable slot filled with data written in the MDMF data
5050+    format (which is described above).
5051+
5052+    I can be initialized with some amount of data, which I will use (if
5053+    it is valid) to eliminate some of the need to fetch it from servers.
5054+    """
5055+    def __init__(self,
5056+                 rref,
5057+                 storage_index,
5058+                 shnum,
5059+                 data=""):
5060+        # Start the initialization process.
5061+        self._rref = rref
5062+        self._storage_index = storage_index
5063+        self.shnum = shnum
5064+
5065+        # Before doing anything, the reader is probably going to want to
5066+        # verify that the signature is correct. To do that, they'll need
5067+        # the verification key, and the signature. To get those, we'll
5068+        # need the offset table. So fetch the offset table on the
5069+        # assumption that that will be the first thing that a reader is
5070+        # going to do.
5071+
5072+        # The fact that these encoding parameters are None tells us
5073+        # that we haven't yet fetched them from the remote share, so we
5074+        # should. We could just not set them, but the checks will be
5075+        # easier to read if we don't have to use hasattr.
5076+        self._version_number = None
5077+        self._sequence_number = None
5078+        self._root_hash = None
5079+        # Filled in if we're dealing with an SDMF file. Unused
5080+        # otherwise.
5081+        self._salt = None
5082+        self._required_shares = None
5083+        self._total_shares = None
5084+        self._segment_size = None
5085+        self._data_length = None
5086+        self._offsets = None
5087+
5088+        # If the user has chosen to initialize us with some data, we'll
5089+        # try to satisfy subsequent data requests with that data before
5090+        # asking the storage server for it. If
5091+        self._data = data
5092+        # The way callers interact with cache in the filenode returns
5093+        # None if there isn't any cached data, but the way we index the
5094+        # cached data requires a string, so convert None to "".
5095+        if self._data == None:
5096+            self._data = ""
5097+
5098+        self._queue_observers = observer.ObserverList()
5099+        self._queue_errbacks = observer.ObserverList()
5100+        self._readvs = []
5101+
5102+
5103+    def _maybe_fetch_offsets_and_header(self, force_remote=False):
5104+        """
5105+        I fetch the offset table and the header from the remote slot if
5106+        I don't already have them. If I do have them, I do nothing and
5107+        return an empty Deferred.
5108+        """
5109+        if self._offsets:
5110+            return defer.succeed(None)
5111+        # At this point, we may be either SDMF or MDMF. Fetching 107
5112+        # bytes will be enough to get header and offsets for both SDMF and
5113+        # MDMF, though we'll be left with 4 more bytes than we
5114+        # need if this ends up being MDMF. This is probably less
5115+        # expensive than the cost of a second roundtrip.
5116+        readvs = [(0, 107)]
5117+        d = self._read(readvs, force_remote)
5118+        d.addCallback(self._process_encoding_parameters)
5119+        d.addCallback(self._process_offsets)
5120+        return d
5121+
5122+
5123+    def _process_encoding_parameters(self, encoding_parameters):
5124+        assert self.shnum in encoding_parameters
5125+        encoding_parameters = encoding_parameters[self.shnum][0]
5126+        # The first byte is the version number. It will tell us what
5127+        # to do next.
5128+        (verno,) = struct.unpack(">B", encoding_parameters[:1])
5129+        if verno == MDMF_VERSION:
5130+            read_size = MDMFHEADERWITHOUTOFFSETSSIZE
5131+            (verno,
5132+             seqnum,
5133+             root_hash,
5134+             k,
5135+             n,
5136+             segsize,
5137+             datalen) = struct.unpack(MDMFHEADERWITHOUTOFFSETS,
5138+                                      encoding_parameters[:read_size])
5139+            if segsize == 0 and datalen == 0:
5140+                # Empty file, no segments.
5141+                self._num_segments = 0
5142+            else:
5143+                self._num_segments = mathutil.div_ceil(datalen, segsize)
5144+
5145+        elif verno == SDMF_VERSION:
5146+            read_size = SIGNED_PREFIX_LENGTH
5147+            (verno,
5148+             seqnum,
5149+             root_hash,
5150+             salt,
5151+             k,
5152+             n,
5153+             segsize,
5154+             datalen) = struct.unpack(">BQ32s16s BBQQ",
5155+                                encoding_parameters[:SIGNED_PREFIX_LENGTH])
5156+            self._salt = salt
5157+            if segsize == 0 and datalen == 0:
5158+                # empty file
5159+                self._num_segments = 0
5160+            else:
5161+                # non-empty SDMF files have one segment.
5162+                self._num_segments = 1
5163+        else:
5164+            raise UnknownVersionError("You asked me to read mutable file "
5165+                                      "version %d, but I only understand "
5166+                                      "%d and %d" % (verno, SDMF_VERSION,
5167+                                                     MDMF_VERSION))
5168+
5169+        self._version_number = verno
5170+        self._sequence_number = seqnum
5171+        self._root_hash = root_hash
5172+        self._required_shares = k
5173+        self._total_shares = n
5174+        self._segment_size = segsize
5175+        self._data_length = datalen
5176+
5177+        self._block_size = self._segment_size / self._required_shares
5178+        # We can upload empty files, and need to account for this fact
5179+        # so as to avoid zero-division and zero-modulo errors.
5180+        if datalen > 0:
5181+            tail_size = self._data_length % self._segment_size
5182+        else:
5183+            tail_size = 0
5184+        if not tail_size:
5185+            self._tail_block_size = self._block_size
5186+        else:
5187+            self._tail_block_size = mathutil.next_multiple(tail_size,
5188+                                                    self._required_shares)
5189+            self._tail_block_size /= self._required_shares
5190+
5191+        return encoding_parameters
5192+
5193+
5194+    def _process_offsets(self, offsets):
5195+        if self._version_number == 0:
5196+            read_size = OFFSETS_LENGTH
5197+            read_offset = SIGNED_PREFIX_LENGTH
5198+            end = read_size + read_offset
5199+            (signature,
5200+             share_hash_chain,
5201+             block_hash_tree,
5202+             share_data,
5203+             enc_privkey,
5204+             EOF) = struct.unpack(">LLLLQQ",
5205+                                  offsets[read_offset:end])
5206+            self._offsets = {}
5207+            self._offsets['signature'] = signature
5208+            self._offsets['share_data'] = share_data
5209+            self._offsets['block_hash_tree'] = block_hash_tree
5210+            self._offsets['share_hash_chain'] = share_hash_chain
5211+            self._offsets['enc_privkey'] = enc_privkey
5212+            self._offsets['EOF'] = EOF
5213+
5214+        elif self._version_number == 1:
5215+            read_offset = MDMFHEADERWITHOUTOFFSETSSIZE
5216+            read_length = MDMFOFFSETS_LENGTH
5217+            end = read_offset + read_length
5218+            (encprivkey,
5219+             blockhashes,
5220+             sharehashes,
5221+             signature,
5222+             verification_key,
5223+             eof) = struct.unpack(MDMFOFFSETS,
5224+                                  offsets[read_offset:end])
5225+            self._offsets = {}
5226+            self._offsets['enc_privkey'] = encprivkey
5227+            self._offsets['block_hash_tree'] = blockhashes
5228+            self._offsets['share_hash_chain'] = sharehashes
5229+            self._offsets['signature'] = signature
5230+            self._offsets['verification_key'] = verification_key
5231+            self._offsets['EOF'] = eof
5232+
5233+
5234+    def get_block_and_salt(self, segnum, queue=False):
5235+        """
5236+        I return (block, salt), where block is the block data and
5237+        salt is the salt used to encrypt that segment.
5238+        """
5239+        d = self._maybe_fetch_offsets_and_header()
5240+        def _then(ignored):
5241+            if self._version_number == 1:
5242+                base_share_offset = MDMFHEADERSIZE
5243+            else:
5244+                base_share_offset = self._offsets['share_data']
5245+
5246+            if segnum + 1 > self._num_segments:
5247+                raise LayoutInvalid("Not a valid segment number")
5248+
5249+            if self._version_number == 0:
5250+                share_offset = base_share_offset + self._block_size * segnum
5251+            else:
5252+                share_offset = base_share_offset + (self._block_size + \
5253+                                                    SALT_SIZE) * segnum
5254+            if segnum + 1 == self._num_segments:
5255+                data = self._tail_block_size
5256+            else:
5257+                data = self._block_size
5258+
5259+            if self._version_number == 1:
5260+                data += SALT_SIZE
5261+
5262+            readvs = [(share_offset, data)]
5263+            return readvs
5264+        d.addCallback(_then)
5265+        d.addCallback(lambda readvs:
5266+            self._read(readvs, queue=queue))
5267+        def _process_results(results):
5268+            assert self.shnum in results
5269+            if self._version_number == 0:
5270+                # We only read the share data, but we know the salt from
5271+                # when we fetched the header
5272+                data = results[self.shnum]
5273+                if not data:
5274+                    data = ""
5275+                else:
5276+                    assert len(data) == 1
5277+                    data = data[0]
5278+                salt = self._salt
5279+            else:
5280+                data = results[self.shnum]
5281+                if not data:
5282+                    salt = data = ""
5283+                else:
5284+                    salt_and_data = results[self.shnum][0]
5285+                    salt = salt_and_data[:SALT_SIZE]
5286+                    data = salt_and_data[SALT_SIZE:]
5287+            return data, salt
5288+        d.addCallback(_process_results)
5289+        return d
5290+
5291+
5292+    def get_blockhashes(self, needed=None, queue=False, force_remote=False):
5293+        """
5294+        I return the block hash tree
5295+
5296+        I take an optional argument, needed, which is a set of indices
5297+        correspond to hashes that I should fetch. If this argument is
5298+        missing, I will fetch the entire block hash tree; otherwise, I
5299+        may attempt to fetch fewer hashes, based on what needed says
5300+        that I should do. Note that I may fetch as many hashes as I
5301+        want, so long as the set of hashes that I do fetch is a superset
5302+        of the ones that I am asked for, so callers should be prepared
5303+        to tolerate additional hashes.
5304+        """
5305+        # TODO: Return only the parts of the block hash tree necessary
5306+        # to validate the blocknum provided?
5307+        # This is a good idea, but it is hard to implement correctly. It
5308+        # is bad to fetch any one block hash more than once, so we
5309+        # probably just want to fetch the whole thing at once and then
5310+        # serve it.
5311+        if needed == set([]):
5312+            return defer.succeed([])
5313+        d = self._maybe_fetch_offsets_and_header()
5314+        def _then(ignored):
5315+            blockhashes_offset = self._offsets['block_hash_tree']
5316+            if self._version_number == 1:
5317+                blockhashes_length = self._offsets['share_hash_chain'] - blockhashes_offset
5318+            else:
5319+                blockhashes_length = self._offsets['share_data'] - blockhashes_offset
5320+            readvs = [(blockhashes_offset, blockhashes_length)]
5321+            return readvs
5322+        d.addCallback(_then)
5323+        d.addCallback(lambda readvs:
5324+            self._read(readvs, queue=queue, force_remote=force_remote))
5325+        def _build_block_hash_tree(results):
5326+            assert self.shnum in results
5327+
5328+            rawhashes = results[self.shnum][0]
5329+            results = [rawhashes[i:i+HASH_SIZE]
5330+                       for i in range(0, len(rawhashes), HASH_SIZE)]
5331+            return results
5332+        d.addCallback(_build_block_hash_tree)
5333+        return d
5334+
5335+
5336+    def get_sharehashes(self, needed=None, queue=False, force_remote=False):
5337+        """
5338+        I return the part of the share hash chain placed to validate
5339+        this share.
5340+
5341+        I take an optional argument, needed. Needed is a set of indices
5342+        that correspond to the hashes that I should fetch. If needed is
5343+        not present, I will fetch and return the entire share hash
5344+        chain. Otherwise, I may fetch and return any part of the share
5345+        hash chain that is a superset of the part that I am asked to
5346+        fetch. Callers should be prepared to deal with more hashes than
5347+        they've asked for.
5348+        """
5349+        if needed == set([]):
5350+            return defer.succeed([])
5351+        d = self._maybe_fetch_offsets_and_header()
5352+
5353+        def _make_readvs(ignored):
5354+            sharehashes_offset = self._offsets['share_hash_chain']
5355+            if self._version_number == 0:
5356+                sharehashes_length = self._offsets['block_hash_tree'] - sharehashes_offset
5357+            else:
5358+                sharehashes_length = self._offsets['signature'] - sharehashes_offset
5359+            readvs = [(sharehashes_offset, sharehashes_length)]
5360+            return readvs
5361+        d.addCallback(_make_readvs)
5362+        d.addCallback(lambda readvs:
5363+            self._read(readvs, queue=queue, force_remote=force_remote))
5364+        def _build_share_hash_chain(results):
5365+            assert self.shnum in results
5366+
5367+            sharehashes = results[self.shnum][0]
5368+            results = [sharehashes[i:i+(HASH_SIZE + 2)]
5369+                       for i in range(0, len(sharehashes), HASH_SIZE + 2)]
5370+            results = dict([struct.unpack(">H32s", data)
5371+                            for data in results])
5372+            return results
5373+        d.addCallback(_build_share_hash_chain)
5374+        return d
5375+
5376+
5377+    def get_encprivkey(self, queue=False):
5378+        """
5379+        I return the encrypted private key.
5380+        """
5381+        d = self._maybe_fetch_offsets_and_header()
5382+
5383+        def _make_readvs(ignored):
5384+            privkey_offset = self._offsets['enc_privkey']
5385+            if self._version_number == 0:
5386+                privkey_length = self._offsets['EOF'] - privkey_offset
5387+            else:
5388+                privkey_length = self._offsets['block_hash_tree'] - privkey_offset
5389+            readvs = [(privkey_offset, privkey_length)]
5390+            return readvs
5391+        d.addCallback(_make_readvs)
5392+        d.addCallback(lambda readvs:
5393+            self._read(readvs, queue=queue))
5394+        def _process_results(results):
5395+            assert self.shnum in results
5396+            privkey = results[self.shnum][0]
5397+            return privkey
5398+        d.addCallback(_process_results)
5399+        return d
5400+
5401+
5402+    def get_signature(self, queue=False):
5403+        """
5404+        I return the signature of my share.
5405+        """
5406+        d = self._maybe_fetch_offsets_and_header()
5407+
5408+        def _make_readvs(ignored):
5409+            signature_offset = self._offsets['signature']
5410+            if self._version_number == 1:
5411+                signature_length = self._offsets['verification_key'] - signature_offset
5412+            else:
5413+                signature_length = self._offsets['share_hash_chain'] - signature_offset
5414+            readvs = [(signature_offset, signature_length)]
5415+            return readvs
5416+        d.addCallback(_make_readvs)
5417+        d.addCallback(lambda readvs:
5418+            self._read(readvs, queue=queue))
5419+        def _process_results(results):
5420+            assert self.shnum in results
5421+            signature = results[self.shnum][0]
5422+            return signature
5423+        d.addCallback(_process_results)
5424+        return d
5425+
5426+
5427+    def get_verification_key(self, queue=False):
5428+        """
5429+        I return the verification key.
5430+        """
5431+        d = self._maybe_fetch_offsets_and_header()
5432+
5433+        def _make_readvs(ignored):
5434+            if self._version_number == 1:
5435+                vk_offset = self._offsets['verification_key']
5436+                vk_length = self._offsets['EOF'] - vk_offset
5437+            else:
5438+                vk_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
5439+                vk_length = self._offsets['signature'] - vk_offset
5440+            readvs = [(vk_offset, vk_length)]
5441+            return readvs
5442+        d.addCallback(_make_readvs)
5443+        d.addCallback(lambda readvs:
5444+            self._read(readvs, queue=queue))
5445+        def _process_results(results):
5446+            assert self.shnum in results
5447+            verification_key = results[self.shnum][0]
5448+            return verification_key
5449+        d.addCallback(_process_results)
5450+        return d
5451+
5452+
5453+    def get_encoding_parameters(self):
5454+        """
5455+        I return (k, n, segsize, datalen)
5456+        """
5457+        d = self._maybe_fetch_offsets_and_header()
5458+        d.addCallback(lambda ignored:
5459+            (self._required_shares,
5460+             self._total_shares,
5461+             self._segment_size,
5462+             self._data_length))
5463+        return d
5464+
5465+
5466+    def get_seqnum(self):
5467+        """
5468+        I return the sequence number for this share.
5469+        """
5470+        d = self._maybe_fetch_offsets_and_header()
5471+        d.addCallback(lambda ignored:
5472+            self._sequence_number)
5473+        return d
5474+
5475+
5476+    def get_root_hash(self):
5477+        """
5478+        I return the root of the block hash tree
5479+        """
5480+        d = self._maybe_fetch_offsets_and_header()
5481+        d.addCallback(lambda ignored: self._root_hash)
5482+        return d
5483+
5484+
5485+    def get_checkstring(self):
5486+        """
5487+        I return the packed representation of the following:
5488+
5489+            - version number
5490+            - sequence number
5491+            - root hash
5492+            - salt hash
5493+
5494+        which my users use as a checkstring to detect other writers.
5495+        """
5496+        d = self._maybe_fetch_offsets_and_header()
5497+        def _build_checkstring(ignored):
5498+            if self._salt:
5499+                checkstring = strut.pack(PREFIX,
5500+                                         self._version_number,
5501+                                         self._sequence_number,
5502+                                         self._root_hash,
5503+                                         self._salt)
5504+            else:
5505+                checkstring = struct.pack(MDMFCHECKSTRING,
5506+                                          self._version_number,
5507+                                          self._sequence_number,
5508+                                          self._root_hash)
5509+
5510+            return checkstring
5511+        d.addCallback(_build_checkstring)
5512+        return d
5513+
5514+
5515+    def get_prefix(self, force_remote):
5516+        d = self._maybe_fetch_offsets_and_header(force_remote)
5517+        d.addCallback(lambda ignored:
5518+            self._build_prefix())
5519+        return d
5520+
5521+
5522+    def _build_prefix(self):
5523+        # The prefix is another name for the part of the remote share
5524+        # that gets signed. It consists of everything up to and
5525+        # including the datalength, packed by struct.
5526+        if self._version_number == SDMF_VERSION:
5527+            return struct.pack(SIGNED_PREFIX,
5528+                           self._version_number,
5529+                           self._sequence_number,
5530+                           self._root_hash,
5531+                           self._salt,
5532+                           self._required_shares,
5533+                           self._total_shares,
5534+                           self._segment_size,
5535+                           self._data_length)
5536+
5537+        else:
5538+            return struct.pack(MDMFSIGNABLEHEADER,
5539+                           self._version_number,
5540+                           self._sequence_number,
5541+                           self._root_hash,
5542+                           self._required_shares,
5543+                           self._total_shares,
5544+                           self._segment_size,
5545+                           self._data_length)
5546+
5547+
5548+    def _get_offsets_tuple(self):
5549+        # The offsets tuple is another component of the version
5550+        # information tuple. It is basically our offsets dictionary,
5551+        # itemized and in a tuple.
5552+        return self._offsets.copy()
5553+
5554+
5555+    def get_verinfo(self):
5556+        """
5557+        I return my verinfo tuple. This is used by the ServermapUpdater
5558+        to keep track of versions of mutable files.
5559+
5560+        The verinfo tuple for MDMF files contains:
5561+            - seqnum
5562+            - root hash
5563+            - a blank (nothing)
5564+            - segsize
5565+            - datalen
5566+            - k
5567+            - n
5568+            - prefix (the thing that you sign)
5569+            - a tuple of offsets
5570+
5571+        We include the nonce in MDMF to simplify processing of version
5572+        information tuples.
5573+
5574+        The verinfo tuple for SDMF files is the same, but contains a
5575+        16-byte IV instead of a hash of salts.
5576+        """
5577+        d = self._maybe_fetch_offsets_and_header()
5578+        def _build_verinfo(ignored):
5579+            if self._version_number == SDMF_VERSION:
5580+                salt_to_use = self._salt
5581+            else:
5582+                salt_to_use = None
5583+            return (self._sequence_number,
5584+                    self._root_hash,
5585+                    salt_to_use,
5586+                    self._segment_size,
5587+                    self._data_length,
5588+                    self._required_shares,
5589+                    self._total_shares,
5590+                    self._build_prefix(),
5591+                    self._get_offsets_tuple())
5592+        d.addCallback(_build_verinfo)
5593+        return d
5594+
5595+
5596+    def flush(self):
5597+        """
5598+        I flush my queue of read vectors.
5599+        """
5600+        d = self._read(self._readvs)
5601+        def _then(results):
5602+            self._readvs = []
5603+            if isinstance(results, failure.Failure):
5604+                self._queue_errbacks.notify(results)
5605+            else:
5606+                self._queue_observers.notify(results)
5607+            self._queue_observers = observer.ObserverList()
5608+            self._queue_errbacks = observer.ObserverList()
5609+        d.addBoth(_then)
5610+
5611+
5612+    def _read(self, readvs, force_remote=False, queue=False):
5613+        unsatisfiable = filter(lambda x: x[0] + x[1] > len(self._data), readvs)
5614+        # TODO: It's entirely possible to tweak this so that it just
5615+        # fulfills the requests that it can, and not demand that all
5616+        # requests are satisfiable before running it.
5617+        if not unsatisfiable and not force_remote:
5618+            results = [self._data[offset:offset+length]
5619+                       for (offset, length) in readvs]
5620+            results = {self.shnum: results}
5621+            return defer.succeed(results)
5622+        else:
5623+            if queue:
5624+                start = len(self._readvs)
5625+                self._readvs += readvs
5626+                end = len(self._readvs)
5627+                def _get_results(results, start, end):
5628+                    if not self.shnum in results:
5629+                        return {self._shnum: [""]}
5630+                    return {self.shnum: results[self.shnum][start:end]}
5631+                d = defer.Deferred()
5632+                d.addCallback(_get_results, start, end)
5633+                self._queue_observers.subscribe(d.callback)
5634+                self._queue_errbacks.subscribe(d.errback)
5635+                return d
5636+            return self._rref.callRemote("slot_readv",
5637+                                         self._storage_index,
5638+                                         [self.shnum],
5639+                                         readvs)
5640+
5641+
5642+    def is_sdmf(self):
5643+        """I tell my caller whether or not my remote file is SDMF or MDMF
5644+        """
5645+        d = self._maybe_fetch_offsets_and_header()
5646+        d.addCallback(lambda ignored:
5647+            self._version_number == 0)
5648+        return d
5649+
5650+
5651+class LayoutInvalid(Exception):
5652+    """
5653+    This isn't a valid MDMF mutable file
5654+    """
5655hunk ./src/allmydata/test/test_storage.py 2
5656 
5657-import time, os.path, stat, re, simplejson, struct
5658+import time, os.path, stat, re, simplejson, struct, shutil
5659 
5660 from twisted.trial import unittest
5661 
5662hunk ./src/allmydata/test/test_storage.py 22
5663 from allmydata.storage.expirer import LeaseCheckingCrawler
5664 from allmydata.immutable.layout import WriteBucketProxy, WriteBucketProxy_v2, \
5665      ReadBucketProxy
5666-from allmydata.interfaces import BadWriteEnablerError
5667-from allmydata.test.common import LoggingServiceParent
5668+from allmydata.mutable.layout import MDMFSlotWriteProxy, MDMFSlotReadProxy, \
5669+                                     LayoutInvalid, MDMFSIGNABLEHEADER, \
5670+                                     SIGNED_PREFIX, MDMFHEADER, \
5671+                                     MDMFOFFSETS, SDMFSlotWriteProxy
5672+from allmydata.interfaces import BadWriteEnablerError, MDMF_VERSION, \
5673+                                 SDMF_VERSION
5674+from allmydata.test.common import LoggingServiceParent, ShouldFailMixin
5675 from allmydata.test.common_web import WebRenderingMixin
5676 from allmydata.web.storage import StorageStatus, remove_prefix
5677 
5678hunk ./src/allmydata/test/test_storage.py 106
5679 
5680 class RemoteBucket:
5681 
5682+    def __init__(self):
5683+        self.read_count = 0
5684+        self.write_count = 0
5685+
5686     def callRemote(self, methname, *args, **kwargs):
5687         def _call():
5688             meth = getattr(self.target, "remote_" + methname)
5689hunk ./src/allmydata/test/test_storage.py 114
5690             return meth(*args, **kwargs)
5691+
5692+        if methname == "slot_readv":
5693+            self.read_count += 1
5694+        if "writev" in methname:
5695+            self.write_count += 1
5696+
5697         return defer.maybeDeferred(_call)
5698 
5699hunk ./src/allmydata/test/test_storage.py 122
5700+
5701 class BucketProxy(unittest.TestCase):
5702     def make_bucket(self, name, size):
5703         basedir = os.path.join("storage", "BucketProxy", name)
5704hunk ./src/allmydata/test/test_storage.py 1299
5705         self.failUnless(os.path.exists(prefixdir), prefixdir)
5706         self.failIf(os.path.exists(bucketdir), bucketdir)
5707 
5708+
5709+class MDMFProxies(unittest.TestCase, ShouldFailMixin):
5710+    def setUp(self):
5711+        self.sparent = LoggingServiceParent()
5712+        self._lease_secret = itertools.count()
5713+        self.ss = self.create("MDMFProxies storage test server")
5714+        self.rref = RemoteBucket()
5715+        self.rref.target = self.ss
5716+        self.secrets = (self.write_enabler("we_secret"),
5717+                        self.renew_secret("renew_secret"),
5718+                        self.cancel_secret("cancel_secret"))
5719+        self.segment = "aaaaaa"
5720+        self.block = "aa"
5721+        self.salt = "a" * 16
5722+        self.block_hash = "a" * 32
5723+        self.block_hash_tree = [self.block_hash for i in xrange(6)]
5724+        self.share_hash = self.block_hash
5725+        self.share_hash_chain = dict([(i, self.share_hash) for i in xrange(6)])
5726+        self.signature = "foobarbaz"
5727+        self.verification_key = "vvvvvv"
5728+        self.encprivkey = "private"
5729+        self.root_hash = self.block_hash
5730+        self.salt_hash = self.root_hash
5731+        self.salt_hash_tree = [self.salt_hash for i in xrange(6)]
5732+        self.block_hash_tree_s = self.serialize_blockhashes(self.block_hash_tree)
5733+        self.share_hash_chain_s = self.serialize_sharehashes(self.share_hash_chain)
5734+        # blockhashes and salt hashes are serialized in the same way,
5735+        # only we lop off the first element and store that in the
5736+        # header.
5737+        self.salt_hash_tree_s = self.serialize_blockhashes(self.salt_hash_tree[1:])
5738+
5739+
5740+    def tearDown(self):
5741+        self.sparent.stopService()
5742+        shutil.rmtree(self.workdir("MDMFProxies storage test server"))
5743+
5744+
5745+    def write_enabler(self, we_tag):
5746+        return hashutil.tagged_hash("we_blah", we_tag)
5747+
5748+
5749+    def renew_secret(self, tag):
5750+        return hashutil.tagged_hash("renew_blah", str(tag))
5751+
5752+
5753+    def cancel_secret(self, tag):
5754+        return hashutil.tagged_hash("cancel_blah", str(tag))
5755+
5756+
5757+    def workdir(self, name):
5758+        basedir = os.path.join("storage", "MutableServer", name)
5759+        return basedir
5760+
5761+
5762+    def create(self, name):
5763+        workdir = self.workdir(name)
5764+        ss = StorageServer(workdir, "\x00" * 20)
5765+        ss.setServiceParent(self.sparent)
5766+        return ss
5767+
5768+
5769+    def build_test_mdmf_share(self, tail_segment=False, empty=False):
5770+        # Start with the checkstring
5771+        data = struct.pack(">BQ32s",
5772+                           1,
5773+                           0,
5774+                           self.root_hash)
5775+        self.checkstring = data
5776+        # Next, the encoding parameters
5777+        if tail_segment:
5778+            data += struct.pack(">BBQQ",
5779+                                3,
5780+                                10,
5781+                                6,
5782+                                33)
5783+        elif empty:
5784+            data += struct.pack(">BBQQ",
5785+                                3,
5786+                                10,
5787+                                0,
5788+                                0)
5789+        else:
5790+            data += struct.pack(">BBQQ",
5791+                                3,
5792+                                10,
5793+                                6,
5794+                                36)
5795+        # Now we'll build the offsets.
5796+        sharedata = ""
5797+        if not tail_segment and not empty:
5798+            for i in xrange(6):
5799+                sharedata += self.salt + self.block
5800+        elif tail_segment:
5801+            for i in xrange(5):
5802+                sharedata += self.salt + self.block
5803+            sharedata += self.salt + "a"
5804+
5805+        # The encrypted private key comes after the shares + salts
5806+        offset_size = struct.calcsize(MDMFOFFSETS)
5807+        encrypted_private_key_offset = len(data) + offset_size + len(sharedata)
5808+        # The blockhashes come after the private key
5809+        blockhashes_offset = encrypted_private_key_offset + len(self.encprivkey)
5810+        # The sharehashes come after the salt hashes
5811+        sharehashes_offset = blockhashes_offset + len(self.block_hash_tree_s)
5812+        # The signature comes after the share hash chain
5813+        signature_offset = sharehashes_offset + len(self.share_hash_chain_s)
5814+        # The verification key comes after the signature
5815+        verification_offset = signature_offset + len(self.signature)
5816+        # The EOF comes after the verification key
5817+        eof_offset = verification_offset + len(self.verification_key)
5818+        data += struct.pack(MDMFOFFSETS,
5819+                            encrypted_private_key_offset,
5820+                            blockhashes_offset,
5821+                            sharehashes_offset,
5822+                            signature_offset,
5823+                            verification_offset,
5824+                            eof_offset)
5825+        self.offsets = {}
5826+        self.offsets['enc_privkey'] = encrypted_private_key_offset
5827+        self.offsets['block_hash_tree'] = blockhashes_offset
5828+        self.offsets['share_hash_chain'] = sharehashes_offset
5829+        self.offsets['signature'] = signature_offset
5830+        self.offsets['verification_key'] = verification_offset
5831+        self.offsets['EOF'] = eof_offset
5832+        # Next, we'll add in the salts and share data,
5833+        data += sharedata
5834+        # the private key,
5835+        data += self.encprivkey
5836+        # the block hash tree,
5837+        data += self.block_hash_tree_s
5838+        # the share hash chain,
5839+        data += self.share_hash_chain_s
5840+        # the signature,
5841+        data += self.signature
5842+        # and the verification key
5843+        data += self.verification_key
5844+        return data
5845+
5846+
5847+    def write_test_share_to_server(self,
5848+                                   storage_index,
5849+                                   tail_segment=False,
5850+                                   empty=False):
5851+        """
5852+        I write some data for the read tests to read to self.ss
5853+
5854+        If tail_segment=True, then I will write a share that has a
5855+        smaller tail segment than other segments.
5856+        """
5857+        write = self.ss.remote_slot_testv_and_readv_and_writev
5858+        data = self.build_test_mdmf_share(tail_segment, empty)
5859+        # Finally, we write the whole thing to the storage server in one
5860+        # pass.
5861+        testvs = [(0, 1, "eq", "")]
5862+        tws = {}
5863+        tws[0] = (testvs, [(0, data)], None)
5864+        readv = [(0, 1)]
5865+        results = write(storage_index, self.secrets, tws, readv)
5866+        self.failUnless(results[0])
5867+
5868+
5869+    def build_test_sdmf_share(self, empty=False):
5870+        if empty:
5871+            sharedata = ""
5872+        else:
5873+            sharedata = self.segment * 6
5874+        self.sharedata = sharedata
5875+        blocksize = len(sharedata) / 3
5876+        block = sharedata[:blocksize]
5877+        self.blockdata = block
5878+        prefix = struct.pack(">BQ32s16s BBQQ",
5879+                             0, # version,
5880+                             0,
5881+                             self.root_hash,
5882+                             self.salt,
5883+                             3,
5884+                             10,
5885+                             len(sharedata),
5886+                             len(sharedata),
5887+                            )
5888+        post_offset = struct.calcsize(">BQ32s16sBBQQLLLLQQ")
5889+        signature_offset = post_offset + len(self.verification_key)
5890+        sharehashes_offset = signature_offset + len(self.signature)
5891+        blockhashes_offset = sharehashes_offset + len(self.share_hash_chain_s)
5892+        sharedata_offset = blockhashes_offset + len(self.block_hash_tree_s)
5893+        encprivkey_offset = sharedata_offset + len(block)
5894+        eof_offset = encprivkey_offset + len(self.encprivkey)
5895+        offsets = struct.pack(">LLLLQQ",
5896+                              signature_offset,
5897+                              sharehashes_offset,
5898+                              blockhashes_offset,
5899+                              sharedata_offset,
5900+                              encprivkey_offset,
5901+                              eof_offset)
5902+        final_share = "".join([prefix,
5903+                           offsets,
5904+                           self.verification_key,
5905+                           self.signature,
5906+                           self.share_hash_chain_s,
5907+                           self.block_hash_tree_s,
5908+                           block,
5909+                           self.encprivkey])
5910+        self.offsets = {}
5911+        self.offsets['signature'] = signature_offset
5912+        self.offsets['share_hash_chain'] = sharehashes_offset
5913+        self.offsets['block_hash_tree'] = blockhashes_offset
5914+        self.offsets['share_data'] = sharedata_offset
5915+        self.offsets['enc_privkey'] = encprivkey_offset
5916+        self.offsets['EOF'] = eof_offset
5917+        return final_share
5918+
5919+
5920+    def write_sdmf_share_to_server(self,
5921+                                   storage_index,
5922+                                   empty=False):
5923+        # Some tests need SDMF shares to verify that we can still
5924+        # read them. This method writes one, which resembles but is not
5925+        assert self.rref
5926+        write = self.ss.remote_slot_testv_and_readv_and_writev
5927+        share = self.build_test_sdmf_share(empty)
5928+        testvs = [(0, 1, "eq", "")]
5929+        tws = {}
5930+        tws[0] = (testvs, [(0, share)], None)
5931+        readv = []
5932+        results = write(storage_index, self.secrets, tws, readv)
5933+        self.failUnless(results[0])
5934+
5935+
5936+    def test_read(self):
5937+        self.write_test_share_to_server("si1")
5938+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
5939+        # Check that every method equals what we expect it to.
5940+        d = defer.succeed(None)
5941+        def _check_block_and_salt((block, salt)):
5942+            self.failUnlessEqual(block, self.block)
5943+            self.failUnlessEqual(salt, self.salt)
5944+
5945+        for i in xrange(6):
5946+            d.addCallback(lambda ignored, i=i:
5947+                mr.get_block_and_salt(i))
5948+            d.addCallback(_check_block_and_salt)
5949+
5950+        d.addCallback(lambda ignored:
5951+            mr.get_encprivkey())
5952+        d.addCallback(lambda encprivkey:
5953+            self.failUnlessEqual(self.encprivkey, encprivkey))
5954+
5955+        d.addCallback(lambda ignored:
5956+            mr.get_blockhashes())
5957+        d.addCallback(lambda blockhashes:
5958+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
5959+
5960+        d.addCallback(lambda ignored:
5961+            mr.get_sharehashes())
5962+        d.addCallback(lambda sharehashes:
5963+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
5964+
5965+        d.addCallback(lambda ignored:
5966+            mr.get_signature())
5967+        d.addCallback(lambda signature:
5968+            self.failUnlessEqual(signature, self.signature))
5969+
5970+        d.addCallback(lambda ignored:
5971+            mr.get_verification_key())
5972+        d.addCallback(lambda verification_key:
5973+            self.failUnlessEqual(verification_key, self.verification_key))
5974+
5975+        d.addCallback(lambda ignored:
5976+            mr.get_seqnum())
5977+        d.addCallback(lambda seqnum:
5978+            self.failUnlessEqual(seqnum, 0))
5979+
5980+        d.addCallback(lambda ignored:
5981+            mr.get_root_hash())
5982+        d.addCallback(lambda root_hash:
5983+            self.failUnlessEqual(self.root_hash, root_hash))
5984+
5985+        d.addCallback(lambda ignored:
5986+            mr.get_seqnum())
5987+        d.addCallback(lambda seqnum:
5988+            self.failUnlessEqual(0, seqnum))
5989+
5990+        d.addCallback(lambda ignored:
5991+            mr.get_encoding_parameters())
5992+        def _check_encoding_parameters((k, n, segsize, datalen)):
5993+            self.failUnlessEqual(k, 3)
5994+            self.failUnlessEqual(n, 10)
5995+            self.failUnlessEqual(segsize, 6)
5996+            self.failUnlessEqual(datalen, 36)
5997+        d.addCallback(_check_encoding_parameters)
5998+
5999+        d.addCallback(lambda ignored:
6000+            mr.get_checkstring())
6001+        d.addCallback(lambda checkstring:
6002+            self.failUnlessEqual(checkstring, checkstring))
6003+        return d
6004+
6005+
6006+    def test_read_with_different_tail_segment_size(self):
6007+        self.write_test_share_to_server("si1", tail_segment=True)
6008+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6009+        d = mr.get_block_and_salt(5)
6010+        def _check_tail_segment(results):
6011+            block, salt = results
6012+            self.failUnlessEqual(len(block), 1)
6013+            self.failUnlessEqual(block, "a")
6014+        d.addCallback(_check_tail_segment)
6015+        return d
6016+
6017+
6018+    def test_get_block_with_invalid_segnum(self):
6019+        self.write_test_share_to_server("si1")
6020+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6021+        d = defer.succeed(None)
6022+        d.addCallback(lambda ignored:
6023+            self.shouldFail(LayoutInvalid, "test invalid segnum",
6024+                            None,
6025+                            mr.get_block_and_salt, 7))
6026+        return d
6027+
6028+
6029+    def test_get_encoding_parameters_first(self):
6030+        self.write_test_share_to_server("si1")
6031+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6032+        d = mr.get_encoding_parameters()
6033+        def _check_encoding_parameters((k, n, segment_size, datalen)):
6034+            self.failUnlessEqual(k, 3)
6035+            self.failUnlessEqual(n, 10)
6036+            self.failUnlessEqual(segment_size, 6)
6037+            self.failUnlessEqual(datalen, 36)
6038+        d.addCallback(_check_encoding_parameters)
6039+        return d
6040+
6041+
6042+    def test_get_seqnum_first(self):
6043+        self.write_test_share_to_server("si1")
6044+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6045+        d = mr.get_seqnum()
6046+        d.addCallback(lambda seqnum:
6047+            self.failUnlessEqual(seqnum, 0))
6048+        return d
6049+
6050+
6051+    def test_get_root_hash_first(self):
6052+        self.write_test_share_to_server("si1")
6053+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6054+        d = mr.get_root_hash()
6055+        d.addCallback(lambda root_hash:
6056+            self.failUnlessEqual(root_hash, self.root_hash))
6057+        return d
6058+
6059+
6060+    def test_get_checkstring_first(self):
6061+        self.write_test_share_to_server("si1")
6062+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6063+        d = mr.get_checkstring()
6064+        d.addCallback(lambda checkstring:
6065+            self.failUnlessEqual(checkstring, self.checkstring))
6066+        return d
6067+
6068+
6069+    def test_write_read_vectors(self):
6070+        # When writing for us, the storage server will return to us a
6071+        # read vector, along with its result. If a write fails because
6072+        # the test vectors failed, this read vector can help us to
6073+        # diagnose the problem. This test ensures that the read vector
6074+        # is working appropriately.
6075+        mw = self._make_new_mw("si1", 0)
6076+        d = defer.succeed(None)
6077+
6078+        # Write one share. This should return a checkstring of nothing,
6079+        # since there is no data there.
6080+        d.addCallback(lambda ignored:
6081+            mw.put_block(self.block, 0, self.salt))
6082+        def _check_first_write(results):
6083+            result, readvs = results
6084+            self.failUnless(result)
6085+            self.failIf(readvs)
6086+        d.addCallback(_check_first_write)
6087+        # Now, there should be a different checkstring returned when
6088+        # we write other shares
6089+        d.addCallback(lambda ignored:
6090+            mw.put_block(self.block, 1, self.salt))
6091+        def _check_next_write(results):
6092+            result, readvs = results
6093+            self.failUnless(result)
6094+            self.expected_checkstring = mw.get_checkstring()
6095+            self.failUnlessIn(0, readvs)
6096+            self.failUnlessEqual(readvs[0][0], self.expected_checkstring)
6097+        d.addCallback(_check_next_write)
6098+        # Add the other four shares
6099+        for i in xrange(2, 6):
6100+            d.addCallback(lambda ignored, i=i:
6101+                mw.put_block(self.block, i, self.salt))
6102+            d.addCallback(_check_next_write)
6103+        # Add the encrypted private key
6104+        d.addCallback(lambda ignored:
6105+            mw.put_encprivkey(self.encprivkey))
6106+        d.addCallback(_check_next_write)
6107+        # Add the block hash tree and share hash tree
6108+        d.addCallback(lambda ignored:
6109+            mw.put_blockhashes(self.block_hash_tree))
6110+        d.addCallback(_check_next_write)
6111+        d.addCallback(lambda ignored:
6112+            mw.put_sharehashes(self.share_hash_chain))
6113+        d.addCallback(_check_next_write)
6114+        # Add the root hash and the salt hash. This should change the
6115+        # checkstring, but not in a way that we'll be able to see right
6116+        # now, since the read vectors are applied before the write
6117+        # vectors.
6118+        d.addCallback(lambda ignored:
6119+            mw.put_root_hash(self.root_hash))
6120+        def _check_old_testv_after_new_one_is_written(results):
6121+            result, readvs = results
6122+            self.failUnless(result)
6123+            self.failUnlessIn(0, readvs)
6124+            self.failUnlessEqual(self.expected_checkstring,
6125+                                 readvs[0][0])
6126+            new_checkstring = mw.get_checkstring()
6127+            self.failIfEqual(new_checkstring,
6128+                             readvs[0][0])
6129+        d.addCallback(_check_old_testv_after_new_one_is_written)
6130+        # Now add the signature. This should succeed, meaning that the
6131+        # data gets written and the read vector matches what the writer
6132+        # thinks should be there.
6133+        d.addCallback(lambda ignored:
6134+            mw.put_signature(self.signature))
6135+        d.addCallback(_check_next_write)
6136+        # The checkstring remains the same for the rest of the process.
6137+        return d
6138+
6139+
6140+    def test_blockhashes_after_share_hash_chain(self):
6141+        mw = self._make_new_mw("si1", 0)
6142+        d = defer.succeed(None)
6143+        # Put everything up to and including the share hash chain
6144+        for i in xrange(6):
6145+            d.addCallback(lambda ignored, i=i:
6146+                mw.put_block(self.block, i, self.salt))
6147+        d.addCallback(lambda ignored:
6148+            mw.put_encprivkey(self.encprivkey))
6149+        d.addCallback(lambda ignored:
6150+            mw.put_blockhashes(self.block_hash_tree))
6151+        d.addCallback(lambda ignored:
6152+            mw.put_sharehashes(self.share_hash_chain))
6153+
6154+        # Now try to put the block hash tree again.
6155+        d.addCallback(lambda ignored:
6156+            self.shouldFail(LayoutInvalid, "test repeat salthashes",
6157+                            None,
6158+                            mw.put_blockhashes, self.block_hash_tree))
6159+        return d
6160+
6161+
6162+    def test_encprivkey_after_blockhashes(self):
6163+        mw = self._make_new_mw("si1", 0)
6164+        d = defer.succeed(None)
6165+        # Put everything up to and including the block hash tree
6166+        for i in xrange(6):
6167+            d.addCallback(lambda ignored, i=i:
6168+                mw.put_block(self.block, i, self.salt))
6169+        d.addCallback(lambda ignored:
6170+            mw.put_encprivkey(self.encprivkey))
6171+        d.addCallback(lambda ignored:
6172+            mw.put_blockhashes(self.block_hash_tree))
6173+        d.addCallback(lambda ignored:
6174+            self.shouldFail(LayoutInvalid, "out of order private key",
6175+                            None,
6176+                            mw.put_encprivkey, self.encprivkey))
6177+        return d
6178+
6179+
6180+    def test_share_hash_chain_after_signature(self):
6181+        mw = self._make_new_mw("si1", 0)
6182+        d = defer.succeed(None)
6183+        # Put everything up to and including the signature
6184+        for i in xrange(6):
6185+            d.addCallback(lambda ignored, i=i:
6186+                mw.put_block(self.block, i, self.salt))
6187+        d.addCallback(lambda ignored:
6188+            mw.put_encprivkey(self.encprivkey))
6189+        d.addCallback(lambda ignored:
6190+            mw.put_blockhashes(self.block_hash_tree))
6191+        d.addCallback(lambda ignored:
6192+            mw.put_sharehashes(self.share_hash_chain))
6193+        d.addCallback(lambda ignored:
6194+            mw.put_root_hash(self.root_hash))
6195+        d.addCallback(lambda ignored:
6196+            mw.put_signature(self.signature))
6197+        # Now try to put the share hash chain again. This should fail
6198+        d.addCallback(lambda ignored:
6199+            self.shouldFail(LayoutInvalid, "out of order share hash chain",
6200+                            None,
6201+                            mw.put_sharehashes, self.share_hash_chain))
6202+        return d
6203+
6204+
6205+    def test_signature_after_verification_key(self):
6206+        mw = self._make_new_mw("si1", 0)
6207+        d = defer.succeed(None)
6208+        # Put everything up to and including the verification key.
6209+        for i in xrange(6):
6210+            d.addCallback(lambda ignored, i=i:
6211+                mw.put_block(self.block, i, self.salt))
6212+        d.addCallback(lambda ignored:
6213+            mw.put_encprivkey(self.encprivkey))
6214+        d.addCallback(lambda ignored:
6215+            mw.put_blockhashes(self.block_hash_tree))
6216+        d.addCallback(lambda ignored:
6217+            mw.put_sharehashes(self.share_hash_chain))
6218+        d.addCallback(lambda ignored:
6219+            mw.put_root_hash(self.root_hash))
6220+        d.addCallback(lambda ignored:
6221+            mw.put_signature(self.signature))
6222+        d.addCallback(lambda ignored:
6223+            mw.put_verification_key(self.verification_key))
6224+        # Now try to put the signature again. This should fail
6225+        d.addCallback(lambda ignored:
6226+            self.shouldFail(LayoutInvalid, "signature after verification",
6227+                            None,
6228+                            mw.put_signature, self.signature))
6229+        return d
6230+
6231+
6232+    def test_uncoordinated_write(self):
6233+        # Make two mutable writers, both pointing to the same storage
6234+        # server, both at the same storage index, and try writing to the
6235+        # same share.
6236+        mw1 = self._make_new_mw("si1", 0)
6237+        mw2 = self._make_new_mw("si1", 0)
6238+        d = defer.succeed(None)
6239+        def _check_success(results):
6240+            result, readvs = results
6241+            self.failUnless(result)
6242+
6243+        def _check_failure(results):
6244+            result, readvs = results
6245+            self.failIf(result)
6246+
6247+        d.addCallback(lambda ignored:
6248+            mw1.put_block(self.block, 0, self.salt))
6249+        d.addCallback(_check_success)
6250+        d.addCallback(lambda ignored:
6251+            mw2.put_block(self.block, 0, self.salt))
6252+        d.addCallback(_check_failure)
6253+        return d
6254+
6255+
6256+    def test_invalid_salt_size(self):
6257+        # Salts need to be 16 bytes in size. Writes that attempt to
6258+        # write more or less than this should be rejected.
6259+        mw = self._make_new_mw("si1", 0)
6260+        invalid_salt = "a" * 17 # 17 bytes
6261+        another_invalid_salt = "b" * 15 # 15 bytes
6262+        d = defer.succeed(None)
6263+        d.addCallback(lambda ignored:
6264+            self.shouldFail(LayoutInvalid, "salt too big",
6265+                            None,
6266+                            mw.put_block, self.block, 0, invalid_salt))
6267+        d.addCallback(lambda ignored:
6268+            self.shouldFail(LayoutInvalid, "salt too small",
6269+                            None,
6270+                            mw.put_block, self.block, 0,
6271+                            another_invalid_salt))
6272+        return d
6273+
6274+
6275+    def test_write_test_vectors(self):
6276+        # If we give the write proxy a bogus test vector at
6277+        # any point during the process, it should fail to write.
6278+        mw = self._make_new_mw("si1", 0)
6279+        mw.set_checkstring("this is a lie")
6280+        # The initial write should be expecting to find the improbable
6281+        # checkstring above in place; finding nothing, it should fail.
6282+        d = defer.succeed(None)
6283+        d.addCallback(lambda ignored:
6284+            mw.put_block(self.block, 0, self.salt))
6285+        def _check_failure(results):
6286+            result, readv = results
6287+            self.failIf(result)
6288+        d.addCallback(_check_failure)
6289+        # Now set the checkstring to the empty string, which
6290+        # indicates that no share is there.
6291+        d.addCallback(lambda ignored:
6292+            mw.set_checkstring(""))
6293+        d.addCallback(lambda ignored:
6294+            mw.put_block(self.block, 0, self.salt))
6295+        def _check_success(results):
6296+            result, readv = results
6297+            self.failUnless(result)
6298+        d.addCallback(_check_success)
6299+        # Now set the checkstring to something wrong
6300+        d.addCallback(lambda ignored:
6301+            mw.set_checkstring("something wrong"))
6302+        # This should fail to do anything
6303+        d.addCallback(lambda ignored:
6304+            mw.put_block(self.block, 1, self.salt))
6305+        d.addCallback(_check_failure)
6306+        # Now set it back to what it should be.
6307+        d.addCallback(lambda ignored:
6308+            mw.set_checkstring(mw.get_checkstring()))
6309+        for i in xrange(1, 6):
6310+            d.addCallback(lambda ignored, i=i:
6311+                mw.put_block(self.block, i, self.salt))
6312+            d.addCallback(_check_success)
6313+        d.addCallback(lambda ignored:
6314+            mw.put_encprivkey(self.encprivkey))
6315+        d.addCallback(_check_success)
6316+        d.addCallback(lambda ignored:
6317+            mw.put_blockhashes(self.block_hash_tree))
6318+        d.addCallback(_check_success)
6319+        d.addCallback(lambda ignored:
6320+            mw.put_sharehashes(self.share_hash_chain))
6321+        d.addCallback(_check_success)
6322+        def _keep_old_checkstring(ignored):
6323+            self.old_checkstring = mw.get_checkstring()
6324+            mw.set_checkstring("foobarbaz")
6325+        d.addCallback(_keep_old_checkstring)
6326+        d.addCallback(lambda ignored:
6327+            mw.put_root_hash(self.root_hash))
6328+        d.addCallback(_check_failure)
6329+        d.addCallback(lambda ignored:
6330+            self.failUnlessEqual(self.old_checkstring, mw.get_checkstring()))
6331+        def _restore_old_checkstring(ignored):
6332+            mw.set_checkstring(self.old_checkstring)
6333+        d.addCallback(_restore_old_checkstring)
6334+        d.addCallback(lambda ignored:
6335+            mw.put_root_hash(self.root_hash))
6336+        d.addCallback(_check_success)
6337+        # The checkstring should have been set appropriately for us on
6338+        # the last write; if we try to change it to something else,
6339+        # that change should cause the verification key step to fail.
6340+        d.addCallback(lambda ignored:
6341+            mw.set_checkstring("something else"))
6342+        d.addCallback(lambda ignored:
6343+            mw.put_signature(self.signature))
6344+        d.addCallback(_check_failure)
6345+        d.addCallback(lambda ignored:
6346+            mw.set_checkstring(mw.get_checkstring()))
6347+        d.addCallback(lambda ignored:
6348+            mw.put_signature(self.signature))
6349+        d.addCallback(_check_success)
6350+        d.addCallback(lambda ignored:
6351+            mw.put_verification_key(self.verification_key))
6352+        d.addCallback(_check_success)
6353+        return d
6354+
6355+
6356+    def test_offset_only_set_on_success(self):
6357+        # The write proxy should be smart enough to detect when a write
6358+        # has failed, and to temper its definition of progress based on
6359+        # that.
6360+        mw = self._make_new_mw("si1", 0)
6361+        d = defer.succeed(None)
6362+        for i in xrange(1, 6):
6363+            d.addCallback(lambda ignored, i=i:
6364+                mw.put_block(self.block, i, self.salt))
6365+        def _break_checkstring(ignored):
6366+            self._old_checkstring = mw.get_checkstring()
6367+            mw.set_checkstring("foobarbaz")
6368+
6369+        def _fix_checkstring(ignored):
6370+            mw.set_checkstring(self._old_checkstring)
6371+
6372+        d.addCallback(_break_checkstring)
6373+
6374+        # Setting the encrypted private key shouldn't work now, which is
6375+        # to be expected and is tested elsewhere. We also want to make
6376+        # sure that we can't add the block hash tree after a failed
6377+        # write of this sort.
6378+        d.addCallback(lambda ignored:
6379+            mw.put_encprivkey(self.encprivkey))
6380+        d.addCallback(lambda ignored:
6381+            self.shouldFail(LayoutInvalid, "test out-of-order blockhashes",
6382+                            None,
6383+                            mw.put_blockhashes, self.block_hash_tree))
6384+        d.addCallback(_fix_checkstring)
6385+        d.addCallback(lambda ignored:
6386+            mw.put_encprivkey(self.encprivkey))
6387+        d.addCallback(_break_checkstring)
6388+        d.addCallback(lambda ignored:
6389+            mw.put_blockhashes(self.block_hash_tree))
6390+        d.addCallback(lambda ignored:
6391+            self.shouldFail(LayoutInvalid, "test out-of-order sharehashes",
6392+                            None,
6393+                            mw.put_sharehashes, self.share_hash_chain))
6394+        d.addCallback(_fix_checkstring)
6395+        d.addCallback(lambda ignored:
6396+            mw.put_blockhashes(self.block_hash_tree))
6397+        d.addCallback(_break_checkstring)
6398+        d.addCallback(lambda ignored:
6399+            mw.put_sharehashes(self.share_hash_chain))
6400+        d.addCallback(lambda ignored:
6401+            self.shouldFail(LayoutInvalid, "out-of-order root hash",
6402+                            None,
6403+                            mw.put_root_hash, self.root_hash))
6404+        d.addCallback(_fix_checkstring)
6405+        d.addCallback(lambda ignored:
6406+            mw.put_sharehashes(self.share_hash_chain))
6407+        d.addCallback(_break_checkstring)
6408+        d.addCallback(lambda ignored:
6409+            mw.put_root_hash(self.root_hash))
6410+        d.addCallback(lambda ignored:
6411+            self.shouldFail(LayoutInvalid, "out-of-order signature",
6412+                            None,
6413+                            mw.put_signature, self.signature))
6414+        d.addCallback(_fix_checkstring)
6415+        d.addCallback(lambda ignored:
6416+            mw.put_root_hash(self.root_hash))
6417+        d.addCallback(_break_checkstring)
6418+        d.addCallback(lambda ignored:
6419+            mw.put_signature(self.signature))
6420+        d.addCallback(lambda ignored:
6421+            self.shouldFail(LayoutInvalid, "out-of-order verification key",
6422+                            None,
6423+                            mw.put_verification_key,
6424+                            self.verification_key))
6425+        d.addCallback(_fix_checkstring)
6426+        d.addCallback(lambda ignored:
6427+            mw.put_signature(self.signature))
6428+        d.addCallback(_break_checkstring)
6429+        d.addCallback(lambda ignored:
6430+            mw.put_verification_key(self.verification_key))
6431+        d.addCallback(lambda ignored:
6432+            self.shouldFail(LayoutInvalid, "out-of-order finish",
6433+                            None,
6434+                            mw.finish_publishing))
6435+        return d
6436+
6437+
6438+    def serialize_blockhashes(self, blockhashes):
6439+        return "".join(blockhashes)
6440+
6441+
6442+    def serialize_sharehashes(self, sharehashes):
6443+        ret = "".join([struct.pack(">H32s", i, sharehashes[i])
6444+                        for i in sorted(sharehashes.keys())])
6445+        return ret
6446+
6447+
6448+    def test_write(self):
6449+        # This translates to a file with 6 6-byte segments, and with 2-byte
6450+        # blocks.
6451+        mw = self._make_new_mw("si1", 0)
6452+        mw2 = self._make_new_mw("si1", 1)
6453+        # Test writing some blocks.
6454+        read = self.ss.remote_slot_readv
6455+        expected_sharedata_offset = struct.calcsize(MDMFHEADER)
6456+        written_block_size = 2 + len(self.salt)
6457+        written_block = self.block + self.salt
6458+        def _check_block_write(i, share):
6459+            self.failUnlessEqual(read("si1", [share], [(expected_sharedata_offset + (i * written_block_size), written_block_size)]),
6460+                                {share: [written_block]})
6461+        d = defer.succeed(None)
6462+        for i in xrange(6):
6463+            d.addCallback(lambda ignored, i=i:
6464+                mw.put_block(self.block, i, self.salt))
6465+            d.addCallback(lambda ignored, i=i:
6466+                _check_block_write(i, 0))
6467+        # Now try the same thing, but with share 1 instead of share 0.
6468+        for i in xrange(6):
6469+            d.addCallback(lambda ignored, i=i:
6470+                mw2.put_block(self.block, i, self.salt))
6471+            d.addCallback(lambda ignored, i=i:
6472+                _check_block_write(i, 1))
6473+
6474+        # Next, we make a fake encrypted private key, and put it onto the
6475+        # storage server.
6476+        d.addCallback(lambda ignored:
6477+            mw.put_encprivkey(self.encprivkey))
6478+        expected_private_key_offset = expected_sharedata_offset + \
6479+                                      len(written_block) * 6
6480+        self.failUnlessEqual(len(self.encprivkey), 7)
6481+        d.addCallback(lambda ignored:
6482+            self.failUnlessEqual(read("si1", [0], [(expected_private_key_offset, 7)]),
6483+                                 {0: [self.encprivkey]}))
6484+
6485+        # Next, we put a fake block hash tree.
6486+        d.addCallback(lambda ignored:
6487+            mw.put_blockhashes(self.block_hash_tree))
6488+        expected_block_hash_offset = expected_private_key_offset + len(self.encprivkey)
6489+        self.failUnlessEqual(len(self.block_hash_tree_s), 32 * 6)
6490+        d.addCallback(lambda ignored:
6491+            self.failUnlessEqual(read("si1", [0], [(expected_block_hash_offset, 32 * 6)]),
6492+                                 {0: [self.block_hash_tree_s]}))
6493+
6494+        # Next, put a fake share hash chain
6495+        d.addCallback(lambda ignored:
6496+            mw.put_sharehashes(self.share_hash_chain))
6497+        expected_share_hash_offset = expected_block_hash_offset + len(self.block_hash_tree_s)
6498+        d.addCallback(lambda ignored:
6499+            self.failUnlessEqual(read("si1", [0],[(expected_share_hash_offset, (32 + 2) * 6)]),
6500+                                 {0: [self.share_hash_chain_s]}))
6501+
6502+        # Next, we put what is supposed to be the root hash of
6503+        # our share hash tree but isn't       
6504+        d.addCallback(lambda ignored:
6505+            mw.put_root_hash(self.root_hash))
6506+        # The root hash gets inserted at byte 9 (its position is in the header,
6507+        # and is fixed).
6508+        def _check(ignored):
6509+            self.failUnlessEqual(read("si1", [0], [(9, 32)]),
6510+                                 {0: [self.root_hash]})
6511+        d.addCallback(_check)
6512+
6513+        # Next, we put a signature of the header block.
6514+        d.addCallback(lambda ignored:
6515+            mw.put_signature(self.signature))
6516+        expected_signature_offset = expected_share_hash_offset + len(self.share_hash_chain_s)
6517+        self.failUnlessEqual(len(self.signature), 9)
6518+        d.addCallback(lambda ignored:
6519+            self.failUnlessEqual(read("si1", [0], [(expected_signature_offset, 9)]),
6520+                                 {0: [self.signature]}))
6521+
6522+        # Next, we put the verification key
6523+        d.addCallback(lambda ignored:
6524+            mw.put_verification_key(self.verification_key))
6525+        expected_verification_key_offset = expected_signature_offset + len(self.signature)
6526+        self.failUnlessEqual(len(self.verification_key), 6)
6527+        d.addCallback(lambda ignored:
6528+            self.failUnlessEqual(read("si1", [0], [(expected_verification_key_offset, 6)]),
6529+                                 {0: [self.verification_key]}))
6530+
6531+        def _check_signable(ignored):
6532+            # Make sure that the signable is what we think it should be.
6533+            signable = mw.get_signable()
6534+            verno, seq, roothash, k, n, segsize, datalen = \
6535+                                            struct.unpack(">BQ32sBBQQ",
6536+                                                          signable)
6537+            self.failUnlessEqual(verno, 1)
6538+            self.failUnlessEqual(seq, 0)
6539+            self.failUnlessEqual(roothash, self.root_hash)
6540+            self.failUnlessEqual(k, 3)
6541+            self.failUnlessEqual(n, 10)
6542+            self.failUnlessEqual(segsize, 6)
6543+            self.failUnlessEqual(datalen, 36)
6544+        d.addCallback(_check_signable)
6545+        # Next, we cause the offset table to be published.
6546+        d.addCallback(lambda ignored:
6547+            mw.finish_publishing())
6548+        expected_eof_offset = expected_verification_key_offset + len(self.verification_key)
6549+
6550+        def _check_offsets(ignored):
6551+            # Check the version number to make sure that it is correct.
6552+            expected_version_number = struct.pack(">B", 1)
6553+            self.failUnlessEqual(read("si1", [0], [(0, 1)]),
6554+                                 {0: [expected_version_number]})
6555+            # Check the sequence number to make sure that it is correct
6556+            expected_sequence_number = struct.pack(">Q", 0)
6557+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
6558+                                 {0: [expected_sequence_number]})
6559+            # Check that the encoding parameters (k, N, segement size, data
6560+            # length) are what they should be. These are  3, 10, 6, 36
6561+            expected_k = struct.pack(">B", 3)
6562+            self.failUnlessEqual(read("si1", [0], [(41, 1)]),
6563+                                 {0: [expected_k]})
6564+            expected_n = struct.pack(">B", 10)
6565+            self.failUnlessEqual(read("si1", [0], [(42, 1)]),
6566+                                 {0: [expected_n]})
6567+            expected_segment_size = struct.pack(">Q", 6)
6568+            self.failUnlessEqual(read("si1", [0], [(43, 8)]),
6569+                                 {0: [expected_segment_size]})
6570+            expected_data_length = struct.pack(">Q", 36)
6571+            self.failUnlessEqual(read("si1", [0], [(51, 8)]),
6572+                                 {0: [expected_data_length]})
6573+            expected_offset = struct.pack(">Q", expected_private_key_offset)
6574+            self.failUnlessEqual(read("si1", [0], [(59, 8)]),
6575+                                 {0: [expected_offset]})
6576+            expected_offset = struct.pack(">Q", expected_block_hash_offset)
6577+            self.failUnlessEqual(read("si1", [0], [(67, 8)]),
6578+                                 {0: [expected_offset]})
6579+            expected_offset = struct.pack(">Q", expected_share_hash_offset)
6580+            self.failUnlessEqual(read("si1", [0], [(75, 8)]),
6581+                                 {0: [expected_offset]})
6582+            expected_offset = struct.pack(">Q", expected_signature_offset)
6583+            self.failUnlessEqual(read("si1", [0], [(83, 8)]),
6584+                                 {0: [expected_offset]})
6585+            expected_offset = struct.pack(">Q", expected_verification_key_offset)
6586+            self.failUnlessEqual(read("si1", [0], [(91, 8)]),
6587+                                 {0: [expected_offset]})
6588+            expected_offset = struct.pack(">Q", expected_eof_offset)
6589+            self.failUnlessEqual(read("si1", [0], [(99, 8)]),
6590+                                 {0: [expected_offset]})
6591+        d.addCallback(_check_offsets)
6592+        return d
6593+
6594+    def _make_new_mw(self, si, share, datalength=36):
6595+        # This is a file of size 36 bytes. Since it has a segment
6596+        # size of 6, we know that it has 6 byte segments, which will
6597+        # be split into blocks of 2 bytes because our FEC k
6598+        # parameter is 3.
6599+        mw = MDMFSlotWriteProxy(share, self.rref, si, self.secrets, 0, 3, 10,
6600+                                6, datalength)
6601+        return mw
6602+
6603+
6604+    def test_write_rejected_with_too_many_blocks(self):
6605+        mw = self._make_new_mw("si0", 0)
6606+
6607+        # Try writing too many blocks. We should not be able to write
6608+        # more than 6
6609+        # blocks into each share.
6610+        d = defer.succeed(None)
6611+        for i in xrange(6):
6612+            d.addCallback(lambda ignored, i=i:
6613+                mw.put_block(self.block, i, self.salt))
6614+        d.addCallback(lambda ignored:
6615+            self.shouldFail(LayoutInvalid, "too many blocks",
6616+                            None,
6617+                            mw.put_block, self.block, 7, self.salt))
6618+        return d
6619+
6620+
6621+    def test_write_rejected_with_invalid_salt(self):
6622+        # Try writing an invalid salt. Salts are 16 bytes -- any more or
6623+        # less should cause an error.
6624+        mw = self._make_new_mw("si1", 0)
6625+        bad_salt = "a" * 17 # 17 bytes
6626+        d = defer.succeed(None)
6627+        d.addCallback(lambda ignored:
6628+            self.shouldFail(LayoutInvalid, "test_invalid_salt",
6629+                            None, mw.put_block, self.block, 7, bad_salt))
6630+        return d
6631+
6632+
6633+    def test_write_rejected_with_invalid_root_hash(self):
6634+        # Try writing an invalid root hash. This should be SHA256d, and
6635+        # 32 bytes long as a result.
6636+        mw = self._make_new_mw("si2", 0)
6637+        # 17 bytes != 32 bytes
6638+        invalid_root_hash = "a" * 17
6639+        d = defer.succeed(None)
6640+        # Before this test can work, we need to put some blocks + salts,
6641+        # a block hash tree, and a share hash tree. Otherwise, we'll see
6642+        # failures that match what we are looking for, but are caused by
6643+        # the constraints imposed on operation ordering.
6644+        for i in xrange(6):
6645+            d.addCallback(lambda ignored, i=i:
6646+                mw.put_block(self.block, i, self.salt))
6647+        d.addCallback(lambda ignored:
6648+            mw.put_encprivkey(self.encprivkey))
6649+        d.addCallback(lambda ignored:
6650+            mw.put_blockhashes(self.block_hash_tree))
6651+        d.addCallback(lambda ignored:
6652+            mw.put_sharehashes(self.share_hash_chain))
6653+        d.addCallback(lambda ignored:
6654+            self.shouldFail(LayoutInvalid, "invalid root hash",
6655+                            None, mw.put_root_hash, invalid_root_hash))
6656+        return d
6657+
6658+
6659+    def test_write_rejected_with_invalid_blocksize(self):
6660+        # The blocksize implied by the writer that we get from
6661+        # _make_new_mw is 2bytes -- any more or any less than this
6662+        # should be cause for failure, unless it is the tail segment, in
6663+        # which case it may not be failure.
6664+        invalid_block = "a"
6665+        mw = self._make_new_mw("si3", 0, 33) # implies a tail segment with
6666+                                             # one byte blocks
6667+        # 1 bytes != 2 bytes
6668+        d = defer.succeed(None)
6669+        d.addCallback(lambda ignored, invalid_block=invalid_block:
6670+            self.shouldFail(LayoutInvalid, "test blocksize too small",
6671+                            None, mw.put_block, invalid_block, 0,
6672+                            self.salt))
6673+        invalid_block = invalid_block * 3
6674+        # 3 bytes != 2 bytes
6675+        d.addCallback(lambda ignored:
6676+            self.shouldFail(LayoutInvalid, "test blocksize too large",
6677+                            None,
6678+                            mw.put_block, invalid_block, 0, self.salt))
6679+        for i in xrange(5):
6680+            d.addCallback(lambda ignored, i=i:
6681+                mw.put_block(self.block, i, self.salt))
6682+        # Try to put an invalid tail segment
6683+        d.addCallback(lambda ignored:
6684+            self.shouldFail(LayoutInvalid, "test invalid tail segment",
6685+                            None,
6686+                            mw.put_block, self.block, 5, self.salt))
6687+        valid_block = "a"
6688+        d.addCallback(lambda ignored:
6689+            mw.put_block(valid_block, 5, self.salt))
6690+        return d
6691+
6692+
6693+    def test_write_enforces_order_constraints(self):
6694+        # We require that the MDMFSlotWriteProxy be interacted with in a
6695+        # specific way.
6696+        # That way is:
6697+        # 0: __init__
6698+        # 1: write blocks and salts
6699+        # 2: Write the encrypted private key
6700+        # 3: Write the block hashes
6701+        # 4: Write the share hashes
6702+        # 5: Write the root hash and salt hash
6703+        # 6: Write the signature and verification key
6704+        # 7: Write the file.
6705+        #
6706+        # Some of these can be performed out-of-order, and some can't.
6707+        # The dependencies that I want to test here are:
6708+        #  - Private key before block hashes
6709+        #  - share hashes and block hashes before root hash
6710+        #  - root hash before signature
6711+        #  - signature before verification key
6712+        mw0 = self._make_new_mw("si0", 0)
6713+        # Write some shares
6714+        d = defer.succeed(None)
6715+        for i in xrange(6):
6716+            d.addCallback(lambda ignored, i=i:
6717+                mw0.put_block(self.block, i, self.salt))
6718+        # Try to write the block hashes before writing the encrypted
6719+        # private key
6720+        d.addCallback(lambda ignored:
6721+            self.shouldFail(LayoutInvalid, "block hashes before key",
6722+                            None, mw0.put_blockhashes,
6723+                            self.block_hash_tree))
6724+
6725+        # Write the private key.
6726+        d.addCallback(lambda ignored:
6727+            mw0.put_encprivkey(self.encprivkey))
6728+
6729+
6730+        # Try to write the share hash chain without writing the block
6731+        # hash tree
6732+        d.addCallback(lambda ignored:
6733+            self.shouldFail(LayoutInvalid, "share hash chain before "
6734+                                           "salt hash tree",
6735+                            None,
6736+                            mw0.put_sharehashes, self.share_hash_chain))
6737+
6738+        # Try to write the root hash and without writing either the
6739+        # block hashes or the or the share hashes
6740+        d.addCallback(lambda ignored:
6741+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
6742+                            None,
6743+                            mw0.put_root_hash, self.root_hash))
6744+
6745+        # Now write the block hashes and try again
6746+        d.addCallback(lambda ignored:
6747+            mw0.put_blockhashes(self.block_hash_tree))
6748+
6749+        d.addCallback(lambda ignored:
6750+            self.shouldFail(LayoutInvalid, "root hash before share hashes",
6751+                            None, mw0.put_root_hash, self.root_hash))
6752+
6753+        # We haven't yet put the root hash on the share, so we shouldn't
6754+        # be able to sign it.
6755+        d.addCallback(lambda ignored:
6756+            self.shouldFail(LayoutInvalid, "signature before root hash",
6757+                            None, mw0.put_signature, self.signature))
6758+
6759+        d.addCallback(lambda ignored:
6760+            self.failUnlessRaises(LayoutInvalid, mw0.get_signable))
6761+
6762+        # ..and, since that fails, we also shouldn't be able to put the
6763+        # verification key.
6764+        d.addCallback(lambda ignored:
6765+            self.shouldFail(LayoutInvalid, "key before signature",
6766+                            None, mw0.put_verification_key,
6767+                            self.verification_key))
6768+
6769+        # Now write the share hashes.
6770+        d.addCallback(lambda ignored:
6771+            mw0.put_sharehashes(self.share_hash_chain))
6772+        # We should be able to write the root hash now too
6773+        d.addCallback(lambda ignored:
6774+            mw0.put_root_hash(self.root_hash))
6775+
6776+        # We should still be unable to put the verification key
6777+        d.addCallback(lambda ignored:
6778+            self.shouldFail(LayoutInvalid, "key before signature",
6779+                            None, mw0.put_verification_key,
6780+                            self.verification_key))
6781+
6782+        d.addCallback(lambda ignored:
6783+            mw0.put_signature(self.signature))
6784+
6785+        # We shouldn't be able to write the offsets to the remote server
6786+        # until the offset table is finished; IOW, until we have written
6787+        # the verification key.
6788+        d.addCallback(lambda ignored:
6789+            self.shouldFail(LayoutInvalid, "offsets before verification key",
6790+                            None,
6791+                            mw0.finish_publishing))
6792+
6793+        d.addCallback(lambda ignored:
6794+            mw0.put_verification_key(self.verification_key))
6795+        return d
6796+
6797+
6798+    def test_end_to_end(self):
6799+        mw = self._make_new_mw("si1", 0)
6800+        # Write a share using the mutable writer, and make sure that the
6801+        # reader knows how to read everything back to us.
6802+        d = defer.succeed(None)
6803+        for i in xrange(6):
6804+            d.addCallback(lambda ignored, i=i:
6805+                mw.put_block(self.block, i, self.salt))
6806+        d.addCallback(lambda ignored:
6807+            mw.put_encprivkey(self.encprivkey))
6808+        d.addCallback(lambda ignored:
6809+            mw.put_blockhashes(self.block_hash_tree))
6810+        d.addCallback(lambda ignored:
6811+            mw.put_sharehashes(self.share_hash_chain))
6812+        d.addCallback(lambda ignored:
6813+            mw.put_root_hash(self.root_hash))
6814+        d.addCallback(lambda ignored:
6815+            mw.put_signature(self.signature))
6816+        d.addCallback(lambda ignored:
6817+            mw.put_verification_key(self.verification_key))
6818+        d.addCallback(lambda ignored:
6819+            mw.finish_publishing())
6820+
6821+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6822+        def _check_block_and_salt((block, salt)):
6823+            self.failUnlessEqual(block, self.block)
6824+            self.failUnlessEqual(salt, self.salt)
6825+
6826+        for i in xrange(6):
6827+            d.addCallback(lambda ignored, i=i:
6828+                mr.get_block_and_salt(i))
6829+            d.addCallback(_check_block_and_salt)
6830+
6831+        d.addCallback(lambda ignored:
6832+            mr.get_encprivkey())
6833+        d.addCallback(lambda encprivkey:
6834+            self.failUnlessEqual(self.encprivkey, encprivkey))
6835+
6836+        d.addCallback(lambda ignored:
6837+            mr.get_blockhashes())
6838+        d.addCallback(lambda blockhashes:
6839+            self.failUnlessEqual(self.block_hash_tree, blockhashes))
6840+
6841+        d.addCallback(lambda ignored:
6842+            mr.get_sharehashes())
6843+        d.addCallback(lambda sharehashes:
6844+            self.failUnlessEqual(self.share_hash_chain, sharehashes))
6845+
6846+        d.addCallback(lambda ignored:
6847+            mr.get_signature())
6848+        d.addCallback(lambda signature:
6849+            self.failUnlessEqual(signature, self.signature))
6850+
6851+        d.addCallback(lambda ignored:
6852+            mr.get_verification_key())
6853+        d.addCallback(lambda verification_key:
6854+            self.failUnlessEqual(verification_key, self.verification_key))
6855+
6856+        d.addCallback(lambda ignored:
6857+            mr.get_seqnum())
6858+        d.addCallback(lambda seqnum:
6859+            self.failUnlessEqual(seqnum, 0))
6860+
6861+        d.addCallback(lambda ignored:
6862+            mr.get_root_hash())
6863+        d.addCallback(lambda root_hash:
6864+            self.failUnlessEqual(self.root_hash, root_hash))
6865+
6866+        d.addCallback(lambda ignored:
6867+            mr.get_encoding_parameters())
6868+        def _check_encoding_parameters((k, n, segsize, datalen)):
6869+            self.failUnlessEqual(k, 3)
6870+            self.failUnlessEqual(n, 10)
6871+            self.failUnlessEqual(segsize, 6)
6872+            self.failUnlessEqual(datalen, 36)
6873+        d.addCallback(_check_encoding_parameters)
6874+
6875+        d.addCallback(lambda ignored:
6876+            mr.get_checkstring())
6877+        d.addCallback(lambda checkstring:
6878+            self.failUnlessEqual(checkstring, mw.get_checkstring()))
6879+        return d
6880+
6881+
6882+    def test_is_sdmf(self):
6883+        # The MDMFSlotReadProxy should also know how to read SDMF files,
6884+        # since it will encounter them on the grid. Callers use the
6885+        # is_sdmf method to test this.
6886+        self.write_sdmf_share_to_server("si1")
6887+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6888+        d = mr.is_sdmf()
6889+        d.addCallback(lambda issdmf:
6890+            self.failUnless(issdmf))
6891+        return d
6892+
6893+
6894+    def test_reads_sdmf(self):
6895+        # The slot read proxy should, naturally, know how to tell us
6896+        # about data in the SDMF format
6897+        self.write_sdmf_share_to_server("si1")
6898+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6899+        d = defer.succeed(None)
6900+        d.addCallback(lambda ignored:
6901+            mr.is_sdmf())
6902+        d.addCallback(lambda issdmf:
6903+            self.failUnless(issdmf))
6904+
6905+        # What do we need to read?
6906+        #  - The sharedata
6907+        #  - The salt
6908+        d.addCallback(lambda ignored:
6909+            mr.get_block_and_salt(0))
6910+        def _check_block_and_salt(results):
6911+            block, salt = results
6912+            # Our original file is 36 bytes long. Then each share is 12
6913+            # bytes in size. The share is composed entirely of the
6914+            # letter a. self.block contains 2 as, so 6 * self.block is
6915+            # what we are looking for.
6916+            self.failUnlessEqual(block, self.block * 6)
6917+            self.failUnlessEqual(salt, self.salt)
6918+        d.addCallback(_check_block_and_salt)
6919+
6920+        #  - The blockhashes
6921+        d.addCallback(lambda ignored:
6922+            mr.get_blockhashes())
6923+        d.addCallback(lambda blockhashes:
6924+            self.failUnlessEqual(self.block_hash_tree,
6925+                                 blockhashes,
6926+                                 blockhashes))
6927+        #  - The sharehashes
6928+        d.addCallback(lambda ignored:
6929+            mr.get_sharehashes())
6930+        d.addCallback(lambda sharehashes:
6931+            self.failUnlessEqual(self.share_hash_chain,
6932+                                 sharehashes))
6933+        #  - The keys
6934+        d.addCallback(lambda ignored:
6935+            mr.get_encprivkey())
6936+        d.addCallback(lambda encprivkey:
6937+            self.failUnlessEqual(encprivkey, self.encprivkey, encprivkey))
6938+        d.addCallback(lambda ignored:
6939+            mr.get_verification_key())
6940+        d.addCallback(lambda verification_key:
6941+            self.failUnlessEqual(verification_key,
6942+                                 self.verification_key,
6943+                                 verification_key))
6944+        #  - The signature
6945+        d.addCallback(lambda ignored:
6946+            mr.get_signature())
6947+        d.addCallback(lambda signature:
6948+            self.failUnlessEqual(signature, self.signature, signature))
6949+
6950+        #  - The sequence number
6951+        d.addCallback(lambda ignored:
6952+            mr.get_seqnum())
6953+        d.addCallback(lambda seqnum:
6954+            self.failUnlessEqual(seqnum, 0, seqnum))
6955+
6956+        #  - The root hash
6957+        d.addCallback(lambda ignored:
6958+            mr.get_root_hash())
6959+        d.addCallback(lambda root_hash:
6960+            self.failUnlessEqual(root_hash, self.root_hash, root_hash))
6961+        return d
6962+
6963+
6964+    def test_only_reads_one_segment_sdmf(self):
6965+        # SDMF shares have only one segment, so it doesn't make sense to
6966+        # read more segments than that. The reader should know this and
6967+        # complain if we try to do that.
6968+        self.write_sdmf_share_to_server("si1")
6969+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
6970+        d = defer.succeed(None)
6971+        d.addCallback(lambda ignored:
6972+            mr.is_sdmf())
6973+        d.addCallback(lambda issdmf:
6974+            self.failUnless(issdmf))
6975+        d.addCallback(lambda ignored:
6976+            self.shouldFail(LayoutInvalid, "test bad segment",
6977+                            None,
6978+                            mr.get_block_and_salt, 1))
6979+        return d
6980+
6981+
6982+    def test_read_with_prefetched_mdmf_data(self):
6983+        # The MDMFSlotReadProxy will prefill certain fields if you pass
6984+        # it data that you have already fetched. This is useful for
6985+        # cases like the Servermap, which prefetches ~2kb of data while
6986+        # finding out which shares are on the remote peer so that it
6987+        # doesn't waste round trips.
6988+        mdmf_data = self.build_test_mdmf_share()
6989+        self.write_test_share_to_server("si1")
6990+        def _make_mr(ignored, length):
6991+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, mdmf_data[:length])
6992+            return mr
6993+
6994+        d = defer.succeed(None)
6995+        # This should be enough to fill in both the encoding parameters
6996+        # and the table of offsets, which will complete the version
6997+        # information tuple.
6998+        d.addCallback(_make_mr, 107)
6999+        d.addCallback(lambda mr:
7000+            mr.get_verinfo())
7001+        def _check_verinfo(verinfo):
7002+            self.failUnless(verinfo)
7003+            self.failUnlessEqual(len(verinfo), 9)
7004+            (seqnum,
7005+             root_hash,
7006+             salt_hash,
7007+             segsize,
7008+             datalen,
7009+             k,
7010+             n,
7011+             prefix,
7012+             offsets) = verinfo
7013+            self.failUnlessEqual(seqnum, 0)
7014+            self.failUnlessEqual(root_hash, self.root_hash)
7015+            self.failUnlessEqual(segsize, 6)
7016+            self.failUnlessEqual(datalen, 36)
7017+            self.failUnlessEqual(k, 3)
7018+            self.failUnlessEqual(n, 10)
7019+            expected_prefix = struct.pack(MDMFSIGNABLEHEADER,
7020+                                          1,
7021+                                          seqnum,
7022+                                          root_hash,
7023+                                          k,
7024+                                          n,
7025+                                          segsize,
7026+                                          datalen)
7027+            self.failUnlessEqual(expected_prefix, prefix)
7028+            self.failUnlessEqual(self.rref.read_count, 0)
7029+        d.addCallback(_check_verinfo)
7030+        # This is not enough data to read a block and a share, so the
7031+        # wrapper should attempt to read this from the remote server.
7032+        d.addCallback(_make_mr, 107)
7033+        d.addCallback(lambda mr:
7034+            mr.get_block_and_salt(0))
7035+        def _check_block_and_salt((block, salt)):
7036+            self.failUnlessEqual(block, self.block)
7037+            self.failUnlessEqual(salt, self.salt)
7038+            self.failUnlessEqual(self.rref.read_count, 1)
7039+        # This should be enough data to read one block.
7040+        d.addCallback(_make_mr, 249)
7041+        d.addCallback(lambda mr:
7042+            mr.get_block_and_salt(0))
7043+        d.addCallback(_check_block_and_salt)
7044+        return d
7045+
7046+
7047+    def test_read_with_prefetched_sdmf_data(self):
7048+        sdmf_data = self.build_test_sdmf_share()
7049+        self.write_sdmf_share_to_server("si1")
7050+        def _make_mr(ignored, length):
7051+            mr = MDMFSlotReadProxy(self.rref, "si1", 0, sdmf_data[:length])
7052+            return mr
7053+
7054+        d = defer.succeed(None)
7055+        # This should be enough to get us the encoding parameters,
7056+        # offset table, and everything else we need to build a verinfo
7057+        # string.
7058+        d.addCallback(_make_mr, 107)
7059+        d.addCallback(lambda mr:
7060+            mr.get_verinfo())
7061+        def _check_verinfo(verinfo):
7062+            self.failUnless(verinfo)
7063+            self.failUnlessEqual(len(verinfo), 9)
7064+            (seqnum,
7065+             root_hash,
7066+             salt,
7067+             segsize,
7068+             datalen,
7069+             k,
7070+             n,
7071+             prefix,
7072+             offsets) = verinfo
7073+            self.failUnlessEqual(seqnum, 0)
7074+            self.failUnlessEqual(root_hash, self.root_hash)
7075+            self.failUnlessEqual(salt, self.salt)
7076+            self.failUnlessEqual(segsize, 36)
7077+            self.failUnlessEqual(datalen, 36)
7078+            self.failUnlessEqual(k, 3)
7079+            self.failUnlessEqual(n, 10)
7080+            expected_prefix = struct.pack(SIGNED_PREFIX,
7081+                                          0,
7082+                                          seqnum,
7083+                                          root_hash,
7084+                                          salt,
7085+                                          k,
7086+                                          n,
7087+                                          segsize,
7088+                                          datalen)
7089+            self.failUnlessEqual(expected_prefix, prefix)
7090+            self.failUnlessEqual(self.rref.read_count, 0)
7091+        d.addCallback(_check_verinfo)
7092+        # This shouldn't be enough to read any share data.
7093+        d.addCallback(_make_mr, 107)
7094+        d.addCallback(lambda mr:
7095+            mr.get_block_and_salt(0))
7096+        def _check_block_and_salt((block, salt)):
7097+            self.failUnlessEqual(block, self.block * 6)
7098+            self.failUnlessEqual(salt, self.salt)
7099+            # TODO: Fix the read routine so that it reads only the data
7100+            #       that it has cached if it can't read all of it.
7101+            self.failUnlessEqual(self.rref.read_count, 2)
7102+
7103+        # This should be enough to read share data.
7104+        d.addCallback(_make_mr, self.offsets['share_data'])
7105+        d.addCallback(lambda mr:
7106+            mr.get_block_and_salt(0))
7107+        d.addCallback(_check_block_and_salt)
7108+        return d
7109+
7110+
7111+    def test_read_with_empty_mdmf_file(self):
7112+        # Some tests upload a file with no contents to test things
7113+        # unrelated to the actual handling of the content of the file.
7114+        # The reader should behave intelligently in these cases.
7115+        self.write_test_share_to_server("si1", empty=True)
7116+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7117+        # We should be able to get the encoding parameters, and they
7118+        # should be correct.
7119+        d = defer.succeed(None)
7120+        d.addCallback(lambda ignored:
7121+            mr.get_encoding_parameters())
7122+        def _check_encoding_parameters(params):
7123+            self.failUnlessEqual(len(params), 4)
7124+            k, n, segsize, datalen = params
7125+            self.failUnlessEqual(k, 3)
7126+            self.failUnlessEqual(n, 10)
7127+            self.failUnlessEqual(segsize, 0)
7128+            self.failUnlessEqual(datalen, 0)
7129+        d.addCallback(_check_encoding_parameters)
7130+
7131+        # We should not be able to fetch a block, since there are no
7132+        # blocks to fetch
7133+        d.addCallback(lambda ignored:
7134+            self.shouldFail(LayoutInvalid, "get block on empty file",
7135+                            None,
7136+                            mr.get_block_and_salt, 0))
7137+        return d
7138+
7139+
7140+    def test_read_with_empty_sdmf_file(self):
7141+        self.write_sdmf_share_to_server("si1", empty=True)
7142+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7143+        # We should be able to get the encoding parameters, and they
7144+        # should be correct
7145+        d = defer.succeed(None)
7146+        d.addCallback(lambda ignored:
7147+            mr.get_encoding_parameters())
7148+        def _check_encoding_parameters(params):
7149+            self.failUnlessEqual(len(params), 4)
7150+            k, n, segsize, datalen = params
7151+            self.failUnlessEqual(k, 3)
7152+            self.failUnlessEqual(n, 10)
7153+            self.failUnlessEqual(segsize, 0)
7154+            self.failUnlessEqual(datalen, 0)
7155+        d.addCallback(_check_encoding_parameters)
7156+
7157+        # It does not make sense to get a block in this format, so we
7158+        # should not be able to.
7159+        d.addCallback(lambda ignored:
7160+            self.shouldFail(LayoutInvalid, "get block on an empty file",
7161+                            None,
7162+                            mr.get_block_and_salt, 0))
7163+        return d
7164+
7165+
7166+    def test_verinfo_with_sdmf_file(self):
7167+        self.write_sdmf_share_to_server("si1")
7168+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7169+        # We should be able to get the version information.
7170+        d = defer.succeed(None)
7171+        d.addCallback(lambda ignored:
7172+            mr.get_verinfo())
7173+        def _check_verinfo(verinfo):
7174+            self.failUnless(verinfo)
7175+            self.failUnlessEqual(len(verinfo), 9)
7176+            (seqnum,
7177+             root_hash,
7178+             salt,
7179+             segsize,
7180+             datalen,
7181+             k,
7182+             n,
7183+             prefix,
7184+             offsets) = verinfo
7185+            self.failUnlessEqual(seqnum, 0)
7186+            self.failUnlessEqual(root_hash, self.root_hash)
7187+            self.failUnlessEqual(salt, self.salt)
7188+            self.failUnlessEqual(segsize, 36)
7189+            self.failUnlessEqual(datalen, 36)
7190+            self.failUnlessEqual(k, 3)
7191+            self.failUnlessEqual(n, 10)
7192+            expected_prefix = struct.pack(">BQ32s16s BBQQ",
7193+                                          0,
7194+                                          seqnum,
7195+                                          root_hash,
7196+                                          salt,
7197+                                          k,
7198+                                          n,
7199+                                          segsize,
7200+                                          datalen)
7201+            self.failUnlessEqual(prefix, expected_prefix)
7202+            self.failUnlessEqual(offsets, self.offsets)
7203+        d.addCallback(_check_verinfo)
7204+        return d
7205+
7206+
7207+    def test_verinfo_with_mdmf_file(self):
7208+        self.write_test_share_to_server("si1")
7209+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7210+        d = defer.succeed(None)
7211+        d.addCallback(lambda ignored:
7212+            mr.get_verinfo())
7213+        def _check_verinfo(verinfo):
7214+            self.failUnless(verinfo)
7215+            self.failUnlessEqual(len(verinfo), 9)
7216+            (seqnum,
7217+             root_hash,
7218+             IV,
7219+             segsize,
7220+             datalen,
7221+             k,
7222+             n,
7223+             prefix,
7224+             offsets) = verinfo
7225+            self.failUnlessEqual(seqnum, 0)
7226+            self.failUnlessEqual(root_hash, self.root_hash)
7227+            self.failIf(IV)
7228+            self.failUnlessEqual(segsize, 6)
7229+            self.failUnlessEqual(datalen, 36)
7230+            self.failUnlessEqual(k, 3)
7231+            self.failUnlessEqual(n, 10)
7232+            expected_prefix = struct.pack(">BQ32s BBQQ",
7233+                                          1,
7234+                                          seqnum,
7235+                                          root_hash,
7236+                                          k,
7237+                                          n,
7238+                                          segsize,
7239+                                          datalen)
7240+            self.failUnlessEqual(prefix, expected_prefix)
7241+            self.failUnlessEqual(offsets, self.offsets)
7242+        d.addCallback(_check_verinfo)
7243+        return d
7244+
7245+
7246+    def test_reader_queue(self):
7247+        self.write_test_share_to_server('si1')
7248+        mr = MDMFSlotReadProxy(self.rref, "si1", 0)
7249+        d1 = mr.get_block_and_salt(0, queue=True)
7250+        d2 = mr.get_blockhashes(queue=True)
7251+        d3 = mr.get_sharehashes(queue=True)
7252+        d4 = mr.get_signature(queue=True)
7253+        d5 = mr.get_verification_key(queue=True)
7254+        dl = defer.DeferredList([d1, d2, d3, d4, d5])
7255+        mr.flush()
7256+        def _print(results):
7257+            self.failUnlessEqual(len(results), 5)
7258+            # We have one read for version information and offsets, and
7259+            # one for everything else.
7260+            self.failUnlessEqual(self.rref.read_count, 2)
7261+            block, salt = results[0][1] # results[0] is a boolean that says
7262+                                           # whether or not the operation
7263+                                           # worked.
7264+            self.failUnlessEqual(self.block, block)
7265+            self.failUnlessEqual(self.salt, salt)
7266+
7267+            blockhashes = results[1][1]
7268+            self.failUnlessEqual(self.block_hash_tree, blockhashes)
7269+
7270+            sharehashes = results[2][1]
7271+            self.failUnlessEqual(self.share_hash_chain, sharehashes)
7272+
7273+            signature = results[3][1]
7274+            self.failUnlessEqual(self.signature, signature)
7275+
7276+            verification_key = results[4][1]
7277+            self.failUnlessEqual(self.verification_key, verification_key)
7278+        dl.addCallback(_print)
7279+        return dl
7280+
7281+
7282+    def test_sdmf_writer(self):
7283+        # Go through the motions of writing an SDMF share to the storage
7284+        # server. Then read the storage server to see that the share got
7285+        # written in the way that we think it should have.
7286+
7287+        # We do this first so that the necessary instance variables get
7288+        # set the way we want them for the tests below.
7289+        data = self.build_test_sdmf_share()
7290+        sdmfr = SDMFSlotWriteProxy(0,
7291+                                   self.rref,
7292+                                   "si1",
7293+                                   self.secrets,
7294+                                   0, 3, 10, 36, 36)
7295+        # Put the block and salt.
7296+        sdmfr.put_block(self.blockdata, 0, self.salt)
7297+
7298+        # Put the encprivkey
7299+        sdmfr.put_encprivkey(self.encprivkey)
7300+
7301+        # Put the block and share hash chains
7302+        sdmfr.put_blockhashes(self.block_hash_tree)
7303+        sdmfr.put_sharehashes(self.share_hash_chain)
7304+        sdmfr.put_root_hash(self.root_hash)
7305+
7306+        # Put the signature
7307+        sdmfr.put_signature(self.signature)
7308+
7309+        # Put the verification key
7310+        sdmfr.put_verification_key(self.verification_key)
7311+
7312+        # Now check to make sure that nothing has been written yet.
7313+        self.failUnlessEqual(self.rref.write_count, 0)
7314+
7315+        # Now finish publishing
7316+        d = sdmfr.finish_publishing()
7317+        def _then(ignored):
7318+            self.failUnlessEqual(self.rref.write_count, 1)
7319+            read = self.ss.remote_slot_readv
7320+            self.failUnlessEqual(read("si1", [0], [(0, len(data))]),
7321+                                 {0: [data]})
7322+        d.addCallback(_then)
7323+        return d
7324+
7325+
7326+    def test_sdmf_writer_preexisting_share(self):
7327+        data = self.build_test_sdmf_share()
7328+        self.write_sdmf_share_to_server("si1")
7329+
7330+        # Now there is a share on the storage server. To successfully
7331+        # write, we need to set the checkstring correctly. When we
7332+        # don't, no write should occur.
7333+        sdmfw = SDMFSlotWriteProxy(0,
7334+                                   self.rref,
7335+                                   "si1",
7336+                                   self.secrets,
7337+                                   1, 3, 10, 36, 36)
7338+        sdmfw.put_block(self.blockdata, 0, self.salt)
7339+
7340+        # Put the encprivkey
7341+        sdmfw.put_encprivkey(self.encprivkey)
7342+
7343+        # Put the block and share hash chains
7344+        sdmfw.put_blockhashes(self.block_hash_tree)
7345+        sdmfw.put_sharehashes(self.share_hash_chain)
7346+
7347+        # Put the root hash
7348+        sdmfw.put_root_hash(self.root_hash)
7349+
7350+        # Put the signature
7351+        sdmfw.put_signature(self.signature)
7352+
7353+        # Put the verification key
7354+        sdmfw.put_verification_key(self.verification_key)
7355+
7356+        # We shouldn't have a checkstring yet
7357+        self.failUnlessEqual(sdmfw.get_checkstring(), "")
7358+
7359+        d = sdmfw.finish_publishing()
7360+        def _then(results):
7361+            self.failIf(results[0])
7362+            # this is the correct checkstring
7363+            self._expected_checkstring = results[1][0][0]
7364+            return self._expected_checkstring
7365+
7366+        d.addCallback(_then)
7367+        d.addCallback(sdmfw.set_checkstring)
7368+        d.addCallback(lambda ignored:
7369+            sdmfw.get_checkstring())
7370+        d.addCallback(lambda checkstring:
7371+            self.failUnlessEqual(checkstring, self._expected_checkstring))
7372+        d.addCallback(lambda ignored:
7373+            sdmfw.finish_publishing())
7374+        def _then_again(results):
7375+            self.failUnless(results[0])
7376+            read = self.ss.remote_slot_readv
7377+            self.failUnlessEqual(read("si1", [0], [(1, 8)]),
7378+                                 {0: [struct.pack(">Q", 1)]})
7379+            self.failUnlessEqual(read("si1", [0], [(9, len(data) - 9)]),
7380+                                 {0: [data[9:]]})
7381+        d.addCallback(_then_again)
7382+        return d
7383+
7384+
7385 class Stats(unittest.TestCase):
7386 
7387     def setUp(self):
7388}
7389[mutable/publish.py: cleanup + simplification
7390Kevan Carstensen <kevan@isnotajoke.com>**20100702225554
7391 Ignore-this: 36a58424ceceffb1ddc55cc5934399e2
7392] {
7393hunk ./src/allmydata/mutable/publish.py 19
7394      UncoordinatedWriteError, NotEnoughServersError
7395 from allmydata.mutable.servermap import ServerMap
7396 from allmydata.mutable.layout import pack_prefix, pack_share, unpack_header, pack_checkstring, \
7397-     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy
7398+     unpack_checkstring, SIGNED_PREFIX, MDMFSlotWriteProxy, \
7399+     SDMFSlotWriteProxy
7400 
7401 KiB = 1024
7402 DEFAULT_MAX_SEGMENT_SIZE = 128 * KiB
7403hunk ./src/allmydata/mutable/publish.py 24
7404+PUSHING_BLOCKS_STATE = 0
7405+PUSHING_EVERYTHING_ELSE_STATE = 1
7406+DONE_STATE = 2
7407 
7408 class PublishStatus:
7409     implements(IPublishStatus)
7410hunk ./src/allmydata/mutable/publish.py 229
7411 
7412         self.bad_share_checkstrings = {}
7413 
7414+        # This is set at the last step of the publishing process.
7415+        self.versioninfo = ""
7416+
7417         # we use the servermap to populate the initial goal: this way we will
7418         # try to update each existing share in place.
7419         for (peerid, shnum) in self._servermap.servermap:
7420hunk ./src/allmydata/mutable/publish.py 245
7421             self.bad_share_checkstrings[key] = old_checkstring
7422             self.connections[peerid] = self._servermap.connections[peerid]
7423 
7424-        # Now, the process dovetails -- if this is an SDMF file, we need
7425-        # to write an SDMF file. Otherwise, we need to write an MDMF
7426-        # file.
7427-        if self._version == MDMF_VERSION:
7428-            return self._publish_mdmf()
7429-        else:
7430-            return self._publish_sdmf()
7431-        #return self.done_deferred
7432-
7433-    def _publish_mdmf(self):
7434-        # Next, we find homes for all of the shares that we don't have
7435-        # homes for yet.
7436         # TODO: Make this part do peer selection.
7437         self.update_goal()
7438         self.writers = {}
7439hunk ./src/allmydata/mutable/publish.py 248
7440-        # For each (peerid, shnum) in self.goal, we make an
7441-        # MDMFSlotWriteProxy for that peer. We'll use this to write
7442+        if self._version == MDMF_VERSION:
7443+            writer_class = MDMFSlotWriteProxy
7444+        else:
7445+            writer_class = SDMFSlotWriteProxy
7446+
7447+        # For each (peerid, shnum) in self.goal, we make a
7448+        # write proxy for that peer. We'll use this to write
7449         # shares to the peer.
7450         for key in self.goal:
7451             peerid, shnum = key
7452hunk ./src/allmydata/mutable/publish.py 263
7453             cancel_secret = self._node.get_cancel_secret(peerid)
7454             secrets = (write_enabler, renew_secret, cancel_secret)
7455 
7456-            self.writers[shnum] =  MDMFSlotWriteProxy(shnum,
7457-                                                      self.connections[peerid],
7458-                                                      self._storage_index,
7459-                                                      secrets,
7460-                                                      self._new_seqnum,
7461-                                                      self.required_shares,
7462-                                                      self.total_shares,
7463-                                                      self.segment_size,
7464-                                                      len(self.newdata))
7465+            self.writers[shnum] =  writer_class(shnum,
7466+                                                self.connections[peerid],
7467+                                                self._storage_index,
7468+                                                secrets,
7469+                                                self._new_seqnum,
7470+                                                self.required_shares,
7471+                                                self.total_shares,
7472+                                                self.segment_size,
7473+                                                len(self.newdata))
7474+            self.writers[shnum].peerid = peerid
7475             if (peerid, shnum) in self._servermap.servermap:
7476                 old_versionid, old_timestamp = self._servermap.servermap[key]
7477                 (old_seqnum, old_root_hash, old_salt, old_segsize,
7478hunk ./src/allmydata/mutable/publish.py 278
7479                  old_datalength, old_k, old_N, old_prefix,
7480                  old_offsets_tuple) = old_versionid
7481-                self.writers[shnum].set_checkstring(old_seqnum, old_root_hash)
7482+                self.writers[shnum].set_checkstring(old_seqnum,
7483+                                                    old_root_hash,
7484+                                                    old_salt)
7485+            elif (peerid, shnum) in self.bad_share_checkstrings:
7486+                old_checkstring = self.bad_share_checkstrings[(peerid, shnum)]
7487+                self.writers[shnum].set_checkstring(old_checkstring)
7488+
7489+        # Our remote shares will not have a complete checkstring until
7490+        # after we are done writing share data and have started to write
7491+        # blocks. In the meantime, we need to know what to look for when
7492+        # writing, so that we can detect UncoordinatedWriteErrors.
7493+        self._checkstring = self.writers.values()[0].get_checkstring()
7494 
7495         # Now, we start pushing shares.
7496         self._status.timings["setup"] = time.time() - self._started
7497hunk ./src/allmydata/mutable/publish.py 293
7498-        def _start_pushing(res):
7499-            self._started_pushing = time.time()
7500-            return res
7501-
7502         # First, we encrypt, encode, and publish the shares that we need
7503         # to encrypt, encode, and publish.
7504 
7505hunk ./src/allmydata/mutable/publish.py 306
7506 
7507         d = defer.succeed(None)
7508         self.log("Starting push")
7509-        for i in xrange(self.num_segments - 1):
7510-            d.addCallback(lambda ignored, i=i:
7511-                self.push_segment(i))
7512-            d.addCallback(self._turn_barrier)
7513-        # We have at least one segment, so we will have a tail segment
7514-        if self.num_segments > 0:
7515-            d.addCallback(lambda ignored:
7516-                self.push_tail_segment())
7517-
7518-        d.addCallback(lambda ignored:
7519-            self.push_encprivkey())
7520-        d.addCallback(lambda ignored:
7521-            self.push_blockhashes())
7522-        d.addCallback(lambda ignored:
7523-            self.push_sharehashes())
7524-        d.addCallback(lambda ignored:
7525-            self.push_toplevel_hashes_and_signature())
7526-        d.addCallback(lambda ignored:
7527-            self.finish_publishing())
7528-        return d
7529-
7530-
7531-    def _publish_sdmf(self):
7532-        self._status.timings["setup"] = time.time() - self._started
7533-        self.salt = os.urandom(16)
7534 
7535hunk ./src/allmydata/mutable/publish.py 307
7536-        d = self._encrypt_and_encode()
7537-        d.addCallback(self._generate_shares)
7538-        def _start_pushing(res):
7539-            self._started_pushing = time.time()
7540-            return res
7541-        d.addCallback(_start_pushing)
7542-        d.addCallback(self.loop) # trigger delivery
7543-        d.addErrback(self._fatal_error)
7544+        self._state = PUSHING_BLOCKS_STATE
7545+        self._push()
7546 
7547         return self.done_deferred
7548 
7549hunk ./src/allmydata/mutable/publish.py 327
7550                                                   segment_size)
7551         else:
7552             self.num_segments = 0
7553+
7554+        self.log("building encoding parameters for file")
7555+        self.log("got segsize %d" % self.segment_size)
7556+        self.log("got %d segments" % self.num_segments)
7557+
7558         if self._version == SDMF_VERSION:
7559             assert self.num_segments in (0, 1) # SDMF
7560hunk ./src/allmydata/mutable/publish.py 334
7561-            return
7562         # calculate the tail segment size.
7563hunk ./src/allmydata/mutable/publish.py 335
7564-        self.tail_segment_size = len(self.newdata) % segment_size
7565 
7566hunk ./src/allmydata/mutable/publish.py 336
7567-        if self.tail_segment_size == 0:
7568+        if segment_size and self.newdata:
7569+            self.tail_segment_size = len(self.newdata) % segment_size
7570+        else:
7571+            self.tail_segment_size = 0
7572+
7573+        if self.tail_segment_size == 0 and segment_size:
7574             # The tail segment is the same size as the other segments.
7575             self.tail_segment_size = segment_size
7576 
7577hunk ./src/allmydata/mutable/publish.py 345
7578-        # We'll make an encoder ahead-of-time for the normal-sized
7579-        # segments (defined as any segment of segment_size size.
7580-        # (the part of the code that puts the tail segment will make its
7581-        #  own encoder for that part)
7582+        # Make FEC encoders
7583         fec = codec.CRSEncoder()
7584         fec.set_params(self.segment_size,
7585                        self.required_shares, self.total_shares)
7586hunk ./src/allmydata/mutable/publish.py 352
7587         self.piece_size = fec.get_block_size()
7588         self.fec = fec
7589 
7590+        if self.tail_segment_size == self.segment_size:
7591+            self.tail_fec = self.fec
7592+        else:
7593+            tail_fec = codec.CRSEncoder()
7594+            tail_fec.set_params(self.tail_segment_size,
7595+                                self.required_shares,
7596+                                self.total_shares)
7597+            self.tail_fec = tail_fec
7598+
7599+        self._current_segment = 0
7600+
7601+
7602+    def _push(self, ignored=None):
7603+        """
7604+        I manage state transitions. In particular, I see that we still
7605+        have a good enough number of writers to complete the upload
7606+        successfully.
7607+        """
7608+        # Can we still successfully publish this file?
7609+        # TODO: Keep track of outstanding queries before aborting the
7610+        #       process.
7611+        if len(self.writers) <= self.required_shares or self.surprised:
7612+            return self._failure()
7613+
7614+        # Figure out what we need to do next. Each of these needs to
7615+        # return a deferred so that we don't block execution when this
7616+        # is first called in the upload method.
7617+        if self._state == PUSHING_BLOCKS_STATE:
7618+            return self.push_segment(self._current_segment)
7619+
7620+        # XXX: Do we want more granularity in states? Is that useful at
7621+        #      all?
7622+        #      Yes -- quicker reaction to UCW.
7623+        elif self._state == PUSHING_EVERYTHING_ELSE_STATE:
7624+            return self.push_everything_else()
7625+
7626+        # If we make it to this point, we were successful in placing the
7627+        # file.
7628+        return self._done(None)
7629+
7630 
7631     def push_segment(self, segnum):
7632hunk ./src/allmydata/mutable/publish.py 394
7633+        if self.num_segments == 0 and self._version == SDMF_VERSION:
7634+            self._add_dummy_salts()
7635+
7636+        if segnum == self.num_segments:
7637+            # We don't have any more segments to push.
7638+            self._state = PUSHING_EVERYTHING_ELSE_STATE
7639+            return self._push()
7640+
7641+        d = self._encode_segment(segnum)
7642+        d.addCallback(self._push_segment, segnum)
7643+        def _increment_segnum(ign):
7644+            self._current_segment += 1
7645+        # XXX: I don't think we need to do addBoth here -- any errBacks
7646+        # should be handled within push_segment.
7647+        d.addBoth(_increment_segnum)
7648+        d.addBoth(self._push)
7649+
7650+
7651+    def _add_dummy_salts(self):
7652+        """
7653+        SDMF files need a salt even if they're empty, or the signature
7654+        won't make sense. This method adds a dummy salt to each of our
7655+        SDMF writers so that they can write the signature later.
7656+        """
7657+        salt = os.urandom(16)
7658+        assert self._version == SDMF_VERSION
7659+
7660+        for writer in self.writers.itervalues():
7661+            writer.put_salt(salt)
7662+
7663+
7664+    def _encode_segment(self, segnum):
7665+        """
7666+        I encrypt and encode the segment segnum.
7667+        """
7668         started = time.time()
7669hunk ./src/allmydata/mutable/publish.py 430
7670-        segsize = self.segment_size
7671+
7672+        if segnum + 1 == self.num_segments:
7673+            segsize = self.tail_segment_size
7674+        else:
7675+            segsize = self.segment_size
7676+
7677+
7678+        offset = self.segment_size * segnum
7679+        length = segsize + offset
7680         self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
7681hunk ./src/allmydata/mutable/publish.py 440
7682-        data = self.newdata[segsize * segnum:segsize*(segnum + 1)]
7683+        data = self.newdata[offset:length]
7684         assert len(data) == segsize
7685 
7686         salt = os.urandom(16)
7687hunk ./src/allmydata/mutable/publish.py 455
7688         started = now
7689 
7690         # now apply FEC
7691+        if segnum + 1 == self.num_segments:
7692+            fec = self.tail_fec
7693+        else:
7694+            fec = self.fec
7695 
7696         self._status.set_status("Encoding")
7697         crypttext_pieces = [None] * self.required_shares
7698hunk ./src/allmydata/mutable/publish.py 462
7699-        piece_size = self.piece_size
7700+        piece_size = fec.get_block_size()
7701         for i in range(len(crypttext_pieces)):
7702             offset = i * piece_size
7703             piece = crypttext[offset:offset+piece_size]
7704hunk ./src/allmydata/mutable/publish.py 469
7705             piece = piece + "\x00"*(piece_size - len(piece)) # padding
7706             crypttext_pieces[i] = piece
7707             assert len(piece) == piece_size
7708-        d = self.fec.encode(crypttext_pieces)
7709+        d = fec.encode(crypttext_pieces)
7710         def _done_encoding(res):
7711             elapsed = time.time() - started
7712             self._status.timings["encode"] = elapsed
7713hunk ./src/allmydata/mutable/publish.py 473
7714-            return res
7715+            return (res, salt)
7716         d.addCallback(_done_encoding)
7717hunk ./src/allmydata/mutable/publish.py 475
7718-
7719-        def _push_shares_and_salt(results):
7720-            shares, shareids = results
7721-            dl = []
7722-            for i in xrange(len(shares)):
7723-                sharedata = shares[i]
7724-                shareid = shareids[i]
7725-                block_hash = hashutil.block_hash(salt + sharedata)
7726-                self.blockhashes[shareid].append(block_hash)
7727-
7728-                # find the writer for this share
7729-                d = self.writers[shareid].put_block(sharedata, segnum, salt)
7730-                dl.append(d)
7731-            # TODO: Naturally, we need to check on the results of these.
7732-            return defer.DeferredList(dl)
7733-        d.addCallback(_push_shares_and_salt)
7734         return d
7735 
7736 
7737hunk ./src/allmydata/mutable/publish.py 478
7738-    def push_tail_segment(self):
7739-        # This is essentially the same as push_segment, except that we
7740-        # don't use the cached encoder that we use elsewhere.
7741-        self.log("Pushing tail segment")
7742+    def _push_segment(self, encoded_and_salt, segnum):
7743+        """
7744+        I push (data, salt) as segment number segnum.
7745+        """
7746+        results, salt = encoded_and_salt
7747+        shares, shareids = results
7748         started = time.time()
7749hunk ./src/allmydata/mutable/publish.py 485
7750-        segsize = self.segment_size
7751-        data = self.newdata[segsize * (self.num_segments-1):]
7752-        assert len(data) == self.tail_segment_size
7753-        salt = os.urandom(16)
7754-
7755-        key = hashutil.ssk_readkey_data_hash(salt, self.readkey)
7756-        enc = AES(key)
7757-        crypttext = enc.process(data)
7758-        assert len(crypttext) == len(data)
7759+        dl = []
7760+        for i in xrange(len(shares)):
7761+            sharedata = shares[i]
7762+            shareid = shareids[i]
7763+            if self._version == MDMF_VERSION:
7764+                hashed = salt + sharedata
7765+            else:
7766+                hashed = sharedata
7767+            block_hash = hashutil.block_hash(hashed)
7768+            self.blockhashes[shareid].append(block_hash)
7769 
7770hunk ./src/allmydata/mutable/publish.py 496
7771-        now = time.time()
7772-        self._status.timings['encrypt'] = now - started
7773-        started = now
7774+            # find the writer for this share
7775+            writer = self.writers[shareid]
7776+            d = writer.put_block(sharedata, segnum, salt)
7777+            d.addCallback(self._got_write_answer, writer, started)
7778+            d.addErrback(self._connection_problem, writer)
7779+            dl.append(d)
7780+            # TODO: Naturally, we need to check on the results of these.
7781+        return defer.DeferredList(dl)
7782 
7783hunk ./src/allmydata/mutable/publish.py 505
7784-        self._status.set_status("Encoding")
7785-        tail_fec = codec.CRSEncoder()
7786-        tail_fec.set_params(self.tail_segment_size,
7787-                            self.required_shares,
7788-                            self.total_shares)
7789 
7790hunk ./src/allmydata/mutable/publish.py 506
7791-        crypttext_pieces = [None] * self.required_shares
7792-        piece_size = tail_fec.get_block_size()
7793-        for i in range(len(crypttext_pieces)):
7794-            offset = i * piece_size
7795-            piece = crypttext[offset:offset+piece_size]
7796-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
7797-            crypttext_pieces[i] = piece
7798-            assert len(piece) == piece_size
7799-        d = tail_fec.encode(crypttext_pieces)
7800-        def _push_shares_and_salt(results):
7801-            shares, shareids = results
7802-            dl = []
7803-            for i in xrange(len(shares)):
7804-                sharedata = shares[i]
7805-                shareid = shareids[i]
7806-                block_hash = hashutil.block_hash(salt + sharedata)
7807-                self.blockhashes[shareid].append(block_hash)
7808-                # find the writer for this share
7809-                d = self.writers[shareid].put_block(sharedata,
7810-                                                    self.num_segments - 1,
7811-                                                    salt)
7812-                dl.append(d)
7813-            # TODO: Naturally, we need to check on the results of these.
7814-            return defer.DeferredList(dl)
7815-        d.addCallback(_push_shares_and_salt)
7816+    def push_everything_else(self):
7817+        """
7818+        I put everything else associated with a share.
7819+        """
7820+        encprivkey = self._encprivkey
7821+        d = self.push_encprivkey()
7822+        d.addCallback(self.push_blockhashes)
7823+        d.addCallback(self.push_sharehashes)
7824+        d.addCallback(self.push_toplevel_hashes_and_signature)
7825+        d.addCallback(self.finish_publishing)
7826+        def _change_state(ignored):
7827+            self._state = DONE_STATE
7828+        d.addCallback(_change_state)
7829+        d.addCallback(self._push)
7830         return d
7831 
7832 
7833hunk ./src/allmydata/mutable/publish.py 527
7834         started = time.time()
7835         encprivkey = self._encprivkey
7836         dl = []
7837-        def _spy_on_writer(results):
7838-            print results
7839-            return results
7840-        for shnum, writer in self.writers.iteritems():
7841+        for writer in self.writers.itervalues():
7842             d = writer.put_encprivkey(encprivkey)
7843hunk ./src/allmydata/mutable/publish.py 529
7844+            d.addCallback(self._got_write_answer, writer, started)
7845+            d.addErrback(self._connection_problem, writer)
7846             dl.append(d)
7847         d = defer.DeferredList(dl)
7848         return d
7849hunk ./src/allmydata/mutable/publish.py 536
7850 
7851 
7852-    def push_blockhashes(self):
7853+    def push_blockhashes(self, ignored):
7854         started = time.time()
7855         dl = []
7856hunk ./src/allmydata/mutable/publish.py 539
7857-        def _spy_on_results(results):
7858-            print results
7859-            return results
7860         self.sharehash_leaves = [None] * len(self.blockhashes)
7861         for shnum, blockhashes in self.blockhashes.iteritems():
7862             t = hashtree.HashTree(blockhashes)
7863hunk ./src/allmydata/mutable/publish.py 545
7864             self.blockhashes[shnum] = list(t)
7865             # set the leaf for future use.
7866             self.sharehash_leaves[shnum] = t[0]
7867-            d = self.writers[shnum].put_blockhashes(self.blockhashes[shnum])
7868+            writer = self.writers[shnum]
7869+            d = writer.put_blockhashes(self.blockhashes[shnum])
7870+            d.addCallback(self._got_write_answer, writer, started)
7871+            d.addErrback(self._connection_problem, self.writers[shnum])
7872             dl.append(d)
7873         d = defer.DeferredList(dl)
7874         return d
7875hunk ./src/allmydata/mutable/publish.py 554
7876 
7877 
7878-    def push_sharehashes(self):
7879+    def push_sharehashes(self, ignored):
7880+        started = time.time()
7881         share_hash_tree = hashtree.HashTree(self.sharehash_leaves)
7882         share_hash_chain = {}
7883         ds = []
7884hunk ./src/allmydata/mutable/publish.py 559
7885-        def _spy_on_results(results):
7886-            print results
7887-            return results
7888         for shnum in xrange(len(self.sharehash_leaves)):
7889             needed_indices = share_hash_tree.needed_hashes(shnum)
7890             self.sharehashes[shnum] = dict( [ (i, share_hash_tree[i])
7891hunk ./src/allmydata/mutable/publish.py 563
7892                                              for i in needed_indices] )
7893-            d = self.writers[shnum].put_sharehashes(self.sharehashes[shnum])
7894+            writer = self.writers[shnum]
7895+            d = writer.put_sharehashes(self.sharehashes[shnum])
7896+            d.addCallback(self._got_write_answer, writer, started)
7897+            d.addErrback(self._connection_problem, writer)
7898             ds.append(d)
7899         self.root_hash = share_hash_tree[0]
7900         d = defer.DeferredList(ds)
7901hunk ./src/allmydata/mutable/publish.py 573
7902         return d
7903 
7904 
7905-    def push_toplevel_hashes_and_signature(self):
7906+    def push_toplevel_hashes_and_signature(self, ignored):
7907         # We need to to three things here:
7908         #   - Push the root hash and salt hash
7909         #   - Get the checkstring of the resulting layout; sign that.
7910hunk ./src/allmydata/mutable/publish.py 578
7911         #   - Push the signature
7912+        started = time.time()
7913         ds = []
7914hunk ./src/allmydata/mutable/publish.py 580
7915-        def _spy_on_results(results):
7916-            print results
7917-            return results
7918         for shnum in xrange(self.total_shares):
7919hunk ./src/allmydata/mutable/publish.py 581
7920-            d = self.writers[shnum].put_root_hash(self.root_hash)
7921+            writer = self.writers[shnum]
7922+            d = writer.put_root_hash(self.root_hash)
7923+            d.addCallback(self._got_write_answer, writer, started)
7924             ds.append(d)
7925         d = defer.DeferredList(ds)
7926hunk ./src/allmydata/mutable/publish.py 586
7927-        def _make_and_place_signature(ignored):
7928-            signable = self.writers[0].get_signable()
7929-            self.signature = self._privkey.sign(signable)
7930-
7931-            ds = []
7932-            for (shnum, writer) in self.writers.iteritems():
7933-                d = writer.put_signature(self.signature)
7934-                ds.append(d)
7935-            return defer.DeferredList(ds)
7936-        d.addCallback(_make_and_place_signature)
7937+        d.addCallback(self._update_checkstring)
7938+        d.addCallback(self._make_and_place_signature)
7939         return d
7940 
7941 
7942hunk ./src/allmydata/mutable/publish.py 591
7943-    def finish_publishing(self):
7944+    def _update_checkstring(self, ignored):
7945+        """
7946+        After putting the root hash, MDMF files will have the
7947+        checkstring written to the storage server. This means that we
7948+        can update our copy of the checkstring so we can detect
7949+        uncoordinated writes. SDMF files will have the same checkstring,
7950+        so we need not do anything.
7951+        """
7952+        self._checkstring = self.writers.values()[0].get_checkstring()
7953+
7954+
7955+    def _make_and_place_signature(self, ignored):
7956+        """
7957+        I create and place the signature.
7958+        """
7959+        started = time.time()
7960+        signable = self.writers[0].get_signable()
7961+        self.signature = self._privkey.sign(signable)
7962+
7963+        ds = []
7964+        for (shnum, writer) in self.writers.iteritems():
7965+            d = writer.put_signature(self.signature)
7966+            d.addCallback(self._got_write_answer, writer, started)
7967+            d.addErrback(self._connection_problem, writer)
7968+            ds.append(d)
7969+        return defer.DeferredList(ds)
7970+
7971+
7972+    def finish_publishing(self, ignored):
7973         # We're almost done -- we just need to put the verification key
7974         # and the offsets
7975hunk ./src/allmydata/mutable/publish.py 622
7976+        started = time.time()
7977         ds = []
7978         verification_key = self._pubkey.serialize()
7979 
7980hunk ./src/allmydata/mutable/publish.py 626
7981-        def _spy_on_results(results):
7982-            print results
7983-            return results
7984+
7985+        # TODO: Bad, since we remove from this same dict. We need to
7986+        # make a copy, or just use a non-iterated value.
7987         for (shnum, writer) in self.writers.iteritems():
7988             d = writer.put_verification_key(verification_key)
7989hunk ./src/allmydata/mutable/publish.py 631
7990+            d.addCallback(self._got_write_answer, writer, started)
7991+            d.addCallback(self._record_verinfo)
7992             d.addCallback(lambda ignored, writer=writer:
7993                 writer.finish_publishing())
7994hunk ./src/allmydata/mutable/publish.py 635
7995+            d.addCallback(self._got_write_answer, writer, started)
7996+            d.addErrback(self._connection_problem, writer)
7997             ds.append(d)
7998         return defer.DeferredList(ds)
7999 
8000hunk ./src/allmydata/mutable/publish.py 641
8001 
8002-    def _turn_barrier(self, res):
8003-        # putting this method in a Deferred chain imposes a guaranteed
8004-        # reactor turn between the pre- and post- portions of that chain.
8005-        # This can be useful to limit memory consumption: since Deferreds do
8006-        # not do tail recursion, code which uses defer.succeed(result) for
8007-        # consistency will cause objects to live for longer than you might
8008-        # normally expect.
8009-        return fireEventually(res)
8010+    def _record_verinfo(self, ignored):
8011+        self.versioninfo = self.writers.values()[0].get_verinfo()
8012 
8013 
8014hunk ./src/allmydata/mutable/publish.py 645
8015-    def _fatal_error(self, f):
8016-        self.log("error during loop", failure=f, level=log.UNUSUAL)
8017-        self._done(f)
8018+    def _connection_problem(self, f, writer):
8019+        """
8020+        We ran into a connection problem while working with writer, and
8021+        need to deal with that.
8022+        """
8023+        self.log("found problem: %s" % str(f))
8024+        self._last_failure = f
8025+        del(self.writers[writer.shnum])
8026 
8027hunk ./src/allmydata/mutable/publish.py 654
8028-    def _update_status(self):
8029-        self._status.set_status("Sending Shares: %d placed out of %d, "
8030-                                "%d messages outstanding" %
8031-                                (len(self.placed),
8032-                                 len(self.goal),
8033-                                 len(self.outstanding)))
8034-        self._status.set_progress(1.0 * len(self.placed) / len(self.goal))
8035 
8036     def loop(self, ignored=None):
8037         self.log("entering loop", level=log.NOISY)
8038hunk ./src/allmydata/mutable/publish.py 778
8039             self.log_goal(self.goal, "after update: ")
8040 
8041 
8042-    def _encrypt_and_encode(self):
8043-        # this returns a Deferred that fires with a list of (sharedata,
8044-        # sharenum) tuples. TODO: cache the ciphertext, only produce the
8045-        # shares that we care about.
8046-        self.log("_encrypt_and_encode")
8047-
8048-        self._status.set_status("Encrypting")
8049-        started = time.time()
8050+    def _got_write_answer(self, answer, writer, started):
8051+        if not answer:
8052+            # SDMF writers only pretend to write when readers set their
8053+            # blocks, salts, and so on -- they actually just write once,
8054+            # at the end of the upload process. In fake writes, they
8055+            # return defer.succeed(None). If we see that, we shouldn't
8056+            # bother checking it.
8057+            return
8058 
8059hunk ./src/allmydata/mutable/publish.py 787
8060-        key = hashutil.ssk_readkey_data_hash(self.salt, self.readkey)
8061-        enc = AES(key)
8062-        crypttext = enc.process(self.newdata)
8063-        assert len(crypttext) == len(self.newdata)
8064+        peerid = writer.peerid
8065+        lp = self.log("_got_write_answer from %s, share %d" %
8066+                      (idlib.shortnodeid_b2a(peerid), writer.shnum))
8067 
8068         now = time.time()
8069hunk ./src/allmydata/mutable/publish.py 792
8070-        self._status.timings["encrypt"] = now - started
8071-        started = now
8072-
8073-        # now apply FEC
8074-
8075-        self._status.set_status("Encoding")
8076-        fec = codec.CRSEncoder()
8077-        fec.set_params(self.segment_size,
8078-                       self.required_shares, self.total_shares)
8079-        piece_size = fec.get_block_size()
8080-        crypttext_pieces = [None] * self.required_shares
8081-        for i in range(len(crypttext_pieces)):
8082-            offset = i * piece_size
8083-            piece = crypttext[offset:offset+piece_size]
8084-            piece = piece + "\x00"*(piece_size - len(piece)) # padding
8085-            crypttext_pieces[i] = piece
8086-            assert len(piece) == piece_size
8087-
8088-        d = fec.encode(crypttext_pieces)
8089-        def _done_encoding(res):
8090-            elapsed = time.time() - started
8091-            self._status.timings["encode"] = elapsed
8092-            return res
8093-        d.addCallback(_done_encoding)
8094-        return d
8095-
8096-
8097-    def _generate_shares(self, shares_and_shareids):
8098-        # this sets self.shares and self.root_hash
8099-        self.log("_generate_shares")
8100-        self._status.set_status("Generating Shares")
8101-        started = time.time()
8102-
8103-        # we should know these by now
8104-        privkey = self._privkey
8105-        encprivkey = self._encprivkey
8106-        pubkey = self._pubkey
8107-
8108-        (shares, share_ids) = shares_and_shareids
8109-
8110-        assert len(shares) == len(share_ids)
8111-        assert len(shares) == self.total_shares
8112-        all_shares = {}
8113-        block_hash_trees = {}
8114-        share_hash_leaves = [None] * len(shares)
8115-        for i in range(len(shares)):
8116-            share_data = shares[i]
8117-            shnum = share_ids[i]
8118-            all_shares[shnum] = share_data
8119-
8120-            # build the block hash tree. SDMF has only one leaf.
8121-            leaves = [hashutil.block_hash(share_data)]
8122-            t = hashtree.HashTree(leaves)
8123-            block_hash_trees[shnum] = list(t)
8124-            share_hash_leaves[shnum] = t[0]
8125-        for leaf in share_hash_leaves:
8126-            assert leaf is not None
8127-        share_hash_tree = hashtree.HashTree(share_hash_leaves)
8128-        share_hash_chain = {}
8129-        for shnum in range(self.total_shares):
8130-            needed_hashes = share_hash_tree.needed_hashes(shnum)
8131-            share_hash_chain[shnum] = dict( [ (i, share_hash_tree[i])
8132-                                              for i in needed_hashes ] )
8133-        root_hash = share_hash_tree[0]
8134-        assert len(root_hash) == 32
8135-        self.log("my new root_hash is %s" % base32.b2a(root_hash))
8136-        self._new_version_info = (self._new_seqnum, root_hash, self.salt)
8137-
8138-        prefix = pack_prefix(self._new_seqnum, root_hash, self.salt,
8139-                             self.required_shares, self.total_shares,
8140-                             self.segment_size, len(self.newdata))
8141-
8142-        # now pack the beginning of the share. All shares are the same up
8143-        # to the signature, then they have divergent share hash chains,
8144-        # then completely different block hash trees + salt + share data,
8145-        # then they all share the same encprivkey at the end. The sizes
8146-        # of everything are the same for all shares.
8147-
8148-        sign_started = time.time()
8149-        signature = privkey.sign(prefix)
8150-        self._status.timings["sign"] = time.time() - sign_started
8151-
8152-        verification_key = pubkey.serialize()
8153-
8154-        final_shares = {}
8155-        for shnum in range(self.total_shares):
8156-            final_share = pack_share(prefix,
8157-                                     verification_key,
8158-                                     signature,
8159-                                     share_hash_chain[shnum],
8160-                                     block_hash_trees[shnum],
8161-                                     all_shares[shnum],
8162-                                     encprivkey)
8163-            final_shares[shnum] = final_share
8164-        elapsed = time.time() - started
8165-        self._status.timings["pack"] = elapsed
8166-        self.shares = final_shares
8167-        self.root_hash = root_hash
8168-
8169-        # we also need to build up the version identifier for what we're
8170-        # pushing. Extract the offsets from one of our shares.
8171-        assert final_shares
8172-        offsets = unpack_header(final_shares.values()[0])[-1]
8173-        offsets_tuple = tuple( [(key,value) for key,value in offsets.items()] )
8174-        verinfo = (self._new_seqnum, root_hash, self.salt,
8175-                   self.segment_size, len(self.newdata),
8176-                   self.required_shares, self.total_shares,
8177-                   prefix, offsets_tuple)
8178-        self.versioninfo = verinfo
8179-
8180-
8181-
8182-    def _send_shares(self, needed):
8183-        self.log("_send_shares")
8184-
8185-        # we're finally ready to send out our shares. If we encounter any
8186-        # surprises here, it's because somebody else is writing at the same
8187-        # time. (Note: in the future, when we remove the _query_peers() step
8188-        # and instead speculate about [or remember] which shares are where,
8189-        # surprises here are *not* indications of UncoordinatedWriteError,
8190-        # and we'll need to respond to them more gracefully.)
8191-
8192-        # needed is a set of (peerid, shnum) tuples. The first thing we do is
8193-        # organize it by peerid.
8194-
8195-        peermap = DictOfSets()
8196-        for (peerid, shnum) in needed:
8197-            peermap.add(peerid, shnum)
8198-
8199-        # the next thing is to build up a bunch of test vectors. The
8200-        # semantics of Publish are that we perform the operation if the world
8201-        # hasn't changed since the ServerMap was constructed (more or less).
8202-        # For every share we're trying to place, we create a test vector that
8203-        # tests to see if the server*share still corresponds to the
8204-        # map.
8205-
8206-        all_tw_vectors = {} # maps peerid to tw_vectors
8207-        sm = self._servermap.servermap
8208-
8209-        for key in needed:
8210-            (peerid, shnum) = key
8211-
8212-            if key in sm:
8213-                # an old version of that share already exists on the
8214-                # server, according to our servermap. We will create a
8215-                # request that attempts to replace it.
8216-                old_versionid, old_timestamp = sm[key]
8217-                (old_seqnum, old_root_hash, old_salt, old_segsize,
8218-                 old_datalength, old_k, old_N, old_prefix,
8219-                 old_offsets_tuple) = old_versionid
8220-                old_checkstring = pack_checkstring(old_seqnum,
8221-                                                   old_root_hash,
8222-                                                   old_salt)
8223-                testv = (0, len(old_checkstring), "eq", old_checkstring)
8224-
8225-            elif key in self.bad_share_checkstrings:
8226-                old_checkstring = self.bad_share_checkstrings[key]
8227-                testv = (0, len(old_checkstring), "eq", old_checkstring)
8228-
8229-            else:
8230-                # add a testv that requires the share not exist
8231-
8232-                # Unfortunately, foolscap-0.2.5 has a bug in the way inbound
8233-                # constraints are handled. If the same object is referenced
8234-                # multiple times inside the arguments, foolscap emits a
8235-                # 'reference' token instead of a distinct copy of the
8236-                # argument. The bug is that these 'reference' tokens are not
8237-                # accepted by the inbound constraint code. To work around
8238-                # this, we need to prevent python from interning the
8239-                # (constant) tuple, by creating a new copy of this vector
8240-                # each time.
8241-
8242-                # This bug is fixed in foolscap-0.2.6, and even though this
8243-                # version of Tahoe requires foolscap-0.3.1 or newer, we are
8244-                # supposed to be able to interoperate with older versions of
8245-                # Tahoe which are allowed to use older versions of foolscap,
8246-                # including foolscap-0.2.5 . In addition, I've seen other
8247-                # foolscap problems triggered by 'reference' tokens (see #541
8248-                # for details). So we must keep this workaround in place.
8249-
8250-                #testv = (0, 1, 'eq', "")
8251-                testv = tuple([0, 1, 'eq', ""])
8252-
8253-            testvs = [testv]
8254-            # the write vector is simply the share
8255-            writev = [(0, self.shares[shnum])]
8256-
8257-            if peerid not in all_tw_vectors:
8258-                all_tw_vectors[peerid] = {}
8259-                # maps shnum to (testvs, writevs, new_length)
8260-            assert shnum not in all_tw_vectors[peerid]
8261-
8262-            all_tw_vectors[peerid][shnum] = (testvs, writev, None)
8263-
8264-        # we read the checkstring back from each share, however we only use
8265-        # it to detect whether there was a new share that we didn't know
8266-        # about. The success or failure of the write will tell us whether
8267-        # there was a collision or not. If there is a collision, the first
8268-        # thing we'll do is update the servermap, which will find out what
8269-        # happened. We could conceivably reduce a roundtrip by using the
8270-        # readv checkstring to populate the servermap, but really we'd have
8271-        # to read enough data to validate the signatures too, so it wouldn't
8272-        # be an overall win.
8273-        read_vector = [(0, struct.calcsize(SIGNED_PREFIX))]
8274-
8275-        # ok, send the messages!
8276-        self.log("sending %d shares" % len(all_tw_vectors), level=log.NOISY)
8277-        started = time.time()
8278-        for (peerid, tw_vectors) in all_tw_vectors.items():
8279-
8280-            write_enabler = self._node.get_write_enabler(peerid)
8281-            renew_secret = self._node.get_renewal_secret(peerid)
8282-            cancel_secret = self._node.get_cancel_secret(peerid)
8283-            secrets = (write_enabler, renew_secret, cancel_secret)
8284-            shnums = tw_vectors.keys()
8285-
8286-            for shnum in shnums:
8287-                self.outstanding.add( (peerid, shnum) )
8288-
8289-            d = self._do_testreadwrite(peerid, secrets,
8290-                                       tw_vectors, read_vector)
8291-            d.addCallbacks(self._got_write_answer, self._got_write_error,
8292-                           callbackArgs=(peerid, shnums, started),
8293-                           errbackArgs=(peerid, shnums, started))
8294-            # tolerate immediate errback, like with DeadReferenceError
8295-            d.addBoth(fireEventually)
8296-            d.addCallback(self.loop)
8297-            d.addErrback(self._fatal_error)
8298-
8299-        self._update_status()
8300-        self.log("%d shares sent" % len(all_tw_vectors), level=log.NOISY)
8301+        elapsed = now - started
8302 
8303hunk ./src/allmydata/mutable/publish.py 794
8304-    def _do_testreadwrite(self, peerid, secrets,
8305-                          tw_vectors, read_vector):
8306-        storage_index = self._storage_index
8307-        ss = self.connections[peerid]
8308+        self._status.add_per_server_time(peerid, elapsed)
8309 
8310hunk ./src/allmydata/mutable/publish.py 796
8311-        #print "SS[%s] is %s" % (idlib.shortnodeid_b2a(peerid), ss), ss.tracker.interfaceName
8312-        d = ss.callRemote("slot_testv_and_readv_and_writev",
8313-                          storage_index,
8314-                          secrets,
8315-                          tw_vectors,
8316-                          read_vector)
8317-        return d
8318+        wrote, read_data = answer
8319 
8320hunk ./src/allmydata/mutable/publish.py 798
8321-    def _got_write_answer(self, answer, peerid, shnums, started):
8322-        lp = self.log("_got_write_answer from %s" %
8323-                      idlib.shortnodeid_b2a(peerid))
8324-        for shnum in shnums:
8325-            self.outstanding.discard( (peerid, shnum) )
8326+        surprise_shares = set(read_data.keys()) - set([writer.shnum])
8327 
8328hunk ./src/allmydata/mutable/publish.py 800
8329-        now = time.time()
8330-        elapsed = now - started
8331-        self._status.add_per_server_time(peerid, elapsed)
8332+        # We need to remove from surprise_shares any shares that we are
8333+        # knowingly also writing to that peer from other writers.
8334 
8335hunk ./src/allmydata/mutable/publish.py 803
8336-        wrote, read_data = answer
8337+        # TODO: Precompute this.
8338+        known_shnums = [x.shnum for x in self.writers.values()
8339+                        if x.peerid == peerid]
8340+        surprise_shares -= set(known_shnums)
8341+        self.log("found the following surprise shares: %s" %
8342+                 str(surprise_shares))
8343 
8344hunk ./src/allmydata/mutable/publish.py 810
8345-        surprise_shares = set(read_data.keys()) - set(shnums)
8346+        # Now surprise shares contains all of the shares that we did not
8347+        # expect to be there.
8348 
8349         surprised = False
8350         for shnum in surprise_shares:
8351hunk ./src/allmydata/mutable/publish.py 817
8352             # read_data is a dict mapping shnum to checkstring (SIGNED_PREFIX)
8353             checkstring = read_data[shnum][0]
8354-            their_version_info = unpack_checkstring(checkstring)
8355-            if their_version_info == self._new_version_info:
8356+            # What we want to do here is to see if their (seqnum,
8357+            # roothash, salt) is the same as our (seqnum, roothash,
8358+            # salt), or the equivalent for MDMF. The best way to do this
8359+            # is to store a packed representation of our checkstring
8360+            # somewhere, then not bother unpacking the other
8361+            # checkstring.
8362+            if checkstring == self._checkstring:
8363                 # they have the right share, somehow
8364 
8365                 if (peerid,shnum) in self.goal:
8366hunk ./src/allmydata/mutable/publish.py 902
8367             self.log("our testv failed, so the write did not happen",
8368                      parent=lp, level=log.WEIRD, umid="8sc26g")
8369             self.surprised = True
8370-            self.bad_peers.add(peerid) # don't ask them again
8371+            # TODO: This needs to
8372+            self.bad_peers.add(writer) # don't ask them again
8373             # use the checkstring to add information to the log message
8374             for (shnum,readv) in read_data.items():
8375                 checkstring = readv[0]
8376hunk ./src/allmydata/mutable/publish.py 928
8377             # self.loop() will take care of finding new homes
8378             return
8379 
8380-        for shnum in shnums:
8381-            self.placed.add( (peerid, shnum) )
8382-            # and update the servermap
8383-            self._servermap.add_new_share(peerid, shnum,
8384+        # and update the servermap
8385+        # self.versioninfo is set during the last phase of publishing.
8386+        # If we get there, we know that responses correspond to placed
8387+        # shares, and can safely execute these statements.
8388+        if self.versioninfo:
8389+            self.log("wrote successfully: adding new share to servermap")
8390+            self._servermap.add_new_share(peerid, writer.shnum,
8391                                           self.versioninfo, started)
8392hunk ./src/allmydata/mutable/publish.py 936
8393-
8394-        # self.loop() will take care of checking to see if we're done
8395-        return
8396+            self.placed.add( (peerid, writer.shnum) )
8397 
8398hunk ./src/allmydata/mutable/publish.py 938
8399-    def _got_write_error(self, f, peerid, shnums, started):
8400-        for shnum in shnums:
8401-            self.outstanding.discard( (peerid, shnum) )
8402-        self.bad_peers.add(peerid)
8403-        if self._first_write_error is None:
8404-            self._first_write_error = f
8405-        self.log(format="error while writing shares %(shnums)s to peerid %(peerid)s",
8406-                 shnums=list(shnums), peerid=idlib.shortnodeid_b2a(peerid),
8407-                 failure=f,
8408-                 level=log.UNUSUAL)
8409         # self.loop() will take care of checking to see if we're done
8410         return
8411 
8412hunk ./src/allmydata/mutable/publish.py 949
8413         now = time.time()
8414         self._status.timings["total"] = now - self._started
8415         self._status.set_active(False)
8416-        if isinstance(res, failure.Failure):
8417-            self.log("Publish done, with failure", failure=res,
8418-                     level=log.WEIRD, umid="nRsR9Q")
8419-            self._status.set_status("Failed")
8420-        elif self.surprised:
8421-            self.log("Publish done, UncoordinatedWriteError", level=log.UNUSUAL)
8422-            self._status.set_status("UncoordinatedWriteError")
8423-            # deliver a failure
8424-            res = failure.Failure(UncoordinatedWriteError())
8425-            # TODO: recovery
8426-        else:
8427-            self.log("Publish done, success")
8428-            self._status.set_status("Finished")
8429-            self._status.set_progress(1.0)
8430+        self.log("Publish done, success")
8431+        self._status.set_status("Finished")
8432+        self._status.set_progress(1.0)
8433         eventually(self.done_deferred.callback, res)
8434 
8435hunk ./src/allmydata/mutable/publish.py 954
8436+    def _failure(self):
8437+
8438+        if not self.surprised:
8439+            # We ran out of servers
8440+            self.log("Publish ran out of good servers, "
8441+                     "last failure was: %s" % str(self._last_failure))
8442+            e = NotEnoughServersError("Ran out of non-bad servers, "
8443+                                      "last failure was %s" %
8444+                                      str(self._last_failure))
8445+        else:
8446+            # We ran into shares that we didn't recognize, which means
8447+            # that we need to return an UncoordinatedWriteError.
8448+            self.log("Publish failed with UncoordinatedWriteError")
8449+            e = UncoordinatedWriteError()
8450+        f = failure.Failure(e)
8451+        eventually(self.done_deferred.callback, f)
8452}
8453[test/test_mutable.py: remove tests that are no longer relevant
8454Kevan Carstensen <kevan@isnotajoke.com>**20100702225710
8455 Ignore-this: 90a26b4cc4b2e190a635474ba7097e21
8456] hunk ./src/allmydata/test/test_mutable.py 627
8457         return d
8458 
8459 
8460-class MakeShares(unittest.TestCase):
8461-    def test_encrypt(self):
8462-        nm = make_nodemaker()
8463-        CONTENTS = "some initial contents"
8464-        d = nm.create_mutable_file(CONTENTS)
8465-        def _created(fn):
8466-            p = Publish(fn, nm.storage_broker, None)
8467-            p.salt = "SALT" * 4
8468-            p.readkey = "\x00" * 16
8469-            p.newdata = CONTENTS
8470-            p.required_shares = 3
8471-            p.total_shares = 10
8472-            p.setup_encoding_parameters()
8473-            return p._encrypt_and_encode()
8474-        d.addCallback(_created)
8475-        def _done(shares_and_shareids):
8476-            (shares, share_ids) = shares_and_shareids
8477-            self.failUnlessEqual(len(shares), 10)
8478-            for sh in shares:
8479-                self.failUnless(isinstance(sh, str))
8480-                self.failUnlessEqual(len(sh), 7)
8481-            self.failUnlessEqual(len(share_ids), 10)
8482-        d.addCallback(_done)
8483-        return d
8484-    test_encrypt.todo = "Write an equivalent of this for the new uploader"
8485-
8486-    def test_generate(self):
8487-        nm = make_nodemaker()
8488-        CONTENTS = "some initial contents"
8489-        d = nm.create_mutable_file(CONTENTS)
8490-        def _created(fn):
8491-            self._fn = fn
8492-            p = Publish(fn, nm.storage_broker, None)
8493-            self._p = p
8494-            p.newdata = CONTENTS
8495-            p.required_shares = 3
8496-            p.total_shares = 10
8497-            p.setup_encoding_parameters()
8498-            p._new_seqnum = 3
8499-            p.salt = "SALT" * 4
8500-            # make some fake shares
8501-            shares_and_ids = ( ["%07d" % i for i in range(10)], range(10) )
8502-            p._privkey = fn.get_privkey()
8503-            p._encprivkey = fn.get_encprivkey()
8504-            p._pubkey = fn.get_pubkey()
8505-            return p._generate_shares(shares_and_ids)
8506-        d.addCallback(_created)
8507-        def _generated(res):
8508-            p = self._p
8509-            final_shares = p.shares
8510-            root_hash = p.root_hash
8511-            self.failUnlessEqual(len(root_hash), 32)
8512-            self.failUnless(isinstance(final_shares, dict))
8513-            self.failUnlessEqual(len(final_shares), 10)
8514-            self.failUnlessEqual(sorted(final_shares.keys()), range(10))
8515-            for i,sh in final_shares.items():
8516-                self.failUnless(isinstance(sh, str))
8517-                # feed the share through the unpacker as a sanity-check
8518-                pieces = unpack_share(sh)
8519-                (u_seqnum, u_root_hash, IV, k, N, segsize, datalen,
8520-                 pubkey, signature, share_hash_chain, block_hash_tree,
8521-                 share_data, enc_privkey) = pieces
8522-                self.failUnlessEqual(u_seqnum, 3)
8523-                self.failUnlessEqual(u_root_hash, root_hash)
8524-                self.failUnlessEqual(k, 3)
8525-                self.failUnlessEqual(N, 10)
8526-                self.failUnlessEqual(segsize, 21)
8527-                self.failUnlessEqual(datalen, len(CONTENTS))
8528-                self.failUnlessEqual(pubkey, p._pubkey.serialize())
8529-                sig_material = struct.pack(">BQ32s16s BBQQ",
8530-                                           0, p._new_seqnum, root_hash, IV,
8531-                                           k, N, segsize, datalen)
8532-                self.failUnless(p._pubkey.verify(sig_material, signature))
8533-                #self.failUnlessEqual(signature, p._privkey.sign(sig_material))
8534-                self.failUnlessEqual(len(share_hash_chain), 4) # ln2(10)++
8535-                for shnum,share_hash in share_hash_chain.items():
8536-                    self.failUnless(isinstance(shnum, int))
8537-                    self.failUnless(isinstance(share_hash, str))
8538-                    self.failUnlessEqual(len(share_hash), 32)
8539-                self.failUnless(isinstance(block_hash_tree, list))
8540-                self.failUnlessEqual(len(block_hash_tree), 1) # very small tree
8541-                self.failUnlessEqual(IV, "SALT"*4)
8542-                self.failUnlessEqual(len(share_data), len("%07d" % 1))
8543-                self.failUnlessEqual(enc_privkey, self._fn.get_encprivkey())
8544-        d.addCallback(_generated)
8545-        return d
8546-    test_generate.todo = "Write an equivalent of this for the new uploader"
8547-
8548-    # TODO: when we publish to 20 peers, we should get one share per peer on 10
8549-    # when we publish to 3 peers, we should get either 3 or 4 shares per peer
8550-    # when we publish to zero peers, we should get a NotEnoughSharesError
8551-
8552 class PublishMixin:
8553     def publish_one(self):
8554         # publish a file and create shares, which can then be manipulated
8555[interfaces.py: create IMutableUploadable
8556Kevan Carstensen <kevan@isnotajoke.com>**20100706215217
8557 Ignore-this: bee202ec2bfbd8e41f2d4019cce176c7
8558] hunk ./src/allmydata/interfaces.py 1693
8559         """The upload is finished, and whatever filehandle was in use may be
8560         closed."""
8561 
8562+
8563+class IMutableUploadable(Interface):
8564+    """
8565+    I represent content that is due to be uploaded to a mutable filecap.
8566+    """
8567+    # This is somewhat simpler than the IUploadable interface above
8568+    # because mutable files do not need to be concerned with possibly
8569+    # generating a CHK, nor with per-file keys. It is a subset of the
8570+    # methods in IUploadable, though, so we could just as well implement
8571+    # the mutable uploadables as IUploadables that don't happen to use
8572+    # those methods (with the understanding that the unused methods will
8573+    # never be called on such objects)
8574+    def get_size():
8575+        """
8576+        Returns a Deferred that fires with the size of the content held
8577+        by the uploadable.
8578+        """
8579+
8580+    def read(length):
8581+        """
8582+        Returns a list of strings which, when concatenated, are the next
8583+        length bytes of the file, or fewer if there are fewer bytes
8584+        between the current location and the end of the file.
8585+        """
8586+
8587+    def close():
8588+        """
8589+        The process that used the Uploadable is finished using it, so
8590+        the uploadable may be closed.
8591+        """
8592+
8593 class IUploadResults(Interface):
8594     """I am returned by upload() methods. I contain a number of public
8595     attributes which can be read to determine the results of the upload. Some
8596[mutable/publish.py: add MutableDataHandle and MutableFileHandle
8597Kevan Carstensen <kevan@isnotajoke.com>**20100706215257
8598 Ignore-this: 295ea3bc2a962fd14fb7877fc76c011c
8599] {
8600hunk ./src/allmydata/mutable/publish.py 8
8601 from zope.interface import implements
8602 from twisted.internet import defer
8603 from twisted.python import failure
8604-from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION
8605+from allmydata.interfaces import IPublishStatus, SDMF_VERSION, MDMF_VERSION, \
8606+                                 IMutableUploadable
8607 from allmydata.util import base32, hashutil, mathutil, idlib, log
8608 from allmydata import hashtree, codec
8609 from allmydata.storage.server import si_b2a
8610hunk ./src/allmydata/mutable/publish.py 971
8611             e = UncoordinatedWriteError()
8612         f = failure.Failure(e)
8613         eventually(self.done_deferred.callback, f)
8614+
8615+
8616+class MutableFileHandle:
8617+    """
8618+    I am a mutable uploadable built around a filehandle-like object,
8619+    usually either a StringIO instance or a handle to an actual file.
8620+    """
8621+    implements(IMutableUploadable)
8622+
8623+    def __init__(self, filehandle):
8624+        # The filehandle is defined as a generally file-like object that
8625+        # has these two methods. We don't care beyond that.
8626+        assert hasattr(filehandle, "read")
8627+        assert hasattr(filehandle, "close")
8628+
8629+        self._filehandle = filehandle
8630+
8631+
8632+    def get_size(self):
8633+        """
8634+        I return the amount of data in my filehandle.
8635+        """
8636+        if not hasattr(self, "_size"):
8637+            old_position = self._filehandle.tell()
8638+            # Seek to the end of the file by seeking 0 bytes from the
8639+            # file's end
8640+            self._filehandle.seek(0, os.SEEK_END)
8641+            self._size = self._filehandle.tell()
8642+            # Restore the previous position, in case this was called
8643+            # after a read.
8644+            self._filehandle.seek(old_position)
8645+            assert self._filehandle.tell() == old_position
8646+
8647+        assert hasattr(self, "_size")
8648+        return self._size
8649+
8650+
8651+    def read(self, length):
8652+        """
8653+        I return some data (up to length bytes) from my filehandle.
8654+
8655+        In most cases, I return length bytes. If I don't, it is because
8656+        length is longer than the distance between my current position
8657+        in the file that I represent and its end. In that case, I return
8658+        as many bytes as I can before going over the EOF.
8659+        """
8660+        return [self._filehandle.read(length)]
8661+
8662+
8663+    def close(self):
8664+        """
8665+        I close the underlying filehandle. Any further operations on the
8666+        filehandle fail at this point.
8667+        """
8668+        self._filehandle.close()
8669+
8670+
8671+class MutableDataHandle(MutableFileHandle):
8672+    """
8673+    I am a mutable uploadable built around a string, which I then cast
8674+    into a StringIO and treat as a filehandle.
8675+    """
8676+
8677+    def __init__(self, s):
8678+        # Take a string and return a file-like uploadable.
8679+        assert isinstance(s, str)
8680+
8681+        MutableFileHandle.__init__(self, StringIO(s))
8682}
8683[mutable/publish.py: reorganize in preparation of file-like uploadables
8684Kevan Carstensen <kevan@isnotajoke.com>**20100706215541
8685 Ignore-this: 5346c9f919ee5b73807c8f287c64e8ce
8686] {
8687hunk ./src/allmydata/mutable/publish.py 4
8688 
8689 
8690 import os, struct, time
8691+from StringIO import StringIO
8692 from itertools import count
8693 from zope.interface import implements
8694 from twisted.internet import defer
8695hunk ./src/allmydata/mutable/publish.py 118
8696         self._status.set_helper(False)
8697         self._status.set_progress(0.0)
8698         self._status.set_active(True)
8699-        # We use this to control how the file is written.
8700-        version = self._node.get_version()
8701-        assert version in (SDMF_VERSION, MDMF_VERSION)
8702-        self._version = version
8703+        self._version = self._node.get_version()
8704+        assert self._version in (SDMF_VERSION, MDMF_VERSION)
8705+
8706 
8707     def get_status(self):
8708         return self._status
8709hunk ./src/allmydata/mutable/publish.py 141
8710 
8711         # 0. Setup encoding parameters, encoder, and other such things.
8712         # 1. Encrypt, encode, and publish segments.
8713+        self.data = StringIO(newdata)
8714+        self.datalength = len(newdata)
8715 
8716hunk ./src/allmydata/mutable/publish.py 144
8717-        self.log("starting publish, datalen is %s" % len(newdata))
8718-        self._status.set_size(len(newdata))
8719+        self.log("starting publish, datalen is %s" % self.datalength)
8720+        self._status.set_size(self.datalength)
8721         self._status.set_status("Started")
8722         self._started = time.time()
8723 
8724hunk ./src/allmydata/mutable/publish.py 193
8725         self.full_peerlist = full_peerlist # for use later, immutable
8726         self.bad_peers = set() # peerids who have errbacked/refused requests
8727 
8728-        self.newdata = newdata
8729-
8730         # This will set self.segment_size, self.num_segments, and
8731         # self.fec.
8732         self.setup_encoding_parameters()
8733hunk ./src/allmydata/mutable/publish.py 272
8734                                                 self.required_shares,
8735                                                 self.total_shares,
8736                                                 self.segment_size,
8737-                                                len(self.newdata))
8738+                                                self.datalength)
8739             self.writers[shnum].peerid = peerid
8740             if (peerid, shnum) in self._servermap.servermap:
8741                 old_versionid, old_timestamp = self._servermap.servermap[key]
8742hunk ./src/allmydata/mutable/publish.py 318
8743         if self._version == MDMF_VERSION:
8744             segment_size = DEFAULT_MAX_SEGMENT_SIZE # 128 KiB by default
8745         else:
8746-            segment_size = len(self.newdata) # SDMF is only one segment
8747+            segment_size = self.datalength # SDMF is only one segment
8748         # this must be a multiple of self.required_shares
8749         segment_size = mathutil.next_multiple(segment_size,
8750                                               self.required_shares)
8751hunk ./src/allmydata/mutable/publish.py 324
8752         self.segment_size = segment_size
8753         if segment_size:
8754-            self.num_segments = mathutil.div_ceil(len(self.newdata),
8755+            self.num_segments = mathutil.div_ceil(self.datalength,
8756                                                   segment_size)
8757         else:
8758             self.num_segments = 0
8759hunk ./src/allmydata/mutable/publish.py 337
8760             assert self.num_segments in (0, 1) # SDMF
8761         # calculate the tail segment size.
8762 
8763-        if segment_size and self.newdata:
8764-            self.tail_segment_size = len(self.newdata) % segment_size
8765+        if segment_size and self.datalength:
8766+            self.tail_segment_size = self.datalength % segment_size
8767         else:
8768             self.tail_segment_size = 0
8769 
8770hunk ./src/allmydata/mutable/publish.py 438
8771             segsize = self.segment_size
8772 
8773 
8774-        offset = self.segment_size * segnum
8775-        length = segsize + offset
8776         self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
8777hunk ./src/allmydata/mutable/publish.py 439
8778-        data = self.newdata[offset:length]
8779+        data = self.data.read(segsize)
8780+
8781         assert len(data) == segsize
8782 
8783         salt = os.urandom(16)
8784hunk ./src/allmydata/mutable/publish.py 502
8785             d.addCallback(self._got_write_answer, writer, started)
8786             d.addErrback(self._connection_problem, writer)
8787             dl.append(d)
8788-            # TODO: Naturally, we need to check on the results of these.
8789         return defer.DeferredList(dl)
8790 
8791 
8792}
8793[test/test_mutable.py: write tests for MutableFileHandle and MutableDataHandle
8794Kevan Carstensen <kevan@isnotajoke.com>**20100706215649
8795 Ignore-this: df719a0c52b4bbe9be4fae206c7ab3e7
8796] {
8797hunk ./src/allmydata/test/test_mutable.py 2
8798 
8799-import struct
8800+import struct, os
8801 from cStringIO import StringIO
8802 from twisted.trial import unittest
8803 from twisted.internet import defer, reactor
8804hunk ./src/allmydata/test/test_mutable.py 26
8805      NeedMoreDataError, UnrecoverableFileError, UncoordinatedWriteError, \
8806      NotEnoughServersError, CorruptShareError
8807 from allmydata.mutable.retrieve import Retrieve
8808-from allmydata.mutable.publish import Publish
8809+from allmydata.mutable.publish import Publish, MutableFileHandle, \
8810+                                      MutableDataHandle
8811 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
8812 from allmydata.mutable.layout import unpack_header, unpack_share, \
8813                                      MDMFSlotReadProxy
8814hunk ./src/allmydata/test/test_mutable.py 2465
8815         d.addCallback(lambda data:
8816             self.failUnlessEqual(data, CONTENTS))
8817         return d
8818+
8819+
8820+class FileHandle(unittest.TestCase):
8821+    def setUp(self):
8822+        self.test_data = "Test Data" * 50000
8823+        self.sio = StringIO(self.test_data)
8824+        self.uploadable = MutableFileHandle(self.sio)
8825+
8826+
8827+    def test_filehandle_read(self):
8828+        self.basedir = "mutable/FileHandle/test_filehandle_read"
8829+        chunk_size = 10
8830+        for i in xrange(0, len(self.test_data), chunk_size):
8831+            data = self.uploadable.read(chunk_size)
8832+            data = "".join(data)
8833+            start = i
8834+            end = i + chunk_size
8835+            self.failUnlessEqual(data, self.test_data[start:end])
8836+
8837+
8838+    def test_filehandle_get_size(self):
8839+        self.basedir = "mutable/FileHandle/test_filehandle_get_size"
8840+        actual_size = len(self.test_data)
8841+        size = self.uploadable.get_size()
8842+        self.failUnlessEqual(size, actual_size)
8843+
8844+
8845+    def test_filehandle_get_size_out_of_order(self):
8846+        # We should be able to call get_size whenever we want without
8847+        # disturbing the location of the seek pointer.
8848+        chunk_size = 100
8849+        data = self.uploadable.read(chunk_size)
8850+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
8851+
8852+        # Now get the size.
8853+        size = self.uploadable.get_size()
8854+        self.failUnlessEqual(size, len(self.test_data))
8855+
8856+        # Now get more data. We should be right where we left off.
8857+        more_data = self.uploadable.read(chunk_size)
8858+        start = chunk_size
8859+        end = chunk_size * 2
8860+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
8861+
8862+
8863+    def test_filehandle_file(self):
8864+        # Make sure that the MutableFileHandle works on a file as well
8865+        # as a StringIO object, since in some cases it will be asked to
8866+        # deal with files.
8867+        self.basedir = self.mktemp()
8868+        # necessary? What am I doing wrong here?
8869+        os.mkdir(self.basedir)
8870+        f_path = os.path.join(self.basedir, "test_file")
8871+        f = open(f_path, "w")
8872+        f.write(self.test_data)
8873+        f.close()
8874+        f = open(f_path, "r")
8875+
8876+        uploadable = MutableFileHandle(f)
8877+
8878+        data = uploadable.read(len(self.test_data))
8879+        self.failUnlessEqual("".join(data), self.test_data)
8880+        size = uploadable.get_size()
8881+        self.failUnlessEqual(size, len(self.test_data))
8882+
8883+
8884+    def test_close(self):
8885+        # Make sure that the MutableFileHandle closes its handle when
8886+        # told to do so.
8887+        self.uploadable.close()
8888+        self.failUnless(self.sio.closed)
8889+
8890+
8891+class DataHandle(unittest.TestCase):
8892+    def setUp(self):
8893+        self.test_data = "Test Data" * 50000
8894+        self.uploadable = MutableDataHandle(self.test_data)
8895+
8896+
8897+    def test_datahandle_read(self):
8898+        chunk_size = 10
8899+        for i in xrange(0, len(self.test_data), chunk_size):
8900+            data = self.uploadable.read(chunk_size)
8901+            data = "".join(data)
8902+            start = i
8903+            end = i + chunk_size
8904+            self.failUnlessEqual(data, self.test_data[start:end])
8905+
8906+
8907+    def test_datahandle_get_size(self):
8908+        actual_size = len(self.test_data)
8909+        size = self.uploadable.get_size()
8910+        self.failUnlessEqual(size, actual_size)
8911+
8912+
8913+    def test_datahandle_get_size_out_of_order(self):
8914+        # We should be able to call get_size whenever we want without
8915+        # disturbing the location of the seek pointer.
8916+        chunk_size = 100
8917+        data = self.uploadable.read(chunk_size)
8918+        self.failUnlessEqual("".join(data), self.test_data[:chunk_size])
8919+
8920+        # Now get the size.
8921+        size = self.uploadable.get_size()
8922+        self.failUnlessEqual(size, len(self.test_data))
8923+
8924+        # Now get more data. We should be right where we left off.
8925+        more_data = self.uploadable.read(chunk_size)
8926+        start = chunk_size
8927+        end = chunk_size * 2
8928+        self.failUnlessEqual("".join(more_data), self.test_data[start:end])
8929}
8930[Alter tests to work with the new APIs
8931Kevan Carstensen <kevan@isnotajoke.com>**20100708000031
8932 Ignore-this: 1f377904ac61ce40e9a04716fbd2ad95
8933] {
8934hunk ./src/allmydata/test/common.py 12
8935 from allmydata import uri, dirnode, client
8936 from allmydata.introducer.server import IntroducerNode
8937 from allmydata.interfaces import IMutableFileNode, IImmutableFileNode, \
8938-     FileTooLargeError, NotEnoughSharesError, ICheckable
8939+     FileTooLargeError, NotEnoughSharesError, ICheckable, \
8940+     IMutableUploadable
8941 from allmydata.check_results import CheckResults, CheckAndRepairResults, \
8942      DeepCheckResults, DeepCheckAndRepairResults
8943 from allmydata.mutable.common import CorruptShareError
8944hunk ./src/allmydata/test/common.py 18
8945 from allmydata.mutable.layout import unpack_header
8946+from allmydata.mutable.publish import MutableDataHandle
8947 from allmydata.storage.server import storage_index_to_dir
8948 from allmydata.storage.mutable import MutableShareFile
8949 from allmydata.util import hashutil, log, fileutil, pollmixin
8950hunk ./src/allmydata/test/common.py 182
8951         self.init_from_cap(make_mutable_file_cap())
8952     def create(self, contents, key_generator=None, keysize=None):
8953         initial_contents = self._get_initial_contents(contents)
8954-        if len(initial_contents) > self.MUTABLE_SIZELIMIT:
8955+        if initial_contents.get_size() > self.MUTABLE_SIZELIMIT:
8956             raise FileTooLargeError("SDMF is limited to one segment, and "
8957hunk ./src/allmydata/test/common.py 184
8958-                                    "%d > %d" % (len(initial_contents),
8959+                                    "%d > %d" % (initial_contents.get_size(),
8960                                                  self.MUTABLE_SIZELIMIT))
8961hunk ./src/allmydata/test/common.py 186
8962-        self.all_contents[self.storage_index] = initial_contents
8963+        data = initial_contents.read(initial_contents.get_size())
8964+        data = "".join(data)
8965+        self.all_contents[self.storage_index] = data
8966         return defer.succeed(self)
8967     def _get_initial_contents(self, contents):
8968hunk ./src/allmydata/test/common.py 191
8969-        if isinstance(contents, str):
8970-            return contents
8971         if contents is None:
8972hunk ./src/allmydata/test/common.py 192
8973-            return ""
8974+            return MutableDataHandle("")
8975+
8976+        if IMutableUploadable.providedBy(contents):
8977+            return contents
8978+
8979         assert callable(contents), "%s should be callable, not %s" % \
8980                (contents, type(contents))
8981         return contents(self)
8982hunk ./src/allmydata/test/common.py 309
8983         return defer.succeed(self.all_contents[self.storage_index])
8984 
8985     def overwrite(self, new_contents):
8986-        if len(new_contents) > self.MUTABLE_SIZELIMIT:
8987+        if new_contents.get_size() > self.MUTABLE_SIZELIMIT:
8988             raise FileTooLargeError("SDMF is limited to one segment, and "
8989hunk ./src/allmydata/test/common.py 311
8990-                                    "%d > %d" % (len(new_contents),
8991+                                    "%d > %d" % (new_contents.get_size(),
8992                                                  self.MUTABLE_SIZELIMIT))
8993         assert not self.is_readonly()
8994hunk ./src/allmydata/test/common.py 314
8995-        self.all_contents[self.storage_index] = new_contents
8996+        new_data = new_contents.read(new_contents.get_size())
8997+        new_data = "".join(new_data)
8998+        self.all_contents[self.storage_index] = new_data
8999         return defer.succeed(None)
9000     def modify(self, modifier):
9001         # this does not implement FileTooLargeError, but the real one does
9002hunk ./src/allmydata/test/common.py 324
9003     def _modify(self, modifier):
9004         assert not self.is_readonly()
9005         old_contents = self.all_contents[self.storage_index]
9006-        self.all_contents[self.storage_index] = modifier(old_contents, None, True)
9007+        new_data = modifier(old_contents, None, True)
9008+        if new_data is not None:
9009+            new_data = new_data.read(new_data.get_size())
9010+            new_data = "".join(new_data)
9011+        self.all_contents[self.storage_index] = new_data
9012         return None
9013 
9014 def make_mutable_file_cap():
9015hunk ./src/allmydata/test/test_checker.py 11
9016 from allmydata.test.no_network import GridTestMixin
9017 from allmydata.immutable.upload import Data
9018 from allmydata.test.common_web import WebRenderingMixin
9019+from allmydata.mutable.publish import MutableDataHandle
9020 
9021 class FakeClient:
9022     def get_storage_broker(self):
9023hunk ./src/allmydata/test/test_checker.py 291
9024         def _stash_immutable(ur):
9025             self.imm = c0.create_node_from_uri(ur.uri)
9026         d.addCallback(_stash_immutable)
9027-        d.addCallback(lambda ign: c0.create_mutable_file("contents"))
9028+        d.addCallback(lambda ign:
9029+            c0.create_mutable_file(MutableDataHandle("contents")))
9030         def _stash_mutable(node):
9031             self.mut = node
9032         d.addCallback(_stash_mutable)
9033hunk ./src/allmydata/test/test_cli.py 12
9034 from allmydata.util import fileutil, hashutil, base32
9035 from allmydata import uri
9036 from allmydata.immutable import upload
9037+from allmydata.mutable.publish import MutableDataHandle
9038 from allmydata.dirnode import normalize
9039 
9040 # Test that the scripts can be imported -- although the actual tests of their
9041hunk ./src/allmydata/test/test_cli.py 1983
9042         self.set_up_grid()
9043         c0 = self.g.clients[0]
9044         DATA = "data" * 100
9045-        d = c0.create_mutable_file(DATA)
9046+        DATA_uploadable = MutableDataHandle(DATA)
9047+        d = c0.create_mutable_file(DATA_uploadable)
9048         def _stash_uri(n):
9049             self.uri = n.get_uri()
9050         d.addCallback(_stash_uri)
9051hunk ./src/allmydata/test/test_cli.py 2085
9052                                            upload.Data("literal",
9053                                                         convergence="")))
9054         d.addCallback(_stash_uri, "small")
9055-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"1"))
9056+        d.addCallback(lambda ign:
9057+            c0.create_mutable_file(MutableDataHandle(DATA+"1")))
9058         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
9059         d.addCallback(_stash_uri, "mutable")
9060 
9061hunk ./src/allmydata/test/test_cli.py 2104
9062         # root/small
9063         # root/mutable
9064 
9065+        # We haven't broken anything yet, so this should all be healthy.
9066         d.addCallback(lambda ign: self.do_cli("deep-check", "--verbose",
9067                                               self.rooturi))
9068         def _check2((rc, out, err)):
9069hunk ./src/allmydata/test/test_cli.py 2119
9070                             in lines, out)
9071         d.addCallback(_check2)
9072 
9073+        # Similarly, all of these results should be as we expect them to
9074+        # be for a healthy file layout.
9075         d.addCallback(lambda ign: self.do_cli("stats", self.rooturi))
9076         def _check_stats((rc, out, err)):
9077             self.failUnlessReallyEqual(err, "")
9078hunk ./src/allmydata/test/test_cli.py 2136
9079             self.failUnlessIn(" 317-1000 : 1    (1000 B, 1000 B)", lines)
9080         d.addCallback(_check_stats)
9081 
9082+        # Now we break things.
9083         def _clobber_shares(ignored):
9084             shares = self.find_shares(self.uris[u"gööd"])
9085             self.failUnlessReallyEqual(len(shares), 10)
9086hunk ./src/allmydata/test/test_cli.py 2155
9087         d.addCallback(_clobber_shares)
9088 
9089         # root
9090-        # root/gööd  [9 shares]
9091+        # root/gööd  [1 missing share]
9092         # root/small
9093         # root/mutable [1 corrupt share]
9094 
9095hunk ./src/allmydata/test/test_cli.py 2161
9096         d.addCallback(lambda ign:
9097                       self.do_cli("deep-check", "--verbose", self.rooturi))
9098+        # This should reveal the missing share, but not the corrupt
9099+        # share, since we didn't tell the deep check operation to also
9100+        # verify.
9101         def _check3((rc, out, err)):
9102             self.failUnlessReallyEqual(err, "")
9103             self.failUnlessReallyEqual(rc, 0)
9104hunk ./src/allmydata/test/test_cli.py 2212
9105                                   "--verbose", "--verify", "--repair",
9106                                   self.rooturi))
9107         def _check6((rc, out, err)):
9108+            # We've just repaired the directory. There is no reason for
9109+            # that repair to be unsuccessful.
9110             self.failUnlessReallyEqual(err, "")
9111             self.failUnlessReallyEqual(rc, 0)
9112             lines = out.splitlines()
9113hunk ./src/allmydata/test/test_deepcheck.py 9
9114 from twisted.internet import threads # CLI tests use deferToThread
9115 from allmydata.immutable import upload
9116 from allmydata.mutable.common import UnrecoverableFileError
9117+from allmydata.mutable.publish import MutableDataHandle
9118 from allmydata.util import idlib
9119 from allmydata.util import base32
9120 from allmydata.scripts import runner
9121hunk ./src/allmydata/test/test_deepcheck.py 38
9122         self.basedir = "deepcheck/MutableChecker/good"
9123         self.set_up_grid()
9124         CONTENTS = "a little bit of data"
9125-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9126+        CONTENTS_uploadable = MutableDataHandle(CONTENTS)
9127+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9128         def _created(node):
9129             self.node = node
9130             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9131hunk ./src/allmydata/test/test_deepcheck.py 61
9132         self.basedir = "deepcheck/MutableChecker/corrupt"
9133         self.set_up_grid()
9134         CONTENTS = "a little bit of data"
9135-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9136+        CONTENTS_uploadable = MutableDataHandle(CONTENTS)
9137+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9138         def _stash_and_corrupt(node):
9139             self.node = node
9140             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9141hunk ./src/allmydata/test/test_deepcheck.py 99
9142         self.basedir = "deepcheck/MutableChecker/delete_share"
9143         self.set_up_grid()
9144         CONTENTS = "a little bit of data"
9145-        d = self.g.clients[0].create_mutable_file(CONTENTS)
9146+        CONTENTS_uploadable = MutableDataHandle(CONTENTS)
9147+        d = self.g.clients[0].create_mutable_file(CONTENTS_uploadable)
9148         def _stash_and_delete(node):
9149             self.node = node
9150             self.fileurl = "uri/" + urllib.quote(node.get_uri())
9151hunk ./src/allmydata/test/test_deepcheck.py 223
9152             self.root = n
9153             self.root_uri = n.get_uri()
9154         d.addCallback(_created_root)
9155-        d.addCallback(lambda ign: c0.create_mutable_file("mutable file contents"))
9156+        d.addCallback(lambda ign:
9157+            c0.create_mutable_file(MutableDataHandle("mutable file contents")))
9158         d.addCallback(lambda n: self.root.set_node(u"mutable", n))
9159         def _created_mutable(n):
9160             self.mutable = n
9161hunk ./src/allmydata/test/test_deepcheck.py 965
9162     def create_mangled(self, ignored, name):
9163         nodetype, mangletype = name.split("-", 1)
9164         if nodetype == "mutable":
9165-            d = self.g.clients[0].create_mutable_file("mutable file contents")
9166+            mutable_uploadable = MutableDataHandle("mutable file contents")
9167+            d = self.g.clients[0].create_mutable_file(mutable_uploadable)
9168             d.addCallback(lambda n: self.root.set_node(unicode(name), n))
9169         elif nodetype == "large":
9170             large = upload.Data("Lots of data\n" * 1000 + name + "\n", None)
9171hunk ./src/allmydata/test/test_dirnode.py 1281
9172     implements(IMutableFileNode)
9173     counter = 0
9174     def __init__(self, initial_contents=""):
9175-        self.data = self._get_initial_contents(initial_contents)
9176+        data = self._get_initial_contents(initial_contents)
9177+        self.data = data.read(data.get_size())
9178+        self.data = "".join(self.data)
9179+
9180         counter = FakeMutableFile.counter
9181         FakeMutableFile.counter += 1
9182         writekey = hashutil.ssk_writekey_hash(str(counter))
9183hunk ./src/allmydata/test/test_dirnode.py 1331
9184         pass
9185 
9186     def modify(self, modifier):
9187-        self.data = modifier(self.data, None, True)
9188+        data = modifier(self.data, None, True)
9189+        self.data = data.read(data.get_size())
9190+        self.data = "".join(self.data)
9191         return defer.succeed(None)
9192 
9193 class FakeNodeMaker(NodeMaker):
9194hunk ./src/allmydata/test/test_hung_server.py 10
9195 from allmydata.util.consumer import download_to_data
9196 from allmydata.immutable import upload
9197 from allmydata.mutable.common import UnrecoverableFileError
9198+from allmydata.mutable.publish import MutableDataHandle
9199 from allmydata.storage.common import storage_index_to_dir
9200 from allmydata.test.no_network import GridTestMixin
9201 from allmydata.test.common import ShouldFailMixin, _corrupt_share_data
9202hunk ./src/allmydata/test/test_hung_server.py 96
9203         self.servers = [(id, ss) for (id, ss) in nm.storage_broker.get_all_servers()]
9204 
9205         if mutable:
9206-            d = nm.create_mutable_file(mutable_plaintext)
9207+            uploadable = MutableDataHandle(mutable_plaintext)
9208+            d = nm.create_mutable_file(uploadable)
9209             def _uploaded_mutable(node):
9210                 self.uri = node.get_uri()
9211                 self.shares = self.find_shares(self.uri)
9212hunk ./src/allmydata/test/test_mutable.py 297
9213             d.addCallback(lambda smap: smap.dump(StringIO()))
9214             d.addCallback(lambda sio:
9215                           self.failUnless("3-of-10" in sio.getvalue()))
9216-            d.addCallback(lambda res: n.overwrite("contents 1"))
9217+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 1")))
9218             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
9219             d.addCallback(lambda res: n.download_best_version())
9220             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9221hunk ./src/allmydata/test/test_mutable.py 304
9222             d.addCallback(lambda res: n.get_size_of_best_version())
9223             d.addCallback(lambda size:
9224                           self.failUnlessEqual(size, len("contents 1")))
9225-            d.addCallback(lambda res: n.overwrite("contents 2"))
9226+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9227             d.addCallback(lambda res: n.download_best_version())
9228             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9229             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
9230hunk ./src/allmydata/test/test_mutable.py 308
9231-            d.addCallback(lambda smap: n.upload("contents 3", smap))
9232+            d.addCallback(lambda smap: n.upload(MutableDataHandle("contents 3"), smap))
9233             d.addCallback(lambda res: n.download_best_version())
9234             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
9235             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
9236hunk ./src/allmydata/test/test_mutable.py 320
9237             # mapupdate-to-retrieve data caching (i.e. make the shares larger
9238             # than the default readsize, which is 2000 bytes). A 15kB file
9239             # will have 5kB shares.
9240-            d.addCallback(lambda res: n.overwrite("large size file" * 1000))
9241+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("large size file" * 1000)))
9242             d.addCallback(lambda res: n.download_best_version())
9243             d.addCallback(lambda res:
9244                           self.failUnlessEqual(res, "large size file" * 1000))
9245hunk ./src/allmydata/test/test_mutable.py 343
9246             # to make them big enough to force the file to be uploaded
9247             # in more than one segment.
9248             big_contents = "contents1" * 100000 # about 900 KiB
9249+            big_contents_uploadable = MutableDataHandle(big_contents)
9250             d.addCallback(lambda ignored:
9251hunk ./src/allmydata/test/test_mutable.py 345
9252-                n.overwrite(big_contents))
9253+                n.overwrite(big_contents_uploadable))
9254             d.addCallback(lambda ignored:
9255                 n.download_best_version())
9256             d.addCallback(lambda data:
9257hunk ./src/allmydata/test/test_mutable.py 355
9258             # segments, so that we make the downloader deal with
9259             # multiple segments.
9260             bigger_contents = "contents2" * 1000000 # about 9MiB
9261+            bigger_contents_uploadable = MutableDataHandle(bigger_contents)
9262             d.addCallback(lambda ignored:
9263hunk ./src/allmydata/test/test_mutable.py 357
9264-                n.overwrite(bigger_contents))
9265+                n.overwrite(bigger_contents_uploadable))
9266             d.addCallback(lambda ignored:
9267                 n.download_best_version())
9268             d.addCallback(lambda data:
9269hunk ./src/allmydata/test/test_mutable.py 368
9270 
9271 
9272     def test_create_with_initial_contents(self):
9273-        d = self.nodemaker.create_mutable_file("contents 1")
9274+        upload1 = MutableDataHandle("contents 1")
9275+        d = self.nodemaker.create_mutable_file(upload1)
9276         def _created(n):
9277             d = n.download_best_version()
9278             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9279hunk ./src/allmydata/test/test_mutable.py 373
9280-            d.addCallback(lambda res: n.overwrite("contents 2"))
9281+            upload2 = MutableDataHandle("contents 2")
9282+            d.addCallback(lambda res: n.overwrite(upload2))
9283             d.addCallback(lambda res: n.download_best_version())
9284             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9285             return d
9286hunk ./src/allmydata/test/test_mutable.py 380
9287         d.addCallback(_created)
9288         return d
9289+    test_create_with_initial_contents.timeout = 15
9290 
9291 
9292     def test_create_mdmf_with_initial_contents(self):
9293hunk ./src/allmydata/test/test_mutable.py 385
9294         initial_contents = "foobarbaz" * 131072 # 900KiB
9295-        d = self.nodemaker.create_mutable_file(initial_contents,
9296+        initial_contents_uploadable = MutableDataHandle(initial_contents)
9297+        d = self.nodemaker.create_mutable_file(initial_contents_uploadable,
9298                                                version=MDMF_VERSION)
9299         def _created(n):
9300             d = n.download_best_version()
9301hunk ./src/allmydata/test/test_mutable.py 392
9302             d.addCallback(lambda data:
9303                 self.failUnlessEqual(data, initial_contents))
9304+            uploadable2 = MutableDataHandle(initial_contents + "foobarbaz")
9305             d.addCallback(lambda ignored:
9306hunk ./src/allmydata/test/test_mutable.py 394
9307-                n.overwrite(initial_contents + "foobarbaz"))
9308+                n.overwrite(uploadable2))
9309             d.addCallback(lambda ignored:
9310                 n.download_best_version())
9311             d.addCallback(lambda data:
9312hunk ./src/allmydata/test/test_mutable.py 413
9313             key = n.get_writekey()
9314             self.failUnless(isinstance(key, str), key)
9315             self.failUnlessEqual(len(key), 16) # AES key size
9316-            return data
9317+            return MutableDataHandle(data)
9318         d = self.nodemaker.create_mutable_file(_make_contents)
9319         def _created(n):
9320             return n.download_best_version()
9321hunk ./src/allmydata/test/test_mutable.py 429
9322             key = n.get_writekey()
9323             self.failUnless(isinstance(key, str), key)
9324             self.failUnlessEqual(len(key), 16)
9325-            return data
9326+            return MutableDataHandle(data)
9327         d = self.nodemaker.create_mutable_file(_make_contents,
9328                                                version=MDMF_VERSION)
9329         d.addCallback(lambda n:
9330hunk ./src/allmydata/test/test_mutable.py 441
9331 
9332     def test_create_with_too_large_contents(self):
9333         BIG = "a" * (self.OLD_MAX_SEGMENT_SIZE + 1)
9334-        d = self.nodemaker.create_mutable_file(BIG)
9335+        BIG_uploadable = MutableDataHandle(BIG)
9336+        d = self.nodemaker.create_mutable_file(BIG_uploadable)
9337         def _created(n):
9338hunk ./src/allmydata/test/test_mutable.py 444
9339-            d = n.overwrite(BIG)
9340+            other_BIG_uploadable = MutableDataHandle(BIG)
9341+            d = n.overwrite(other_BIG_uploadable)
9342             return d
9343         d.addCallback(_created)
9344         return d
9345hunk ./src/allmydata/test/test_mutable.py 459
9346 
9347     def test_modify(self):
9348         def _modifier(old_contents, servermap, first_time):
9349-            return old_contents + "line2"
9350+            new_contents = old_contents + "line2"
9351+            return MutableDataHandle(new_contents)
9352         def _non_modifier(old_contents, servermap, first_time):
9353hunk ./src/allmydata/test/test_mutable.py 462
9354-            return old_contents
9355+            return MutableDataHandle(old_contents)
9356         def _none_modifier(old_contents, servermap, first_time):
9357             return None
9358         def _error_modifier(old_contents, servermap, first_time):
9359hunk ./src/allmydata/test/test_mutable.py 468
9360             raise ValueError("oops")
9361         def _toobig_modifier(old_contents, servermap, first_time):
9362-            return "b" * (self.OLD_MAX_SEGMENT_SIZE+1)
9363+            new_content = "b" * (self.OLD_MAX_SEGMENT_SIZE + 1)
9364+            return MutableDataHandle(new_content)
9365         calls = []
9366         def _ucw_error_modifier(old_contents, servermap, first_time):
9367             # simulate an UncoordinatedWriteError once
9368hunk ./src/allmydata/test/test_mutable.py 476
9369             calls.append(1)
9370             if len(calls) <= 1:
9371                 raise UncoordinatedWriteError("simulated")
9372-            return old_contents + "line3"
9373+            new_contents = old_contents + "line3"
9374+            return MutableDataHandle(new_contents)
9375         def _ucw_error_non_modifier(old_contents, servermap, first_time):
9376             # simulate an UncoordinatedWriteError once, and don't actually
9377             # modify the contents on subsequent invocations
9378hunk ./src/allmydata/test/test_mutable.py 484
9379             calls.append(1)
9380             if len(calls) <= 1:
9381                 raise UncoordinatedWriteError("simulated")
9382-            return old_contents
9383+            return MutableDataHandle(old_contents)
9384 
9385hunk ./src/allmydata/test/test_mutable.py 486
9386-        d = self.nodemaker.create_mutable_file("line1")
9387+        initial_contents = "line1"
9388+        d = self.nodemaker.create_mutable_file(MutableDataHandle(initial_contents))
9389         def _created(n):
9390             d = n.modify(_modifier)
9391             d.addCallback(lambda res: n.download_best_version())
9392hunk ./src/allmydata/test/test_mutable.py 548
9393 
9394     def test_modify_backoffer(self):
9395         def _modifier(old_contents, servermap, first_time):
9396-            return old_contents + "line2"
9397+            return MutableDataHandle(old_contents + "line2")
9398         calls = []
9399         def _ucw_error_modifier(old_contents, servermap, first_time):
9400             # simulate an UncoordinatedWriteError once
9401hunk ./src/allmydata/test/test_mutable.py 555
9402             calls.append(1)
9403             if len(calls) <= 1:
9404                 raise UncoordinatedWriteError("simulated")
9405-            return old_contents + "line3"
9406+            return MutableDataHandle(old_contents + "line3")
9407         def _always_ucw_error_modifier(old_contents, servermap, first_time):
9408             raise UncoordinatedWriteError("simulated")
9409         def _backoff_stopper(node, f):
9410hunk ./src/allmydata/test/test_mutable.py 570
9411         giveuper._delay = 0.1
9412         giveuper.factor = 1
9413 
9414-        d = self.nodemaker.create_mutable_file("line1")
9415+        d = self.nodemaker.create_mutable_file(MutableDataHandle("line1"))
9416         def _created(n):
9417             d = n.modify(_modifier)
9418             d.addCallback(lambda res: n.download_best_version())
9419hunk ./src/allmydata/test/test_mutable.py 620
9420             d.addCallback(lambda smap: smap.dump(StringIO()))
9421             d.addCallback(lambda sio:
9422                           self.failUnless("3-of-10" in sio.getvalue()))
9423-            d.addCallback(lambda res: n.overwrite("contents 1"))
9424+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 1")))
9425             d.addCallback(lambda res: self.failUnlessIdentical(res, None))
9426             d.addCallback(lambda res: n.download_best_version())
9427             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9428hunk ./src/allmydata/test/test_mutable.py 624
9429-            d.addCallback(lambda res: n.overwrite("contents 2"))
9430+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9431             d.addCallback(lambda res: n.download_best_version())
9432             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9433             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
9434hunk ./src/allmydata/test/test_mutable.py 628
9435-            d.addCallback(lambda smap: n.upload("contents 3", smap))
9436+            d.addCallback(lambda smap: n.upload(MutableDataHandle("contents 3"), smap))
9437             d.addCallback(lambda res: n.download_best_version())
9438             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 3"))
9439             d.addCallback(lambda res: n.get_servermap(MODE_ANYTHING))
9440hunk ./src/allmydata/test/test_mutable.py 646
9441         # publish a file and create shares, which can then be manipulated
9442         # later.
9443         self.CONTENTS = "New contents go here" * 1000
9444+        self.uploadable = MutableDataHandle(self.CONTENTS)
9445         self._storage = FakeStorage()
9446         self._nodemaker = make_nodemaker(self._storage)
9447         self._storage_broker = self._nodemaker.storage_broker
9448hunk ./src/allmydata/test/test_mutable.py 650
9449-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
9450+        d = self._nodemaker.create_mutable_file(self.uploadable)
9451         def _created(node):
9452             self._fn = node
9453             self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
9454hunk ./src/allmydata/test/test_mutable.py 662
9455         # an MDMF file.
9456         # self.CONTENTS should have more than one segment.
9457         self.CONTENTS = "This is an MDMF file" * 100000
9458+        self.uploadable = MutableDataHandle(self.CONTENTS)
9459         self._storage = FakeStorage()
9460         self._nodemaker = make_nodemaker(self._storage)
9461         self._storage_broker = self._nodemaker.storage_broker
9462hunk ./src/allmydata/test/test_mutable.py 666
9463-        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=1)
9464+        d = self._nodemaker.create_mutable_file(self.uploadable, version=MDMF_VERSION)
9465         def _created(node):
9466             self._fn = node
9467             self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
9468hunk ./src/allmydata/test/test_mutable.py 678
9469         # like publish_one, except that the result is guaranteed to be
9470         # an SDMF file
9471         self.CONTENTS = "This is an SDMF file" * 1000
9472+        self.uploadable = MutableDataHandle(self.CONTENTS)
9473         self._storage = FakeStorage()
9474         self._nodemaker = make_nodemaker(self._storage)
9475         self._storage_broker = self._nodemaker.storage_broker
9476hunk ./src/allmydata/test/test_mutable.py 682
9477-        d = self._nodemaker.create_mutable_file(self.CONTENTS, version=0)
9478+        d = self._nodemaker.create_mutable_file(self.uploadable, version=SDMF_VERSION)
9479         def _created(node):
9480             self._fn = node
9481             self._fn2 = self._nodemaker.create_from_cap(node.get_uri())
9482hunk ./src/allmydata/test/test_mutable.py 696
9483                          "Contents 2",
9484                          "Contents 3a",
9485                          "Contents 3b"]
9486+        self.uploadables = [MutableDataHandle(d) for d in self.CONTENTS]
9487         self._copied_shares = {}
9488         self._storage = FakeStorage()
9489         self._nodemaker = make_nodemaker(self._storage)
9490hunk ./src/allmydata/test/test_mutable.py 700
9491-        d = self._nodemaker.create_mutable_file(self.CONTENTS[0], version=version) # seqnum=1
9492+        d = self._nodemaker.create_mutable_file(self.uploadables[0], version=version) # seqnum=1
9493         def _created(node):
9494             self._fn = node
9495             # now create multiple versions of the same file, and accumulate
9496hunk ./src/allmydata/test/test_mutable.py 707
9497             # their shares, so we can mix and match them later.
9498             d = defer.succeed(None)
9499             d.addCallback(self._copy_shares, 0)
9500-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[1])) #s2
9501+            d.addCallback(lambda res: node.overwrite(self.uploadables[1])) #s2
9502             d.addCallback(self._copy_shares, 1)
9503hunk ./src/allmydata/test/test_mutable.py 709
9504-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[2])) #s3
9505+            d.addCallback(lambda res: node.overwrite(self.uploadables[2])) #s3
9506             d.addCallback(self._copy_shares, 2)
9507hunk ./src/allmydata/test/test_mutable.py 711
9508-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[3])) #s4a
9509+            d.addCallback(lambda res: node.overwrite(self.uploadables[3])) #s4a
9510             d.addCallback(self._copy_shares, 3)
9511             # now we replace all the shares with version s3, and upload a new
9512             # version to get s4b.
9513hunk ./src/allmydata/test/test_mutable.py 717
9514             rollback = dict([(i,2) for i in range(10)])
9515             d.addCallback(lambda res: self._set_versions(rollback))
9516-            d.addCallback(lambda res: node.overwrite(self.CONTENTS[4])) #s4b
9517+            d.addCallback(lambda res: node.overwrite(self.uploadables[4])) #s4b
9518             d.addCallback(self._copy_shares, 4)
9519             # we leave the storage in state 4
9520             return d
9521hunk ./src/allmydata/test/test_mutable.py 826
9522         # create a new file, which is large enough to knock the privkey out
9523         # of the early part of the file
9524         LARGE = "These are Larger contents" * 200 # about 5KB
9525-        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE))
9526+        LARGE_uploadable = MutableDataHandle(LARGE)
9527+        d.addCallback(lambda res: self._nodemaker.create_mutable_file(LARGE_uploadable))
9528         def _created(large_fn):
9529             large_fn2 = self._nodemaker.create_from_cap(large_fn.get_uri())
9530             return self.make_servermap(MODE_WRITE, large_fn2)
9531hunk ./src/allmydata/test/test_mutable.py 1842
9532 class MultipleEncodings(unittest.TestCase):
9533     def setUp(self):
9534         self.CONTENTS = "New contents go here"
9535+        self.uploadable = MutableDataHandle(self.CONTENTS)
9536         self._storage = FakeStorage()
9537         self._nodemaker = make_nodemaker(self._storage, num_peers=20)
9538         self._storage_broker = self._nodemaker.storage_broker
9539hunk ./src/allmydata/test/test_mutable.py 1846
9540-        d = self._nodemaker.create_mutable_file(self.CONTENTS)
9541+        d = self._nodemaker.create_mutable_file(self.uploadable)
9542         def _created(node):
9543             self._fn = node
9544         d.addCallback(_created)
9545hunk ./src/allmydata/test/test_mutable.py 1872
9546         s = self._storage
9547         s._peers = {} # clear existing storage
9548         p2 = Publish(fn2, self._storage_broker, None)
9549-        d = p2.publish(data)
9550+        uploadable = MutableDataHandle(data)
9551+        d = p2.publish(uploadable)
9552         def _published(res):
9553             shares = s._peers
9554             s._peers = {}
9555hunk ./src/allmydata/test/test_mutable.py 2049
9556         self._set_versions(target)
9557 
9558         def _modify(oldversion, servermap, first_time):
9559-            return oldversion + " modified"
9560+            return MutableDataHandle(oldversion + " modified")
9561         d = self._fn.modify(_modify)
9562         d.addCallback(lambda res: self._fn.download_best_version())
9563         expected = self.CONTENTS[2] + " modified"
9564hunk ./src/allmydata/test/test_mutable.py 2175
9565         self.basedir = "mutable/Problems/test_publish_surprise"
9566         self.set_up_grid()
9567         nm = self.g.clients[0].nodemaker
9568-        d = nm.create_mutable_file("contents 1")
9569+        d = nm.create_mutable_file(MutableDataHandle("contents 1"))
9570         def _created(n):
9571             d = defer.succeed(None)
9572             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
9573hunk ./src/allmydata/test/test_mutable.py 2185
9574             d.addCallback(_got_smap1)
9575             # then modify the file, leaving the old map untouched
9576             d.addCallback(lambda res: log.msg("starting winning write"))
9577-            d.addCallback(lambda res: n.overwrite("contents 2"))
9578+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9579             # now attempt to modify the file with the old servermap. This
9580             # will look just like an uncoordinated write, in which every
9581             # single share got updated between our mapupdate and our publish
9582hunk ./src/allmydata/test/test_mutable.py 2194
9583                           self.shouldFail(UncoordinatedWriteError,
9584                                           "test_publish_surprise", None,
9585                                           n.upload,
9586-                                          "contents 2a", self.old_map))
9587+                                          MutableDataHandle("contents 2a"), self.old_map))
9588             return d
9589         d.addCallback(_created)
9590         return d
9591hunk ./src/allmydata/test/test_mutable.py 2203
9592         self.basedir = "mutable/Problems/test_retrieve_surprise"
9593         self.set_up_grid()
9594         nm = self.g.clients[0].nodemaker
9595-        d = nm.create_mutable_file("contents 1")
9596+        d = nm.create_mutable_file(MutableDataHandle("contents 1"))
9597         def _created(n):
9598             d = defer.succeed(None)
9599             d.addCallback(lambda res: n.get_servermap(MODE_READ))
9600hunk ./src/allmydata/test/test_mutable.py 2213
9601             d.addCallback(_got_smap1)
9602             # then modify the file, leaving the old map untouched
9603             d.addCallback(lambda res: log.msg("starting winning write"))
9604-            d.addCallback(lambda res: n.overwrite("contents 2"))
9605+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9606             # now attempt to retrieve the old version with the old servermap.
9607             # This will look like someone has changed the file since we
9608             # updated the servermap.
9609hunk ./src/allmydata/test/test_mutable.py 2241
9610         self.basedir = "mutable/Problems/test_unexpected_shares"
9611         self.set_up_grid()
9612         nm = self.g.clients[0].nodemaker
9613-        d = nm.create_mutable_file("contents 1")
9614+        d = nm.create_mutable_file(MutableDataHandle("contents 1"))
9615         def _created(n):
9616             d = defer.succeed(None)
9617             d.addCallback(lambda res: n.get_servermap(MODE_WRITE))
9618hunk ./src/allmydata/test/test_mutable.py 2253
9619                 self.g.remove_server(peer0)
9620                 # then modify the file, leaving the old map untouched
9621                 log.msg("starting winning write")
9622-                return n.overwrite("contents 2")
9623+                return n.overwrite(MutableDataHandle("contents 2"))
9624             d.addCallback(_got_smap1)
9625             # now attempt to modify the file with the old servermap. This
9626             # will look just like an uncoordinated write, in which every
9627hunk ./src/allmydata/test/test_mutable.py 2263
9628                           self.shouldFail(UncoordinatedWriteError,
9629                                           "test_surprise", None,
9630                                           n.upload,
9631-                                          "contents 2a", self.old_map))
9632+                                          MutableDataHandle("contents 2a"), self.old_map))
9633             return d
9634         d.addCallback(_created)
9635         return d
9636hunk ./src/allmydata/test/test_mutable.py 2267
9637+    test_unexpected_shares.timeout = 15
9638 
9639     def test_bad_server(self):
9640         # Break one server, then create the file: the initial publish should
9641hunk ./src/allmydata/test/test_mutable.py 2303
9642         d.addCallback(_break_peer0)
9643         # now "create" the file, using the pre-established key, and let the
9644         # initial publish finally happen
9645-        d.addCallback(lambda res: nm.create_mutable_file("contents 1"))
9646+        d.addCallback(lambda res: nm.create_mutable_file(MutableDataHandle("contents 1")))
9647         # that ought to work
9648         def _got_node(n):
9649             d = n.download_best_version()
9650hunk ./src/allmydata/test/test_mutable.py 2312
9651             def _break_peer1(res):
9652                 self.connection1.broken = True
9653             d.addCallback(_break_peer1)
9654-            d.addCallback(lambda res: n.overwrite("contents 2"))
9655+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9656             # that ought to work too
9657             d.addCallback(lambda res: n.download_best_version())
9658             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9659hunk ./src/allmydata/test/test_mutable.py 2344
9660         peerids = [serverid for (serverid,ss) in sb.get_all_servers()]
9661         self.g.break_server(peerids[0])
9662 
9663-        d = nm.create_mutable_file("contents 1")
9664+        d = nm.create_mutable_file(MutableDataHandle("contents 1"))
9665         def _created(n):
9666             d = n.download_best_version()
9667             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 1"))
9668hunk ./src/allmydata/test/test_mutable.py 2352
9669             def _break_second_server(res):
9670                 self.g.break_server(peerids[1])
9671             d.addCallback(_break_second_server)
9672-            d.addCallback(lambda res: n.overwrite("contents 2"))
9673+            d.addCallback(lambda res: n.overwrite(MutableDataHandle("contents 2")))
9674             # that ought to work too
9675             d.addCallback(lambda res: n.download_best_version())
9676             d.addCallback(lambda res: self.failUnlessEqual(res, "contents 2"))
9677hunk ./src/allmydata/test/test_mutable.py 2371
9678         d = self.shouldFail(NotEnoughServersError,
9679                             "test_publish_all_servers_bad",
9680                             "Ran out of non-bad servers",
9681-                            nm.create_mutable_file, "contents")
9682+                            nm.create_mutable_file, MutableDataHandle("contents"))
9683         return d
9684 
9685     def test_publish_no_servers(self):
9686hunk ./src/allmydata/test/test_mutable.py 2383
9687         d = self.shouldFail(NotEnoughServersError,
9688                             "test_publish_no_servers",
9689                             "Ran out of non-bad servers",
9690-                            nm.create_mutable_file, "contents")
9691+                            nm.create_mutable_file, MutableDataHandle("contents"))
9692         return d
9693     test_publish_no_servers.timeout = 30
9694 
9695hunk ./src/allmydata/test/test_mutable.py 2401
9696         # we need some contents that are large enough to push the privkey out
9697         # of the early part of the file
9698         LARGE = "These are Larger contents" * 2000 # about 50KB
9699-        d = nm.create_mutable_file(LARGE)
9700+        LARGE_uploadable = MutableDataHandle(LARGE)
9701+        d = nm.create_mutable_file(LARGE_uploadable)
9702         def _created(n):
9703             self.uri = n.get_uri()
9704             self.n2 = nm.create_from_cap(self.uri)
9705hunk ./src/allmydata/test/test_mutable.py 2438
9706         self.set_up_grid(num_servers=20)
9707         nm = self.g.clients[0].nodemaker
9708         LARGE = "These are Larger contents" * 2000 # about 50KiB
9709+        LARGE_uploadable = MutableDataHandle(LARGE)
9710         nm._node_cache = DevNullDictionary() # disable the nodecache
9711 
9712hunk ./src/allmydata/test/test_mutable.py 2441
9713-        d = nm.create_mutable_file(LARGE)
9714+        d = nm.create_mutable_file(LARGE_uploadable)
9715         def _created(n):
9716             self.uri = n.get_uri()
9717             self.n2 = nm.create_from_cap(self.uri)
9718hunk ./src/allmydata/test/test_mutable.py 2464
9719         self.set_up_grid(num_servers=20)
9720         nm = self.g.clients[0].nodemaker
9721         CONTENTS = "contents" * 2000
9722-        d = nm.create_mutable_file(CONTENTS)
9723+        CONTENTS_uploadable = MutableDataHandle(CONTENTS)
9724+        d = nm.create_mutable_file(CONTENTS_uploadable)
9725         def _created(node):
9726             self._node = node
9727         d.addCallback(_created)
9728hunk ./src/allmydata/test/test_system.py 22
9729 from allmydata.monitor import Monitor
9730 from allmydata.mutable.common import NotWriteableError
9731 from allmydata.mutable import layout as mutable_layout
9732+from allmydata.mutable.publish import MutableDataHandle
9733 from foolscap.api import DeadReferenceError
9734 from twisted.python.failure import Failure
9735 from twisted.web.client import getPage
9736hunk ./src/allmydata/test/test_system.py 460
9737     def test_mutable(self):
9738         self.basedir = "system/SystemTest/test_mutable"
9739         DATA = "initial contents go here."  # 25 bytes % 3 != 0
9740+        DATA_uploadable = MutableDataHandle(DATA)
9741         NEWDATA = "new contents yay"
9742hunk ./src/allmydata/test/test_system.py 462
9743+        NEWDATA_uploadable = MutableDataHandle(NEWDATA)
9744         NEWERDATA = "this is getting old"
9745hunk ./src/allmydata/test/test_system.py 464
9746+        NEWERDATA_uploadable = MutableDataHandle(NEWERDATA)
9747 
9748         d = self.set_up_nodes(use_key_generator=True)
9749 
9750hunk ./src/allmydata/test/test_system.py 471
9751         def _create_mutable(res):
9752             c = self.clients[0]
9753             log.msg("starting create_mutable_file")
9754-            d1 = c.create_mutable_file(DATA)
9755+            d1 = c.create_mutable_file(DATA_uploadable)
9756             def _done(res):
9757                 log.msg("DONE: %s" % (res,))
9758                 self._mutable_node_1 = res
9759hunk ./src/allmydata/test/test_system.py 558
9760             self.failUnlessEqual(res, DATA)
9761             # replace the data
9762             log.msg("starting replace1")
9763-            d1 = newnode.overwrite(NEWDATA)
9764+            d1 = newnode.overwrite(NEWDATA_uploadable)
9765             d1.addCallback(lambda res: newnode.download_best_version())
9766             return d1
9767         d.addCallback(_check_download_3)
9768hunk ./src/allmydata/test/test_system.py 572
9769             newnode2 = self.clients[3].create_node_from_uri(uri)
9770             self._newnode3 = self.clients[3].create_node_from_uri(uri)
9771             log.msg("starting replace2")
9772-            d1 = newnode1.overwrite(NEWERDATA)
9773+            d1 = newnode1.overwrite(NEWERDATA_uploadable)
9774             d1.addCallback(lambda res: newnode2.download_best_version())
9775             return d1
9776         d.addCallback(_check_download_4)
9777hunk ./src/allmydata/test/test_system.py 642
9778         def _check_empty_file(res):
9779             # make sure we can create empty files, this usually screws up the
9780             # segsize math
9781-            d1 = self.clients[2].create_mutable_file("")
9782+            d1 = self.clients[2].create_mutable_file(MutableDataHandle(""))
9783             d1.addCallback(lambda newnode: newnode.download_best_version())
9784             d1.addCallback(lambda res: self.failUnlessEqual("", res))
9785             return d1
9786hunk ./src/allmydata/test/test_system.py 673
9787                                  self.key_generator_svc.key_generator.pool_size + size_delta)
9788 
9789         d.addCallback(check_kg_poolsize, 0)
9790-        d.addCallback(lambda junk: self.clients[3].create_mutable_file('hello, world'))
9791+        d.addCallback(lambda junk:
9792+            self.clients[3].create_mutable_file(MutableDataHandle('hello, world')))
9793         d.addCallback(check_kg_poolsize, -1)
9794         d.addCallback(lambda junk: self.clients[3].create_dirnode())
9795         d.addCallback(check_kg_poolsize, -2)
9796hunk ./src/allmydata/test/test_web.py 3166
9797         def _stash_mutable_uri(n, which):
9798             self.uris[which] = n.get_uri()
9799             assert isinstance(self.uris[which], str)
9800-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
9801+        d.addCallback(lambda ign:
9802+            c0.create_mutable_file(publish.MutableDataHandle(DATA+"3")))
9803         d.addCallback(_stash_mutable_uri, "corrupt")
9804         d.addCallback(lambda ign:
9805                       c0.upload(upload.Data("literal", convergence="")))
9806hunk ./src/allmydata/test/test_web.py 3313
9807         def _stash_mutable_uri(n, which):
9808             self.uris[which] = n.get_uri()
9809             assert isinstance(self.uris[which], str)
9810-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"3"))
9811+        d.addCallback(lambda ign:
9812+            c0.create_mutable_file(publish.MutableDataHandle(DATA+"3")))
9813         d.addCallback(_stash_mutable_uri, "corrupt")
9814 
9815         def _compute_fileurls(ignored):
9816hunk ./src/allmydata/test/test_web.py 3976
9817         def _stash_mutable_uri(n, which):
9818             self.uris[which] = n.get_uri()
9819             assert isinstance(self.uris[which], str)
9820-        d.addCallback(lambda ign: c0.create_mutable_file(DATA+"2"))
9821+        d.addCallback(lambda ign:
9822+            c0.create_mutable_file(publish.MutableDataHandle(DATA+"2")))
9823         d.addCallback(_stash_mutable_uri, "mutable")
9824 
9825         def _compute_fileurls(ignored):
9826hunk ./src/allmydata/test/test_web.py 4076
9827                                                         convergence="")))
9828         d.addCallback(_stash_uri, "small")
9829 
9830-        d.addCallback(lambda ign: c0.create_mutable_file("mutable"))
9831+        d.addCallback(lambda ign:
9832+            c0.create_mutable_file(publish.MutableDataHandle("mutable")))
9833         d.addCallback(lambda fn: self.rootnode.set_node(u"mutable", fn))
9834         d.addCallback(_stash_uri, "mutable")
9835 
9836}
9837[Alter mutable files to use file-like objects for publishing instead of strings.
9838Kevan Carstensen <kevan@isnotajoke.com>**20100708000732
9839 Ignore-this: 8dd07d95386b6d540bc21289f981ebd0
9840] {
9841hunk ./src/allmydata/dirnode.py 11
9842 from allmydata.mutable.common import NotWriteableError
9843 from allmydata.mutable.filenode import MutableFileNode
9844 from allmydata.unknown import UnknownNode, strip_prefix_for_ro
9845+from allmydata.mutable.publish import MutableDataHandle
9846 from allmydata.interfaces import IFilesystemNode, IDirectoryNode, IFileNode, \
9847      IImmutableFileNode, IMutableFileNode, \
9848      ExistingChildError, NoSuchChildError, ICheckable, IDeepCheckable, \
9849hunk ./src/allmydata/dirnode.py 104
9850 
9851         del children[self.name]
9852         new_contents = self.node._pack_contents(children)
9853-        return new_contents
9854+        uploadable = MutableDataHandle(new_contents)
9855+        return uploadable
9856 
9857 
9858 class MetadataSetter:
9859hunk ./src/allmydata/dirnode.py 130
9860 
9861         children[name] = (child, metadata)
9862         new_contents = self.node._pack_contents(children)
9863-        return new_contents
9864+        uploadable = MutableDataHandle(new_contents)
9865+        return uploadable
9866 
9867 
9868 class Adder:
9869hunk ./src/allmydata/dirnode.py 175
9870 
9871             children[name] = (child, metadata)
9872         new_contents = self.node._pack_contents(children)
9873-        return new_contents
9874+        uploadable = MutableDataHandle(new_contents)
9875+        return uploadable
9876 
9877 
9878 def _encrypt_rw_uri(filenode, rw_uri):
9879hunk ./src/allmydata/mutable/filenode.py 7
9880 from zope.interface import implements
9881 from twisted.internet import defer, reactor
9882 from foolscap.api import eventually
9883-from allmydata.interfaces import IMutableFileNode, \
9884-     ICheckable, ICheckResults, NotEnoughSharesError, MDMF_VERSION, SDMF_VERSION
9885+from allmydata.interfaces import IMutableFileNode, ICheckable, ICheckResults, \
9886+                                 NotEnoughSharesError, \
9887+                                 MDMF_VERSION, SDMF_VERSION, IMutableUploadable
9888 from allmydata.util import hashutil, log
9889 from allmydata.util.assertutil import precondition
9890 from allmydata.uri import WriteableSSKFileURI, ReadonlySSKFileURI
9891hunk ./src/allmydata/mutable/filenode.py 16
9892 from allmydata.monitor import Monitor
9893 from pycryptopp.cipher.aes import AES
9894 
9895-from allmydata.mutable.publish import Publish
9896+from allmydata.mutable.publish import Publish, MutableFileHandle, \
9897+                                      MutableDataHandle
9898 from allmydata.mutable.common import MODE_READ, MODE_WRITE, UnrecoverableFileError, \
9899      ResponseCache, UncoordinatedWriteError
9900 from allmydata.mutable.servermap import ServerMap, ServermapUpdater
9901hunk ./src/allmydata/mutable/filenode.py 133
9902         return self._upload(initial_contents, None)
9903 
9904     def _get_initial_contents(self, contents):
9905-        if isinstance(contents, str):
9906-            return contents
9907         if contents is None:
9908hunk ./src/allmydata/mutable/filenode.py 134
9909-            return ""
9910+            return MutableDataHandle("")
9911+
9912+        if IMutableUploadable.providedBy(contents):
9913+            return contents
9914+
9915         assert callable(contents), "%s should be callable, not %s" % \
9916                (contents, type(contents))
9917         return contents(self)
9918hunk ./src/allmydata/mutable/filenode.py 353
9919     def overwrite(self, new_contents):
9920         return self._do_serialized(self._overwrite, new_contents)
9921     def _overwrite(self, new_contents):
9922+        assert IMutableUploadable.providedBy(new_contents)
9923+
9924         servermap = ServerMap()
9925         d = self._update_servermap(servermap, mode=MODE_WRITE)
9926         d.addCallback(lambda ignored: self._upload(new_contents, servermap))
9927hunk ./src/allmydata/mutable/filenode.py 431
9928                 # recovery when it observes UCWE, we need to do a second
9929                 # publish. See #551 for details. We'll basically loop until
9930                 # we managed an uncontested publish.
9931-                new_contents = old_contents
9932-            precondition(isinstance(new_contents, str),
9933-                         "Modifier function must return a string or None")
9934+                old_uploadable = MutableDataHandle(old_contents)
9935+                new_contents = old_uploadable
9936+            precondition((IMutableUploadable.providedBy(new_contents) or
9937+                          new_contents is None),
9938+                         "Modifier function must return an IMutableUploadable "
9939+                         "or None")
9940             return self._upload(new_contents, servermap)
9941         d.addCallback(_apply)
9942         return d
9943hunk ./src/allmydata/mutable/filenode.py 472
9944         return self._do_serialized(self._upload, new_contents, servermap)
9945     def _upload(self, new_contents, servermap):
9946         assert self._pubkey, "update_servermap must be called before publish"
9947+        assert IMutableUploadable.providedBy(new_contents)
9948+
9949         p = Publish(self, self._storage_broker, servermap)
9950         if self._history:
9951hunk ./src/allmydata/mutable/filenode.py 476
9952-            self._history.notify_publish(p.get_status(), len(new_contents))
9953+            self._history.notify_publish(p.get_status(), new_contents.get_size())
9954         d = p.publish(new_contents)
9955hunk ./src/allmydata/mutable/filenode.py 478
9956-        d.addCallback(self._did_upload, len(new_contents))
9957+        d.addCallback(self._did_upload, new_contents.get_size())
9958         return d
9959     def _did_upload(self, res, size):
9960         self._most_recent_size = size
9961hunk ./src/allmydata/mutable/publish.py 141
9962 
9963         # 0. Setup encoding parameters, encoder, and other such things.
9964         # 1. Encrypt, encode, and publish segments.
9965-        self.data = StringIO(newdata)
9966-        self.datalength = len(newdata)
9967+        assert IMutableUploadable.providedBy(newdata)
9968+
9969+        self.data = newdata
9970+        self.datalength = newdata.get_size()
9971 
9972         self.log("starting publish, datalen is %s" % self.datalength)
9973         self._status.set_size(self.datalength)
9974hunk ./src/allmydata/mutable/publish.py 442
9975 
9976         self.log("Pushing segment %d of %d" % (segnum + 1, self.num_segments))
9977         data = self.data.read(segsize)
9978+        # XXX: This is dumb. Why return a list?
9979+        data = "".join(data)
9980 
9981         assert len(data) == segsize
9982 
9983hunk ./src/allmydata/mutable/repairer.py 5
9984 from zope.interface import implements
9985 from twisted.internet import defer
9986 from allmydata.interfaces import IRepairResults, ICheckResults
9987+from allmydata.mutable.publish import MutableDataHandle
9988 
9989 class RepairResults:
9990     implements(IRepairResults)
9991hunk ./src/allmydata/mutable/repairer.py 108
9992             raise RepairRequiresWritecapError("Sorry, repair currently requires a writecap, to set the write-enabler properly.")
9993 
9994         d = self.node.download_version(smap, best_version, fetch_privkey=True)
9995+        d.addCallback(lambda data:
9996+            MutableDataHandle(data))
9997         d.addCallback(self.node.upload, smap)
9998         d.addCallback(self.get_results, smap)
9999         return d
10000hunk ./src/allmydata/nodemaker.py 9
10001 from allmydata.immutable.filenode import ImmutableFileNode, LiteralFileNode
10002 from allmydata.immutable.upload import Data
10003 from allmydata.mutable.filenode import MutableFileNode
10004+from allmydata.mutable.publish import MutableDataHandle
10005 from allmydata.dirnode import DirectoryNode, pack_children
10006 from allmydata.unknown import UnknownNode
10007 from allmydata import uri
10008hunk ./src/allmydata/nodemaker.py 111
10009                          "create_new_mutable_directory requires metadata to be a dict, not None", metadata)
10010             node.raise_error()
10011         d = self.create_mutable_file(lambda n:
10012-                                     pack_children(n, initial_children),
10013+                                     MutableDataHandle(
10014+                                        pack_children(n, initial_children)),
10015                                      version)
10016         d.addCallback(self._create_dirnode)
10017         return d
10018hunk ./src/allmydata/web/filenode.py 12
10019 from allmydata.interfaces import ExistingChildError
10020 from allmydata.monitor import Monitor
10021 from allmydata.immutable.upload import FileHandle
10022+from allmydata.mutable.publish import MutableFileHandle
10023 from allmydata.util import log, base32
10024 
10025 from allmydata.web.common import text_plain, WebError, RenderMixin, \
10026hunk ./src/allmydata/web/filenode.py 27
10027         # a new file is being uploaded in our place.
10028         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
10029         if mutable:
10030-            req.content.seek(0)
10031-            data = req.content.read()
10032+            data = MutableFileHandle(req.content)
10033             d = client.create_mutable_file(data)
10034             def _uploaded(newnode):
10035                 d2 = self.parentnode.set_node(self.name, newnode,
10036hunk ./src/allmydata/web/filenode.py 61
10037         d.addCallback(lambda res: childnode.get_uri())
10038         return d
10039 
10040-    def _read_data_from_formpost(self, req):
10041-        # SDMF: files are small, and we can only upload data, so we read
10042-        # the whole file into memory before uploading.
10043-        contents = req.fields["file"]
10044-        contents.file.seek(0)
10045-        data = contents.file.read()
10046-        return data
10047 
10048     def replace_me_with_a_formpost(self, req, client, replace):
10049         # create a new file, maybe mutable, maybe immutable
10050hunk ./src/allmydata/web/filenode.py 66
10051         mutable = boolean_of_arg(get_arg(req, "mutable", "false"))
10052 
10053+        # create an immutable file
10054+        contents = req.fields["file"]
10055         if mutable:
10056hunk ./src/allmydata/web/filenode.py 69
10057-            data = self._read_data_from_formpost(req)
10058-            d = client.create_mutable_file(data)
10059+            uploadable = MutableFileHandle(contents.file)
10060+            d = client.create_mutable_file(uploadable)
10061             def _uploaded(newnode):
10062                 d2 = self.parentnode.set_node(self.name, newnode,
10063                                               overwrite=replace)
10064hunk ./src/allmydata/web/filenode.py 78
10065                 return d2
10066             d.addCallback(_uploaded)
10067             return d
10068-        # create an immutable file
10069-        contents = req.fields["file"]
10070+
10071         uploadable = FileHandle(contents.file, convergence=client.convergence)
10072         d = self.parentnode.add_file(self.name, uploadable, overwrite=replace)
10073         d.addCallback(lambda newnode: newnode.get_uri())
10074hunk ./src/allmydata/web/filenode.py 84
10075         return d
10076 
10077+
10078 class PlaceHolderNodeHandler(RenderMixin, rend.Page, ReplaceMeMixin):
10079     def __init__(self, client, parentnode, name):
10080         rend.Page.__init__(self)
10081hunk ./src/allmydata/web/filenode.py 278
10082 
10083     def replace_my_contents(self, req):
10084         req.content.seek(0)
10085-        new_contents = req.content.read()
10086+        new_contents = MutableFileHandle(req.content)
10087         d = self.node.overwrite(new_contents)
10088         d.addCallback(lambda res: self.node.get_uri())
10089         return d
10090hunk ./src/allmydata/web/filenode.py 286
10091     def replace_my_contents_with_a_formpost(self, req):
10092         # we have a mutable file. Get the data from the formpost, and replace
10093         # the mutable file's contents with it.
10094-        new_contents = self._read_data_from_formpost(req)
10095+        new_contents = req.fields['file']
10096+        new_contents = MutableFileHandle(new_contents.file)
10097+
10098         d = self.node.overwrite(new_contents)
10099         d.addCallback(lambda res: self.node.get_uri())
10100         return d
10101hunk ./src/allmydata/web/unlinked.py 7
10102 from twisted.internet import defer
10103 from nevow import rend, url, tags as T
10104 from allmydata.immutable.upload import FileHandle
10105+from allmydata.mutable.publish import MutableFileHandle
10106 from allmydata.web.common import getxmlfile, get_arg, boolean_of_arg, \
10107      convert_children_json, WebError
10108 from allmydata.web import status
10109hunk ./src/allmydata/web/unlinked.py 23
10110 def PUTUnlinkedSSK(req, client):
10111     # SDMF: files are small, and we can only upload data
10112     req.content.seek(0)
10113-    data = req.content.read()
10114+    data = MutableFileHandle(req.content)
10115     d = client.create_mutable_file(data)
10116     d.addCallback(lambda n: n.get_uri())
10117     return d
10118hunk ./src/allmydata/web/unlinked.py 87
10119     # "POST /uri", to create an unlinked file.
10120     # SDMF: files are small, and we can only upload data
10121     contents = req.fields["file"]
10122-    contents.file.seek(0)
10123-    data = contents.file.read()
10124+    data = MutableFileHandle(contents.file)
10125     d = client.create_mutable_file(data)
10126     d.addCallback(lambda n: n.get_uri())
10127     return d
10128}
10129[test/test_sftp.py: alter a setup routine to work with new mutable file APIs.
10130Kevan Carstensen <kevan@isnotajoke.com>**20100708193522
10131 Ignore-this: 434bbe1347072076c0836d26fca8ac8a
10132] {
10133hunk ./src/allmydata/test/test_sftp.py 32
10134 
10135 from allmydata.util.consumer import download_to_data
10136 from allmydata.immutable import upload
10137+from allmydata.mutable import publish
10138 from allmydata.test.no_network import GridTestMixin
10139 from allmydata.test.common import ShouldFailMixin
10140 from allmydata.test.common_util import ReallyEqualMixin
10141hunk ./src/allmydata/test/test_sftp.py 84
10142         return d
10143 
10144     def _set_up_tree(self):
10145-        d = self.client.create_mutable_file("mutable file contents")
10146+        u = publish.MutableDataHandle("mutable file contents")
10147+        d = self.client.create_mutable_file(u)
10148         d.addCallback(lambda node: self.root.set_node(u"mutable", node))
10149         def _created_mutable(n):
10150             self.mutable = n
10151}
10152[mutable/publish.py: make MutableFileHandle seek to the beginning of its file handle before reading.
10153Kevan Carstensen <kevan@isnotajoke.com>**20100708193600
10154 Ignore-this: 453a737dc62a79c77b3d360fed9000ab
10155] hunk ./src/allmydata/mutable/publish.py 989
10156         assert hasattr(filehandle, "close")
10157 
10158         self._filehandle = filehandle
10159+        # We must start reading at the beginning of the file, or we risk
10160+        # encountering errors when the data read does not match the size
10161+        # reported to the uploader.
10162+        self._filehandle.seek(0)
10163 
10164 
10165     def get_size(self):
10166[Refactor download interfaces to be more uniform, per #993
10167Kevan Carstensen <kevan@isnotajoke.com>**20100709232912
10168 Ignore-this: 277c5699c4a2dd7c03ecfa0a28458f5b
10169] {
10170hunk ./src/allmydata/immutable/filenode.py 10
10171 from foolscap.api import eventually
10172 from allmydata.interfaces import IImmutableFileNode, ICheckable, \
10173      IDownloadTarget, IUploadResults
10174-from allmydata.util import dictutil, log, base32
10175+from allmydata.util import dictutil, log, base32, consumer
10176 from allmydata.uri import CHKFileURI, LiteralFileURI
10177 from allmydata.immutable.checker import Checker
10178 from allmydata.check_results import CheckResults, CheckAndRepairResults
10179hunk ./src/allmydata/immutable/filenode.py 318
10180                       self.download_cache.read(consumer, offset, size))
10181         return d
10182 
10183+    # IReadable, IFileNode
10184+
10185+    def get_best_readable_version(self):
10186+        """
10187+        Return an IReadable of the best version of this file. Since
10188+        immutable files can have only one version, we just return the
10189+        current filenode.
10190+        """
10191+        return self
10192+
10193+
10194+    def download_best_version(self):
10195+        """
10196+        Download the best version of this file, returning its contents
10197+        as a bytestring. Since there is only one version of an immutable
10198+        file, we download and return the contents of this file.
10199+        """
10200+        d = consumer.download_to_data(self)
10201+        return d
10202+
10203+    # for an immutable file, download_to_data (specified in IReadable)
10204+    # is the same as download_best_version (specified in IFileNode). For
10205+    # mutable files, the difference is more meaningful, since they can
10206+    # have multiple versions.
10207+    download_to_data = download_best_version
10208+
10209+
10210+    # get_size() (IReadable), get_current_size() (IFilesystemNode), and
10211+    # get_size_of_best_version(IFileNode) are all the same for immutable
10212+    # files.
10213+    get_size_of_best_version = get_current_size
10214+
10215+
10216 class LiteralProducer:
10217     implements(IPushProducer)
10218     def resumeProducing(self):
10219hunk ./src/allmydata/immutable/filenode.py 409
10220         d = basic.FileSender().beginFileTransfer(StringIO(data), consumer)
10221         d.addCallback(lambda lastSent: consumer)
10222         return d
10223+
10224+    # IReadable, IFileNode, IFilesystemNode
10225+    def get_best_readable_version(self):
10226+        return self
10227+
10228+
10229+    def download_best_version(self):
10230+        return defer.succeed(self.u.data)
10231+
10232+
10233+    download_to_data = download_best_version
10234+    get_size_of_best_version = get_current_size
10235hunk ./src/allmydata/interfaces.py 563
10236 class MustNotBeUnknownRWError(CapConstraintError):
10237     """Cannot add an unknown child cap specified in a rw_uri field."""
10238 
10239+
10240+class IReadable(Interface):
10241+    """I represent a readable object -- either an immutable file, or a
10242+    specific version of a mutable file.
10243+    """
10244+
10245+    def is_readonly():
10246+        """Return True if this reference provides mutable access to the given
10247+        file or directory (i.e. if you can modify it), or False if not. Note
10248+        that even if this reference is read-only, someone else may hold a
10249+        read-write reference to it.
10250+
10251+        For an IReadable returned by get_best_readable_version(), this will
10252+        always return True, but for instances of subinterfaces such as
10253+        IMutableFileVersion, it may return False."""
10254+
10255+    def is_mutable():
10256+        """Return True if this file or directory is mutable (by *somebody*,
10257+        not necessarily you), False if it is is immutable. Note that a file
10258+        might be mutable overall, but your reference to it might be
10259+        read-only. On the other hand, all references to an immutable file
10260+        will be read-only; there are no read-write references to an immutable
10261+        file."""
10262+
10263+    def get_storage_index():
10264+        """Return the storage index of the file."""
10265+
10266+    def get_size():
10267+        """Return the length (in bytes) of this readable object."""
10268+
10269+    def download_to_data():
10270+        """Download all of the file contents. I return a Deferred that fires
10271+        with the contents as a byte string."""
10272+
10273+    def read(consumer, offset=0, size=None):
10274+        """Download a portion (possibly all) of the file's contents, making
10275+        them available to the given IConsumer. Return a Deferred that fires
10276+        (with the consumer) when the consumer is unregistered (either because
10277+        the last byte has been given to it, or because the consumer threw an
10278+        exception during write(), possibly because it no longer wants to
10279+        receive data). The portion downloaded will start at 'offset' and
10280+        contain 'size' bytes (or the remainder of the file if size==None).
10281+
10282+        The consumer will be used in non-streaming mode: an IPullProducer
10283+        will be attached to it.
10284+
10285+        The consumer will not receive data right away: several network trips
10286+        must occur first. The order of events will be::
10287+
10288+         consumer.registerProducer(p, streaming)
10289+          (if streaming == False)::
10290+           consumer does p.resumeProducing()
10291+            consumer.write(data)
10292+           consumer does p.resumeProducing()
10293+            consumer.write(data).. (repeat until all data is written)
10294+         consumer.unregisterProducer()
10295+         deferred.callback(consumer)
10296+
10297+        If a download error occurs, or an exception is raised by
10298+        consumer.registerProducer() or consumer.write(), I will call
10299+        consumer.unregisterProducer() and then deliver the exception via
10300+        deferred.errback(). To cancel the download, the consumer should call
10301+        p.stopProducing(), which will result in an exception being delivered
10302+        via deferred.errback().
10303+
10304+        See src/allmydata/util/consumer.py for an example of a simple
10305+        download-to-memory consumer.
10306+        """
10307+
10308+
10309+class IMutableFileVersion(IReadable):
10310+    """I provide access to a particular version of a mutable file. The
10311+    access is read/write if I was obtained from a filenode derived from
10312+    a write cap, or read-only if the filenode was derived from a read cap.
10313+    """
10314+
10315+    def get_sequence_number():
10316+        """Return the sequence number of this version."""
10317+
10318+    def get_servermap():
10319+        """Return the IMutableFileServerMap instance that was used to create
10320+        this object.
10321+        """
10322+
10323+    def get_writekey():
10324+        """Return this filenode's writekey, or None if the node does not have
10325+        write-capability. This may be used to assist with data structures
10326+        that need to make certain data available only to writers, such as the
10327+        read-write child caps in dirnodes. The recommended process is to have
10328+        reader-visible data be submitted to the filenode in the clear (where
10329+        it will be encrypted by the filenode using the readkey), but encrypt
10330+        writer-visible data using this writekey.
10331+        """
10332+
10333+    # TODO: Can this be overwrite instead of replace?
10334+    def replace(new_contents):
10335+        """Replace the contents of the mutable file, provided that no other
10336+        node has published (or is attempting to publish, concurrently) a
10337+        newer version of the file than this one.
10338+
10339+        I will avoid modifying any share that is different than the version
10340+        given by get_sequence_number(). However, if another node is writing
10341+        to the file at the same time as me, I may manage to update some shares
10342+        while they update others. If I see any evidence of this, I will signal
10343+        UncoordinatedWriteError, and the file will be left in an inconsistent
10344+        state (possibly the version you provided, possibly the old version,
10345+        possibly somebody else's version, and possibly a mix of shares from
10346+        all of these).
10347+
10348+        The recommended response to UncoordinatedWriteError is to either
10349+        return it to the caller (since they failed to coordinate their
10350+        writes), or to attempt some sort of recovery. It may be sufficient to
10351+        wait a random interval (with exponential backoff) and repeat your
10352+        operation. If I do not signal UncoordinatedWriteError, then I was
10353+        able to write the new version without incident.
10354+
10355+        I return a Deferred that fires (with a PublishStatus object) when the
10356+        update has completed.
10357+        """
10358+
10359+    def modify(modifier_cb):
10360+        """Modify the contents of the file, by downloading this version,
10361+        applying the modifier function (or bound method), then uploading
10362+        the new version. This will succeed as long as no other node
10363+        publishes a version between the download and the upload.
10364+        I return a Deferred that fires (with a PublishStatus object) when
10365+        the update is complete.
10366+
10367+        The modifier callable will be given three arguments: a string (with
10368+        the old contents), a 'first_time' boolean, and a servermap. As with
10369+        download_to_data(), the old contents will be from this version,
10370+        but the modifier can use the servermap to make other decisions
10371+        (such as refusing to apply the delta if there are multiple parallel
10372+        versions, or if there is evidence of a newer unrecoverable version).
10373+        'first_time' will be True the first time the modifier is called,
10374+        and False on any subsequent calls.
10375+
10376+        The callable should return a string with the new contents. The
10377+        callable must be prepared to be called multiple times, and must
10378+        examine the input string to see if the change that it wants to make
10379+        is already present in the old version. If it does not need to make
10380+        any changes, it can either return None, or return its input string.
10381+
10382+        If the modifier raises an exception, it will be returned in the
10383+        errback.
10384+        """
10385+
10386+
10387 # The hierarchy looks like this:
10388 #  IFilesystemNode
10389 #   IFileNode
10390hunk ./src/allmydata/interfaces.py 801
10391     def raise_error():
10392         """Raise any error associated with this node."""
10393 
10394+    # XXX: These may not be appropriate outside the context of an IReadable.
10395     def get_size():
10396         """Return the length (in bytes) of the data this node represents. For
10397         directory nodes, I return the size of the backing store. I return
10398hunk ./src/allmydata/interfaces.py 818
10399 class IFileNode(IFilesystemNode):
10400     """I am a node which represents a file: a sequence of bytes. I am not a
10401     container, like IDirectoryNode."""
10402+    def get_best_readable_version():
10403+        """Return a Deferred that fires with an IReadable for the 'best'
10404+        available version of the file. The IReadable provides only read
10405+        access, even if this filenode was derived from a write cap.
10406 
10407hunk ./src/allmydata/interfaces.py 823
10408-class IImmutableFileNode(IFileNode):
10409-    def read(consumer, offset=0, size=None):
10410-        """Download a portion (possibly all) of the file's contents, making
10411-        them available to the given IConsumer. Return a Deferred that fires
10412-        (with the consumer) when the consumer is unregistered (either because
10413-        the last byte has been given to it, or because the consumer threw an
10414-        exception during write(), possibly because it no longer wants to
10415-        receive data). The portion downloaded will start at 'offset' and
10416-        contain 'size' bytes (or the remainder of the file if size==None).
10417-
10418-        The consumer will be used in non-streaming mode: an IPullProducer
10419-        will be attached to it.
10420+        For an immutable file, there is only one version. For a mutable
10421+        file, the 'best' version is the recoverable version with the
10422+        highest sequence number. If no uncoordinated writes have occurred,
10423+        and if enough shares are available, then this will be the most
10424+        recent version that has been uploaded. If no version is recoverable,
10425+        the Deferred will errback with an UnrecoverableFileError.
10426+        """
10427 
10428hunk ./src/allmydata/interfaces.py 831
10429-        The consumer will not receive data right away: several network trips
10430-        must occur first. The order of events will be::
10431+    def download_best_version():
10432+        """Download the contents of the version that would be returned
10433+        by get_best_readable_version(). This is equivalent to calling
10434+        download_to_data() on the IReadable given by that method.
10435 
10436hunk ./src/allmydata/interfaces.py 836
10437-         consumer.registerProducer(p, streaming)
10438-          (if streaming == False)::
10439-           consumer does p.resumeProducing()
10440-            consumer.write(data)
10441-           consumer does p.resumeProducing()
10442-            consumer.write(data).. (repeat until all data is written)
10443-         consumer.unregisterProducer()
10444-         deferred.callback(consumer)
10445+        I return a Deferred that fires with a byte string when the file
10446+        has been fully downloaded. To support streaming download, use
10447+        the 'read' method of IReadable. If no version is recoverable,
10448+        the Deferred will errback with an UnrecoverableFileError.
10449+        """
10450 
10451hunk ./src/allmydata/interfaces.py 842
10452-        If a download error occurs, or an exception is raised by
10453-        consumer.registerProducer() or consumer.write(), I will call
10454-        consumer.unregisterProducer() and then deliver the exception via
10455-        deferred.errback(). To cancel the download, the consumer should call
10456-        p.stopProducing(), which will result in an exception being delivered
10457-        via deferred.errback().
10458+    def get_size_of_best_version():
10459+        """Find the size of the version that would be returned by
10460+        get_best_readable_version().
10461 
10462hunk ./src/allmydata/interfaces.py 846
10463-        See src/allmydata/util/consumer.py for an example of a simple
10464-        download-to-memory consumer.
10465+        I return a Deferred that fires with an integer. If no version
10466+        is recoverable, the Deferred will errback with an
10467+        UnrecoverableFileError.
10468         """
10469 
10470hunk ./src/allmydata/interfaces.py 851
10471+
10472+class IImmutableFileNode(IFileNode, IReadable):
10473+    """I am a node representing an immutable file. Immutable files have
10474+    only one version"""
10475+
10476+
10477 class IMutableFileNode(IFileNode):
10478     """I provide access to a 'mutable file', which retains its identity
10479     regardless of what contents are put in it.
10480hunk ./src/allmydata/interfaces.py 916
10481     only be retrieved and updated all-at-once, as a single big string. Future
10482     versions of our mutable files will remove this restriction.
10483     """
10484-
10485-    def download_best_version():
10486-        """Download the 'best' available version of the file, meaning one of
10487-        the recoverable versions with the highest sequence number. If no
10488+    def get_best_mutable_version():
10489+        """Return a Deferred that fires with an IMutableFileVersion for
10490+        the 'best' available version of the file. The best version is
10491+        the recoverable version with the highest sequence number. If no
10492         uncoordinated writes have occurred, and if enough shares are
10493hunk ./src/allmydata/interfaces.py 921
10494-        available, then this will be the most recent version that has been
10495-        uploaded.
10496-
10497-        I update an internal servermap with MODE_READ, determine which
10498-        version of the file is indicated by
10499-        servermap.best_recoverable_version(), and return a Deferred that
10500-        fires with its contents. If no version is recoverable, the Deferred
10501-        will errback with UnrecoverableFileError.
10502-        """
10503-
10504-    def get_size_of_best_version():
10505-        """Find the size of the version that would be downloaded with
10506-        download_best_version(), without actually downloading the whole file.
10507+        available, then this will be the most recent version that has
10508+        been uploaded.
10509 
10510hunk ./src/allmydata/interfaces.py 924
10511-        I return a Deferred that fires with an integer.
10512+        If no version is recoverable, the Deferred will errback with an
10513+        UnrecoverableFileError.
10514         """
10515 
10516     def overwrite(new_contents):
10517hunk ./src/allmydata/interfaces.py 964
10518         errback.
10519         """
10520 
10521-
10522     def get_servermap(mode):
10523         """Return a Deferred that fires with an IMutableFileServerMap
10524         instance, updated using the given mode.
10525hunk ./src/allmydata/test/test_filenode.py 98
10526         def _check_segment(res):
10527             self.failUnlessEqual(res, DATA[1:1+5])
10528         d.addCallback(_check_segment)
10529+        d.addCallback(lambda ignored:
10530+            self.failUnlessEqual(fn1.get_best_readable_version(), fn1))
10531+        d.addCallback(lambda ignored:
10532+            fn1.get_size_of_best_version())
10533+        d.addCallback(lambda size:
10534+            self.failUnlessEqual(size, len(DATA)))
10535+        d.addCallback(lambda ignored:
10536+            fn1.download_to_data())
10537+        d.addCallback(lambda data:
10538+            self.failUnlessEqual(data, DATA))
10539+        d.addCallback(lambda ignored:
10540+            fn1.download_best_version())
10541+        d.addCallback(lambda data:
10542+            self.failUnlessEqual(data, DATA))
10543 
10544         return d
10545 
10546hunk ./src/allmydata/test/test_immutable.py 153
10547         return d
10548 
10549 
10550+    def test_download_to_data(self):
10551+        d = self.n.download_to_data()
10552+        d.addCallback(lambda data:
10553+            self.failUnlessEqual(data, common.TEST_DATA))
10554+        return d
10555+
10556+
10557+    def test_download_best_version(self):
10558+        d = self.n.download_best_version()
10559+        d.addCallback(lambda data:
10560+            self.failUnlessEqual(data, common.TEST_DATA))
10561+        return d
10562+
10563+
10564+    def test_get_best_readable_version(self):
10565+        n = self.n.get_best_readable_version()
10566+        self.failUnlessEqual(n, self.n)
10567+
10568+    def test_get_size_of_best_version(self):
10569+        d = self.n.get_size_of_best_version()
10570+        d.addCallback(lambda size:
10571+            self.failUnlessEqual(size, len(common.TEST_DATA)))
10572+        return d
10573+
10574+
10575 # XXX extend these tests to show bad behavior of various kinds from servers: raising exception from each remove_foo() method, for example
10576 
10577 # XXX test disconnect DeadReferenceError from get_buckets and get_block_whatsit
10578}
10579[frontends/sftpd.py: alter a mutable file overwrite to work with the new API
10580Kevan Carstensen <kevan@isnotajoke.com>**20100709232951
10581 Ignore-this: e0441c3ef2dfe78a1cac3f423d613e40
10582] {
10583hunk ./src/allmydata/frontends/sftpd.py 33
10584 from allmydata.interfaces import IFileNode, IDirectoryNode, ExistingChildError, \
10585      NoSuchChildError, ChildOfWrongTypeError
10586 from allmydata.mutable.common import NotWriteableError
10587+from allmydata.mutable.publish import MutableFileHandle
10588 from allmydata.immutable.upload import FileHandle
10589 from allmydata.dirnode import update_metadata
10590 
10591hunk ./src/allmydata/frontends/sftpd.py 867
10592                     assert parent and childname, (parent, childname, self.metadata)
10593                     d2.addCallback(lambda ign: parent.set_metadata_for(childname, self.metadata))
10594 
10595-                d2.addCallback(lambda ign: self.consumer.get_current_size())
10596-                d2.addCallback(lambda size: self.consumer.read(0, size))
10597-                d2.addCallback(lambda new_contents: self.filenode.overwrite(new_contents))
10598+                d2.addCallback(lambda ign: self.filenode.overwrite(MutableFileHandle(self.consumer.get_file())))
10599             else:
10600                 def _add_file(ign):
10601                     self.log("_add_file childname=%r" % (childname,), level=OPERATIONAL)
10602}
10603
10604Context:
10605
10606[SFTP: don't call .stopProducing on the producer registered with OverwriteableFileConsumer (which breaks with warner's new downloader).
10607david-sarah@jacaranda.org**20100628231926
10608 Ignore-this: 131b7a5787bc85a9a356b5740d9d996f
10609] 
10610[docs/how_to_make_a_tahoe-lafs_release.txt: trivial correction, install.html should now be quickstart.html.
10611david-sarah@jacaranda.org**20100625223929
10612 Ignore-this: 99a5459cac51bd867cc11ad06927ff30
10613] 
10614[setup: in the Makefile, refuse to upload tarballs unless someone has passed the environment variable "BB_BRANCH" with value "trunk"
10615zooko@zooko.com**20100619034928
10616 Ignore-this: 276ddf9b6ad7ec79e27474862e0f7d6
10617] 
10618[trivial: tiny update to in-line comment
10619zooko@zooko.com**20100614045715
10620 Ignore-this: 10851b0ed2abfed542c97749e5d280bc
10621 (I'm actually committing this patch as a test of the new eager-annotation-computation of trac-darcs.)
10622] 
10623[docs: about.html link to home page early on, and be decentralized storage instead of cloud storage this time around
10624zooko@zooko.com**20100619065318
10625 Ignore-this: dc6db03f696e5b6d2848699e754d8053
10626] 
10627[docs: update about.html, especially to have a non-broken link to quickstart.html, and also to comment out the broken links to "for Paranoids" and "for Corporates"
10628zooko@zooko.com**20100619065124
10629 Ignore-this: e292c7f51c337a84ebfeb366fbd24d6c
10630] 
10631[TAG allmydata-tahoe-1.7.0
10632zooko@zooko.com**20100619052631
10633 Ignore-this: d21e27afe6d85e2e3ba6a3292ba2be1
10634] 
10635Patch bundle hash:
10636b1149541897336a5792c01467ae530bfbbb0a5ff