Ticket #1392: patchcompletewithdocchanges.darcs.patch

File patchcompletewithdocchanges.darcs.patch, 20.9 KB (added by arch_o_median, at 2011-05-24T15:50:34Z)

complete patch: teststorage, stats, interfaces, NEWS all modified

Line 
1Tue Apr 26 14:59:58 MDT 2011  wilcoxjg@gmail.com
2  * test_storage.py:  test_latencies now expects None in output categories that contain too few samples for the associated percentile to be unambiguously reported.
3
4Tue Apr 26 15:16:41 MDT 2011  wilcoxjg@gmail.com
5  * server.py:  get_latencies now reports percentiles _only_ if there are sufficient observations for the interpretation of the percentile to be unambiguous.
6
7Thu May 19 11:10:41 MDT 2011  wilcoxjg@gmail.com
8  * stats.rst: now documents percentile modification in get_latencies
9
10Mon May 23 16:29:08 MDT 2011  wilcoxjg@gmail.com
11  * interfaces.py:  modified the return type of RIStatsProvider.get_stats to allow for None as a return value
12
13Tue May 24 09:46:39 MDT 2011  wilcoxjg@gmail.com
14  * NEWS.rst, stats.py: documentation of change to get_latencies
15
16New patches:
17
18[test_storage.py:  test_latencies now expects None in output categories that contain too few samples for the associated percentile to be unambiguously reported.
19wilcoxjg@gmail.com**20110426205958
20 Ignore-this: 2cf1920eb878f97394940584c470f43a
21] {
22hunk ./src/allmydata/test/test_storage.py 1314
23             ss.add_latency("allocate", 1.0 * i)
24         for i in range(1000):
25             ss.add_latency("renew", 1.0 * i)
26+        for i in range(20):
27+            ss.add_latency("write", 1.0 * i)
28         for i in range(10):
29             ss.add_latency("cancel", 2.0 * i)
30         ss.add_latency("get", 5.0)
31hunk ./src/allmydata/test/test_storage.py 1323
32         output = ss.get_latencies()
33 
34         self.failUnlessEqual(sorted(output.keys()),
35-                             sorted(["allocate", "renew", "cancel", "get"]))
36+                             sorted(["allocate", "renew", "cancel", "write", "get"]))
37         self.failUnlessEqual(len(ss.latencies["allocate"]), 1000)
38         self.failUnless(abs(output["allocate"]["mean"] - 9500) < 1, output)
39         self.failUnless(abs(output["allocate"]["01_0_percentile"] - 9010) < 1, output)
40hunk ./src/allmydata/test/test_storage.py 1344
41         self.failUnless(abs(output["renew"]["99_0_percentile"] - 990) < 1, output)
42         self.failUnless(abs(output["renew"]["99_9_percentile"] - 999) < 1, output)
43 
44+        self.failUnlessEqual(len(ss.latencies["write"]), 20)
45+        self.failUnless(abs(output["write"]["mean"] - 9) < 1, output)
46+        self.failUnless(output["write"]["01_0_percentile"] == None, output)
47+        self.failUnless(abs(output["write"]["10_0_percentile"] -  2) < 1, output)
48+        self.failUnless(abs(output["write"]["50_0_percentile"] - 10) < 1, output)
49+        self.failUnless(abs(output["write"]["90_0_percentile"] - 18) < 1, output)
50+        self.failUnless(abs(output["write"]["95_0_percentile"] - 19) < 1, output)
51+        self.failUnless(output["write"]["99_0_percentile"] == None, output)
52+        self.failUnless(output["write"]["99_9_percentile"] == None, output)
53+
54         self.failUnlessEqual(len(ss.latencies["cancel"]), 10)
55         self.failUnless(abs(output["cancel"]["mean"] - 9) < 1, output)
56hunk ./src/allmydata/test/test_storage.py 1356
57-        self.failUnless(abs(output["cancel"]["01_0_percentile"] -  0) < 1, output)
58+        self.failUnless(output["cancel"]["01_0_percentile"] == None, output)
59         self.failUnless(abs(output["cancel"]["10_0_percentile"] -  2) < 1, output)
60         self.failUnless(abs(output["cancel"]["50_0_percentile"] - 10) < 1, output)
61         self.failUnless(abs(output["cancel"]["90_0_percentile"] - 18) < 1, output)
62hunk ./src/allmydata/test/test_storage.py 1360
63-        self.failUnless(abs(output["cancel"]["95_0_percentile"] - 18) < 1, output)
64-        self.failUnless(abs(output["cancel"]["99_0_percentile"] - 18) < 1, output)
65-        self.failUnless(abs(output["cancel"]["99_9_percentile"] - 18) < 1, output)
66+        self.failUnless(output["cancel"]["95_0_percentile"] == None, output)
67+        self.failUnless(output["cancel"]["99_0_percentile"] == None, output)
68+        self.failUnless(output["cancel"]["99_9_percentile"] == None, output)
69 
70         self.failUnlessEqual(len(ss.latencies["get"]), 1)
71hunk ./src/allmydata/test/test_storage.py 1365
72-        self.failUnless(abs(output["get"]["mean"] - 5) < 1, output)
73-        self.failUnless(abs(output["get"]["01_0_percentile"] - 5) < 1, output)
74-        self.failUnless(abs(output["get"]["10_0_percentile"] - 5) < 1, output)
75-        self.failUnless(abs(output["get"]["50_0_percentile"] - 5) < 1, output)
76-        self.failUnless(abs(output["get"]["90_0_percentile"] - 5) < 1, output)
77-        self.failUnless(abs(output["get"]["95_0_percentile"] - 5) < 1, output)
78-        self.failUnless(abs(output["get"]["99_0_percentile"] - 5) < 1, output)
79-        self.failUnless(abs(output["get"]["99_9_percentile"] - 5) < 1, output)
80+        self.failUnless(output["get"]["mean"] == None, output)
81+        self.failUnless(output["get"]["01_0_percentile"] == None, output)
82+        self.failUnless(output["get"]["10_0_percentile"] == None, output)
83+        self.failUnless(output["get"]["50_0_percentile"] == None, output)
84+        self.failUnless(output["get"]["90_0_percentile"] == None, output)
85+        self.failUnless(output["get"]["95_0_percentile"] == None, output)
86+        self.failUnless(output["get"]["99_0_percentile"] == None, output)
87+        self.failUnless(output["get"]["99_9_percentile"] == None, output)
88 
89 def remove_tags(s):
90     s = re.sub(r'<[^>]*>', ' ', s)
91}
92[server.py:  get_latencies now reports percentiles _only_ if there are sufficient observations for the interpretation of the percentile to be unambiguous.
93wilcoxjg@gmail.com**20110426211641
94 Ignore-this: 546001f34d53e35ce2025b05b4ea66b6
95] {
96hunk ./src/allmydata/storage/server.py 119
97 
98     def get_latencies(self):
99         """Return a dict, indexed by category, that contains a dict of
100-        latency numbers for each category. Each dict will contain the
101+        latency numbers for each category. If there are sufficient samples
102+        for unambiguous interpretation, each dict will contain the
103         following keys: mean, 01_0_percentile, 10_0_percentile,
104         50_0_percentile (median), 90_0_percentile, 95_0_percentile,
105hunk ./src/allmydata/storage/server.py 123
106-        99_0_percentile, 99_9_percentile. If no samples have been collected
107-        for the given category, then that category name will not be present
108-        in the return value."""
109+        99_0_percentile, 99_9_percentile.  If there are insufficient
110+        samples for a given percentile to be interpreted unambiguously
111+        that percentile will be reported as None. If no samples have been
112+        collected for the given category, then that category name will
113+        not be present in the return value. """
114         # note that Amazon's Dynamo paper says they use 99.9% percentile.
115         output = {}
116         for category in self.latencies:
117hunk ./src/allmydata/storage/server.py 135
118                 continue
119             stats = {}
120             samples = self.latencies[category][:]
121-            samples.sort()
122             count = len(samples)
123hunk ./src/allmydata/storage/server.py 136
124-            stats["mean"] = sum(samples) / count
125-            stats["01_0_percentile"] = samples[int(0.01 * count)]
126-            stats["10_0_percentile"] = samples[int(0.1 * count)]
127-            stats["50_0_percentile"] = samples[int(0.5 * count)]
128-            stats["90_0_percentile"] = samples[int(0.9 * count)]
129-            stats["95_0_percentile"] = samples[int(0.95 * count)]
130-            stats["99_0_percentile"] = samples[int(0.99 * count)]
131-            stats["99_9_percentile"] = samples[int(0.999 * count)]
132+            stats["samplesize"] = count
133+            samples.sort()
134+            if count > 1:
135+                stats["mean"] = sum(samples) / count
136+            else:
137+                stats["mean"] = None
138+
139+            orderstatlist = [(0.01, "01_0_percentile", 100), (0.1, "10_0_percentile", 10),\
140+                             (0.50, "50_0_percentile", 10), (0.90, "90_0_percentile", 10),\
141+                             (0.95, "95_0_percentile", 20), (0.99, "99_0_percentile", 100),\
142+                             (0.999, "99_9_percentile", 1000)]
143+
144+            for percentile, percentilestring, minnumtoobserve in orderstatlist:
145+                if count >= minnumtoobserve:
146+                    stats[percentilestring] = samples[int(percentile*count)]
147+                else:
148+                    stats[percentilestring] = None
149+
150             output[category] = stats
151         return output
152 
153}
154[stats.rst: now documents percentile modification in get_latencies
155wilcoxjg@gmail.com**20110519171041
156 Ignore-this: ab728a6f8d382a046c84e152f00c0171
157] hunk ./docs/stats.rst 137
158         999 out of the last 1000 operations were faster than the
159         given number, and is the same threshold used by Amazon's
160         internal SLA, according to the Dynamo paper).
161+       Percentiles are only reported in the case of a sufficient
162+        number of observations for unambiguous interpretation. For
163+        example, the 99.9th percentile is (at the level of thousandths
164+        precision) 9 thousandths greater than the 99th
165+        percentile for sample sizes greater than or equal to 1000,
166+        thus the 99.9th percentile is only reported for samples of 1000
167+        or more observations.
168+
169 
170 **counters.uploader.files_uploaded**
171 
172[interfaces.py:  modified the return type of RIStatsProvider.get_stats to allow for None as a return value
173wilcoxjg@gmail.com**20110523222908
174 Ignore-this: 569051254e18b521faaba5203c93d10c
175] hunk ./src/allmydata/interfaces.py 2398
176         stats are instantaneous measures (potentially time averaged
177         internally)
178         """
179-        return DictOf(str, DictOf(str, ChoiceOf(float, int, long)))
180+        return DictOf(str, DictOf(str, ChoiceOf(float, int, long, None)))
181 
182 class RIStatsGatherer(RemoteInterface):
183     __remote_name__ = "RIStatsGatherer.tahoe.allmydata.com"
184[NEWS.rst, stats.py: documentation of change to get_latencies
185wilcoxjg@gmail.com**20110524154639
186 Ignore-this: 207603196767c497306610b27abda9ce
187] {
188hunk ./NEWS.rst 5
189 User-Visible Changes in Tahoe-LAFS
190 ==================================
191 
192+Release 1.9.0 (2011-??-??)
193+--------------------------
194+
195+
196+- Nodes now emit "None" for percentiles with higher implied precision
197+  than the number of observations can support. Older stats gatherers
198+  will throw an exception if they gather stats from a new storage
199+  server and it sends a "None" for a percentile. (`#1392`_)
200+
201+
202 Release 1.8.2 (2011-01-30)
203 --------------------------
204 
205hunk ./src/allmydata/interfaces.py 2393
206     def get_stats():
207         """
208         returns a dictionary containing 'counters' and 'stats', each a
209-        dictionary with string counter/stat name keys, and numeric values.
210+        dictionary with string counter/stat name keys, and numeric or None values.
211         counters are monotonically increasing measures of work done, and
212         stats are instantaneous measures (potentially time averaged
213         internally)
214}
215
216Context:
217
218[docs: revert link in relnotes.txt from NEWS.rst to NEWS, since the former did not exist at revision 5000.
219david-sarah@jacaranda.org**20110517011214
220 Ignore-this: 6a5be6e70241e3ec0575641f64343df7
221]
222[docs: convert NEWS to NEWS.rst and change all references to it.
223david-sarah@jacaranda.org**20110517010255
224 Ignore-this: a820b93ea10577c77e9c8206dbfe770d
225]
226[docs: remove out-of-date docs/testgrid/introducer.furl and containing directory. fixes #1404
227david-sarah@jacaranda.org**20110512140559
228 Ignore-this: 784548fc5367fac5450df1c46890876d
229]
230[scripts/common.py: don't assume that the default alias is always 'tahoe' (it is, but the API of get_alias doesn't say so). refs #1342
231david-sarah@jacaranda.org**20110130164923
232 Ignore-this: a271e77ce81d84bb4c43645b891d92eb
233]
234[setup: don't catch all Exception from check_requirement(), but only PackagingError and ImportError
235zooko@zooko.com**20110128142006
236 Ignore-this: 57d4bc9298b711e4bc9dc832c75295de
237 I noticed this because I had accidentally inserted a bug which caused AssertionError to be raised from check_requirement().
238]
239[M-x whitespace-cleanup
240zooko@zooko.com**20110510193653
241 Ignore-this: dea02f831298c0f65ad096960e7df5c7
242]
243[docs: fix typo in running.rst, thanks to arch_o_median
244zooko@zooko.com**20110510193633
245 Ignore-this: ca06de166a46abbc61140513918e79e8
246]
247[relnotes.txt: don't claim to work on Cygwin (which has been untested for some time). refs #1342
248david-sarah@jacaranda.org**20110204204902
249 Ignore-this: 85ef118a48453d93fa4cddc32d65b25b
250]
251[relnotes.txt: forseeable -> foreseeable. refs #1342
252david-sarah@jacaranda.org**20110204204116
253 Ignore-this: 746debc4d82f4031ebf75ab4031b3a9
254]
255[replace remaining .html docs with .rst docs
256zooko@zooko.com**20110510191650
257 Ignore-this: d557d960a986d4ac8216d1677d236399
258 Remove install.html (long since deprecated).
259 Also replace some obsolete references to install.html with references to quickstart.rst.
260 Fix some broken internal references within docs/historical/historical_known_issues.txt.
261 Thanks to Ravi Pinjala and Patrick McDonald.
262 refs #1227
263]
264[docs: FTP-and-SFTP.rst: fix a minor error and update the information about which version of Twisted fixes #1297
265zooko@zooko.com**20110428055232
266 Ignore-this: b63cfb4ebdbe32fb3b5f885255db4d39
267]
268[munin tahoe_files plugin: fix incorrect file count
269francois@ctrlaltdel.ch**20110428055312
270 Ignore-this: 334ba49a0bbd93b4a7b06a25697aba34
271 fixes #1391
272]
273[corrected "k must never be smaller than N" to "k must never be greater than N"
274secorp@allmydata.org**20110425010308
275 Ignore-this: 233129505d6c70860087f22541805eac
276]
277[Fix a test failure in test_package_initialization on Python 2.4.x due to exceptions being stringified differently than in later versions of Python. refs #1389
278david-sarah@jacaranda.org**20110411190738
279 Ignore-this: 7847d26bc117c328c679f08a7baee519
280]
281[tests: add test for including the ImportError message and traceback entry in the summary of errors from importing dependencies. refs #1389
282david-sarah@jacaranda.org**20110410155844
283 Ignore-this: fbecdbeb0d06a0f875fe8d4030aabafa
284]
285[allmydata/__init__.py: preserve the message and last traceback entry (file, line number, function, and source line) of ImportErrors in the package versions string. fixes #1389
286david-sarah@jacaranda.org**20110410155705
287 Ignore-this: 2f87b8b327906cf8bfca9440a0904900
288]
289[remove unused variable detected by pyflakes
290zooko@zooko.com**20110407172231
291 Ignore-this: 7344652d5e0720af822070d91f03daf9
292]
293[allmydata/__init__.py: Nicer reporting of unparseable version numbers in dependencies. fixes #1388
294david-sarah@jacaranda.org**20110401202750
295 Ignore-this: 9c6bd599259d2405e1caadbb3e0d8c7f
296]
297[update FTP-and-SFTP.rst: the necessary patch is included in Twisted-10.1
298Brian Warner <warner@lothar.com>**20110325232511
299 Ignore-this: d5307faa6900f143193bfbe14e0f01a
300]
301[control.py: remove all uses of s.get_serverid()
302warner@lothar.com**20110227011203
303 Ignore-this: f80a787953bd7fa3d40e828bde00e855
304]
305[web: remove some uses of s.get_serverid(), not all
306warner@lothar.com**20110227011159
307 Ignore-this: a9347d9cf6436537a47edc6efde9f8be
308]
309[immutable/downloader/fetcher.py: remove all get_serverid() calls
310warner@lothar.com**20110227011156
311 Ignore-this: fb5ef018ade1749348b546ec24f7f09a
312]
313[immutable/downloader/fetcher.py: fix diversity bug in server-response handling
314warner@lothar.com**20110227011153
315 Ignore-this: bcd62232c9159371ae8a16ff63d22c1b
316 
317 When blocks terminate (either COMPLETE or CORRUPT/DEAD/BADSEGNUM), the
318 _shares_from_server dict was being popped incorrectly (using shnum as the
319 index instead of serverid). I'm still thinking through the consequences of
320 this bug. It was probably benign and really hard to detect. I think it would
321 cause us to incorrectly believe that we're pulling too many shares from a
322 server, and thus prefer a different server rather than asking for a second
323 share from the first server. The diversity code is intended to spread out the
324 number of shares simultaneously being requested from each server, but with
325 this bug, it might be spreading out the total number of shares requested at
326 all, not just simultaneously. (note that SegmentFetcher is scoped to a single
327 segment, so the effect doesn't last very long).
328]
329[immutable/downloader/share.py: reduce get_serverid(), one left, update ext deps
330warner@lothar.com**20110227011150
331 Ignore-this: d8d56dd8e7b280792b40105e13664554
332 
333 test_download.py: create+check MyShare instances better, make sure they share
334 Server objects, now that finder.py cares
335]
336[immutable/downloader/finder.py: reduce use of get_serverid(), one left
337warner@lothar.com**20110227011146
338 Ignore-this: 5785be173b491ae8a78faf5142892020
339]
340[immutable/offloaded.py: reduce use of get_serverid() a bit more
341warner@lothar.com**20110227011142
342 Ignore-this: b48acc1b2ae1b311da7f3ba4ffba38f
343]
344[immutable/upload.py: reduce use of get_serverid()
345warner@lothar.com**20110227011138
346 Ignore-this: ffdd7ff32bca890782119a6e9f1495f6
347]
348[immutable/checker.py: remove some uses of s.get_serverid(), not all
349warner@lothar.com**20110227011134
350 Ignore-this: e480a37efa9e94e8016d826c492f626e
351]
352[add remaining get_* methods to storage_client.Server, NoNetworkServer, and
353warner@lothar.com**20110227011132
354 Ignore-this: 6078279ddf42b179996a4b53bee8c421
355 MockIServer stubs
356]
357[upload.py: rearrange _make_trackers a bit, no behavior changes
358warner@lothar.com**20110227011128
359 Ignore-this: 296d4819e2af452b107177aef6ebb40f
360]
361[happinessutil.py: finally rename merge_peers to merge_servers
362warner@lothar.com**20110227011124
363 Ignore-this: c8cd381fea1dd888899cb71e4f86de6e
364]
365[test_upload.py: factor out FakeServerTracker
366warner@lothar.com**20110227011120
367 Ignore-this: 6c182cba90e908221099472cc159325b
368]
369[test_upload.py: server-vs-tracker cleanup
370warner@lothar.com**20110227011115
371 Ignore-this: 2915133be1a3ba456e8603885437e03
372]
373[happinessutil.py: server-vs-tracker cleanup
374warner@lothar.com**20110227011111
375 Ignore-this: b856c84033562d7d718cae7cb01085a9
376]
377[upload.py: more tracker-vs-server cleanup
378warner@lothar.com**20110227011107
379 Ignore-this: bb75ed2afef55e47c085b35def2de315
380]
381[upload.py: fix var names to avoid confusion between 'trackers' and 'servers'
382warner@lothar.com**20110227011103
383 Ignore-this: 5d5e3415b7d2732d92f42413c25d205d
384]
385[refactor: s/peer/server/ in immutable/upload, happinessutil.py, test_upload
386warner@lothar.com**20110227011100
387 Ignore-this: 7ea858755cbe5896ac212a925840fe68
388 
389 No behavioral changes, just updating variable/method names and log messages.
390 The effects outside these three files should be minimal: some exception
391 messages changed (to say "server" instead of "peer"), and some internal class
392 names were changed. A few things still use "peer" to minimize external
393 changes, like UploadResults.timings["peer_selection"] and
394 happinessutil.merge_peers, which can be changed later.
395]
396[storage_client.py: clean up test_add_server/test_add_descriptor, remove .test_servers
397warner@lothar.com**20110227011056
398 Ignore-this: efad933e78179d3d5fdcd6d1ef2b19cc
399]
400[test_client.py, upload.py:: remove KiB/MiB/etc constants, and other dead code
401warner@lothar.com**20110227011051
402 Ignore-this: dc83c5794c2afc4f81e592f689c0dc2d
403]
404[test: increase timeout on a network test because Francois's ARM machine hit that timeout
405zooko@zooko.com**20110317165909
406 Ignore-this: 380c345cdcbd196268ca5b65664ac85b
407 I'm skeptical that the test was proceeding correctly but ran out of time. It seems more likely that it had gotten hung. But if we raise the timeout to an even more extravagant number then we can be even more certain that the test was never going to finish.
408]
409[docs/configuration.rst: add a "Frontend Configuration" section
410Brian Warner <warner@lothar.com>**20110222014323
411 Ignore-this: 657018aa501fe4f0efef9851628444ca
412 
413 this points to docs/frontends/*.rst, which were previously underlinked
414]
415[web/filenode.py: avoid calling req.finish() on closed HTTP connections. Closes #1366
416"Brian Warner <warner@lothar.com>"**20110221061544
417 Ignore-this: 799d4de19933f2309b3c0c19a63bb888
418]
419[Add unit tests for cross_check_pkg_resources_versus_import, and a regression test for ref #1355. This requires a little refactoring to make it testable.
420david-sarah@jacaranda.org**20110221015817
421 Ignore-this: 51d181698f8c20d3aca58b057e9c475a
422]
423[allmydata/__init__.py: .name was used in place of the correct .__name__ when printing an exception. Also, robustify string formatting by using %r instead of %s in some places. fixes #1355.
424david-sarah@jacaranda.org**20110221020125
425 Ignore-this: b0744ed58f161bf188e037bad077fc48
426]
427[Refactor StorageFarmBroker handling of servers
428Brian Warner <warner@lothar.com>**20110221015804
429 Ignore-this: 842144ed92f5717699b8f580eab32a51
430 
431 Pass around IServer instance instead of (peerid, rref) tuple. Replace
432 "descriptor" with "server". Other replacements:
433 
434  get_all_servers -> get_connected_servers/get_known_servers
435  get_servers_for_index -> get_servers_for_psi (now returns IServers)
436 
437 This change still needs to be pushed further down: lots of code is now
438 getting the IServer and then distributing (peerid, rref) internally.
439 Instead, it ought to distribute the IServer internally and delay
440 extracting a serverid or rref until the last moment.
441 
442 no_network.py was updated to retain parallelism.
443]
444[TAG allmydata-tahoe-1.8.2
445warner@lothar.com**20110131020101]
446Patch bundle hash:
44712716d4a167ffc91d3273dee770cd8e6a41bb786