Changeset 67ad0175 in trunk


Ignore:
Timestamp:
2011-05-27T12:01:35Z (14 years ago)
Author:
wilcoxjg <wilcoxjg@…>
Branches:
master
Children:
ff136b8e
Parents:
d566e46
Message:

server.py: get_latencies now reports percentiles _only_ if there are sufficient observations for the interpretation of the percentile to be unambiguous.
interfaces.py: modified the return type of RIStatsProvider.get_stats to allow for None as a return value
NEWS.rst, stats.py: documentation of change to get_latencies
stats.rst: now documents percentile modification in get_latencies
test_storage.py: test_latencies now expects None in output categories that contain too few samples for the associated percentile to be unambiguously reported.
fixes #1392

Files:
5 edited

Legend:

Unmodified
Added
Removed
  • TabularUnified NEWS.rst

    rd566e46 r67ad0175  
    1 ==================================
     1==================================
    22User-Visible Changes in Tahoe-LAFS
    33==================================
     4
     5Release 1.9.0 (2011-??-??)
     6--------------------------
     7
     8
     9- Nodes now emit "None" for percentiles with higher implied precision
     10  than the number of observations can support. Older stats gatherers
     11  will throw an exception if they gather stats from a new storage
     12  server and it sends a "None" for a percentile. (`#1392`_)
     13
    414
    515Release 1.8.2 (2011-01-30)
  • TabularUnified docs/stats.rst

    rd566e46 r67ad0175  
    1 ================
     1================
    22Tahoe Statistics
    33================
     
    4545    by client-only nodes which have been configured to not run a storage server
    4646    (with [storage]enabled=false in tahoe.cfg)
    47                            
     47
    4848    allocate, write, close, abort
    4949        these are for immutable file uploads. 'allocate' is incremented when a
     
    135135        given number, and is the same threshold used by Amazon's
    136136        internal SLA, according to the Dynamo paper).
     137        Percentiles are only reported in the case of a sufficient
     138        number of observations for unambiguous interpretation. For
     139        example, the 99.9th percentile is (at the level of thousandths
     140        precision) 9 thousandths greater than the 99th
     141        percentile for sample sizes greater than or equal to 1000,
     142        thus the 99.9th percentile is only reported for samples of 1000
     143        or more observations.
     144
    137145
    138146**counters.uploader.files_uploaded**
     
    196204    active_uploads
    197205        how many files are currently being uploaded. 0 when idle.
    198    
     206
    199207    incoming_count
    200208        how many cache files are present in the incoming/ directory,
  • TabularUnified src/allmydata/interfaces.py

    rd566e46 r67ad0175  
    23912391        """
    23922392        returns a dictionary containing 'counters' and 'stats', each a
    2393         dictionary with string counter/stat name keys, and numeric values.
     2393        dictionary with string counter/stat name keys, and numeric or None values.
    23942394        counters are monotonically increasing measures of work done, and
    23952395        stats are instantaneous measures (potentially time averaged
    23962396        internally)
    23972397        """
    2398         return DictOf(str, DictOf(str, ChoiceOf(float, int, long)))
     2398        return DictOf(str, DictOf(str, ChoiceOf(float, int, long, None)))
    23992399
    24002400class RIStatsGatherer(RemoteInterface):
  • TabularUnified src/allmydata/storage/server.py

    rd566e46 r67ad0175  
    117117    def get_latencies(self):
    118118        """Return a dict, indexed by category, that contains a dict of
    119         latency numbers for each category. Each dict will contain the
     119        latency numbers for each category. If there are sufficient samples
     120        for unambiguous interpretation, each dict will contain the
    120121        following keys: mean, 01_0_percentile, 10_0_percentile,
    121122        50_0_percentile (median), 90_0_percentile, 95_0_percentile,
    122         99_0_percentile, 99_9_percentile. If no samples have been collected
    123         for the given category, then that category name will not be present
    124         in the return value."""
     123        99_0_percentile, 99_9_percentile.  If there are insufficient
     124        samples for a given percentile to be interpreted unambiguously
     125        that percentile will be reported as None. If no samples have been
     126        collected for the given category, then that category name will
     127        not be present in the return value. """
    125128        # note that Amazon's Dynamo paper says they use 99.9% percentile.
    126129        output = {}
     
    130133            stats = {}
    131134            samples = self.latencies[category][:]
     135            count = len(samples)
     136            stats["samplesize"] = count
    132137            samples.sort()
    133             count = len(samples)
    134             stats["mean"] = sum(samples) / count
    135             stats["01_0_percentile"] = samples[int(0.01 * count)]
    136             stats["10_0_percentile"] = samples[int(0.1 * count)]
    137             stats["50_0_percentile"] = samples[int(0.5 * count)]
    138             stats["90_0_percentile"] = samples[int(0.9 * count)]
    139             stats["95_0_percentile"] = samples[int(0.95 * count)]
    140             stats["99_0_percentile"] = samples[int(0.99 * count)]
    141             stats["99_9_percentile"] = samples[int(0.999 * count)]
     138            if count > 1:
     139                stats["mean"] = sum(samples) / count
     140            else:
     141                stats["mean"] = None
     142
     143            orderstatlist = [(0.01, "01_0_percentile", 100), (0.1, "10_0_percentile", 10),\
     144                             (0.50, "50_0_percentile", 10), (0.90, "90_0_percentile", 10),\
     145                             (0.95, "95_0_percentile", 20), (0.99, "99_0_percentile", 100),\
     146                             (0.999, "99_9_percentile", 1000)]
     147
     148            for percentile, percentilestring, minnumtoobserve in orderstatlist:
     149                if count >= minnumtoobserve:
     150                    stats[percentilestring] = samples[int(percentile*count)]
     151                else:
     152                    stats[percentilestring] = None
     153
    142154            output[category] = stats
    143155        return output
     
    552564                level=log.SCARY, umid="SGx2fA")
    553565        return None
    554 
  • TabularUnified src/allmydata/test/test_storage.py

    rd566e46 r67ad0175  
    13121312        for i in range(1000):
    13131313            ss.add_latency("renew", 1.0 * i)
     1314        for i in range(20):
     1315            ss.add_latency("write", 1.0 * i)
    13141316        for i in range(10):
    13151317            ss.add_latency("cancel", 2.0 * i)
     
    13191321
    13201322        self.failUnlessEqual(sorted(output.keys()),
    1321                              sorted(["allocate", "renew", "cancel", "get"]))
     1323                             sorted(["allocate", "renew", "cancel", "write", "get"]))
    13221324        self.failUnlessEqual(len(ss.latencies["allocate"]), 1000)
    13231325        self.failUnless(abs(output["allocate"]["mean"] - 9500) < 1, output)
     
    13401342        self.failUnless(abs(output["renew"]["99_9_percentile"] - 999) < 1, output)
    13411343
     1344        self.failUnlessEqual(len(ss.latencies["write"]), 20)
     1345        self.failUnless(abs(output["write"]["mean"] - 9) < 1, output)
     1346        self.failUnless(output["write"]["01_0_percentile"] is None, output)
     1347        self.failUnless(abs(output["write"]["10_0_percentile"] -  2) < 1, output)
     1348        self.failUnless(abs(output["write"]["50_0_percentile"] - 10) < 1, output)
     1349        self.failUnless(abs(output["write"]["90_0_percentile"] - 18) < 1, output)
     1350        self.failUnless(abs(output["write"]["95_0_percentile"] - 19) < 1, output)
     1351        self.failUnless(output["write"]["99_0_percentile"] is None, output)
     1352        self.failUnless(output["write"]["99_9_percentile"] is None, output)
     1353
    13421354        self.failUnlessEqual(len(ss.latencies["cancel"]), 10)
    13431355        self.failUnless(abs(output["cancel"]["mean"] - 9) < 1, output)
    1344         self.failUnless(abs(output["cancel"]["01_0_percentile"] -  0) < 1, output)
     1356        self.failUnless(output["cancel"]["01_0_percentile"] is None, output)
    13451357        self.failUnless(abs(output["cancel"]["10_0_percentile"] -  2) < 1, output)
    13461358        self.failUnless(abs(output["cancel"]["50_0_percentile"] - 10) < 1, output)
    13471359        self.failUnless(abs(output["cancel"]["90_0_percentile"] - 18) < 1, output)
    1348         self.failUnless(abs(output["cancel"]["95_0_percentile"] - 18) < 1, output)
    1349         self.failUnless(abs(output["cancel"]["99_0_percentile"] - 18) < 1, output)
    1350         self.failUnless(abs(output["cancel"]["99_9_percentile"] - 18) < 1, output)
     1360        self.failUnless(output["cancel"]["95_0_percentile"] is None, output)
     1361        self.failUnless(output["cancel"]["99_0_percentile"] is None, output)
     1362        self.failUnless(output["cancel"]["99_9_percentile"] is None, output)
    13511363
    13521364        self.failUnlessEqual(len(ss.latencies["get"]), 1)
    1353         self.failUnless(abs(output["get"]["mean"] - 5) < 1, output)
    1354         self.failUnless(abs(output["get"]["01_0_percentile"] - 5) < 1, output)
    1355         self.failUnless(abs(output["get"]["10_0_percentile"] - 5) < 1, output)
    1356         self.failUnless(abs(output["get"]["50_0_percentile"] - 5) < 1, output)
    1357         self.failUnless(abs(output["get"]["90_0_percentile"] - 5) < 1, output)
    1358         self.failUnless(abs(output["get"]["95_0_percentile"] - 5) < 1, output)
    1359         self.failUnless(abs(output["get"]["99_0_percentile"] - 5) < 1, output)
    1360         self.failUnless(abs(output["get"]["99_9_percentile"] - 5) < 1, output)
     1365        self.failUnless(output["get"]["mean"] is None, output)
     1366        self.failUnless(output["get"]["01_0_percentile"] is None, output)
     1367        self.failUnless(output["get"]["10_0_percentile"] is None, output)
     1368        self.failUnless(output["get"]["50_0_percentile"] is None, output)
     1369        self.failUnless(output["get"]["90_0_percentile"] is None, output)
     1370        self.failUnless(output["get"]["95_0_percentile"] is None, output)
     1371        self.failUnless(output["get"]["99_0_percentile"] is None, output)
     1372        self.failUnless(output["get"]["99_9_percentile"] is None, output)
    13611373
    13621374def remove_tags(s):
Note: See TracChangeset for help on using the changeset viewer.