Changes between Initial Version and Version 1 of Ticket #1392, comment 12


Ignore:
Timestamp:
2011-04-23T17:47:58Z (14 years ago)
Author:
arch_o_median
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #1392, comment 12

    initial v1  
    44
    55(The Problem)
    6    The notion of a percentile becomes ambiguous as the precision in the `percentile' reported becomes over specific for the quantity of data provided.  For example, if the size of a sample is less than 10 then the 01th percentile and the 10th percentile refer to the same index (the first) in the sorted list of samples. This matches the definition of a percentile, that is both 1 and 10 percent of the data is less than the first element, but can be misleading in interpretation.   If the consumer believes that the 01th and 10th percentile should refer to different indices in the list then they will be mistaken.
    7    The intuition that different percentiles are references to different indices is reasonable and should be supported.   The degree to which the percentiles _are_ distinct is a function of their precision and the size of the sample.  Larger samples permit more precise percentiles to be meaningful.  I use the word `resolution' in my head when I think of this concept.  Larger samples sizes permit higher `resolution'.   
     6   The notion of a percentile becomes ambiguous as the precision in the 'percentile' reported becomes over specific for the quantity of data provided.  For example, if the size of a sample is less than 10 then the 01th percentile and the 10th percentile refer to the same index (the first) in the sorted list of samples. This matches the definition of a percentile, that is both 1 and 10 percent of the data is less than the first element, but can be misleading in interpretation.   If the consumer believes that the 01th and 10th percentile should refer to different indices in the list then they will be mistaken.
     7   The intuition that different percentiles are references to different indices is reasonable and should be supported.   The degree to which the percentiles _are_ distinct is a function of their precision and the size of the sample.  Larger samples permit more precise percentiles to be meaningful.  I use the word 'resolution' in my head when I think of this concept.  Larger samples sizes permit higher 'resolution'.   
    88
    99(The Solution):
    10    Indistinct percentiles are indicative of insufficient resolution for the specified percentile.  `Indistinct' can be simply defined as multiple references by different percentiles to the same index.   The fix is quite simple, if percentiles are indistinct, they should return/report None instead of an index.
     10   Indistinct percentiles are indicative of insufficient resolution for the specified percentile.  'Indistinct' can be simply defined as multiple references by different percentiles to the same index.   The fix is quite simple, if percentiles are indistinct, they should return/report None instead of an index.
    1111
    1212Caveat:  It is, of course, possible to render all percentiles indistinct by specifying over-precise adjacent percentiles.  This hack was created with the given percentile list in mind, that is, I am operating on the assumption that the consumer believes .99 and .999 to be different things but does not need to know whether .999 and .9999 are different quantities.