[tahoe-dev] darcs patch: Add statistics module

Shawn Willden shawn-tahoe at willden.org
Sun Jan 11 13:26:01 PST 2009


On Sunday 11 January 2009 12:36:54 pm Drew Perttula wrote:
> I have some notes regarding the python style. I didn't check the
> algorithms themselves.

Exactly what I was looking for, thanks.

> Python supports the expression '0 <= p <= 1', which is slightly easier
> to read.

Cool.  I like that.

> +    def bisect_k(low_k, high_k):
> +        """
> +        Recursive function to perform the search.
> +        """
>
> This part looks a lot like the stdlib 'bisect' module. That one is
> expecting to receive a sequence, and yours calculates elements on demand,
> but you could make an object that lazily runs pr_backup_file_loss() when
> bisect asks for an element.

I did spend a little time looking to see if there might be a more general 
implementation available.  I'll look at it and see whether I think it might 
be cleaner and more maintainable to use that one.

> +    if k > n/2:
> +        k = n - k
> +
> +    accum = 1.0
> +    for i in range(1, k+1):
> +        accum = accum * (n - k + i) / i;
>
> These lines are doing integer division for legacy reasons. It would be more
> future proof and robust to put 'from __future__ import division' at the top
> of the file and use the explicit // operator in all the cases where you
> want int division.

Good to know, thanks.  I'll make that change.  Alternatively, is there a 
binomial calculation function in a standard library somewhere?  I couldn't 
find one.  I found a couple of other libs that had them, but I didn't want to 
pull in a whole math library for one trivial function.

> +    return long(accum + 0.5)
>
> long is going away, and in modern python versions, I think int can do
> everything that long can.

Also good to know.  I'll change it to int.

	Shawn.


More information about the tahoe-dev mailing list