[SciPy-User] scoreatpercentile behaviour

Sturla Molden sturla@molden...
Fri Jan 25 08:07:52 CST 2013


On 24.01.2013 18:46, Andreas Hilboll wrote:

> Sorry for the noise, it turns out I was just too blind / tired /
> whatever to notice the very first line of the function
>
>     values = np.sort(a, axis=0)
>
> My dumb fault. Thanks for making me realize.

I wonder if it's not better to use quickselect than quicksort for 
percentiles? We can quickselect to find the sample closest to the 
desired percentile. Then (since quickselect leaves the data partially 
sorted), we can find the k-nearest neighbors by linear search in each 
direction, possibly using a heap, leaving us with 2*k+1 samples with 
which to interpolate the score at the percentile. That should give us 
the percentiles in average O(n) instead of average O(n log n) time.

Sturla



More information about the SciPy-User mailing list