[Numpy-discussion] Bin selection

Todd Miller jmiller at stsci.edu
Tue Dec 31 05:25:04 CST 2002


Magnus Lie Hetland wrote:

>I have a set of limits, e.g. array([0, 35, 45, 55, 75]) and I want to
>use these to "classify" a set of numbers (another one-dimensional
>array). The "class" is the number of the first limit that is lower
>than or equal to the number I want to classify. E.g. I'd classify 17
>as 0 and 42 as 1.
>
>My current approach is:
>
>  sum(nums[:,NewAxis] >= lims, dim=-1)
>
>It seems a bit unnecessary to compare each number with all the limits
>when O(log(n)) would suffice (the limits are ordered); or even with
>O(n) running time, a smarter implementation could get an average of
>n/2 comparisons...
>
>Suggestions?
>
>  
>
Try searchsorted().  Searchsorted returns the index of the first bin >= 
the number being classified and has O(log(n)) running time.

 >>> a=numarray.array([0, 35, 45, 55, 75])
 >>> numarray.searchsorted(a, [1,42,35])
array([1, 2, 1])

Todd





More information about the Numpy-discussion mailing list