[Numpy-discussion] Find the N maximum values and corresponding indexes in an array

Anne Archibald peridot.faceted@gmail....
Wed Dec 2 19:27:08 CST 2009


2009/12/2 Keith Goodman <kwgoodman@gmail.com>:
> On Wed, Dec 2, 2009 at 5:09 PM, Neal Becker <ndbecker2@gmail.com> wrote:
>> David Warde-Farley wrote:
>>
>>> On 2-Dec-09, at 6:55 PM, Howard Chong wrote:
>>>
>>>> def myFindMaxA(myList):
>>>>    """implement finding maximum value with for loop iteration"""
>>>>    maxIndex=0
>>>>    maxVal=myList[0]
>>>>    for index, item in enumerate(myList):
>>>>        if item[0]>maxVal:
>>>>            maxVal=item[0]
>>>>            maxIndex=index
>>>>    return [maxIndex, maxVal]
>>>>
>>>>
>>>>
>>>> My question is: how can I make the latter version run faster? I
>>>> think the
>>>> answer is that I have to do the iteration in C.
>>>
>>>
>>> def find_biggest_n(myarray, n):
>>> ind = np.argsort(myarray)
>>> return ind[-n:], myarray[ind[-n:]]
>>>
>>> David
>> Not bad, although I wonder whether a partial sort could be faster.
>
> I'm doing a lot of sorting right now. I only need to sort the lowest
> 30% of values in a 1d array (about 250k elements), the rest I don't
> need to sort. How do I do a partial sort?

Algorithmically, if you're doing a quicksort, you just don't sort one
side of a partition when it's outside the range you want sorted (which
could even be data-dependent, I suppose, as well as
number-of-items-dependent). This is useful for sorting out only the
extreme elements, as well as finding quantiles (and those elements
above/below them). Unfortunately I'm not aware of any implementation,
useful though it might be.

Anne


More information about the NumPy-Discussion mailing list