[Numpy-discussion] Find indices of largest elements

Keith Goodman kwgoodman@gmail....
Thu Apr 15 19:22:46 CDT 2010


On Thu, Apr 15, 2010 at 1:48 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
> On Thu, Apr 15, 2010 at 12:41 PM, Nikolaus Rath <Nikolaus@rath.org> wrote:
>> Keith Goodman <kwgoodman@gmail.com> writes:
>>> On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath <Nikolaus@rath.org> wrote:
>>>> Keith Goodman <kwgoodman@gmail.com> writes:
>>>>> On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman <kwgoodman@gmail.com> wrote:
>>>>>> On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath <Nikolaus@rath.org> wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> How do I best find out the indices of the largest x elements in an
>>>>>>> array?
>>>>>>>
>>>>>>> Example:
>>>>>>>
>>>>>>> a = [ [1,8,2], [2,1,3] ]
>>>>>>> magic_function(a, 2) == [ (0,1), (1,2) ]
>>>>>>>
>>>>>>> Since the largest 2 elements are at positions (0,1) and (1,2).
>>>>>>
>>>>>> Here's a quick way to rank the data if there are no ties and no NaNs:
>>>>>
>>>>> ...or if you need the indices in order:
>>>>>
>>>>>>> shape = (3,2)
>>>>>>> x = np.random.rand(*shape)
>>>>>>> x
>>>>> array([[ 0.52420123,  0.43231286],
>>>>>        [ 0.97995333,  0.87416228],
>>>>>        [ 0.71604075,  0.66018382]])
>>>>>>> r = x.reshape(-1).argsort().argsort()
>>>>
>>>> I don't understand why this works. Why do you call argsort() twice?
>>>> Doesn't that give you the indices of the sorted indices?
>>>
>>> It is confusing. Let's look at an example:
>>>
>>>>> x = np.random.rand(4)
>>>>> x
>>>    array([ 0.37412289,  0.68248559,  0.12935131,  0.42510212])
>>>
>>> If we call argsort once we get the index that will sort x:
>>>
>>>>> idx = x.argsort()
>>>>> idx
>>>    array([2, 0, 3, 1])
>>>>> x[idx]
>>>    array([ 0.12935131,  0.37412289,  0.42510212,  0.68248559])
>>>
>>> Notice that the first element of idx is 2. That's because element x[2]
>>> is the min of x. But that's not what we want.
>>
>> I think that's exactly what I want, the index of the smallest element.
>> It also seems to work:
>>
>> In [3]: x = np.random.rand(3,3)
>> In [4]: x
>> Out[4]:
>> array([[ 0.49064281,  0.54989584,  0.05319183],
>>       [ 0.50510206,  0.39683101,  0.22801874],
>>       [ 0.04595144,  0.3329171 ,  0.61156205]])
>> In [5]: idx = x.reshape(-1).argsort()
>> In [6]: [ np.unravel_index(i, x.shape) for i in idx[-3:] ]
>> Out[6]: [(1, 0), (0, 1), (2, 2)]
>
> Yes, you are right. My first thought was to approach the problem by
> ranking the data. But that is not needed here since the position in
> the argsorted index tells us the rank. I guess my approach was to rank
> first and then ask questions later. Well, at least we got to see
> Anne's fast ranking method.

I see now that the first method I tried in this thread requires
ranking. But the second method, the one that uses unravel_index,
doesn't.


More information about the NumPy-Discussion mailing list