[Numpy-discussion] Find indices of largest elements
Keith Goodman
kwgoodman@gmail....
Thu Apr 15 19:22:46 CDT 2010
On Thu, Apr 15, 2010 at 1:48 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
> On Thu, Apr 15, 2010 at 12:41 PM, Nikolaus Rath <Nikolaus@rath.org> wrote:
>> Keith Goodman <kwgoodman@gmail.com> writes:
>>> On Wed, Apr 14, 2010 at 12:39 PM, Nikolaus Rath <Nikolaus@rath.org> wrote:
>>>> Keith Goodman <kwgoodman@gmail.com> writes:
>>>>> On Wed, Apr 14, 2010 at 8:49 AM, Keith Goodman <kwgoodman@gmail.com> wrote:
>>>>>> On Wed, Apr 14, 2010 at 8:16 AM, Nikolaus Rath <Nikolaus@rath.org> wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> How do I best find out the indices of the largest x elements in an
>>>>>>> array?
>>>>>>>
>>>>>>> Example:
>>>>>>>
>>>>>>> a = [ [1,8,2], [2,1,3] ]
>>>>>>> magic_function(a, 2) == [ (0,1), (1,2) ]
>>>>>>>
>>>>>>> Since the largest 2 elements are at positions (0,1) and (1,2).
>>>>>>
>>>>>> Here's a quick way to rank the data if there are no ties and no NaNs:
>>>>>
>>>>> ...or if you need the indices in order:
>>>>>
>>>>>>> shape = (3,2)
>>>>>>> x = np.random.rand(*shape)
>>>>>>> x
>>>>> array([[ 0.52420123, 0.43231286],
>>>>> [ 0.97995333, 0.87416228],
>>>>> [ 0.71604075, 0.66018382]])
>>>>>>> r = x.reshape(-1).argsort().argsort()
>>>>
>>>> I don't understand why this works. Why do you call argsort() twice?
>>>> Doesn't that give you the indices of the sorted indices?
>>>
>>> It is confusing. Let's look at an example:
>>>
>>>>> x = np.random.rand(4)
>>>>> x
>>> array([ 0.37412289, 0.68248559, 0.12935131, 0.42510212])
>>>
>>> If we call argsort once we get the index that will sort x:
>>>
>>>>> idx = x.argsort()
>>>>> idx
>>> array([2, 0, 3, 1])
>>>>> x[idx]
>>> array([ 0.12935131, 0.37412289, 0.42510212, 0.68248559])
>>>
>>> Notice that the first element of idx is 2. That's because element x[2]
>>> is the min of x. But that's not what we want.
>>
>> I think that's exactly what I want, the index of the smallest element.
>> It also seems to work:
>>
>> In [3]: x = np.random.rand(3,3)
>> In [4]: x
>> Out[4]:
>> array([[ 0.49064281, 0.54989584, 0.05319183],
>> [ 0.50510206, 0.39683101, 0.22801874],
>> [ 0.04595144, 0.3329171 , 0.61156205]])
>> In [5]: idx = x.reshape(-1).argsort()
>> In [6]: [ np.unravel_index(i, x.shape) for i in idx[-3:] ]
>> Out[6]: [(1, 0), (0, 1), (2, 2)]
>
> Yes, you are right. My first thought was to approach the problem by
> ranking the data. But that is not needed here since the position in
> the argsorted index tells us the rank. I guess my approach was to rank
> first and then ask questions later. Well, at least we got to see
> Anne's fast ranking method.
I see now that the first method I tried in this thread requires
ranking. But the second method, the one that uses unravel_index,
doesn't.
More information about the NumPy-Discussion
mailing list