[Numpy-discussion] A newbie question: How to get the "rank" of an 1-d array
Tim Hochberg
tim.hochberg at cox.net
Mon Mar 27 11:43:04 CST 2006
CL wrote:
> Thanks, Tim. Your function of "listrank" is indeed equivalent with
> mine. My typical vector size is 150K and there are thousands of such
> vectors need to be processed.
Ouch!
> I think the performace would be boosted if there is any way other than
> pure python fucntion "listrank".
OK. Well still no great ideas on this end. If you have some a priori
knowledge about the vector, there might be some tricks using take. But
that would require that the values in 'v; be integers that are on the
order of len(v): is that the case?
In any event, the following is considerably faster than the previous
version:
def listrank_4(v):
rank = {}
for i, x in enumerate(numpy.sort(v)[::-1]):
if x not in rank:
rank[x] = i
return numpy.array([rank[x] for x in v]) + 1
The execution time here appears to be dominated by the time it takes to
insert items in the dictionary. If we know enough about the items of
'v', we could potentially replace the dictionary with a vector and speed
things about quite a bit more.
Regards,
-tim
>
> Thanks again,
>
> CL
>
> Tim Hochberg wrote:
>
>> CL wrote:
>>
>>> Hi, group,
>>> I need to get the "rank" of an 1-D array (ie. a vector).
>>> Note that "rank" here is not the value returned from
>>> "rank(a_array)". It is the order of the item in its sorted arrray.
>>> For example, I have a python function called "listrank" to return
>>> the "rank" as below:
>>
>>
>>
>> In the future, please include the relevant function. This saves us
>> (me anyway) time reverse engineering said function from the
>> description you give. Is the function below equivalent to your
>> listrank function?
>>
>> def listrank(v):
>> rank = {}
>> for i, x in enumerate(reversed(sorted(v))):
>> if x not in rank:
>> rank[x] = i
>> return [rank[x]+1 for x in v]
>>
>>>
>>> In [19]: x
>>> Out[19]: array([1, 2, 5, 3, 3, 2])
>>>
>>> In [20]: listrank(x)
>>> Out[20]: [6, 4, 1, 2, 2, 4]
>>>
>>> Somebody suggested me to use "argsort(argsort(x))". But the problem
>>> is it does not handle ties. See the output:
>>>
>>> In [21]: argsort(argsort(x))
>>> Out[21]: array([0, 1, 5, 3, 4, 2])
>>>
>>> I am wondering if there is a solution in numpy/numarray/numeric to
>>> get this done nicely.
>>
>>
>>
>> Unfortunately, nothing comes to mind immediately. This kind of
>> problem, where the values at one index depend on the values at a
>> different index is often hard to deal with in the array framework.
>> How large of vectors are you typically dealing with? If they are not
>> extremely large or this isn't a performance critical a python
>> solution like above, possibly somewhat optimized, may well be
>> sufficient.
>>
>> Perhaps someone else will come up with something though.
>>
>> Regards,
>>
>> -tim
>>
>>
>>
>>
>>
>> -------------------------------------------------------
>> This SF.Net email is sponsored by xPML, a groundbreaking scripting
>> language
>> that extends applications into web and mobile media. Attend the live
>> webcast
>> and join the prime developer group breaking into this new coding
>> territory!
>> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
>> _______________________________________________
>> Numpy-discussion mailing list
>> Numpy-discussion at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language
> that extends applications into web and mobile media. Attend the live
> webcast
> and join the prime developer group breaking into this new coding
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>
More information about the Numpy-discussion
mailing list