[Numpy-discussion] A newbie question: How to get the "rank" of an 1-d array
anewgene at gmail.com
Mon Mar 27 11:25:11 CST 2006
Thanks, Tim. Your function of "listrank" is indeed equivalent with mine.
My typical vector size is 150K and there are thousands of such vectors
need to be processed. I think the performace would be boosted if there
is any way other than pure python fucntion "listrank".
Tim Hochberg wrote:
>> Hi, group,
>> I need to get the "rank" of an 1-D array (ie. a vector).
>> Note that "rank" here is not the value returned from "rank(a_array)".
>> It is the order of the item in its sorted arrray. For example, I have
>> a python function called "listrank" to return the "rank" as below:
> In the future, please include the relevant function. This saves us (me
> anyway) time reverse engineering said function from the description
> you give. Is the function below equivalent to your listrank function?
>
> def listrank(v):
> rank = {}
> for i, x in enumerate(reversed(sorted(v))):
> if x not in rank:
> rank[x] = i
> return [rank[x]+1 for x in v]
>> In [19]: x
>> Out[19]: array([1, 2, 5, 3, 3, 2])
>>
>> In [20]: listrank(x)
>> Out[20]: [6, 4, 1, 2, 2, 4]
>>
>> Somebody suggested me to use "argsort(argsort(x))". But the problem
>> is it does not handle ties. See the output:
>>
>> In [21]: argsort(argsort(x))
>> Out[21]: array([0, 1, 5, 3, 4, 2])
>> I am wondering if there is a solution in numpy/numarray/numeric to
>> get this done nicely.
> Unfortunately, nothing comes to mind immediately. This kind of
> problem, where the values at one index depend on the values at a
> different index is often hard to deal with in the array framework. How
> large of vectors are you typically dealing with? If they are not
> extremely large or this isn't a performance critical a python solution
> like above, possibly somewhat optimized, may well be sufficient.
>
> Perhaps someone else will come up with something though.
>
> Regards,
>
> -tim
