[Numpy-discussion] bug in lexsort with two different dtypes?
Charles R Harris
charlesr.harris@gmail....
Tue Jun 26 19:47:53 CDT 2007
On 6/26/07, Tom Denniston <tom.denniston@alum.dartmouth.org> wrote:
>
> In [1]: intArr1 = numpy.array([ 0, 1, 2,-2,-1, 5,-5,-5])
> In [2]: intArr2 = numpy.array([1,1,1,2,2,2,3,4])
> In [3]: charArr = numpy.array(['a','a','a','b','b','b','c','d'])
>
> Here I sort two int arrays. As expected intArr2 dominates intArr1 but
> the items with the same intArr2 values are sorted forwards according
> to intArr1
> In [6]: numpy.lexsort((intArr1, intArr2))
> Out[6]: array([0, 1, 2, 3, 4, 5, 6, 7])
>
> This, however, looks like a bug to me. Here I sort an int array and
> a str array. As expected charArray dominates intArr1 but the items
> with the same charArray values are sorted *backwards* according to
> intArr1
> In [5]: numpy.lexsort((intArr1, charArr))
> Out[5]: array([2, 1, 0, 5, 4, 3, 6, 7])
>
> Is this a bug or am I missing something?
Looks like a bug.
In [12]: numpy.argsort([charArr], kind='m')
Out[12]: array([[2, 1, 0, 5, 4, 3, 6, 7]])
In [13]: numpy.argsort([intArr2], kind='m')
Out[13]: array([[0, 1, 2, 3, 4, 5, 6, 7]])
Both of these are stable sorts, and since the elements are in order should
return [[0, 1, 2, 3, 4, 5, 6, 7]]. Actually, I think they should return [0,
1, 2, 3, 4, 5, 6, 7], I'm not sure why the returned array is 2D and I
suspect that is a bug also. As to why the string array sorts incorrectly, I
am not sure. It could be that the sort isn't stable, there could be a stride
error, or the comparison is returning wrong values. My bet is on the first
being the case.
Please file a ticket on this.
Chuck
