[Numpy-discussion] weird searchsorted behavior for unicode array

josef.pktd@gmai... josef.pktd@gmai...
Wed Mar 28 12:17:04 CDT 2012


On Wed, Mar 28, 2012 at 11:51 AM,  <josef.pktd@gmail.com> wrote:
> On Wed, Mar 28, 2012 at 10:55 AM, Thouis (Ray) Jones <thouis@gmail.com> wrote:
>> I am seeing some very strange behavior searching a unicode array.  The
>> attached code outputs the following:
>> UNICODE
>> Is sorted: True
>> Search sorted by iteration, left: [0, 1, 2, 4, 4, 6, 6, 8, 8, 10, 10,
>> 12, 12, 13]
>> Search sorted by iteration, right: [0, 2, 2, 4, 4, 6, 6, 8, 8, 10, 10,
>> 12, 12, 13]
>> Search sorted by indexing, left: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 13]
>> Search sorted by indexing, right: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
>> 12, 13, 13]
>> Search sorted by indexing with copy, left: [1, 2, 3, 4, 5, 6, 7, 8, 9,
>> 10, 11, 12, 13, 13]
>> Search sorted by indexing with copy, right: [1, 2, 3, 4, 5, 6, 7, 8,
>> 9, 10, 11, 12, 13, 13]
>>
>> If I remove the first print, it produces:
>> Is sorted: True
>> Search sorted by iteration, left: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
>> Search sorted by iteration, right: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
>> 11, 12, 13]
>> Search sorted by indexing, left: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 13]
>> Search sorted by indexing, right: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
>> 12, 13, 13]
>> Search sorted by indexing with copy, left: [0, 1, 2, 3, 4, 5, 6, 7, 8,
>> 9, 10, 11, 12, 13]
>> Search sorted by indexing with copy, right: [0, 1, 2, 3, 4, 5, 6, 7,
>> 8, 9, 10, 11, 12, 13]
>>
>> Neither answer is correct, since left and right should be offset by 1
>> when searching for an element in the array, by my reading of the docs.
>>
>> This is numpy 1.6.1 on OSX 10.6, python 2.7
>>
>> Am I missing something?
>
> adding this
> # -*- coding: utf-8 -*-
>
> produces consistent results for me
> maybe the regex for encoding, but I thought it has to be the first line

consistent means the same with and without commenting out UNICODE, but
searchsorted doesn't distinguish between left and right. that looks
like a bug (in numpy 1.4.1)

using an object array, or a string view a.view('<S524')  produces the
correct left right shift.

Josef

>
> Josef
>
>
>>
>> Thanks,
>> Ray Jones
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>


More information about the NumPy-Discussion mailing list