[Numpy-discussion] workaround for searchsorted with strings?
Charles R Harris
charlesr.harris@gmail....
Thu May 22 13:38:17 CDT 2008
On Thu, May 22, 2008 at 12:29 PM, Lewis Hyatt <lhyatt@gmail.com> wrote:
> I see from this thread:
> http://article.gmane.org/gmane.comp.python.numeric.general/18746/
> that searchsorted does not work correctly with strings. Is there a
> workaround,
> though, that I can use with 1.0.4 until there is a new official numpy
> release
> that includes the fix mentioned in the reference above? Using the latest
> SVN
> version is not an option for me.
> My understanding was that searchsorted works OK if the strings are all the
> same
> data type, but that does not appear to be the case:
>
> p >>> x=array(['0', '1', '2', '12'])
> p >>> y=array(['0', '0', '2', '3', '123'])
> p >>> x.searchsorted(y)
> array([0, 0, 0, 2, 0])
> p >>> x.astype(y.dtype).searchsorted(y)
> array([0, 0, 2, 4, 2])
> I understand that the first call to searchsorted fails because y has type
> S3 and
> x has type S2. But it seems that changing the type of x produces still
> incorrect (albeit) different results. Is there something similar I can do
> to
> make this work for now? Thanks very much.
The x array is not sorted. Try
x = array(['0', '1', '12', '2'])
