[Numpy-discussion] Multiplicity of an entry

Christopher Barker Chris.Barker@noaa....
Tue Oct 27 11:09:53 CDT 2009

Nadav Horesh wrote:
> np.equal(a,a).sum(0)
> but, for unknown reason, np.equal operates only on "normal" arrays.


In [25]: a
array(['abc', 'def', 'abc', 'ghij'],

In [27]: np.equal(a,a)
Out[27]: NotImplemented


In [28]: a == a
Out[28]: array([ True,  True,  True,  True], dtype=bool)

don't they use the same code? or is "==" reverting to plain old generic 
python sequence comparison, which would partly explain why it is so slow.

> maybe you can transform the array to arrays of numbers, for example by hash.

or even easier:

In [32]: a2 = a.view(dtype=np.int32)

In [33]: a2
Out[33]: array([1633837824, 1684366848, 1633837824, 1734895978])

In [34]: np.equal(a2, a2[0])
Out[34]: array([ True, False,  True, False], dtype=bool)

though that only works if your strings are a handy length like 4 bytes...


Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception


More information about the NumPy-Discussion mailing list