[Numpy-discussion] Multiplicity of an entry
Christopher Barker
Chris.Barker@noaa....
Tue Oct 27 11:09:53 CDT 2009
Nadav Horesh wrote:
> np.equal(a,a).sum(0)
>
> but, for unknown reason, np.equal operates only on "normal" arrays.
true:
In [25]: a
Out[25]:
array(['abc', 'def', 'abc', 'ghij'],
dtype='|S4')
In [27]: np.equal(a,a)
Out[27]: NotImplemented
however:
In [28]: a == a
Out[28]: array([ True, True, True, True], dtype=bool)
don't they use the same code? or is "==" reverting to plain old generic
python sequence comparison, which would partly explain why it is so slow.
> maybe you can transform the array to arrays of numbers, for example by hash.
or even easier:
In [32]: a2 = a.view(dtype=np.int32)
In [33]: a2
Out[33]: array([1633837824, 1684366848, 1633837824, 1734895978])
In [34]: np.equal(a2, a2[0])
Out[34]: array([ True, False, True, False], dtype=bool)
though that only works if your strings are a handy length like 4 bytes...
-Chris
