[Numpy-discussion] dtype comparison and hashing
Robert Kern
robert.kern@gmail....
Wed Oct 15 14:56:52 CDT 2008
On Wed, Oct 15, 2008 at 02:20, Geoffrey Irving <irving@naml.us> wrote:
> Hello,
>
> Currently in numpy comparing dtypes for equality with == does an
> internal PyArray_EquivTypes check, which means that the dtypes NPY_INT
> and NPY_LONG compare as equal in python. However, the hash function
> for dtypes reduces id(), which is therefore inconsistent with ==.
> Unfortunately I can't produce a python snippet showing this since I
> don't know how to create a NPY_INT dtype in pure python.
>
> Based on the source it looks like hash should raise a type error,
> since tp_hash is null but tp_richcompare is not. Does the following
> snippet through an exception for others?
>
>>>> import numpy
>>>> hash(numpy.dtype('int'))
> 5708736
>
> This might be the problem:
>
> /* Macro to get the tp_richcompare field of a type if defined */
> #define RICHCOMPARE(t) (PyType_HasFeature((t), Py_TPFLAGS_HAVE_RICHCOMPARE) \
> ? (t)->tp_richcompare : NULL)
>
> I'm using the default Mac OS X 10.5 installation of python 2.5 and
> numpy, so maybe those weren't compiled correctly. Has anyone else
> seen this issue?
Actually, the problem is that we provide a hash function explicitly.
In multiarraymodule.c:
PyArrayDescr_Type.tp_hash = (hashfunc)_Py_HashPointer;
That is a violation of the hashing protocol (objects which compare
equal and are hashable need to hash equal), and should be fixed.
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
-- Umberto Eco
More information about the Numpy-discussion
mailing list