[Numpy-discussion] dtype comparison and hashing

Robert Kern robert.kern@gmail....
Wed Oct 15 14:56:52 CDT 2008

On Wed, Oct 15, 2008 at 02:20, Geoffrey Irving <irving@naml.us> wrote:
> Hello,
> Currently in numpy comparing dtypes for equality with == does an
> internal PyArray_EquivTypes check, which means that the dtypes NPY_INT
> and NPY_LONG compare as equal in python.  However, the hash function
> for dtypes reduces id(), which is therefore inconsistent with ==.
> Unfortunately I can't produce a python snippet showing this since I
> don't know how to create a NPY_INT dtype in pure python.
> Based on the source it looks like hash should raise a type error,
> since tp_hash is null but tp_richcompare is not.  Does the following
> snippet through an exception for others?
>>>> import numpy
>>>> hash(numpy.dtype('int'))
> 5708736
> This might be the problem:
> /* Macro to get the tp_richcompare field of a type if defined */
> #define RICHCOMPARE(t) (PyType_HasFeature((t), Py_TPFLAGS_HAVE_RICHCOMPARE) \
>                         ? (t)->tp_richcompare : NULL)
> I'm using the default Mac OS X 10.5 installation of python 2.5 and
> numpy, so maybe those weren't compiled correctly.  Has anyone else
> seen this issue?

Actually, the problem is that we provide a hash function explicitly.
In multiarraymodule.c:

    PyArrayDescr_Type.tp_hash = (hashfunc)_Py_HashPointer;

That is a violation of the hashing protocol (objects which compare
equal and are hashable need to hash equal), and should be fixed.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

More information about the Numpy-discussion mailing list