[Numpy-discussion] numpy type mismatch

Olivier Delalleau shish@keba...
Fri Jun 10 20:50:30 CDT 2011


2011/6/10 Charles R Harris <charlesr.harris@gmail.com>

>
>
> On Fri, Jun 10, 2011 at 5:19 PM, Olivier Delalleau <shish@keba.be> wrote:
>
>> 2011/6/10 Charles R Harris <charlesr.harris@gmail.com>
>>
>>>
>>>
>>> On Fri, Jun 10, 2011 at 3:43 PM, Benjamin Root <ben.root@ou.edu> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Jun 10, 2011 at 3:24 PM, Charles R Harris <
>>>> charlesr.harris@gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Jun 10, 2011 at 2:17 PM, Benjamin Root <ben.root@ou.edu>wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 10, 2011 at 3:02 PM, Charles R Harris <
>>>>>> charlesr.harris@gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jun 10, 2011 at 1:50 PM, Benjamin Root <ben.root@ou.edu>wrote:
>>>>>>>
>>>>>>>> Came across an odd error while using numpy master.  Note, my system
>>>>>>>> is 32-bits.
>>>>>>>>
>>>>>>>> >>> import numpy as np
>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int32)) == np.int32
>>>>>>>> False
>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int64)) == np.int64
>>>>>>>> True
>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.float32)) == np.float32
>>>>>>>> True
>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.float64)) == np.float64
>>>>>>>> True
>>>>>>>>
>>>>>>>> So, only the summation performed with a np.int32 accumulator results
>>>>>>>> in a type that doesn't match the expected type.  Now, for even more
>>>>>>>> strangeness:
>>>>>>>>
>>>>>>>> >>> type(np.sum([1, 2, 3], dtype=np.int32))
>>>>>>>> <type 'numpy.int32'>
>>>>>>>> >>> hex(id(type(np.sum([1, 2, 3], dtype=np.int32))))
>>>>>>>> '0x9599a0'
>>>>>>>> >>> hex(id(np.int32))
>>>>>>>> '0x959a80'
>>>>>>>>
>>>>>>>> So, the type from the sum() reports itself as a numpy int, but its
>>>>>>>> memory address is different from the memory address for np.int32.
>>>>>>>>
>>>>>>>>
>>>>>>> One of them is probably a long, print out the typecode, dtype.char.
>>>>>>>
>>>>>>> Chuck
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> Good intuition, but odd result...
>>>>>>
>>>>>> >>> import numpy as np
>>>>>> >>> a = np.sum([1, 2, 3], dtype=np.int32)
>>>>>> >>> b = np.int32(6)
>>>>>> >>> type(a)
>>>>>> <type 'numpy.int32'>
>>>>>> >>> type(b)
>>>>>> <type 'numpy.int32'>
>>>>>> >>> a.dtype.char
>>>>>> 'i'
>>>>>> >>> b.dtype.char
>>>>>> 'l'
>>>>>>
>>>>>> So, the standard np.int32 is getting listed as a long somehow?  To
>>>>>> further investigate:
>>>>>>
>>>>>>
>>>>> Yes, long shifts around from int32 to int64 depending on the OS. For
>>>>> instance, in 64 bit Windows it's 32 bits while in 64 bit Linux it's 64 bits.
>>>>> On 32 bit systems it is 32 bits.
>>>>>
>>>>> Chuck
>>>>>
>>>>>
>>>> Right, that makes sense.  But, the question is why does sum() put out a
>>>> result dtype that is not identical to the dtype that I requested, or even
>>>> the dtype of the input array?  Could this be an indication of a bug
>>>> somewhere?  Even if the bug is harmless (it was only noticed within the test
>>>> suite of larry), is this unexpected?
>>>>
>>>>
>>> I expect sum is using a ufunc and it acts differently on account of the
>>> cleanup of the ufunc casting rules. And yes, a long *is* int32 on your
>>> machine. On mine
>>>
>>> In [4]: dtype('q') # long long
>>> Out[4]: dtype('int64')
>>>
>>> In [5]: dtype('l') # long
>>> Out[5]: dtype('int64')
>>>
>>> The mapping from C types to numpy width types isn't 1-1. Personally, I
>>> think we should drop long ;) But it used to be the standard Python type in
>>> the C API. Mark has also pointed out the problems/confusion this ambiguity
>>> causes and someday we should probably think it out and fix it. But I don't
>>> think it is the most pressing problem.
>>>
>>> Chuck
>>>
>>>
>> But isn't it a bug if numpy.dtype('i') != numpy.dtype('l') on a 32 bit
>> computer where both are int32?
>>
>>
> Maybe yes, maybe no ;) They have different descriptors, so from numpy's
> perspective they are different, but at the hardware/precision level they are
> the same. It's more of a decision as to what  != means in this case. Since
> numpy started as Numeric with only the c types the current behavior is
> consistent, but that doesn't mean it shouldn't change at some point.
>
> Chuck
>

Well apparently it was actually changed recently, since in Numpy 1.5.1 on a
Windows 32 bit machine, they are considered equal with '=='.
Personally I think if the string representation of two dtypes is "int32",
then they should be ==, otherwise it wouldn't make much sense given that you
can directly test the equality of a dtype with a string like "int32" (like
dtype('i') == "int32" and dtype('l') == "int32").

-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20110610/789e96c0/attachment.html 


More information about the NumPy-Discussion mailing list