[Numpy-discussion] comparison operators (e.g. ==) on array with dtype object do not work
josef.pktd@gmai...
josef.pktd@gmai...
Thu Jan 14 17:40:20 CST 2010
On Thu, Jan 14, 2010 at 5:49 PM, Warren Weckesser
<warren.weckesser@enthought.com> wrote:
> Yaroslav Halchenko wrote:
>> Dear NumPy People,
>>
>> First I want to apologize if I misbehaved on NumPy Trac by reopening the
>> closed ticket
>> http://projects.scipy.org/numpy/ticket/1362
>> but I still feel strongly that there is misunderstanding
>> and the bug/defect is valid. I would appreciate if someone would waste
>> more of his time to persuade me that I am wrong but please first read
>> till the end:
>>
>> The issue, as originally reported, is demonstrated with:
>>
>> ,---
>> | > python -c 'import numpy as N; print N.__version__; a=N.array([1, (0,1)],dtype=object); print a==1; print a == (0,1), a[1] == (0,1)'
>> | 1.5.0.dev
>> | [ True False]
>> | [False False] True
>> `---
>>
>> whenever I expected the last line to be
>>
>> [False True] True
>>
>> charris (thanks for all the efforts to enlighten me) summarized it as
>>
>> """the result was correct given that the tuple (0,1) was converted to an
>> object array with elements 0 and 1. It is *not* converted to an array
>> containing a tuple. """
>>
>> and I was trying to argue that it is not the case in my example. It is
>> the case in charris's example though whenever both elements are of
>> the same length, or there is just a single tuple, i.e.
>>
>>
>
> The "problem" is that the tuple is converted to an array in the
> statement that
> does the comparison, not in the construction of the array. Numpy attempts
> to convert the right hand side of the == operator into an array. It
> then does
> the comparison using the two arrays.
>
> One way to get what you want is to create your own array and then do
> the comparison:
>
> In [1]: import numpy as np
>
> In [2]: a = np.array([1, (0,1)], dtype='O')
>
> In [3]: t = np.empty(1, dtype='O')
>
> In [4]: t[0] = (0,1)
>
> In [5]: a == t
> Out[5]: array([False, True], dtype=bool)
>
>
> In the above code, a numpy array 't' of objects with shape (1,) is created,
> and the single element is assigned the value (0,1). Then the comparison
> works as expected.
>
> More food for thought:
>
> In [6]: b = np.array([1, (0,1), "foo"], dtype='O')
>
> In [7]: b == 1
> Out[7]: array([ True, False, False], dtype=bool)
>
> In [8]: b == (0,1)
> Out[8]: False
>
> In [9]: b == "foo"
> Out[9]: array([False, False, True], dtype=bool)
>
It looks difficult to construct an object array with only 1 element,
since a tuple is interpreted as different array elements.
>>> N.array([(0,1)],dtype=object).shape
(1, 2)
>>> N.array([(0,1),()],dtype=object).shape
(2,)
>>> c = N.array([(0,1),()],dtype=object)[:1]
>>> c.shape1,)
>>> a == c
array([False, True], dtype=bool)
It looks like some convention is necessary for interpreting a tuple in
the array construction, but it doesn't look like a problem with the
comparison operator just a consequence.
Josef
> Warren
>
>> ,---
>> | In [1]: array((0,1), dtype=object)
>> | Out[1]: array([0, 1], dtype=object)
>> |
>> | In [2]: array((0,1), dtype=object).shape
>> | Out[2]: (2,)
>> `---
>>
>> There I would not expect my comparison to be valid indeed. But lets see what
>> happens in my case:
>>
>> ,---
>> | In [2]: array([1, (0,1)],dtype=object)
>> | Out[2]: array([1, (0, 1)], dtype=object)
>> |
>> | *In [3]: array([1, (0,1)],dtype=object).shape
>> | Out[3]: (2,)
>> |
>> | *In [4]: array([1, (0,1)],dtype=object)[1].shape
>> | ---------------------------------------------------------------------------
>> | AttributeError Traceback (most recent call
>> | last)
>> |
>> | /home/yoh/proj/<ipython console> in <module>()
>> |
>> | AttributeError: 'tuple' object has no attribute 'shape'
>> `---
>>
>> So, as far as I see it, the array does contain an object of type tuple,
>> which does not get correctly compared upon __eq__ operation. Am I
>> wrong? Or does numpy internally somehow does convert 1st item (ie
>> tuple) into an array, but casts it back to tuple upon __repr__ or
>> __getitem__?
>>
>> Thanks in advance for feedback
>>
>> On Thu, 14 Jan 2010, NumPy Trac wrote:
>>
>>
>>> #1362: comparison operators (e.g. ==) on array with dtype object do not work
>>> -------------------------+--------------------------------------------------
>>> Reporter: yarikoptic | Owner: somebody
>>> Type: defect | Status: closed
>>> Priority: normal | Milestone:
>>> Component: Other | Version:
>>> Resolution: invalid | Keywords:
>>> -------------------------+--------------------------------------------------
>>> Changes (by charris):
>>>
>>
>>
>>> * status: reopened => closed
>>> * resolution: => invalid
>>>
>>
>>
>>
>>> Old description:
>>>
>>
>>
>>>> You can see this better with the '*' operator:
>>>>
>>
>>
>>
>>>> {{{
>>>> In [8]: a * (0,2)
>>>> Out[8]: array([0, (0, 1, 0, 1)], dtype=object)
>>>> }}}
>>>>
>>
>>
>>
>>>> Note how the tuple is concatenated with itself. The reason the original
>>>> instance of a worked was that 1 and (0,1) are of different lengths, so
>>>> the decent into the nested sequence types stopped at one level and a
>>>> tuple is one of the elements. When you do something like ((0,1),(0,1))
>>>> the decent goes down two levels and you end up with a 2x2 array of
>>>> integer objects. The rule of thumb for object arrays is that you get an
>>>> array with as many indices as possible. Which is why object arrays are
>>>> hard to create. Another example:
>>>>
>>
>>
>>
>>>> {{{
>>>> In [10]: array([(1,2,3),(1,2)], dtype=object)
>>>> Out[10]: array([(1, 2, 3), (1, 2)], dtype=object)
>>>>
>>
>>
>>>> In [11]: array([(1,2),(1,2)], dtype=object)
>>>> Out[11]:
>>>> array([[1, 2],
>>>> [1, 2]], dtype=object)
>>>> }}}
>>>>
>>
>>
>>> New description:
>>>
>>
>>
>>> {{{
>>> python -c 'import numpy as N; print N.__version__; a=N.array([1,
>>> (0,1)],dtype=object); print a==1; print a == (0,1), a[1] == (0,1)'
>>> }}}
>>> results in
>>> {{{
>>> 1.5.0.dev
>>> [ True False]
>>> [False False] True
>>> }}}
>>> I expected last line to be
>>> {{{
>>> [False True] True
>>> }}}
>>> So, it works for int but doesn't work for tuple... I guess it doesn't try
>>> to compare element by element but does smth else.
>>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
More information about the NumPy-Discussion
mailing list