[Numpy-discussion] dtype=object behavior change from 0.9.6 to beta 1

Tom Denniston tom.denniston at alum.dartmouth.org
Thu Aug 31 12:00:06 CDT 2006

But i have hetergenious arrays that have numbers and strings and NoneType, etc.

Take for instance:

In [11]: numpy.array([numpy.array([1,'A', None]),
numpy.array([2,2,'Some string'])], dtype=object)
array([[1, A, None],
       [2, 2, Some string]], dtype=object)

In [12]: numpy.array([numpy.array([1,'A', None]),
numpy.array([2,2,'Some string'])], dtype=object).shape
Out[12]: (2, 3)

Works fine in Numeric and pre beta numpy but in beta numpy versions i get:

In [6]: numpy.array([numpy.array([1,'A', None]),
numpy.array([2,2,'Some string'])], dtype=object)
Out[6]: array([[1 A None], [2 2 Some string]], dtype=object)

In [7]: numpy.array([numpy.array([1,'A', None]),
numpy.array([2,2,'Some string'])], dtype=object).shape
Out[7]: (2,)

But a lists of lists still gives:

In [9]: numpy.array([[1,'A', None], [2,2,'Some string']], dtype=object).shape
Out[9]: (2, 3)

And if you omit the dtype and use a list of arrays then you get a
string array with 2,3 dimensions:
In [11]: numpy.array([numpy.array([1,'A', None]),
numpy.array([2,2,'Some string'])]).shape
Out[11]: (2, 3)

This new behavior strikes me as inconsistent.  One of the things I
love about numpy is the ufuncs, array constructors, etc don't care
about whether you pass a combination of lists, arrays, tuples, etc.
They just know what you _mean_.  And what you _mean_ by a list of
lists, tuple of arrays, list of arrays, etc is very consistent across
constructors and functions.  I think making an exception for
dtype=object introduces a lot of inconsistencies and it isn't clear to
me what is gained.  Does anyone commonly use the array interface in a
manner that this new behavior is actuallly favorable?  I may be
overlooking a common use case or something like that.

On 8/31/06, Charles R Harris <charlesr.harris at gmail.com> wrote:
> On 8/31/06, Tom Denniston
> <tom.denniston at alum.dartmouth.org> wrote:
> >
> > For this simple example yes, but if one of the nice things about the array
> constructors is that they know that lists, tuple and arrays are just
> sequences and any combination of them is valid numpy input.  So for instance
> a list of tuples yields a 2d array.  A list of tuples of 1d arrays yields a
> 3d array.  A list of 1d arrays yields 2d array.  This was the case
> consistently across all dtypes.  Now it is the case across all of them
> except for the dtype=object which has this unusual behavior.  I agree that
> vstack is a better choice when you know you have a list of arrays but it is
> less useful when you don't know and you can't force a type in the vstack
> function so there is no longer an equivalent to the dtype=object behavior:
> >
> > In [7]: numpy.array([numpy.array([1,2,3]), numpy.array([4,5,6])],
> dtype=object)
> > Out[7]:
> > array([[1, 2, 3],
> >        [4, 5, 6]], dtype=object)
> What are you trying to do? If you want integers:
> In [32]: a = array([array([1,2,3]), array([4,5,6])], dtype=int)
> In [33]: a.shape
> Out[33]: (2, 3)
> If you want objects, you have them:
> In [30]: a = array([array([1,2,3]), array([4,5,6])], dtype=object)
> In [31]: a.shape
> Out[31]: (2,)
> i.e, a is an array containing two array objects.
> Chuck
> -------------------------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job
> easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion

More information about the Numpy-discussion mailing list