[Numpy-discussion] Problem with concatenate and object arrays

Christopher Barker Chris.Barker at noaa.gov
Thu Sep 14 11:29:02 CDT 2006


Charles R Harris wrote:
>> > Why not simply
>> > write a wrapper function in python that does Numeric-style guesswork,
>> > and put it in the compatibility modules? 

>> Can I encourage any more comments? 

+1

> The main problem in constructing arrays
> of objects is more information needs to be supplied because the user's
> intention can't be reliably deduced from the current syntax.

I wrote about this a bit early in this conversation, and as I thought 
about it. I'm not sure it's possible _- you could specify a rank, or a 
shape, but in general, there wouldn't be a unique way to translate an 
given hierarchy of sequences into a particular shape: imagine four 
levels of nested lists, asked to turn into a rank-3 array.

This is why it may be best to simply recommend that people create an 
empty array of the shape they need, then put the objects into it - it's 
the only way to construct what you need reliably.

However, an object array constructor that take a rank as an argument 
might well work for most cases, as long as there is a clearly documented 
and consistent way to handle extra levels of sequences: perhaps specify 
that any extra levels of nesting always go to the last dimension (or the 
first). That being said, it's still dangerous -- what levels of nesting 
are allowed would depend on which sequences *happen* to be the same 
size. Also the code would be a pain to write!

I wonder how often people need to use objects arrays when they don't 
know when writing the code what shape they need?


this is making me think that maybe all we really need is a little 
syntactic sugar for creating empty object arrays:

numpy.ObjectArray(shape)

Not much different than:

numpy.empty(shape, dtype=numpy.object)

but a little cleaner an more obvious to new users that are primarily 
interested in object arrays -- analogous to ones() and zeros()

 > That said, I
> have no idea how widespread the use of object arrays is and so don't know
> how much it really matters. 

If we ever get nd-arrays into the standard lib (or want to see wider use 
of them in any case), I think that object arrays are critical. Right 
now, people think they don't have a use for numpy if they aren't doing 
serious number crunching -- it's seen mostly as a way to speed up 
computations on lots of numbers. However, I think nd-arrays have LOTS of 
  other applications, for anything where the data fits well in to a 
"rectangular" data structure. n-d slicing is a wonderful thing! As numpy 
gets wider use -- object arrays will be a very big draw.

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov




More information about the Numpy-discussion mailing list