[SciPy-user] NumPy arrays of Python objects (it was Re: How to start with SciPy and NumPy)

Francesc Alted faltet@pytables....
Mon Jan 26 09:32:56 CST 2009


A Monday 26 January 2009, Vicent escrigué:
> On Mon, Jan 26, 2009 at 11:22, Francesc Alted <faltet@pytables.org> 
wrote:
> > Ei, Vicent,
> >
> > Yes.  In general, having arrays of 'object' dtype is a problem in
> > NumPy because you won't be able to reach the high performance that
> > NumPy can usually reach by specifying other dtypes like 'float' or
> > 'int'.  This is because many of the NumPy accelerations are based
> > on two facts:
> >
> > 1. That every element of the array is of equal size (in order to
> > allow high memory performance on common access patterns).
> >
> > 2. That operations between each of these elements have available
> > hardware that can perform fast operations with them.
> >
> > In nowadays architectures, the sort of elements that satisfy those
> > conditions are mainly these types:
> >
> > boolean, integer, float, complex and fixed-length strings
> >
> > Another kind of array element that can benefit from NumPy better
> > computational abilities are compound objects that are made of the
> > above ones, which are commonly referred as 'record types'. 
> > However, in order to preserve condition 1, these compound objects
> > cannot vary in size from element to element (so, your example does
> > not fit here).  However, such record arrays normally lacks the
> > property 2 for most operations, so they are normally seen more as a
> > data containers than a computational object "per se".
> >
> > So, you have two options here:
> >
> > - If you want to stick with collections of classes with attributes
> > that can be general python objects, then try to use python
> > containers for your case.  You will find that, in general, they are
> > better suited for doing most of your desired operations.
> >
> > - If you need extreme computational speed, then you need to change
> > your data schema (and perhaps the way your brain works too) and
> > start to think in terms of homegeneous array NumPy objects as your
> > building blocks.
> >
> > This is why people wanted that you were more explicit in describing
> > your situation: they tried to see whether NumPy arrays could be
> > used as the basic building blocks for your data schema or not.  My
> > advice here is that you try first with regular python containers. 
> > If you are not satisfied with speed or memory consumption, then try
> > to restate your problem in terms of arrays and use NumPy to
> > accelerate them (and to consume far less memory too).
> >
> > Hope that helps,
>
> Of course it helps!!   :-)    Gràcies, Francesc.
>
> That solves my question. I realize of the importance of adapting my
> mind and my data structures to NumPy arrays, dtypes, "records" and so
> on.
>
> But, it leads me to another question:
>
> (1) How can I match/join object-oriented programming with the
> array+record NumPy philosophy?
>
> I mean, as far as I understood, what I thought that should be defined
> as an object with properties and methods, may be better defined as a
> "record dtype" + some functions that operate with that kind of
> records. Right?
>
> So... Isn't it possible to "embed" the second approach into the
> first?? Maybe it makes no sense, but I would like to know it.
>
> [I answer myself: I think I could keep classes for several "big" and
> unique or not frequent classes (and that don't require much
> computation), and arrays + NumPy-like records for massive
> computations over "grids" or "matrices" of "similar" elements.]

Yeah, you are getting the idea.  It is common sense to use general 
Python machinery for building the skeleton of your application, and 
when you want to accelerate/improve the parts of the code taking most 
of the runtime, then it is when NumPy/SciPy can enter in action.

> (2) Just to be sure: An array can be assigned to a property of an
> object, can't it?

David has already answered this: there is no problem doing that.

> Sorry if I'm being too general again!

Don't be afraid to ask as many people here is really willing to help.  
In case we need more concrete details, we will ask you to do that.

Au!

-- 
Francesc Alted


More information about the SciPy-user mailing list