[SciPy-user] NumPy arrays of Python objects (it was Re: How to start with SciPy and NumPy)

Vicent vginer@gmail....
Mon Jan 26 05:45:02 CST 2009


On Mon, Jan 26, 2009 at 11:22, Francesc Alted <faltet@pytables.org> wrote:

> Ei, Vicent,
>
> Yes.  In general, having arrays of 'object' dtype is a problem in NumPy
> because you won't be able to reach the high performance that NumPy can
> usually reach by specifying other dtypes like 'float' or 'int'.  This
> is because many of the NumPy accelerations are based on two facts:
>
> 1. That every element of the array is of equal size (in order to allow
> high memory performance on common access patterns).
>
> 2. That operations between each of these elements have available
> hardware that can perform fast operations with them.
>
> In nowadays architectures, the sort of elements that satisfy those
> conditions are mainly these types:
>
> boolean, integer, float, complex and fixed-length strings
>
> Another kind of array element that can benefit from NumPy better
> computational abilities are compound objects that are made of the above
> ones, which are commonly referred as 'record types'.  However, in order
> to preserve condition 1, these compound objects cannot vary in size
> from element to element (so, your example does not fit here).  However,
> such record arrays normally lacks the property 2 for most operations,
> so they are normally seen more as a data containers than a
> computational object "per se".
>
> So, you have two options here:
>
> - If you want to stick with collections of classes with attributes that
> can be general python objects, then try to use python containers for
> your case.  You will find that, in general, they are better suited for
> doing most of your desired operations.
>
> - If you need extreme computational speed, then you need to change your
> data schema (and perhaps the way your brain works too) and start to
> think in terms of homegeneous array NumPy objects as your building
> blocks.
>
> This is why people wanted that you were more explicit in describing your
> situation: they tried to see whether NumPy arrays could be used as the
> basic building blocks for your data schema or not.  My advice here is
> that you try first with regular python containers.  If you are not
> satisfied with speed or memory consumption, then try to restate your
> problem in terms of arrays and use NumPy to accelerate them (and to
> consume far less memory too).
>
> Hope that helps,
>


Of course it helps!!   :-)    Gràcies, Francesc.

That solves my question. I realize of the importance of adapting my mind and
my data structures to NumPy arrays, dtypes, "records" and so on.

But, it leads me to another question:

(1) How can I match/join object-oriented programming with the array+record
NumPy philosophy?

I mean, as far as I understood, what I thought that should be defined as an
object with properties and methods, may be better defined as a "record
dtype" + some functions that operate with that kind of records. Right?

So... Isn't it possible to "embed" the second approach into the first??
Maybe it makes no sense, but I would like to know it.

[I answer myself: I think I could keep classes for several "big" and unique
or not frequent classes (and that don't require much computation), and
arrays + NumPy-like records for massive computations over "grids" or
"matrices" of "similar" elements.]

(2) Just to be sure: An array can be assigned to a property of an object,
can't it?

Sorry if I'm being too general again!

In fact, I know that some of my colleagues don't work with objects, but just
with "structs" or "records" and functions that directly manage those
"records". They work with C++ and Delphy, by the way.

Thank you in advance for your answers.

--
Vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/scipy-user/attachments/20090126/fe3b79b4/attachment-0001.html 


More information about the SciPy-user mailing list