[Numpy-discussion] Questions about the array interface.
oliphant at ee.byu.edu
Wed Apr 6 16:59:05 CDT 2005
Chris Barker wrote:
> Hi all, (but mostly Travis),
> I've taken a look at:
> to try and see how I would use this with wxPython. I have a few
> questions, and a little code I'd like you to look at to see if I
> understand how this works.
> Here's a first stab on how I might use this for the wxPython
> DrawPointsList method. The method takes a sequence of length-2
> sequences of numbers, and draws a point at each point described by
> coordinates in the data:
> [(x,y), (x2,y2), (x3,y3), ...] (or a NX2 NumPy array of Ints)
> Here's what I have:
> def DrawPointList(self, points, pens=None):
> # some checking code on the pens)
> if (hasattr(points,'__array_shape__') and
> hasattr(points,'__array_typestr__') and
> len(points.__array_shape__) == 2 and
> points.__array_shape__ == 2 and
> points.__array_typestr__ == 'i4' and
> ): # this means we have a compliant array
> # return the array protocol version
You should account for the '<' or '>' that might be present in
__array_typestr__ (Numeric won't put it there, but scipy.base and
numarray will---since they can have byteswapped arrays internally).
A more generic interface would handle multiple integer types if possible
(but this is a good start...)
> return self._DrawPointArray(points.__array_data__, pens,)
> #This needs to be written now!
> #return the generic python sequence version
> return self._DrawPointList(points, pens, )
> Then we'll need a function (in C++):
> _DrawPointArray(points.__array_data__, pens,)
> That takes a buffer object, and does the drawing.
> My questions:
> 1) Is this what you had in mind for how to use this?
Yes, pretty much.
> 2) As __array_strides__ is optional, I'd kind of like to have a
> __contiguous__ flag that I could just check, rather than checking for
> the existence of strides, then calculating what the strides should be,
> then checking them.
I don't want to add too much. The other approach is to establish a set
of helper functions in Python to check this sort of thing: Thus, if
you can't handle a general array you check:
where obj exports the array interface.
But, it could really go either way. What do others think?
I think one idea here is that if __array_strides__ returns None, then
C-style contiguousness is assumed. In fact, I like that idea so much
that I just changed the interface. Thanks for the suggestion.
> 3) A number of the attributes are optional, but will always be there
> with SciPy arrays..(I assume) have you documented them anywhere?
No, they won't always be there for SciPy arrays (currently 4 of them
are). Only record-arrays will provide __array_descr__ for example and
__array_offset__ is unnecessary for SciPy arrays. I actually don't much
like the __array_offset__ parameter myself, but Scott convinced me that
it would could be useful for very complicated array classes.
> 4) a wxWidgets wxPoint is defined as such:
> class WXDLLEXPORT wxPoint
> int x, y;
> As wxWidgets is using "int", I"d like to be able to use "int". If I
> define it as a 4 byte integer, I'm losing platform independence,
> aren't I? Or can I use something like sizeof(int) ?
Ah, yes.. here is where we need some standard Python functions to help
establish the array interface. Sometimes you want to match a
particular c-type, other times you want to match a particular bit
width. So, what do you do? I had considered having an additional
interface called ctypestr but decided against it for fear of creep. I
think in general we need to have in Python some constants to make this
e.g. ndarray.cint (gives 'iX' on the correct platform).
For now, I would check (__array_typestr__ == 'i%d' %
But, on most platforms these days an int is 4 bytes, but the about would
be just to make sure.
> 5) Why is: __array_data__ optional? Isn't that the whole point of this?
Because the object itself might expose the buffer interface. We could
make __array_data__ required and prefer that it return a buffer object.
But, really all that is needed is something that exposes the buffer
interface: remember the difference between the buffer object and the
buffer interface. So, the correct consumer usage for grabbing the data is
data = getattr(obj, '__array_data__', obj)
Then, in C you use the Buffer *Protocol* to get a pointer to memory.
For example, the function:
int *PyObject_AsReadBuffer*(PyObject *obj, const void **buffer, int
Of course this approach has the 32-bit limit until we get this changed
> 6) Should __array_offset__ be optional? I'd rather it were required,
> but default to zero. This way I have to check for it, then use it.
> Also, I assume it is an integer number of bytes, is that right?
A consumer has to check for most of the optional stuff if they want to
support all types of arrays.
Again a simple:
getattr(obj, '__array_offset__', 0)
> 7) An alternative to the above: A __simple_ flag, that means the data
> is a simple, C array of contiguous data of a single type. The most
> common use, and it would be nice to just check that flag and not have
> to take all other options into account.
I think if __array_strides__ returns None (and if an object doesn't
expose it you can assume it) it is probably good enough.
More information about the Numpy-discussion