[Numpy-discussion] Questions about the array interface.

Travis Oliphant oliphant at ee.byu.edu
Wed Apr 6 16:59:05 CDT 2005


Chris Barker wrote:

> Hi all, (but mostly Travis),
>
> I've taken a look at:
>
> http://numeric.scipy.org/array_interface.html)
>
> to try and see how I would use this with wxPython. I have a few 
> questions, and a little code I'd like you to look at to see if I 
> understand how this works.


Great, fantastic!!!

>
> Here's a first stab on how I might use this for the wxPython 
> DrawPointsList method. The method takes a sequence of length-2 
> sequences of numbers, and draws a point at each point described by 
> coordinates in the data:
>
> [(x,y), (x2,y2), (x3,y3), ...] (or a NX2 NumPy array of Ints)
>
> Here's what I have:
>
>     def DrawPointList(self, points, pens=None):
>     ...
>     # some checking code on the pens)
>         ...
>         if (hasattr(points,'__array_shape__') and
>                 hasattr(points,'__array_typestr__') and
>                 len(points.__array_shape__) == 2 and
>                 points.__array_shape__[1] == 2 and
>                 points.__array_typestr__ == 'i4' and
>                 ): # this means we have a compliant array
>            # return the array protocol version



You should account for the '<' or '>' that might be present in 
__array_typestr__   (Numeric won't put it there, but scipy.base and 
numarray will---since they can have byteswapped arrays internally).  

A more generic interface would handle multiple integer types if possible 
(but this is a good start...)


>            return self._DrawPointArray(points.__array_data__, pens,[])
>                    #This needs to be written now!
>         else:
>             #return the generic python sequence version
>             return self._DrawPointList(points, pens, [])
>
> Then we'll need a function (in C++):
>  _DrawPointArray(points.__array_data__, pens,[])
> That takes a buffer object, and does the drawing.
>
> My questions:
>
> 1) Is this what you had in mind for how to use this?


Yes, pretty much.

>
> 2) As __array_strides__ is optional, I'd kind of like to have a 
> __contiguous__ flag that I could just check, rather than checking for 
> the existence of strides, then calculating what the strides should be, 
> then checking them.


I don't want to add too much.  The other approach is to establish a set 
of helper functions in Python to check this sort of thing:   Thus, if 
you can't handle a general array you check:

ndarray.iscontiguous(obj) 

where obj exports the array interface.

But, it could really go either way.   What do others think?

I think one idea here is that if __array_strides__ returns None, then 
C-style contiguousness is assumed.   In fact, I like that idea so much 
that I just changed the interface.  Thanks for the suggestion.

>
> 3) A number of the attributes are optional, but will always be there 
> with SciPy arrays..(I assume) have you documented them anywhere?


No, they won't always be there for SciPy arrays (currently 4 of them 
are).  Only record-arrays will provide __array_descr__ for example and 
__array_offset__ is unnecessary for SciPy arrays.  I actually don't much 
like the __array_offset__  parameter myself, but Scott convinced me that 
it would could be useful for very complicated array classes. 

>
> 4) a wxWidgets wxPoint is defined as such:
>
> class WXDLLEXPORT wxPoint
> {
> public:
>     int x, y;
>
> etc.
>
> As wxWidgets is using "int", I"d like to be able to use "int". If I 
> define it as a 4 byte integer, I'm losing platform independence, 
> aren't I? Or can I use something like sizeof(int) ?


Ah, yes.. here is where we need some standard Python functions to help 
establish the array interface.   Sometimes you want to match a 
particular c-type, other times you want to match a particular bit 
width.  So, what do you do?  I had considered having an additional 
interface called ctypestr but decided against it for fear of creep.   I 
think in general we need to have in Python some constants to make this 
conversion easy

e.g.  ndarray.cint  (gives 'iX' on the correct platform). 

For now, I would check (__array_typestr__ == 'i%d' % 
array.array('i',[0]).itemsize)

But, on most platforms these days an int is 4 bytes, but the about would 
be just to make sure.

>
> 5) Why is: __array_data__ optional? Isn't that the whole point of this?

Because the object itself might expose the buffer interface.  We could 
make __array_data__ required and prefer that it return a buffer object.  
But, really all that is needed is something that exposes the buffer 
interface:  remember the difference between the buffer object and the 
buffer interface.   So, the correct consumer usage for grabbing the data is

data = getattr(obj, '__array_data__', obj)

Then, in C you use the Buffer *Protocol* to get a pointer to memory.  
For example, the function:

int *PyObject_AsReadBuffer*(PyObject *obj, const void **buffer, int 
*buffer_len)

Of course this approach has the 32-bit limit until we get this changed 
in Python. 

>
> 6) Should __array_offset__ be optional? I'd rather it were required, 
> but  default to zero. This way I have to check for it, then use it. 
> Also, I assume it is an integer number of bytes, is that right?


A consumer has to check for most of the optional stuff if they want to 
support all types of arrays.

Again a simple:

getattr(obj, '__array_offset__', 0)

works fine.

>
> 7) An alternative to the above: A __simple_ flag, that means the data 
> is a simple, C array of contiguous data of a single type. The most 
> common use, and it would be nice to just check that flag and not have 
> to take all other options into account.


I think if __array_strides__ returns None (and if an object doesn't 
expose it you can assume it) it is probably good enough.


-Travis







More information about the Numpy-discussion mailing list