[Numpy-discussion] PyArray_SETITEM with object arrays in Cython

Wes McKinney wesmckinn@gmail....
Wed Feb 11 16:12:01 CST 2009


I actually got it to work-- the function prototype in the pxi file was
wrong, needed to be:

int PyArray_SETITEM(object obj, void* itemptr, object item)

This still doesn't explain why the buffer interface was slow.

The general problem here is an indexed array (by dates or strings, for
example), that you want to conform to a new index. The arrays most of the
time contain floats but occasionally PyObjects. For some reason the access
and assignment is slow (this function can be faster by a factor of 50 with C
API macros, so clearly something is awry)-- let me know if you see anything
obviously wrong with this

def reindexObject(ndarray[object, ndim=1] index,
                  ndarray[object, ndim=1] arr,
                  dict idxMap):
    '''
    Using the provided new index, a given array, and a mapping of
index-value
    correpondences in the value array, return a new ndarray conforming to
    the new index.
    '''
    cdef object idx, value

    cdef int length  = index.shape[0]
    cdef ndarray[object, ndim = 1] result = np.empty(length, dtype=object)

    cdef int i = 0
    for i from 0 <= i < length:
        idx = index[i]
        if not PyDict_Contains(idxMap, idx):
            result[i] = None
            continue
        value = arr[idxMap[idx]]
        result[i] = value
    return result

On Wed, Feb 11, 2009 at 3:25 PM, Dag Sverre Seljebotn <
dagss@student.matnat.uio.no> wrote:

> Wes McKinney wrote:
> > I am writing some Cython code and have noted that the buffer interface
> > offers very little speedup for PyObject arrays. In trying to rewrite the
> > same code using the C API in Cython, I find I can't get PyArray_SETITEM
> to
> > work, in a call like:
> >
> > PyArray_SETITEM(result, <void *> iterresult.dataptr, obj)
> >
> > where result is an ndarray of dtype object, and obj is a PyObject*.
>
> Interesting. Whatever you end up doing, I'll make sure to integrate
> whatever works faster into Cython.
>
> I do doubt your results a bit though -- the buffer interface in Cython
> increfs/decrefs the objects, but otherwise it should be completely raw
> access, so using SETITEM shouldn't be faster except one INCREF/DECREF per
> object (i.e. still way faster than using Python).
>
> Could you perhaps post your Cython code?
>
> Dag Sverre
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20090211/5e702555/attachment.html 


More information about the Numpy-discussion mailing list