[Numpy-discussion] A Cython apply_along_axis function

Dag Sverre Seljebotn dagss@student.matnat.uio...
Thu Dec 2 03:08:12 CST 2010


On 12/02/2010 08:17 AM, Robert Bradshaw wrote:
> On Wed, Dec 1, 2010 at 6:09 PM, John Salvatier
> <jsalvati@u.washington.edu>  wrote:
>    
>> On Wed, Dec 1, 2010 at 6:07 PM, Keith Goodman<kwgoodman@gmail.com>  wrote:
>>      
>>> On Wed, Dec 1, 2010 at 5:53 PM, David<david@silveregg.co.jp>  wrote:
>>>
>>>        
>>>> On 12/02/2010 04:47 AM, Keith Goodman wrote:
>>>>          
>>>>> It's hard to write Cython code that can handle all dtypes and
>>>>> arbitrary number of dimensions. The former is typically dealt with
>>>>> using templates, but what do people do about the latter?
>>>>>            
>>>> The only way that I know to do that systematically is iterator. There is
>>>> a relatively simple example in scipy/signal (lfilter.c.src).
>>>>
>>>> I wonder if it would be possible to add better support for numpy
>>>> iterators in cython...
>>>>          
>>> Thanks for the tip. I'm starting to think that for now I should just
>>> template both dtype and ndim.
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>        
>> I enthusiastically support better iterator support for cython
>>      
> I enthusiastically welcome contributions along this line.
>    

Me too :-)

I guess we're moving into more Cython-list territory, so let's move any 
follow-ups there (posting this one both places).

Just in case anybody is wondering what something like this could look 
like, here's a rough scetch complete with bugs. The idea would be to a) 
add some rudimentary support for using the yield keyword in Cython to 
make a generator function, b) inline the generator function if the 
generator is used directly in a for-loop. This should result in very 
efficient code, and would also be much easier to implement than a 
general purpose generator.

@cython.inline
cdef array_iter_double(np.ndarray a, int axis=-1):
     cdef np.flatiter it
     ita = np.PyArray_IterAllButAxis(a, &axis)
     cdef Py_ssize_t stride = a.strides[axis], length = a.shape[axis], i
     while np.PyArray_ITER_NOTDONE(ita):
         for i in range(length):
             yield <double*>(np.PyArray_ITER_DATA(it) + )[i * stride])[0]
             # TODO: Probably yield indices as well
         np.PyArray_ITER_NEXT(it)
     # TODO: add faster special-cases for stride == sizeof(double)


# Use NumPy iterator API to sum all values of array with
# arbitrary number of dimensions:
cdef double s = 0, value
for value in array_iter_double(myarray):
     s += value
     # at this point, the contents of the array_iter_double function is 
copied,
     # and "s += value" simply inserted everywhere "yield" occurs in the 
function

Dag Sverre



More information about the NumPy-Discussion mailing list