[SciPy-dev] Inclusion of cython code in scipy

Prabhu Ramachandran prabhu@aero.iitb.ac...
Thu Apr 24 07:09:50 CDT 2008


Hi Stéfan,

Stéfan van der Walt wrote:
> 2008/4/24 Prabhu Ramachandran <prabhu@aero.iitb.ac.in>:
>>  Lets take a simple case of someone wanting to handle a growing
>>  collection of say a million particles and do something to them.  How do
>>  you do that in cython/pyrex and get the performance of C and interface
>>  to numpy?  Worse, even if it were possible, you'll still need to know
>>  something about allocating memory in C and manipulating pointers.  I can
>>  do that with C++ and SWIG today.
> 
> That's the point: you, being a well-established programmer can do it
> easily, but most Python programmers would struggle doing that through
> some C or C++ API.  I think this would be pretty easy to do in Cython:
> 1. Write a function, say create_workspace(nr_elements), that creates a
> new ndarray and returns it:
> 
>     cdef ndarray results_arr = np.empty((nr_elements,), dtype=np.double)

This is not what I want and precisely the point.  I don't want an array 
of doubles. I want an array of objects (particles).  I am not talking 
about manipulating an array of numbers -- I can do that just fine, thanks.

> 3. Run your loop in which you produce data points.  The moment you
> have more results than
> the output array can hold, call create_workspace(current_size**2), and
> use normal numpy indexing to copy the old results to the new location:
> 
>     new_results_arr[:current_size] = old_results_arr

This just shows how bad things are w.r.t. basic data types that you 
expect when you program at any lower level.  Is there a fast link list? 
What about maps(dicts)?  What about tree structures containing arbitrary 
data structures.  Sure, they can be done but you need a one to one (or 
at least something close) mapping between commonly used data types and 
those we are familiar with in Python.  All you seem to get with 
cython/pyrex is arrays and you have to implement everything else.  Its 
almost like having to reinvent a full fledged language.

Once again, I urge you to look beyond the simple functions that 
manipulate arrays of numbers to something more realistic (at least for 
me).  To this end I'll write something up that explains what I am 
talking about with real code and show you comparisons with different 
approaches.  I'll try and do it this weekend.

> The beauty of the Cython approach is that you
> 
> a) Never have to worry about INCREF and DECREF

I don't have to with swig or weave for that matter and have one less 
language to worry about.

> c) Debug in a much cleaner way than C++ or C code: fewer memory leaks,
> introspection of source etc.

I'm afraid I can't buy this at all.  Good test-driven programming 
practice makes debugging easier, but when it comes down to it, cython or 
otherwise you are just going to have to roll up your sleeves and debug C 
anyway.  Lets face it, C is underneath all of this and if something goes 
wrong at that level you need to know how it works to debug at all.

cheers,
prabhu


More information about the Scipy-dev mailing list