[SciPy-dev] Inclusion of cython code in scipy
Thu Apr 24 07:09:50 CDT 2008
Stéfan van der Walt wrote:
> 2008/4/24 Prabhu Ramachandran <email@example.com>:
>> Lets take a simple case of someone wanting to handle a growing
>> collection of say a million particles and do something to them. How do
>> you do that in cython/pyrex and get the performance of C and interface
>> to numpy? Worse, even if it were possible, you'll still need to know
>> something about allocating memory in C and manipulating pointers. I can
>> do that with C++ and SWIG today.
> That's the point: you, being a well-established programmer can do it
> easily, but most Python programmers would struggle doing that through
> some C or C++ API. I think this would be pretty easy to do in Cython:
> 1. Write a function, say create_workspace(nr_elements), that creates a
> new ndarray and returns it:
> cdef ndarray results_arr = np.empty((nr_elements,), dtype=np.double)
This is not what I want and precisely the point. I don't want an array
of doubles. I want an array of objects (particles). I am not talking
about manipulating an array of numbers -- I can do that just fine, thanks.
> 3. Run your loop in which you produce data points. The moment you
> have more results than
> the output array can hold, call create_workspace(current_size**2), and
> use normal numpy indexing to copy the old results to the new location:
> new_results_arr[:current_size] = old_results_arr
This just shows how bad things are w.r.t. basic data types that you
expect when you program at any lower level. Is there a fast link list?
What about maps(dicts)? What about tree structures containing arbitrary
data structures. Sure, they can be done but you need a one to one (or
at least something close) mapping between commonly used data types and
those we are familiar with in Python. All you seem to get with
cython/pyrex is arrays and you have to implement everything else. Its
almost like having to reinvent a full fledged language.
Once again, I urge you to look beyond the simple functions that
manipulate arrays of numbers to something more realistic (at least for
me). To this end I'll write something up that explains what I am
talking about with real code and show you comparisons with different
approaches. I'll try and do it this weekend.
> The beauty of the Cython approach is that you
> a) Never have to worry about INCREF and DECREF
I don't have to with swig or weave for that matter and have one less
language to worry about.
> c) Debug in a much cleaner way than C++ or C code: fewer memory leaks,
> introspection of source etc.
I'm afraid I can't buy this at all. Good test-driven programming
practice makes debugging easier, but when it comes down to it, cython or
otherwise you are just going to have to roll up your sleeves and debug C
anyway. Lets face it, C is underneath all of this and if something goes
wrong at that level you need to know how it works to debug at all.
More information about the Scipy-dev