[Numpy-discussion] Getting C-function pointers from Python to C
Dag Sverre Seljebotn
Tue Apr 10 07:39:17 CDT 2012
On 04/10/2012 12:37 PM, Nathaniel Smith wrote:
> On Tue, Apr 10, 2012 at 1:57 AM, Travis Oliphant<firstname.lastname@example.org> wrote:
>> On Apr 9, 2012, at 7:21 PM, Nathaniel Smith wrote:
>> ...isn't this an operation that will be performed once per compiled
>> function? Is the overhead of the easy, robust method (calling ctypes.cast)
>> actually measurable as compared to, you know, running an optimizing
>> Yes, there can be significant overhead. The compiler is run once and
>> creates the function. This function is then potentially used many, many
>> times. Also, it is entirely conceivable that the "build" step happens at
>> a separate "compilation" time, and Numba actually loads a pre-compiled
>> version of the function from disk which it then uses at run-time.
>> I have been playing with a version of this using scipy.integrate and
>> unfortunately the overhead of ctypes.cast is rather significant --- to the
>> point of making the code-path using these function pointers to be useless
>> when without the ctypes.cast overhed the speed up is 3-5x.
> Ah, I was assuming that you'd do the cast once outside of the inner
> loop (at the same time you did type compatibility checking and so
>> In general, I think NumPy will need its own simple function-pointer object
>> to use when handing over raw-function pointers between Python and C. SciPy
>> can then re-use this object which also has a useful C-API for things like
>> signature checking. I have seen that ctypes is nice but very slow and
>> without a compelling C-API.
> Sounds reasonable to me. Probably nicer than violating ctypes's
> abstraction boundary, and with no real downsides.
>> The kind of new C-level cfuncptr object I imagine has attributes:
>> void *func_ptr;
>> char *signature string /* something like 'dd->d' to indicate a function
>> that takes two doubles and returns a double */
> This looks like it's setting us up for trouble later. We already have
> a robust mechanism for describing types -- dtypes. We should use that
> instead of inventing Yet Another baby type system. We'll need to
> convert between this representation and dtypes anyway if you want to
> use these pointers for ufunc loops... and if we just use dtypes from
> the start, we'll avoid having to break the API the first time someone
> wants to pass a struct or array or something.
For some of the things we'd like to do with Cython down the line,
something very fast like what Travis describes is exactly what we need;
specifically, if you have Cython code like
cdef double f(func):
that may NOT be called in a loop.
But I do agree that this sounds overkill for NumPy+numba at the moment;
certainly for scipy.integrate where you can amortize over N function
samples. But Travis perhaps has a usecase I didn't think of.
More information about the NumPy-Discussion