[Numpy-discussion] NumPy re-factoring project
Sebastian Walter
sebastian.walter@gmail....
Sat Jun 12 15:12:50 CDT 2010
On Sat, Jun 12, 2010 at 3:57 PM, David Cournapeau <cournape@gmail.com> wrote:
> On Sat, Jun 12, 2010 at 10:27 PM, Sebastian Walter
> <sebastian.walter@gmail.com> wrote:
>> On Thu, Jun 10, 2010 at 6:48 PM, Sturla Molden <sturla@molden.no> wrote:
>>>
>>> I have a few radical suggestions:
>>>
>>> 1. Use ctypes as glue to the core DLL, so we can completely forget about
>>> refcounts and similar mess. Why put manual reference counting and error
>>> handling in the core? It's stupid.
>>
>> I totally agree, I thought that the refactoring was supposed to provide
>> simple data structures and simple algorithms to perform the C equivalents of
>> sin,cos,exp, dot, +,-,*,/, dot, inv, ...
>>
>> Let me explain at an example what I expected:
>>
>> In the core C numpy library there would be new "numpy_array" struct
>> with attributes
>>
>> numpy_array->buffer
>> numpy_array->dtype
>> numpy_array->ndim
>> numpy_array->shape
>> numpy_array->strides
>> numpy_array->owndata
>> etc.
>>
>> that replaces the current PyArrayObject which contains Python C API stuff:
>>
>> typedef struct PyArrayObject {
>> PyObject_HEAD
>> char *data; /* pointer to raw data buffer */
>> int nd; /* number of dimensions, also called ndim */
>> npy_intp *dimensions; /* size in each dimension */
>> npy_intp *strides; /* bytes to jump to get to the
>> next element in each dimension */
>> PyObject *base; /* This object should be decref'd
>> upon deletion of array */
>> /* For views it points to the original array */
>> /* For creation from buffer object it points
>> to an object that shold be decref'd on
>> deletion */
>> /* For UPDATEIFCOPY flag this is an array
>> to-be-updated upon deletion of this one */
>> PyArray_Descr *descr; /* Pointer to type structure */
>> int flags; /* Flags describing array -- see below*/
>> PyObject *weakreflist; /* For weakreferences */
>> void *buffer_info; /* Data used by the buffer interface */
>> } PyArrayObject;
>>
>>
>>
>> Example:
>> --------------
>>
>> If one calls the following Python code
>> x = numpy.zeros((N,M,K), dtype=float)
>> the memory allocation would be done on the Python side.
>>
>> Calling a ufunc like
>> y = numpy.sin(x)
>> would first allocate the memory for y on the Python side
>> and then call a C function a la
>> numpy_unary_ufunc( double (*fcn_ptr)(double), numpy_array *x, numpy_array *y)
>>
>> If y is already allocated, one would call
>> y = numpy.sin(x, out = y)
>>
>> Similarly z = x*y
>> would first allocate the memory for z and then call a C function a la
>> numpy_binary_ufunc( double (*fcn_ptr)(double, double), numpy_array *x,
>> numpy_array *y, numpy_array *z)
>>
>>
>> similarly other functions like dot:
>> z = dot(x,y, out = z)
>>
>> would simply call a C function a la
>> numpy_dot( numpy_array *x, numpy_array *y, numpy_array *z)
>>
>>
>> If one wants to use numpy functions on the C side only, one would use
>> the numpy_array struct manually.
>> I.e. one has to do the memory management oneself in C. Which is
>> perfectly ok since one is just interested in using
>> the algorithms.
>
> Anything non trivial will require memory allocation and object
> ownership conventions. If the goal is interoperation with other
> languages and vm, you may want to use something else than plain
> malloc, to interact better with the allocation strategies of the host
> platform (reference counting, garbage collection).
I'm just saying that the "host platform" could do the memory
management and not libnumpy.
I.e. libnumpy could be just a collection of algorithms.
Reimplementing half of the Python C API somehow doesn't feel right to me.
Those users who like to use C++ could write a class with methods that
internally call the
libnumpy functions:
-------------- example code -----------------
class Array{
numpy_array *_array;
public:
const Array operator+(Array &rhs) const {
Array retval( ... arguments for the right type and dimensions of
the output...);
numpy_add((*this)->_array, rhs->_array, retval->_array);
return retval;
}
};
-------------- end code -----------------
I.e. let C++ do all the memory management and type inference but the
numpy core C API does the number crunching.
In other languages (Python, Ruby, R, whatever) one would implement a
similar class.
I cannot speak for others, but something about these lines is what I'd
love to see since it would make it
relatively easy to use numpy functionality even in existing C/C++/R/Ruby codes.
Sebastian
>
>
>> The only reason I see for C++ is the possibility to use meta programming which
>> is very ill-designed. I'd rather like to see some simple code
>> preprocessing on C code than
>> C++ template meta programming.
>
> I don't think anyone is seriously considering changing languages.
> Especially if interoperation is desired, C is miles ahead of C++
> anyway.
>
> David
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
More information about the NumPy-Discussion
mailing list