[Numpy-discussion] NumPy re-factoring project
Charles R Harris
Sat Jun 12 16:10:54 CDT 2010
On Sat, Jun 12, 2010 at 2:56 PM, Dag Sverre Seljebotn <
> Charles Harris wrote:
> > On Sat, Jun 12, 2010 at 11:38 AM, Dag Sverre Seljebotn <
> > email@example.com> wrote:
> >> Christopher Barker wrote:
> >> > David Cournapeau wrote:
> >> >>> In the core C numpy library there would be new "numpy_array" struct
> >> >>> with attributes
> >> >>>
> >> >>> numpy_array->buffer
> >> >
> >> >> Anything non trivial will require memory allocation and object
> >> >> ownership conventions.
> >> >
> >> > I totally agree -- I've been thinking for a while about a core array
> >> > data structure that you could use from C/C++, and would interact well
> >> > with numpy and Python -- it would be even better if it WAS numpy.
> >> >
> >> > I was thinking that at the root of it would be a "data_block" object
> >> > (the buffer in the above), that would have a reference counting
> >> system.
> >> > It would be its own system, but hopefully be able to link to Python's
> >> > easily when used with Python.
> >> I think taking PEP 3118, strip out the Python-specific things, and then
> >> add memory management conventions, would be a good starting point.
> >> Simply a simple header file/struct definition and specification, which
> >> could in time become a de facto way of exporting multidimensional array
> >> data between C libraries, between Fortran and C and so on (Kurt Smith's
> >> fwrap could easily be adapted to support it). The C-NumPy would then be
> >> a
> >> library on top of this spec (mainly ufuncs operating on such structs).
> >> The memory management conventions needs some thought, as you say,
> >> because
> >> of slices -- but a central memory allocator is not good enough because
> >> one
> >> would often be accessing memory that's allocated with other purposes in
> >> mind (and should not be deallocated, or deallocated in a special way).
> >> So
> >> refcounts + deallocator callback seems reasonable.
> >> (Not that I'm involved in this, just my 2 cents.)
> > This is more the way I see things, except I would divide the bottom layer
> > into two parts, views and memory. The memory can come from many places --
> > memmaps, user supplied buffers, etc. -- but we should provide a simple
> > reference counted allocator for the default. The views correspond more to
> > PEP 3118 and simply provide data types, dimensions, and strides, much as
> > arrays do now. However, I would confine the data types to those available
> > in
> > C with a bit extra information as to precision, because. Object arrays
> > would be a special case of pointer arrays (void pointer arrays?) and
> > structured arrays/Unicode might be a special case of char arrays. The
> > complicated dtypes would then be built on top of those. Some things just
> > won't be portable, pointers in particular, but such is life.
> > As to languages, I think we should stay with C. C++ has much to offer for
> > this sort of thing but would be quite a big jump and maybe not as
> > universal
> > as C. FORTRAN is harder to come by than C and older versions didn't have
> > such things as unsigned integers. I really haven't used FORTRAN since the
> > 77
> > version, so haven't much idea what the modern version looks like, but I
> > suspect we have more C programmers than FORTRAN programmers, and adding a
> > language translation on top of a design refactoring is just going to
> > complicate things.
> I'm not sure how Fortran got into this, but if it was from what I wrote: I
> certainly didn't suggest any part of NumPy is written in Fortran. Sorry
> for causing confusion.
It was a suggestion by someone else way back in the beginning. I think
changing languages is a great way to get side tracked and spend the next
year just hacking about.
> What I meant: If there's an "ndarray.h" which basically just contains the
> minimum C version of PEP 3118 for passing around and accessing arrays.
> Without any runtime dependencies -- of course, writing code using it will
> be much easier when using the C-NumPy runtime library, but for simply
> exporting data from an existing library one could do without the runtime
Well, that would be just one bit. If we want to offer a *functioning* system
we have to implement quite a bit more.
> (To be more precise about fwrap: In fwrap there's a need to pass
> multidimensional, flexible-size array data back and forth between
> (autogenerated) Fortran and (autogenerated) C. It currently defines its
> own structs (or extra arguments, but boils down to the same thing...) for
> this purpose, but if an "ndarray.h" could create a standard for passing
> array data then that would be a natural choice instead.)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion