[Numpy-discussion] Proposed Roadmap Overview

Eric Firing efiring@hawaii....
Fri Feb 17 11:52:10 CST 2012

On 02/17/2012 05:39 AM, Charles R Harris wrote:
> On Fri, Feb 17, 2012 at 8:01 AM, David Cournapeau <cournape@gmail.com
> <mailto:cournape@gmail.com>> wrote:
>     Hi Travis,
>     On Thu, Feb 16, 2012 at 10:39 PM, Travis Oliphant
>     <travis@continuum.io <mailto:travis@continuum.io>> wrote:
>      > Mark Wiebe and I have been discussing off and on (as well as
>     talking with Charles) a good way forward to balance two competing
>     desires:
>      >
>      >        * addition of new features that are needed in NumPy
>      >        * improving the code-base generally and moving towards a
>     more maintainable NumPy
>      >
>      > I know there are load voices for just focusing on the second of
>     these and avoiding the first until we have finished that.  I
>     recognize the need to improve the code base, but I will also be
>     pushing for improvements to the feature-set and user experience in
>     the process.
>      >
>      > As a result, I am proposing a rough outline for releases over the
>     next year:
>      >
>      >        * NumPy 1.7 to come out as soon as the serious bugs can be
>     eliminated.  Bryan, Francesc, Mark, and I are able to help triage
>     some of those.
>      >
>      >        * NumPy 1.8 to come out in July which will have as many
>     ABI-compatible feature enhancements as we can add while improving
>     test coverage and code cleanup.   I will post to this list more
>     details of what we plan to address with it later.    Included for
>     possible inclusion are:
>      >        * resolving the NA/missing-data issues
>      >        * finishing group-by
>      >        * incorporating the start of label arrays
>      >        * incorporating a meta-object
>      >        * a few new dtypes (variable-length string,
>     varialbe-length unicode and an enum type)
>      >        * adding ufunc support for flexible dtypes and possibly
>     structured arrays
>      >        * allowing generalized ufuncs to work on more kinds of
>     arrays besides just contiguous
>      >        * improving the ability for NumPy to receive JIT-generated
>     function pointers for ufuncs and other calculation opportunities
>      >        * adding "filters" to Input and Output
>      >        * simple computed fields for dtypes
>      >        * accepting a Data-Type specification as a class or JSON file
>      >        * work towards improving the dtype-addition mechanism
>      >        * re-factoring of code so that it can compile with a C++
>     compiler and be minimally dependent on Python data-structures.
>     This is a pretty exciting list of features. What is the rationale for
>     code being compiled as C++ ? IMO, it will be difficult to do so
>     without preventing useful C constructs, and without removing some of
>     the existing features (like our use of C99 complex). The subset that
>     is both C and C++ compatible is quite constraining.
> I'm in favor of this myself, C++ would allow a lot code cleanup and make
> it easier to provide an extensible base, I think it would be a natural
> fit with numpy. Of course, some C++ projects become tangled messes of
> inheritance, but I'd be very interested in seeing what a good C++
> designer like Mark, intimately familiar with the numpy code base, could
> do. This opportunity might not come by again anytime soon and I think we
> should grab onto it. The initial step would be a release whose code that
> would compile in both C/C++, which mostly comes down to removing C++
> keywords like 'new'.
> I did suggest running it by you for build issues, so please raise any
> you can think of. Note that MatPlotLib is in C++, so I don't think the
> problems are insurmountable. And choosing a set of compilers to support
> is something that will need to be done.

It's true that matplotlib relies heavily on C++, both via the Agg 
library and in its own extension code.  Personally, I don't like this; I 
think it raises the barrier to contributing.  C++ is an order of 
magnitude more complicated than C--harder to read, and much harder to 
write, unless one is a true expert. In mpl it brings reliance on the CXX 
library, which Mike D. has had to help maintain.  And if it does 
increase compiler specificity, that's bad.

I would much rather see development in the direction of sticking with C 
where direct low-level control and speed are needed, and using cython to 
gain higher level language benefits where appropriate.  Of course, that 
brings in the danger of reliance on another complex tool, cython.  If 
that danger is considered excessive, then just stick with C.


> Chuck

More information about the NumPy-Discussion mailing list