[Numpy-discussion] Proposed Roadmap Overview

David Cournapeau cournape@gmail....
Fri Feb 17 10:27:32 CST 2012


On Fri, Feb 17, 2012 at 3:39 PM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
>
>
> On Fri, Feb 17, 2012 at 8:01 AM, David Cournapeau <cournape@gmail.com>
> wrote:
>>
>> Hi Travis,
>>
>> On Thu, Feb 16, 2012 at 10:39 PM, Travis Oliphant <travis@continuum.io>
>> wrote:
>> > Mark Wiebe and I have been discussing off and on (as well as talking
>> > with Charles) a good way forward to balance two competing desires:
>> >
>> >        * addition of new features that are needed in NumPy
>> >        * improving the code-base generally and moving towards a more
>> > maintainable NumPy
>> >
>> > I know there are load voices for just focusing on the second of these
>> > and avoiding the first until we have finished that.  I recognize the need to
>> > improve the code base, but I will also be pushing for improvements to the
>> > feature-set and user experience in the process.
>> >
>> > As a result, I am proposing a rough outline for releases over the next
>> > year:
>> >
>> >        * NumPy 1.7 to come out as soon as the serious bugs can be
>> > eliminated.  Bryan, Francesc, Mark, and I are able to help triage some of
>> > those.
>> >
>> >        * NumPy 1.8 to come out in July which will have as many
>> > ABI-compatible feature enhancements as we can add while improving test
>> > coverage and code cleanup.   I will post to this list more details of what
>> > we plan to address with it later.    Included for possible inclusion are:
>> >        * resolving the NA/missing-data issues
>> >        * finishing group-by
>> >        * incorporating the start of label arrays
>> >        * incorporating a meta-object
>> >        * a few new dtypes (variable-length string, varialbe-length
>> > unicode and an enum type)
>> >        * adding ufunc support for flexible dtypes and possibly
>> > structured arrays
>> >        * allowing generalized ufuncs to work on more kinds of arrays
>> > besides just contiguous
>> >        * improving the ability for NumPy to receive JIT-generated
>> > function pointers for ufuncs and other calculation opportunities
>> >        * adding "filters" to Input and Output
>> >        * simple computed fields for dtypes
>> >        * accepting a Data-Type specification as a class or JSON file
>> >        * work towards improving the dtype-addition mechanism
>> >        * re-factoring of code so that it can compile with a C++ compiler
>> > and be minimally dependent on Python data-structures.
>>
>> This is a pretty exciting list of features. What is the rationale for
>> code being compiled as C++ ? IMO, it will be difficult to do so
>> without preventing useful C constructs, and without removing some of
>> the existing features (like our use of C99 complex). The subset that
>> is both C and C++ compatible is quite constraining.
>>
>
> I'm in favor of this myself, C++ would allow a lot code cleanup and make it
> easier to provide an extensible base, I think it would be a natural fit with
> numpy. Of course, some C++ projects become tangled messes of inheritance,
> but I'd be very interested in seeing what a good C++ designer like Mark,
> intimately familiar with the numpy code base, could do. This opportunity
> might not come by again anytime soon and I think we should grab onto it. The
> initial step would be a release whose code that would compile in both C/C++,
> which mostly comes down to removing C++ keywords like 'new'.

C++ will make integration with external environments much harder
(calling a C++ library from a non C++ program is very hard, especially
for cross-platform projects), and I am not convinced by the more
extensible argument.

Making the numpy C code buildable by a C++ compiler is harder than
removing keywords.

> I did suggest running it by you for build issues, so please raise any you
> can think of. Note that MatPlotLib is in C++, so I don't think the problems
> are insurmountable. And choosing a set of compilers to support is something
> that will need to be done.

I don't know for matplotlib, but for scipy, quite a few issues were
caused by our C++ extensions in scipy.sparse. But build issues are a
not a strong argument against C++ - I am sure those could be worked
out.

regards,

David


More information about the NumPy-Discussion mailing list