[Numpy-discussion] Proposed Roadmap Overview

Mark Wiebe mwwiebe@gmail....
Fri Feb 17 11:57:57 CST 2012


On Fri, Feb 17, 2012 at 10:27 AM, David Cournapeau <cournape@gmail.com>wrote:

> On Fri, Feb 17, 2012 at 3:39 PM, Charles R Harris
> <charlesr.harris@gmail.com> wrote:
> >
> >
> > On Fri, Feb 17, 2012 at 8:01 AM, David Cournapeau <cournape@gmail.com>
> > wrote:
> >>
> >> Hi Travis,
> >>
> >> On Thu, Feb 16, 2012 at 10:39 PM, Travis Oliphant <travis@continuum.io>
> >> wrote:
> >> > Mark Wiebe and I have been discussing off and on (as well as talking
> >> > with Charles) a good way forward to balance two competing desires:
> >> >
> >> >        * addition of new features that are needed in NumPy
> >> >        * improving the code-base generally and moving towards a more
> >> > maintainable NumPy
> >> >
> >> > I know there are load voices for just focusing on the second of these
> >> > and avoiding the first until we have finished that.  I recognize the
> need to
> >> > improve the code base, but I will also be pushing for improvements to
> the
> >> > feature-set and user experience in the process.
> >> >
> >> > As a result, I am proposing a rough outline for releases over the next
> >> > year:
> >> >
> >> >        * NumPy 1.7 to come out as soon as the serious bugs can be
> >> > eliminated.  Bryan, Francesc, Mark, and I are able to help triage
> some of
> >> > those.
> >> >
> >> >        * NumPy 1.8 to come out in July which will have as many
> >> > ABI-compatible feature enhancements as we can add while improving test
> >> > coverage and code cleanup.   I will post to this list more details of
> what
> >> > we plan to address with it later.    Included for possible inclusion
> are:
> >> >        * resolving the NA/missing-data issues
> >> >        * finishing group-by
> >> >        * incorporating the start of label arrays
> >> >        * incorporating a meta-object
> >> >        * a few new dtypes (variable-length string, varialbe-length
> >> > unicode and an enum type)
> >> >        * adding ufunc support for flexible dtypes and possibly
> >> > structured arrays
> >> >        * allowing generalized ufuncs to work on more kinds of arrays
> >> > besides just contiguous
> >> >        * improving the ability for NumPy to receive JIT-generated
> >> > function pointers for ufuncs and other calculation opportunities
> >> >        * adding "filters" to Input and Output
> >> >        * simple computed fields for dtypes
> >> >        * accepting a Data-Type specification as a class or JSON file
> >> >        * work towards improving the dtype-addition mechanism
> >> >        * re-factoring of code so that it can compile with a C++
> compiler
> >> > and be minimally dependent on Python data-structures.
> >>
> >> This is a pretty exciting list of features. What is the rationale for
> >> code being compiled as C++ ? IMO, it will be difficult to do so
> >> without preventing useful C constructs, and without removing some of
> >> the existing features (like our use of C99 complex). The subset that
> >> is both C and C++ compatible is quite constraining.
> >>
> >
> > I'm in favor of this myself, C++ would allow a lot code cleanup and make
> it
> > easier to provide an extensible base, I think it would be a natural fit
> with
> > numpy. Of course, some C++ projects become tangled messes of inheritance,
> > but I'd be very interested in seeing what a good C++ designer like Mark,
> > intimately familiar with the numpy code base, could do. This opportunity
> > might not come by again anytime soon and I think we should grab onto it.
> The
> > initial step would be a release whose code that would compile in both
> C/C++,
> > which mostly comes down to removing C++ keywords like 'new'.
>
> C++ will make integration with external environments much harder
> (calling a C++ library from a non C++ program is very hard, especially
> for cross-platform projects), and I am not convinced by the more
> extensible argument.
>

The whole of NumPy could be written utilizing C++ extensively while still
using exactly the same API and ABI numpy has now. C++ does not force
anything about API/ABI design decisions.

One good document to read about how a major open source project
transitioned from C to C++ is about gcc. Their points comparing C and C++
apply to numpy quite well, and being compiler authors, they're intimately
familiar with ABI and performance issues:

http://gcc.gnu.org/wiki/gcc-in-cxx#The_gcc-in-cxx_branch

Making the numpy C code buildable by a C++ compiler is harder than
> removing keywords.


Certainly, but it's not a difficult task for someone who's familiar with
both C and C++.


> > I did suggest running it by you for build issues, so please raise any you
> > can think of. Note that MatPlotLib is in C++, so I don't think the
> problems
> > are insurmountable. And choosing a set of compilers to support is
> something
> > that will need to be done.
>
> I don't know for matplotlib, but for scipy, quite a few issues were
> caused by our C++ extensions in scipy.sparse. But build issues are a
> not a strong argument against C++ - I am sure those could be worked
> out.
>

On this topic, I'd like to ask what it would take to change the default
warning levels in all the build configurations? Building with no warnings
under high warning levels is a pretty standard practice as a basic
mechanisms for catching some classes of bugs, and it would be nice for
numpy to do this. The only way this is reasonable, though, is if it's the
default in the build system.

Thanks,
Mark


> regards,
>
> David
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20120217/f736f015/attachment.html 


More information about the NumPy-Discussion mailing list