[Numpy-discussion] How a transition to C++ could work

Mark Wiebe mwwiebe@gmail....
Sun Feb 19 03:19:05 CST 2012

On Sun, Feb 19, 2012 at 2:56 AM, David Cournapeau <cournape@gmail.com>wrote:

> Hi Mark,
> thank you for joining this discussion.
> On Sun, Feb 19, 2012 at 7:18 AM, Mark Wiebe <mwwiebe@gmail.com> wrote:
> > The suggestion of transitioning the NumPy core code from C to C++ has
> > sparked a vigorous debate, and I thought I'd start a new thread to give
> my
> > perspective on some of the issues raised, and describe how such a
> transition
> > could occur.
> >
> > First, I'd like to reiterate the gcc rationale for their choice to
> switch:
> > http://gcc.gnu.org/wiki/gcc-in-cxx#Rationale
> >
> > In particular, these points deserve emphasis:
> >
> > The C subset of C++ is just as efficient as C.
> > C++ supports cleaner code in several significant cases.
> > C++ makes it easier to write cleaner interfaces by making it harder to
> break
> > interface boundaries.
> > C++ never requires uglier code.
> I think those arguments will not be very useful: they are subjective,
> and unlikely to convince people who prefer C to C++.

They are arguments from a team which implement both a C and a C++ compiler.
In the spectrum of possible authorities on the matter, they rate about as
high as I can imagine.

> >
> > There are concerns about ABI/API interoperability and interactions with
> C++
> > exceptions. I've dealt with these types of issues on enough platforms to
> > know that while they're important, they're a lot easier to handle than
> the
> > issues with Fortran, BLAS, and LAPACK in SciPy. My experience has been
> that
> > providing a C API from a C++ library is no harder than providing a C API
> > from a C library.
> This needs more details. I have some experience in both areas as well,
> and mine is quite different. Reiterating a few examples that worry me:
>  - how can you ensure that exceptions happening in C++ will never
> cross different .so/.dll ?

This is a necessary part of providing a C API, and is included as a
requirement of doing that. All C++ libraries which expose a C API deal with

> How can one make sure C++ extensions built
> by different compilers can work ?

This is no different from the situation in C. Already in C on Windows, one
can't build NumPy with a different version of Visual C++ than the one used
to build CPython.

> Is not using exceptions like it is
> done in zeromq acceptable ? (would be nice to find out more about the
> decisions made by the zeromq team about their usage of C++).

I prefer to use exceptions in C++, but some major projects have decided to
disable them. LLVM/Clang is the most notable example. My experience working
with high-performance graphics code has been that appropriate use of
exceptions (i.e. not doing something like using them for control flow) do
not pose a problem.

I cannot
> find a recent example, but I have seen errors similar to
> this(http://software.intel.com/en-us/forums/showthread.php?t=42940)
> quite a few times.

This kind of thing would happen when using 'new' to allocate memory, and
with the compiler setting enabled to raise bad_alloc on such allocation
failures (the default for most compilers nowadays). If exception handling
is disabled in the compiler, new will return NULL instead. Unless the
compiler has a bizarre issue, catching either std::exception or
std::bad_alloc specifically within NumPy should be sufficient to deal with
it. Also note that the possibility of something like this will only arise
once more advanced C++ features are being adopted.

 - how can you expose in C some heavily-using C++ features ?

If the advantages of those C++ features depend on the C++ language, you
have to map them to a limited subset of the feature in C. For example, if a
feature is based on a C++ template, you can instantiate specific instances
of the template for all the types you want to support from C.

> I would
> expect you would like to use templates for iterators in numpy - you
> can you make them available to 3rd party extensions without requiring
> C++.

Yes, something like the nditer is a good example. From C, it would have to
retain an API in the current style, but C++ users could gain an
easier-to-use variant.

> >
> > It's worth comparing the possibility of C++ versus the possibility of
> other
> > languages, and the ones that have been suggested for consideration are D,
> > Cython, Rust, Fortran 2003, Go, RPython, C# and Java. The target language
> > has to interact naturally with the CPython API. It needs to provide
> direct
> > access to all the various sizes of signed int, unsigned int, and float.
> It
> > needs to have mature compiler support wherever we want to deploy NumPy.
> > Taken together, these requirements eliminate a majority of these
> > possibilities. From these criteria, the only languages which seem to
> have a
> > clear possibility for the implementation of Numpy are C, C++, and D. For
> D,
> > I suspect the tooling is not mature enough, but I'm not 100% certain of
> > that.
> While I agree that no other language is realistic, staying in C has
> the nice advantage that we can more easily use one of them if they
> mature (rust/D - go, rpython, C#/java can be dismissed for fundamental
> technical reasons right away). This is not a very strong argument
> against using C++, obviously.

To provide a counterpoint to this argument, switching to C++ could actually
make a transition to another language easier. C++ classes and templates map
to equivalent features in D quite naturally, to provide a specific example.

> >
> > 1) Immediately after branching for 1.7, we minimally patch all the .c
> files
> > so that they can build with a C++ compiler and with a C compiler at the
> same
> > time. Then we rename all .c -> .cpp, and update the build systems for
> C++.
> > 2) During the 1.8 development cycle, we heavily restrict C++ feature
> usage.
> > But, where a feature implementation would be arguably easier and less
> > error-prone with C++, we allow it. This is a period for learning about
> C++
> > and how it can benefit NumPy.
> > 3) After the 1.8 release, the community will have developed more
> experience
> > with C++, and will be in a better position to discuss a way forward.
> A step that would be useful sooner rather than later is one where
> numpy has been split into smaller extensions (instead of
> multiarray/ufunc, essentially). This would help avoiding recompilation
> of lots of code for any small change. It is already quite painful with
> C, but with C++, it will be unbearable. This can be done in C, and
> would be useful whether the decision to move to C++ is accepted or
> not.

I'm pretty confident that the current code will compile in C++ in nearly
identical time to C. Having a properly working incremental build system
would be a nice step to take numpy builds out of the dark ages, though.
Your tireless efforts to make this happen are appreciated!


> cheers,
> David
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20120219/32b1f99c/attachment-0001.html 

More information about the NumPy-Discussion mailing list