[Numpy-discussion] How a transition to C++ could work
Sun Feb 19 03:19:05 CST 2012
On Sun, Feb 19, 2012 at 2:56 AM, David Cournapeau <firstname.lastname@example.org>wrote:
> Hi Mark,
> thank you for joining this discussion.
> On Sun, Feb 19, 2012 at 7:18 AM, Mark Wiebe <email@example.com> wrote:
> > The suggestion of transitioning the NumPy core code from C to C++ has
> > sparked a vigorous debate, and I thought I'd start a new thread to give
> > perspective on some of the issues raised, and describe how such a
> > could occur.
> > First, I'd like to reiterate the gcc rationale for their choice to
> > http://gcc.gnu.org/wiki/gcc-in-cxx#Rationale
> > In particular, these points deserve emphasis:
> > The C subset of C++ is just as efficient as C.
> > C++ supports cleaner code in several significant cases.
> > C++ makes it easier to write cleaner interfaces by making it harder to
> > interface boundaries.
> > C++ never requires uglier code.
> I think those arguments will not be very useful: they are subjective,
> and unlikely to convince people who prefer C to C++.
They are arguments from a team which implement both a C and a C++ compiler.
In the spectrum of possible authorities on the matter, they rate about as
high as I can imagine.
> > There are concerns about ABI/API interoperability and interactions with
> > exceptions. I've dealt with these types of issues on enough platforms to
> > know that while they're important, they're a lot easier to handle than
> > issues with Fortran, BLAS, and LAPACK in SciPy. My experience has been
> > providing a C API from a C++ library is no harder than providing a C API
> > from a C library.
> This needs more details. I have some experience in both areas as well,
> and mine is quite different. Reiterating a few examples that worry me:
> - how can you ensure that exceptions happening in C++ will never
> cross different .so/.dll ?
This is a necessary part of providing a C API, and is included as a
requirement of doing that. All C++ libraries which expose a C API deal with
> How can one make sure C++ extensions built
> by different compilers can work ?
This is no different from the situation in C. Already in C on Windows, one
can't build NumPy with a different version of Visual C++ than the one used
to build CPython.
> Is not using exceptions like it is
> done in zeromq acceptable ? (would be nice to find out more about the
> decisions made by the zeromq team about their usage of C++).
I prefer to use exceptions in C++, but some major projects have decided to
disable them. LLVM/Clang is the most notable example. My experience working
with high-performance graphics code has been that appropriate use of
exceptions (i.e. not doing something like using them for control flow) do
not pose a problem.
> find a recent example, but I have seen errors similar to
> quite a few times.
This kind of thing would happen when using 'new' to allocate memory, and
with the compiler setting enabled to raise bad_alloc on such allocation
failures (the default for most compilers nowadays). If exception handling
is disabled in the compiler, new will return NULL instead. Unless the
compiler has a bizarre issue, catching either std::exception or
std::bad_alloc specifically within NumPy should be sufficient to deal with
it. Also note that the possibility of something like this will only arise
once more advanced C++ features are being adopted.
- how can you expose in C some heavily-using C++ features ?
If the advantages of those C++ features depend on the C++ language, you
have to map them to a limited subset of the feature in C. For example, if a
feature is based on a C++ template, you can instantiate specific instances
of the template for all the types you want to support from C.
> I would
> expect you would like to use templates for iterators in numpy - you
> can you make them available to 3rd party extensions without requiring
Yes, something like the nditer is a good example. From C, it would have to
retain an API in the current style, but C++ users could gain an
> > It's worth comparing the possibility of C++ versus the possibility of
> > languages, and the ones that have been suggested for consideration are D,
> > Cython, Rust, Fortran 2003, Go, RPython, C# and Java. The target language
> > has to interact naturally with the CPython API. It needs to provide
> > access to all the various sizes of signed int, unsigned int, and float.
> > needs to have mature compiler support wherever we want to deploy NumPy.
> > Taken together, these requirements eliminate a majority of these
> > possibilities. From these criteria, the only languages which seem to
> have a
> > clear possibility for the implementation of Numpy are C, C++, and D. For
> > I suspect the tooling is not mature enough, but I'm not 100% certain of
> > that.
> While I agree that no other language is realistic, staying in C has
> the nice advantage that we can more easily use one of them if they
> mature (rust/D - go, rpython, C#/java can be dismissed for fundamental
> technical reasons right away). This is not a very strong argument
> against using C++, obviously.
To provide a counterpoint to this argument, switching to C++ could actually
make a transition to another language easier. C++ classes and templates map
to equivalent features in D quite naturally, to provide a specific example.
> > 1) Immediately after branching for 1.7, we minimally patch all the .c
> > so that they can build with a C++ compiler and with a C compiler at the
> > time. Then we rename all .c -> .cpp, and update the build systems for
> > 2) During the 1.8 development cycle, we heavily restrict C++ feature
> > But, where a feature implementation would be arguably easier and less
> > error-prone with C++, we allow it. This is a period for learning about
> > and how it can benefit NumPy.
> > 3) After the 1.8 release, the community will have developed more
> > with C++, and will be in a better position to discuss a way forward.
> A step that would be useful sooner rather than later is one where
> numpy has been split into smaller extensions (instead of
> multiarray/ufunc, essentially). This would help avoiding recompilation
> of lots of code for any small change. It is already quite painful with
> C, but with C++, it will be unbearable. This can be done in C, and
> would be useful whether the decision to move to C++ is accepted or
I'm pretty confident that the current code will compile in C++ in nearly
identical time to C. Having a properly working incremental build system
would be a nice step to take numpy builds out of the dark ages, though.
Your tireless efforts to make this happen are appreciated!
> NumPy-Discussion mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion