[Numpy-discussion] Proposed Roadmap Overview
Sat Feb 18 16:54:55 CST 2012
On Feb 18, 2012, at 4:03 PM, Matthew Brett wrote:
> On Sat, Feb 18, 2012 at 1:57 PM, Travis Oliphant <email@example.com> wrote:
>> The C/C++ discussion is just getting started. Everyone should keep in mind
>> that this is not something that is going to happening quickly. This will
>> be a point of discussion throughout the year. I'm not a huge supporter of
>> C++, but C++11 does look like it's made some nice progress, and as I think
>> about making a core-set of NumPy into a library that can be called by
>> multiple languages (and even multiple implementations of Python), tempered
>> C++ seems like it might be an appropriate way to go.
> Could you say more about this? Do you have any idea when the decision
> about C++ is likely to be made? At what point does it make most sense
> to make the argument for or against? Can you suggest a good way for
> us to be able to make more substantial arguments either way?
I think early arguments against are always appropriate --- if you believe they have a chance of swaying Mark or Chuck who are the strongest supporters of C++ at this point. I will be quite nervous about going crazy with C++. It was suggested that I use C++ 7 years ago when I wrote NumPy. I didn't go that route then largely because of compiler issues, ABI-concerns, and I knew C better than C++ so I felt like it would have taken me longer to do something in C++. I made the right decision for me. If you think my C-code is horrible, you would have been completely offended by whatever C++ I might have done at the time.
But I basically agree with Chuck that there is a lot of C-code in NumPy and template-based-code that is really trying to be C++ spelled differently.
The decision will not be made until NumPy 2.0 work is farther along. The most likely outcome is that Mark will develop something quite nice in C++ which he is already toying with, and we will either choose to use it in NumPy to build 2.0 on --- or not. I'm interested in sponsoring Mark and working as closely as I can with he and Chuck to see what emerges.
I'm reading very carefully any arguments against using C++ because I've actually pushed back on Mark pretty hard as we've discussed these things over the past months. I am nervous about corner use-cases that will be unpleasant for some groups and some platforms. But, that vague nervousness is not enough to discount the clear benefits. I'm curious about the state of C++ compilers for Blue-Gene and other big-iron machines as well. My impression is that most of them use g++. which has pretty good support for C++. David and others raised some important concerns (merging multiple compilers seems like the biggest issue --- it already is...). If someone out there seriously opposes judicious and careful use of C++ and can show a clear reason why it would be harmful --- feel free to speak up at any time. We are leaning that way with Mark out in front of us leading the charge.
> Can you say a little more about your impression of the previous Cython
> refactor and why it was not successful?
Sure. This list actually deserves a long writeup about that. First, there wasn't a "Cython-refactor" of NumPy. There was a Cython-refactor of SciPy. I'm not sure of it's current status. I'm still very supportive of that sort of thing. I don't know if Cython ever solved the "raising an exception in a Fortran-called call-back" issue. I used setjmp and longjmp in several places in SciPy originally in order to enable exceptions raised in a Python-callback that is wrapped in a C-function pointer and being handed to a Fortran-routine that asks for a function-pointer.
What happend in NumPy, was that the code was re-factored to become a library. I don't think much NumPy code actually ended up in Cython (the random-number generators have been in Cython from the beginning).
The biggest problem with merging the code was that Mark Wiebe got active at about that same time :-) He ended up changing several things in the code-base that made it difficult to merge-in the changes. Some of the bug-fixes and memory-leak patches, and tests did get into the code-base, but the essential creation of the NumPy library did not make it. There was some very good work done that I hope we can still take advantage of.
Another factor. the decision to make an extra layer of indirection makes small arrays that much slower. I agree with Mark that in a core library we need to go the other way with small arrays being completely allocated in the data-structure itself (reducing the number of pointer de-references).
So, Cython did not play a major role on the NumPy side of things. It played a very nice role on the SciPy side of things.
> Thanks a lot,
> NumPy-Discussion mailing list
More information about the NumPy-Discussion