[Numpy-discussion] Proposed Roadmap Overview

Travis Oliphant travis@continuum...
Sat Feb 18 16:54:55 CST 2012


On Feb 18, 2012, at 4:03 PM, Matthew Brett wrote:

> Hi,
> 
> On Sat, Feb 18, 2012 at 1:57 PM, Travis Oliphant <travis@continuum.io> wrote:
>> The C/C++ discussion is just getting started.  Everyone should keep in mind
>> that this is not something that is going to happening quickly.   This will
>> be a point of discussion throughout the year.    I'm not a huge supporter of
>> C++, but C++11 does look like it's made some nice progress, and as I think
>> about making a core-set of NumPy into a library that can be called by
>> multiple languages (and even multiple implementations of Python), tempered
>> C++ seems like it might be an appropriate way to go.
> 
> Could you say more about this?  Do you have any idea when the decision
> about C++ is likely to be made?  At what point does it make most sense
> to make the argument for or against?  Can you suggest a good way for
> us to be able to make more substantial arguments either way?

I think early arguments against are always appropriate --- if you believe they have a chance of swaying Mark or Chuck who are the strongest supporters of C++ at this point.     I will be quite nervous about going crazy with C++.   It was suggested that I use C++ 7 years ago when I wrote NumPy.   I didn't go that route then largely because of compiler issues,  ABI-concerns, and I knew C better than C++ so I felt like it would have taken me longer to do something in C++.     I made the right decision for me.   If you think my C-code is horrible, you would have been completely offended by whatever C++ I might have done at the time.    

But I basically agree with Chuck that there is a lot of C-code in NumPy and template-based-code that is really trying to be C++ spelled differently. 

The decision will not be made until NumPy 2.0 work is farther along.     The most likely outcome is that Mark will develop something quite nice in C++ which he is already toying with, and we will either choose to use it in NumPy to build 2.0 on --- or not.   I'm interested in sponsoring Mark and working as closely as I can with he and Chuck to see what emerges. 

I'm reading very carefully any arguments against using C++ because I've actually pushed back on Mark pretty hard as we've discussed these things over the past months.  I am nervous about corner use-cases that will be unpleasant for some groups and some platforms.    But, that vague nervousness is not enough to discount the clear benefits.   I'm curious about the state of C++ compilers for Blue-Gene and other big-iron machines as well.   My impression is that most of them use g++.   which has pretty good support for C++.    David and others raised some important concerns (merging multiple compilers seems like the biggest issue --- it already is...).    If someone out there seriously opposes judicious and careful use of C++ and can show a clear reason why it would be harmful --- feel free to speak up at any time.   We are leaning that way with Mark out in front of us leading the charge. 

> 
> Can you say a little more about your impression of the previous Cython
> refactor and why it was not successful?
> 

Sure.  This list actually deserves a long writeup about that.   First, there wasn't a "Cython-refactor" of NumPy.   There was a Cython-refactor of SciPy.   I'm not sure of it's current status.   I'm still very supportive of that sort of thing.     I don't know if Cython ever solved the "raising an exception in a Fortran-called call-back" issue.   I used setjmp and longjmp in several places in SciPy originally in order to enable exceptions raised in a Python-callback that is wrapped in a C-function pointer and being handed to a Fortran-routine that asks for a function-pointer.     

What happend in NumPy, was that the code was re-factored to become a library.   I don't think much NumPy code actually ended up in Cython (the random-number generators have been in Cython from the beginning).   


The biggest problem with merging the code was that Mark Wiebe got active at about that same time :-)   He ended up changing several things in the code-base that made it difficult to merge-in the changes.   Some of the bug-fixes and memory-leak patches, and tests did get into the code-base, but the essential creation of the NumPy library did not make it.   There was some very good work done that I hope we can still take advantage of.  

Another factor.   the decision to make an extra layer of indirection makes small arrays that much slower.   I agree with Mark that in a core library we need to go the other way with small arrays being completely allocated in the data-structure itself (reducing the number of pointer de-references).  

So, Cython did not play a major role on the NumPy side of things.   It played a very nice role on the SciPy side of things. 

-Travis



> Thanks a lot,
> 
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion



More information about the NumPy-Discussion mailing list