[Numpy-discussion] Proposed Roadmap Overview

Matthew Brett matthew.brett@gmail....
Sat Feb 18 19:32:34 CST 2012


On Sat, Feb 18, 2012 at 5:18 PM, Matthew Brett <matthew.brett@gmail.com> wrote:
> Hi,
>
> On Sat, Feb 18, 2012 at 2:54 PM, Travis Oliphant <travis@continuum.io> wrote:
>>
>> On Feb 18, 2012, at 4:03 PM, Matthew Brett wrote:
>>
>>> Hi,
>>>
>>> On Sat, Feb 18, 2012 at 1:57 PM, Travis Oliphant <travis@continuum.io> wrote:
>>>> The C/C++ discussion is just getting started.  Everyone should keep in mind
>>>> that this is not something that is going to happening quickly.   This will
>>>> be a point of discussion throughout the year.    I'm not a huge supporter of
>>>> C++, but C++11 does look like it's made some nice progress, and as I think
>>>> about making a core-set of NumPy into a library that can be called by
>>>> multiple languages (and even multiple implementations of Python), tempered
>>>> C++ seems like it might be an appropriate way to go.
>>>
>>> Could you say more about this?  Do you have any idea when the decision
>>> about C++ is likely to be made?  At what point does it make most sense
>>> to make the argument for or against?  Can you suggest a good way for
>>> us to be able to make more substantial arguments either way?
>>
>> I think early arguments against are always appropriate --- if you believe they have a chance of swaying Mark or Chuck who are the strongest supporters of C++ at this point.     I will be quite nervous about going crazy with C++.   It was suggested that I use C++ 7 years ago when I wrote NumPy.   I didn't go that route then largely because of compiler issues,  ABI-concerns, and I knew C better than C++ so I felt like it would have taken me longer to do something in C++.     I made the right decision for me.   If you think my C-code is horrible, you would have been completely offended by whatever C++ I might have done at the time.
>>
>> But I basically agree with Chuck that there is a lot of C-code in NumPy and template-based-code that is really trying to be C++ spelled differently.
>>
>> The decision will not be made until NumPy 2.0 work is farther along.     The most likely outcome is that Mark will develop something quite nice in C++ which he is already toying with, and we will either choose to use it in NumPy to build 2.0 on --- or not.   I'm interested in sponsoring Mark and working as closely as I can with he and Chuck to see what emerges.
>
> Would it be fair to say then, that you are expecting the discussion
> about C++ will mainly arise after the Mark has written the code?   I
> can see that it will be easier to specific at that point, but there
> must be a serious risk that it will be too late to seriously consider
> an alternative approach.
>
>>> Can you say a little more about your impression of the previous Cython
>>> refactor and why it was not successful?
>>>
>>
>> Sure.  This list actually deserves a long writeup about that.   First, there wasn't a "Cython-refactor" of NumPy.   There was a Cython-refactor of SciPy.   I'm not sure of it's current status.   I'm still very supportive of that sort of thing.
>
> I think I missed that - is it on git somewhere?
>
>> I don't know if Cython ever solved the "raising an exception in a Fortran-called call-back" issue.   I used setjmp and longjmp in several places in SciPy originally in order to enable exceptions raised in a Python-callback that is wrapped in a C-function pointer and being handed to a Fortran-routine that asks for a function-pointer.
>>
>> What happend in NumPy, was that the code was re-factored to become a library.   I don't think much NumPy code actually ended up in Cython (the random-number generators have been in Cython from the beginning).
>>
>>
>> The biggest problem with merging the code was that Mark Wiebe got active at about that same time :-)   He ended up changing several things in the code-base that made it difficult to merge-in the changes.   Some of the bug-fixes and memory-leak patches, and tests did get into the code-base, but the essential creation of the NumPy library did not make it.   There was some very good work done that I hope we can still take advantage of.
>
>> Another factor.   the decision to make an extra layer of indirection makes small arrays that much slower.   I agree with Mark that in a core library we need to go the other way with small arrays being completely allocated in the data-structure itself (reducing the number of pointer de-references
>
> Does that imply there was a review of the refactor at some point to do
> things like benchmarking?   Are there any sources to get started
> trying to understand the nature of the Numpy refactor and where it ran
> into trouble?  Was it just the small arrays?
>
>> So, Cython did not play a major role on the NumPy side of things.   It played a very nice role on the SciPy side of things.
>
> I guess Cython was attractive because the desire was to make a

Sorry - that should read "I guess Cython was _not_ attractive ... "

> stand-alone library?   If that is still the goal, presumably that
> excludes Cython from serious consideration?  What are the primary
> advantages of making the standalone library?  Are there any serious
> disbenefits?

Best,

Matthew


More information about the NumPy-Discussion mailing list