[Numpy-discussion] Proposed Roadmap Overview

Christopher Jordan-Squire cjordan1@uw....
Sat Feb 18 02:17:29 CST 2012


On Fri, Feb 17, 2012 at 11:55 PM, David Cournapeau <cournape@gmail.com> wrote:
>
> Le 18 févr. 2012 06:18, "Christopher Jordan-Squire" <cjordan1@uw.edu> a
> écrit :
>
>
>>
>> On Fri, Feb 17, 2012 at 8:30 PM, Sturla Molden <sturla@molden.no> wrote:
>> >
>> >
>> > Den 18. feb. 2012 kl. 05:01 skrev Jason Grout
>> > <jason-sage@creativetrax.com>:
>> >
>> >> On 2/17/12 9:54 PM, Sturla Molden wrote:
>> >>> We would have to write a C++ programming tutorial that is based on
>> >>> Pyton knowledge instead of C knowledge.
>> >>
>> >> I personally would love such a thing.  It's been a while since I did
>> >> anything nontrivial on my own in C++.
>> >>
>> >
>> > One example: How do we code multiple return values?
>> >
>> > In Python:
>> > - Return a tuple.
>> >
>> > In C:
>> > - Use pointers (evilness)
>> >
>> > In C++:
>> > - Return a std::tuple, as you would in Python.
>> > - Use references, as you would in Fortran or Pascal.
>> > - Use pointers, as you would in C.
>> >
>> > C++ textbooks always pick the last...
>> >
>> > I would show the first and the second method, and perhaps intentionally
>> > forget the last.
>> >
>> > Sturla
>> >
>>
>> I can add my own 2 cents about cython vs. C vs. C++, based on summer
>> coding experiences.
>>
>> I was an intern at Enthought, sharing an office with Mark W. (Which
>> was a treat. I recommend you all quit your day jobs and haunt whatever
>> office Mark is inhabiting.) I was trying to optimize some code and
>> that lead to experimenting with both cython and C.
>>
>> Dealing with the C internals of numpy was frustrating. Since C doesn't
>> have templating but numpy kinda needs it, instead python scripts go
>> over and manually perform templating. Not the most obvious thing.
>> There were other issues  in the background--including C doesn't allow
>> for abstraction (i.e. easy to read), lots of pointer-fu is required,
>> and the C API is lightly documented and already plenty difficult.
>
> Please understand that the argument is not to maintain a status quo.
>
> Lack of API documentation, internals that need significant work are
> certainly issues. I fail to see how writing in C++ will solve the
> documentation issues.
>
> On the abstraction side of things, let's agree to disagree. Plenty of
> complex projects are written in both languages to make this a matter of
> mostly subjective matter.
>
>>
>> On the flip side, cython looked pretty...but I didn't get the
>> performance gains I wanted, and had to spend a lot of time figuring
>> out if it was cython, needing to add types, buggy support for numpy,
>> or actually the algorithm. The C files generated by cython were
>> enormous and difficult to read. They really weren't meant for human
>> consumption. As Sturla has said, regardless of the quality of the
>> current product, it isn't stable.
>
> Sturla represents only himself on this issue. Cython is widely held as a
> successful and very useful tool. Many more projects in the scipy community
> uses cython compared to C++.
>
> And even if it looks friendly
>> there's magic going on under the hood. Magic means it's hard to
>> diagnose and fix problems. At least one very smart person has told me
>> they find cython most useful for wrapping C/C++ libraries and exposing
>> them to python, which is a far cry from library writing. (Of course
>> Wes McKinney, a cython evangelist, uses it all over his pandas
>> library.)
>
> I am not very smart, but this is certainly close to what I had in mind as
> well :) As you know, the lack of clear abstraction between c and c python
> wrapping is one of the major issue in numpy. Cython is certainly one of the
> most capable tool out there to avoid tedious reference bug chasing.
>
>>
>> In comparison, there are a number of high quality, performant,
>> open-source C++ based array libraries out there with very friendly
>> API's. Things like eigen
>> (http://eigen.tuxfamily.org/index.php?title=Main_Page) and Armadillo
>> (http://arma.sourceforge.net/). They seem to have plenty of users and
>> more devs than
>
> eigen is a typical example of code i hope numpy will never be close to. This
> is again quite subjective, but it also shows that we have quite different
> ideas on what maintainable/readable code means. Which is of course quite
> alright. But it means a choice needs to be made. If a majority of people
> find eigen more readable than a well written C library, then I don't think
> anyone can reasonably argue against going to c++.
>

Fair point, obviously. I have't dug into eigen's internals much. I
just like their performance benchmarks and API.

<joke>
Also their cute owl mascot, but I suppose that's not a meaningful
standard for future coding practices.
</joke>

>>
>> On the broader topic of recruitment...sure, cython has a lower barrier
>> to entry than C++. But there are many, many more C++ developers and
>> resources out there than cython resources. And it likely will stay
>> that way for quite some
>
> I may not have explained it very well: my whole point is that we don't
> recruite people, where I understand recruit as hiring full time, profesional
> programmers.We need more people who can casually spend a few hours -
> typically grad students, scientists with an itch. There is no doubt that
> more professional programmers know c++ compared to C. But a community
> project like numpy has different requirements than a "professional" project.
>

I'm not sure you really mean casually spend a few *hours*, but I get
your point. It's important for people to be able to add onto it
incrementally as an off-hours hobby.

But for itches to scratch, is numpy the realistic place for scientists
and grad students to go? As opposed to one of the extension packages,
like scipy, sklearn, etc.? If anywhere is going to be more akin to a
"professional" project, code-style wise, it seems like the numpy core
is the place to do it.

-Chris

> David
>
>
>>
>> -Chris
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion@scipy.org
>> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


More information about the NumPy-Discussion mailing list