[SciPy-user] python (against java) advocacy for scientific projects

Ravi lists_ravi@lavabit....
Tue Jan 20 15:15:17 CST 2009


On Tuesday 20 January 2009 13:13:27 David Cournapeau wrote:
> On Tue, Jan 20, 2009 at 11:27 PM, Ravi <lists_ravi@lavabit.com> wrote:
> > Really? Try writing just a fixed-point radix-8 FFT 
[snip]
> The FFT reference is FFTW. It uses neither C++ or fortran. It does not
> have rounding /clipping strategies that I know of, but is certainly as
> flexible as you can make in C++.

Please notice that I specifically mentioned *fixed-point* FFTs. The area I 
work in is an intersection of algebraic geometry, signal processing and 
discrete mathematics. FFTW has no idea how to model 13-bit fixed point values, 
and certainly does not handle minimization of error propagation by choice of 
rounding vs. truncation in intermediate steps (rounding does not always lead 
to better error propagation compared to truncation, which is computationally 
much less intensive).

> > Code maintainability works by using clearly defined idioms.
>
> That's really only part of the story. Code maintainability also
> requires the idioms to be well shared and understood by the community
> - which C++ makes really hard to ensure because it is such a complex
> beast.

The second part is not really true. Of course C++ is a very young language 
with features that were completely unappreciated in the beginning by its 
target audience: C programmers looking for something more scalable. Well 
understood and shared idioms do not just appear on the scene. A significant 
body of work and experience is required before such idioms percolate down to 
the journeyman programmer. C++ reached that stage only circa 2005. (For a 
simple example, see how much work has been going on in the ipython-dev lists 
regarding asynchronous operations; this is not because asynchronous operations 
are not inherently difficult to understand - just that the standard idioms for 
handling asynchronous events are not yet commonly understood outside of a very 
small community (and even those idioms are still under refinement)).

> C++ is unmaintainable without a strong set of coding rules,
> which only really works in companies, or when you have an already
> strong framework (in open source, it is quite striking that C++ is
> seldom used, except for complex GUI programs).

Of course you have coding rules, but you have such rules even in small C 
projects. Boost does not really having many coding rules other than naming 
conventions and boost is widely deployed. Please read the CERN ROOT 
information page for the reason they switched from Fortran to C++ (speed & 
scalability). C++ is not the best language for every task; my only claim was 
that C++ is just as good as Fortran for a lot of tasks and even better. After 
all, I participate in this list because I use python just as much as C++.

> I have no reason to doubt your experience that template leads to
> maintainable code - but it is exactly the contrary in my experience,
> and often for code which is supposed to be state of the art (boost).

This is the fundamental misunderstanding. People treat C++ as an extension of 
C and then templates tie them into knots. I had the very same problem until I 
used some functional languages (Common Lisp, in my case) and realized that C++ 
is an new object-oriented language than has certain C features. This coincided 
with the time I became frustrated with Fortran and wished that I had a hybrid 
between C & Lisp and then it became clear to me that C++ is very near that.

> > First, Fortran, as I pointed out above, is generally worthless for a lot
> > of computation-intensive problems that don't map to its native data
> > types.
> >
> > Second, Fortran is not magic; it simply uses optimized libraries
> > underneath and the speed of Fortran compiled code depends upon the
> > libraries
>
> Part of the fortran speed comes from the fact that fortran does not
> have pointer.

Not true for Fortran95 as pointed out by Mattheiu & Sturla already.

> I think something like eigen will not suit python developers much.
> First, it has dreadful compilation time (like everything
> template-based), and their performance numbers, I never could
> reproduce them. I have never seen such a difference between MKL and
> ATLAS as shown on their benchmark - since they don't give enough
> information, it is hard to tell which atlas they used, but in my
> experience, ATLAS (and of course MKL) was always much faster than
> eigen, on both mac os X (with accelerate, which is mostly customized
> atlas, at least at its code) and Linux, with the benchmark they
> provide. At this point, I don't understand what they are measuring.

I used to work for a certain major competitor to the producers of MKL. ATLAS 
cad FFTW can both be beaten by a significant margin. In fact, with a certain 
compiler from the major competitor and our own libraries, we could beat 
Fortran performance (from the same competitor's compiler) on L2 & L3 BLAS from 
C/C++/Fortran.

> I also note that they are so much faster than blitz, which itself was
> supposed to match fortran speed. This puzzles me as a fundamental
> contradiction somewhere :)

Never used blitz seriously because of the painful interface; so, no comment.

> > Third, computation speed now on CotS processors depends more on cache &
> > memory access optimization than anything else, which compilers can do
> > with C/C++ just as well as with Fortran;
>
> No, they can't. At least in standard C++, you can't provide enough
> informations about pointers. But even then, it is often only 2 or 3
> times slower - which rarely matters for scientific programming, except
> for the biggest simulations.

Unfortunately, at least in my line of work, these "biggest simulations" are 
very common ones. One example from my past is LDPC code searches, where 
sometimes one has to resort to using FPGAs when we could not speed up 
computations any more; the 3 months we lost programming the FPGAs were amply 
repaid within a few weeks.

> But the point is that it is difficult for no reason but a dreadful
> syntax. Something like eigen could be done in a higher level language.
> To everyone his own interet, I guess, but I don't understand the joy
> of spending time coding and debugging template code. It is just awful
> - the compiler often cannot tell you even the line which has a syntax
> error.

I partly agree (and assert that you need to use better compilers, like 
Comeau). I wish it were possible to write DSELs easily in some other language 
(preferably some enhancement of OCaml), but I haven't yet found such a 
language that has sufficient mindshare in my area of work :-(

> Something like fftw, wich a code generator written in a high level
> language is a much better example of meta programming IMHO. It is
> readable, flexible, and portable, at least in comparison to anything
> C++ has to offer today.

Completely agreed, but tool availability is a big problem. In my case, I quote 
the zen of python: practicality beats purity :-) and so I stick with 
C++/python.

Just in case the main point was lost: (1) C++ does not fill every niche but 
has its place when used with Python. (2) Fortran is not a replacement for C++.

Regards,
Ravi




More information about the SciPy-user mailing list