[SciPy-user] Numpy/SciPy and performance optimizations
Wed Jan 7 06:52:15 CST 2009
In the last days I went through some tutorials about python and
performance optimizations, basically the two main articles I looked
through are (and the references in there):
So it seems that there exist now many possibilities to speed up some
essential parts of the python code, however, I am still not satisfied
with those solutions.
I have parts in my projects, where I have to iterate over loops (some
recursive algorithms). In the past I developed the basic library in C++
(using SWIG to generate python modules) - but now I want to switch fully
to python and only optimize some small parts, because I waste too much
time while trying to extend the C++ library, which is already quite
Okay, of course weave in combination with blitz looked very attractive
After struggling through the documentation of weave and blitz++, I
understood the concept and tried to implement an example.
One example of such a typical loop would be (all variables are arrays,
from numpy import *):
for n in range(steps):
x = dot(A, x)
x += dot(B, u[:, n])
x = tanh(x)
y[:,n] = dot(C, r_[x,u[:,n]] )
So I need in blitz++ some matrix-vector multiplications and similar
stuff, which is unfortunately not very intuitive.
One way is to use the blitz::sum function, which is IMHO not intuitive
and very slow, slower than usual numpy (see for instance also some
benchmark of C/C++ libraries I made last year:
Another way would be to use blas and write support code for every needed
blas (or maybe also lapack) function - as for instance demonstrated in
However, this was now too much work for me ...
What I want:
- easy embeddable C/C++ code, without having to handle a complicated
python API (like in weave)
- basic matrix operations (blas, maybe also lapack) available in C/C++
- nice indexing, slicing etc. also in C/C++ (which is nice with blitz++)
- handling of sparse matrices also in C/C++ (at least basic blas methods
for sparse matrices)
OK, this is quite a big wishlist ;)
However, ATM I can think of two possible solutions:
1. Add some additional header files to weave/blitz, so that it is out of
the box possible to have at least blas functions available
2. Writing a new type converter for weave, which supports a more feature
rich (and faster) C++ library than blitz++
I don't know how hard 2. would be ?
At least I played with quite some C++ libraries last year (see again the
benchmark http://grh.mur.at/misc/sparselib_benchmark/index.html) and
there would be three nice candidates:
- MTL: http://www.osl.iu.edu/research/mtl/
- gmm++: http://home.gna.org/getfem/gmm_intro
- flens: http://flens.sourceforge.net/
(- maybe also boost ublas:
These three libraries are very fast, header only libs (like blitz++) and
also have blas, lapack and sparse support.
See also this more general benchmark, which shows advantages of MTL
compared to Intel BLAS, blitz, fortran, c:
So, it would be nice to get some feedback, maybe there are other
solutions I don't know of ?
(Maybe it is easier to do all this in fortran and use f2py ?)
How do other people optimize more complicated code ?
I would be also happy to get some remarks, if it is useful to implement
type converters for an other C++ library than blitz++ (e.g. MTL or
gmm++) - and maybe some suggestions for that ...
Thanks for any hints,
More information about the SciPy-user