[SciPy-user] Numpy/SciPy and performance optimizations

Georg Holzmann grh@mur...
Wed Jan 7 06:52:15 CST 2009


Hallo!

In the last days I went through some tutorials about python and 
performance optimizations, basically the two main articles I looked 
through are (and the references in there):
- http://www.scipy.org/PerformancePython
- http://wiki.cython.org/tutorials/numpy

So it seems that there exist now many possibilities to speed up some 
essential parts of the python code, however, I am still not satisfied 
with those solutions.


My problem:

I have parts in my projects, where I have to iterate over loops (some 
recursive algorithms). In the past I developed the basic library in C++ 
(using SWIG to generate python modules) - but now I want to switch fully 
to python and only optimize some small parts, because I waste too much 
time while trying to extend the C++ library, which is already quite 
complex ...

Okay, of course weave in combination with blitz looked very attractive 
to me.
After struggling through the documentation of weave and blitz++, I 
understood the concept and tried to implement an example.
One example of such a typical loop would be (all variables are arrays, 
from numpy import *):

for n in range(steps):
     x = dot(A, x)
     x += dot(B, u[:, n])
     x = tanh(x)
     y[:,n] = dot(C, r_[x,u[:,n]] )

So I need in blitz++ some matrix-vector multiplications and similar 
stuff, which is unfortunately not very intuitive.
One way is to use the blitz::sum function, which is IMHO not intuitive 
and very slow, slower than usual numpy (see for instance also some 
benchmark of C/C++ libraries I made last year: 
http://grh.mur.at/misc/sparselib_benchmark/index.html).
Another way would be to use blas and write support code for every needed 
blas (or maybe also lapack) function - as for instance demonstrated in 
http://www.math.washington.edu/~jkantor/Numerical_Sage/node14.html.
However, this was now too much work for me ...


What I want:

- easy embeddable C/C++ code, without having to handle a complicated 
python API (like in weave)
- basic matrix operations (blas, maybe also lapack) available in C/C++
- nice indexing, slicing etc. also in C/C++ (which is nice with blitz++)
- handling of sparse matrices also in C/C++ (at least basic blas methods 
for sparse matrices)

OK, this is quite a big wishlist ;)
However, ATM I can think of two possible solutions:

1. Add some additional header files to weave/blitz, so that it is out of 
the box possible to have at least blas functions available

2. Writing a new type converter for weave, which supports a more feature 
rich (and faster) C++ library than blitz++

I don't know how hard 2. would be ?
At least I played with quite some C++ libraries last year (see again the 
benchmark http://grh.mur.at/misc/sparselib_benchmark/index.html) and 
there would be three nice candidates:
- MTL: http://www.osl.iu.edu/research/mtl/
- gmm++: http://home.gna.org/getfem/gmm_intro
- flens: http://flens.sourceforge.net/
(- maybe also boost ublas: 
http://grh.mur.at/misc/sparselib_benchmark/www.boost.org/libs/numeric/)

These three libraries are very fast, header only libs (like blitz++) and 
also have blas, lapack and sparse support.
See also this more general benchmark, which shows advantages of MTL 
compared to Intel BLAS, blitz, fortran, c: 
http://projects.opencascade.org/btl/



So, it would be nice to get some feedback, maybe there are other 
solutions I don't know of ?
(Maybe it is easier to do all this in fortran and use f2py ?)
How do other people optimize more complicated code ?

I would be also happy to get some remarks, if it is useful to implement 
type converters for an other C++ library than blitz++ (e.g. MTL or 
gmm++) - and maybe some suggestions for that ...

Thanks for any hints,
LG
Georg


More information about the SciPy-user mailing list