[Numpy-discussion] Converting a large package from Numeric to numpy
Rick Muller
rmuller at sandia.gov
Thu Mar 16 09:07:07 CST 2006
Before I ask for help I would first like to thank everyone on the
Numpy team for the hard work that they've done. A longtime user of
Numeric, I'm really impressed with what I'm seeing in Numpy thus far,
particularly in the configuration routines, which seem to work
wonderfully so far.
I maintain a Python Quantum Chemistry package called PyQuante (http://
pyquante.sourceforge.net) that is built on top of Numeric/
LinearAlgebra. I wanted to see what would be involved in converting
everything over to the new packages. The big workhorse for me is the
dot routine, and I ran a few tests that implied that dot ran faster
in Numpy than it did in Numeric, so I was ready to go.
The conversion turned out to be easy. I wrote an interface routine
called NumWrap that has a flag (use_numpy) that determines whether to
import from Numeric or numpy. I then modified my code to import
everything from NumWrap. So far very easy.
I have a test-suite that exercises many different parts of my code. I
ran the test suite using Numeric and it took 351 sec. I then ran
using numpy and it took 615 sec!! Almost twice as long. (I should
point out that the very good news was that the test suite ran the
first time I tried it with the numpy routines without any errors!)
I started digging into the differences using the python profiler, and
was surprised what turned up. The big difference turned out to be
routines that were taking various mathematical functions (pow, exp,
even simple scalar multiplication) of array elements. This routine:
def vwn_xx(x,b,c): return x*x+b*x+c
which is called on many elements of a very large grid, went from
taking 17 sec in the test suite to taking 68 sec in the test suite.
Another routine that took a log() and a pow() of grid elements, went
from 15 sec to 100 sec.
I looked through Travis' new documentation. I first tried
the .toscalar() function, as follows:
rho = 0.5*dens[i].toscalar()
but this didn't work, I got
AttributeError: 'float64scalar' object has no attribute 'toscalar'
So now I'm using:
rho = 0.5*float(dens[i])
and this fixed the problem -- it makes the timings go back to the
same (or faster) than the Numeric values.
So my question is has anyone else run into problems of this sort? If
so, is there a "proper", sci-pythonic way of handling conversions
like this?
Thanks in advance for any help you can offer, and thanks again for
the great new package!
Rick
Rick Muller
rmuller at sandia.gov
More information about the Numpy-discussion
mailing list