[Numpy-discussion] numpy.vectorize performance
tim.hochberg at cox.net
Fri Jul 14 11:56:32 CDT 2006
Nick Fotopoulos wrote:
> On Jul 13, 2006, at 10:17 PM, Tim Hochberg wrote:
>> Nick Fotopoulos wrote:
>>> Dear all,
>>> I often make use of numpy.vectorize to make programs read more
>>> like the physics equations I write on paper. numpy.vectorize is
>>> basically a wrapper for numpy.frompyfunc. Reading Travis's Scipy
>>> Book (mine is dated Jan 6 2005) kind of suggests to me that it
>>> returns a full- fledged ufunc exactly like built-in ufuncs.
>>> First, is this true?
>> Well according to type(), the result of frompyfunc is indeed of
>> type ufunc, so I would say the answer to that is yes.
>>> Second, how is the performance?
>> A little timing indicates that it's not good (about 30 X slower for
>> computing x**2 than doing it using x*x on an array). . That's not
>> frompyfunc (or vectorizes) fault though. It's calling a python
>> function at each point, so the python function call overhead is
>> going to kill you. Not to mention instantiating an actual Python
>> object or objects at each point.
> That's unfortunate since I tend to nest functions quite deeply and
> then scipy.integrate.quad over them, which I'm sure results in a
> ridiculous number of function calls. Are anonymous lambdas any
> different than named functions in terms of performance?
Sorry, no. Under the covers they're the same.
>>> i.e., are my functions performing approximately as fast as they
>>> could be or would they still gain a great deal of speed by
>>> rewriting it in C or some other compiled python accelerator?
>> Can you give examples of what these functions look like? You might
>> gain a great deal of speed by rewriting them in numpy in the
>> correct way. Or perhaps not, but it's probably worth showing some
>> examples so we can offer suggestions or at least admit that we are
> This is by far the slowest bit of my code. I cache the results, so
> it's not too bad, but any upstream tweak can take a lot of CPU time
> to propagate.
> def dnsratezfunc(z):
> """Take coalescence time into account.""
> def integrand(zf):
> return Pz(z,zf)*NSbirthzfunc(zf)
> return quad(integrand,delayedZ(2e5*secperyear+1,z),5)
> dnsratez = lambdap*dnsratezfunc(zs)
> # Neutron star formation rate is a delayed version of star formation
> NSbirthzfunc = autovectorized(lambda z: SFRz(delayedZ
> def Pz(z_c,z_f):
> """Return the probability density per unit redshift of a DNS
> coalescence at z_c given a progenitor formation at z_f. """
> return P(t(z_c,z_f))*dtdz(z_c)
> and there are many further nested levels of function calls. If the
> act of calling a function is more expensive than actually executing
> it and I value speed over readability/code reuse, I can inline Pz's
> function calls and inline the unvectorized NSbirthzfunc to reduce the
> calling stack a bit. Any other suggestions?
I think I'd try psyco (http://psyco.sourceforge.net/). That's pretty
painless to try and may result in a significant improvement.
More information about the Numpy-discussion