[Numpy-discussion] Quick Question about Optimization
Anne Archibald
peridot.faceted@gmail....
Mon May 19 14:55:14 CDT 2008
2008/5/19 James Snyder <jbsnyder@gmail.com>:
> First off, I know that optimization is evil, and I should make sure
> that everything works as expected prior to bothering with squeezing
> out extra performance, but the situation is that this particular block
> of code works, but it is about half as fast with numpy as in matlab,
> and I'm wondering if there's a better approach than what I'm doing.
>
> I have a chunk of code, below, that generally iterates over 2000
> iterations, and the vectors that are being worked on at a given step
> generally have ~14000 elements in them.
With arrays this size, I wouldn't worry about python overhead - things
like range versus xrange or self lookups.
> Is there anything in practice here that could be done to speed this
> up? I'm looking more for general numpy usage tips, that I can use
> while writing further code and not so things that would be obscure or
> difficult to maintain in the future.
Try using a profiler to find which steps are using most of your time.
With such a simple function it may not be very informative, but it's
worth a try.
> Also, the results of this are a binary array, I'm wondering if there's
> anything more compact for expressing than using 8 bits to represent
> each single bit. I've poked around, but I haven't come up with any
> clean and unhackish ideas :-)
There's a tradeoff between compactness and speed here. The *fastest*
is probably one boolean per 32-bit integer. It sounds awful, I know,
but most modern CPUs have to work harder to access bytes individually
than they do to access them four at a time. On the other hand, cache
performance can make a huge difference, so compactness might actually
amount to speed. I don't think numpy has a packed bit array data type
(which is a shame, but would require substantial implementation
effort).
> I can provide the rest of the code if needed, but it's basically just
> filling some vectors with random and empty data and initializing a few
> things.
It would kind of help, since it would make it clearer what's a scalar
and what's an array, and what the dimensions of the various arrays
are.
> for n in range(0,time_milliseconds):
> self.u = self.expfac_m * self.prev_u +
> (1-self.expfac_m) * self.aff_input[n,:]
> self.v = self.u + self.sigma *
> np.random.standard_normal(size=(1,self.naff))
You can use "scale" to rescale the random numbers on creation; that'll
save you a temporary.
> self.theta = self.expfac_theta * self.prev_theta -
> (1-self.expfac_theta)
>
> idx_spk = np.where(self.v>=self.theta)
You can probably skip the "where"; the result of the expression
self.v>=self.theta is a boolean array, which you can use directly for
indexing.
> self.S[n,idx_spk] = 1
> self.theta[idx_spk] = self.theta[idx_spk] + self.b
+= here might speed things up, not just in terms of temporaries but by
saving a fancy-indexing operation.
> self.prev_u = self.u
> self.prev_theta = self.theta
Anne
More information about the Numpy-discussion
mailing list