[Numpy-discussion] Index Array Performance

Marcel Oliver m.oliver@jacobs-university...
Mon Feb 13 17:23:49 CST 2012


Hi,

I have a short piece of code where the use of an index array "feels
right", but incurs a severe performance penalty: It's about an order
of magnitude slower than all other operations with arrays of that
size.

It comes up in a piece of code which is doing a large number of "on
the fly" histograms via

  hist[i,j] += 1

where i is an array with the bin index to be incremented and j is
simply enumerating the histograms.  I attach a full short sample code
below which shows how it's being used in context, and corresponding
timeit output from the critical code section.

Questions:

- Is this a matter of principle, or due to an inefficient
  implementation?
- Is there an equivalent way of doing it which is fast?

Regards,
Marcel

=================================================================

#! /usr/bin/env python
# Plot the bifurcation diagram of the logistic map

from pylab import *

Nx = 800
Ny = 600
I = 50000

rmin = 2.5
rmax = 4.0
ymin = 0.0
ymax = 1.0

rr = linspace (rmin, rmax, Nx)
x = 0.5*ones(rr.shape)
hist = zeros((Ny+1,Nx), dtype=int)
j = arange(Nx)

dy = ymax/Ny

def f(x):
    return rr*x*(1.0-x)

for n in xrange(1000):
    x = f(x)

for n in xrange(I):
    x = f(x)
    i = array(x/dy, dtype=int)
    hist[i,j] += 1

figure()

imshow(hist,
       cmap='binary',
       origin='lower',
       interpolation='nearest',
       extent=(rmin,rmax,ymin,ymax),
       norm=matplotlib.colors.LogNorm())

xlabel ('$r$')
ylabel ('$x$')

title('Attractor of the logistic map $x_{n+1} = r \, x_n (1-x_n)$')

show()

====================================================================

In [4]: timeit y=f(x)
10000 loops, best of 3: 19.4 us per loop

In [5]: timeit i = array(x/dy, dtype=int)
10000 loops, best of 3: 22 us per loop

In [6]: timeit img[i,j] += 1
10000 loops, best of 3: 119 us per loop


More information about the NumPy-Discussion mailing list