[Numpy-discussion] Rookie problems - Why is C-code much faster?

Mads Ipsen mpi at osc.kiku.dk
Tue Feb 21 06:18:04 CST 2006


Hey,

I am relatively new to python and Numeric, and am currently involved
in a project of developing molecular dynamics code written in python
and using Numeric for number crunching. During this, I've run into
some problems, that I hope I can get some assistance with here. My
main issues here are:

1. around() appears to be slow
2. C code appears to be much faster


1. One of the bottle necks in MD is the calculation of the distances
between all particle pairs.

In MD simulations with periodic boundary conditions, you have to
estimate the shortest distance between all the particle pairs in your
system. Fx. on a line of length box = 10, the distance dx between two
points x0 = 1 and x1 = 9, will be dx = -2 (and NOT dx = 9). One way to
do this in numpy is

    dx  = x1 - x0
    dx -= around(dx/box)

My first observation here, is that around() seems to be very slow. So
I looked in umathmodule.c and implemented rint() from the C math
library and made my own custom Numeric module. This gives a speed-up
by a factor of app. 4 compared to around().

I suggest that rint() is added as a ufunc or is there any concerns
here that I am not aware of?

2. Here is the main loop for finding all possible pair distances,
   which corresponds to a loop over the upper triangular part of a
   square matrix

     # Loop over all particles
     for i in range(n-1):
        dx = x[i+1:] - x[i]
        dy = y[i+1:] - y[i]

        dx -= box*rint(dx/box)
        dy -= box*rint(dy/box)

        r2 = dx**2 + dy**2      # square of dist. between points

where x and y contain the positions of the particles. A naive
implementation in C is


    // loop over all particles
    for (int i=0; i<n-1; i++) {
        for (int j=i+1; j<n; j++) {
            dx = x[j] - x[i];
            dy = y[j] - y[i];

            dx -= box*rint(dx/box);
            dy -= box*rint(dy/box);

            r2 = dx*dx + dy*dy;
        }
    }

For n = 2500 particles, i.e. 3123750 particle pairs, the C loop is
app. 10 times faster than the Python/Numeric counterpart. This is of
course not satisfactory.

Are there any things I am doing completely wrong here, basic
approaches completely misunderstood, misuses etc?

Any suggestions, guidelines, hints are most welcome.

Best regards,

Mads Ipsen


+---------------------------------+-------------------------+
| Mads Ipsen                      |                         |
| Dept. of Chemistry              | phone:     +45-35320220 |
| H.C.Ørsted Institute            | fax:       +45-35320322 |
| Universitetsparken 5            |                         |
| DK-2100 Copenhagen Ø, Denmark   | mpi at osc.kiku.dk         |
+---------------------------------+-------------------------+




More information about the Numpy-discussion mailing list