[Numpy-discussion] huge array calculation speed
Thu Jul 10 14:13:00 CDT 2008
Charles R Harris wrote:
> On Thu, Jul 10, 2008 at 10:38 AM, Dan Lussier <email@example.com
> <mailto:firstname.lastname@example.org>> wrote:
> I am relatively new to numpy and am having trouble with the speed of
> a specific array based calculation that I'm trying to do.
> What I'm trying to do is to calculate the total total potential
> energy and coordination number of each atom within a relatively large
> simulation. Each atom is at a position (x,y,z) given by a row in a
> large array (approximately 1e6 by 3) and presently I have no
> information about its nearest neighbours so each its position must be
> checked against all others before cutting the list down prior to
> calculating the energy.
> This looks to be O(n^2) and might well be the bottle neck. There are
> various ways to speed up such things but more information would help
> determine the method, i.e., is this operation within a loop so that
> the values change a lot. However, one quick thing to try is a sort on
> one of the coordinates so you only need to check a subset of the
> vectors. Searchsorted could be useful here also.
> Numpy-discussion mailing list
If I understand correctly, I notice that you are doing many scalar
multiplications where some are the same or constants. For example, you
compute r2*r2 a few times:
But you don't need this numpy.power(r2,6) as it can be factored. Also the division is unnecessary.
So the liTotal is really:
ljTotal[i] = 2.0*((numpy.power(r2,-3)*(numpy.power(r2,-3)-1)).sum(axis=0))
However, the real problem is getting the coordinates that I do not
follow and limits how you can remove the loop over xyz.
Like why the need for the repeated (r0-xyz)?
This is a huge cost that you need to avoid perhaps by just extracting
those elements within your criteria.
More information about the Numpy-discussion