[Numpy-discussion] fast way of doing "cross-multiplications" ?

Perry Greenfield perry at stsci.edu
Tue Jul 18 10:23:53 CDT 2006

On Jul 18, 2006, at 10:23 AM, Eric Emsellem wrote:

> Hi,
> I have a specific quantity to derive from an array, and I am at the
> moment unable to do it for a too large array because it just takes too
> long! So I am looking for an advice on how to efficiently compute  
> such a
> quantity:
> I have 3 arrays of N floats (x[...], y[..], z[..]) and I wish to do:
> result = 0.
> for i in range(N) :
>    for j in range(i+1,N,1) :
>       result += 1. / sqrt((x[j] - x[i])**2 + (y[j] - y[i])**2 + (z 
> [j] -
> z[i])**2)
> Of course the procedure written above is very inefficient and I  
> thought
> of doing:
> result = 0.
> for i in range(N) :
>    result += 1. / sqrt((x[i+1:] - x[i])**2 + (y[i+1:] - y[i])**2 +
> (z[i+1:] - z[i])**2)
> Still, this is quite slow and not workable for very large arrays (>  
> 10^6
> floats per array).
> Any hint on how to speed things up here?
> Thanks in advance!
> Eric

Perhaps I'm misunderstanding the last variant but don't you want  
something like:

result = 0.
for i in range(N) :
    result += add.reduce(1. / sqrt((x[i+1:] - x[i])**2 + (y[i+1:] - y 
[i])**2 +
(z[i+1:] - z[i])**2))

instead since the expression yields an array with a decreasing size  
each iteration?

But besides that, it seems you are asking to do roughly 10^12 of  
these computations  for 10^6 points. I don't see any way to avoid  
that given what you are computing. The solution Bill Baxter gives is  
fine (I think, I haven't looked at it closely), but the usual problem  
of doing it without any looping is that it requires an enormous  
amount of memory (~10^12 element arrays) if I'm not mistaken. Since  
your second example is iterating over large arrays (most of the time,  
not near the end), I'd be surprised if you can do much better than  
that (the looping overhead should be negligible for such large  
arrays). Do you have examples written in other languages that run  
much faster? I guess I would be surprised to see it possible to do  
more than a few times faster in any language without some very clever  


More information about the Numpy-discussion mailing list