[Numpy-discussion] distance matrix speed
Michael Sorich
michael.sorich at gmail.com
Fri Jun 16 01:26:37 CDT 2006
Hi Sebastian,
I am not sure if there is a function already defined in numpy, but
something like this may be what you are after
def distance(a1, a2):
return sqrt(sum((a1[:,newaxis,:] - a2[newaxis,:,:])**2, axis=2))
The general idea is to avoid loops if you want the code to execute
fast. I hope this helps.
Mike
On 6/16/06, Sebastian Beca <sebastian.beca at gmail.com> wrote:
> Hi,
> I'm working with NumPy/SciPy on some algorithms and i've run into some
> important speed differences wrt Matlab 7. I've narrowed the main speed
> problem down to the operation of finding the euclidean distance
> between two matrices that share one dimension rank (dist in Matlab):
>
> Python:
> def dtest():
> A = random( [4,2])
> B = random( [1000,2])
>
> d = zeros([4, 1000], dtype='f')
> for i in range(4):
> for j in range(1000):
> d[i, j] = sqrt( sum( (A[i] - B[j])**2 ) )
> return d
>
> Matlab:
> A = rand( [4,2])
> B = rand( [1000,2])
> d = dist(A, B')
>
> Running both of these 100 times, I've found the python version to run
> between 10-20 times slower. My question is if there is a faster way to
> do this? Perhaps I'm not using the correct functions/structures? Or
> this is as good as it gets?
>
> Thanks on beforehand,
>
> Sebastian Beca
> Department of Computer Science Engineering
> University of Chile
>
> PD: I'm using NumPy 0.9.8, SciPy 0.4.8. I also understand I have
> ATLAS, BLAS and LAPACK all installed, but I havn't confirmed that.
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
More information about the Numpy-discussion
mailing list