[Numpy-discussion] Distance Matrix speed

Bruce Southey bsouthey at gmail.com
Fri Jun 16 09:20:40 CDT 2006


Hi,
Please run the exact same code in Matlab that you are running in
NumPy. Many of Matlab functions are very highly optimized so these are
provided as binary functions. I think that you are running into this
so you are not doing the correct comparison

So the ways around it are to write an extension in C or Fortran, use
Pysco etc if possible, and vectorize your algorithm to remove the
loops (especially the inner one).

Bruce

On 6/14/06, Sebastian Beca <sebastian.beca at gmail.com> wrote:
> Hi,
> I'm working with NumPy/SciPy on some algorithms and i've run into some
> important speed differences wrt Matlab 7. I've narrowed the main speed
> problem down to the operation of finding the euclidean distance
> between two matrices that share one dimension rank (dist in Matlab):
>
> Python:
> def dtest():
>     A = random( [4,2])
>     B = random( [1000,2])
>
>     d = zeros([4, 1000], dtype='f')
>     for i in range(4):
>         for j in range(1000):
>             d[i, j] = sqrt( sum( (A[i] - B[j])**2 ) )
>     return d
>
> Matlab:
>     A = rand( [4,2])
>     B = rand( [1000,2])
>     d = dist(A, B')
>
> Running both of these 100 times, I've found the python version to run
> between 10-20 times slower. My question is if there is a faster way to
> do this? Perhaps I'm not using the correct functions/structures? Or
> this is as good as it gets?
>
> Thanks on beforehand,
>
> Sebastian Beca
> Department of Computer Science Engineering
> University of Chile
>
> PD: I'm using NumPy 0.9.8, SciPy 0.4.8. I also understand I have
> ATLAS, BLAS and LAPACK all installed, but I havn't confirmed that.
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>




More information about the Numpy-discussion mailing list