[Numpy-discussion] Distance Matrix speed

Sebastian Beca sebastian.beca at gmail.com
Fri Jun 16 18:01:44 CDT 2006


Thanks! Avoiding the inner loop is MUCH faster (~20-300 times than the
original). Nevertheless I don't think I can use hypot as it only works
for two dimensions. The general problem I have is:

A = random( [C, K] )
B = random( [N, K] )

C ~ 1-10
N ~ Large (thousands, millions.. i.e. my dataset)
K ~ 2-100 (dimensions of my problem, i.e. not fixed a priori.)

I adapted your proposed version to this for K dimensions:

def d4():
    d = zeros([4, 1000], dtype=float)
    for i in range(4):
        xy = A[i] - B
        d[i] = sqrt( sum(xy**2, axis=1) )
    return d

Maybe there's another alternative to d4?
Thanks again,

Sebastian.

>     def d_2():
>         d = zeros([4, 10000], dtype=float)
>         for i in range(4):
>             xy = A[i] - B
>             d[i] = xy[:,0]**2 + xy[:,1]**2
>         return d
>
> This is something like 250 times as fast as the naive Python solution;
> another five times faster than the fastest distance computing version
> that I could come up with (using hypot).
>
> -tim
>
>
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>




More information about the Numpy-discussion mailing list