[Numpy-discussion] Distance Matrix speed

Sebastian Beca sebastian.beca at gmail.com
Sun Jun 18 17:49:27 CDT 2006


I checked the matlab version's code and it does the same as discussed
here.  The only thing to check is to make sure you loop around the
shorter dimension of the output array. Speedwise the Matlab code still
runs about twice as fast for large sets of data (by just taking time
by hand and comparing), nevetheless the improvement over calculating
each value as in d1 is significant (10-300 times) and enough for my
needs. Thanks to all.

Sebastian Beca

PD: I also tried the d5 version Alex sent but the results are not the
same so I couldn't compare.

My final version was:

K = 10
C = 3
N = 2500 # One could switch around C and N now.
A = random.random( [N, K])
B = random.random( [C, K])

def dist():
    d = zeros([N, C], dtype=float)
    if N < C:
        for i in range(N):
            xy = A[i] - B
            d[i,:] = sqrt(sum(xy**2, axis=1))
        return d
    else:
        for j in range(C):
            xy = A - B[j]
            d[:,j] = sqrt(sum(xy**2, axis=1))
    return d


On 6/17/06, Johannes Loehnert <a.u.r.e.l.i.a.n at gmx.net> wrote:
> Hi,
>
> > def d4():
> >     d = zeros([4, 1000], dtype=float)
> >     for i in range(4):
> >         xy = A[i] - B
> >         d[i] = sqrt( sum(xy**2, axis=1) )
> >     return d
> >
> > Maybe there's another alternative to d4?
> > Thanks again,
>
> I think this is the fastest you can get. Maybe it would be nicer to use
> the .sum() method instead of sum function, but that is just my personal
> opinion.
>
> I am curious how this compares to the matlab version. :)
>
> Johannes
>
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>




More information about the Numpy-discussion mailing list