[Numpy-discussion] Fastest distance matrix calc
Fri Apr 13 05:43:44 CDT 2007
I think someone posted some timings about this before but I don't recall.
The task is to compute the matrix D from two sets of vectors x (M,d)
and y (N,d).
The output should be D where D[i,j] is norm(x[i]-y[j])
The Matlab NetLab toolkit uses something like this to compute it:
d2 = (x*x).sum(1)[:,numpy.newaxis] + (y*y).sum(1) - 2*mult(x,y.T)
And then does
because roundoff in the above can sometimes create negative values. And finally
d = sqrt(d2)
But it just doesn't seem like the best way to do it. Whatever was
posted before I don't remember anything about a subtract.outer
solution. Seems like something along the lines of
might be faster, and also avoid the need to check for negatives.
I'll do some timings if no one's already done it. Just wanted to check first.
More information about the Numpy-discussion