[Numpy-discussion] faster way for calculating distance matrix...
Jeffery D. Collins
jcollins at boulder.net
Wed May 21 18:03:02 CDT 2003
Kasper Souren wrote:
>Hi,
>
>I was happy to see solutions coming up for Cliff's problems and it made me
>think: maybe I can mention my problem here as well.
>
>>From an array of feature vectors I want to calculate it's distance matrix.
>Something like [1]. Currently it can take quite a while to calculate the
>stuff for a long array.
>
>Some questions:
>
>1) Is there a smart speed up possible? Like, a way to avoid the double loop?
>It's no problem if this would lead to less generality (like the choice for a
>distance function). I know there is a little speed up to be gained by leaving
>out the array(f) thing, but that's not what I'm looking for.
>
>2) Is it possible (in Numeric or numarray) to define a class DiagonalMatrix
>that at least saves half of the memory?
>
>3) If 1) is not possible, what would be the way to go for speeding it up by
>writing it in C? weave because of its availability in scipy or would pyrex be
>more interesting, or are there even more options..?
>
>bye,
>Kasper
>
>
>
>[1] Example program:
>
>import Numeric
>
>def euclidean_dist(a, b):
> diff = a - b
> return Numeric.dot(diff, diff)
>
>def calc_dist_matrix(f, distance_function):
> W = Numeric.array(f)
> length = W.shape[0]
> S = Numeric.zeros((length, length)) * 1.0
>
> for i in range(length):
> for j in range(i):
> S[j, i] = S[i, j] = distance_function(W[i], W[j])
> return S
>
>print calc_dist_matrix(Numeric.sin(Numeric.arange(30)), euclidean_dist)
>
>
The following is a bit faster (at the cost of some memory):
def euclidean_dist(a, b):
diff = a - b
return diff**2
def calc_dist_matrix2(f, distance_function):
W = Numeric.array(f)
i,j = indices((len(W),)*2)
S = distance_function(take(W,i), take(W,j))
return S
--
Jeffery Collins (http://www.boulder.net/~jcollins)
More information about the Numpy-discussion
mailing list