Charles R Harris
Fri May 2 21:36:53 CDT 2008
On Fri, May 2, 2008 at 8:02 PM, Keith Goodman <email@example.com> wrote:
> On Fri, May 2, 2008 at 6:29 PM, Charles R Harris
> <firstname.lastname@example.org> wrote:
> > Isn't the lengthy part finding the distance between clusters? I can
> > of several ways to do that, but I think you will get a real speedup by
> > that in c or c++. I have a module made in boost python that holds
> > and returns a list of lists containing their elements. Clusters are
> > by joining any two elements, one from each. It wouldn't take much to add
> > distance function, but you could use the list of indices in each cluster
> > pull a subset out of the distance matrix and then find the minimum
> > in that. This also reminds me of Huffman codes.
> You're right. Finding the distance is slow. Is there any way to speed
> up the function below? It returns the row and column indices of the
> min value of the NxN array x.
> def dist(x):
> x = x + 1e10 * np.eye(x.shape)
> i, j = np.where(x == x.min())
> return i, j
> >> x = np.random.rand(500,500)
> >> timeit dist(x)
> 100 loops, best of 3: 14.1 ms per loop
> If the clustering gives me useful results, I'll ask you about your
> boost code. I'll also take a look at Damian Eads's scipy-cluster.
That package looks nice. I think your time would be better spent learning
how to use it than in rolling your own routines.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Numpy-discussion