# [SciPy-User] Speeding up a search algorithm

R. Padraic Springuel R.Springuel@umit.maine....
Wed Jun 2 16:24:19 CDT 2010

```Eric wrote:
> You may want to work only on one part of your array.
>
> For instance this :
> [(x,y) for x in range(4) for y in range(x)]
>
> will give you
> [(1, 0), (2, 0), (2, 1), (3, 0), (3, 1), (3, 2)]
>
> that can be used as the indexes for the lower half of your matrix, without the
> diagonal.
>
> Then, once you have selected only this part of the array, you can use
> array.min().

I'm not sure I understand what you're proposing.  I tried these possible
variations:
>>> a = arange(100).reshape(10,10)
>>> a[(x,y) for x in range(4) for y in range(x)]
File "<stdin>", line 1
a[(x,y) for x in range(4) for y in range(x)]
^
SyntaxError: invalid syntax
>>> b = [(x,y) for x in range(4) for y in range(x)]
>>> a[b]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: too many indices for array
>>> a.take(b)
array([[1, 0],
[2, 0],
[2, 1],
[3, 0],
[3, 1],
[3, 2]])

None of which are returning the appropriate part of the array (two don't
even return anything).

Chuck wrote:
> What is the larger picture here? This sounds like a bit like you are
> shooting for some
> sort of clustering and there may already be an appropriate algorithm for it.

Yes, this is part of a clustering algorithm.  The problem is that
algorithms already in scipy don't support everything that I need to do
here and I don't know anything about c (and so can't figure out how to
modify them to actually do what I want them to do).  I had the same
problem with PyCluster back before scipy had clustering algorithms
incorporated into it (except for kmeans) and wrote my own package that
did everything that I wanted it to do, though it is written in pure
Python.  This algorithm comes from that package of mine.  I'm trying to
speed it up because it can take 24 hours or so to complete the
clustering on the ~3500 point data sets I'm working with now.

--