[SciPy-User] Speeding up a search algorithm

Éric Depagne edepagne@lcogt....
Wed Jun 2 13:24:22 CDT 2010


Le mercredi 2 juin 2010 11:08:54, R. Padraic Springuel a écrit :
> I've got a search algorithm I've written which is designed to find the
> minimum value of a rank 2 ndarray (called search) and remember it
> (calling it m) and where that value was located (with the values of n1
> and n2).  While that may seem a simple enough task using min and argmin
> functions, there are a few of caveats that make using those functions
> impractical (at least they appear impractical to me):
> 
> 1) The array being searched is an extract from a larger array (made
> using take twice on the larger array with the rows and columns being
> taken specified by a list of indecies called current).  What we want to
> know is the location of the minimum in the original array, not in the
> extract, so the values for n1 and n2 have to be taken from current.
> 
> 2) The array is distance array.  I.e. it is symmetric about the main
> diagonal and all the main diagonal elements are 0.  However, these main
> diagonal elements should not count as being the minimum value of the array.
> 
> 3) The minimum value does not necessarily occur only once in the array
> (even when the symmetry is taken into account) and the way to choose
> between multiple minimums can vary.  Which variation is used is
> specified by the value of aggr (None, True, or False).
> 
> As it stands, the algorithm looks like this:
> 
> search = distancematrix.take(current,axis=0).take(current,axis=1)
> m = numpy.max(search)
> n1 = current[0]
> n2 = current[1]
> if aggr:
>      p1 = 0
> else:
>      p1 = N
> for i in range(len(search)):
>      for j in range(len(search)):
>          if i == j:
>              break
>          else:
>              if search[i][j] < m or numpy.isnan(m):
>                  m = search[i][j]
>                  n1 = current[i]
>                  n2 = current[j]
>                  if n1 < 0:
>                      p1 = tree.pop[n1]
>                  else:
>                      p1 = 1
>                  if n2 < 0:
>                      p1 += tree.pop[n2]
>                  else:
>                      p1 += 1
>              elif search[i][j] == m and aggr != None:
>                  if current[i] < 0:
>                      p2 = tree.pop[current[i]]
>                  else:
>                      p2 = 1
>                  if current[j] < 0:
>                      p2 += tree.pop[current[j]]
>                  else:
>                      p2 += 1
>                  if p2 < p1 and not aggr:
>                      n1 = current[i]
>                      n2 = current[j]
>                      p1 = p2
>                  elif p2 > p1 and aggr:
>                      n1 = current[i]
>                      n2 = current[j]
>                      p1 = p2
> 
> 
> However, I've found it to be fairly slow, especially on large arrays,
> when compared to min and argmin (probably due to all the looping in
> python).  Does anyone have any suggestions for optimizing this function
> or otherwise speeding it up?
> 
You may want to work only on one part of your array.

For instance this :
[(x,y) for x in range(4) for y in range(x)]

will give you
[(1, 0), (2, 0), (2, 1), (3, 0), (3, 1), (3, 2)]

that can be used as the indexes for the lower half of your matrix, without the 
diagonal.

Then, once you have selected only this part of the array, you can use 
array.min().

Éric.
-- 
Un clavier azerty en vaut deux
----------------------------------------------------------
Éric Depagne                            edepagne@lcogt.net
Las Cumbres Observatory 
6740 Cortona Dr
Goleta CA, 93117
----------------------------------------------------------


More information about the SciPy-User mailing list