[SciPy-user] Scipy for Cluster Detection

Stefan van der Walt stefan@sun.ac...
Sat Dec 29 10:43:25 CST 2007


On Thu, Dec 27, 2007 at 01:24:26AM -0500, David Warde-Farley wrote:
> Once you have this your problem is what graph theorists would call  
> identifying the "connected components", which is a well-studied  
> problem and can be done fairly easily, Google "finding connected  
> components" or pick up any algorithms textbook at your local library  
> such as the Cormen et al "Introduction to Algorithms".

I have implemented the O(N) algorithm described in

  Christophe Fiorio and Jens Gustedt,
  "Two linear time Union-Find strategies for image processing",
  Theoretical Computer Science 154 (1996), pp. 165-181.

and

  Kensheng Wu, Ekow Otoo and Arie Shoshani,
  "Optimizing connected component labeling algorithms",
  Paper LBNL-56864, 2005,
  Lawrence Berkeley National Laboratory
  (University of California),
  http://repositories.cdlib.org/lbnl/LBNL-56864.

The code can be found at

http://mentat.za.net/source/connected_components.tar.bz2

Run 'scons' to compile (I can send a setup.py if necessary).

Typical usage:

x = N.array([[0, 0, 3, 2, 1, 9],
             [0, 1, 1, 9, 2, 9],
	     [0, 0, 1, 9, 9, 9],
             [3, 1, 1, 5, 3, 0]])

labels = connected_components.linear(x)

Result:

[[0, 0, 1, 2, 3, 4],
 [0, 5, 5, 4, 6, 4],
 [0, 0, 5, 4, 4, 4],
 [7, 5, 5, 8, 9, 10]])

The given source searches for neighbours that have the same values,
but this can easily be modified to look for values within a certain
difference range.

Regards
Stéfan


More information about the SciPy-user mailing list