[SciPy-user] Looking for a way to cluster data

Anne Archibald peridot.faceted@gmail....
Mon Apr 27 13:51:26 CDT 2009

2009/4/25 Gary Ruben <gruben@bigpond.net.au>:
> Hi all,
> I'm looking for some advice on how to order data points so that I can
> visualise them. I've been looking at scipy.cluster for this purpose but
> I'm not sure whether it is suitable so I thought I'd see whether anyone
> had suggestions for a simpler suggestion of how to order the coordinates.
> I have a binary 3D array containing 1's that form a shape in a 3D volume
> against a background of 0's - they form a skeleton of a connected,
> branched structure. Furthermore, the points are all 26-connected to each
> other, i.e. there are no gaps in the skeleton. The longest chains may be
> 1000's of points long.
> It would be nice to visualise these using the mayavi mlab plot3d
> function, which draws tubes and which requires ordered coordinates as
> input, so I need to get ordered coordinate lists that traverse the
> points along the branches of the skeleton. It would also be nice to
> preferentially cluster long chains since then I can cull very short
> chains from the visualisation.
> scipy.cluster seems to be able to cluster the points but I'm not sure
> how to get the x,y,z coordinates of the original points out of its
> linkage data. This may not be possible. Maybe the scipy.spatial module
> is a better match to my problem.
> Any suggestions?

If I understand you correctly, what you have is not a list of
coordinates of freely-located points, it's a binary mask indicating
which voxels are part of your object. So first of all, have you
considered using volumetric visualization tools? These seem like they
might be a better fit to your problem.

If what you want to know about is the connectivity of your object,
though, I can see why you might want to build chains of rods. The most
direct approach is, for each cell that is a 1, to draw rods from it to
each of its neighbors that is on. This may not give you what you want:
if you have regions where all the cells are on, they'll be a dense
grid of rods. It will also not allow you to provide long strings of
rods to your 3D toolkit, or to eliminate short chains.

As I see it, then, your problem is graph-theoretic: you have this
fairly dense adjacency graph of "on" cells, and you want to pare it
down. One good choice would be to produce a (minimum diameter?)
spanning tree, which should be pretty easy to convert to a collection
of strings of rods.  But I think what you want is a graph library of
some sort.

On the other hand, if what you have is "fat" chains of on cells, and
you want to build up a "skeleton" of them (like converting the pixels
of an image of the letter o back to a circle), you might look at
machine vision for help, they do this sort of thing often.


More information about the SciPy-user mailing list