[SciPy-user] 2D clustering question
Hazen Babcock
hbabcock@mac....
Mon May 4 18:06:07 CDT 2009
Hello,
I've been using scipy.cluster.hierarchy.fclusterdata() to cluster groups
of points based on their x and y position. This works well for data sets
without out too many points, but seems to get pretty slow as the number
of points gets into the high thousands (i.e. 6000+). Does anyone know of
a more specialized clustering algorithm that might be able to handle
even larger numbers of points, i.e. up to 10e6 or so? The points are
spread out over 0 - 200 or so in X and Y and I'm clustering with a 0.5
cutoff. One approach is to break the data set down into smaller sections
based on X,Y coordinate, but perhaps something like this already exists?
thanks,
-Hazen
More information about the SciPy-user
mailing list