[SciPy-user] Kmeans help and C source
costasm at hotmail.com
Mon Jan 7 17:50:48 CST 2002
I need to use a modified K-means algorithm for a project and I was delighted
to discover that SciPy includes a python wrapper for a kmeans() function.
However, I am not quite following the kmeans() functionality (I am new to
this clustering business, so this maybe a stupid newbie question): my docs
tell me that kmeans should partition a dataset into k clusters. So, I
expect vq.kmeans(dataset, 2) to return to me dataset split up into two
"equivalent" datasets. However, anyway I feed my data into vq.kmeans() this
doesn't happen (e.g. I feed it a 5x4 dataset and I get back two 5x1
vectors). My guess is that either this vq.kmeans() does something different
--I confess to not understanding the docstring as the observation/codebook
terminology has no parallel to the docs I've read-- or that I am not doing
something right. Any pointers? Even some documentation on the algorithm
would be great help.
Secondly, as I mentioned above, I need a modified kmeans. However, I see no
C/Fortran code in the src tarball or CVS that seems related to kmeans. Is
the base code available? If so, is it hackable by a SWIG newbie? (I am
aware of SWIG, but I have never used it for anything serious).
Any and all info will be greatly appreciated :-) --and thanks for SciPy!
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp.
More information about the SciPy-user