[SciPy-user] Mysterious kmeans() error

David Cournapeau cournape@gmail....
Fri Feb 6 10:05:55 CST 2009


On Fri, Feb 6, 2009 at 11:37 PM, Roy H. Han
<starsareblueandfaraway@gmail.com> wrote:
> Well I feel like there are numerical problems with scipy's kmeans2(),
> at least in the 0.6.0 version of scipy.

kmeans and kmeans2 are fairly low level - they will fail if you have
empty cluster, indeed.

> I changed the code to try to ensure that no clusters were empty.
> Pycluster seems to be the better clustering algorithm for now.

Maybe - I am not familiar with pycluster.

> Even though the size (number of columns = 3) of each vector in the
> cluster is three, kmeans should still work even if one of the clusters
> contained a single vector (number of rows = 1).

Strictly speaking, kmeans is undefined in that case - there are
various strategies which can be implemented, like cluster splitting,
etc... Generally, I agree the code is not great.

David


More information about the SciPy-user mailing list