Fri Jul 23 12:27:56 CDT 2010
On Sat, Jul 24, 2010 at 2:19 AM, Benjamin Root <email@example.com> wrote:
> Examining further, I see that SciPy's implementation is fairly simplistic
> and has some issues. In the given example, the reason why 3 is never
> returned is not because of the use of the distortion metric, but rather
> because the kmeans function never sees the distance for using 3. As a
> matter of fact, the actual code that does the convergence is in vq and py_vq
> (vector quantization) and it tries to minimize the sum of squared errors.
> kmeans just keeps on retrying the convergence with random guesses to see if
> different convergences occur.
As one of the maintainer of kmeans, I would be the first to admit the
code is basic, for good and bad. Something more elaborate for
clustering may indeed be useful, as long as the interface stays
More complex needs should turn on scikits.learn or more specialized packages,
More information about the SciPy-User