```Hi James,

Usually, we run the optimisation several times and take the solution with
the smallest inertia. The technic you use don't ensure you to keep the best
solution.

There's a full implementation in scikit-learn with several runs. You can
have a look at the code to see how it works.

Cheers,
N
On 8 Aug 2012 20:53, "James Abel" <j@abel.co> wrote:

> BTW, I modified my code to loop until it gets the same clustering twice in
> a row.  This yields more consistent results.  I don’t know if this is a
> general solution but it worked for my simple test case.  Code below.****
>
> import sys****
> import scipy****
> import warnings****
> from scipy.cluster.vq import *****
> print sys.version****
> vals = scipy.array((0.0,0.1,0.5,0.6,1.0,1.1))****
> print vals****
> white_vals = whiten(vals)****
> print white_vals.shape, white_vals****
> # Check for same clustering****
> def clustering_test(a,b):****
>     # have to create copies, then sort so we don't modify the original****
>     ea = a.copy()****
>     eb = b.copy()****
>     ea.sort()****
>     eb.sort()****
>     r = (ea == eb).all()****
>     print a,b,ea,eb,r****
>     return r****
> # try it until we get the same clustering twice in a row****
> found = False****
> prior_idx = None****
>     with warnings.catch_warnings():****
>         warnings.simplefilter("ignore") # suppress the warning message
> (happens if it doesn't find the right number of clusters)****
>         res, idx = kmeans2(white_vals, 3) # changing iter doesn't seem to
> matter****
>     #print res, idx****
>     if prior_idx is not None:****
>         eq = clustering_test(idx, prior_idx)****
>
>
>
>
> print "result", res, idx****
