[SciPy-dev] denoising spatial point process data

Sturla Molden sturla@molden...
Fri Jan 9 12:30:01 CST 2009

> 2009/1/9 Sturla Molden <sturla@molden.no>:

> As such, I wonder
> whether it belongs more in with the clustering/machine learning code?

It is not correct to call it a 'clustering' technique, but machine
learning is an acceptable label. Similar to many clustering techniques
(e.g. k-means) it fits a mixture model using the EM agorithm. It does not
fit clusters, but separates target from clutter, based on the idea that
clutter points will be widely scattered without adjacent neighbours.

As for clustering:

A clustering technique related to nnclean is 'superparamagnetic
clustering'. It uses KNN distances and some form of MCMC (Swendsen-Wang or
Wolff's algorithm). It is to my knowledge the only clustering method that
is impervious to initialization, oblivious to the number of clusters in
advance, can fit clusters of arbitrary shape, as well as guaranteed to
converge to the globally correct solution. I have an implementation of
that as well (it needs some more testing). Unfortunately it seems to be
protected by a patent. I am not a lawyer, but it seems strange that a pure
numerical method can be patented. At least in Europe, patents must include
some sort of physical action, not just plain mathematics.


More information about the Scipy-dev mailing list