[SciPy-User] [Scikit-learn-general] kriging module

Robert Kern robert.kern@gmail....
Sun Nov 21 18:38:20 CST 2010

On Sun, Nov 21, 2010 at 03:07, Gael Varoquaux
<gael.varoquaux@normalesup.org> wrote:
> On Sat, Nov 20, 2010 at 10:28:15PM -0600, Joe Kington wrote:

>>    I'm not trying to say that it's a bad thing to combine similar code, just
>>    be aware that the first thing that someone's going to think when they hear
>>    "kriging" is "How do I build and fit a variogram with this module?".
> Thank you. I was certainly not aware (I am certainly not a Kriging nor a
> Gaussian Process expert). I am no clue what a variogram is. It does seem
> that any code that wants to cater for 'Kriging' users will need some
> Kriging-specific functionality.

FWIW, a variogram is a different way of representing the covariance
function of a GP. It obscures the relationship kriging/GPs have with
multidimensional Gaussian distributions, but it arguably has a closer
relationship to observable or estimable quantities. Assuming isotropy
for the moment, it is a function of radius that describes the variance
of an r-distant point conditioned on knowing the value of the point at
r=0. That's where the "nugget" and "sill" values I described earlier
come from. Exactly at r=0, the variance is 0, naturally, but
infinitesimally close to 0, it takes the nonzero nugget value. The
nugget roughly represents the uncertainty of any individual
observation. The variance (usually) increases as the radius increases
up to a limiting value called the sill. This is the overall variance
in the data. The variogram can be estimated by looking at all of the
squared pairwise differences in the observed values plotted as a
function of the pairwise distances.


Naturally, there is an extensive and well-developed literature using
this methodology, possibly more so than the GP regression formulation.
The geostatisticians were doing GPs before everyone else caught on.

> If people are (still) interested in the effort underway in the
> scikit-learn[*], it might be great to contribute a Kriging-specific
> module that uses the more general-purpose Gaussian process code to
> achieve what geostatisticians call Kriging. If there is some
> freely-downloadable geostatistics data, it would be great to make an
> example (similar to the one in PyMC) that ensures that comon tasks in
> geostatistics can easily be done.
> As a side note, now that I am having a closer look at the PyMC GP
> documentation, there seems to be some really nice and fancy code in
> there, and it is very well documented.

Yup! They've done some good work.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

More information about the SciPy-User mailing list