[SciPy-User] Weighted KDE
Mon Jan 14 13:58:08 CST 2013
On Jan 14, 2013 11:31 AM, "Jackson Li" <email@example.com> wrote:
> On Sun, Jan 13, 2013 at 10:44 AM, Joe Kington <firstname.lastname@example.org
> > For what it's worth, the code you linked to is much slower for small
> > sample sizes. It's only faster with large numbers (>1e4) of points. It
> > also has a bit of a different use case than gaussian_kde. It's only
> > intended for making a regularly gridded KDE of a very large number of
> > points on a relatively fine grid. It bins the data onto a regular grid
> > convolves it with an approriate gaussian kernel. This is a reasonable
> > approximation when you're dealing with a large number of points, but
> > reasonable if you only have a handful. Because the size of the gaussian
> > kernel can be very large when the sample size is low, the convolution
> > be very slow for small sample sizes. Also, If I recall correctly,
> > a stray flipud that got left in there. You'll want to take it out.
> > while I think that got posted only a couple of years ago, I wrote it
> > longer ago than that... There's some less-than-ideal code in there...)
> > However, are you sure that you want a kernel density estimate? What
> > you're describing sounds like interpolation, not a weighted KDE.
> > As an example, a weighted KDE would be used when you wanted to show the
> > density of point estimates while weighting it by error in the location
> > the point.
> >>I shouldn't have said "error in the location of the point". I guess it
> >>would me more like "confidence that the point exists" or more
> >>"magnitude of the point". Otherwise, the size of the Gaussian kernel
> >>have to change depending on the data involved.
> >>As another (not exact) example, it can be handy when you want to sum
> >>attribute over a map to yield a density estimate per-unit-area (e.g.
> >>population density, where you have populations of cities as your point
> >>measurements). In other words, if you want your temperature values to be
> >>summed-per-unit-area, then it's what you want. If you want to
> >>it's not what you want.
> > Instead, it sounds like you have a third variable that you want to make
> > continuous map of based on irregularly sampled points. If so, have a
> > at scipy.interpolate (and particularly scipy.interpolate.Rbf).
> > Hope that helps,
> > -Joe
> Thanks for the quick reply.
> What you described for the population of cities is indeed what I want.
> I have several data points spread out randomly in XY space, and each data
point has an independent third variable.
> (e.g. for 2 points very close to each other, one 50 and another 10, and
all other data points are far away.
You're describing interpolation, for whatever it's worth.
You want to interpolate your "z" values, not determine the number of
samples you have per unit area.
A KDE will give you "bulls eyes" around where you have data and the
resulting values won't directly reflect the weight values you pass in.
Instead, the values will mostly reflect where you have clusters of point
measurements, modified by the localized sum of the weights. The exact value
you get will depend on the covariance of your sampled point distribution.
Instead, you want a smooth surface that reflects your sampled z values.
Have a look at some of the examples involving scipy.interpolate.griddata or
The cookbook is a bit out of date, but take a look at the second example on
this page: http://www.scipy.org/Cookbook/RadialBasisFunctions
Hope that helps!
> --> I would like that patch to get a value of 30 (average))
> Hence, I would like to obtain a XY graph showing the density estimate of
the third variable.
> (if that patch is mostly high temperature on average, it should be "red",
and if it is empty or has a lot of low temperature data points, then it
should be "blue".)
> SciPy-User mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the SciPy-User