[SciPy-User] Weighted KDE
Sun Jan 13 11:53:25 CST 2013
On Sun, Jan 13, 2013 at 10:44 AM, Joe Kington <email@example.com>wrote: >For what it's worth, the code you linked to is much slower for small >sample sizes. It's only faster with large numbers (>1e4) of points. It >also has a bit of a different use case than gaussian_kde. It's only >intended for making a regularly gridded KDE of a very large number of >points on a relatively fine grid. It bins the data onto a regular grid and >convolves it with an approriate gaussian kernel. This is a reasonable >approximation when you're dealing with a large number of points, but not so >reasonable if you only have a handful. Because the size of the gaussian >kernel can be very large when the sample size is low, the convolution can >be very slow for small sample sizes. Also, If I recall correctly, there's >a stray flipud that got left in there. You'll want to take it out. (Also, >while I think that got posted only a couple of years ago, I wrote it much
>longer ago than that... There's some less-than-ideal code in there...) >>However, are you sure that you want a kernel density estimate? What >you're describing sounds like interpolation, not a weighted KDE. >>As an example, a weighted KDE would be used when you wanted to show the >density of point estimates while weighting it by error in the location of >the point. > >>I shouldn't have said "error in the location of the point". I guess it >>would me more like "confidence that the point exists" or more accurately, >>"magnitude of the point". Otherwise, the size of the Gaussian kernel would >>have to change depending on the data involved. >>As another (not exact) example, it can be handy when you want to sum some >>attribute over a map to yield a density estimate per-unit-area (e.g. >>population density, where you have populations of cities as your point >>measurements). In other words, if you want your temperature values to be >>summed-per-unit-area,
then it's what you want. If you want to interpolate, >>it's not what you want. >>Instead, it sounds like you have a third variable that you want to make a >continuous map of based on irregularly sampled points. If so, have a look >at scipy.interpolate (and particularly scipy.interpolate.Rbf). >>Hope that helps, >-Joe
Thanks for the quick reply.
What you described for the population of cities is indeed what I want.
I have several data points spread out randomly in XY space, and each data point has an independent third variable.
(e.g. for 2 points very close to each other, one 50 and another 10, and all other data points are far away.
--> I would like that patch to get a value of 30 (average))
Hence, I would like to obtain a XY graph showing the density estimate of the third variable.
(if that patch is mostly high temperature on average, it should be "red", and if it is empty or has a lot of low temperature data points, then it should be "blue".)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the SciPy-User