[SciPy-user] KDE question
Fri Nov 9 15:26:15 CST 2007
Thanks a lot, Robert -- I appreciate your time and suggestions!
On Nov 9, 2007, at 3:13 PM, Robert Kern wrote:
> Zachary Pincus wrote:
>> Hello all,
>> I'm using scipy.stats.gaussian_kde to estimate the a probability
>> density based on some scattered 2D samples. What I would like to do
>> eventually is plot a line that surrounds (say) 90% of the probability
>> Clearly this line corresponds to some contour of the density function
>> (which I can extract out easily enough from KDE density estimates on
>> a grid), but I'm not sure how to figure out what exact contour value
>> it is that I want.
>> Given the KDE density estimates on a grid, I could empirically figure
>> out which value is closest, but I was wondering if there's a simpler
>> closed-form solution... (And apologies if it's somehow obvious; I'm
>> feeling a bit foggy-headed about this.)
> I don't think there is a closed-form solution for n>1-D. It
> shouldn't be too
> hard to evaluate the KDE on a grid, then search for the value
> between 0 and the
> peak where the integral is around 90%.
> from scipy import stats
> from scipy.optimize import brentq
> k = stats.gaussian_kde(data)
> xy = ... # the grid
> dxdy = ... # the area represented by each grid point
> z = k(xy)
> peak = z.max()
> # You can't quite do this since the function is discontinuous, and
> # actually equals 0, but you can do a search along these lines.
> # Exercise left for the reader.
> def f(s):
> mask = z > s
> return 0.9 - (z[mask].sum() * dxdy)
> brentq(f, 0, peak)
> Robert Kern
> "I have come to believe that the whole world is an enigma, a
> harmless enigma
> that is made terrible by our own mad attempt to interpret it as
> though it had
> an underlying truth."
> -- Umberto Eco
> SciPy-user mailing list
More information about the SciPy-user