[SciPy-Dev] scipy.stats.kde

josef.pktd@gmai... josef.pktd@gmai...
Fri Aug 27 13:38:45 CDT 2010


On Fri, Aug 27, 2010 at 2:17 PM, Sam Birch <sam.m.birch@gmail.com> wrote:
> Hi all,
> I was thinking of renovating the kernel density estimation package (although
> no promises; I'm leaving for college tomorrow morning!). I was wondering:
> a) whether anyone had started code in that direction

Mike Crowe wrote code for kernel regression  and Skipper started a 1D
kernel density estimator in scikits.statsmodels, which cover a larger
number of kernels

I don't think I have seen any higher dimensional kernel density
estimation in python besides scipy.stats.kde. The Gaussian kde in
scipy.stats is targeted to the underlying Fortran code for
multivariate normal cdf.
It's not clear to me what other n-dimensional kdes would require or
whether they would fit well with the current code.

One extension that Robert also mentioned in the past that it would be
nice to have adaptive kernels, which I also haven't seen in python
yet.

> b) what people want in it
> I was thinking (as an ideal, not necessarily goal):
> - Support for more than Gaussian kernels (e.g. custom,
> uniform, Epanechnikov, triangular, quartic, cosine, etc.)
> - More options for bandwidth selection (custom bandwidth matrices, AMISE
> optimization, cross-validation, etc.)

definitely yes, I don't think they are even available for 1D yet.

> - Assorted conveniences: automatically generate the mesh, limit the kernel's
> support for speed

Using scipy.spatial to limit the number of neighbors in a bounded
support kernel might be a good idea.

(just some thought on the topic)

Josef

> So, thoughts anyone? I figure it's better to over-specify and then
> under-produce, so don't hold back.
> Thanks,
> Sam
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
>


More information about the SciPy-Dev mailing list