[SciPy-Dev] Expanding Scipy's KDE functionality

josef.pktd@gmai... josef.pktd@gmai...
Fri Jan 25 09:28:06 CST 2013


On Fri, Jan 25, 2013 at 8:53 AM, Sturla Molden <sturla@molden.no> wrote:
> On 25.01.2013 14:45, Sturla Molden wrote:
>
>> One can always use a delta function as kernel though. It retains all the
>> information we have about the sampled distribution.
>
> This is not as crazy as it might sound. It the basis for bootstrap and
> jack-knife procedures, PRESS in regression analysis, etc. Also by
> viewing a data sample as an "analog signal" consisting of a sum of delta
> functions (or equivalently: a KDE using a delta kernel), all the methods
> of DSP becomes available to statistical data analysis. The first step in
> which case is to digitize the signal by anti-alias filtering and regular
> sampling. And as it turs out, the anti-alias filtering is just another
> case of KDE.

I'm not sure what you mean.

If you just use a delta function,  you get the original data back, and
we get the empirical distribution function, isn't it. I don't
understand how this relates to digitizing the data.

It's useful for many applications, but not the point for kde. IIRC,
the empirical distribution has a large variance, and the point of kde
is to "loose" information and remove the noise.
The pointwise variance of the density estimate is much smaller with a
smooth, large bandwidth kernel, and the main task is to find the right
bias-variance trade-off.

Or do I misinterpret what you have in mind?

Josef


>
> Sturla
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev


More information about the SciPy-Dev mailing list