[Scipy-tickets] [SciPy] #1854: scipy.stats.gaussian_kde.integrate functions are slow
SciPy Trac
scipy-tickets@scipy....
Fri Mar 1 07:27:50 CST 2013
#1854: scipy.stats.gaussian_kde.integrate functions are slow
-------------------------------------------------+--------------------------
Reporter: itissid | Owner: pv
Type: enhancement | Status: new
Priority: normal | Milestone: Unscheduled
Component: scipy.special | Version: 0.11.0
Keywords: gaussian_kde, numerical integration |
-------------------------------------------------+--------------------------
Comment(by josefpktd):
Usage: I don't think that for datasets this large and one-dimensional,
gaussian_kde is an efficient approach.
For example statsmodels (and some other packages) use fft on the binned
data which is much faster. IIRC, statsmodels only uses integrate.quad to
get the cdf from the kde, I don't know if fft could be used for that.
Also, unless you want tail probabilities with few observations, using the
empirical cdf (just count the number of points in the interval) might be
pretty accurate with this many points.
(aside: statsmodels has a kernel estimator for the cdf directly, but that
will be slow since it loops over points.)
--
Ticket URL: <http://projects.scipy.org/scipy/ticket/1854#comment:2>
SciPy <http://www.scipy.org>
SciPy is open-source software for mathematics, science, and engineering.
More information about the Scipy-tickets
mailing list