[Scipy-tickets] [SciPy] #1854: scipy.stats.gaussian_kde.integrate functions are slow

SciPy Trac scipy-tickets@scipy....
Fri Mar 1 07:27:50 CST 2013


#1854: scipy.stats.gaussian_kde.integrate functions are slow
-------------------------------------------------+--------------------------
 Reporter:  itissid                              |       Owner:  pv         
     Type:  enhancement                          |      Status:  new        
 Priority:  normal                               |   Milestone:  Unscheduled
Component:  scipy.special                        |     Version:  0.11.0     
 Keywords:  gaussian_kde, numerical integration  |  
-------------------------------------------------+--------------------------

Comment(by josefpktd):

 Usage: I don't think that for datasets this large and one-dimensional,
 gaussian_kde is an efficient approach.

 For example statsmodels (and some other packages) use fft on the binned
 data which is much faster. IIRC, statsmodels only uses integrate.quad to
 get the cdf from the kde, I don't know if fft could be used for that.

 Also, unless you want tail probabilities with few observations, using the
 empirical cdf (just count the number of points in the interval) might be
 pretty accurate with this many points.

 (aside: statsmodels has a kernel estimator for the cdf directly, but that
 will be slow since it loops over points.)

-- 
Ticket URL: <http://projects.scipy.org/scipy/ticket/1854#comment:2>
SciPy <http://www.scipy.org>
SciPy is open-source software for mathematics, science, and engineering.


More information about the Scipy-tickets mailing list