[SciPy-dev] MCMC, Kalman Filtering, AI for SciPy?

Travis Oliphant oliphant at ee.byu.edu
Tue Sep 28 12:01:16 CDT 2004


chris at fisher.forestry.uga.edu wrote:

>On 9/27/2004, "Travis Oliphant" <oliphant at ee.byu.edu> wrote:
>
>  
>
>>We are not against reorganizations, but odd to you is not necessarily
>>odd to someone else, and vice versa.  So, let's just figure out were
>>monte_carlo should go.  I think it would go well under stats, or else a
>>new AI subpackage.
>>
>>    
>>
>
>MCMC isnt really artificial intelligence, it is bayesian stats, so stats
>would be best, I think.
>
>  
>
You have a like-minded friend in your thinking there.  I actually don't 
like the name stats (except that it is short) and prefer probability or 
prob.

>>I agree that stats could use reorganization.  Many routines were lifted
>>    
>>
>>from an old pstats.py file.  While it has been significantly cleaned up,
>  
>
>>there are still problems.
>>
>>    
>>
>
>I would be glad to take a run at re-doing stats, if only because I would
>be using it a *lot*. If so, I wouldnt mind getting a bit of background
>from Travis (and others) about the module, particularly with respectto
>what would need to be retained, and what the most desirable changes
>would be.
>
>  
>
The distributions.py file received a lot of attention from me.  The rest 
of it only a bit of attention.  So, I'd have more to say about anything 
in distributions.py  

>  
>
>>I hear various complaints occasionally about slowness in some of the
>>distributions in stats.  In order to improve things, these need to be
>>better described.  Their shouldn't be a lot of slow-down in most of the
>>routines (aside from domain validity slowness).   Some distributions
>>don't have exactly defined cdf's or ppf's and these must be computed by
>>SciPy using integration and zero-finding routines.  This will be very slow.
>>    
>>
>
>The distributions that most folks would be using 95% of the time
>(normal,mv-normal,gamma,beta,binomial,exponential,poisson) are well
>defined and can be made pretty fast with FORTRAN extensions.
>  
>
And these in particular should already be fast (i.e. based on FORTRAN 
and C extensions).  If there is something specific in the interface that 
is slowing them down unacceptably then that needs to be addressed if at 
all possible.

There will always be an overhead for making a library call robust.  I 
don't mind having exposed "faster" methods if the overhead for checking 
arguments is unacceptable, but it will not be the front-line command.

-Travis





More information about the Scipy-dev mailing list