[SciPy-dev] Statistics review months

Jonathan Taylor jonathan.taylor at stanford.edu
Sat Apr 1 22:20:02 CST 2006

on this topic, as an honest-to-goodness statistician it might be nice to 
see more statistical modelling in scipy. i know Rpy exists, but the 
interface is not very pythonic.

i have some "home-brew" modules for linear regression, formula building 
(something like R's) and a few other things. if it went into something 
like scipy, it might gain from the criticisms of others....

is there any interest in making the equivalent of a



i think an easily (medium-term) achievable goal is:

i) linear (least-squares) regression models with/without weights or 
non-diagonal covariance matrices (in R: lm + more)

ii) generalized linear models (in R: glm)

iii) iteratively reweighted least squares algorithms (glm is a special 
case), i.e. robust regression  (in R: rlm).

iv) ordinary least squares multivariate linear models (i.e. multivariate 

some of these models can easily be "broadcasted", others not so easily....

further goals are more general models: classification, constrained model 
fitting, model selection.... for some of these things, it may not be 
worth duplicating R's (or other packages') efforts.

-- jonathan

Robert Kern wrote:

>In the interest of improving the quality of the scipy.stats package, I hereby
>declare April and May of 2006 to be Statistics Review Months. I propose that we
>set ourselves a goal to review each function in stats.py and morestats.py (and a
>few others) for correctness and completeness of implementation by the end of
>May. By my count, that's about 2.5 functions every day. Surely this is a
>reasonable amount of effort for a rather large payoff: a robust, well-tested and
>thorough statistics library.
>I have added a Wiki page describing the details:
>  http://projects.scipy.org/scipy/scipy/wiki/StatisticsReview
>Barring any objections, I will be irretrievably creating the ~150 tickets or so
>for all of the functions to be reviewed later tonight. So if you object, act fast!
>[Disclosure: this idea isn't mine. Eric Jones mentioned it to me once, and I'm
>just running with it.]

I'm part of the Team in Training: please support our efforts for the
Leukemia and Lymphoma Society!



Jonathan Taylor                           Tel:   650.723.9230
Dept. of Statistics                       Fax:   650.725.8977
Sequoia Hall, 137                         www-stat.stanford.edu/~jtaylo
390 Serra Mall
Stanford, CA 94305

More information about the Scipy-dev mailing list