[SciPy-user] predicting values based on (linear) models

josef.pktd@gmai... josef.pktd@gmai...
Wed Jan 14 21:15:10 CST 2009

On Wed, Jan 14, 2009 at 7:37 PM, Tim Michelsen
<timmichelsen@gmx-topmail.de> wrote:
> Hello Josef,
> thank for your extensive answer.
> I really appreciate it and will see how I could use it.
>> olsexample.py is in attachment is from the cookbook and I'm slowly reworking it.
>> fancier models will be in scipy.stats.models when they are ready for inclusion.
> Do you have the scipy.stats.models in a SVN repository somewhere?

The main current location is at

I made a few changes to stats.models, so that all existing tests pass at

>> I'm using rpy (version 1) to check scipy.stats function, and for sure
>> the available methods are very extensive in R, while coverage of
>> statistics and econometrics in python packages including scipy is
>> spotty, some good spots and many missing pieces.
> As you are checking against R with rpy, do you think that the R
> functions are more accurate?

The function in stats, that I tested or rewrote, are usually identical
to around 1e-15, but in some cases R has a more accurate test
distribution for small samples (option "exact" in R), while in
scipy.stats we only have the asymptotic distribution. Also, not all
existing functions in scipy.stats are tested (yet).

> Do you see benefit from re-programming the stats functions in scipy?

(Since R and its packages are GPL we cannot copy from it directly, but
I was looking at R and matlab for the interface/signature of
statistical functions.)

I would like to see many of the basic statistics functions included in
scipy (or in an addon, or initially as cookbook recipes). Much of the
basic supporting tools for statistics like optimize, linalg,
distributions, special and signal, are available but it is a pain to
figure out each time how to use it; for example,  how to get the error
and covariance estimates for linear or non-linear regression. There
are many good specialized packages for python available, for example
for machine learning or MCMC, but no complete collection of basic
statistical functionality.

But, my impression is that, since scipy is mostly developer driven
(?), what finally ends up in scipy, depends on the needs of the
developers, and their willingness to share the code and to incorporate
user feedback.


More information about the SciPy-user mailing list