[SciPy-user] predicting values based on (linear) models

Pierre GM pgmdevlist@gmail....
Wed Jan 14 22:24:36 CST 2009


On Jan 14, 2009, at 10:15 PM, josef.pktd@gmail.com wrote:
> The function in stats, that I tested or rewrote, are usually identical
> to around 1e-15, but in some cases R has a more accurate test
> distribution for small samples (option "exact" in R), while in
> scipy.stats we only have the asymptotic distribution.

We could try to reimplement part of it in C,. In any   case, it might  
be worth to output a warning (or at least be very explicit in the doc)  
that the results may not hold for samples smaller than 10-20.

> Also, not all
> existing functions in scipy.stats are tested (yet).

We should also try to make sure missing data are properly supported  
(not always possible) and that the results are consistent between the  
masked and non-masked versions.



>> Do you see benefit from re-programming the stats functions in scipy?
>>
>
> (Since R and its packages are GPL we cannot copy from it directly, but
> I was looking at R and matlab for the interface/signature of
> statistical functions.)

There's one obvious advantage (on top of the pedagogical exercise):  
that's one dependency less.


> I would like to see many of the basic statistics functions included in
> scipy (or in an addon, or initially as cookbook recipes). Much of the
> basic supporting tools for statistics like optimize, linalg,
> distributions, special and signal, are available but it is a pain to
> figure out each time how to use it; for example,  how to get the error
> and covariance estimates for linear or non-linear regression.

Very true, but it's also what attracted me in numpy/scipy in the first  
place: the functions I needed were at the time non-existent, and I was  
reluctant to rely on other softwares which, albeit more powerful,  
hided how values were actually calculated (what assumptions were made,  
what were the validity domains...). It was nice to have some time at  
hand.
>
> But, my impression is that, since scipy is mostly developer driven
> (?), what finally ends up in scipy, depends on the needs of the
> developers, and their willingness to share the code and to incorporate
> user feedback.

IMHO, the readiness to incorporate user feedback is here. The feedback  
is not, or at least not as much as we'd like.


More information about the SciPy-user mailing list