[Scipy-tickets] [SciPy] #8: What are the return values of stats.linregress ?

SciPy scipy-tickets at scipy.net
Wed Jan 10 04:05:45 CST 2007


#8: What are the return values of stats.linregress ?
-----------------------------------+----------------------------------------
 Reporter:  baptiste13 at altern.org  |        Owner:  somebody
     Type:  defect                 |       Status:  new     
 Priority:  normal                 |    Milestone:          
Component:  scipy.stats            |      Version:          
 Severity:  normal                 |   Resolution:          
 Keywords:                         |  
-----------------------------------+----------------------------------------
Old description:

> Hello,
>
> The linregress function in scipy.stats returns a value called stderr-of-
> the-estimate which is equal to:
> sqrt(1-r^2) * samplestd(y) = sqrt( (1-r^2) * sum(y - mean(y)) / N )
> with r the correlation coefficient and N the number of data points
>
> This is different from the usual estimator for the standard error, wich
> is
> sqrt( (1-r^2) * sum(y - mean(y)) / df ) = sqrt( (1-r^2) * sum(y -
> mean(y)) / (N-2) )
> where df stands for the number of degrees of freedom
>
> From the docstring, one could assume that stderr-of-the-estimate is the
> usual estimator for stderr, as this result is relevant in most cases
> where linear regression is used. On the contrary, I don't see
> applications where the stderr-of-the-estimate result as is would be
> relevant.
>
> If stderr-of-the-estimate is meant to be the usual estimator for stderr,
> the calculation should be corrected. If not, the docstring should
> describe more specifically what it stands for.
>
> Cheers,
> BC

New description:

 Hello,

 The linregress function in scipy.stats returns a value called stderr-of-
 the-estimate which is equal to:
 sqrt(1-r^2^) * samplestd(y) = sqrt( (1-r^2^) * sum(y - mean(y)) / N )
 with r the correlation coefficient and N the number of data points

 This is different from the usual estimator for the standard error, wich is
 sqrt( (1-r^2^) * sum(y - mean(y)) / df ) = sqrt( (1-r^2^) * sum(y -
 mean(y)) / (N-2) )
 where df stands for the number of degrees of freedom

 From the docstring, one could assume that stderr-of-the-estimate is the
 usual estimator for stderr, as this result is relevant in most cases where
 linear regression is used. On the contrary, I don't see applications where
 the stderr-of-the-estimate result as is would be relevant.

 If stderr-of-the-estimate is meant to be the usual estimator for stderr,
 the calculation should be corrected. If not, the docstring should describe
 more specifically what it stands for.

 Cheers,
 BC

-- 
Ticket URL: <http://projects.scipy.org/scipy/scipy/ticket/8#comment:2>
SciPy <http://www.scipy.org/>
SciPy is open-source software for mathematics, science, and engineering.


More information about the Scipy-tickets mailing list