[SciPy-user] Limits of linrgress - underflow encountered in stdtr

wierob wierob83@googlemail....
Mon Jun 8 15:09:37 CDT 2009


Hi,

I'm trying to do a regression analysis for a large data set using 
stats.linregress. Unfortunately, I keep getting strange results or 
errors. I've tested linregress with the code below and I get the error 
message:

Traceback (most recent call last):
  File "C:/Users/wierob/Documents/Masterarbeit/underflow.py", line 25, in <module>
    res = stats.linregress(x, y_es)
  File "C:\Python26\lib\site-packages\scipy\stats\stats.py", line 1799, in linregress
    prob = distributions.t.sf(np.abs(t),df)*2
  File "C:\Python26\lib\site-packages\scipy\stats\distributions.py", line 665, in sf
    place(output,cond,self._sf(*goodargs))
  File "C:\Python26\lib\site-packages\scipy\stats\distributions.py", line 531, in _sf
    return 1.0-self._cdf(x,*args)
  File "C:\Python26\lib\site-packages\scipy\stats\distributions.py", line 2829, in _cdf
    res = special.stdtr(df, x)
FloatingPointError: underflow encountered in stdtr

When setting z > 1 (len(x) >= 40)  stats.linregress(x, y["dependency"]) 
fails.
When setting z > 29 (len(x) >= 600)  stats.linregress(x, 
y["dependency"]) and stats.linregress(x, y["dependency_with_noise"]) fails.

from scipy import stats

import numpy
numpy.seterr(all="raise")

# linregress fails -> FloatingPointError: underflow encountered in stdtr
#   for x and y["dependency"] if z > 1
#   for x and y["dependency_with_noise"] if z > 29
z = 30

x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]*z
print len(x)

y = {}
y["dependency"] = [0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38]*z

y["dependency_with_noise"] = [0, -1, 4, 3, 8, 10, 12, 14, 10, 25, 20, 22, 24, 27, 28, 17, 32, 34, 36, 40]*z


for key, y_es in y.iteritems():
    print "="*5, key, "="*5

    res = stats.linregress(x, y_es)
    
    print "slope:", res[0]
    print "intercept:", res[1]
    print "r^2:", res[2]
    print "p-value:", res[3]
    print "stderr:", res[4]


What's my mistake? Are there any restrictions for the use of linregress?

regards
robert


More information about the SciPy-user mailing list