[SciPy-Dev] Subversion scipy.stats irregular problem with source code example

Charles charles.moliere@gmail....
Thu Dec 9 13:34:55 CST 2010


Skipper Seabold <jsseabold <at> gmail.com> writes:

> 
> On Tue, Sep 28, 2010 at 1:12 PM, James Phillips <zunzun <at> zunzun.com> 
wrote:
> > Since I observed the following behavior in the SVN respository version
> > of SciPy, it seemed to me proper to post to the dev mailing list.  I'm
> > using Ubuntu Lucid Lynx 32 bit and a fresh GIT of Numpy.  I am not
> > sure if a Trac bug report needs to be entered.
> >
> >
> > Below is some example code for fitting two statistical distributions.
> > Sometimes the numpy-generated data is fit, as I can see the estimated
> > and fitted parameters are different.  Sometimes I receive many
> > messages repeated on the command line like:
> >
> > Warning: invalid value encountered in absolute
> > Warning: invalid value encountered in subtract
> >
> > and the estimated parameters equal the fitted parameter values,
> > indicating no fitting took place.  Sometimes I receive on the command
> > line:
> >
> > Traceback (most recent call last):
> >  File "/home/zunzun/local/lib/python2.6/site-
packages/scipy/stats/distributions.py",
> > line 1987, in func
> >    sk = 2*(b-a)*math.sqrt(a + b + 1) / (a + b + 2) / math.sqrt(a*b)
> > ValueError: math domain error
> > Traceback (most recent call last):
> >  File "example.py", line 10, in <module>
> >    fitStart_beta = scipy.stats.beta._fitstart(data)
> >  File "/home/zunzun/local/lib/python2.6/site-
packages/scipy/stats/distributions.py",
> > line 1992, in _fitstart
> >    a, b = optimize.fsolve(func, (1.0, 1.0))
> >  File "/home/zunzun/local/lib/python2.6/site-
packages/scipy/optimize/minpack.py",
> > line 125, in fsolve
> >    maxfev, ml, mu, epsfcn, factor, diag)
> > minpack.error: Error occured while calling the Python function named func
> >
> > and program flow is stopped.
> >
> >
> > In summary, three behaviors: (1) Fits OK (2) Many exceptions with no
> > fitting (3) minpack error.  Running the program 10 times or so will
> > reproduce these behaviors without fail from the "bleeding-edge"
> > repository code.
> >
> >     James Phillips
> >
> >
> > ########################################################
> >
> > import numpy, scipy, scipy.stats
> >
> > # test uniform distribution fitting
> > data = numpy.random.uniform(2.0, 3.0, size=100)
> > fitStart_uniform = scipy.stats.uniform._fitstart(data)
> > fittedParameters_uniform = scipy.stats.uniform.fit(data)
> >
> > # test beta distribution fitting
> > data = numpy.random.beta(2.0, 3.0, size=100)
> > fitStart_beta = scipy.stats.beta._fitstart(data)
> > fittedParameters_beta = scipy.stats.beta.fit(data)
> >
> > print
> > print 'uniform._fitstart returns', fitStart_uniform
> > print 'fitted parameters for uniform =', fittedParameters_uniform
> > print
> > print 'beta._fitstart returns', fitStart_beta
> > print 'fitted parameters for beta =', fittedParameters_beta
> > print
> > _______________________________________________
> 
> Is there an existing bug ticket for this?  If not there probably should be...
> 
> I think the fitting code should be looked at as experimental.  It's
> good that you caught that no fitting is actually done in these cases.
> The problem stems (for the most part) from bad starting values
> (outside the support of the distribution for those with bounded
> support).  I've tried to go through and fix this, giving very naive
> (but correct) starting values to fit methods, but I haven't gotten
> much further than that.
> 
> I don't know if Travis or Josef have gone back to look at this.
> Hopefully one of these days I will find some more time to look at this
> and try to give a systematic fix.
> 
> Skipper
> 


Hi,
I'm very sorry for entering the thread like this, but after a long search over 
the web, this thread is the more relevant to my problem which I'm stuck with. 
I'm actually trying to fit a gamma distribution on a set of experimental 
values with gamma.fit() in scipy 0.8.0. Here is the very simple code I'm using 
with a sample of my data:

##########################
import scipy as sp
import scipy.stats as ss

exp_data =[25.6,35.8,100.2,115.2,125.2,140.1,160.6,210.1,250.5,4500.3]
data = sp.array(exp_data)

fit_alpha, fit_loc, fit_beta = ss.gamma.fit(data) 
print(fit_alpha,fit_loc,fit_beta)
#########################

I then receive many messages on the command line:
Warning: invalid value encountered in subtract

Which ends with no fitting of the parameters:
(1.0, 0.0, 1.0)

With earlier version of scipy (0.7.2), the error message are absent but still 
no fitting is done. Apparently, it is the extrem value of "4500.3" that is 
causing problem with the fitting in this case. 

I know you metionned earlier that the fitting code should be considered as 
experimental, however I was wondering if this should be considered as a bug, 
or if I'm making a mistake. In either case, is there a fix for the fit 
method to work with a gamma distribution?

Many thanks,
Charles




More information about the SciPy-Dev mailing list