[SciPy-Dev] Subversion scipy.stats irregular problem with source code example

josef.pktd@gmai... josef.pktd@gmai...
Thu Dec 9 14:53:38 CST 2010


On Thu, Dec 9, 2010 at 3:12 PM, Skipper Seabold <jsseabold@gmail.com> wrote:
> On Thu, Dec 9, 2010 at 2:34 PM, Charles <charles.moliere@gmail.com> wrote:
>> Skipper Seabold <jsseabold <at> gmail.com> writes:
>>
>>>
>>> On Tue, Sep 28, 2010 at 1:12 PM, James Phillips <zunzun <at> zunzun.com>
>> wrote:
>>> > Since I observed the following behavior in the SVN respository version
>>> > of SciPy, it seemed to me proper to post to the dev mailing list.  I'm
>>> > using Ubuntu Lucid Lynx 32 bit and a fresh GIT of Numpy.  I am not
>>> > sure if a Trac bug report needs to be entered.
>>> >
>>> >
>>> > Below is some example code for fitting two statistical distributions.
>>> > Sometimes the numpy-generated data is fit, as I can see the estimated
>>> > and fitted parameters are different.  Sometimes I receive many
>>> > messages repeated on the command line like:
>>> >
>>> > Warning: invalid value encountered in absolute
>>> > Warning: invalid value encountered in subtract
>>> >
>>> > and the estimated parameters equal the fitted parameter values,
>>> > indicating no fitting took place.  Sometimes I receive on the command
>>> > line:
>>> >
>>> > Traceback (most recent call last):
>>> >  File "/home/zunzun/local/lib/python2.6/site-
>> packages/scipy/stats/distributions.py",
>>> > line 1987, in func
>>> >    sk = 2*(b-a)*math.sqrt(a + b + 1) / (a + b + 2) / math.sqrt(a*b)
>>> > ValueError: math domain error
>>> > Traceback (most recent call last):
>>> >  File "example.py", line 10, in <module>
>>> >    fitStart_beta = scipy.stats.beta._fitstart(data)
>>> >  File "/home/zunzun/local/lib/python2.6/site-
>> packages/scipy/stats/distributions.py",
>>> > line 1992, in _fitstart
>>> >    a, b = optimize.fsolve(func, (1.0, 1.0))
>>> >  File "/home/zunzun/local/lib/python2.6/site-
>> packages/scipy/optimize/minpack.py",
>>> > line 125, in fsolve
>>> >    maxfev, ml, mu, epsfcn, factor, diag)
>>> > minpack.error: Error occured while calling the Python function named func
>>> >
>>> > and program flow is stopped.
>>> >
>>> >
>>> > In summary, three behaviors: (1) Fits OK (2) Many exceptions with no
>>> > fitting (3) minpack error.  Running the program 10 times or so will
>>> > reproduce these behaviors without fail from the "bleeding-edge"
>>> > repository code.
>>> >
>>> >     James Phillips
>>> >
>>> >
>>> > ########################################################
>>> >
>>> > import numpy, scipy, scipy.stats
>>> >
>>> > # test uniform distribution fitting
>>> > data = numpy.random.uniform(2.0, 3.0, size=100)
>>> > fitStart_uniform = scipy.stats.uniform._fitstart(data)
>>> > fittedParameters_uniform = scipy.stats.uniform.fit(data)
>>> >
>>> > # test beta distribution fitting
>>> > data = numpy.random.beta(2.0, 3.0, size=100)
>>> > fitStart_beta = scipy.stats.beta._fitstart(data)
>>> > fittedParameters_beta = scipy.stats.beta.fit(data)
>>> >
>>> > print
>>> > print 'uniform._fitstart returns', fitStart_uniform
>>> > print 'fitted parameters for uniform =', fittedParameters_uniform
>>> > print
>>> > print 'beta._fitstart returns', fitStart_beta
>>> > print 'fitted parameters for beta =', fittedParameters_beta
>>> > print
>>> > _______________________________________________
>>>
>>> Is there an existing bug ticket for this?  If not there probably should be...
>>>
>>> I think the fitting code should be looked at as experimental.  It's
>>> good that you caught that no fitting is actually done in these cases.
>>> The problem stems (for the most part) from bad starting values
>>> (outside the support of the distribution for those with bounded
>>> support).  I've tried to go through and fix this, giving very naive
>>> (but correct) starting values to fit methods, but I haven't gotten
>>> much further than that.
>>>
>>> I don't know if Travis or Josef have gone back to look at this.
>>> Hopefully one of these days I will find some more time to look at this
>>> and try to give a systematic fix.
>>>
>>> Skipper
>>>
>>
>>
>> Hi,
>> I'm very sorry for entering the thread like this, but after a long search over
>> the web, this thread is the more relevant to my problem which I'm stuck with.
>> I'm actually trying to fit a gamma distribution on a set of experimental
>> values with gamma.fit() in scipy 0.8.0. Here is the very simple code I'm using
>> with a sample of my data:
>>
>> ##########################
>> import scipy as sp
>> import scipy.stats as ss
>>
>> exp_data =[25.6,35.8,100.2,115.2,125.2,140.1,160.6,210.1,250.5,4500.3]
>> data = sp.array(exp_data)
>>
>> fit_alpha, fit_loc, fit_beta = ss.gamma.fit(data)
>> print(fit_alpha,fit_loc,fit_beta)
>> #########################
>>
>> I then receive many messages on the command line:
>> Warning: invalid value encountered in subtract
>>
>> Which ends with no fitting of the parameters:
>> (1.0, 0.0, 1.0)
>>
>> With earlier version of scipy (0.7.2), the error message are absent but still
>> no fitting is done. Apparently, it is the extrem value of "4500.3" that is
>> causing problem with the fitting in this case.
>>
>> I know you metionned earlier that the fitting code should be considered as
>> experimental, however I was wondering if this should be considered as a bug,
>> or if I'm making a mistake. In either case, is there a fix for the fit
>> method to work with a gamma distribution?
>>
>
> It looks like Josef's recent changes have got this working.  Using the
> most recent trunk, so you might want to upgrade or see the changeset

I don't remember any changes, but in this case choosing the right
starting values will be important, and the default ones might work
with one scipy version but with an other.

I think the warnings mean that the starting values don't make much
sense and the likelihood is evaluated at "bad" places

Josef

>
> In [1]: import scipy as sp
>
> In [2]: import scipy.stats as ss
>
> In [3]:
>
> In [4]: exp_data =[25.6,35.8,100.2,115.2,125.2,140.1,160.6,210.1,250.5,4500.3]
>
> In [5]: data = sp.array(exp_data)
>
> In [6]:
>
> In [7]: fit_alpha, fit_loc, fit_beta = ss.gamma.fit(data)
>
> In [8]: print(fit_alpha, fit_loc, fit_beta)
> (0.37079887324711569, 25.599999999999998, 2459.7323873048508)
>
> Skipper
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>


More information about the SciPy-Dev mailing list