[SciPy-User] MLE with stats.lognorm
Mon Oct 10 18:35:54 CDT 2011
On Mon, Oct 10, 2011 at 3:22 PM, <email@example.com> wrote:
> On Mon, Oct 10, 2011 at 10:26 AM, Christian K. <firstname.lastname@example.org> wrote:
>>> >> for example with starting value for loc
>>> >>>>> print stats.lognorm.fit(x, loc=0)
>>> >> (0.23800805074491538, 0.034900026034516723, 196.31113801786194)
>>> > I see. Is there any workaround/patch to force loc=0.0? What is the
>>> > meaning of loc anyway?
>>> loc is the starting value for fmin, I don't remember how to specify
>>> starting values for shape parameters, I never used it.
>>> As in the ticket you could monkey patch the _fitstart function
>>> >>> stats.cauchy._fitstart = lambda x:(0,1)
>>> >>> stats.cauchy.fit(x)
>>> or what I do to experiment with starting values is
>>> stats.distributions.lognorm_gen._fitstart = fitstart_lognormal
>> Ok, but this is not different from calling fit like
>> stats.lognorm.fit(samples, loc=0.0)
>> I would really need to force loc=0.0
>> stats.lognorm.fit(samples, loc=0.0, floc=0.0)
>> does not work either.
> ok, I misunderstood that you want to fix the location parameter at zero
> This looks like a different bug.
> floc=0 doesn't seem to work, I don't get any results that look close
> to the true values
> With a sample size of 2000 the MLE should be pretty close to the true
this is now http://projects.scipy.org/scipy/ticket/1536
I ran a few more distributions as examples, and my conclusion is: At
this stage, don't trust any results with setting floc.
As far as I know, nobody has ever checked the fixed parameter cases in
distributions fit. Patches welcome.
> import numpy as np
> from scipy import stats
> print 'true'
> print 0.25, 0., 20.0
> print 'estimated, floc=0, loc=0'
> for i in range(10):
> x = stats.lognorm.rvs(0.25, 0., 20.0, size=2000)
> print np.array(stats.lognorm.fit(x, floc=0)), \
> np.array(stats.lognorm.fit(x, loc=0))
> 0.25 0.0 20.0
> estimated, floc=0, loc=0
> [ 2.1271 0. 2.3999] [ 0.2623 1.0211 18.7911]
> [ 2.1393 0. 2.3952] [ 0.2523 0.0294 20.0117]
> [ 2.1356 0. 2.3978] [ 0.2477 0.03 19.9703]
> [ 2.1378 0. 2.3874] [ 0.2496 0.0301 19.9231]
> [ 2.1463 0. 2.3641] [ 0.2474 0.0292 19.9051]
> [ 2.1408 0. 2.3898] [ 0.2459 0.0303 20.0118]
> [ 2.1252 0. 2.4326] [ 0.251 0.029 20.0412]
> [ 2.1296 0. 2.3943] [ 0.2476 0.0296 19.8208]
> [ 2.1344 0. 2.401 ] [ 0.2472 0.0299 19.9744]
> [ 2.1383 0. 2.4133] [ 0.247 0.0301 20.1544]
> floc=0 is supposed to fix the location at 0, loc=0 only provides a
> starting value for loc, but still estimates loc
>> Btw., I think the extradoc is quite misleading:
> I think this might be just the non-standard parameterization of the
> log-normal distribution because we use generic loc and scale handling.
> The parameterization has been discussed in the mailing list and for
> example in http://projects.scipy.org/scipy/ticket/1502
> clearer documentation for this or a reparameterized distribution would
> be helpful for lognorm
>> lognorm.pdf(x,s) = 1/(s*x*sqrt(2*pi)) * exp(-1/2*(log(x)/s)**2)
>> for x > 0, s > 0.
>> If log x is normally distributed with mean mu and variance sigma**2,
>> then x is log-normally distributed with shape paramter sigma and scale
>> parameter exp(mu).
>> sigma seems to equal s in the function definition but mu does not appear at
>> all. It seems to enter via _pdf()/scale when looking at distributions.py,
>> wehere scale = exp(mu)?
>> SciPy-User mailing list
More information about the SciPy-User