[SciPy-Dev] Subversion scipy.stats irregular problem with source code example

Per.Brodtkorb@f... Per.Brodtkorb@f...
Tue Oct 12 03:17:11 CDT 2010


On Mon, Oct 11, 2010 at 4:24 PM, James Phillips <zunzun@zunzun.com> wrote:
> On Mon, Oct 11, 2010 at 10:10 AM,  <josef.pktd@gmail.com> wrote:
>>
>> typo should be p4
>
> Oops - thank you.
>
>
>> If I remember correctly, you have observations that are too close to
>> the upper boundary.
>>
>> If you have an observation at the boundary, the loglikelihood is inf
>>
>> I think in these cases you have to keep the boundary of the support
>> away from the max and min of the data. Similar in other distributions,
>> as I mentioned before.
>
> Thank you.
>
>
>> If MLE doesn't work for a distribution then a global optimizer
>> wouldn't help either. In these cases, usually another estimation
>> method is recommended in the literature. For example matching
>> quantiles similar to your initial version.
>

The maximum product of spacings (MPS) method is a  general method  of  estimating  parameters  in  
continuous  univariate  distributions that in many cases solves this problem.
It  is  especially  suited  to cases where one of  the parameters  is  an unknown 
shifted  origin.  This  occurs,  for  example,  in  the  three-parameter  lognormal,  gamma, Generalized Extreme Value 
Generalized Pareto and  Weibull  models.  
For  such  distributions  it  is  known  that  maximum  likelihood 
(ML)  estimation  can  break  down  because  the  likelihood  is  unbounded  and  this  can 
lead  to  inconsistent  estimators.  
In  particular  MPS  is  shown  to  give  consistent  estimators  with  asymptotic  efficiency 
equal  to  ML  estimators  when  these  exist. Moreover  it  gives  consistent,  asymptoti- 
cally  efficient  estimators  in  situations where  ML  fails. 

Finally, as a by-product of the MPS, a goodness of fit statistic, Moran’s statistic,
is available for evaluating the fit to the selected distribution. 

Two years ago I implemented this method + some other enhancements. The source code is available here:
http://code.google.com/p/joepython/source/browse/trunk/joepython/scipystats/enhance/per/distributions_per.py

The parameters are estimated by minimizing the method nlogps in the rv_continous class.
       
You will find more details on the method in the following references:

Estimating Parameters in Continuous Univariate Distributions with a Shifted Origin
R. C. H. Cheng and N. A. K. Amin
(http://links.jstor.org/sici?sici=0035-9246%281983%2945%3A3%3C394%3AEPICUD%3E2.0.CO%3B2-N)

A Note on the Estimation of Extreme Value Distributions Using Maximum Product of Spacings
T. S. T. Wong and W. K. Li
Lecture Notes-Monograph Series
Vol. 52, Time Series and Related Topics: In Memory of Ching-Zong Wei (2006), pp. 272-283 
(article consists of 12 pages)
Published by: Institute of Mathematical Statistics
Stable URL: http://www.jstor.org/stable/20461444

A Goodness-Of-Fit Test Using Moran's Statistic with Estimated Parameters
R. C. H. Cheng and M. A. Stephens
(http://links.jstor.org/sici?sici=0006-3444%28198906%2976%3A2%3C385%3AAGTUMS%3E2.0.CO%3B2-1 )


More information about the SciPy-Dev mailing list