[SciPy-User] how to fit a given pdf
josef.pktd@gmai...
josef.pktd@gmai...
Wed Aug 11 13:15:56 CDT 2010
On Wed, Aug 11, 2010 at 1:48 PM, Benjamin Root <ben.root@ou.edu> wrote:
> On Wed, Aug 11, 2010 at 11:14 AM, Jonathan Stickel <jjstickel@vcn.com>
> wrote:
>>
>> On 8/11/10 09:00 , scipy-user-request@scipy.org wrote:
>> > Date: Wed, 11 Aug 2010 12:00:22 -0300
>> > From: Renato Fabbri<renato.fabbri@gmail.com>
>> > Subject: [SciPy-User] how to fit a given pdf
>> > To: Discussion of Numerical Python<numpy-discussion@scipy.org>,
>> > scipy-user@scipy.org
>> > Message-ID:
>> > <AANLkTinmmEbSo9_-ZmCXDzqDXLZb1fM8zFVoM-gnHo+c@mail.gmail.com>
>> > Content-Type: text/plain; charset="iso-8859-1"
>> >
>> > Dear All,
>> >
>> > help appreciated, thanks in advance.
>> >
>> > how do you fit a pdf you have with a given pdf (say gamma).
>> >
>> > with the file attached, you can go like:
>> >
>> > a=open("AC-010_ED-1m37F100P0.txt","rb")
>> > aa=a.read()
>> > aaa=aa[1:-1].split(",")
>> > data=[int(i) for i in aaa]
>> >
>> > if you do pylab.plot(data); pylab.show()
>> >
>> > The data is something like:
>> > ___|\___
>> >
>> > It is my pdf (probability density function).
>> >
>> > how can i find the right parameters to make that fit with a gamma?
>> >
>> > if i was looking for a normal pdf, for example, i would just find mean
>> > and std and ask for the pdf.
>> >
>> > i've been playing with scipy.stats.distributions.gamma but i have not
>> > reached anything.
>> >
>> > we can extend the discussion further, but this is a good starting point.
>> >
>> > any idea?
>> >
>>
>> I am not familiar with the scipy.stats module, and so I do not know what
>> it can do for you. However, I would just generate a model gamma
>> distribution from the mean and variance, just as for a normal
>> distribution. The gamma distribution equation can be written as
>>
>> p(x) = p0/(b^a*Gamma(a))*x^(a-1)*exp(-x/b)
>>
>> assuming x starts at zero (x>=0)
>> (http://en.wikipedia.org/wiki/Gamma_distribution). Then the the
>> parameters a and b are related to the mean and variance by
>>
>> a = mean^2/var
>> b = var/mean
>>
>> (I did the algebra quickly just now, and so you might want to
>> double-check). p0 is the area under the distribution and may be simply
>> 1 if your distribution is normalized.
>>
>> Hope this helps.
>>
>> Jonathan
>
> Note, those are the point estimators for gamma distribution, but it does not
> use maximum likeihood estimation (MLE). Using MLE, finding the estimator
> for alpha is tricky, requiring convergence. Probably best to use the fit
> module as Josef mentioned. However, there is a decent estimator for the
> estimator (I heard you liked estimation...)
>
> a = 0.5 / (ln(mean(X)) - mean(ln(X)))
>
> Note that if you have a zero value in your dataset this won't work. Also,
> the reference is not 100% clear if it is a base 10 log or not. I am pretty
> certain it is a natural log.
Travis also added a different estimator in
http://projects.scipy.org/scipy/browser/trunk/scipy/stats/distributions.py#L2867
I never looked specifically at gamma, only as example for generic MLE,
and have no idea what this estimator is.
Ben, do you have the reference or know what type of estimators your's is?
(I like references, the R package POT has 17 estimators for genpareto,
and I have 20 pdf files for pareto/genpareto/genextreme open)
Josef
>
> Ben Root
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
More information about the SciPy-User
mailing list