[SciPy-User] how to fit a given pdf

josef.pktd@gmai... josef.pktd@gmai...
Wed Aug 11 13:49:25 CDT 2010


On Wed, Aug 11, 2010 at 2:29 PM, Benjamin Root <ben.root@ou.edu> wrote:
> On Wed, Aug 11, 2010 at 1:15 PM, <josef.pktd@gmail.com> wrote:
>>
>> On Wed, Aug 11, 2010 at 1:48 PM, Benjamin Root <ben.root@ou.edu> wrote:
>> > On Wed, Aug 11, 2010 at 11:14 AM, Jonathan Stickel <jjstickel@vcn.com>
>> > wrote:
>> >>
>> >> On 8/11/10 09:00 , scipy-user-request@scipy.org wrote:
>> >> > Date: Wed, 11 Aug 2010 12:00:22 -0300
>> >> > From: Renato Fabbri<renato.fabbri@gmail.com>
>> >> > Subject: [SciPy-User] how to fit a given pdf
>> >> > To: Discussion of Numerical Python<numpy-discussion@scipy.org>,
>> >> >       scipy-user@scipy.org
>> >> > Message-ID:
>> >> >       <AANLkTinmmEbSo9_-ZmCXDzqDXLZb1fM8zFVoM-gnHo+c@mail.gmail.com>
>> >> > Content-Type: text/plain; charset="iso-8859-1"
>> >> >
>> >> > Dear All,
>> >> >
>> >> > help appreciated, thanks in advance.
>> >> >
>> >> > how do you fit a pdf you have with a given pdf (say gamma).
>> >> >
>> >> > with the file attached, you can go like:
>> >> >
>> >> > a=open("AC-010_ED-1m37F100P0.txt","rb")
>> >> > aa=a.read()
>> >> > aaa=aa[1:-1].split(",")
>> >> > data=[int(i) for i in aaa]
>> >> >
>> >> > if you do pylab.plot(data); pylab.show()
>> >> >
>> >> > The data is something like:
>> >> > ___|\___
>> >> >
>> >> > It is my pdf (probability density function).
>> >> >
>> >> > how can i find the right parameters to make that fit with a gamma?
>> >> >
>> >> > if i was looking for a normal pdf, for example, i would just find
>> >> > mean
>> >> > and std and ask for the pdf.
>> >> >
>> >> > i've been playing with scipy.stats.distributions.gamma but i have not
>> >> > reached anything.
>> >> >
>> >> > we can extend the discussion further, but this is a good starting
>> >> > point.
>> >> >
>> >> > any idea?
>> >> >
>> >>
>> >> I am not familiar with the scipy.stats module, and so I do not know
>> >> what
>> >> it can do for you.  However, I would just generate a model gamma
>> >> distribution from the mean and variance, just as for a normal
>> >> distribution.  The gamma distribution equation can be written as
>> >>
>> >> p(x) = p0/(b^a*Gamma(a))*x^(a-1)*exp(-x/b)
>> >>
>> >> assuming x starts at zero (x>=0)
>> >> (http://en.wikipedia.org/wiki/Gamma_distribution).  Then the the
>> >> parameters a and b are related to the mean and variance by
>> >>
>> >> a = mean^2/var
>> >> b = var/mean
>> >>
>> >> (I did the algebra quickly just now, and so you might want to
>> >> double-check).  p0 is the area under the distribution and may be simply
>> >> 1 if your distribution is normalized.
>> >>
>> >> Hope this helps.
>> >>
>> >> Jonathan
>> >
>> > Note, those are the point estimators for gamma distribution, but it does
>> > not
>> > use maximum likeihood estimation (MLE).  Using MLE, finding the
>> > estimator
>> > for alpha is tricky, requiring convergence.  Probably best to use the
>> > fit
>> > module as Josef mentioned.  However, there is a decent estimator for the
>> > estimator (I heard you liked estimation...)
>> >
>> > a = 0.5 / (ln(mean(X)) - mean(ln(X)))
>> >
>> > Note that if you have a zero value in your dataset this won't work.
>> > Also,
>> > the reference is not 100% clear if it is a base 10 log or not.  I am
>> > pretty
>> > certain it is a natural log.
>>
>> Travis also added a different estimator in
>>
>> http://projects.scipy.org/scipy/browser/trunk/scipy/stats/distributions.py#L2867
>> I never looked specifically at gamma, only as example for generic MLE,
>> and have no idea what this estimator is.
>>
>> Ben, do you have the reference or know what type of estimators your's is?
>>
>> (I like references, the R package POT has 17 estimators for genpareto,
>> and I have 20 pdf files for pareto/genpareto/genextreme open)
>>
>> Josef
>>
>
> Josef,
>
> The reference for that particular estimator is
> http://research.microsoft.com/en-us/um/people/minka/papers/minka-gamma.pdf

thanks, that also describes the current scipy trunk estimator.

Josef

>
> There is another estimator that I have used that came from a very good
> 'statistics for meteorologists' book, but I don't have that on me.  I can
> get it to you tomorrow, though.
>
> Ben Root
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>


More information about the SciPy-User mailing list