[SciPy-User] Need help to use rv_continuous.fit

josef.pktd@gmai... josef.pktd@gmai...
Wed Dec 5 10:44:53 CST 2012


On Wed, Dec 5, 2012 at 10:11 AM, Sepideh Rastin <sepideh@isc.ac.uk> wrote:
> Hi there,
> I have histograms that will form the likelihood function and need to
> find the best normal distributions.
> I would like to know if it would be possible to create my input array to
> rv_continuous.fit using the data of my histograms.

several answers

No, you cannot use the histogram for fit() directly. stats.norm.fit()
expects the original data with individual observations. It's possible
to create an artifical dataset by just creating observations based on
the histogramm.
np.repeat(bin_centers, bin_counts) or something like this.

fitting a normal distribution is "boring": sample mean and standard
deviation are estimates of loc and scale.

There are several scripts on the web, scipy cookbook, scipy central,
... ? how to fit a normal pdf directly to a histogram.

If you have only a small number of bins, then using the above will
cause a discretization bias (reference ?).
In this case, I would fit either the histogram or the cumulative
histogram to the discrete probabilities from the discretization, for
example something like minimizing
def fun(cumhist, bins_edges, loc, scale):
    diff = cumhist - stats.norm.cdf(bin_edges, loc=loc, scale=scale)
#fix or drop first element
    return (diff**2).sum()

Josef





> Kind regards,
> Sep
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user


More information about the SciPy-User mailing list