[SciPy-User] Need help to use rv_continuous.fit
Wed Dec 5 10:44:53 CST 2012
On Wed, Dec 5, 2012 at 10:11 AM, Sepideh Rastin <firstname.lastname@example.org> wrote:
> Hi there,
> I have histograms that will form the likelihood function and need to
> find the best normal distributions.
> I would like to know if it would be possible to create my input array to
> rv_continuous.fit using the data of my histograms.
No, you cannot use the histogram for fit() directly. stats.norm.fit()
expects the original data with individual observations. It's possible
to create an artifical dataset by just creating observations based on
np.repeat(bin_centers, bin_counts) or something like this.
fitting a normal distribution is "boring": sample mean and standard
deviation are estimates of loc and scale.
There are several scripts on the web, scipy cookbook, scipy central,
... ? how to fit a normal pdf directly to a histogram.
If you have only a small number of bins, then using the above will
cause a discretization bias (reference ?).
In this case, I would fit either the histogram or the cumulative
histogram to the discrete probabilities from the discretization, for
example something like minimizing
def fun(cumhist, bins_edges, loc, scale):
diff = cumhist - stats.norm.cdf(bin_edges, loc=loc, scale=scale)
#fix or drop first element
> Kind regards,
> SciPy-User mailing list
More information about the SciPy-User