[SciPy-User] scipy.stats.fit inquiry

Anne Archibald peridot.faceted@gmail....
Mon Oct 19 22:53:16 CDT 2009


2009/10/19 Leon Adams <leon_r_adams@hotmail.com>:

> I am using scipy.stats module to perform some distribution fitting. What I
> cannot seem to get a handle on is how to compare the quality of fit
> achieved. At this stage the docs does not seem to be quite as useful... As
> an example, I fit my data using
>
>
> fitExp = st.expon.fit(data)
>
> which returns an array [ 0.99999999  1.33310547]
>
> How do we access the resulting maximized likelihood, mean square errors ...
> Also, how would we go about calculating KS tests for the fitted parameters
> ?? Mainly I am interesting in how good is this fit, and what diagnostics we
> have available.

I'm not sure what tools we have in scipy, but there's always the
everything-looks-like-a-nail approach: fit for the parameters, then
use the fitted distribution to generate many data sets and see how
many of them are a better fit than yours.

We do have a K-S test, which would serve as a reasonable way to answer
"how well does this data fit this distribution". The p value you get
will be wrong if you obtained the distribution by fitting, but the K-S
value will still be a reasonable measure of quality-of-fit (which you
can compare to the quality of models fit to generated data sets). The
scatter in model parameters obtained by fitting generated data sets
will give you an estimate of the uncertainties on the fitted
parameters.

For smarter approaches, for example Cash statistics, I'm not sure
whether scipy has anything more spohisticated, but at least scipy's
distributions will give you PDFs you can take negative logs of.

Anne


More information about the SciPy-User mailing list