[SciPy-User] adding distributions from hydroclimpy to stats.distributions
Pierre GM
pgmdevlist@gmail....
Sun Aug 2 14:52:56 CDT 2009
On Aug 2, 2009, at 9:41 AM, josef.pktd@gmail.com wrote:
>
> I looked briefly at the distributions in hydroclimpy
> http://projects.scipy.org/scikits/browser/trunk/hydroclimpy/scikits/hydroclimpy/stats/extradistributions.py
>
> my first impression:
>
> kappa, glogistic, gennorm and wakeby
> can be added almost without changes to stats distributions, since they
> are already in the standard format
> cosmetic changes: add longname and extradocs (from module docstring)
Agreed. I still have an issue about defining a proper template for
describing the distributions (eqns for pdf/cdf/ppf, example of usage,
plots...), hence the nudge. What are our doc exegetes' recommendations ?
> pearson3
> This one overwrites the main public methods, pdf, cdf, ...,
> Can this be rewritten to define only the private, distribution
> specific methods, _cdf, _pdf, or is there a special reason for the
> public methods?
Depending on the values of the parameters, Pearson III can reduce to a
normal. Overwriting .pdf and .cdf was IMHO more efficient than trying
to stick to the _pdf/_cdf methods. The same problem arises when a
distribution reduces to another in some particular cases.
> ztnbinom and logseries look like duplicates of stats.nbinom and
> stats.logser
> ztnbinom uses a different way to calculate stats
ztnbinom is the zero-truncated negative binomial distribution, a
particular case of the negative binomial where support is restricted
to integers larger or equal than 1 (no zero class). Yes, the stats are
slightly different because of the truncation. Similarly, we can define
a zero-inflated Poisson.
I considered developing a generic trunc_dist class from rv_discrete to
handle arbitrary truncation, but realize that the scope was too large
for me to handle, and I've already far enough on my plate(s) for now.
> logseries adds a fit function
> Is there a difference that I'm missing after my only brief look?
I had overlooked the logser distribution (silly me). Adding the fit
method is required for my own applications (analyzing dry/wet spells
distributions). I'm about to add fit methods to other discrete
distribution as I need them.
> I don't know anything about L moments and only briefly looked up the
> definitions. Is there a generic method, that works (reasonably well)
> for all distribution?
L-moments are defined for continuous distributions only. You can find
a nice description of their definition and use here:
http://www.research.ibm.com/people/h/hosking/lmoments.html
In short, they tend to be more robust that the classical moments. The
facts that the L-kurtosis and L-skewness are in the interval [-1;+1]
simplifies the comparisons between different distributions when trying
to define the most adequate one.
L-moments of some specific distributions have an explicit formulation
that can help estimating the parameters of these distributions (hence
the whole lmoments.py module).
> I assume the main work would be to make sure that adding a new method
> would work with all distributions. I would gladly review a patch, but
> I don't have the time to do the integration into stats.distributions
> and the testing myself.
OK, what about we keep them on the backburner for now ? Hopefully I'll
have more time to deal with polishing the docs and adding more tests
soon. My advertising these new distributions was primarily to let
other users know that they're already implemented somewhere, to
illustrate the need for a doc template
More information about the SciPy-User
mailing list