[SciPy-User] stats-mstats plotting_positions, quantiles
Mon Aug 9 00:14:10 CDT 2010
On Aug 9, 2010, at 12:02 AM, email@example.com wrote:
> I stumbled over plotting positions in mstats while looking at the
> Pareto family of distributions.
> plotting_positions, quantiles in stats.mstats has more options that
> the functions in stats (similar to a ecdf proposal by David Huard)
I followed the example of R on that one, and took the definitions from the Hydrology Handbook. I kinda suspect that David did the same...
> Instead of writing plain ndarray versions, I was trying to have a
> common interface for plain ndarrays, ndarrays with nans and masked
> arrays. The implementations are different enough that merging the
> ndarray and masked array version didn't look useful (e.g. masked
> arrays or nans require apply_along_axis).
Because of a variable nb of nans/missing in each column...
Mmh, looks like the ndarray case is the special one here.
> Instead I just delegated the messy cases (ma, nans, limit) to
> stats.mstats and only the nice cases go through the plain ndarray
> Main question: Would it be useful to have this delegation in
> scipy.stats so that there is a single entry point for users, or is it
> better to keep the plain ndarray, nan and ma versions separate?
> The pattern could apply to quite a few functions in stats-mstats that
> are too difficult to merge.
As long as you import numpy.ma inside the function, and not at the module level... I expect some people won't like numpy.ma overhead by default.
> As a bonus, I added a plotting_positions_w1d that handles weights
> (since I recently saw the question somewhere). I am not completely
> sure about the definition for the plotting position correction, but if
> desired it will be easy enough to include it in the other functions or
> write also versions of quantiles and scoreatpercentile that take
Well, me neither. Float weights kinda defeat the purpose of plotting positions, don't you think ?
More information about the SciPy-User