[SciPy-User] stats-mstats plotting_positions, quantiles

Pierre GM pgmdevlist@gmail....
Mon Aug 9 00:14:10 CDT 2010


On Aug 9, 2010, at 12:02 AM, josef.pktd@gmail.com wrote:

> I stumbled over plotting positions in mstats while looking at the
> Pareto family of distributions.
> 
> plotting_positions, quantiles in stats.mstats has more options that
> the functions in stats (similar to a ecdf proposal by David Huard)

I followed the example of R on that one, and took the definitions from the Hydrology Handbook. I kinda suspect that David did the same...


> Instead of writing plain ndarray versions, I was trying to have a
> common interface for plain ndarrays, ndarrays with nans and masked
> arrays. The implementations are different enough that merging the
> ndarray and masked array version didn't look useful (e.g. masked
> arrays or nans require apply_along_axis).

Because of a variable nb of nans/missing in each column... 
Mmh, looks like the ndarray case is the special one here.

> 
> Instead I just delegated the messy cases (ma, nans, limit) to
> stats.mstats and only the nice cases go through the plain ndarray
> version.
> 
> Main question: Would it be useful to have this delegation in
> scipy.stats so that there is a single entry point for users, or is it
> better to keep the plain ndarray, nan and ma versions separate?
> The pattern could apply to quite a few functions in stats-mstats that
> are too difficult to merge.

As long as you import numpy.ma inside the function, and not at the module level... I expect some people won't like numpy.ma overhead by default.


> As a bonus, I added a plotting_positions_w1d that handles weights
> (since I recently saw the question somewhere). I am not completely
> sure about the definition for the plotting position correction, but if
> desired it will be easy enough to include it in the other functions or
> write also versions of quantiles and scoreatpercentile that take
> weights.

Well, me neither. Float weights kinda defeat the purpose of plotting positions, don't you think ?




More information about the SciPy-User mailing list