[SciPy-dev] Homogenizing stats & mstats
Pierre GM
pgmdevlist@gmail....
Fri Jul 24 01:15:39 CDT 2009
All,
I was browsing some recent tickets for scipy.stats, and couldn't but
noticed that a significant number of them (#845, #822, #901...), are
related to some lack of consistency between stats and mstats.
I'd like to eventually get rid of mstats all together, provided the
same functionalities are supported in stats.
* A first step would be to use np.asanyarray instead of np.asarray.
That should be sufficient for functions like gmean and hmean for
example.
* A second step would be to use numpy.ma under the hood, returning
either a MaskedArray if the input is a MaskedArray itself, or just a
standard ndarray otherwise. That should take care of the functions
related to ranking and tie handling (I'm pretty confident into the
mstats routines, and we can always double-check the results w/ R). If
needed, we could also add a usemask flag, like we do in
np.io.genfromtxt.
* A third would be to port the remaining routines of mstats.extras to
stats or morestats (Harrell-Davies quantiles could be imlemented more
efficiently in cython, for example).
At each step, we could add a Deprecate warning to a reviewed mstat
function and call the corresponding stat function instead.
What would be a good time line ? 0.8.0, or is it too late? 0.9.0 ?
Comments expected.
Thx in advance
P.
More information about the Scipy-dev
mailing list