[SciPy-dev] PEP: Improving the basic statistical functions in Scipy

josef.pktd@gmai... josef.pktd@gmai...
Fri Feb 27 17:13:06 CST 2009


On Fri, Feb 27, 2009 at 5:47 PM, Pierre GM <pgmdevlist@gmail.com> wrote:
>
> On Feb 27, 2009, at 4:52 PM, josef.pktd@gmail.com wrote:
>>
>> For most of the current statistical functions, with the exception of
>> different tie handling, I think that we can expand the _chk_asarray to
>> do the necessary preprocessing.
>
> Mmh. _chk_asarray will always return a MA. Is it what you want? Are you
>
No, what I meant was, that _chk_asarray is currently called for
preprocessing in most functions, so it will be easy to use a replacement
function to obtain the preprocessed (e.g. compressed) data, and whatever
flags (usemask) we need, in the main body of the function and for the
decision about the return type.


> An idea is then to use the 'usemask' parameter I was talking about
> earlier:
> * if usemask is False (default), return a ndarray
> * If usemask is True, return a MA
> * if the input is a MA (w/ or w/o missing values), set usemask to
> True, and mask the NaNs/Infs first w/ ma.fix_invalid.
>
> That way, we need only one function. If we really need it, we can have
> duplicate functions in scipy.mstats where usemask is set to True by
> default.
>
> Now, for the actual implementation:
> * usemask=False and some NaNs: return NaN
> * usemask=True: use the ma implementation.
>

That clarifies the API. I will try to write a prototype, but I spend
too much time on scipy this week.


>
>>>>> stats.mstats.moment(np.ma.fix_invalid(np.ma.column_stack([x,x])),
>>>>> 1) #inconsistent return type
>> array([ 0.,  0.])
>
> That's a bug, we should have a MA.
>


More information about the Scipy-dev mailing list