[SciPy-dev] scipy.stats._chk_asarray

josef.pktd@gmai... josef.pktd@gmai...
Tue Jun 2 23:51:36 CDT 2009


On Wed, Jun 3, 2009 at 12:07 AM,  <josef.pktd@gmail.com> wrote:
> On Tue, Jun 2, 2009 at 11:40 PM, Charles R Harris
> <charlesr.harris@gmail.com> wrote:
>>
>>
>> On Tue, Jun 2, 2009 at 9:09 PM, <josef.pktd@gmail.com> wrote:
>>>
>>> On Tue, Jun 2, 2009 at 5:58 PM, Robert Kern <robert.kern@gmail.com> wrote:
>>> > On Tue, Jun 2, 2009 at 16:20, ctw <lists.20.chth@xoxy.net> wrote:
>>> >>> Please revert that.
>>> >>
>>> >> Done! Sorry about that. I am having some issues with the current
>>> >> behavior of changing all inputs to ndarrays. Would it be possible to
>>> >> add a nanmean function to numpy that behaves just as np.nansum in the
>>> >> sense that it preserves the type of the input?
>>> >
>>> > I would prefer a comprehensive approach rather than hacking in the one
>>> > function you want. I wouldn't be opposed to more NaN-aware functions
>>> > in numpy if they were corralled into their own module. However, that
>>> > leaves all of the rest of scipy.stats untouched.
>>> >
>>> > Alternately, you could help write a decorator that would wrap a
>>> > function to cast its arguments to ndarrays (bonus points: any
>>> > specified subclass) and then cast the result(s) back to the
>>> > appropriate subclass determined by the inputs' classes according to
>>> > the ufunc rules. You just have to be careful to deal with functions
>>> > that take multiple arraylike and non-arraylike inputs and return
>>> > multiple outputs (some of which aren't arraylike, either). This would
>>> > take some care, but would be a great asset to numpy.
>>> >
>>>
>>> I tried to see if I can introduce a second version _check_asanyarray,
>>> that doesn't convert to basic np.array, but I didn't get very far.
>>> nanmedian, and nanstd are not easy to convert to work with matrices,
>>> nanstd uses multiplication and nanmedian uses np.compress
>>>
>>> I usually avoid matrices because it is too confusing in numpy to keep
>>> track of the type for the basic operations.
>>>
>>> As an alternative, I looked at np.core.fromnumeric._wrapit, which is
>>> the wrapper for np.mean
>>>
>>> Doing a variation on it seems to work for matrices, see below. I
>>> haven't tried it on other array types. This is just a trial balloon to
>>> see whether this would make sense for some of the stats functions. It
>>> would be relevant mostly for the descriptive statistics, the
>>> statistical tests just return test statistics and pvalues, the plan
>>> for models is that they get explicit array subclass handling.
>>>
>>> Is this a good idea to try to work this way?
>>> And what is the best way to check whether an array is a plain ndarray
>>> and not a subclass instance?
>>> something like this ?
>>> >>> isinstance(np.matrix(range(4)),np.ndarray)
>>> True
>>> >>> np.matrix(range(4)).__class__ is np.ndarray
>>> False
>>> >>> np.arange(5).__class__ is np.ndarray
>>> True
>>
>> The linear algebra routines do a lot of that _wrapit stuff so that they can
>> handle both ndarrays and matrices. They might be useful examples.
>>
>
> Thanks for the pointer, this looks simple enough for me. Do you also
> have some easily readable examples for the usage of arraypriority, for
> the multi input case?
> (Even though in scipy.stats most multi input cases will be of the same type.)
>

so far I checked that these functions can be wrapped to accept
matrices, (there will be more):

ops = ['nanmean', 'nanstd', 'nanmedian', 'moment', 'variation',
       'skew', 'kurtosis', 'hmean', 'gmean', 'mode']

opsarg = [('tstd', (None,)), ('trim_mean', (0.1,))]

Following the example in np.linalg.linalg.py this will require only
adding 1 or 2 lines per function plus the wrap function.
But it still needs a new set of tests.

Is this change ok for scipy 0.8 or would it break for other subclasses
of arrays?

Josef


More information about the Scipy-dev mailing list