[Numpy-discussion] Warnings in numpy.ma.test()
Wed Mar 17 13:28:21 CDT 2010
Charles R Harris wrote:
> On Wed, Mar 17, 2010 at 6:19 AM, Darren Dale <firstname.lastname@example.org
> <mailto:email@example.com>> wrote:
> On Wed, Mar 17, 2010 at 2:07 AM, Pierre GM <firstname.lastname@example.org
> <mailto:email@example.com>> wrote:
> > All,
> > As you're probably aware, the current test suite for numpy.ma
> <http://numpy.ma> raises some nagging warnings such as "invalid
> value in ...". These warnings are only issued when a standard numpy
> ufunc (eg., np.sqrt) is called on a MaskedArray, instead of its
> numpy.ma <http://numpy.ma> (eg., np.ma.sqrt) equivalent. The reason
> is that the masked versions of the ufuncs temporarily set the numpy
> error status to 'ignore' before the operation takes place, and reset
> the status to its original value.
> > I thought I could use the new __array_prepare__ method to
> intercept the call of a standard ufunc. After actual testing, that
> can't work. __array_prepare only help to prepare the *output* of the
> operation, not to change the input on the fly, just for this
> operation. Actually, you can modify the input in place, but it's
> usually not what you want.
> That is correct, __array_prepare__ is called just after the output
> array is created, but before the ufunc actually gets down to business.
> I have the same limitation in quantities you are now seeing with
> masked array, in my case I want the opportunity to rescale different
> but compatible quantities for the operation (without changing the
> original arrays in place, of course).
> > Then, I tried to use __array_prepare__ to store the current
> error status in the input, force it to ignore divide/invalid errors
> and send the input to the ufunc. Doesn't work either: np.seterr in
> __array_prepare__ does change the error status, but as far as I
> understand, the ufunc is called is still called with the original
> error status. That means that if something goes wrong, your error
> status can stay stuck. Not a good idea either.
> > I'm running out of ideas at this point. For the test suite, I'd
> suggest to disable the warnings in test_fix_invalid and
> > An additional issue is that if one of the error status is set to
> 'raise', the numpy ufunc will raise the exception (as expected),
> while its numpy.ma <http://numpy.ma> version will not. I'll put also
> a warning in the docs to that effect.
> > Please send me your comments before I commit any changes.
> I started thinking about a third method called __input_prepare__ that
> would be called on the way into the ufunc, which would allow you to
> intercept the input and pass a somehow modified copy back to the
> ufunc. The total flow would be:
> 1) Call myufunc(x, y[, z])
> 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x',
> y' (or simply passes through x,y by default)
> 3) myufunc creates the output array z (if not specified) and calls
> ?.__array_prepare__(z, (myufunc, x, y, ...))
> 4) myufunc finally gets around to performing the calculation
> 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns
> the result to the caller
> Is this general enough for your use case? I haven't tried to think
> about how to change some global state at one point and change it back
> at another, that seems like a bad idea and difficult to support.
> I'm not a masked array user and not familiar with the specific problems
> here, but as an outsider it's beginning to look like one little fix
> after another. Is there some larger framework that would help here?
> Changes to the ufuncs themselves? There was some code for masked ufuncs
> on the c level posted a while back that I thought was interesting, would
> it help to have masked masked versions of the ufuncs? So on and so
> forth. It just looks like a larger design issue needs to be addressed here.
I'm glad you found it interesting, and I'm sorry I haven't had time to
follow up on the work with masked ufuncs in C. My motivation for going
to the C level was speed and control; many ma operations are very slow
compared to their numpy counterparts, and moving the mask handling to C
can erase nearly all of this penalty. Regarding nan-handling, using
masked ufuncs in C means that calculations are simply not done with
masked values, so it doesn't matter whether a masked value is invalid or
not; consequently, so long as an invalid value is masked, the seterr
state doesn't matter. And, the seterr state then applies normally to
the unmasked values. I'm not sure whether this solves the problem at
hand, but it does seem to me to be sensible behavior and a step in the
right direction. The devil is in the details--coming up with some basic
masked ufunc functionality in C was fairly easy, but figuring out how to
handle all ufuncs, and especially their methods (reduce, etc.) would be
quite a bit of work. It might be a good project for a student.
Realistically, I don't think I will ever have the time to do it myself.
In case anyone is interested, my initial feeble attempt nearly a year
ago is still on github:
> NumPy-Discussion mailing list
More information about the NumPy-Discussion