[Numpy-discussion] Another suggestion for making numpy's functions generic

Darren Dale dsdale24@gmail....
Mon Oct 19 06:55:36 CDT 2009


On Mon, Oct 19, 2009 at 3:10 AM, Sebastian Walter
<sebastian.walter@gmail.com> wrote:
> On Sat, Oct 17, 2009 at 2:49 PM, Darren Dale <dsdale24@gmail.com> wrote:
>> numpy's functions, especially ufuncs, have had some ability to support
>> subclasses through the ndarray.__array_wrap__ method, which provides
>> masked arrays or quantities (for example) with an opportunity to set
>> the class and metadata of the output array at the end of an operation.
>> An example is
>>
>> q1 = Quantity(1, 'meter')
>> q2 = Quantity(2, 'meters')
>> numpy.add(q1, q2) # yields Quantity(3, 'meters')
>>
>> At SciPy2009 we committed a change to the numpy trunk that provides a
>> chance to determine the class and some metadata of the output *before*
>> the ufunc performs its calculation, but after output array has been
>> established (and its data is still uninitialized). Consider:
>>
>> q1 = Quantity(1, 'meter')
>> q2 = Quantity(2, 'J')
>> numpy.add(q1, q2, q1)
>> # or equivalently:
>> # q1 += q2
>>
>> With only __array_wrap__, the attempt to propagate the units happens
>> after q1's data was updated in place, too late to raise an error, the
>> data is now corrupted. __array_prepare__ solves that problem, an
>> exception can be raised in time.
>>
>> Now I'd like to suggest one more improvement to numpy to make its
>> functions more generic. Consider one more example:
>>
>> q1 = Quantity(1, 'meter')
>> q2 = Quantity(2, 'feet')
>> numpy.add(q1, q2)
>>
>> In this case, I'd like an opportunity to operate on the input arrays
>> on the way in to the ufunc, to rescale the second input to meters. I
>> think it would be a hack to try to stuff this capability into
>> __array_prepare__. One form of this particular example is already
>> supported in quantities, "q1 + q2", by overriding the __add__ method
>> to rescale the second input, but there are ufuncs that do not have an
>> associated special method. So I'd like to look into adding another
>> check for a special method, perhaps called __input_prepare__. My time
>> is really tight for the next month, so I'd rather not start if there
>> are strong objections, but otherwise, I'd like to try to try to get it
>> in in time for numpy-1.4. (Has a timeline been established?)
>>
>> I think it will be not too difficult to document this overall scheme:
>>
>> When calling numpy functions:
>>
>> 1) __input_prepare__ provides an opportunity to operate on the inputs
>> to yield versions that are compatible with the operation (they should
>> obviously not be modified in place)
>>
>> 2) the output array is established
>>
>> 3) __array_prepare__ is used to determine the class of the output
>> array, as well as any metadata that needs to be established before the
>> operation proceeds
>>
>> 4) the ufunc performs its operations
>>
>> 5) __array_wrap__ provides an opportunity to update the output array
>> based on the results of the computation
>>
>> Comments, criticisms? If PEP 3124^ were already a part of the standard
>> library, that could serve as the basis for generalizing numpy's
>> functions. But I think the PEP will not be approved in its current
>> form, and it is unclear when and if the author will revisit the
>> proposal. The scheme I'm imagining might be sufficient for our
>> purposes.
>
> I'm all for generic (u)funcs since they might come handy for me since
> I'm doing lots of operation on arrays of polynomials.
>  I don't quite get the reasoning though.
> Could you correct me where I get it wrong?
> * the class Quantity derives from numpy.ndarray
> * Quantity overrides __add__, __mul__ etc. and you get the correct behaviour for
> q1 = Quantity(1, 'meter')
> q2 = Quantity(2, 'J')
> by raising an exception when performing q1+=q2

No, Quantity does not override __iadd__ to catch this. Quantity
implements __array_prepare__ to perform the dimensional analysis based
on the identity of the ufunc and the inputs, and set the class and
dimensionality of the output array, or raise an error when dimensional
analysis fails. This approach lets quantities support all ufuncs (in
principle), not just built in numerical operations. It should also
make it easier to subclass from MaskedArray, so we could have a
MaskedQuantity without having to establish yet another suite of ufuncs
specific to quantities or masked quantities.

> * The problem is that numpy.add(q1,q1,q2) would corrupt q1 before
> raising an exception

That was solved by the addition of __array_prepare__ to numpy back in
August. What I am proposing now is supporting operations on arrays
that would be compatible if we had a chance to transform them on the
way into the ufunc, like "meter + foot".

Darren


More information about the NumPy-Discussion mailing list