[Numpy-discussion] Another suggestion for making numpy's functions generic
Darren Dale
dsdale24@gmail....
Sat Oct 17 07:49:11 CDT 2009
numpy's functions, especially ufuncs, have had some ability to support
subclasses through the ndarray.__array_wrap__ method, which provides
masked arrays or quantities (for example) with an opportunity to set
the class and metadata of the output array at the end of an operation.
An example is
q1 = Quantity(1, 'meter')
q2 = Quantity(2, 'meters')
numpy.add(q1, q2) # yields Quantity(3, 'meters')
At SciPy2009 we committed a change to the numpy trunk that provides a
chance to determine the class and some metadata of the output *before*
the ufunc performs its calculation, but after output array has been
established (and its data is still uninitialized). Consider:
q1 = Quantity(1, 'meter')
q2 = Quantity(2, 'J')
numpy.add(q1, q2, q1)
# or equivalently:
# q1 += q2
With only __array_wrap__, the attempt to propagate the units happens
after q1's data was updated in place, too late to raise an error, the
data is now corrupted. __array_prepare__ solves that problem, an
exception can be raised in time.
Now I'd like to suggest one more improvement to numpy to make its
functions more generic. Consider one more example:
q1 = Quantity(1, 'meter')
q2 = Quantity(2, 'feet')
numpy.add(q1, q2)
In this case, I'd like an opportunity to operate on the input arrays
on the way in to the ufunc, to rescale the second input to meters. I
think it would be a hack to try to stuff this capability into
__array_prepare__. One form of this particular example is already
supported in quantities, "q1 + q2", by overriding the __add__ method
to rescale the second input, but there are ufuncs that do not have an
associated special method. So I'd like to look into adding another
check for a special method, perhaps called __input_prepare__. My time
is really tight for the next month, so I'd rather not start if there
are strong objections, but otherwise, I'd like to try to try to get it
in in time for numpy-1.4. (Has a timeline been established?)
I think it will be not too difficult to document this overall scheme:
When calling numpy functions:
1) __input_prepare__ provides an opportunity to operate on the inputs
to yield versions that are compatible with the operation (they should
obviously not be modified in place)
2) the output array is established
3) __array_prepare__ is used to determine the class of the output
array, as well as any metadata that needs to be established before the
operation proceeds
4) the ufunc performs its operations
5) __array_wrap__ provides an opportunity to update the output array
based on the results of the computation
Comments, criticisms? If PEP 3124^ were already a part of the standard
library, that could serve as the basis for generalizing numpy's
functions. But I think the PEP will not be approved in its current
form, and it is unclear when and if the author will revisit the
proposal. The scheme I'm imagining might be sufficient for our
purposes.
Darren
^ http://www.python.org/dev/peps/pep-3124/
More information about the NumPy-Discussion
mailing list