[Numpy-discussion] suggestion for generalizing numpy functions

Charles R Harris charlesr.harris@gmail....
Wed Jun 24 15:08:59 CDT 2009


On Wed, Jun 24, 2009 at 1:49 PM, Darren Dale<dsdale24@gmail.com> wrote:
> On Wed, Jun 24, 2009 at 3:37 PM, Charles R Harris
> <charlesr.harris@gmail.com> wrote:
>>
>> On Wed, Jun 24, 2009 at 8:52 AM, Darren Dale<dsdale24@gmail.com> wrote:
>> > On Wed, Jun 24, 2009 at 9:42 AM, Charles R Harris
>> > <charlesr.harris@gmail.com> wrote:
>> >>
>> >> On Wed, Jun 24, 2009 at 7:08 AM, Darren Dale<dsdale24@gmail.com> wrote:
>> >> > On Wed, May 27, 2009 at 11:30 AM, Darren Dale <dsdale24@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Now that numpy-1.3 has been released, I was hoping I could engage
>> >> >> the
>> >> >> numpy developers and community concerning my suggestion to improve
>> >> >> the
>> >> >> ufunc
>> >> >> wrapping mechanism. Currently, ufuncs call, on the way out, the
>> >> >> __array_wrap__ method of the input array with the highest
>> >> >> __array_priority__.
>> >> >>
>> >> >> There are use cases, like masked arrays or arrays with units, where
>> >> >> it
>> >> >> is
>> >> >> imperative to run some code on the way in to the ufunc as well.
>> >> >> MaskedArrays
>> >> >> do this by reimplementing or wrapping ufuncs, but this approach puts
>> >> >> some
>> >> >> pretty severe constraints on subclassing. For example, in my
>> >> >> Quantities
>> >> >> package I have a Quantity object that derives from ndarray. It has
>> >> >> been
>> >> >> suggested that in order to make ufuncs work with Quantity, I should
>> >> >> wrap
>> >> >> numpy's built-in ufuncs. But I intend to make a MaskedQuantity
>> >> >> object
>> >> >> as
>> >> >> well, deriving from MaskedArray, and would therefore have to wrap
>> >> >> the
>> >> >> MaskedArray ufuncs as well.
>> >> >>
>> >> >> If ufuncs would simply call a method both on the way in and on the
>> >> >> way
>> >> >> out, I think this would go a long way to improving this situation. I
>> >> >> whipped
>> >> >> up a simple proof of concept and posted it in this thread a while
>> >> >> back.
>> >> >> For
>> >> >> example, a MaskedQuantity would implement a method like
>> >> >> __gfunc_pre__
>> >> >> to
>> >> >> check the validity of the units operation etc, and would then call
>> >> >> MaskedArray.__gfunc_pre__ (if defined) to determine the domain etc.
>> >> >> __gfunc_pre__ would return a dict containing any metadata the
>> >> >> subclasses
>> >> >> wish to provide based on the inputs, and that dict would be passed
>> >> >> along
>> >> >> with the inputs, output and context to __gfunc_post__, so
>> >> >> postprocessing can
>> >> >> be done (__gfunc_post__ replacing __array_wrap__).
>> >> >>
>> >> >> Of course, packages like MaskedArray may still wish to reimplement
>> >> >> ufuncs,
>> >> >> like Eric Firing is investigating right now. The point is that
>> >> >> classes
>> >> >> that
>> >> >> dont care about the implementation of ufuncs, that only need to
>> >> >> provide
>> >> >> metadata based on the inputs and the output, can do so using this
>> >> >> mechanism
>> >> >> and can build upon other specialized arrays.
>> >> >>
>> >> >> I would really appreciate input from numpy developers and other
>> >> >> interested
>> >> >> parties. I would like to continue developing the Quantities package
>> >> >> this
>> >> >> summer, and have been approached by numerous people interested in
>> >> >> using
>> >> >> Quantities with sage, sympy, matplotlib. But I would prefer to
>> >> >> improve
>> >> >> the
>> >> >> ufunc mechanism (or establish that there is no interest among the
>> >> >> community
>> >> >> to do so) so I can improve the package (or limit its scope) before
>> >> >> making an
>> >> >> official announcement.
>> >> >
>> >> > There was some discussion of this proposal to allow better
>> >> > interaction
>> >> > of
>> >> > ufuncs with ndarray subclasses in another thread (Plans for
>> >> > numpy-1.4.0
>> >> > and
>> >> > scipy-0.8.0) and the comments were encouraging. I have been trying to
>> >> > gather
>> >> > feedback as to whether the numpy devs were receptive to the idea, and
>> >> > it
>> >> > seems the answer is tentatively yes, although there were questions
>> >> > about
>> >> > who
>> >> > would actually write the code. I guess I have not made clear that I
>> >> > intend
>> >> > to write the implementation and tests. I gained some familiarity with
>> >> > the
>> >> > relevant code while squashing a few bugs for numpy-1.3, but it would
>> >> > be
>> >> > helpful if someone else who is familiar with the existing
>> >> > __array_wrap__
>> >> > machinery would be willing to discuss this proposal in more detail
>> >> > and
>> >> > offer
>> >> > constructive criticism along the way. Is anyone willing?
>> >> >
>> >>
>> >> I think Travis would be the only one familiar with that code and that
>> >> would be from a couple of years back when he wrote it. Most of us have
>> >> followed the same route as yourself, finding our way into the code by
>> >> squashing bugs.
>> >>
>> >
>> > Do you mean that you would require Travis to sign off on the
>> > implementation
>> > (assuming he would agree to review my work)? I would really like to
>> > avoid a
>> > situation where I invest the time and then the code bitrots because I
>> > can't
>> > find a route to committing it to svn.
>> >
>>
>> No, just that Travis would know the most about that subsystem if you
>> are looking for help. I and others here can look over the code and
>> commit it without Travis signing off on it. You could ask for commit
>> privileges yourself. The important thing is having some tests and an
>> agreement that the interface is appropriate. Pierre also seems
>> interested in the functionality so it would be useful for him to say
>> that it serves his needs also.
>
> Ok, I'll start working on it then. Any idea what you are targeting for
> numpy-1.4? Scipy-2009, or much earlier? I'd like to gauge how to budget my
> time.
>

The timeline is open for discussion. A six month timeline would put it
sometime in November but David might want it earlier for scipy 0.8. My
guess would be sometime after Scipy-2009, late September at the
earliest. But as I say, it is open for discussion. What schedule would
you prefer?

Chuck


More information about the Numpy-discussion mailing list