[Numpy-discussion] Proposal for new ufunc functionality

Warren Weckesser warren.weckesser@enthought....
Mon Apr 12 17:54:52 CDT 2010


Robert Kern wrote:
> On Mon, Apr 12, 2010 at 17:26, Travis Oliphant <oliphant@enthought.com> wrote:
>   
>> On Apr 11, 2010, at 2:56 PM, Anne Archibald wrote:
>>
>> 2010/4/10 Stéfan van der Walt <stefan@sun.ac.za>:
>>
>> On 10 April 2010 19:45, Pauli Virtanen <pav@iki.fi> wrote:
>>
>> Another addition to ufuncs that should be though about is specifying the
>>
>> Python-side interface to generalized ufuncs.
>>
>> This is an interesting idea; what do you have in mind?
>>
>> I can see two different kinds of answer to this question: one is a
>> tool like vectorize/frompyfunc that allows construction of generalized
>> ufuncs from python functions, and the other is thinking out what
>> methods and support functions generalized ufuncs need.
>>
>> The former would be very handy for prototyping gufunc-based libraries
>> before delving into the templated C required to make them actually
>> efficient.
>>
>> The latter is more essential in the long run: it'd be nice to have a
>> reduce-like function, but obviously only when the arity and dimensions
>> work out right (which I think means (shape1,shape2)->(shape2) ). This
>> could be applied along an axis or over a whole array. reduceat and the
>> other, more sophisticated, schemes might also be worth supporting. At
>> a more elementary level, gufunc objects should have good introspection
>> - docstrings, shape specification accessible from python, named formal
>> arguments, et cetera. (So should ufuncs, for that matter.)
>>
>> We should collect all of these proposals into a NEP.      To clarify what I
>> mean by "group-by" behavior.
>> Suppose I have an array of floats and an array of integers.   Each element
>> in the array of integers represents a region in the float array of a certain
>> "kind".   The reduction should take place over like-kind values:
>> Example:
>> add.reduceby(array=[1,2,3,4,5,6,7,8,9], by=[0,1,0,1,2,0,0,2,2])
>> results in the calculations:
>> 1 + 3 + 6 + 7
>> 2 + 4
>> 5 + 8 + 9
>> and therefore the output (notice the two arrays --- perhaps a structured
>> array should be returned instead...)
>> [0,1,2],
>> [17, 6, 22]
>>
>> The real value is when you have tabular data and you want to do reductions
>> in one field based on values in another field.   This happens all the time
>> in relational algebra and would be a relatively straightforward thing to
>> support in ufuncs.
>>     
>
> I might suggest a simplification where the by array must be an array
> of non-negative ints such that they are indices into the output. For
> example (note that I replace 2 with 3 and have no 2s in the by array):
>
> add.reduceby(array=[1,2,3,4,5,6,7,8,9], by=[0,1,0,1,3,0,0,3,3]) ==
> [17, 6, 0, 22]
>
> This basically generalizes bincount() to other binary ufuncs.
>
>   


A bit more generalization of `by` gives behavior like matlab's accumarray
(http://www.mathworks.com/access/helpdesk/help/techdoc/ref/accumarray.html),
which I partly cloned here:
[This would be a link to the scipy cookbook, but scipy.org is not 
responding.]

Warren



More information about the NumPy-Discussion mailing list