[Numpy-discussion] NEP for faster ufuncs
John Salvatier
jsalvati@u.washington....
Tue Dec 21 20:00:49 CST 2010
I applaud you on your vision. I only have one small suggestion: I suggest
you put a table of contents at the beginning of your NEP so people may skip
to the part that most interests them.
On Tue, Dec 21, 2010 at 4:59 PM, John Salvatier
<jsalvati@u.washington.edu>wrote:
> That is an amazing christmas present.
>
> On Tue, Dec 21, 2010 at 4:53 PM, Mark Wiebe <mwwiebe@gmail.com> wrote:
>
>> Hello NumPy-ers,
>>
>> After some performance analysis, I've designed and implemented a new
>> iterator designed to speed up ufuncs and allow for easier multi-dimensional
>> iteration. The new code is fairly large, but works quite well already. If
>> some people could read the NEP and give some feedback, that would be great!
>> Here's a link:
>>
>>
>> https://github.com/m-paradox/numpy/blob/mw_neps/doc/neps/new-iterator-ufunc.rst
>>
>> I would also love it if someone could try building the code and play
>> around with it a bit. The github branch is here:
>>
>> https://github.com/m-paradox/numpy/tree/new_iterator
>>
>> To give a taste of the iterator's functionality, below is an example from
>> the NEP for how to implement a "Lambda UFunc." With just a few lines of
>> code, it's possible to replicate something similar to the numexpr library
>> (numexpr still gets a bigger speedup, though). In the example expression I
>> chose, execution time went from 138ms to 61ms.
>>
>> Hopefully this is a good Christmas present for NumPy. :)
>>
>> Cheers,
>> Mark
>>
>> Here is the definition of the ``luf`` function.::
>>
>> def luf(lamdaexpr, *args, **kwargs):
>> """Lambda UFunc
>>
>> e.g.
>> c = luf(lambda i,j:i+j, a, b, order='K',
>> casting='safe', buffersize=8192)
>>
>> c = np.empty(...)
>> luf(lambda i,j:i+j, a, b, out=c, order='K',
>> casting='safe', buffersize=8192)
>> """
>>
>> nargs = len(args)
>> op = args + (kwargs.get('out',None),)
>> it = np.newiter(op, ['buffered','no_inner_iteration'],
>> [['readonly','nbo_aligned']]*nargs +
>> [['writeonly','allocate','no_broadcast']],
>> order=kwargs.get('order','K'),
>> casting=kwargs.get('casting','safe'),
>> buffersize=kwargs.get('buffersize',0))
>> while not it.finished:
>> it[-1] = lamdaexpr(*it[:-1])
>> it.iternext()
>>
>> return it.operands[-1]
>>
>> Then, by using ``luf`` instead of straight Python expressions, we
>> can gain some performance from better cache behavior.::
>>
>> In [2]: a = np.random.random((50,50,50,10))
>> In [3]: b = np.random.random((50,50,1,10))
>> In [4]: c = np.random.random((50,50,50,1))
>>
>> In [5]: timeit 3*a+b-(a/c)
>> 1 loops, best of 3: 138 ms per loop
>>
>> In [6]: timeit luf(lambda a,b,c:3*a+b-(a/c), a, b, c)
>> 10 loops, best of 3: 60.9 ms per loop
>>
>> In [7]: np.all(3*a+b-(a/c) == luf(lambda a,b,c:3*a+b-(a/c), a, b, c))
>> Out[7]: True
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20101221/4c881653/attachment-0001.html
More information about the NumPy-Discussion
mailing list