[Numpy-discussion] NEP for faster ufuncs

Mark Wiebe mwwiebe@gmail....
Wed Dec 22 12:52:45 CST 2010


On Wed, Dec 22, 2010 at 10:41 AM, Francesc Alted <faltet@pytables.org>wrote:

> NumPy version 2.0.0.dev-147f817
>

There's your problem, it looks like the PYTHONPATH isn't seeing your new
build for some reason.  That build is off of this commit in the NumPy master
branch:

https://github.com/numpy/numpy/commit/147f817eefd5efa56fa26b03953a51d533cc27ec

> The reason I think it might help is that with 'luf' is that it's
> > calculating the expression on smaller sized arrays, which possibly
> > just got buffered. If the memory allocator for the temporaries keeps
> > giving back the same addresses, all this will be in one of the
> > caches very close to the CPU. Unless this cache is still too slow to
> > feed the SSE instructions, there should be a speed benefit.  The
> > ufunc inner loops could also use the SSE prefetch instructions based
> > on the stride to give some strong hints about where the next memory
> > bytes to use will be.
>
> Ah, okay.  However, Numexpr is not meant to accelerate calculations with
> small operands.  I suppose that this is where your new iterator makes
> more sense: accelerating operations where some of the operands are small
> (i.e. fit in cache) and have to be broadcasted to match the
> dimensionality of the others.
>

It's not about small operands, but small chunks of the operands at a time,
with temporary arrays for intermediate calculations.  It's the small chunks
+ temporaries which must fit in cache to get the benefit, not the whole
array.  The numexpr front page explains this fairly well in the section "Why
It Works":

http://code.google.com/p/numexpr/#Why_It_Works

-Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20101222/54326e49/attachment.html 


More information about the NumPy-Discussion mailing list