[Numpy-discussion] Optimization question for ufuncs
oliphant at ee.byu.edu
Fri Feb 4 14:23:08 CST 2005
I've been thinking lately about ufuncs and I would love to hear the
opinion of others.
I like what numarray has done with the temporary buffer ideas so that
full copies are never made if they are just going to be thrown away.
This has led to other thoughts about possible improvements to the ufunc
object to support "ufunc chaining" so that array operations on
expressions don't have to create any temporary copies (using buffers
instead) --- I think I remember the numarray guys thinking along these
lines as well.
Regardless, there is always an inner for loop (for each type) that
performs the requested operation. The question I have is whether to
assume unit strides for the inner loop. The current Numeric ufunc inner
loops allow for discontiguous memory to be accessed during the loop
(non-unit strides). I'm not sure what numarray does, I think it only
allows for unit strides and uses temporary buffers to support
Is this requirement for unit-strides on the inner loop a good one? Does
it allow faster code to be compiled? Is it part of the reason that
numarray is a little faster on large arrays?
I am not an optimization expert, though I've read a bit as of late. I'm
just wondering what the experts on this list think about unit-strides
versus non unit-strides on the inner loop?
More information about the Numpy-discussion