[SciPy-dev] [SciPy-user] some benchmark data for numarray, Numeric and scipy-newcore
Charles R Harris
charlesr.harris at gmail.com
Sun Dec 4 22:03:19 CST 2005
> In short, numarray is doing a better job of handling the memory for the
> misbehaved cases and we could learn something from that.
Sounds like cache might be coming into play here. Using the stack pointer
can also help on the Intel architecture: moving data to the stack often
seems to give a speed up. As to the cache, I have seen speedups of 2x - 5x
just by trying to use chunks small enough ( < 16 KB or so) to fit in cache.
Unrolling the innermost loop of the ufunc might also help, by which I don't
really mean unrolling, but simply using efficient explicit c code. Of
course, I haven't yet looked at how you implemented these things, so I am
just tossing out ideas and maybe making a fool of myself.
> Scipy-dev mailing list
> Scipy-dev at scipy.net
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Scipy-dev