[Numpy-discussion] numpy ufuncs and COREPY - any info?
Fri May 22 07:33:18 CDT 2009
A Friday 22 May 2009 13:52:46 Andrew Friedley escrigué:
> (sending again)
> I'm the student doing the project. I have a blog here, which contains
> some initial performance numbers for a couple test ufuncs I did:
> It's really too early yet to give definitive results though; GSoC
> officially starts in two days :) What I'm finding is that the existing
> ufuncs are already pretty fast; it appears right now that the main
> limitation is memory bandwidth. If that's really the case, the
> performance gains I'll get will be through cache tricks (non-temporal
> loads/stores), reducing memory accesses and using multiple cores to get
> more bandwidth.
> Another alternative we've talked about, and I (more and more likely) may
> look into is composing multiple operations together into a single ufunc.
> Again the main idea being that memory accesses can be reduced/eliminated.
IMHO, composing multiple operations together is the most promising venue for
leveraging current multicore systems.
Another interesting approach is to implement costly operations (from the point
of view of CPU resources), namely, transcendental functions like sin, cos or
tan, but also others like sqrt or pow) in a parallel way. If besides, you can
combine this with vectorized versions of them (by using the well spread SSE2
instruction set, see  for an example), then you would be able to achieve
really good results for sure (at least Intel did with its VML library ;)
More information about the Numpy-discussion