Fri May 22 06:52:46 CDT 2009

I'm the student doing the project.  I have a blog here, which contains 
some initial performance numbers for a couple test ufuncs I did:


It's really too early yet to give definitive results though; GSoC 
officially starts in two days :)  What I'm finding is that the existing 
ufuncs are already pretty fast; it appears right now that the main 
limitation is memory bandwidth.  If that's really the case, the 
performance gains I'll get will be through cache tricks (non-temporal 
loads/stores), reducing memory accesses and using multiple cores to get 
more bandwidth.

Another alternative we've talked about, and I (more and more likely) may 
look into is composing multiple operations together into a single ufunc. 
  Again the main idea being that memory accesses can be reduced/eliminated.


dmitrey wrote:
> hi all,
> has anyone already tried to compare using an ordinary numpy ufunc vs
> that one from corepy, first of all I mean the project
> http://socghop.appspot.com/student_project/show/google/gsoc2009/python/t124024628235
> It would be interesting to know what is speedup for (eg) vec ** 0.5 or
> (if it's possible - it isn't pure ufunc) numpy.dot(Matrix, vec). Or
> any another example.
