[Numpy-discussion] Help speeding up element-wise operations for video processing

David Cournapeau david@ar.media.kyoto-u.ac...
Tue Sep 16 20:47:28 CDT 2008

Brendan Simons wrote:
> Why would I need the GPU to do parallel operations?  I thought most
> modern processors have vector units.  I just don't know if there's a
> way to have my code use them.

Yes, modern CPU have vector units, but to use them efficiently, you have
to use assembly or specially written routines, which are arch dependent
(compilers still do not 'auto-vectorize' automatically the code which
could be autovectorized in most cases). Numpy itself does not use
specially optimized routines using vector units, except for blas/lapack.
You certainly can't use the vector units efficiently from simple python



