[Numpy-discussion] Objected-oriented SIMD API for Numpy

Mathieu Blondel mathieu@mblondel....
Wed Oct 21 02:48:22 CDT 2009

Hi David,

On Wed, Oct 21, 2009 at 3:56 PM, David Cournapeau
<david@ar.media.kyoto-u.ac.jp> wrote:
> I am not sure how this could be applied to numpy case ? From what I can
> understand, this cannot be directly applied to python: the described
> changes are vm changes, and we cannot do anything at python vm level (I
> would guess the python vm to be too primitive to implement this kind of
> things anyway).

Yes in Mono this is realized with Just-In-Time compilation, so at the VM level.

The reason I thought of Numpy rather than Cython is that Python's
support for vectors/matrices is limited and Numpy has kind of become
the standard for that in the Python world.

I saw the video of Peter Norvig at the last Scipy conference who was
suggesting to merge Numpy into Cython. The SIMD API would be an
argument in favor of this too because of the possible interactions
between such a SIMD API and an array API.

> I don't see how the high level API at the assembly level (Mono.Simd)
> would work either: the overhead of python and numpy to deal with 4 or 8
> items in python would make this API useless from a speed POV.

My original idea was to write the code in C with Intel/Alvitec/Neon
intrinsics and have this code binded to be able to call it from
Python. So the SIMD code would be compiled already, ready to be called
from Python. Like you said, there's a risk that the overhead of
calling Python is bigger than the benefit of using SIMD instructions.
If it's worth trying out, an experiment can be made with Vector4f to
see if it's even worth continuing with other types.

> This is only my opinion (read other numpy dev may disagree), but I think
> that the numpy C code should be cleaned up before adding this kind of
> features: there is still too much coupling between the pure C core and
> the python machinery. Also, any use of SIMD code should be done at
> runtime IMHO (so that one binary can be used on multiple architectures),
> which has some issues on its own from a cross platform POV.

I recently used SIMD instructions for a project and I realized that
they cannot be activated in a standard Debian package, because the
package has to remain general-purpose. So people who want to benefit
the speed up have to compile my project from source... I also see that
sometimes packages are available in different flavors (-msse,


More information about the NumPy-Discussion mailing list