[Numpy-discussion] SSEPlus + Framewave

David Cournapeau cournape@gmail....
Mon Aug 11 07:17:40 CDT 2008

On Mon, Aug 11, 2008 at 5:17 AM, Holger Rapp <Rapp@mrt.uka.de> wrote:
> I realize that it was not really feasible to support a proprietary
> library like the IPP in a beautifully crafted Open Source Project, but
> quite recently, AMD came up with two very interesting projects
> (SSEPlus and Framewave, links provided below) which are more or less a
> direct response to intels IPP. And the best: They are OpenSource
> (under a Apache license, afaik). My question is now: Is it intended/Is
> there interest to get this  performance gain into numpy? Are their any
> political restrictions (license/project identity)? Is there already
> work underway?

Hi Holger,

Thank you for those information. I would say that in general, we are
really interested in this kind of things in numpy/scipy (sse
optimization, etc...), but as always, man power is short. So to answer
your question quickly:
 - is there any interest ? Definitely
 - Are there "political" restrictions ? There is the restriction that
any contribution to numpy/scipy has to be BSD. AFAIK, Apache license
falls into BSD camp. Otherwise, what you are proposing is not against
any policy I am aware of (implicit or explicit).

> I for one would consider helping in a effort like that, because it
> would probably safe me time in the long run.


> (Sidenote: I'm aware that this optimization would only help INTEL/AMD
> boxes, but hardware acceleration is so common these days that it is a
> shame NOT to use it in a numbercrunching library.

This is not a problem. x86 is pervasive, and is there to stay; I think
most numpy/scipy users fall into the x86 users category, so it
definitely makes sense to do it. Reasonable x86-only optimization will
be accepted (as long as it does not break the other architectures, of

I will take a deeper look at those libraries, but my own opinion
(which I don't claim to represent any consensus of numpy developers at
large) is that for this kind of tasks to be effective and "pervasive",
we need some code infrastructure which is not there yet in numpy. In
particular, I would like to separate the purely computational part of
numpy from the "boilerplate", so that from an architectural POV, any
optimization would be done in the computational part only. There are
also build/deployment issues with SSE, but using an external library
(which hopefully takes care of it) would alleviate most problems in
that category.

But if you have code which works now, and uses those two libraries,
provide a patch on numpy trac (http://projects.scipy.org/scipy/numpy)



More information about the Numpy-discussion mailing list