[Numpy-discussion] SSEPlus + Framewave
Wed Aug 13 03:28:54 CDT 2008
> The problem is not so much the build part, but the clear separation I
> was talking about. My experience with ATLAS convinced me the only way
> to make sse work reliably is to detect the CPU arch at runtime;
> compiling binaries incompatible on different arch is just not scalable
> and confuse users.
What do you mean by compiling incompatible? It is my understanding
that (for example) Framewave (but also IPP) come in different flavors
(32bit, 64bit) which of course must be compiled in at compile time.
But which CPU is available and which features it delivers is indeed
done at runtime (framewave: fwStaticInit()), the choice of how to
implement things with which assembler code is then up to the framewave
I do not consider it a good idea to write a own dispatcher library
into numpy to choose which opcode to use.
Or do it get you completly wrong? Is your intention to make a plugin
architecture in the sense of: copy some directory with libs and config
in your site-packages and then your multiplications are much faster? I
would consider such a framework a bit overengineered, since speedy
calculations are a nice feature for every numpy user.
More information about the Numpy-discussion