[Numpy-discussion] about SIMD (SSE2 & SSE3)
Sun Nov 7 02:39:27 CST 2010
2010/11/7 qihua wu <email@example.com>
> Thank David,
> the java program takes 3 hours to read data, after read the data into
> memory, it takes 4 hours to process/calculate somthing on all these data.
> The data is the sale data which contains both promoted sale and
> non-promoted sale, the program needs to predict the non-promoted sale: so
> input data is a serial of promoted sale and non-promoted sale, the output is
> a serial of non-promoted sale. e.g
> day 1,2,3 have the non-promoted sales, day 4 have the promoted sales, day
> 5,6,7 have the non-promted sales, the output for day 1~7 are all
> non-promoted sales. During the process, we might need to sum all the data
> for day 1~7, is this what you called " elementwise addition,
> multiplication", which can't be SIMDed in numpy?
There is little sense in implementing element wise adds and mults in SIMD
because these operations are memory bounded in modern computers. SIMD is
only useful when you want to accelerate operations that are CPU bounded
(e.g. evaluation of transcendental functions or matrix-matrix operations).
You can get a better grasp on this limitation (I like to call it the
starving CPU problem) by having a look at this material:
It also includes exercises, so that you can do your own experiments.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion