[Numpy-discussion] Floating Point Difference between numpy and numarray
David Cournapeau
cournape@gmail....
Tue Sep 9 01:25:19 CDT 2008
On Tue, Sep 9, 2008 at 6:55 AM, Sebastian Stephan Berg
<sebastian@sipsolutions.net> wrote:
> Yeah, I memory wise it doesn't matter to sum to a double, but just
> trying around it seems that the mixing of float and double is very slow
Yes, the memory argument explains why you would float32 data vs
float64 data, not the accumulator (that certainly what Matthieu
meant). Having an accumulator of float64 means you don't care so much
about speed, but more about memory, and are willing to trade speed for
less memory.
> (at least on my comp) while if the starting array is already double
> there is almost no difference for summing. Generally double precision
> calculations should be slower though. Don't extensions like SSE2 operate
> either on 2 doubles or 4 floats at once and thus should be about twice
> as fast for floats? For add/multiply this behaviour is for me visible
> anyways.
We don't use SSE and co in numpy, and I doubt the compilers (even
Intel one) are able to generate effective SSE for numpy ATM. Actually,
double and float are about the same speed for x86 (using the x87 FPU
and not the SSE units), because internally, the register is 80 bits
wide when doing computation. The real difference is the memory
pressure induced by double (8 bytes per items) compared to float when
doing computation with double, and for certain operations, for a
reason I don't understand (log, sin and co are as fast for float and
double using the FPU, but sqrt and divide are twice faster for float,
for example).
I don't have any clear explanation on why mixing float with a double
accumulator would be slow; maybe the default code to convert float to
double - as generated by the compiler - is bad. Or maybe numpy does
something funny.
cheers,
David
More information about the Numpy-discussion
mailing list