[Numpy-discussion] Why is numpy.abs so much slower on complex64 than complex128 under windows 32-bit?
Henry Gomersall
heng@cantab....
Tue Apr 10 09:55:53 CDT 2012
On 10/04/2012 16:36, Francesc Alted wrote:
> In [10]: timeit c = numpy.complex64(numpy.abs(numpy.complex128(b)))
> 100 loops, best of 3: 12.3 ms per loop
>
> In [11]: timeit c = numpy.abs(b)
> 100 loops, best of 3: 8.45 ms per loop
>
> in your windows box and see if they raise similar results?
>
No, the results are somewhat the same as before - ~40ms for the first
(upcast/downcast) case and ~150ms for the direct case (both *much*
slower than yours!). This is versus ~28ms for operating directly on
double precisions.
I'm using numexpr in the end, but this is slower than numpy.abs under linux.
>> In a related note of confusion, the times above are notably (and
>> consistently) different (shorter) to that I get doing a naive `st =
>> time.time(); numpy.abs(a); print time.time()-st`. Is this to be expected?
>>
>
> This happens a lot, yes, specially when your code is
> memory-bottlenecked (a very common situation). The explanation is
> simple: when your datasets are small enough to fit in CPU cache, the
> first time the timing loop runs, it brings all your working set to
> cache, so the second time the computation is evaluated, the time does
> not have to fetch data from memory, and by the time you run the loop
> 10 times or more, you are discarding any memory effect. However, when
> you run the loop only once, you are considering the memory fetch time
> too (which is often much more realistic).
Ah, that makes sense. Thanks!
Cheers,
Henry
More information about the NumPy-Discussion
mailing list