[Numpy-discussion] Why is numpy.abs so much slower on complex64 than complex128 under windows 32-bit?
Tue Apr 10 11:43:14 CDT 2012
On 10/04/2012 17:57, Francesc Alted wrote:
>> I'm using numexpr in the end, but this is slower than numpy.abs under linux.
> Oh, you mean the windows version of abs(complex64) in numexpr is slower
> than a pure numpy.abs(complex64) under linux? That's weird, because
> numexpr has an independent implementation of the complex operations from
> NumPy machinery. Here it is how abs() is implemented in numexpr:
> static void
> nc_abs(cdouble *x, cdouble *r)
> r->real = sqrt(x->real*x->real + x->imag*x->imag);
> r->imag = 0;
> [as I said, only the double precision version is implemented, so you
> have to add here the cost of the cast too]
hmmm, I can't seem to reproduce that assertion, so ignore it.
> Hmm, considering all of these facts, it might be that sqrtf() on windows
> is under-performing? Can you try this:
> In : a = numpy.linspace(0, 1, 1e6)
> In : b = numpy.float32(a)
> In : timeit c = numpy.sqrt(a)
> 100 loops, best of 3: 5.64 ms per loop
> In : timeit c = numpy.sqrt(b)
> 100 loops, best of 3: 3.77 ms per loop
> and tell us the results for windows?
In : timeit c = numpy.sqrt(a)
100 loops, best of 3: 21.4 ms per loop
In : timeit c = numpy.sqrt(b)
100 loops, best of 3: 12.5 ms per loop
So, all sensible there it seems.
Taking this to the next stage...
In : a = numpy.random.randn(256,2048) + 1j*numpy.random.randn(256,2048)
In : b = numpy.complex64(a)
In : timeit numpy.sqrt(a*numpy.conj(a))
10 loops, best of 3: 61.9 ms per loop
In : timeit numpy.sqrt(b*numpy.conj(b))
10 loops, best of 3: 27.2 ms per loop
In : timeit numpy.abs(a) # for comparison
10 loops, best of 3: 30 ms per loop
In : timeit numpy.abs(b) # and again (slow slow slow)
1 loops, best of 3: 153 ms per loop
I can confirm the results are correct. So, it really is in numpy.abs.
> PD: if you are using numexpr on windows, you may want to use the MKL
> linked version, which uses the abs of MKL, that should have considerably
> better performance.
I'd love to - I presume this would mean me buying an MKL license? If
not, where do I find the MKL linked version?
More information about the NumPy-Discussion