[SciPy-dev] scipy.fft module slow for complex inputs when linked to fftw3

David M. Cooke cookedm at physics.mcmaster.ca
Fri Aug 25 17:29:38 CDT 2006


On Aug 19, 2006, at 05:40 , David Cournapeau wrote:

> I build a patch anyway; I am not sure the format is OK, as I am not
> familiar with
> patch/diff options:
>
> Before patch (last numpy and scipy SVN),  scipy.fftpack.test(10)  
> gives:
>
>                  Fast Fourier Transform
> =================================================
>       |    real input     |   complex input
> -------------------------------------------------
>  size |  scipy  |  numpy  |  scipy  |  numpy
> -------------------------------------------------
>   100 |    0.17 |    0.14 |    1.93 |    0.11  (secs for 7000 calls)
>  1000 |    0.12 |    0.16 |    1.12 |    0.14  (secs for 2000 calls)
>   256 |    0.28 |    0.28 |    3.09 |    0.22  (secs for 10000 calls)
>   512 |    0.37 |    0.48 |    3.82 |    0.38  (secs for 10000 calls)
>  1024 |    0.05 |    0.09 |    0.53 |    0.07  (secs for 1000 calls)
>  2048 |    0.10 |    0.16 |    0.88 |    0.15  (secs for 1000 calls)
>  4096 |    0.08 |    0.17 |    0.75 |    0.17  (secs for 500 calls)
>  8192 |    0.20 |    0.53 |    1.39 |    0.47  (secs for 500 calls)
> ....
>        Inverse Fast Fourier Transform
> ===============================================
>       |     real input    |    complex input
> -----------------------------------------------
>  size |  scipy  |  numpy  |  scipy  |  numpy
> -----------------------------------------------
>   100 |    0.16 |    0.29 |    2.29 |    0.28  (secs for 7000 calls)
>  1000 |    0.13 |    0.27 |    1.22 |    0.26  (secs for 2000 calls)
>   256 |    0.29 |    0.57 |    3.43 |    0.51  (secs for 10000 calls)
>   512 |    0.41 |    0.84 |    4.14 |    0.77  (secs for 10000 calls)
>  1024 |    0.06 |    0.14 |    0.62 |    0.13  (secs for 1000 calls)
>  2048 |    0.10 |    0.26 |    0.91 |    0.25  (secs for 1000 calls)
>  4096 |    0.11 |    0.31 |    0.81 |    0.26  (secs for 500 calls)
>  8192 |    0.23 |    0.78 |    1.53 |    0.72  (secs for 500 calls)
> .......
>
> After patch (last numpy and scipy SVN),  scipy.fftpack.test(10) gives:
>
>                  Fast Fourier Transform
> =================================================
>       |    real input     |   complex input
> -------------------------------------------------
>  size |  scipy  |  numpy  |  scipy  |  numpy
> -------------------------------------------------
>   100 |    0.17 |    0.14 |    0.12 |    0.11  (secs for 7000 calls)
>  1000 |    0.12 |    0.16 |    0.10 |    0.14  (secs for 2000 calls)
>   256 |    0.28 |    0.28 |    0.20 |    0.20  (secs for 10000 calls)
>   512 |    0.36 |    0.47 |    0.27 |    0.36  (secs for 10000 calls)
>  1024 |    0.05 |    0.08 |    0.05 |    0.07  (secs for 1000 calls)
>  2048 |    0.09 |    0.16 |    0.08 |    0.14  (secs for 1000 calls)
>  4096 |    0.10 |    0.17 |    0.10 |    0.16  (secs for 500 calls)
>  8192 |    0.23 |    0.53 |    0.23 |    0.47  (secs for 500 calls)
> ....
>        Inverse Fast Fourier Transform
> ===============================================
>       |     real input    |    complex input
> -----------------------------------------------
>  size |  scipy  |  numpy  |  scipy  |  numpy
> -----------------------------------------------
>   100 |    0.17 |    0.29 |    0.14 |    0.26  (secs for 7000 calls)
>  1000 |    0.13 |    0.27 |    0.15 |    0.25  (secs for 2000 calls)
>   256 |    0.29 |    0.54 |    0.27 |    0.50  (secs for 10000 calls)
>   512 |    0.38 |    0.81 |    0.43 |    0.73  (secs for 10000 calls)
>  1024 |    0.06 |    0.14 |    0.08 |    0.13  (secs for 1000 calls)
>  2048 |    0.09 |    0.24 |    0.13 |    0.23  (secs for 1000 calls)
>  4096 |    0.09 |    0.24 |    0.13 |    0.23  (secs for 500 calls)
>  8192 |    0.22 |    0.75 |    0.37 |    0.73  (secs for 500 calls)
>
> This makes things much faster, particularly for small sizes (which is
> logical considering the main cause of slowness is the building of  
> plans).
>
> http://projects.scipy.org/scipy/scipy/attachment/ticket/1/ 
> fftw3slow.patch

The patch format is fine, btw. Although, it looks like you used diff;  
it's much easier to just do 'svn diff' (add the file or directory  
names to the end if you're only interested in some of what you've  
changed).

I'm working on splitting the _fttpack module into separate extensions  
for each fft library, so it's clearer. Then, you could link in both  
fftw2 and fftw3, and use either (or fftpack). Along the way, I'd like  
to add more of their API functions (such as the wisdom functions), or  
have routines that may have better performance (fftw3, for instance,  
can be more efficient [or not] if it's allowed to destroy its input  
array, or if the input and output array are different).

Also, it looks like using the guru interface, plans could be cached  
(but for each length, there would be possibly two plans: for aligned  
and unaligned data).

-- 
|>|\/|<
/------------------------------------------------------------------\
|David M. Cooke              http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca



More information about the Scipy-dev mailing list