[SciPy-dev] fftpack2 for review

eric jones eric at enthought.com
Tue Oct 1 05:19:58 CDT 2002


Hey Pearu,

Overall, first tests went fairly smoothly.  Even without djbfft and
fftw, things appeared to install correctly on win32.  *very* nice job on
system checking and handling the lack of libraries.

All tests passed.  Sure am glad to finally have some unit tests for fft
stuff! 

I do not have fftw installed.  I tested once without djbfft and once
with it.  I didn't notice any material difference in timings (the
non-djbfft set given below).  My tests are on a 850 MHz PIII and they
are slower than yours.  This leads me to believe, even when trying to
use djbfft, I'm not getting it linked correctly.  Setup said it detected
my djbfft libraries though (I defined the environment variable pointing
to my libs to get this to work).

I'll look over the tests tomorrow.  Also, Travis O., could you give the
package a once over?

I've also copied your "proposal" here to solicit comments from the
signal processing geeks:

Differences between fftpack and FFT from Numeric
================================================

* Functions rfft and irfft accept and return only real sequences. So,
  the corresponding functions real_fft, inverse_real_fft from FFT are
  not equivalent with rfft and irfft. The difference is in the storage
  of data, see the definitions of corresponding functions for details.

* Functions fftshift and ifftshift from Numeric.FFT are renamed to
  freqshift and ifreqshift, respectively. Rationale: there is nothing
  FFT algorithm specific what these functions aim to accomplish.

  Also freqshift puts the Nyquist component at the end of the result
  while fftshift puts it at the beginning of the result. This is for
  the consistency among varios functions in the fftpack module.

* PROPOSAL: When calling ifft with forced truncation or zero-padding
  then I would like to propose that the corresponding action is
  applied to the middle of data. For example, ifft([1,2,3,4,5,6],n=8)
  is equivalent to ifft([1,2,3,4,0,0,5,6]), that is, the Fourier terms
  with higher frequencies and zero coefficients are introduced. In the
  Numeric.FFT case, the example above would be equivalent to
  ifft([1,2,3,4,5,6,0,0],n=8), which would mean that Fourier
  coefficients [5,6] become the coefficients of higher frequency terms
  and the original terms are zerod.

  Note that this proposal is **not implemented** because it needs to
  be discussed. For instance, Matlab uses the same convention as FFT
  and this change would be confusing for Matlab users.  On the other
  hand, FFT or Matlab's conventions change the spectrum of the
  original signal and I don't see any sense in this behaviour (if you
  don't agree then please provide an example). Namely, one of the
  applications of the argument n would be to compose a new signal with
  a more dense or sparse grid than the original one by using

    new_signal = ifft(fft(signal),n)

  Note that the new_signal would have the same Fourier spectrum as
  original signal. With Matlab/FFT convention this is not true.  Any
  thoughts?

And now some installation notes about what I did to get djbfft running:

Install notes:

1.  
win32 (cygwin and mingw) required  #include "errno.h" in any djbfft file
that used errno.

2.
I installed the latest f2py but still got the warning error:

WARNING!WARNING!WARNING!WARNING!WARNING!WARNING!WARNING!WARNING!WARNING!

  fftpack2 requires F2PY version 2.23.190-1367 or higher but got
2.23.190-1354

WARNING!WARNING!WARNING!WARNING!WARNING!WARNING!WARNING!WARNING!WARNING!

I'm hitting the hay right now, so I'll check out why this happened
tomorrow.

3.
How hard would it be to get setup.py to build the djbfft library if one
wasn't detected.  The makefile builds a lot of the files on the fly
using stuff like:

4c0.c: \
idea/c0.c pentium/c0.c ppro/c0.c sparc/c0.c auto_opt
	sed 1s/PRE/pre4.c/ `cat auto_opt`/c0.c > 4c0.c

4c0.o: \
compile 4c0.c pre4.c fftc4.h complex4.h real4.h fftr4.h real4.h 4i.c
	./compile 4c0.c

I'm not sure what all is going on here, but it seems like we could
either do the same in python or auto-generate a generic version of this
stuff and include it in SciPy.  Of course, since djbfft isn't required,
this isn't necessary.  

TEST RESULTS:

>>> fftpack2.test()
 creating test suite for: fftpack2.basic
 creating test suite for: fftpack2.helper
 creating test suite for: fftpack2.pseudo_diffs
...................
                 Fast Fourier Transform
========================================================
      |       real input       |       complex input
--------------------------------------------------------
 size |fftpack2|  FFT  | scipy |fftpack2|  FFT  | scipy
--------------------------------------------------------
  100 |   0.39 |  0.36 |  0.35 |   0.40 |  0.37 |  0.36  (secs for 7000
call
 1000 |   0.31 |  0.48 |  0.43 |   0.44 |  0.48 |  0.41  (secs for 2000
call
  256 |   0.65 |  0.75 |  0.70 |   0.75 |  0.75 |  0.70  (secs for 10000
cal
  512 |   0.89 |  1.32 |  1.11 |   1.16 |  1.33 |  1.12  (secs for 10000
cal
 1024 |   0.14 |  0.35 |  0.30 |   0.21 |  0.24 |  0.20  (secs for 1000
call
 2048 |   0.50 |  0.76 |  0.64 |   0.49 |  0.77 |  0.68  (secs for 1000
call
 4096 |   0.60 |  0.93 |  0.82 |   0.67 |  0.79 |  0.65  (secs for 500
calls
 8192 |   1.23 |  3.16 |  2.85 |   2.88 |  3.23 |  2.91  (secs for 500
calls
.
          Inverse Fast Fourier Transform
========================================================
      |       real input       |        complex input
--------------------------------------------------------
 size |fftpack2|  FFT  | scipy |fftpack2|  FFT  | scipy
--------------------------------------------------------
  100 |   0.38 |  0.88 |  0.87 |   0.45 |  0.89 |  0.88  (secs for 7000
call
 1000 |   0.45 |  1.93 |  1.85 |   0.80 |  1.44 |  1.36  (secs for 2000
call
  256 |   0.66 |  3.07 |  2.85 |   0.81 |  1.79 |  1.74  (secs for 10000
cal
  512 |   0.91 |  2.89 |  2.78 |   1.27 |  3.22 |  3.23  (secs for 10000
cal
 1024 |   0.24 |  0.80 |  0.74 |   0.37 |  0.90 |  0.85  (secs for 1000
call
 2048 |   0.51 |  1.85 |  1.79 |   0.78 |  1.90 |  1.84  (secs for 1000
call
 4096 |   0.35 |  1.99 |  1.86 |   0.84 |  2.18 |  2.02  (secs for 500
calls
 8192 |   1.56 |  5.35 |  5.07 |   2.82 |  5.45 |  5.19  (secs for 500
calls
.
      Multi-dimensional Fast Fourier Transform
========================================================
          |       real input       |       complex input
--------------------------------------------------------
   size   |fftpack2|  FFT  | scipy |fftpack2|  FFT  | scipy
--------------------------------------------------------
  100x100 |   0.51 |  0.67 |  0.59 |   0.55 |  0.72 |  0.64  (secs for
100 c
)
 1000x100 |   0.70 |  0.67 |  0.62 |   0.70 |  0.70 |  0.64  (secs for 7
cal
  256x256 |   0.70 |  0.64 |  0.61 |   0.75 |  0.67 |  0.63  (secs for
10 ca

  512x512 |   0.93 |  0.84 |  0.79 |   0.94 |  0.86 |  0.80  (secs for 3
cal
.
Fast Fourier Transform (real data)
==================================
 size |fftpack2|  FFT  | scipy
----------------------------------
  100 |   0.35 |  0.43 |  0.47  (secs for 7000 calls)
 1000 |   0.27 |  0.32 |  0.34  (secs for 2000 calls)
  256 |   0.60 |  0.78 |  0.77  (secs for 10000 calls)
  512 |   0.83 |  1.04 |  1.01  (secs for 10000 calls)
 1024 |   0.13 |  0.22 |  0.20  (secs for 1000 calls)
 2048 |   0.28 |  0.39 |  0.41  (secs for 1000 calls)
 4096 |   0.25 |  0.47 |  0.43  (secs for 500 calls)
 8192 |   0.81 |  1.15 |  1.20  (secs for 500 calls)
.
Inverse Fast Fourier Transform (real data)
==================================
 size |fftpack2|  FFT  | scipy
----------------------------------
  100 |   0.36 |  0.92 |  0.87  (secs for 7000 calls)
 1000 |   0.41 |  1.12 |  1.15  (secs for 2000 calls)
  256 |   0.62 |  1.61 |  2.49  (secs for 10000 calls)
  512 |   0.89 |  2.28 |  2.57  (secs for 10000 calls)
 1024 |   0.17 |  0.52 |  0.24  (secs for 1000 calls)
 2048 |   0.27 |  1.08 |  1.02  (secs for 1000 calls)
 4096 |   0.44 |  1.10 |  1.08  (secs for 500 calls)
 8192 |   1.10 |  2.22 |  2.18  (secs for 500 calls)
.........................
 Shifting periodic functions
==============================
 size  | optimized |    naive
------------------------------
   100 |      0.10 |      0.53  (secs for 1500 calls)
  1000 |      0.07 |      0.76  (secs for 300 calls)
   256 |      0.12 |      0.94  (secs for 1500 calls)
   512 |      0.13 |      1.11  (secs for 1000 calls)
  1024 |      0.11 |      1.08  (secs for 500 calls)
  2048 |      0.08 |      1.06  (secs for 200 calls)
  4096 |      0.12 |      1.16  (secs for 100 calls)
  8192 |      0.17 |      1.45  (secs for 50 calls)
.
Differentiation of periodic functions
=====================================
 size  |  convolve |    naive
-------------------------------------
   100 |      0.09 |      0.78  (secs for 1500 calls)
  1000 |      0.07 |      1.04  (secs for 300 calls)
   256 |      0.13 |      1.44  (secs for 1500 calls)
   512 |      0.13 |      1.74  (secs for 1000 calls)
  1024 |      0.11 |      2.37  (secs for 500 calls)
  2048 |      0.09 |      1.73  (secs for 200 calls)
  4096 |      0.14 |      1.74  (secs for 100 calls)
  8192 |      0.19 |      2.06  (secs for 50 calls)
.
 Tilbert transform of periodic functions
=========================================
 size  | optimized |    naive
-----------------------------------------
   100 |      0.10 |      0.63  (secs for 1500 calls)
  1000 |      0.07 |      0.93  (secs for 300 calls)
   256 |      0.13 |      1.02  (secs for 1500 calls)
   512 |      0.12 |      1.25  (secs for 1000 calls)
  1024 |      0.10 |      1.62  (secs for 500 calls)
  2048 |      0.08 |      1.12  (secs for 200 calls)
  4096 |      0.12 |      1.25  (secs for 100 calls)
  8192 |      0.17 |      1.36  (secs for 50 calls)
.
 Hilbert transform of periodic functions
=========================================
 size  | optimized |    naive
-----------------------------------------
   100 |      0.09 |      0.51  (secs for 1500 calls)
  1000 |      0.07 |      0.49  (secs for 300 calls)
   256 |      0.12 |      0.76  (secs for 1500 calls)
   512 |      0.11 |      1.27  (secs for 1000 calls)
  1024 |      0.10 |      1.15  (secs for 500 calls)
  2048 |      0.08 |      0.88  (secs for 200 calls)
  4096 |      0.13 |      0.90  (secs for 100 calls)
  8192 |      0.16 |      1.09  (secs for 50 calls)
.
----------------------------------------------------------------------
Ran 52 tests in 268.065s

OK

> -----Original Message-----
> From: scipy-dev-admin at scipy.net [mailto:scipy-dev-admin at scipy.net] On
> Behalf Of Pearu Peterson
> Sent: Saturday, September 28, 2002 3:54 PM
> To: scipy-dev at scipy.org
> Subject: [SciPy-dev] fftpack2 for review
> 
> 
> Dear SciPy devels,
> 
> Please find a scipy.fftpack replacement candidate fftpack2 from
> 
>   http://cens.ioc.ee/~pearu/misc/fftpack2-0.1.tar.gz
> 
> for review. fftpack2 main features include:
> 
>   - Optional support for high-performance FFT libraries such as FFTW
and
>     DJBFFT libraries. By default, Fortran FFTPACK is used and other
>     libraries are automatically detected by
>     the scipy_distutils/system_info.py script.
> 
>   - Support for both real and complex DFT's and their inverse
>     transforms. When used with FFTW and/or DJBFFT libraries then
fftpack2
>     routines are considerably faster than Numeric.FFT and the current
>     scipy.fftpack.
>     Some comparison timings are included at the end of this message.
> 
>   - Implementation of differentiation and integration of periodic
>     sequences, various pseudo-differential operators such as Tilbert
>     transform, Hilbert transform, hyperbolic pseudo-differential
>     operators, their inverses, etc.
> 
>   - Because different FFT libraries use different data storage
convention,
>     fftpack2 implements interface to these conventions so that users
>     and developers need not to learn all these conventions, just
>     one is enough. Namely, fftpack2 uses the same data storage
>     specification as Fortran FFTPACK (also used by Numeric.FFT and
>     Matlab, though with somewhat different formal definitions).
> 
>   - Generic convolution routines that allow quick and easy way of
>     introducing new pseudo-differential operators with the same
>     performance as core pseudo-differential operators.
> 
>   - All functions have complete documentation strings.
> 
>   - A rather complete testing site.
> 
> Note that fftpack2 requires the latest F2PY that you can get from
> 
>   http://cens.ioc.ee/projects/f2py2e/2.x/F2PY-2-latest.tar.gz
> 
> For testing, fftpack2 uses scipy_test from SciPy CVS repository.
> 
> For building/testing instructions and for various notes, including
open
> questions, see NOTES.txt file after unpacking fftpack2-0.1.tar.gz.
> 
> As usual, comments, suggestions, and criticism are most welcome.
> 
> Regards,
> 	Pearu
> 
> -------------------------------
> fftpack2 timing results:
> -------------------------------
>                  Fast Fourier Transform
> ========================================================
>       |       real input       |       complex input
> --------------------------------------------------------
>  size |fftpack2|  FFT  | scipy |fftpack2|  FFT  | scipy
> --------------------------------------------------------
>   100 |   0.38 |  0.39 |  0.38 |   0.40 |  0.40 |  0.38 (secs
> for7000calls)
>  1000 |   0.31 |  0.63 |  0.60 |   0.54 |  0.64 |  0.61  (secs for
2000
>   256 |   0.60 |  0.75 |  0.62 |   0.71 |  0.78 |  0.66  (secs for
10000
>   512 |   0.77 |  1.30 |  1.00 |   0.95 |  1.30 |  1.01  (secs for
10000
>  1024 |   0.14 |  0.26 |  0.18 |   0.17 |  0.25 |  0.19  (secs for
1000
>  2048 |   0.21 |  0.51 |  0.36 |   0.34 |  0.65 |  0.54  (secs for
1000
>  4096 |   0.28 |  1.08 |  0.66 |   0.56 |  0.93 |  0.79  (secs for 500
>  8192 |   1.09 |  3.88 |  3.94 |   2.19 |  4.36 |  3.48  (secs for 500
> 
>           Inverse Fast Fourier Transform
> ========================================================
>       |       real input       |        complex input
> --------------------------------------------------------
>  size |fftpack2|  FFT  | scipy |fftpack2|  FFT  | scipy
> --------------------------------------------------------
>   100 |   0.17 |  0.26 |  0.63 |   0.45 |  0.87 |  0.85  (secs for
7000
>  1000 |   0.34 |  1.24 |  1.25 |   0.72 |  1.29 |  1.29  (secs for
2000
>   256 |   0.62 |  1.57 |  1.44 |   0.69 |  1.62 |  1.47  (secs for
10000
>   512 |   0.80 |  2.69 |  2.30 |   1.00 |  2.62 |  2.20  (secs for
10000
>  1024 |   0.12 |  0.51 |  0.41 |   0.18 |  0.54 |  0.44  (secs for
1000
>  2048 |   0.23 |  1.16 |  1.09 |   0.40 |  1.33 |  1.27  (secs for
1000
>  4096 |   0.35 |  1.75 |  1.44 |   0.63 |  1.72 |  1.54  (secs for 500
>  8192 |   1.34 |  5.71 |  6.02 |   2.40 |  6.14 |  5.67  (secs for 500
> 
>       Multi-dimensional Fast Fourier Transform
> ========================================================
>           |       real input       |       complex input
> --------------------------------------------------------
>    size   |fftpack2|  FFT  | scipy |fftpack2|  FFT  | scipy
> --------------------------------------------------------
>   100x100 |   0.43 |  0.87 |  0.81 |   0.48 |  0.88 |  0.82  (secs for
100
>  1000x100 |   0.54 |  0.78 |  0.72 |   0.56 |  0.73 |  0.75  (secs for
7
>   256x256 |   0.69 |  0.72 |  0.62 |   0.67 |  0.72 |  0.64  (secs for
10
>   512x512 |   0.81 |  0.91 |  0.84 |   0.81 |  0.87 |  0.81  (secs for
3
> 
> Fast Fourier Transform (real data)
> ==================================
>  size |fftpack2|  FFT  | scipy
> ----------------------------------
>   100 |   0.36 |  0.47 |  0.47  (secs for 7000 calls)
>  1000 |   0.29 |  0.38 |  0.38  (secs for 2000 calls)
>   256 |   0.55 |  0.76 |  0.81  (secs for 10000 calls)
>   512 |   0.67 |  1.06 |  1.07  (secs for 10000 calls)
>  1024 |   0.10 |  0.18 |  0.17  (secs for 1000 calls)
>  2048 |   0.16 |  0.30 |  0.30  (secs for 1000 calls)
>  4096 |   0.20 |  0.37 |  0.32  (secs for 500 calls)
>  8192 |   0.75 |  1.29 |  1.26  (secs for 500 calls)
> 
> Inverse Fast Fourier Transform (real data)
> ==================================
>  size |fftpack2|  FFT  | scipy
> ----------------------------------
>   100 |   0.39 |  0.90 |  0.92  (secs for 7000 calls)
>  1000 |   0.32 |  0.63 |  0.70  (secs for 2000 calls)
>   256 |   0.59 |  1.37 |  1.39  (secs for 10000 calls)
>   512 |   0.73 |  1.70 |  1.80  (secs for 10000 calls)
>  1024 |   0.11 |  0.27 |  0.27  (secs for 1000 calls)
>  2048 |   0.20 |  0.45 |  0.44  (secs for 1000 calls)
>  4096 |   0.24 |  0.65 |  0.68  (secs for 500 calls)
>  8192 |   0.73 |  1.98 |  2.06  (secs for 500 calls)
> 
> 
>  Shifting periodic functions
> ==============================
>  size  | optimized |    naive
> ------------------------------
>    100 |      0.09 |      0.52  (secs for 1500 calls)
>   1000 |      0.06 |      0.66  (secs for 300 calls)
>    256 |      0.10 |      0.84  (secs for 1500 calls)
>    512 |      0.10 |      1.02  (secs for 1000 calls)
>   1024 |      0.07 |      1.07  (secs for 500 calls)
>   2048 |      0.06 |      0.80  (secs for 200 calls)
>   4096 |      0.00 |      0.86  (secs for 100 calls)
>   8192 |      0.13 |      1.23  (secs for 50 calls)
> 
> Differentiation of periodic functions
> =====================================
>  size  |  convolve |    naive
> -------------------------------------
>    100 |      0.09 |      0.59  (secs for 1500 calls)
>   1000 |      0.06 |      0.75  (secs for 300 calls)
>    256 |      0.11 |      0.97  (secs for 1500 calls)
>    512 |      0.09 |      1.21  (secs for 1000 calls)
>   1024 |      0.08 |      1.19  (secs for 500 calls)
>   2048 |      0.06 |      1.11  (secs for 200 calls)
>   4096 |      0.09 |      1.19  (secs for 100 calls)
>   8192 |      0.14 |      1.48  (secs for 50 calls)
> 
>  Tilbert transform of periodic functions
> =========================================
>  size  | optimized |    naive
> -----------------------------------------
>    100 |      0.09 |      0.54  (secs for 1500 calls)
>   1000 |      0.06 |      0.60  (secs for 300 calls)
>    256 |      0.10 |      0.77  (secs for 1500 calls)
>    512 |      0.09 |      0.89  (secs for 1000 calls)
>   1024 |      0.07 |      0.90  (secs for 500 calls)
>   2048 |      0.05 |      0.83  (secs for 200 calls)
>   4096 |      0.07 |      0.92  (secs for 100 calls)
>   8192 |      0.11 |      1.04  (secs for 50 calls)
> .
>  Hilbert transform of periodic functions
> =========================================
>  size  | optimized |    naive
> -----------------------------------------
>    100 |      0.09 |      0.47  (secs for 1500 calls)
>   1000 |      0.06 |      0.48  (secs for 300 calls)
>    256 |      0.10 |      0.68  (secs for 1500 calls)
>    512 |      0.09 |      0.77  (secs for 1000 calls)
>   1024 |      0.07 |      0.74  (secs for 500 calls)
>   2048 |      0.06 |      0.72  (secs for 200 calls)
>   4096 |      0.07 |      0.79  (secs for 100 calls)
>   8192 |      0.11 |      0.93  (secs for 50 calls)
> 
> 
> _______________________________________________
> Scipy-dev mailing list
> Scipy-dev at scipy.net
> http://www.scipy.net/mailman/listinfo/scipy-dev




More information about the Scipy-dev mailing list