[SciPy-user] test_fft, test_ifft results

Arnd Baecker arnd.baecker at web.de
Mon Dec 12 02:28:07 CST 2005


Hi Darren and all fft-enthusiasts,

On Sun, 11 Dec 2005, Darren Dale wrote:

> On Saturday 10 December 2005 5:57 pm, Darren Dale wrote:
> > I have been wondering about the results of
> > fftpack.basic.test_basic.test_fft, fftpack.basic.test_basic.test_fftn and
> > fftpack.basic.test_basic.test_ifft. On my system, with scipy built against
> > fftw2 or 3, ffts of complex input takes over 8 times as long as real input.
>
> I would like to clarify this report. I thought that editing site.cfg to find
> fftw-2 would make scipy build against it, but this is not the case. One can
> build scipy with support for only fftw-2 by commenting out the fftw3
> dictionary in the fftw_info class in scipy/distutils/systeminfo.py. The
> performance of fft's for complex and real input are comparable if scipy is
> built with fftw-2 in this way.

On Itanium2, we also get that fftw3 performs badly for the complex case
  http://www.scipy.net/pipermail/scipy-dev/2005-December/004408.html
The same holds for the 64 Bit Opteron machine.

Also note, that fftw3 support was only added recently
(http://www.scipy.net/pipermail/scipy-dev/2005-November/003989.html)
to new scipy.
I also think that it might not have recieved much testing
in old scipy, as it has been added only in July
http://www.scipy.org/documentation/mailman?fn=scipy-dev/2005-July/003061.html

Over the weekend I did some checks comparing the fftw
performance for "old" scipy (fftw2 only) and new scipy (fft2 and fftw3)
on my PIII laptop, see test_AB.png.
Darren also has sent me his results off-list, see test_DD.png.
Both plots (and the script + input file) are at
   http://www.physik.tu-dresden.de/~baecker/tmp/fftw/
and display the ratio of the scipy time vs. Numeric time for the fftw,
so anything below 1 is not ok.

E.g. for data like:

                 Fast Fourier Transform
=================================================
      |    real input     |   complex input
-------------------------------------------------
 size |  scipy  | Numeric |  scipy  | Numeric
-------------------------------------------------
  100 |    0.23 |    0.23 |    1.56 |    0.23  (secs for 7000 calls)
 1000 |    0.17 |    0.31 |    1.62 |    0.30  (secs for 2000 calls)
  256 |    0.40 |    0.46 |    3.21 |    0.47  (secs for 10000 calls)
  512 |    0.55 |    0.84 |    4.19 |    0.81  (secs for 10000 calls)
 1024 |    0.09 |    0.16 |    0.67 |    0.15  (secs for 1000 calls)
 2048 |    0.16 |    0.28 |    1.16 |    0.30  (secs for 1000 calls)
 4096 |    0.17 |    0.30 |    1.08 |    0.29  (secs for 500 calls)
 8192 |    0.46 |    1.04 |    2.38 |    1.01  (secs for 500 calls)


I looked at the profiling output (on the scipy side)
for the  fftw2 and and fftw3 case, but could not see
any difference which could cause the above effect.

> According to some benchmarks posted at
> http://www.fftw.org/speed/p4-2.2GHz-gcc/ , version 3 should be faster than
> version 2.

And for 1D complex, size 8192 fftw3 would almost be a
factor of 2 faster than fftw2!!
For 1D real, size 8192 it is still about 1.5.

> However, I haven't been able to build benchfft and test my own
> installation independent of scipy.

There is another way, if you built fftw3 from source:
  see fftw-3.0.1/tests/README

  ./bench -opatient -s icf8192
  ./bench -opatient -s irf8192

and for 2D:
  ./bench -opatient -s icf256x256

Remarks
 - ./bench does not exist for fftw2
 - i: in-place   (o: out-of-place)
 - c: complex    (r: real)
 - f: forward    (b: backwards fft)

Hope that this somehow helps to get all this sorted!

Could maybe some scipy/fft(pack) expert explain, why for FFTW3
no caching is needed (see scipy/Lib/fftpack/src/zfft.c),
whereas for FFTW2 this is done?
((Presumably all this is explained in
http://www.fftw.org/fftw-paper-ieee.pdf
but, at a quick glance I could not extract the relevant information...))

Best, Arnd



More information about the SciPy-user mailing list