[Scipy-tickets] [SciPy] #189: Add support for p4fftwgel - extremly fast fft for Intel P4
SciPy
scipy-tickets@scipy....
Thu Jun 28 05:07:23 CDT 2007
#189: Add support for p4fftwgel - extremly fast fft for Intel P4
---------------------------+------------------------------------------------
Reporter: pearu | Owner: pearu
Type: enhancement | Status: new
Priority: normal | Milestone:
Component: scipy.fftpack | Version:
Severity: normal | Resolution:
Keywords: |
---------------------------+------------------------------------------------
Comment (by cdavid):
There are two different problems: aligned data, and how to deal with non-
aligned data in the current fft system. Numpy arrays are not aligned on 16
bytes boundaries. I don't know how difficult it would be to force
alignement: for arrays with newly created data buffer, this is really easy
to support (replacing PyDataMem_NEW by an 16 bytes aligned allocator
instead of malloc, eg posix_memalign on unix and whatever on windows); but
how to do for arrays created with existing data ?
Now, assuming we may have aligned and unaligned arrays, this is a bit
hairy. As you know, by default fftw3 defines plans on given arrays (one
reason is directly linked to alignement issues). So you have two basic
choices:
* caching plans with aligned buffers and copying data between arrays and
those working buffers back and forth (this is the approach I followed to
solve ticket #1). Copying data to buffers take a large amount of time
(about half the time for size around 2^9 - 2^12 on my P4), but this uses
SIMD
* caching plans using advanced and guru planning. Guru planning can be
used to apply a given plan to a different array given several condition
are met, including alignement. But then the problems is that for in place
transform, if you use FFTW_MEASURE, input are destroyed... So you have to
use FFTW_ESTIMATE for plans, which lead to worse results than using
FFTW_MEASURE with copies...
Basically, if we want optimal performances with fftw3, I don't think we
can use the current cache system; we need something a bit more
sophisticated (that's exactly why I started to rewrite the code of
fftpack).
--
Ticket URL: <http://projects.scipy.org/scipy/scipy/ticket/189#comment:3>
SciPy <http://www.scipy.org/>
SciPy is open-source software for mathematics, science, and engineering.
More information about the Scipy-tickets
mailing list