[Numpy-discussion] Enhancements for NumPy's FFTs
Charles R Harris
Sat Mar 14 21:52:57 CDT 2009
On Sat, Mar 14, 2009 at 8:26 PM, Sturla Molden <firstname.lastname@example.org> wrote:
> > On Sat, Mar 14, 2009 at 7:23 PM, Sturla Molden <email@example.com> wrote:
> > We can't count on C99 at this point. Maybe David will add something so we
> > can use c99 when it is available.
> Ok, but most GNU compilers has a __restrict__ extension for C89 and C++
> that we could use. And MSVC has a compiler pragma in VS2003 and a
> __restrict extension in VS2005 later versions. So we could define a mscro
> RESTRICT to be restrict in ISO C99, __restrict__ in GCC 3 and 4,
> __restrict in recent versions of MSVC, and nothing elsewhere.
> > I don't have a problem with this, although I not sure what npy type is
> > appropriate without looking. Were you thinking of size_t? I was tempted
> > that. But why is it more efficient? I haven't seen any special
> > instructions
> > at the assembly level, so unless there is some sort of global
> > that isn't obvious I don't know where the advantage is.
> I may be that my memory serves med badly. I thought I read it here, but it
> does not show examples of different assembly code being generated. So I
> think I'll just leave it for now and experiment with this later.
I suspect the biggest gains can be made from careful attention to cache
issues. I had a prototype block based fft -- using a different algorithm
than the usual -- that outperformed fftw at that time.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Numpy-discussion