[Numpy-discussion] Enhancements for NumPy's FFTs

Charles R Harris charlesr.harris@gmail....
Sat Mar 14 21:52:57 CDT 2009

On Sat, Mar 14, 2009 at 8:26 PM, Sturla Molden <sturla@molden.no> wrote:

> > On Sat, Mar 14, 2009 at 7:23 PM, Sturla Molden <sturla@molden.no> wrote:
> > We can't count on C99 at this point. Maybe David will add something so we
> > can use c99 when it is available.
> Ok, but most GNU compilers has a __restrict__ extension for C89 and C++
> that we could use. And MSVC has a compiler pragma in VS2003 and a
> __restrict extension in VS2005 later versions.  So we could define a mscro
> RESTRICT to be restrict in ISO C99, __restrict__ in GCC 3 and 4,
> __restrict in recent versions of MSVC, and nothing elsewhere.
> > I don't have a problem with this, although I not sure what npy type is
> > appropriate without looking. Were you thinking of size_t? I was tempted
> by
> > that. But why is it more efficient? I haven't seen any special
> > instructions
> > at the assembly level, so unless there is some sort of global
> optimization
> > that isn't obvious I don't know where the advantage is.
> I may be that my memory serves med badly. I thought I read it here, but it
> does not show examples of different assembly code being generated. So I
> think I'll just leave it for now and experiment with this later.
> http://support.amd.com/us/Processor_TechDocs/22007.pdf

I suspect the biggest gains can be made from careful attention to cache
issues. I had a prototype block based fft -- using a different algorithm
than the usual -- that outperformed fftw at that time.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20090314/8684843d/attachment.html 

More information about the Numpy-discussion mailing list