[SciPy-user] numpy aligned memory

Andrew Straw strawman@astraw....
Thu Mar 12 10:05:14 CDT 2009

Sturla Molden wrote:
> On 3/8/2009 6:03 PM, Rohit Garg wrote:
>> http://www.mail-archive.com/numpy-discussion@scipy.org/msg04005.html
>> while googling for numpy memory alignment. I wish to know if anything
>> on that account has come to pass yet? On linux 64 bit platform, can I
>> assume anything beyond the glibc alignment as of now?
> If you are willing to waste a few bytes, there is nothing that prevents 
> you from ensuring arbitrary alignment manually. You just allocate more 
> space than you need (16 bytes for 16 bytes alignment), and return a view 
> to a properly aligned segment. Something like this:
> import numpy as np
> def aligned_zeros(shape, boundary=16, dtype=float, order='C'):
>      N = np.prod(shape)
>      d = np.dtype(dtype)
>      tmp = np.zeros(N * d.itemsize + boundary, dtype=np.uint8)
>      address = tmp.__array_interface__['data'][0]
>      offset = (boundary - address % boundary) % boundary
>      return tmp[offset:offset+N]\
>                .view(dtype=d)\
>                .reshape(shape, order=order)
> We had questions regarding this for an FFTW interface as well (how to 
> use fftw_malloc instead of malloc). It also affect all coding using SIMD 
> extensions on x86 (MMX, SSE, SSE2). I don't use PPC so I don't know what 
> altivec needs. In any case, should this be in the cookbook? Or even in 
> numpy? It seems a bit redundant to answer this question over and over again.
> Sturla Molden
Sturla, I just tried your example, and I discovered that for a 2D array,
it did not align rows on boundaries -- just the first element of the
first row. My understanding is that for image processing with SIMD this
is desired. For example, Intel IPP allocates images such that each image
row is 32-byte aligned. (I just checked that Framewave does _not_ do
this, so maybe times have changed or maybe Framewave just isn't
optimized in this regard.)

So, what's your take on having each row aligned? Is this also useful for
FFTW, for example? If so, we should perhaps come up with a better
routine for the cookbook.


More information about the SciPy-user mailing list