[Numpy-discussion] "aligned" matrix / ctypes

David Cournapeau cournapeau@cslab.kecl.ntt.co...
Thu Apr 24 21:34:59 CDT 2008


On Fri, 2008-04-25 at 04:15 +0200, Sturla Molden wrote:
> The problem with alignment on 3 byte boundaries, is that 3 is a prime and
> not a factor of the size of any common data type. (The only exception I
> can think of is 24 bit RGB values.) So in general, the elements in an
> array for which the first element is aligned on a 3 byte boundary, may or
> may not not be 3-byte aligned.
> Byte boundary alignment should thus be a bit intelligent. If the size of
> the dtype is not divisable by the byte boundary, an exception should be
> raised.
> 
> In practice, only alignment on 2-, 4- and perhaps 8-byte boundaries are
> really required.

There are other useful alignements. I don't know for mmx, but for SSE,
16 bytes alignement is almost required for useful speedup (to be able to
use movps instead of movups, which is extremely slow, when loading data
from memory into sse registers). I saw once mention that the mkl also
sometimes requires 64 byte alignement.

>  Alignment on 2 byte boundaries should perhaps be NumPy's
> default (over no alignment), as MMX and SSE extensions depend on it.

malloc on glibc alloc on 8 bytes boundaries by default, and malloc on
mac os X on 16 bytes. I guess, but should check whether the same is true
on solaris, since sparc does not like unusual alignement (bus errors if
float are not 4 byte aligned, for example).

I have somewhere the code for portable aligned allocators (mostly given
by Steve Johnson from fftw fame) + a C api to access them in C
extensions + plus default alignement to 16 bytes for PyDataMem_NEW
(which is just a wrapper around those aligned allocators).

cheers,

David



More information about the Numpy-discussion mailing list