[SciPy-user] numpy aligned memory
Sturla Molden
sturla@molden...
Thu Mar 12 13:15:32 CDT 2009
On 3/12/2009 4:05 PM, Andrew Straw wrote:
> So, what's your take on having each row aligned? Is this also useful for
> FFTW, for example? If so, we should perhaps come up with a better
> routine for the cookbook.
Ok, so here is how it could be done. It fails for a reason I'll
attribute to a bug in NumPy.
import numpy as np
def _nextpow(b,isize):
i = 1
while b**i < isize:
i += 1
return b**i
def aligned_zeros(shape, boundary=16, dtype=float, order='C',
imagealign=True):
if (not imagealign) or (not hasattr(shape,'__len__')):
N = np.prod(shape)
d = np.dtype(dtype)
tmp = np.zeros(N * d.itemsize + boundary, dtype=np.uint8)
address = tmp.__array_interface__['data'][0]
offset = (boundary - address % boundary) % boundary
return tmp[offset:offset+N*d.itemsize]\
.view(dtype=d)\
.reshape(shape, order=order)
else:
if order == 'C':
ndim0 = shape[-1]
dim0 = -1
else:
ndim0 = shape[0]
dim0 = 0
d = np.dtype(dtype)
bshape = [i for i in shape]
padding = boundary + _nextpow(boundary, d.itemsize) - d.itemsize
bshape[dim0] = ndim0*d.itemsize + padding
print bshape
tmp = np.zeros(bshape, dtype=np.uint8, order=order)
address = tmp.__array_interface__['data'][0]
offset = (boundary - address % boundary) % boundary
aligned_slice = slice(offset, offset + ndim0*d.itemsize)
if tmp.flags['C_CONTIGUOUS']:
tmp = tmp[..., aligned_slice]
print tmp.shape
else:
tmp = tmp[aligned_slice, ...]
print tmp.shape
return tmp.view(dtype=dtype) # this will often fail,
# probably a bug in numpy
So lets reproduce the NumPy issue:
>>> a = zeros((10,52), dtype=uint8)
>>> b = a[:, 3:8*2+3]
>>> b.shape
(10, 16)
>>> b.view(dtype=float)
Traceback (most recent call last):
File "<pyshell#86>", line 1, in <module>
b.view(dtype=float)
ValueError: new type not compatible with array.
However:
>>> a = zeros((10,16), dtype=uint8)
>>> a.view(dtype=float)
array([[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.]])
Until we find a way to overcome this, it will be difficult to align rows
to particular byte boundaries. It fails even if we make sure the padding
is a multiple of the item size:
padding = (boundary + _nextpow(boundary, d.itemsize) \
- d.itemsize) * d.itemsize
Very annoying..
Using allocators in libraries (e.g. FFTW) would not help either, as
NumPy would fail in the same way.
Maybe we can force NumPy to do the right thing by hard-coding an array
descriptor?
We can do this in Cython though, as it supports pointers and double
indirection. But it would be like using C.
Sturla Molden
More information about the SciPy-user
mailing list