[Numpy-discussion] Byte aligned arrays
Charles R Harris
Wed Dec 19 09:27:54 CST 2012
On Wed, Dec 19, 2012 at 8:10 AM, Nathaniel Smith <firstname.lastname@example.org> wrote:
> On Wed, Dec 19, 2012 at 2:57 PM, Charles R Harris
> <email@example.com> wrote:
> > On Wed, Dec 19, 2012 at 7:43 AM, Nathaniel Smith <firstname.lastname@example.org> wrote:
> >> On Wed, Dec 19, 2012 at 8:40 AM, Henry Gomersall <email@example.com>
> >> > I've written a few simple cython routines for assisting in creating
> >> > byte-aligned numpy arrays. The point being for the arrays to work with
> >> > SSE/AVX code.
> >> >
> >> > https://github.com/hgomersall/pyFFTW/blob/master/pyfftw/utils.pxi
> >> >
> >> > The change recently has been to add a check on the CPU as to what
> >> > are supported (though it's not complete, I should make the default
> >> > return 0 or something).
> >> >
> >> > It occurred to me that this is something that (a) other people almost
> >> > certainly need and are solving themselves and (b) I lack the necessary
> >> > platforms to test all the possible CPU/OS combinations to make sure
> >> > something sensible happens in all cases.
> >> >
> >> > Is this something that can be rolled into Numpy (the feature, not my
> >> > particular implementation or interface - though I'd be happy for it to
> >> > be so)?
> >> >
> >> > Regarding (b), I've written a test case that works for Linux on x86-64
> >> > with GCC (my platform!). I can test it on 32-bit windows, but that's
> >> > Is ARM supported by Numpy? Neon would be great to include as well.
> >> > other platforms might need this?
> >> Your code looks simple and portable to me (at least the alignment
> >> part). I can see a good argument for adding this sort of functionality
> >> directly to numpy with a nice interface, though, since these kind of
> >> requirements seem quite common these days. Maybe an interface like
> >> a = np.asarray([1, 2, 3], base_alignment=32) # should this be in
> >> bits or in bytes?
> >> b = np.empty((10, 10), order="C", base_alignment=32)
> >> # etc.
> >> assert a.base_alignment == 32
> >> which underneath tries to use posix_memalign/_aligned_malloc when
> >> possible, or falls back on the overallocation trick otherwise?
> > There is a thread about this from several years back. IIRC, David
> > was interested in the same problem. At first glance, the alignment
> > looks interesting. One possible concern is keeping alignment for rows,
> > views, etc., which is probably not possible in any sensible way. But
> > who need this most likely know what they are doing and just need memory
> > allocated on the proper boundary.
> Right, my intuition is that it's like order="C" -- if you make a new
> array by, say, indexing, then it may or may not have order="C", no
> guarantees. So when you care, you call asarray(a, order="C") and that
> either makes a copy or not as needed. Similarly for base alignment.
> I guess to push this analogy even further we could define a set of
> array flags, ALIGNED_8, ALIGNED_16, etc. (In practice only power-of-2
> alignment matters, I think, so the number of flags would remain
> manageable?) That would make the C API easier to deal with too, no
> need to add PyArray_FromAnyAligned.
Another possibility is an aligned datatype, basically an aligned structured
array with floats/ints in chunks of the appropriate size. IIRC, gcc support
for sse is something like that.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion