[Numpy-discussion] object array alignment issues
Charles R Harris
Thu Oct 15 12:00:04 CDT 2009
On Thu, Oct 15, 2009 at 10:40 AM, Michael Droettboom <firstname.lastname@example.org>wrote:
> I recently committed a regression test and bugfix for object pointers in
> record arrays of unaligned size (meaning where each record is not a
> multiple of sizeof(PyObject **)).
> For example:
> a1 = np.zeros((10,), dtype=[('o', 'O'), ('c', 'c')])
> a2 = np.zeros((10,), 'S10')
> # This copying would segfault
> a1['o'] = a2
> Unfortunately, this unit test has opened up a whole hornet's nest of
> alignment issues on Solaris.
No surprise there. Good unit tests seem to routinely uncover hornet's nests
and Solaris is a platform that exercises the alignment part of the code. I
think it is great that you are finding these problems. We folks working on
Intel don't see them so much.
> The various reference counting functions
> (PyArray_INCREF etc.) in refcnt.c all fail on unaligned object pointers,
> for instance. Interestingly, there are comments in there saying
> "handles misaligned data" (eg. line 190), but in fact it doesn't, and
> doesn't look to me like it would. But I won't rule out a mistake in
> building it on my part.
> So, how to fix this?
> One obvious workaround is for users to pass "align=True" to the dtype
> constructor. This works if the dtype descriptor is a dictionary or
> comma-separated string. Is there a reason it couldn't be made to work
> with the string-of-tuples form that I'm missing? It would be marginally
> more convenient from my application, but that's just a finesse issue.
> However, perhaps we should try to fix the underlying alignment
> problems? Unfortunately, it's not clear to me how to resolve them
> without at least some performance penalty. You either do an alignment
> check of the pointer, and then memcpy if unaligned, or just always use
> memcpy. Not sure which is faster, as memcpy may have a fast path
> already. These are object arrays anyway, so there's plenty of overhead
> already, and I don't think this would affect regular numerical arrays.
I believe the memcpy approach is used for other unaligned parts of void
types. There is an inherent performance penalty there, but I don't see how
it can be avoided when using what are essentially packed structures. As to
memcpy, it's performance seems to depend on the compiler/compiler version,
old versions of gcc had *horrible* implementations of memcpy. I believe the
situation has since improved. However, I'm not sure we should be coding to
compiler issues unless it is unavoidable or the gain is huge.
> If we choose not to fix it, perhaps we should we try to warn when
> creating an unaligned recarray on platforms where it matters? I do
> worry about having something that works perfectly well on one platform
> fail on another.
> In the meantime, I'll just mark the new regression test to "skip on
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion