[Numpy-discussion] object array alignment issues

Travis Oliphant oliphant@enthought....
Fri Oct 16 22:35:13 CDT 2009


On Oct 15, 2009, at 11:40 AM, Michael Droettboom wrote:

> I recently committed a regression test and bugfix for object  
> pointers in
> record arrays of unaligned size (meaning where each record is not a
> multiple of sizeof(PyObject **)).
>
> For example:
>
>        a1 = np.zeros((10,), dtype=[('o', 'O'), ('c', 'c')])
>        a2 = np.zeros((10,), 'S10')
>        # This copying would segfault
>        a1['o'] = a2
>
> http://projects.scipy.org/numpy/ticket/1198
>
> Unfortunately, this unit test has opened up a whole hornet's nest of
> alignment issues on Solaris.  The various reference counting functions
> (PyArray_INCREF etc.) in refcnt.c all fail on unaligned object  
> pointers,
> for instance.  Interestingly, there are comments in there saying
> "handles misaligned data" (eg. line 190), but in fact it doesn't, and
> doesn't look to me like it would.  But I won't rule out a mistake in
> building it on my part.

Thanks for this bug report.      It would be very helpful if you could  
provide the line number where the code is giving a bus error and  
explain why you think the code in question does not handle misaligned  
data (it still seems like it should to me --- but perhaps I must be  
missing something --- I don't have a Solaris box to test on).    
Perhaps, the real problem is elsewhere (such as other places where the  
mistake of forgetting about striding needing to be aligned also before  
pursuing the fast alignment path that you pointed out in another place  
of code).

This was the thinking for why the code (that I think is in question)  
should handle mis-aligned data:

1) pointers that are not aligned to the correct size need to be copied  
to an aligned memory area before being de-referenced.
2) static variables defined in a function will be aligned by the C  
compiler.

So, what the code in refcnt.c does is to copy the value in the NumPy  
data-area (i.e. pointed to by it->dataptr) to another memory location  
(the stack variable temp), dereference it and then increment it's  
reference count.

196:  temp = (PyObject **)it->dataptr;
197:  Py_XINCREF(*temp);

I'm puzzled why this should fail.    The stack trace showing where  
this fails would be very useful in figuring out what to fix.


This is all independent of defining a variable to decide whether or  
not to even care about worrying about un-aligned data (which we could  
avoid worrying about on Intel and AMD).

I'm all in favor of such a flag if it would speed up code, but I don't  
see it as the central issue here.

Any more details about the bug you have found would be greatly  
appreciated.

-Travis




-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20091016/f94560ce/attachment.html 


More information about the NumPy-Discussion mailing list