[Numpy-discussion] object array alignment issues

Michael Droettboom mdroe@stsci....
Sun Oct 18 07:04:15 CDT 2009


On 10/16/2009 11:35 PM, Travis Oliphant wrote:
>
> On Oct 15, 2009, at 11:40 AM, Michael Droettboom wrote:
>
>> I recently committed a regression test and bugfix for object pointers in
>> record arrays of unaligned size (meaning where each record is not a
>> multiple of sizeof(PyObject **)).
>>
>> For example:
>>
>>        a1 = np.zeros((10,), dtype=[('o', 'O'), ('c', 'c')])
>>        a2 = np.zeros((10,), 'S10')
>>        # This copying would segfault
>>        a1['o'] = a2
>>
>> http://projects.scipy.org/numpy/ticket/1198
>>
>> Unfortunately, this unit test has opened up a whole hornet's nest of
>> alignment issues on Solaris.  The various reference counting functions
>> (PyArray_INCREF etc.) in refcnt.c all fail on unaligned object pointers,
>> for instance.  Interestingly, there are comments in there saying
>> "handles misaligned data" (eg. line 190), but in fact it doesn't, and
>> doesn't look to me like it would.  But I won't rule out a mistake in
>> building it on my part.
>
> Thanks for this bug report.      It would be very helpful if you could 
> provide the line number where the code is giving a bus error and 
> explain why you think the code in question does not handle misaligned 
> data (it still seems like it should to me --- but perhaps I must be 
> missing something --- I don't have a Solaris box to test on).   
> Perhaps, the real problem is elsewhere (such as other places where the 
> mistake of forgetting about striding needing to be aligned also before 
> pursuing the fast alignment path that you pointed out in another place 
> of code).
>
> This was the thinking for why the code (that I think is in question) 
> should handle mis-aligned data:
>
> 1) pointers that are not aligned to the correct size need to be copied 
> to an aligned memory area before being de-referenced.
> 2) static variables defined in a function will be aligned by the C 
> compiler.
>
> So, what the code in refcnt.c does is to copy the value in the NumPy 
> data-area (i.e. pointed to by it->dataptr) to another memory location 
> (the stack variable temp), dereference it and then increment it's 
> reference count.
>
> 196:  temp = (PyObject **)it->dataptr;
> 197:  Py_XINCREF(*temp);
This is exactly an instance that fails.  Let's say we have a PyObject at 
an aligned location 0x4000 (PyObjects themselves always seem to be 
aligned -- I strongly suspect CPython is enforcing that).  Then, we can 
create a recarray such that some of the PyObject*'s in it are at 
unaligned locations.  For example, if the dtype is 'O,c', you have a 
record stride of 5 which creates unaligned PyObject*'s:

    OOOOcOOOOcOOOOc
    0123456789abcde
         ^    ^

Now in the code above, let's assume that it->dataptr points to an 
unaligned location, 0x8005.  Assigning it to temp puts the same 
unaligned value in temp, 0x8005.  That is:

&temp == 0x1000 /* The location of temp *is* on the stack and aligned */
    temp == 0x8005 /* But its value as a pointer points to an unaligned 
memory location */
    *temp == 0x4000 /* Dereferencing it should get us back to the original
                       PyObject * pointer, but dereferencing an 
unaligned memory location
                       fails with a bus error on Solaris */

So the bus error occurs on line 197.

Note that something like:

    PyObject* temp;
    temp = *(PyObject **)it->dataptr;

would also fail.

The solution (this is what works for me, though there may be a better way):

     PyObject *temp; /* NB: temp is now a (PyObject *), not a (PyObject 
**) */
     /* memcpy works byte-by-byte, so can handle an unaligned assignment */
     memcpy(&temp, it->dataptr, sizeof(PyObject *));
     Py_XINCREF(temp);

I'm proposing adding a macro which on Intel/AMD would be defined as:

#define COPY_PYOBJECT_PTR(dst, src) (*(dst) = *(src))

and on alignment-required platforms as:

#define COPY_PYOBJECT_PTR(dst, src) (memcpy((dst), (src), 
sizeof(PyObject *))

and it would be used something like:

COPY_PYOBJECT_PTR(&temp, it->dataptr);

If you agree with this assessment, I'm working on a patch for all of the 
locations that require this change.  All that I've found so far are 
related to object arrays.  It seems that many places where this would be 
an issue for numeric types are already using this memcpy technique (e.g. 
*_copyswap in arraytype.c.src:1716).  I think this issue shows up in 
object arrays much more because there are many more places where the 
unaligned memory is dereferenced (in order to do reference counting).

So here's the traceback from:

a1 = np.zeros((10,), dtype=[('o', 'O'), ('c', 'c'), ('i', 'i'), ('c2', 
'c')])

Unfortunately, I'm having trouble getting line numbers out of the 
debugger, but "print statement debugging" tells me the inner most frame 
here is in refcount.c:

275        PyObject **temp;
276        Py_XINCREF(obj);
277        temp = (PyObject **)optr;
278        *temp = obj; /* <-- here */
279        return;

My fix was:

Py_XINCREF(obj);
memcpy(optr, &obj, sizeof(PyObject*));
return;

0xfeefaf60 in _fillobject ()
    from 
/home/mdroe/numpy_clean/build/lib.solaris-2.8-sun4u-2.5/numpy/core/multiarray.so
(gdb) bt
#0  0xfeefaf60 in _fillobject ()
    from 
/home/mdroe/numpy_clean/build/lib.solaris-2.8-sun4u-2.5/numpy/core/multiarray.so
#1  0xfeefaf20 in _fillobject ()
    from 
/home/mdroe/numpy_clean/build/lib.solaris-2.8-sun4u-2.5/numpy/core/multiarray.so
#2  0xfeefad40 in PyArray_FillObjectArray ()
    from 
/home/mdroe/numpy_clean/build/lib.solaris-2.8-sun4u-2.5/numpy/core/multiarray.so
#3  0xfee90e04 in _zerofill ()
    from 
/home/mdroe/numpy_clean/build/lib.solaris-2.8-sun4u-2.5/numpy/core/multiarray.so
#4  0xfeed48c4 in PyArray_Zeros ()
    from 
/home/mdroe/numpy_clean/build/lib.solaris-2.8-sun4u-2.5/numpy/core/multiarray.so
#5  0xfef05638 in array_zeros ()
    from 
/home/mdroe/numpy_clean/build/lib.solaris-2.8-sun4u-2.5/numpy/core/multiarray.so
#6  0x37e8c in PyObject_Call ()
#7  0x9a7e8 in do_call ()
#8  0x9a264 in call_function ()
#9  0x9754c in PyEval_EvalFrameEx ()
#10 0x988d4 in PyEval_EvalCodeEx ()
#11 0x93d44 in PyEval_EvalCode ()
#12 0xb9150 in run_mod ()
#13 0xb9108 in PyRun_FileExFlags ()
#14 0xb80c4 in PyRun_SimpleFileExFlags ()
#15 0x3171c in Py_Main ()

Hope that illustrates the point better.  Sorry for my vagueness in my 
initial report.

Mike



More information about the NumPy-Discussion mailing list