[Numpy-tickets] [NumPy] #551: numpy.ndarray messed up after unpickling

NumPy numpy-tickets@scipy....
Sat Apr 12 18:14:52 CDT 2008


#551: numpy.ndarray messed up after unpickling
---------------------------------------+------------------------------------
 Reporter:  cotackst                   |        Owner:  somebody
     Type:  defect                     |       Status:  new     
 Priority:  normal                     |    Milestone:  1.0.5   
Component:  numpy.core                 |      Version:  1.0.1   
 Severity:  critical                   |   Resolution:          
 Keywords:  pickle, ndarray, segfault  |  
---------------------------------------+------------------------------------
Comment (by pv):

 SOLVED: this is an aligment issue (see below for proof).

 Apparently, something in the SSE2 code in Atlas requires that incoming
 doubles are aligned to a 8-byte boundary and causes a crash if they are
 not. This is surprising, as documentation on the net says that SSE2 can
 optionally assume that values lie at 16-byte boundaries! However, GNU libc
 docs state that "In the GNU system, the address is always a multiple of
 eight on most systems, and a multiple of 16 on 64-bit systems.", so maybe
 there is some broken optimization done based on this assumption. (And
 apparently the byte data in Python strings is consistently off by 4 from
 8-byte alignment.)

 So Atlas + SSE2 is somehow broken (at least, unless this is clearly
 explained in its docs...). The question is what numpy should do about
 this.

   ***

 Finally, to believe that it's really the alignment at fault here, apply
 the patch
 {{{
 diff -r d6b1b710efac numpy/core/src/arraymethods.c
 --- a/numpy/core/src/arraymethods.c     Sat Apr 12 06:12:09 2008 +0300
 +++ b/numpy/core/src/arraymethods.c     Sun Apr 13 02:04:58 2008 +0300
 @@ -1229,6 +1229,16 @@ array_setstate(PyArrayObject *self, PyOb
          if (PyString_AsStringAndSize(rawdata, &datastr, &len))
              return NULL;

 +        {
 +            char *dupmem;
 +            dupmem = malloc(len + 16);
 +            dupmem += (((long long)dupmem) % 8) + 0; /* doesn't crash */
 +            /* dupmem += (((long long)dupmem) % 8) + 4;*/ /* crashes */
 +            memcpy(dupmem, datastr, len);
 +            datastr = dupmem;
 +            printf("--xx-- %p, %lld\n", dupmem, ((long long)dupmem) %
 16);
 +        }
 +
          if ((len != (self->descr->elsize * size))) {
              PyErr_SetString(PyExc_ValueError,
                              "buffer size does not"  \
 }}}
 and run the following test code, using first numpy compiled with the first
 marked line commented out, and then the second line commented out:
 {{{
 $ cat test.py
 import numpy

 x = numpy.zeros((3,4))
 v = numpy.ones((1, x.shape[0]))

 z = x.__reduce__()[2]
 y = numpy.empty(x.shape, x.dtype)
 y.__setstate__(z)

 print "x:", x.__array_interface__['data'][0] % 16
 print "y:", y.__array_interface__['data'][0] % 16

 print "1"
 numpy.dot(v, x)
 print "2"
 numpy.dot(v, y)
 print "3"
 $ python test.py   # first run
 --xx-- 0x8327128, 8
 x: 8
 y: 8
 1
 2
 3
 $ comment out the second line, comment the first line and recompile numpy
 $ python test.py
 --xx-- 0x832712c, 12
 x: 8
 y: 12
 1
 2
 Segmentation fault
 }}}

-- 
Ticket URL: <http://scipy.org/scipy/numpy/ticket/551#comment:21>
NumPy <http://projects.scipy.org/scipy/numpy>
The fundamental package needed for scientific computing with Python.


More information about the Numpy-tickets mailing list