[Numpy-tickets] [NumPy] #551: numpy.ndarray messed up after unpickling
NumPy
numpy-tickets@scipy....
Sat Apr 12 18:14:52 CDT 2008
#551: numpy.ndarray messed up after unpickling
---------------------------------------+------------------------------------
Reporter: cotackst | Owner: somebody
Type: defect | Status: new
Priority: normal | Milestone: 1.0.5
Component: numpy.core | Version: 1.0.1
Severity: critical | Resolution:
Keywords: pickle, ndarray, segfault |
---------------------------------------+------------------------------------
Comment (by pv):
SOLVED: this is an aligment issue (see below for proof).
Apparently, something in the SSE2 code in Atlas requires that incoming
doubles are aligned to a 8-byte boundary and causes a crash if they are
not. This is surprising, as documentation on the net says that SSE2 can
optionally assume that values lie at 16-byte boundaries! However, GNU libc
docs state that "In the GNU system, the address is always a multiple of
eight on most systems, and a multiple of 16 on 64-bit systems.", so maybe
there is some broken optimization done based on this assumption. (And
apparently the byte data in Python strings is consistently off by 4 from
8-byte alignment.)
So Atlas + SSE2 is somehow broken (at least, unless this is clearly
explained in its docs...). The question is what numpy should do about
this.
***
Finally, to believe that it's really the alignment at fault here, apply
the patch
{{{
diff -r d6b1b710efac numpy/core/src/arraymethods.c
--- a/numpy/core/src/arraymethods.c Sat Apr 12 06:12:09 2008 +0300
+++ b/numpy/core/src/arraymethods.c Sun Apr 13 02:04:58 2008 +0300
@@ -1229,6 +1229,16 @@ array_setstate(PyArrayObject *self, PyOb
if (PyString_AsStringAndSize(rawdata, &datastr, &len))
return NULL;
+ {
+ char *dupmem;
+ dupmem = malloc(len + 16);
+ dupmem += (((long long)dupmem) % 8) + 0; /* doesn't crash */
+ /* dupmem += (((long long)dupmem) % 8) + 4;*/ /* crashes */
+ memcpy(dupmem, datastr, len);
+ datastr = dupmem;
+ printf("--xx-- %p, %lld\n", dupmem, ((long long)dupmem) %
16);
+ }
+
if ((len != (self->descr->elsize * size))) {
PyErr_SetString(PyExc_ValueError,
"buffer size does not" \
}}}
and run the following test code, using first numpy compiled with the first
marked line commented out, and then the second line commented out:
{{{
$ cat test.py
import numpy
x = numpy.zeros((3,4))
v = numpy.ones((1, x.shape[0]))
z = x.__reduce__()[2]
y = numpy.empty(x.shape, x.dtype)
y.__setstate__(z)
print "x:", x.__array_interface__['data'][0] % 16
print "y:", y.__array_interface__['data'][0] % 16
print "1"
numpy.dot(v, x)
print "2"
numpy.dot(v, y)
print "3"
$ python test.py # first run
--xx-- 0x8327128, 8
x: 8
y: 8
1
2
3
$ comment out the second line, comment the first line and recompile numpy
$ python test.py
--xx-- 0x832712c, 12
x: 8
y: 12
1
2
Segmentation fault
}}}
--
Ticket URL: <http://scipy.org/scipy/numpy/ticket/551#comment:21>
NumPy <http://projects.scipy.org/scipy/numpy>
The fundamental package needed for scientific computing with Python.
More information about the Numpy-tickets
mailing list