[Numpy-discussion] Another reference count leak: ticket #848
Michael Abbott
michael@araneidae.co...
Wed Jul 9 03:36:27 CDT 2008
There are three separate patches in this message plus some remarks on
"stealing" reference counts at the bottom.
On Tue, 8 Jul 2008, Travis E. Oliphant wrote:
> Michael Abbott wrote:
> > On Tue, 8 Jul 2008, Travis E. Oliphant wrote:
> >> The first part of this patch is good. The second is not needed.
> > I don't see that.
> Don't forget that PyArray_FromAny consumes the reference even if it
> returns with an error.
Oh dear. That's not good.
Well then, I need to redo my patch. Here's the new patch for
..._arrtype_new:
commit 431d99f40ca200201ba59c74a88b0bd972022ff0
Author: Michael Abbott <michael.abbott@diamond.ac.uk>
Date: Tue Jul 8 10:10:59 2008 +0100
Another reference leak using PyArray_DescrFromType
This change fixes two issues: a spurious ADDREF on a typecode returned
from PyArray_DescrFromType and an awkward interaction with PyArray_FromAny.
diff --git a/numpy/core/src/scalartypes.inc.src b/numpy/core/src/scalartypes.inc.src
index 3feefc0..7d3e562 100644
--- a/numpy/core/src/scalartypes.inc.src
+++ b/numpy/core/src/scalartypes.inc.src
@@ -1886,7 +1886,6 @@ static PyObject *
if (!PyArg_ParseTuple(args, "|O", &obj)) return NULL;
typecode = PyArray_DescrFromType(PyArray_@TYPE@);
- Py_INCREF(typecode);
if (obj == NULL) {
#if @default@ == 0
char *mem;
@@ -1903,8 +1902,12 @@ static PyObject *
goto finish;
}
+ Py_XINCREF(typecode);
arr = PyArray_FromAny(obj, typecode, 0, 0, FORCECAST, NULL);
- if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) return arr;
+ if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) {
+ Py_XDECREF(typecode);
+ return arr;
+ }
robj = PyArray_Return((PyArrayObject *)arr);
finish:
I don't think we can dispense with the extra INCREF and DECREF.
Looking at the uses of PyArray_FromAny I can see the motivation for this
design: core/include/numpy/ndarrayobject.h has a lot of calls which take a
value returned by PyArray_DescrFromType as argument. This has prompted me
to take a trawl through the code to see what else is going on, and I note
a couple more issues with patches below.
In the patch below the problem being fixed is that the first call to
PyArray_FromAny can result in the erasure of dtype *before* Py_INCREF is
called. Perhaps you can argue that this only occurs when NULL is
returned...
diff --git a/numpy/core/blasdot/_dotblas.c b/numpy/core/blasdot/_dotblas.c
index e2619b6..0b34ec7 100644
--- a/numpy/core/blasdot/_dotblas.c
+++ b/numpy/core/blasdot/_dotblas.c
@@ -234,9 +234,9 @@ dotblas_matrixproduct(PyObject *dummy, PyObject *args)
}
dtype = PyArray_DescrFromType(typenum);
- ap1 = (PyArrayObject *)PyArray_FromAny(op1, dtype, 0, 0, ALIGNED, NULL);
- if (ap1 == NULL) return NULL;
Py_INCREF(dtype);
+ ap1 = (PyArrayObject *)PyArray_FromAny(op1, dtype, 0, 0, ALIGNED, NULL);
+ if (ap1 == NULL) { Py_DECREF(dtype); return NULL; }
ap2 = (PyArrayObject *)PyArray_FromAny(op2, dtype, 0, 0, ALIGNED, NULL);
if (ap2 == NULL) goto fail;
The next patch deals with an interestingly subtle memory leak in
_string_richcompare where if casting to a common type fails then a
reference count will leaked. Actually this one has nothing to do with
PyArray_FromAny, but I spotted it in passing.
diff --git a/numpy/core/src/arrayobject.c b/numpy/core/src/arrayobject.c
index ee4e945..2294b8d 100644
--- a/numpy/core/src/arrayobject.c
+++ b/numpy/core/src/arrayobject.c
@@ -4715,7 +4715,6 @@ _strings_richcompare(PyArrayObject *self, PyArrayObject *other, int cmp_op,
PyObject *new;
if (self->descr->type_num == PyArray_STRING && \
other->descr->type_num == PyArray_UNICODE) {
- Py_INCREF(other);
Py_INCREF(other->descr);
new = PyArray_FromAny((PyObject *)self, other->descr,
0, 0, 0, NULL);
@@ -4723,16 +4722,17 @@ _strings_richcompare(PyArrayObject *self, PyArrayObject *other, int cmp_op,
return NULL;
}
self = (PyArrayObject *)new;
+ Py_INCREF(other);
}
else if (self->descr->type_num == PyArray_UNICODE && \
other->descr->type_num == PyArray_STRING) {
- Py_INCREF(self);
Py_INCREF(self->descr);
new = PyArray_FromAny((PyObject *)other, self->descr,
0, 0, 0, NULL);
if (new == NULL) {
return NULL;
}
+ Py_INCREF(self);
other = (PyArrayObject *)new;
}
else {
I really don't think that this design of reference count handling in
PyArray_FromAny (and consequently PyArray_CheckFromAny) is a good idea.
Unfortunately these seem to be part of the published API, so presumably
it's too late to change this? (Otherwise I might see how the
corresponding patch comes out.)
Not only is this not a good idea, it's not documented in the API
documentation (I'm referring to the "Guide to NumPy" book), although at
least the inline comment on the implementation of PyArray_FromAny does
mention that the reference is "stolen".
I've been trying to find some documentation on stealing references. The
Python C API reference (http://docs.python.org/api/refcountDetails.html)
says
Few functions steal references; the two notable exceptions are
PyList_SetItem() and PyTuple_SetItem()
An interesting essay on reference counting is at
http://lists.blender.org/pipermail/bf-python/2005-September/003092.html
In conclusion, I can't find much about the role of stealing in reference
count management, but it's such a source of surprise (and frankly doesn't
actually work out all that well in numpy) that I don't think it's
justified. If PyList_SetItem() and PyTuple_SetItem() could remain the
only examples of this it would be good.
More information about the Numpy-discussion
mailing list