[Numpy-discussion] How to debug reference counting errors
Dag Sverre Seljebotn
Fri Aug 31 06:22:27 CDT 2012
On 08/31/2012 09:03 AM, Ondřej Čertík wrote:
> There is segfault reported here:
> I've managed to isolate the problem and even provide a simple patch,
> that fixes it here:
> however the patch simply doesn't decrease the proper reference, so it
> might leak. I've used
> bisection (took the whole evening unfortunately...) but the good news
> is that I've isolated commits
> that actually broke it. See the github issue #398 for details, diffs etc.
> Unfortunately, it's 12 commits from Mark and the individual commits
> raise exception on the segfaulting code,
> so I can't pin point the problem further.
> In general, how can I debug this sort of problem? I tried to use
> valgrind, with a debugging build of numpy,
> but it provides tons of false (?) positives: https://gist.github.com/3549063
> Mark, by looking at the changes that broke it, as well as at my "fix",
> do you see where the problem could be?
> I suspect it is something with the changes in PyArray_FromAny() or
> PyArray_FromArray() in ctors.c.
> But I don't see anything so far that could cause it.
> Thanks for any help. This is one of the issues blocking the 1.7.0 release.
IIRC you can recompile Python with some support for detecting memory
leaks. One of the issues with using Valgrind, after suppressing the
false positives, is that Python uses its own memory allocator so that
sits between the bug and what Valgrind detects. So at least recompile
Python to not do that.
As for hardening the NumPy source in general, you should at least be
aware of these two options:
1) David Malcolm (firstname.lastname@example.org) was writing a static code
analysis plugin for gcc that would check every routine that the
reference count semantics was correct. (I don't know how far he's got
2) In Cython we have a "reference count nanny". This requires changes to
all the code though, so not an option just for finding this bug, just
thought I'd mention it. In addition to the INCREF/DECREF you need to
insert new "GIVEREF" and "GOTREF" calls (which are noops in a normal
compile) to declare where you get and give away a reference. When
Cython-generated sources are enabled with -DCYTHON_REFNANNY,
INCREF/DECREF/GIVEREF/GOTREF are tracked within each function and a
failure is raised if the function violates any contract.
More information about the NumPy-Discussion