[Numpy-discussion] 30% speedup when deactivating NumArray.__del__ !!!
Todd Miller
jmiller at stsci.edu
Mon Jan 31 10:30:21 CST 2005
Thanks Andrew, that was a useful summary. I wish I had more time to
work on optimizing numarray personally, but I don't. Instead I'll try
to share what I know of the state of __del__/tp_dealloc so that people
who want to work on it can come up with something better:
1. We need __del__/tp_dealloc. (This may be controversial but I hope
not). Using the destructor makes the high level C-API cleaner. Getting
rid of it means changing the C-API. __del__/tp_dealloc is used to
transparently copy the contents of a working array back onto an
ill-behaved (byteswapped, etc...) source array at extension function
exit time.
2. There's a problem with the tp_dealloc I originally implemented which
causes it to segfault for a ./configure'ed --with-pydebug Python.
Looking at it today, it looks like it may be an exit-time garbage
collection problem. There is no explicit garbage collection support in
_numarray or _ndarray, so that may be the problem.
3. We're definitely not exploiting the "single underscore rule" yet.
We use single underscores mostly to hide globals from module export. I
don't think this is really critical, but that's the state of things.
4. Circular references should only be a problem for numerical arrays
with "user introduced" cycles. numarray ObjectArrays have no __del__.
I attached a patch against CVS that reinstates the old tp_dealloc; this
shows where I left off in case someone has insight on how to fix it. I
haven't tested it recently for a non-debug Python; I think it works.
The patch segfaults after the C-API examples/selftest for debug Pythons:
% python setup.py install --selftest
Using EXTRA_COMPILE_ARGS = []
running install
running build
running build_py
copying Lib/numinclude.py -> build/lib.linux-i686-2.4/numarray
running build_ext
running install_lib
copying build/lib.linux-i686-2.4/numarray/numinclude.py -> /home/jmiller/work/lib/python2.4/site-packages/numarray
byte-compiling /home/jmiller/work/lib/python2.4/site-packages/numarray/numinclude.py to numinclude.pyc
running install_headers
copying Include/numarray/numconfig.h -> /home/jmiller/work/include/python2.4/numarray
running install_data
Testing numarray 1.2a on Python (2, 4, 0, 'final', 0)
numarray.numtest: ((0, 1231), (0, 1231))
numarray.ieeespecial: (0, 20)
numarray.records: (0, 48)
numarray.strings: (0, 186)
numarray.memmap: (0, 82)
numarray.objects: (0, 105)
numarray.memorytest: (0, 16)
numarray.examples.convolve: ((0, 20), (0, 20), (0, 20), (0, 20))
Segmentation fault (core dumped)
That's all I can add for now.
Regards,
Todd
On Mon, 2005-01-31 at 11:54 +1100, Andrew McNamara wrote:
> >This benchmark made me suspicious since I had already found it odd before that
> >killing a numarray calculation with Ctrl-C nearly always gives a backtrace
> >starting in __del__
>
> Much of the python machinery may have been torn down when your __del__
> method is called while the interpreter is exiting (I'm asuming you're
> talking about a script, rather than interactive mode). Code should
> be prepared for anything to fail - it's quite common for parts of
> __builtins__ to have been disassembled, etc.
>
> The language reference has this to say:
>
> http://python.org/doc/2.3.4/ref/customization.html#l2h-174
>
> Warning: Due to the precarious circumstances under which __del__()
> methods are invoked, exceptions that occur during their execution
> are ignored, and a warning is printed to sys.stderr instead. Also,
> when __del__() is invoked in response to a module being deleted (e.g.,
> when execution of the program is done), other globals referenced by
> the __del__() method may already have been deleted. For this reason,
> __del__() methods should do the absolute minimum needed to maintain
> external invariants. Starting with version 1.5, Python guarantees that
> globals whose name begins with a single underscore are deleted from
> their module before other globals are deleted; if no other references
> to such globals exist, this may help in assuring that imported modules
> are still available at the time when the __del__() method is called.
>
> Another important caveat of classes with __del__ methods is mentioned in
> the library reference for the "gc" module:
>
> http://python.org/doc/2.3.4/lib/module-gc.html
>
> Objects that have __del__() methods and are part of a reference
> cycle cause the entire reference cycle to be uncollectable,
> including objects not necessarily in the cycle but reachable only
> from it. Python doesn't collect such cycles automatically because,
> in general, it isn't possible for Python to guess a safe order in
> which to run the __del__() methods.
-------------- next part --------------
? Lib/numinclude.py
? Lib/ufunc.warnings
? Lib/codegenerator/basecode.pyc
? Lib/codegenerator/bytescode.pyc
? Lib/codegenerator/convcode.pyc
? Lib/codegenerator/sortcode.pyc
? Lib/codegenerator/template.pyc
? Lib/codegenerator/ufunccode.pyc
? Src/_ufuncmodule.new
Index: Lib/numarraycore.py
===================================================================
RCS file: /cvsroot/numpy/numarray/Lib/numarraycore.py,v
retrieving revision 1.101
diff -c -r1.101 numarraycore.py
*** Lib/numarraycore.py 25 Jan 2005 11:25:09 -0000 1.101
--- Lib/numarraycore.py 31 Jan 2005 16:36:47 -0000
***************
*** 693,703 ****
v._byteorder = self._byteorder
return v
- def __del__(self):
- if self._shadows != None:
- self._shadows._copyFrom(self)
- self._shadows = None
-
def __getstate__(self):
"""returns state of NumArray for pickling."""
# assert not hasattr(self, "_shadows") # Not a good idea for pickling.
--- 693,698 ----
Index: Src/_numarraymodule.c
===================================================================
RCS file: /cvsroot/numpy/numarray/Src/_numarraymodule.c,v
retrieving revision 1.65
diff -c -r1.65 _numarraymodule.c
*** Src/_numarraymodule.c 5 Jan 2005 19:49:02 -0000 1.65
--- Src/_numarraymodule.c 31 Jan 2005 16:36:47 -0000
***************
*** 105,128 ****
}
static PyObject *
! _numarray_shadows_get(PyArrayObject *self)
{
! if (self->_shadows) {
! Py_INCREF(self->_shadows);
! return self->_shadows;
! } else {
! Py_INCREF(Py_None);
! return Py_None;
}
}
! static int
! _numarray_shadows_set(PyArrayObject *self, PyObject *s)
{
! Py_XDECREF(self->_shadows);
! if (s) Py_INCREF(s);
! self->_shadows = s;
! return 0;
}
static PyObject *
--- 105,138 ----
}
static PyObject *
! _numarray_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
! PyArrayObject *self;
! self = (PyArrayObject *)
! _numarray_type.tp_base->tp_new(type, args, kwds);
! if (!self) return NULL;
! if (!(self->descr = PyArray_DescrFromType( tAny))) {
! PyErr_Format(PyExc_RuntimeError,
! "_numarray_new: bad type number");
! return NULL;
}
+ return (PyObject *) self;
}
! static void
! _numarray_dealloc(PyObject *self)
{
! PyArrayObject *me = (PyArrayObject *) self;
! Py_INCREF(self);
! if (me->_shadows) {
! PyObject *result = PyObject_CallMethod(me->_shadows,
! "_copyFrom", "(O)", self);
! Py_XDECREF(result); /* Should be None. */
! Py_DECREF(me->_shadows);
! me->_shadows = NULL;
! }
! self->ob_refcnt = 0;
! _numarray_type.tp_base->tp_dealloc(self);
}
static PyObject *
***************
*** 218,226 ****
}
static PyGetSetDef _numarray_getsets[] = {
- {"_shadows",
- (getter)_numarray_shadows_get,
- (setter) _numarray_shadows_set, "numeric shadows object"},
{"_type",
(getter)_numarray_type_get,
(setter) _numarray_type_set, "numeric type object"},
--- 228,233 ----
***************
*** 418,424 ****
"numarray._numarray._numarray",
sizeof(PyArrayObject),
0,
! 0, /* tp_dealloc */
0, /* tp_print */
0, /* tp_getattr */
0, /* tp_setattr */
--- 425,431 ----
"numarray._numarray._numarray",
sizeof(PyArrayObject),
0,
! _numarray_dealloc, /* tp_dealloc */
0, /* tp_print */
0, /* tp_getattr */
0, /* tp_setattr */
***************
*** 452,458 ****
0, /* tp_dictoffset */
(initproc)_numarray_init, /* tp_init */
0, /* tp_alloc */
! 0, /* tp_new */
};
typedef void Sigfunc(int);
--- 459,465 ----
0, /* tp_dictoffset */
(initproc)_numarray_init, /* tp_init */
0, /* tp_alloc */
! _numarray_new, /* tp_new */
};
typedef void Sigfunc(int);
More information about the Numpy-discussion
mailing list