[Numpy-discussion] 30% speedup when deactivating NumArray.__del__ !!!

Todd Miller jmiller at stsci.edu
Mon Jan 31 10:30:21 CST 2005


Thanks Andrew, that was a useful summary.   I wish I had more time to
work on optimizing numarray personally,  but I don't.   Instead I'll try
to share what I know of the state of __del__/tp_dealloc so that people
who want to work on it can come up with something better:

1.  We need __del__/tp_dealloc.  (This may be controversial but I hope
not). Using the destructor makes the high level C-API cleaner.  Getting
rid of it means changing the C-API.  __del__/tp_dealloc is used to
transparently copy the contents of a working array back onto an
ill-behaved (byteswapped, etc...) source array at extension function
exit time.  

2.  There's a problem with the tp_dealloc I originally implemented which
causes it to segfault for a ./configure'ed --with-pydebug Python.  
Looking at it today,  it looks like it may be an exit-time garbage
collection problem.  There is no explicit garbage collection support in
_numarray or _ndarray,  so that may be the problem.

3.  We're definitely not exploiting the "single underscore rule" yet. 
We use single underscores mostly to hide globals from module export.  I
don't think this is really critical,  but that's the state of things.

4. Circular references should only be a problem for numerical arrays
with "user introduced" cycles.  numarray ObjectArrays have no __del__.

I attached a patch against CVS that reinstates the old tp_dealloc;  this
shows where I left off in case someone has insight on how to fix it.  I
haven't tested it recently for a non-debug Python;  I think it works.
The patch segfaults after the C-API examples/selftest for debug Pythons:

% python setup.py install --selftest
Using EXTRA_COMPILE_ARGS = []
running install
running build
running build_py
copying Lib/numinclude.py -> build/lib.linux-i686-2.4/numarray
running build_ext
running install_lib
copying build/lib.linux-i686-2.4/numarray/numinclude.py -> /home/jmiller/work/lib/python2.4/site-packages/numarray
byte-compiling /home/jmiller/work/lib/python2.4/site-packages/numarray/numinclude.py to numinclude.pyc
running install_headers
copying Include/numarray/numconfig.h -> /home/jmiller/work/include/python2.4/numarray
running install_data
Testing numarray 1.2a on Python (2, 4, 0, 'final', 0)
numarray.numtest:                       ((0, 1231), (0, 1231))
numarray.ieeespecial:                   (0, 20)
numarray.records:                       (0, 48)
numarray.strings:                       (0, 186)
numarray.memmap:                        (0, 82)
numarray.objects:                       (0, 105)
numarray.memorytest:                    (0, 16)
numarray.examples.convolve:             ((0, 20), (0, 20), (0, 20), (0, 20))
Segmentation fault (core dumped)

That's all I can add for now.  

Regards,
Todd

On Mon, 2005-01-31 at 11:54 +1100, Andrew McNamara wrote: 
> >This benchmark made me suspicious since I had already found it odd before that 
> >killing a numarray calculation with Ctrl-C nearly always gives a backtrace 
> >starting in __del__
> 
> Much of the python machinery may have been torn down when your __del__
> method is called while the interpreter is exiting (I'm asuming you're
> talking about a script, rather than interactive mode). Code should
> be prepared for anything to fail - it's quite common for parts of
> __builtins__ to have been disassembled, etc.
> 
> The language reference has this to say:
> 
>     http://python.org/doc/2.3.4/ref/customization.html#l2h-174
> 
>     Warning: Due to the precarious circumstances under which __del__()
>     methods are invoked, exceptions that occur during their execution
>     are ignored, and a warning is printed to sys.stderr instead. Also,
>     when __del__() is invoked in response to a module being deleted (e.g.,
>     when execution of the program is done), other globals referenced by
>     the __del__() method may already have been deleted. For this reason,
>     __del__() methods should do the absolute minimum needed to maintain
>     external invariants. Starting with version 1.5, Python guarantees that
>     globals whose name begins with a single underscore are deleted from
>     their module before other globals are deleted; if no other references
>     to such globals exist, this may help in assuring that imported modules
>     are still available at the time when the __del__() method is called.
> 
> Another important caveat of classes with __del__ methods is mentioned in
> the library reference for the "gc" module:
> 
>     http://python.org/doc/2.3.4/lib/module-gc.html
> 
>     Objects that have __del__() methods and are part of a reference
>     cycle cause the entire reference cycle to be uncollectable,
>     including objects not necessarily in the cycle but reachable only
>     from it. Python doesn't collect such cycles automatically because,
>     in general, it isn't possible for Python to guess a safe order in
>     which to run the __del__() methods.
-------------- next part --------------
? Lib/numinclude.py
? Lib/ufunc.warnings
? Lib/codegenerator/basecode.pyc
? Lib/codegenerator/bytescode.pyc
? Lib/codegenerator/convcode.pyc
? Lib/codegenerator/sortcode.pyc
? Lib/codegenerator/template.pyc
? Lib/codegenerator/ufunccode.pyc
? Src/_ufuncmodule.new
Index: Lib/numarraycore.py
===================================================================
RCS file: /cvsroot/numpy/numarray/Lib/numarraycore.py,v
retrieving revision 1.101
diff -c -r1.101 numarraycore.py
*** Lib/numarraycore.py	25 Jan 2005 11:25:09 -0000	1.101
--- Lib/numarraycore.py	31 Jan 2005 16:36:47 -0000
***************
*** 693,703 ****
              v._byteorder = self._byteorder
              return v
  
-     def __del__(self):
-         if self._shadows != None:
-             self._shadows._copyFrom(self)
-             self._shadows = None
-  
      def __getstate__(self):
          """returns state of NumArray for pickling."""
          # assert not hasattr(self, "_shadows") # Not a good idea for pickling.
--- 693,698 ----
Index: Src/_numarraymodule.c
===================================================================
RCS file: /cvsroot/numpy/numarray/Src/_numarraymodule.c,v
retrieving revision 1.65
diff -c -r1.65 _numarraymodule.c
*** Src/_numarraymodule.c	5 Jan 2005 19:49:02 -0000	1.65
--- Src/_numarraymodule.c	31 Jan 2005 16:36:47 -0000
***************
*** 105,128 ****
  }
  
  static PyObject *
! _numarray_shadows_get(PyArrayObject *self)
  {
! 	if (self->_shadows) {
! 		Py_INCREF(self->_shadows);
! 		return self->_shadows;
! 	} else {
! 		Py_INCREF(Py_None);
! 		return Py_None;
  	}
  }
  
! static int
! _numarray_shadows_set(PyArrayObject *self, PyObject *s)
  {
! 	Py_XDECREF(self->_shadows);
! 	if (s) Py_INCREF(s);
! 	self->_shadows = s;
! 	return 0;
  }
  
  static PyObject *
--- 105,138 ----
  }
  
  static PyObject *
! _numarray_new(PyTypeObject *type, PyObject  *args, PyObject *kwds)
  {
! 	PyArrayObject *self;
! 	self = (PyArrayObject *) 
! 		_numarray_type.tp_base->tp_new(type, args, kwds);
! 	if (!self) return NULL;
! 	if (!(self->descr = PyArray_DescrFromType( tAny))) {
! 		PyErr_Format(PyExc_RuntimeError, 
! 			     "_numarray_new: bad type number");
! 		return NULL;
  	}
+ 	return (PyObject *) self;
  }
  
! static void
! _numarray_dealloc(PyObject *self)
  {
! 	PyArrayObject *me = (PyArrayObject *) self;
! 	Py_INCREF(self);
! 	if (me->_shadows) {
! 		PyObject *result = PyObject_CallMethod(me->_shadows, 
! 						  "_copyFrom", "(O)", self);
! 		Py_XDECREF(result);    /* Should be None. */
! 		Py_DECREF(me->_shadows);
! 		me->_shadows = NULL;
! 	}
! 	self->ob_refcnt = 0;
! 	_numarray_type.tp_base->tp_dealloc(self);
  }
  
  static PyObject *
***************
*** 218,226 ****
  }
  
  static PyGetSetDef _numarray_getsets[] = {
-  	{"_shadows", 
- 	 (getter)_numarray_shadows_get, 
- 	 (setter) _numarray_shadows_set, "numeric shadows object"}, 
   	{"_type", 
  	 (getter)_numarray_type_get, 
  	 (setter) _numarray_type_set, "numeric type object"}, 
--- 228,233 ----
***************
*** 418,424 ****
  	"numarray._numarray._numarray",
  	sizeof(PyArrayObject),
          0,
! 	0,			                /* tp_dealloc */
  	0,					/* tp_print */
  	0,					/* tp_getattr */
  	0,					/* tp_setattr */
--- 425,431 ----
  	"numarray._numarray._numarray",
  	sizeof(PyArrayObject),
          0,
! 	_numarray_dealloc,			/* tp_dealloc */
  	0,					/* tp_print */
  	0,					/* tp_getattr */
  	0,					/* tp_setattr */
***************
*** 452,458 ****
  	0,					/* tp_dictoffset */
  	(initproc)_numarray_init,		/* tp_init */
  	0,					/* tp_alloc */
! 	0,      				/* tp_new */
  };
  
  typedef void Sigfunc(int);
--- 459,465 ----
  	0,					/* tp_dictoffset */
  	(initproc)_numarray_init,		/* tp_init */
  	0,					/* tp_alloc */
! 	_numarray_new, 				/* tp_new */
  };
  
  typedef void Sigfunc(int);


More information about the Numpy-discussion mailing list