[NumPy-Tickets] [NumPy] #1779: array.tolist() enhancement

NumPy Trac numpy-tickets@scipy....
Wed Mar 23 08:17:03 CDT 2011


#1779: array.tolist() enhancement
-------------------------+--------------------------------------------------
 Reporter:  Han          |       Owner:  somebody
     Type:  enhancement  |      Status:  new     
 Priority:  normal       |   Milestone:  1.6.0   
Component:  Other        |     Version:  devel   
 Keywords:  tolist       |  
-------------------------+--------------------------------------------------
 Hi,

 For a while, a small issue has been bugging me, where array.tolist() takes
 a huge amount of time, compared to the speed of Python.

 To illustrate, here are some timings (on Windows XP):


 {{{
 >>> timeit.timeit('a.tolist()', 'from numpy import arange; a =
 arange(1e5)', number=500)
 19.07303038231646
 }}}


 The conversion of 500 x 100000 elements takes up to 20(!) seconds.


 {{{
 >>> timeit.timeit('a = arange(1e5)', 'from numpy import arange',
 number=500)
 0.4194663546483639

 }}}

 While creating a NumPy arrays with the same amount of items takes .5
 seconds.


 {{{
 >>> timeit.timeit('a = range(int(1e5))', number=500)
 0.97656364422391562
 }}}


 And creating a Python list takes no more than 1 second, this is 20x faster
 than array.tolist(). So where is this discrepancy coming from?

 To find out, I did some runs with valgrind on NumPy 1.4.1 dbg (Debian),
 and used kcachegrind to produce a few calling graphs.

 The first thing I noticed was the amount of array_alloc and consequent
 array_dealloc calls. The number of calls amount up to the number of
 elements in the array! PyArray_NewFromDescr is called per array element,
 which creates a lot of overhead.

 In NumPy 1.6.1b1, this overhead still exists; it stems from the
 PyArray_ToList function in convert.c:

 {{{
 NPY_NO_EXPORT PyObject *
 PyArray_ToList(PyArrayObject *self)
 {
     PyObject *lp;
     PyArrayObject *v;
     intp sz, i;

     if (!PyArray_Check(self)) {
         return (PyObject *)self;
     }
     if (self->nd == 0) {
         return self->descr->f->getitem(self->data,self);
     }

     sz = self->dimensions[0];
     lp = PyList_New(sz);
     for (i = 0; i < sz; i++) {
         v = (PyArrayObject *)array_big_item(self, i);
         if (PyArray_Check(v) && (v->nd >= self->nd)) {
             PyErr_SetString(PyExc_RuntimeError,
                             "array_item not returning smaller-" \
                             "dimensional array");
             Py_DECREF(v);
             Py_DECREF(lp);
             return NULL;
         }
         PyList_SetItem(lp, i, PyArray_ToList(v));
         Py_DECREF(v);
     }
     return lp;
 }
 }}}

 For every element in the array, a array_big_item call is made to create a
 new array with that element and given recursively to PyArray_ToList to get
 the actual element item.

 I added an extra clause to the function to account for 1-dimensional
 arrays:

 {{{
     if (self->nd == 1) {
         sz = self->dimensions[0];
         lp = PyList_New(sz);
         for (i = 0; i < sz; i++) {
             PyList_SetItem(lp, i, self->descr->f->getitem(index2ptr(self,
 i),self));
         }
         return lp;
     }
 }}}

 Which gets the time down to 2 seconds on Windows.

 I'm not sure about the patch, though, because it does not account for
 errors, and might be more optimized / streamlined.

 Anyway, hope it can go in at 1.6.0 in some way or another, because it
 really helps in NumPy->Python conversions!

 [attached: calling graphs]

-- 
Ticket URL: <http://projects.scipy.org/numpy/ticket/1779>
NumPy <http://projects.scipy.org/numpy>
My example project


More information about the NumPy-Tickets mailing list