[NumPy-Tickets] [NumPy] #1779: array.tolist() enhancement
NumPy Trac
numpy-tickets@scipy....
Wed Mar 23 08:17:03 CDT 2011
#1779: array.tolist() enhancement
-------------------------+--------------------------------------------------
Reporter: Han | Owner: somebody
Type: enhancement | Status: new
Priority: normal | Milestone: 1.6.0
Component: Other | Version: devel
Keywords: tolist |
-------------------------+--------------------------------------------------
Hi,
For a while, a small issue has been bugging me, where array.tolist() takes
a huge amount of time, compared to the speed of Python.
To illustrate, here are some timings (on Windows XP):
{{{
>>> timeit.timeit('a.tolist()', 'from numpy import arange; a =
arange(1e5)', number=500)
19.07303038231646
}}}
The conversion of 500 x 100000 elements takes up to 20(!) seconds.
{{{
>>> timeit.timeit('a = arange(1e5)', 'from numpy import arange',
number=500)
0.4194663546483639
}}}
While creating a NumPy arrays with the same amount of items takes .5
seconds.
{{{
>>> timeit.timeit('a = range(int(1e5))', number=500)
0.97656364422391562
}}}
And creating a Python list takes no more than 1 second, this is 20x faster
than array.tolist(). So where is this discrepancy coming from?
To find out, I did some runs with valgrind on NumPy 1.4.1 dbg (Debian),
and used kcachegrind to produce a few calling graphs.
The first thing I noticed was the amount of array_alloc and consequent
array_dealloc calls. The number of calls amount up to the number of
elements in the array! PyArray_NewFromDescr is called per array element,
which creates a lot of overhead.
In NumPy 1.6.1b1, this overhead still exists; it stems from the
PyArray_ToList function in convert.c:
{{{
NPY_NO_EXPORT PyObject *
PyArray_ToList(PyArrayObject *self)
{
PyObject *lp;
PyArrayObject *v;
intp sz, i;
if (!PyArray_Check(self)) {
return (PyObject *)self;
}
if (self->nd == 0) {
return self->descr->f->getitem(self->data,self);
}
sz = self->dimensions[0];
lp = PyList_New(sz);
for (i = 0; i < sz; i++) {
v = (PyArrayObject *)array_big_item(self, i);
if (PyArray_Check(v) && (v->nd >= self->nd)) {
PyErr_SetString(PyExc_RuntimeError,
"array_item not returning smaller-" \
"dimensional array");
Py_DECREF(v);
Py_DECREF(lp);
return NULL;
}
PyList_SetItem(lp, i, PyArray_ToList(v));
Py_DECREF(v);
}
return lp;
}
}}}
For every element in the array, a array_big_item call is made to create a
new array with that element and given recursively to PyArray_ToList to get
the actual element item.
I added an extra clause to the function to account for 1-dimensional
arrays:
{{{
if (self->nd == 1) {
sz = self->dimensions[0];
lp = PyList_New(sz);
for (i = 0; i < sz; i++) {
PyList_SetItem(lp, i, self->descr->f->getitem(index2ptr(self,
i),self));
}
return lp;
}
}}}
Which gets the time down to 2 seconds on Windows.
I'm not sure about the patch, though, because it does not account for
errors, and might be more optimized / streamlined.
Anyway, hope it can go in at 1.6.0 in some way or another, because it
really helps in NumPy->Python conversions!
[attached: calling graphs]
--
Ticket URL: <http://projects.scipy.org/numpy/ticket/1779>
NumPy <http://projects.scipy.org/numpy>
My example project
More information about the NumPy-Tickets
mailing list