[Numpy-discussion] NumPy re-factoring project
Pauli Virtanen
pav@iki...
Fri Jun 11 03:17:40 CDT 2010
Thu, 10 Jun 2010 23:56:56 +0200, Sturla Molden wrote:
[clip]
> Also about array iterators in NumPy's C base (i.e. for doing something
> along an axis): we don't need those. There is a different way of coding
> which leads to faster code.
>
> 1. Collect an array of pointers to each subarray (e.g. using
> std::vector<dtype*> or dtype**)
> 2. Dispatch on the pointer array...
This is actually what the current ufunc code does.
The innermost dimension is handled via the ufunc loop, which is a simple
for loop with constant-size step and is given a number of iterations. The
array iterator objects are used only for stepping through the outer
dimensions. That is, it essentially steps through your dtype** array,
without explicitly constructing it.
An abstraction for iteration through an array is useful, also for
building the dtype** array, so we should probably retain them.
For multidimensional arrays, you run here into the optimization problem
of choosing the order of axes so that the memory access pattern makes a
maximally efficient use of the cache. Currently, this optimization
heuristic is really simple, and makes the wrong choice even in the simple
2D case.
We could in principle use OpenMP to parallelize the outer loop, as long
as the inner ufunc part is implemented in C, and does not go back into
Python. But if the problem is memory-bound, it is not clear that
parallelization helps.
Another task for optimization would perhaps be to implement specialized
loops for each operation, currently we do one function indirection per
iteration which probably costs something. But again, it may be that the
operations are already bounded by memory speed, and this would not
improve performance.
--
Pauli Virtanen
More information about the NumPy-Discussion
mailing list