[Numpy-discussion] fortran array storage question
Travis E. Oliphant
Fri Oct 26 10:38:59 CDT 2007
Anne Archibald wrote:
> On 26/10/2007, Georg Holzmann <email@example.com> wrote:
>> if in that example I also change the strides:
>> int s = tmp->strides;
>> tmp->strides = s;
>> tmp->strides = s * dim0;
>> Then I get in python the fortran-style array in right order.
> This is the usual way. More or less, at least. numpy is designed from
> the start to handle arrays with arbitrary striding; this is how slices
> are implemented, for example. There will be no major performance hit
> from numpy code itself. The actual organization of data in memory will
> of course affect the speed at which your code runs. The flags, as you
> discovered, are just a performance optimization, so that code that
> needs arrays organized as C- or FORTRAN-standard doesn't need to check
> the strides every time.
> I don't think numpy's loops - for example in ones((100,100))+eye(100)
> - are smart about doing operations in an order that makes
> cache-coherent use of memory. The important exception is the loops
> that use ATLAS, which I think is mostly the dot() function.
There is an optimization where-in the inner-loops are done over the
dimension with the smallest stride.
What other cache-coherent optimizations do you recommend?
More information about the Numpy-discussion