[Numpy-discussion] Array views

Sturla Molden sturla@molden...
Mon Mar 28 05:55:59 CDT 2011


Den 28.03.2011 09:34, skrev Dag Sverre Seljebotn:
> What would we do exactly -- pass the entire underlying buffer to Fortran
> and then re-slice it Fortran side?

Pass a C pointer to the first element along with shape and strides, get 
a Fortran pointer using c_f_pointer, then reslice the Fortran pointer to 
a second Fortran pointer.

Remember to argsort strides and dimentions on strides in ascending order.

> That's more complicated than doing a
> copy Python-side, so unless the Fortran compiler actually supports
> keeping the array strided throughout the code it's not worth it. I'm not
> aware of the rules here -- basically, if you keep an array
> assumed-shape, it can be passed around in non-contiguous form in
> Fortran?

The Fortran standard does not specify this. A compiler is free to make a 
copy (copy-in copy-out), which is why a function call can invalidate a 
pointer. A compiler can decide to make a local copy, pass a dope array 
struct, or even remap virtual memory. It might e.g. depend on 
optimization rules, such as whether to favour speed or size. We cannot 
instruct a Fortran compiler to use strided memory, but we can allow it 
to use strided memory if it wants.

> And it will be copied whenever it is passed as an
> explicit-shape array?

That is also compiler dependent. The standard just says that array 
indexing shall work. There is no requirement that explicit-shape arrays 
are contiguous. A compiler is free to use strided memory if it wants to 
here as well. Most compilers will assume that explicit-shape and 
assumed-size arrays are contiguous and passed as a pointer to the first 
element. But the standard does not require this, which is why there were 
no portable way of interfacing C and Fortran prior to Fortran 2003. Only 
a C array using Fortran 2003 C bindings is assumed contiguous by the 
standard. Apart from that, the standard does not care about the binary 
representation.

f2py actually depends on knowing implementation details for common 
Fortran compilers, not a strandard C interface.

I seem to remember that some Fortran 77 compilers even used virtual 
memory remapping to make strided memory to look contiguous to the 
callee, instead of making a local copy. That would be an alternative to 
a dope array, while presering a simple interface to C.

So the answer is this is messy, the Fortran standard only require 
indexing to work as expected, and Fortran compilers can do whatever they 
want to achieve this. Just to repeat myself, "we cannot instruct a 
Fortran compiler to use strided memory, but we can allow it to use 
strided memory if it wants."

It is indeed easier to make a local contiguous copy when calling 
Fortran. My experience when working with C is that "copy-in copy-out" is 
faster for computation than strided memory, particularly when it can fit 
in CPU cache. So just making a copy can be a good strategy, but it is 
depending on small array size. Generally I would say that it is a bad 
idea to try to be smarter than a Fortran compiler when it comes to 
decisions on array access. Those that make Fortran compilers have spent 
man years tuning this. So my preference is to just tell Fortran that 
memory is strided, and let it do whatever it wants.

Sturla


More information about the NumPy-Discussion mailing list