[Numpy-discussion] Proclamation: column-wise arrays

Pearu Peterson pearu at ioc.ee
Wed Jan 26 11:00:31 CST 2000


Hi!

Problem:
	Using Fortran routines from Python C/API is "tricky" when
multi-dimensional arrays are passed in.

Cause:
	Arrays in Fortran are stored in column-wise order while arrays in
C are stored in row-wise order.

Standard solutions:
	1) Create a new C array; copy the data from the old one in
column-wise order; pass the new array to fortran; copy changed array back
to old one in row-wise order; deallocate the array.
	2) Change the storage order of an array in place: element-wise
swapping; pass the array to fortran; change the storage order back with
element-wise swapping

Why standard solutions are not good?
	1) Additional memory allocation, that is problem for large arrays;
Element-wise copying is time consuming (2 times).
	2) It is good as no extra memory is needed but element-wise
swapping (2 times) is approx. equivalent with the element-wise copying (4
times).

Proclamation:
	Introduce a column-wise array to Numeric Python where data is
stored in column-wise order that can be used specifically for fortran
routines.

Proposal sketch:
	1) Introduce a new flag `row_order'to PyArrayObject structure:
row_order == 1  -> the data is stored in row-wise order (default, as it is
		now)
row_order == 0  -> the data is stored in column-wise order
Note that now the concept of contiguousness depends on this flag. 
	2) Introduce new array "constructors" such as PyArray_CW_FromDims,
PyArray_CW_FromDimsAndData, PyArray_CW_ContiguousFromObject,
PyArray_CW_CopyFromObject, PyArray_CW_FromObject, etc. that all return
arrays with row_order=0 and data stored in column-wise order (that is in
case of contiguous results, otherwise strides feature is employd).
	3) In order to operations between arrays (possibly with different 
storage order) would work correctly, many internal functions of NumPy
C/API need to be modifyied.
	4) anything else?

What is the good of this?
	1) The fact is that there is a large number of very good scietific
tools freely available written in Fortran (Netlib, for instance). And I
don't mean only Fortran 77 codes but also Fortran 90/95 codes.
	2) Having Numeric Python arrays with data stored in column-wise
order, calling Fortran routines from Python becomes really efficient and
space-saving.
	3) There should be little performance hit if, say, two
arrays with different storage order are multiplied (compared to the
operations between non-contiguous arrays in the current implementation).
	4) I don't see any reason why older C/API modules would broke
because of this change if it is carried out carefully enough. So,
back-ward compability should be there.
	5) anything else?

What are against of this?
	1) Lots of work but with current experience it should not be a
problem.
	2) The size of the code will grow.
	3) I suppose that most people using Numerical Python will not care
of calling Fortran routines from Python. Possible reasons: too "tricky" or
no need. In the first case, the answer is that there are tools such as
PyFort, f2py that solve this problem. In the later case, there is no
problem:-)
	4) anything else?

I understand that my proposal is quite radical but taking into account
that we want to use Python for many years to come, the use would be more
pleasing if one cause of (constant) confusion would be less during this
time.

Best regards,
	Pearu





More information about the Numpy-discussion mailing list