[Numpy-discussion] C-API for non-contiguous arrays

Oliver Kranz o.kranz@gmx...
Fri Oct 26 10:38:47 CDT 2007


David Cournapeau wrote:
> Oliver Kranz wrote:
>> Hi,
>>
>> I am working on a Python extension module using of the NumPy C-API. The 
>> extension module is an interface to an image processing and analysis 
>> library written in C++. The C++ functions are exported with 
>> boos::python. Currently I am implementing the support of 
>> three-dimensional data sets which can consume a huge amount of memory. 
>> The 3D data is stored in a numpy.ndarray. This array is passed to C++ 
>> functions which do the calculations.
>>
>> In general, multi-dimensional arrays can be organized in memory in four 
>> different ways:
>> 1. C order contiguous
>> 2. Fortran order contiguous
>> 3. C order non-contiguous
>> 4. Fortran order non-contiguous
>>
>> Am I right that the NumPy C-API can only distinguish between three ways 
>> the array is organized in memory? These are:
>> 1. C order contiguous e.g. with PyArray_ISCONTIGUOUS(arr)
>> 2. Fortran order contiguous e.g. with PyArray_ISFORTRAN(arr)
>> 3. non-contiguous e.g. with !PyArray_ISCONTIGUOUS(arr) &&
>> !PyArray_ISFORTRAN(arr)
>>
>> So there is no way to find out if a non-contiguous array has C order or 
>> Fortran order. The same holds for Python code e.g. by use of the flags.
>>
>> a.flags.contiguous
>> a.flags.fortran
>>
>> This is very important for me because I just want to avoid to copy every 
>> non-contiguous array into a contiguous array. This would be very 
>> inefficient. But I can't find an other solution than copying the array.
> It is inefficient depending on what you mean by inefficient. 
> Memory-wise, copying is obviously inefficient. But speed-wise, copying 
> the array into a contiguous array in C order is faster in most if not 
> all cases, because of memory access times.
> 
> You may want to read the following article from Ulrich Drepper on memory 
> and cache:
> 
> http://lwn.net/Articles/252125/

That's an interesting note. We already thought about this. At the 
moment, we decided to consequently avoid copying in our apecial case. 
It's not unusal to work with data sets consuming about 1 GB of memory. 
In the case of arrays not being in contiguous C order we have to live 
with the inefficiency in speed.

Cheers,
Oliver


More information about the Numpy-discussion mailing list