[Numpy-discussion] np.asfortranarray: unnecessary copying?

Kurt Smith kwmsmith@gmail....
Sat Jul 31 17:17:37 CDT 2010


On Fri, Jul 30, 2010 at 1:33 PM, Anne Archibald
<aarchiba@physics.mcgill.ca> wrote:
> This seems to me to be a bug, or rather, two bugs. 1D arrays are
> automatically Fortran-ordered, so isfortran should return True for
> them (incidentally, the documentation should be edited to indicate
> that the data must also be contiguous in memory). Whether or not this
> change is made, there's no point in asfortranarray making a copy of a
> 1D array, since the copy isn't any more Fortran-ordered than the input
> array.

Yep, seem like bugs to me too.  And they're related to the same thing
in the numpy source: the FORTRAN  flag always tests for at least 2
dimensional arrays, although some comments contradict this:

see: numpy/include/ndarrayobject.h:592

/*
  Note: all 0-d arrays are CONTIGUOUS and FORTRAN contiguous. If a
   1-d array is CONTIGUOUS it is also FORTRAN contiguous
*/

There is an array flag, 'fnc' that stands for something like
"fortran-not-contiguous"; this is what isfortran checks.
Non-intuitively an array can have a.flags.f_contigouous == True but
a.flags.fnc == False (0-D and 1-D contig. arrays, for example).

Is there some rationale for this behavior in the code?  It's enforced
everywhere (ignoring the comments to the contrary) so it's
intentional, but makes no sense to me.

This fortran stuff is very important to have working correctly for my
project, 'fwrap'.  I'm thinking of creating a wrapper module that has
fixed versions of these functions.  (I'd also like this fixed in
numpy, but that might break old code that depends on the current
sematics.)

Something like:

>>> from fwrap import fnp
>>> fnp.isfortran(a) # handles 0- and 1-D arrays correctly
>>> fnp.asfortranarray(a) # won't make unnecessary copies, etc.


>
> Another kind of iffy case is axes of length one. These should not
> affect C/Fortran order, since the length of their strides doesn't
> matter, but they do; if you use newaxis to add an axis to an array,
> it's still just as C/Fortran ordered as it was, but np.isfortran
> reports False. (Is there a np.isc or equivalent function?)

Good point; taking a C- or F-contiguous array and using np.newaxis
sets a.flags.contiguous == False and a.flags.f_contiguous == False.
So this is more general than just a fortran issue.

Example:

In [17]: a = np.arange(10)

In [18]: a.flags.contiguous
Out[18]: True

In [19]: anew = a[np.newaxis, :]

In [20]: anew.flags.contiguous
Out[20]: False

In [21]: anew.shape
Out[21]: (1, 10)

In [22]: other = np.empty((1,10))

In [23]: other.flags.contiguous
Out[23]: True

In [24]: other.shape
Out[24]: (1, 10)

In [25]: other.strides
Out[25]: (80, 8)

In [26]: anew.strides
Out[26]: (0, 8)


>
> Incidentally, there is a subtle misconception in your example code:
> when reshaping an array, the order='F' has a different meaning. It has
> nothing direct to do with the memory layout; what it does is define
> the logical arrangement of elements used while reshaping the array.
> The array returned will be in C order if a copy must be made, or in
> whatever arbitrarily-strided order is necessary if the reshape can be
> done without a copy. As it happens, in your example, the latter case
> occurs and works out to Fortran order.

Good catch; I should have done 'arr.reshape(2,5).copy('F')'.

>
> Anne
>
> On 30 July 2010 13:50, Kurt Smith <kwmsmith@gmail.com> wrote:
>> What are the rules for when 'np.asarray' and 'np.asfortranarray' make a copy?
>>
>> This makes sense to me:
>>
>> In [3]: carr = np.arange(3)
>>
>> In [6]: carr2 = np.asarray(carr)
>>
>> In [8]: carr2[0] = 1
>>
>> In [9]: carr
>> Out[9]: array([1, 1, 2])
>>
>> No copy is made.
>>
>> But doing the same with a fortran array makes a copy:
>>
>> In [10]: farr = np.arange(3).copy('F')
>>
>> In [12]: farr2 = np.asfortranarray(farr)
>>
>> In [13]: farr2[0] = 1
>>
>> In [14]: farr
>> Out[14]: array([0, 1, 2])
>>
>> Could it be a 1D thing, since it's both C contiguous & F contiguous?
>>
>> Here's a 2D example:
>>
>> In [15]: f2D = np.arange(10).reshape((2,5), order='F')
>>
>> In [17]: f2D2 = np.asfortranarray(f2D)
>>
>> In [19]: f2D2[0,0] = 10
>>
>> In [20]: f2D
>> Out[20]:
>> array([[10,  2,  4,  6,  8],
>>       [ 1,  3,  5,  7,  9]])
>>
>> So it looks like np.asfortranarray makes an unnecessary copy if the
>> array is simultaneously 1D, C contiguous and F contiguous.
>>
>> Coercing the array with np.atleast_2d() makes asfortranarry behave.
>>
>> Looking further, np.isfortran always returns false if the array is 1D,
>> even if it's Fortran contiguous (and np.isfortran is documented as
>> such).
>>
>> What is the rationale here?  Is it a 'column' vs. 'row' thing?
>>
>> Kurt
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


More information about the NumPy-Discussion mailing list