[Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

Matthew Brett matthew.brett@gmail....
Fri Mar 29 21:08:23 CDT 2013

Hi,

We were teaching today, and found ourselves getting very confused
about ravel and shape in numpy.

Summary
--------------

There are two separate ideas needed to understand ordering in ravel and reshape:

Idea 1): ravel / reshape can proceed from the last axis to the first,
or the first to the last.  This is "ravel index ordering"
Idea 2) The physical layout of the array (on disk or in memory) can be
"C" or "F" contiguous or neither.
This is "memory ordering"

The index ordering is usually (but see below) orthogonal to the memory ordering.

The 'ravel' and 'reshape' commands use "C" and "F" in the sense of
index ordering, and this mixes the two ideas and is confusing.

What the current situation looks like
----------------------------------------------------

Specifically, we've been rolling this around 4 experienced numpy users
and we all predicted at least one of the results below wrongly.

This was what we knew, or should have known:

In [2]: import numpy as np

In [3]: arr = np.arange(10).reshape((2, 5))

In [5]: arr.ravel()
Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

So, the 'ravel' operation unravels over the last axis (1) first,
followed by axis 0.

So far so good (even if the opposite to MATLAB, Octave).

Then we found the 'order' flag to ravel:

In [10]: arr.flags
Out[10]:
C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False

In [11]: arr.ravel('C')
Out[11]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [12]: arr_F = np.array(arr, order='F')

In [13]: arr_F.flags
Out[13]:
C_CONTIGUOUS : False
F_CONTIGUOUS : True
OWNDATA : True
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False

In [16]: arr_F
Out[16]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])

In [17]: arr_F.ravel('C')
Out[17]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Right - so the flag 'C' to ravel, has got nothing to do with *memory*
ordering, but is to do with *index* ordering.

And in fact, we can ask for memory ordering specifically:

In [22]: arr.ravel('K')
Out[22]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [23]: arr_F.ravel('K')
Out[23]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

In [24]: arr.ravel('A')
Out[24]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [25]: arr_F.ravel('A')
Out[25]: array([0, 5, 1, 6, 2, 7, 3, 8, 4, 9])

There are some confusions to get into with the 'order' flag to reshape
as well, of the same type.

Ravel and reshape use the tems 'C' and 'F" in the sense of index ordering.

This is very confusing.  We think the index ordering and memory
ordering ideas need to be separated, and specifically, we should avoid
using "C" and "F" to refer to index ordering.

Proposal
-------------

* Deprecate the use of "C" and "F" meaning backwards and forwards
index ordering for ravel, reshape
* Prefer "Z" and "N", being graphical representations of unraveling in
2 dimensions, axis1 first and axis0 first respectively (excellent
naming idea by Paul Ivanov)

What do y'all think?

Cheers,

Matthew
Paul Ivanov
JB Poline