[Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

josef.pktd@gmai... josef.pktd@gmai...
Sat Mar 30 23:05:20 CDT 2013


On Sat, Mar 30, 2013 at 11:43 PM, Matthew Brett <matthew.brett@gmail.com> wrote:
> Hi,
>
> On Sat, Mar 30, 2013 at 7:02 PM,  <josef.pktd@gmail.com> wrote:
>> On Sat, Mar 30, 2013 at 8:29 PM, Matthew Brett <matthew.brett@gmail.com> wrote:
>>> Hi,
>>>
>>> On Sat, Mar 30, 2013 at 7:50 PM,  <josef.pktd@gmail.com> wrote:
>>>> On Sat, Mar 30, 2013 at 7:31 PM, Bradley M. Froehle
>>>> <brad.froehle@gmail.com> wrote:
>>>>> On Sat, Mar 30, 2013 at 3:21 PM, Matthew Brett <matthew.brett@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> On Sat, Mar 30, 2013 at 2:20 PM,  <josef.pktd@gmail.com> wrote:
>>>>>> > On Sat, Mar 30, 2013 at 4:57 PM,  <josef.pktd@gmail.com> wrote:
>>>>>> >> On Sat, Mar 30, 2013 at 3:51 PM, Matthew Brett
>>>>>> >> <matthew.brett@gmail.com> wrote:
>>>>>> >>> On Sat, Mar 30, 2013 at 4:14 AM,  <josef.pktd@gmail.com> wrote:
>>>>>> >>>> On Fri, Mar 29, 2013 at 10:08 PM, Matthew Brett
>>>>>> >>>> <matthew.brett@gmail.com> wrote:
>>>>>> >>>>>
>>>>>> >>>>> Ravel and reshape use the tems 'C' and 'F" in the sense of index
>>>>>> >>>>> ordering.
>>>>>> >>>>>
>>>>>> >>>>> This is very confusing.  We think the index ordering and memory
>>>>>> >>>>> ordering ideas need to be separated, and specifically, we should
>>>>>> >>>>> avoid
>>>>>> >>>>> using "C" and "F" to refer to index ordering.
>>>>>> >>>>>
>>>>>> >>>>> Proposal
>>>>>> >>>>> -------------
>>>>>> >>>>>
>>>>>> >>>>> * Deprecate the use of "C" and "F" meaning backwards and forwards
>>>>>> >>>>> index ordering for ravel, reshape
>>>>>> >>>>> * Prefer "Z" and "N", being graphical representations of unraveling
>>>>>> >>>>> in
>>>>>> >>>>> 2 dimensions, axis1 first and axis0 first respectively (excellent
>>>>>> >>>>> naming idea by Paul Ivanov)
>>>>>> >>>>>
>>>>>> >>>>> What do y'all think?
>>>>>> >>>>
>>>>>> >>>> I always thought "F" and "C" are easy to understand, I always thought
>>>>>> >>>> about
>>>>>> >>>> the content and never about the memory when using it.
>>>>>> >>
>>>>>> >> changing the names doesn't make it easier to understand.
>>>>>> >> I think the confusion is because the new A and K refer to existing
>>>>>> >> memory
>>>>>> >>
>>>>>>
>>>>>> I disagree, I think it's confusing, but I have evidence, and that is
>>>>>> that four out of four of us tested ourselves and got it wrong.
>>>>>>
>>>>>> Perhaps we are particularly dumb or poorly informed, but I think it's
>>>>>> rash to assert there is no problem here.
>>>>
>>>> I think you are overcomplicating things or phrased it as a "trick question"
>>>
>>> I don't know what you mean by trick question - was there something
>>> over-complicated in the example?  I deliberately didn't include
>>> various much more confusing examples in "reshape".
>>
>> I meant making the "candidates" think about memory instead of just
>> column versus row stacking.
>> I don't think I ever get confused about reshape "F" in 2d.
>> But when I work with 3d or larger ndim nd-arrays, I always have to
>> try an example to check my intuition (in general not just reshape).
>>
>>>
>>>> ravel F and C have *nothing* to do with memory layout.
>>>
>>> We do agree on this of course - but you said in an earlier mail that
>>> you thought of 'C" and 'F' as referring to target memory layout (which
>>> they don't in this case) so I think we also agree that "C" and "F" do
>>> often refer to memory layout elsewhere in numpy.
>>
>> I guess that wasn't so helpful.
>> (emphasis on *target*, There are very few places where an order
>> keyword refers to *existing* memory layout.
>
> It is helpful because it shows how easy it is to get confused between
> memory order and index order.
>
>> What's reverse index order?
>
> I am not being clear, sorry about that:
>
> import numpy as np
>
> def ravel_iter_last_fastest(arr):
>     res = []
>     for i in range(arr.shape[0]):
>         for j in range(arr.shape[1]):
>             for k in range(arr.shape[2]):
>                 # Iterating over last dimension fastest
>                 res.append(arr[i, j, k])
>     return np.array(res)
>
>
> def ravel_iter_first_fastest(arr):
>     res = []
>     for k in range(arr.shape[2]):
>         for j in range(arr.shape[1]):
>             for i in range(arr.shape[0]):
>                 # Iterating over first dimension fastest
>                 res.append(arr[i, j, k])
>     return np.array(res)

good example

that's just C and F order in the terminology of numpy
http://docs.scipy.org/doc/numpy/reference/arrays.nditer.html#controlling-iteration-order
(independent of memory)
http://docs.scipy.org/doc/numpy/reference/generated/numpy.flatiter.html#numpy.flatiter

I don't think we want to rename a large part of the basic terminology of numpy


Josef


>
>
> a = np.arange(24).reshape((2, 3, 4))
>
> print np.all(a.ravel('C') == ravel_iter_last_fastest(a))
> print np.all(a.ravel('F') == ravel_iter_first_fastest(a))
>
> By 'reverse index ordering' I mean 'ravel_iter_last_fastest' above.  I
> guess one could argue that this was not 'reverse' but 'forward' index
> ordering, but I am not arguing about which is better, or those names,
> only that it's the order of indices that differs, not the memory
> layout, and that these ideas need to be kept separate.
>
> Cheers,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


More information about the NumPy-Discussion mailing list