[Numpy-discussion] C vs. Fortran order -- misleading documentation?

Benjamin Root ben.root@ou....
Tue Jun 8 15:56:59 CDT 2010


On Tue, Jun 8, 2010 at 1:36 PM, Eric Firing <efiring@hawaii.edu> wrote:

> On 06/08/2010 08:16 AM, Eric Firing wrote:
> > On 06/08/2010 05:50 AM, Charles R Harris wrote:
> >>
> >>
> >> On Tue, Jun 8, 2010 at 9:39 AM, David Goldsmith<d.l.goldsmith@gmail.com
> >> <mailto:d.l.goldsmith@gmail.com>>  wrote:
> >>
> >>      On Tue, Jun 8, 2010 at 8:27 AM, Pavel Bazant<MaxPlanck@seznam.cz
> >>      <mailto:MaxPlanck@seznam.cz>>  wrote:
> >>
> >>
> >>           >  >  Correct me if I am wrong, but the paragraph
> >>           >  >
> >>           >  >  Note to those used to IDL or Fortran memory order as it
> >>          relates to
> >>           >  >  indexing. Numpy uses C-order indexing. That means that
> the
> >>          last index
> >>           >  >  usually (see xxx for exceptions) represents the most
> >>          rapidly changing memory
> >>           >  >  location, unlike Fortran or IDL, where the first index
> >>          represents the most
> >>           >  >  rapidly changing location in memory. This difference
> >>          represents a great
> >>           >  >  potential for confusion.
> >>           >  >
> >>           >  >  in
> >>           >  >
> >>           >  >
> http://docs.scipy.org/doc/numpy/user/basics.indexing.html
> >>           >  >
> >>           >  >  is quite misleading, as C-order means that the last
> index
> >>          changes rapidly,
> >>           >  >  not the
> >>           >  >  memory location.
> >>           >  >
> >>           >  >
> >>           >  Any index can change rapidly, depending on whether is in an
> >>          inner loop or
> >>           >  not. The important distinction between C and Fortran order
> is
> >>          how indices
> >>           >  translate to memory locations. The documentation seems
> >>          correct to me,
> >>           >  although it might make more sense to say the last index
> >>          addresses a
> >>           >  contiguous range of memory. Of course, with modern
> >>          processors, actual
> >>           >  physical memory can be mapped all over the place.
> >>           >
> >>           >  Chuck
> >>
> >>          To me, saying that the last index represents the most rapidly
> >>          changing memory
> >>          location means that if I change the last index, the memory
> >>          location changes
> >>          a lot, which is not true for C-order. So for C-order, supposed
> >>          one scans the memory
> >>          linearly (the desired scenario),  it is the last *index* that
> >>          changes most rapidly.
> >>
> >>          The inverted picture looks like this: For C-order,  changing
> the
> >>          first index
> >>          leads to the most rapid jump in *memory*.
> >>
> >>          Still have the feeling the doc is very misleading at this
> >>          important issue.
> >>
> >>          Pavel
> >>
> >>
> >>      The distinction between your two perspectives is that one is using
> >>      for-loop traversal of indices, the other is using pointer-increment
> >>      traversal of memory; from each of your perspectives, your
> >>      conclusions are "correct," but my inclination is that the
> >>      pointer-increment traversal of memory perspective is closer to the
> >>      "spirit" of the docstring, no?
> >>
> >>
> >> I think the confusion is in "most rapidly changing memory location",
> >> which is kind of ambiguous because a change in the indices is always a
> >> change in memory location if one hasn't used index tricks and such. So
> >> from a time perspective it means nothing, while from a memory
> >> perspective the largest address changes come from the leftmost indices.
> >
> > Exactly.  Rate of change with respect to what, or as you do what?
> >
> > I suggest something like the following wording, if you don't mind the
> > verbosity as a means of conjuring up an image (although putting in
> > diagrams would make it even clearer--undoubtedly there are already good
> > illustrations somewhere on the web):
> >
> > ------------
> >
> > Note to those used to Matlab, IDL, or Fortran memory order as it relates
> > to indexing. Numpy uses C-order indexing by default, although a numpy
> > array can be designated as using Fortran order. [With C-order,
> > sequential memory locations are accessed by incrementing the last
>
> Maybe change "sequential" to "contiguous".
>
> I was thinking maybe "subsequent" might be a better word.

In the end, we need to communicate this clearly.  No matter which language,
I have always found it difficult to get new programmers to understand the
importance of knowing the difference between row-major and column-major.  A
"thick" paragraph isn't going to help to get the idea across to a person who
doesn't even know that a problem exists.

Maybe a car analogy would be good here...

Maybe if one imagine city streets (where many of the streets are one-way),
and need to drop off mail at each address.  Would it be more efficient to go
up and back a street or to drop off mail at the first address of the street
and then move on to the first address of the next street?

Just my two cents...

Ben Root


>
> > index.]  For a two-dimensional array, think if it as a table.  With
> > C-order indexing the table is stored as a series of rows, so that one is
> > reading from left to right, incrementing the column (last) index, and
> > jumping ahead in memory to the next row by incrementing the row (first)
> > index. With Fortran order, the table is stored as a series of columns,
> > so one reads memory sequentially from top to bottom, incrementing the
> > first index, and jumps ahead in memory to the next column by
> > incrementing the last index.
> >
> > One more difference to be aware of: numpy, like python and C, uses
> > zero-based indexing; Matlab, [IDL???], and Fortran start from one.
> >
> > -----------------
> >
> > If you want to keep it short, the key wording is in the sentence in
> > brackets, and you can chop out the table illustration.
> >
> > Eric
> >
> >
> >>
> >> Chuck
> >>
> >>
> >>
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion@scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20100608/35c0748f/attachment-0001.html 


More information about the NumPy-Discussion mailing list