[Numpy-discussion] C vs. Fortran order -- misleading documentation?
Tue Jun 8 15:56:59 CDT 2010
On Tue, Jun 8, 2010 at 1:36 PM, Eric Firing <email@example.com> wrote:
> On 06/08/2010 08:16 AM, Eric Firing wrote:
> > On 06/08/2010 05:50 AM, Charles R Harris wrote:
> >> On Tue, Jun 8, 2010 at 9:39 AM, David Goldsmith<firstname.lastname@example.org
> >> <mailto:email@example.com>> wrote:
> >> On Tue, Jun 8, 2010 at 8:27 AM, Pavel Bazant<MaxPlanck@seznam.cz
> >> <mailto:MaxPlanck@seznam.cz>> wrote:
> >> > > Correct me if I am wrong, but the paragraph
> >> > >
> >> > > Note to those used to IDL or Fortran memory order as it
> >> relates to
> >> > > indexing. Numpy uses C-order indexing. That means that
> >> last index
> >> > > usually (see xxx for exceptions) represents the most
> >> rapidly changing memory
> >> > > location, unlike Fortran or IDL, where the first index
> >> represents the most
> >> > > rapidly changing location in memory. This difference
> >> represents a great
> >> > > potential for confusion.
> >> > >
> >> > > in
> >> > >
> >> > >
> >> > >
> >> > > is quite misleading, as C-order means that the last
> >> changes rapidly,
> >> > > not the
> >> > > memory location.
> >> > >
> >> > >
> >> > Any index can change rapidly, depending on whether is in an
> >> inner loop or
> >> > not. The important distinction between C and Fortran order
> >> how indices
> >> > translate to memory locations. The documentation seems
> >> correct to me,
> >> > although it might make more sense to say the last index
> >> addresses a
> >> > contiguous range of memory. Of course, with modern
> >> processors, actual
> >> > physical memory can be mapped all over the place.
> >> >
> >> > Chuck
> >> To me, saying that the last index represents the most rapidly
> >> changing memory
> >> location means that if I change the last index, the memory
> >> location changes
> >> a lot, which is not true for C-order. So for C-order, supposed
> >> one scans the memory
> >> linearly (the desired scenario), it is the last *index* that
> >> changes most rapidly.
> >> The inverted picture looks like this: For C-order, changing
> >> first index
> >> leads to the most rapid jump in *memory*.
> >> Still have the feeling the doc is very misleading at this
> >> important issue.
> >> Pavel
> >> The distinction between your two perspectives is that one is using
> >> for-loop traversal of indices, the other is using pointer-increment
> >> traversal of memory; from each of your perspectives, your
> >> conclusions are "correct," but my inclination is that the
> >> pointer-increment traversal of memory perspective is closer to the
> >> "spirit" of the docstring, no?
> >> I think the confusion is in "most rapidly changing memory location",
> >> which is kind of ambiguous because a change in the indices is always a
> >> change in memory location if one hasn't used index tricks and such. So
> >> from a time perspective it means nothing, while from a memory
> >> perspective the largest address changes come from the leftmost indices.
> > Exactly. Rate of change with respect to what, or as you do what?
> > I suggest something like the following wording, if you don't mind the
> > verbosity as a means of conjuring up an image (although putting in
> > diagrams would make it even clearer--undoubtedly there are already good
> > illustrations somewhere on the web):
> > ------------
> > Note to those used to Matlab, IDL, or Fortran memory order as it relates
> > to indexing. Numpy uses C-order indexing by default, although a numpy
> > array can be designated as using Fortran order. [With C-order,
> > sequential memory locations are accessed by incrementing the last
> Maybe change "sequential" to "contiguous".
> I was thinking maybe "subsequent" might be a better word.
In the end, we need to communicate this clearly. No matter which language,
I have always found it difficult to get new programmers to understand the
importance of knowing the difference between row-major and column-major. A
"thick" paragraph isn't going to help to get the idea across to a person who
doesn't even know that a problem exists.
Maybe a car analogy would be good here...
Maybe if one imagine city streets (where many of the streets are one-way),
and need to drop off mail at each address. Would it be more efficient to go
up and back a street or to drop off mail at the first address of the street
and then move on to the first address of the next street?
Just my two cents...
> > index.] For a two-dimensional array, think if it as a table. With
> > C-order indexing the table is stored as a series of rows, so that one is
> > reading from left to right, incrementing the column (last) index, and
> > jumping ahead in memory to the next row by incrementing the row (first)
> > index. With Fortran order, the table is stored as a series of columns,
> > so one reads memory sequentially from top to bottom, incrementing the
> > first index, and jumps ahead in memory to the next column by
> > incrementing the last index.
> > One more difference to be aware of: numpy, like python and C, uses
> > zero-based indexing; Matlab, [IDL???], and Fortran start from one.
> > -----------------
> > If you want to keep it short, the key wording is in the sentence in
> > brackets, and you can chop out the table illustration.
> > Eric
> >> Chuck
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion@scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> NumPy-Discussion mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion