[Numpy-discussion] order flag again
Zachary Pincus
zpincus at stanford.edu
Mon Mar 27 17:14:18 CST 2006
>> Does this mean that if I create a new array with FORTRAN order
>> from the numpy.array() function, every time I reshape that array
>> I need to tell reshape to use FORTRAN order? That seems a bit odd...
>
> Yes, that is what it means. Doing anything else leads to a host
> of problems. Let me illustrate.
>
> a = rand(10,3)
> b = a.transpose()
> b.flags # b is now in Fortran order
> b.ravel() # do you really want this to now be interpreted in
> FORTRAN-order. I didn't think so.
>
> What this illustrates is that the FORTRAN and CONTIGUOUS flags on
> the array are just special cases of the strides (and in fact can be
> updated from the strides at any time). But, an ORDER flag would
> be an independent concept that could be set and reset at will. The
> ORDER flag is only necessary when somebody is interpreting the
> array as a linear sequence of bytes.
I think I understand now. Thanks.
Now a question concerning a common use-case:
Frequently I have either a string or a buffer protocol object that is
basically a chunk of memory returned from some wrapped C library
function. Sometimes this string/object is in fortran-strided memory
order, and I need to tell the ndarray constructor that so that
indexing works correctly.
Clearly, fortran-strided memory order is just a special case of the
strides, as you point out. So what I need to do in this case is
provide an appropriate strides argument to the ndarray constructor.
But that's harder (and less transparent for someone reading the code)
than just setting a flag like Fortran=True or Order='FORTRAN'. So
from this perspective it would be great if the ndarray constructor
allowed something like strides='FORTRAN'.
Of course, if an order flag is provided, then the ndarray constructor
would have two different and orthogonal things that could/should
accept a parameter of 'FORTRAN' -- Order and Strides. Now that's
confusing!
I guess the problem is (as you demonstrate clearly) that there are
two semi-orthogonal issues around FORTRAN-ness. First is the issue of
how a 1D memory region is to be indexed as a multidimensional array.
(e.g. when constructing an array from a buffer.) The second is the
issue of how a multidimensional array is to be treated as a 1D memory
region. (e.g. when using ravel or resize [if resize is padding with
the array contents]).
This is very confusing, and could certainly benefit from some
notational clarity. Does someone have any good ideas for what to call
each of these properties? Strides and Order (respectively) seems to
be what people are using in this email thread, but I'm not sure if
there could be something better...
Zach
More information about the Numpy-discussion
mailing list