[Numpy-discussion] array, asarray as contiguous and friends
tim.hochberg at cox.net
Fri Mar 24 08:39:03 CST 2006
Colin J. Williams wrote:
> Tim Hochberg wrote:
>> Sasha wrote:
>>> On 3/23/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:
>>>> At any rate, if the fortran flag is there, we need to specify the
>>>> contiguous case as well. So, either propose a better interface (we
>>>> could change it still --- the fortran flag doesn't have that much
>>>> history) to handle the situation or accept what I do ;-)
> Contiguity is separable from fortran:
> [Dbg]>>> b= _n.array([[1, 2, 3], [4, 5, 6]])
> [Dbg]>>> b.flags.contiguous
> [Dbg]>>> c= b.transpose()
> [Dbg]>>> c
> array([[1, 4],
> [2, 5],
> [3, 6]])
> [Dbg]>>> c.flags.contiguous
This is true, but irrelevant. To the best of my knowledge, the only
reason to force an array to be in a specific order is to pass it to a C
function that expects either FORTRAN- or C-ordered arrays. And, in that
case, the array also needs to be contiguous. So, for the purpose of
creating arrays (and for the purposes of ascontiguous), the only cases
that matter are arrays that are both contiguous and the specified order.
Thus, specifying continuity and order separately to the constructor
needlessly complicates the interface. Or since I'm feeling jargon happy
>>> Let me try. I propose to eliminate the fortran flag in favor of a more
>>> general "strides" argument. This argument can be either a sequence of
>>> integers that becomes the strides, or a callable object that takes
>>> shape and dtype arguments and return a sequence that becomes the
>>> strides. For fortran and c order functions that generate appropriate
>>> stride sequences should be predefined to enable array(...,
>>> strides=fortran, ...) and array(..., strides=contiguous).
>> I like the idea of being able to create an array with custom strides.
>> The applications aren't entirely clear yet, but it does seem like it
>> could have some interesting and useful consequences. That said, I
>> don't think this belongs in 'array'. Historically, array has been
>> used for all sorts of array creation activities, which is why it
>> always seems to have a wide, somewhat incoherent interface. However,
>> most uses of array() boil down to one thing: creating a *new* array
>> from a python object. My preference would be to focus on that
>> functionality for array() and spin of it's other historical uses and
>> new uses, like this custom strided array stuff, into separate factory
>> functions. For example (and just for example, I make no great claims
>> for either this name or interface):
>> a = array_from_data(a_buffer_object, dtype, dims, strides) [***]
>> One thing that you do make clear is that contiguous and fortran
>> should really two values of the same flag.
> Please see the transpose example above.
>> If you combine this with one other simplification: array() always
>> copies, we end up with a nice thin interface:
>> # Create a new array in 'order' order. Defaults to "C" order.
>> array(object, dtype=None, order="C"|"FORTRAN")
> I feel that [***] above is much cleaner than this. I suggest that
> string constants be deprecated.
I'm no huge fan of string constants myself, but I think you need to
think this through more. First off, the interface I tossed off above
doesn't cover the same ground as array, since it works off an already
created buffer object. That means you'd have to go through all sorts of
contortions and do at least one copy to get data into Fortran order. You
could allow arbitrary, 1D, python sequences instead, but that doesn't
help the common case of converting a 2D python object into a 2D array.
You could allow N-D python objects, but then you have two ways of
specifying the dims of the object and things become a big krufty mess.
Compared to that string constants are great.
>> # Returns an array. If object is an array and order is satisfied,
>> return object otherwise a new array.
>> # If order is set the returned array will be contiguous and have
>> that ordering
>> asarray(object, dtype=None, order=None|"C"|"FORTRAN")
>> # Just the same, but allow subtypes.
>> asanyarray(object, dtype=None, order=None|"C"|"FORTRAN")
>> You could build asarray, asanyarray, etc on top of the proposed array
>> without problems by using type(object)==ndarray and isinstance(type,
>> ndarray) respectively. Stuff like convenience functions for minnd
>> would also be easy to build on top of there. This looks great to me
>> Embrace simplicity: you have nothing to lose but your clutter;)
> If [***] above were adopted, it would still be helpful to adopt
> numarray's iscontiguous method, or better, use a property.
-0. In my experience, 99% of my use cases would be covered for
ascontiguous and for the remaining 1% I'm happy to use a.flags.contiguous.
More information about the Numpy-discussion