[Numpy-discussion] PEP 209: Multi-dimensional Arrays
Barrett at stsci.edu
Wed Feb 14 17:05:17 CST 2001
Rob W. W. Hooft writes:
> >>>>> "PB" == Paul Barrett <Barrett at stsci.edu> writes:
> PB> By way of bootstrapping, only one predefined type need be known,
> PB> say, Int32. The operations associated with this type can only be
> PB> Int32 operations, because this is the only type it knows about.
> PB> Yet, we can add another type, say Real64, which has not only
> PB> Real64 operations, BUT also Int32 and Real64 mixed operations,
> PB> since it knows about Int32. The Real64 type provides the
> PB> necessary information to relate the Int32 and Int64 types. Let's
> PB> now add a third type, then a fourth, etc., each knowing about its
> PB> predecessor types but not its successors.
> PB> This approach is identical to the way core Python adds new
> PB> classes or C-extension types, so this is nothing new. The
> PB> current types do not know about the new type, but the new type
> PB> knows about them. As long as one type knows the relationship
> PB> between the two that is sufficient for the scheme to work.
> Yuck. I'm thinking how long it would take to load the Int256 class,
> because it will need to import all other types before defining the
> relations.... [see below for another idea]
First, I'm not proposing that we use this method of bootstapping from
just one type. I was just demonstrating that it could be done. Users
could then create their own types and dynamically add them to the
module by the above scheme.
Second, I think your making the situation more complex than it really
is. It doesn't take that long to initialize the type rules and
register the functions, because both arrays are sparsely populated.
If there isn't a rule between two types, you don't have to create a
dictionary entry. The size of the coecion table is equal to or less
than the number of types, so that's small. The function table is a
sparsely populated square array. We just envision populating its
diagonal elements and using coercion rules for the empty off-diagonal
elements. The point is that if an off-diagonal element is filled,
then it will be used.
I'll include our proposed implementation in the PEP for clarification.
> PB> Attributes: .name: e.g. "Int32", "Float64", etc. .typecode:
> PB> e.g. 'i', 'f', etc. (for backward compatibility)
> >> .typecode() is a method now.
> PB> Yes, I propose that it become a settable attribute.
> Then it is not backwards compatible anyway, and you could leave it out.
I'd like to, but others have strongly objected to leaving out
> PB> .size (in bytes): e.g. 4, 8, etc.
> >> "element size?"
> PB> Yes.
> I think it should be called like that in that case. I dnt lk abbrvs.
> size could be misread as the size of the total object.
How about item_size?
> >> >> add.register('add', (Int32, Int32, Int32), cfunc-add)
> >> Typo: cfunc-add is an expression, not an identifier.
> PB> No, it is a Python object that encompasses and describes a C
> PB> function that adds two Int32 arrays and returns an Int32 array.
> I understand that, but in general a "-" in pseudo-code is the
> minus operator. I'd write cfunc_add instead.
Yes. I understand now.
> PB> 4. ArrayView
> PB> This class is similar to the Array class except that the reshape
> PB> and flat methods will raise exceptions, since non-contiguous
> PB> arrays cannot be reshaped or flattened using just pointer and
> PB> step-size information.
> >> This was completely unclear to me until here. I must say I find
> >> this a strange way of handling things. I haven't looked into
> >> implementation details, but wouldn't it feel more natural if an
> >> Array would just be the "data", and an ArrayView would contain the
> >> dimensions and strides. Completely separated. One would always
> >> need a pair, but more than one ArrayView could use the same Array.
> PB> In my definition, an Array that has no knowledge of its shape and
> PB> type is not an Array, it's a data or character buffer. An array
> PB> in my definition is a data buffer with information on how that
> PB> buffer is to be mapped, i.e. shape, type, etc. An ArrayView is
> PB> an Array that shares its data buffer with another Array, but may
> PB> contain a different mapping of that Array, ie. its shape and type
> PB> are different.
> PB> If this is what you mean, then the answer is "Yes". This is how
> PB> we intend to implement Arrays and ArrayViews.
> No, it is not what I meant. Reading your answer I'd say that I wouldn't
> see the need for an Array. We only need a data buffer and an ArrayView.
> If there are two parts of the functionality, it is much cleaner to make
> the cut in an orthogonal way.
I just don't see what you are getting at here! What attributes does
your Array have, if it doesn't have a shape or type?
If Arrays only have view behavior; then Yes, there is no need for the
ArrayView class. Whereas if Arrays have copy behavior, it might be a
good idea to distinguish between an ordinary Array and a ArrayView.
An alternative would be to have a view attribute.
> PB> B = A.V[:10] or A.view[:10] are some possibilities. B is now an
> PB> ArrayView class.
> I hate magic attributes like this. I do not like abbrevs at all. It is
> not at all obvious what A.T or A.V mean.
I'm not a fan of them either, but I'm looking for concensus on these
> PB> 2. Does item syntax default to copy or view behavior?
> >> view.
> PB> Yet, c[i] can be considered just a shorthand for c[i,:] which
> PB> would imply copy behavior assuming slicing syntax returns a copy.
> >> If you reason that way, then c is just a shorthand for c[...]
> >> too.
> PB> Yes, that is correct, but that is not how Python currently
> PB> behaves.
> Current python also doesn't treat c[i] as a shorthand for c[i,:] or
Because there aren't any multi-dimensional lists in Python, only
nested 1-dimensional lists. There is a structural difference.
Dr. Paul Barrett Space Telescope Science Institute
Phone: 410-338-4475 ESS/Science Software Group
FAX: 410-338-4767 Baltimore, MD 21218
More information about the Numpy-discussion