[Numpy-discussion] PEP 209: Multi-dimensional Arrays

Paul Barrett Barrett at stsci.edu
Fri Feb 16 10:18:53 CST 2001


Rob W. W. Hooft writes:
 > 
 >  >>  No, it is not what I meant. Reading your answer I'd say that I
 >  >> wouldn't see the need for an Array. We only need a data buffer and
 >  >> an ArrayView. If there are two parts of the functionality, it is
 >  >> much cleaner to make the cut in an orthogonal way.
 > 
 > 
 >  PB> I just don't see what you are getting at here!  What attributes
 >  PB> does your Array have, if it doesn't have a shape or type?
 > 
 > A piece of memory. It needs nothing more. A buffer[1]. You'd always
 > need an ArrayView.  The Arrayview contains information like
 > dimensions, strides, data type, endianness.
 > 
 > Making a new _view_ would consist of making a new ArrayView, and pointing
 > its data object to the same data array. 
 > 
 > Making a new _copy_ would consist of making a new ArrayView, and
 > marking the "copy-on-write" features (however that needs to be
 > implemented, I have never done that. Does it involve weak
 > references?).
 > 
 > Different Views on the same data can even have different data types:
 > e.g. character and byte, or even floating point and integer (I am
 > a happy user of the fortran EQUIVALENCE statement that way too).

I think our approaches are very similar.  It's the meaning that we
ascribe to Array and ArrayView that appears to be causing the
confusion.  Your Array object is our Data object and your ArrayView
object is our Array attributes, ie. the information to map/interpret
the Data object.  We view an Array as being composed of two entities,
its attributes and a Data object.  And we entirely agree with the
above definitions of _view_ and _copy_.  But you haven't told us what
object associates your Array and ArrayView to make a usable array that 
can be sliced, diced, and Julian fried.

My impression of your slice method would be:

slice(Array, ArrayView, slice expression)

I'm not too keen on this approach. :-)

 > The speed up by re-use of temporary arrays becomes very easy this way
 > too: one can even re-use a floating point data array as integer result
 > if the reference count of both the data array and its (only) view is
 > one.

Yes!  This is our intended implementation.  But instead of re-using
your Array object, we will be re-using a (data-) buffer object, or a
memory-mapped object, or whatever else in which the data is stored.

 > [1] Could the python buffer interface be used as a pre-existing
 >     implementation here? Would that make it possible to implement
 >     Array.append()? I don't always know beforehand how large my
 >     numeric arrays will become.

In a way, yes.  I've considered creating an in-memory object that has
similar properties to the memory-mapped object (e.g. it might have a
read-only property), so that the two data objects can be used
interchangeably.  The in-memory object would replace the string object 
as a data store, since the string object is meant to be read-only.

 -- Paul

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218




More information about the Numpy-discussion mailing list