[Numpy-discussion] Re-implementation of Python Numerical arrays (Numeric) available for download

Perry Greenfield greenfield at home.com
Sat Nov 17 16:28:02 CST 2001


> > I think that we also don't like that, and after doing the original,
> > somewhat incomplete, implementation using the subarray approach,
> > I began to feel that implementing it in C (albiet using a different
> > approach for the code generation) was probably easier and more
> > elegant than what was done here. So you are very likely to see
> > it integrated as a regular numeric type, with a more C-based
> > implementation.
>
> Sounds good.   Is development going to take place on the CVS
> tree.  If so, I
> could help out by comitting changes directly.
>
> >
> > > 2)  Also,  in your C-API, you have a different pointer to the
> > > imaginary data.
> > >   I much prefer the way it is done currently to have complex numbers
> > > represented as an 8-byte, or 16-byte chunk of contiguous memory.
> >
> > Any reason not to allow both? (The pointer to the real can be
> interpreted
> > as either a pointer to 8-byte or 16-byte quantities). It is true
> > that figuring out the imaginary pointer from the real is trivial
> > so I suppose it really isn't necessary.
>
> I guess the way you've structured the ndarray, it is possible.  I figured
> some operations might be faster, but perhaps not if you have two pointers
> running at the same time, anyway.
>
Well, the C implementation I was thinking of would only use
one pointer. The API could supply both if some algorithms would
find it useful to just access the imaginary data alone. But as
mentioned, I don't think it is important to include, so we
could easily get rid of it (and probably should)

> >
> > > Index Arrays:
> > > ===========
> > >
> > > 1)  For what it's worth, my initial reaction to your indexing
> scheme is
> > > negative.  I would prefer that if
> > >
> > > a = [[1,2,3,4],
> > >       [5,6,7,8],
> > >       [9,10,11,12],
> > >       [13,14,15,16]]
> > >
> > > then
> > >
> > > a[[1,3],[0,3]] returns the sub-matrix:
> > >
> > > [[   4,  6],
> > >  [ 12, 14]
> > >
> > > i.e. the cross-product of [1,3] x [0,3]   This is the way MATLAB
> > > works.  I'm
> > > not sure what IDL does.
> >
> > I'm afraid I don't understand the example. Could you elaborate
> > a bit more how this is supposed to work? (Or is it possible
> > there is an error? I would understand it if the result were
> > [[5, 8],[13,16]] corresponding to the index pairs
> > [[(1,0),(1,3)],[(3,0),(3,3)]])
> >
>
> The idea is to consider indexing with arrays of integers to be a
> generalization of slice index notation.   Simply interpret the
> slice as an
> array of integers that would be formed by using the range operator.
>
> For example, I would like to see
>
> a[1:5,1:3] be the same thing  as  a[[1,2,3,4],[1,2]]
>
> a[1:5,1:3] selects the 2-d subarray consisting of rows 1 to 4 and
> columns 1
> to 2 (inclusive starting with the first row being row 0).  In
> other words,
> the indices used to select the elements of a are ordered-pairs
> taken from the
> cross-product of the index set:
>
> [1,2,3,4] x [1,2] = [(1,1), (1,2), (2,1), (2,2), (3,1), (3,2),
> (4,1), (4,2)]
> and these selected elements are structured as a 2-d array of shape (4,2)
>
> Does this make more sense?  Indexing would be a natural extension of this
> behavior but allowing sets that can't be necessarily formed from
> the range
> function.
>
I understand this (but is the example in the first message
consistent with this?). This is certainly a reasonable
interpetation. But if this is the way multiple index arrays
are interpreted, how does one easily specify scattered points
in a multidimensional array? The only other alternative I can
think of is to use some of the dimensions of a multidimensional
index array as indicies for each of the dimensions. For example,
if one wanted to index random points in a 2d array, then
supplying an nx2 array would provide a list of n such points.
But I see this as a more limiting way to do this (and there
are often benefits to being able to keep the indices for
different dimensions in separate arrays.

But I think doing what you would like to do is straightforward
even with the existing implementation. For example, if x is a
2d array we could easily develop a function such that:

x[outer_index_product([1,3,4],[1,5])]
# with a better function name!

The function outer_index_product would return a tuple of two
index arrays each with a shape of 3x2. These arrays
would not take up more space than the original
arrays even though they appear to have a much
larger size (the one dimension is replicated by
use of a 0 stride size so the data buffer is
the same as the original). Would this be acceptable?

In the end, all these indexing behaviors can be provided
by different functions. So it isn't really a question of
which one to have and which not to have. The question is
what is supported by the indexing notation? For us, the
behavior we have implemented is far more useful for our
applications than the one you propose. But perhaps we are
in the minority, so I'd be very interested in hearing which
indexing interpretation is most useful to the general
community.

> > Why not:
> >
> > ravel(a)[[9,10,11]] ?
>
> sure, that would work, especially if ravel doesn't make a copy of
> the data
> (which I presume it does not).
>
Correct.

Perry





More information about the Numpy-discussion mailing list