[Numpy-discussion] Slice assignment and overlapping views (was: Strange behavior in setting masked array values in Numpy 1.1.0)

Charles R Harris charlesr.harris@gmail....
Sun Jun 1 10:08:06 CDT 2008

On Sun, Jun 1, 2008 at 1:37 AM, Anne Archibald <peridot.faceted@gmail.com>

> 2008/5/31 Pauli Virtanen <pav@iki.fi>:
> > The reason for the strange behavior of slice assignment is that when the
> > left and right sides in a slice assignment are overlapping views of the
> > same array, the result is currently effectively undefined. Same is true
> > for ndarrays:
> >
> >>>> import numpy
> >>>> a = numpy.array([1, 2, 3, 4, 5])
> >>>> a[::-1]
> > array([5, 4, 3, 2, 1])
> >>>> a[:] = a[::-1]
> >>>> a
> > array([5, 4, 3, 4, 5])
> I think that the current rule is, slices are walked from low index to
> high index. This doesn't help with multidimensional arrays, where the
> order of the axes is (and should be) determined by efficiency
> considerations.
> Unfortunately there's really no good way to warn about overlapping
> copies. Remember that this is a frequent operation, so it has to be
> fast for small arrays.
> I think changing base so that it points to the real base and not the
> parent would help (and clear up a memory leak: try "while True: A =
> A[::-1]" some time) eliminate some cases where overlap cannot occur,
> but what about the following cases?
> A[:5] = A[-5:]
> A[::2] = A[1::2]
> A[1:] = A[-1:]
> The last is actually fairly common (I've needed it), and relies on
> numpy's ordering of copies. The middle one is very common, and the
> first one would be a royal pain to code around if the slices were not
> allowed to overlap.
> I think I at one point wrote an algorithm that would detect many cases
> where overlap could not occur (maybe in the smarter reshape code?) but
> I came to the conclusion that detecting all the cases was infeasible.
> It's a number-theoretic problem - "can you write the same number as
> the sum of multiples of these two lists of numbers with nonnegative
> integer coefficients less than these other two lists of numbers?" -
> and I suspect it's NP-complete. (Ah, yes, you can make a knapsack out
> of it - take an array with strides a_1, ... a_n and shape (2,...,2),
> and a single-element array starting at position N.) In any case it's
> infeasible to solve it every time somebody tries to do a slice
> assignment.

Since one could compute all the addresses and check for duplicates, I would
guess it is O(n), maybe O(n*log(n)) worst case, but I can't convince myself
that it would help much to know about overlaps. As you point out,  a lot of
times one needs overlapping ranges, it is when the user has designed an
algorithm without taking the possibility into account that problems arise.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20080601/2765c6cc/attachment.html 

More information about the Numpy-discussion mailing list