[Numpy-discussion] broadcasting behavior for 1.6 (was: Numpy 1.6 schedule)
Charles R Harris
charlesr.harris@gmail....
Fri Mar 11 09:51:08 CST 2011
On Fri, Mar 11, 2011 at 8:06 AM, Wes McKinney <wesmckinn@gmail.com> wrote:
> On Fri, Mar 11, 2011 at 9:57 AM, Charles R Harris
> <charlesr.harris@gmail.com> wrote:
> >
> >
> > On Fri, Mar 11, 2011 at 7:42 AM, Charles R Harris
> > <charlesr.harris@gmail.com> wrote:
> >>
> >>
> >> On Fri, Mar 11, 2011 at 2:01 AM, Ralf Gommers
> >> <ralf.gommers@googlemail.com> wrote:
> >>>
> >>> I'm just going through the very long 1.6 schedule thread to see what
> >>> is still on the TODO list before a 1.6.x branch can be made. So I'll
> >>> send a few separate mails, one for each topic.
> >>>
> >>> On Mon, Mar 7, 2011 at 8:30 PM, Francesc Alted <faltet@pytables.org>
> >>> wrote:
> >>> > A Sunday 06 March 2011 06:47:34 Mark Wiebe escrigué:
> >>> >> I think it's ok to revert this behavior for backwards compatibility,
> >>> >> but believe it's an inconsistent and unintuitive choice. In
> >>> >> broadcasting, there are two operations, growing a dimension 1 -> n,
> >>> >> and appending a new 1 dimension to the left. The behaviour under
> >>> >> discussion in assignment is different from normal broadcasting in
> >>> >> that only the second one is permitted. It is broadcasting the output
> >>> >> to the input, rather than broadcasting the input to the output.
> >>> >>
> >>> >> Suppose a has shape (20,), b has shape (1,20), and c has shape
> >>> >> (20,1). Then a+b has shape (1,20), a+c has shape (20,20), and b+c
> >>> >> has shape (20,20).
> >>> >>
> >>> >> If we do "b[...] = a", a will be broadcast to match b by adding a 1
> >>> >> dimension to the left. This is reasonable and consistent with
> >>> >> addition.
> >>> >>
> >>> >> If we do "a[...]=b", under 1.5 rules, a will once again be broadcast
> >>> >> to match b by adding a 1 dimension to the left.
> >>> >>
> >>> >> If we do "a[...]=c", we could broadcast both a and c together to the
> >>> >> shape (20,20). This results in multiple assignments to each element
> >>> >> of a, which is inconsistent. This is not analogous to a+c, but
> >>> >> rather to np.add(c, c, out=a).
> >>> >>
> >>> >> The distinction is subtle, but the inconsistent behavior is harmless
> >>> >> enough for assignment that keeping backwards compatibility seems
> >>> >> reasonable.
> >>> >
> >>> > For what is worth, I also like the behaviour that Mark proposes, and
> >>> > have updated tables test suite to adapt to this. But I'm fine if it
> is
> >>> > decided to revert to the previous behaviour.
> >>>
> >>> The conclusion on this topic, as I read the discussion, is that we
> >>> need to keep backwards compatible behavior (even though the proposed
> >>> change is more intuitive). Has backwards compatibility been fixed
> >>> already?
> >>>
> >>
> >> I don't think an official conclusion was reached, at least in so far as
> >> numpy has an official anything ;) But this change does show up as an
> error
> >> in one of the pandas tests, so it is likely to affect other folks as
> well.
> >> Probably the route of least compatibility hassle is to revert to the old
> >> behavior and maybe switch to the new behavior, which I prefer, for 2.0.
> >>
> >
> > That said, apart from pandas and pytables, and the latter has been fixed,
> > the new behavior doesn't seem to have much fallout. I think it actually
> > exposes unoticed assumptions in code that slipped by because there was no
> > consequence.
> >
> > Chuck
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
>
> I've fixed the pandas issue-- I'll put out a bugfix release whenever
> NumPy 1.6 final is out. I don't suspect it will cause very many
> problems (and those problems will--hopefully--be easy to fix).
> __
>
Now I'm really vacillating. I do prefer the new behavior and the fallout
does seem minimal. Put me +1 for the change unless a strong objection
surfaces.
Chuck
