[Numpy-discussion] Broadcasting rules (Ticket 76).

Sasha ndarray at mac.com
Tue Apr 25 18:17:04 CDT 2006

On 4/25/06, tim.hochberg at cox.net <tim.hochberg at cox.net> wrote:
> ---- Travis Oliphant <oliphant.travis at ieee.org> wrote:
> > Sasha wrote:
> > > In this category, I would suggest to allow broadcasting to any
> > > multiple of the dimension even if the dimension is not 1.  I don't see
> > > what makes 1 so special.
> > >
> > What's so special about 1 is that the code for it is relatively
> > straightforward and already implemented using strides.  Altering the
> > code to allow any multiple of the dimension would be harder and slower.

I don't think so. The same zero-stride trick that allows size-1
broadcasting can be used to implement repetition.  I did not review
the C code, but the following Python fragment shows that the loop that
is already in numpy can be used to implement repetition by simply
manipulating shapes and strides:

>>> x = zeros(6)
>>> reshape(x,(3,2))[...] = 1,2
>>> x
array([1, 2, 1, 2, 1, 2])

> It also does the right thing most of the time and is easy to understand.

Easy to understand?  Let me quote Travis' book on this:

"Broadcasting can be understood by four rules: ... While perhaps
somewhat difficult to explain, broadcasting can be quite useful and
becomes second nature rather quickly."

I may be slow, but it did not become second nature for me.  I am still
getting bitten by subtle differences between unit length 1-d arrays
and 0-d arrays.

> It's my expectation that oppening up broadcasting will be more effective in masking
> errors than in enabling useful new behaviour.
In my experience broadcasting length-1 and not broadcasting other
lengths is very error prone as it is.  I understand that restricting
broadcasting to make it a strictly dimension-increasing operation is
not possible for two reasons:

1. Numpy cannot break legacy Numeric code.
2. It is not possible to differentiate between 1-d array that
broadcasts column-wise vs. one that broadcasts raw-wise.

In my view none of these reasons is valid.  In my experience Numeric
code that relies on dimension-preserving broadcasting is already
broken, only in a subtle and hard to reproduce way.  Similarly the
need to broadcast over non-leading dimension is a sign of bad design. 
In rare cases where such broadcasting is desirable, it can be easily
done via swapaxes which is a cheap operation.

Nevertheless, I've lost that battle some time ago.

On the other hand I don't see much problem in making
dimension-preserving broadcasting more permissive.  In R, for example,
(1-d) arrays can be broadcast to arbitrary size.  This has an
additional benefit that 1-d to 2-d broadcasting requires no special
code, it just happens because matrices inherit arithmetics from
vectors.  I've never had a problem with R rules being too loose.

> I think that's my ticket being discussed here. If so, it was motivated by a case that
> stopped working because the looser broadcasting behaviour was preventing some
> other broadcasting from taking place. I'm not home right now, so I can't provide
> details; I'll do that on Thursday.

In my view the problem that your ticket highlighted is not so much in
the particular set of broadcasting rules, but in the fact that a[...]
= b uses one set of rules while a[...] += b uses another.  This is
*very* confusing.

> Just keep in mind that it's much easier to keep the broadcasting rules restrictive for
> now and loosen them up later than to try to tighten them up later if loosening them up
> turns out to not be a good idea.

You are preaching to the choir!

More information about the Numpy-discussion mailing list