[Numpy-discussion] Broadcasting rules (Ticket 76).
Christopher Barker
Chris.Barker at noaa.gov
Thu Apr 27 00:00:05 CDT 2006
As Sasha quite clearly pointed out, when you do aggregation, you really
do want to reduce the dimensionality of your data. IN fact, that's
something that always bit me with MATLAB. If I had a matrix that
happened to have a dimension of 1, MATLAB would interpret it as a
vector. I ended up writing functions like "SumColumns" that would check
if it was a single row vector before calling sum, so that I wouldn't
suddenly get a scaler result if a matrix happened to have on row.
Once you reduce dimensionality with aggregating functions, I can see how
it would be natural to want to use broadcasting to to merge the reduced
data and full data. However, I can't see how you could do that cleanly.
How is the code to know whether a rank-1 array represents a column or
row when multiplied with a rank-2 array? There is simply no way to know,
in general. I suppose we could define a convention, like:
"rank-1 arrays will be interpreted as row vectors for broadcasting."
etc. for higher dimensions.
However, I've found that even in my code, I don't find one convention
always makes the most sense for all applications, so I'm just as happy
to make it clear with a lot of calls like:
v.shape = (-1, 1)
NOTE:
It appears that numpy does, in fact, use such a convention:
>>> v = N.arange(5)
>>> m = N.ones((5,5))
>>> v * m
array([[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]])
>>> v.shape = (-1,1)
>>> v * m
array([[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[2, 2, 2, 2, 2],
[3, 3, 3, 3, 3],
[4, 4, 4, 4, 4]])
So what's the disagreement about?
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
NOAA/OR&R/HAZMAT (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
More information about the Numpy-discussion
mailing list