FW: [Numpy-discussion] Bug: extremely misleading array behavior

Alexander Schmolck a.schmolck at gmx.net
Tue Jun 11 16:03:02 CDT 2002


"eric jones" <eric at enthought.com> writes:


> I think the consistency with Python is less of an issue than it seems.
> I wasn't aware that add.reduce(x) would generated the same results as
> the Python version of reduce(add,x) until Perry pointed it out to me.
> There are some inconsistencies between Python the language and Numeric
> because the needs of the Numeric community.  For instance, slices create
> views instead of copies as in Python.  This was a correct break with
> consistency in a very utilized area of Python because of efficiency.  

Ahh, a loaded example ;) I always thought that Numeric's view-slicing is a
fairly problematic deviation from standard Python behavior and I'm not
entirely sure why it needs to be done that way.

Couldn't one have both consistency *and* efficiency by implementing a
copy-on-demand scheme (which is what matlab does, if I'm not entirely
mistaken; a real copy gets only created if either the original or the 'copy'
is modified)? The current behavior seems not just problematic because it
breaks consistency and hence user expectations, it also breaks code that is
written with more pythonic sequences in mind (in a potentially hard to track
down manner) and is, IMHO generally undesirable and error-prone, for pretty
much the same reasons that dynamic scope and global variables are generally
undesirable and error-prone -- one can unwittingly create intricate
interactions between remote parts of a program that can be very difficult to
track down.

Obviously there *are* cases where one really wants a (partial) view of an
existing array. It would seem to me, however, that these cases are exceedingly
rare (In all my Numeric code I'm only aware of one instance where I actually
want the aliasing behavior, so that I can manipulate a large array by
manipulating its views and vice versa).  Thus rather than being the default
behavior, I'd rather see those cases accommodated by a special syntax that
makes it explicit that an alias is desired and that care must be taken when
modifying either the original or the view (e.g. one possible syntax would be
``aliased_vector = m.view[:,1]``).  Again I think the current behavior is
somewhat analogous to having variables declared in global (or dynamic) scope
by default which is not only error-prone, it also masks those cases where
global (or dynamic) scope *is* actually desired and necessary.

It might be that the problems associated with a copy-on-demand scheme
outweigh the error-proneness, the interface breakage that the deviation from
standard python slicing behavior causes, but otherwise copying on slicing
would be an backwards incompatibility in numarray I'd rather like to see
(especially since one could easily add a view attribute to Numeric, for
forwards-compatibility). I would also suspect that this would make it *a lot*
easier to get numarray (or parts of it) into the core, but this is just a
guess.


> 
> I don't see choosing axis=-1 as a break with Python -- multi-dimensional
> arrays are inherently different and used differently than lists of lists
> in Python.  Further, reduce() is a "corner" of the Python language that
> has been superceded by list comprehensions.  Choosing an alternative

Guido might nowadays think that adding reduce was as mistake, so in that sense
it might be a "corner" of the python language (although some people, including
me, still rather like using reduce), but I can't see how you can generally
replace reduce with anything but a loop. Could you give an example?


alex

-- 
Alexander Schmolck     Postgraduate Research Student
                       Department of Computer Science
                       University of Exeter
A.Schmolck at gmx.net     http://www.dcs.ex.ac.uk/people/aschmolc/





More information about the Numpy-discussion mailing list