FW: [Numpy-discussion] Bug: extremely misleading array behavior

Alexander Schmolck a.schmolck at gmx.net
Wed Jun 12 08:44:04 CDT 2002


"eric jones" <eric at enthought.com> writes:

> > Couldn't one have both consistency *and* efficiency by implementing a
> > copy-on-demand scheme (which is what matlab does, if I'm not entirely
> > mistaken; a real copy gets only created if either the original or the
> > 'copy'
> > is modified)? 
> 
> Well, slices creating copies is definitely a bad idea (which is what I
> have heard proposed before) -- finite difference calculations (and
> others) would be very slow with this approach.  Your copy-on-demand
> suggestion might work though.  Its implementation would be more complex,
> but I don't think it would require cooperation from the Python core.?
> It could be handled in the ufunc code.  It would also require extension
> modules to make copies before they modified any values.  
> 
> Copy-on-demand doesn't really fit with python's 'assignments are
> references" approach to things though does it?  Using foo = bar in
> Python and then changing an element of foo will also change bar.  So, I

My suggestion wouldn't conflict with any standard python behavior -- indeed
the main motivation would be to have numarray conform to standard python
behavior -- ``foo = bar`` and ``foo = bar[20:30]`` would behave exactly as for
other sequences in python. The first one creates an alias to bar and in the
second one the indexing operation creates a copy of part of the sequence which
is then aliased to foo. Sequences are atomic in python, in the sense that
indexing them creates a new object, which I think is not in contradiction to
python's nice and consistent 'assignments are references' behavior.


> guess there would have to be a distinction made here.  This adds a
> little more complexity.
> 
> Personally, I like being able to pass views around because it allows for
> efficient implementations.  The option to pass arrays into extension
> function and edit them in-place is very nice.  Copy-on-demand might
> allow for equal efficiency -- I'm not sure.

I don't know how much of a performance drawback copy-on-demand would have when
compared to views one -- I'd suspect it would be not significant, the fact
that the runtime behavior becomes a bit more difficult to predict might be
more of a drawback (but then I haven't heard matlab users complain and one
could always force an eager copy). Another reason why I think a copy-on-demand
scheme for slicing operations might be attractive is that I'd suspect one
could gain significant benefits from doing other operations in a lazy fashion
(plus optionally caching some results), too (transposing seems to cause in
principle unnecessary copies at least in some cases at the moment).

> 
> I haven't found the current behavior very problematic in practice and
> haven't seen that it as a major stumbling block to new users.  I'm happy



More information about the Numpy-discussion mailing list