[Numpy-discussion] copy on demand

Perry Greenfield perry at stsci.edu
Fri Jun 14 08:02:04 CDT 2002

<Alexander Schmolck writes>:
    <Perry Greenfield writes>:
> > I guess that depends on what you mean by unnecessary copies.
> In most cases the array of which I desire a flattened representation is
> contiguous (plus, I usually don't intend to modify it).
> Consequently, in most
> cases I don't want to any copies of it to be created (especially
> not if it is
> really large -- which is not seldom the case).
Numarray already returns a view of the array if it is contiguous.
Copies are only produced if it is non-contiguous. I assume that
is the behavior you are asking for?

> The fact that you can never really be sure whether you can actually use
> ``.flat``, without checking beforehand if the array is in fact
> contiguous (I
> don't think there are many guarantees about something being
> contiguous, or are
> there?) and that ravel will always work but has a huge overhead,
> suggests to
> me that something is not quite right.
Not for numarray, at least in this context.

> > If the array is non-contiguous what would you have it do?
> Simple -- in that case 'lazy ravel' would do the same as 'ravel' currently
> does, create a copy (or alternatively rearrange the memory
> representation to
> make it non-contiguous and then create a lazy copy, but I don't
> know whether
> this would be a good or even feasible idea).
> A lazy version of ravel would have the same semantics as ravel
> but only create
> an actual copy if necessary-- which means as long as no modification takes
> place and the array is non-contiguous, it will be sufficient to return the
> ``.flat`` (for starters). If it is contiguous than the copying can't be
> helped, but these cases are rare and currently you either have to test for
> them explicitly or slow everything down and waste memory by just
> always using
> ``ravel()``.
Currently for numarray .flat will fail if it isn't contiguous. It isn't
clear if this should change. If .flat is meant to be a view always, then
it should always fail it the array is not contiguous. Ravel is not
guaranteed to be a view.

This is a problematic issue if we decide to switch from view to copy
semantics. If slices produce copies, then does .flat? If so, then
how does one produce a flattened view? x.view.flat?

> For example, if bar is contiguous ``foo = ravel(bar)`` would be
> computationally equivalent to ``bar.flat``, as long as neither of them is
> modified, but semantically equivalent to the current ``foo =
> ravel(bar)`` in
> all cases.
> Thus you could now write:
> >>> a = ravel(a)[20:]
> wherever you've written this boiler-plate code before:
> >>> if a.iscontiguous():
> >>>    a = a.flat[20:]
> >>> else:
> >>>    a = ravel(a)[20:]
> without any loss of performance.
I believe this is already true in numarray.

> I personally don't find it messy.  And please keep in mind that
> the ``view``
> construct would only very seldomly be used if copy-on-demand is
> the default
> -- as I said, I've only needed the aliasing behavior once -- no
> doubt it was
> really handy then, but the fact that e.g. matlab doesn't have
> anything along
> those lines (AFAIK) suggests that many people will never need it.
You're kidding, right? Particularly after arguing for aliasing
semantics in the previous paragraph for .flat ;-)

> Also what exactly is the confused person's notion of the purpose of ``x =
> a.view`` supposed to be? That ``x = a`` is what ``x = a.copy()``
> really does
> and that to create aliases an alias to ``a`` they would have to use
> ``x = a.view``? In that case they'd better read the python
> tutorial before they do
> any more python programming, because they are in for all kinds of
> unpleasant
> surprises (``a = []; b = a; b[1] = 3; print a`` -- oops).
This is basically true, though the confusion may be that a.view is
an array object that has different slicing behavior instead of
an non-array object that can be sliced to produce a view. I don't
view it as a major issue but I do see how may mistakenly infer that.


More information about the Numpy-discussion mailing list