[Numpy-discussion] copy on demand

Alexander Schmolck a.schmolck at gmx.net
Sun Jun 16 15:59:02 CDT 2002


"Perry Greenfield" <perry at stsci.edu> writes:

> <Alexander Schmolck writes>:
>     <Perry Greenfield writes>:
> > > I guess that depends on what you mean by unnecessary copies.
> >
> > In most cases the array of which I desire a flattened representation is
> > contiguous (plus, I usually don't intend to modify it).
> > Consequently, in most
> > cases I don't want to any copies of it to be created (especially
> > not if it is
> > really large -- which is not seldom the case).
> >
> Numarray already returns a view of the array if it is contiguous.
> Copies are only produced if it is non-contiguous. I assume that
> is the behavior you are asking for?

Not at all -- in fact I was rather shocked when my attention was drawn to the
fact that this is also the behavior of Numeric -- I had thought that ravel
would *always* create a copy. I absolutely agree with the other posters that
remarked that different behavior of ravel (creating a copy vs creating a view,
depending on whether the argument is contiguous) is highly undesirable and
error-prone (especially since it is not even possible to determine at compile
time which behavior will occur, if I'm not mistaken). In fact, I think this
behavior is worse than what I incorrectly assumed to be the case.

What I was arguing for is a ravel that always has the same semantics, (namely
creating a copy) but tha -- because it would create the copy only demand --
would be just as efficient as using .flat when 

a) its argument were contiguous; and
b) neither the result nor the argument were modified while both are alive.

The reason that I view `.flat` as a hack, is that it is an operation that is
there exclusively for efficiency reasons and has no well defined semantics --
it will only work stochastically, giving better performance in certain
cases. Thus you have to cast lots whether you actually use it at runtime
(calling .iscontiguous) and always have a fall-back scheme (most likely using
ravel) at hand -- there seems to be no way to determine at compile time what's
going to happen.

I don't think a language or a library should have any such constructs or at
least strive to minimize their number. The fact that the current behavior of
ravel actually achieves the effect I want in most cases doesn't justify its
obscure behavior in my eyes, which translates into a variation of the
boiler-plate code previously mentioned (``if a.iscontiguous:...else:``) when
you actually want a *single* ravelled copy and it also is a very likely
candidate for extremely hard to find bugs.

One nice thing about python is that there is very little undefined
behavior. I'd like to keep it that way.

[snipped]
> > I personally don't find it messy.  And please keep in mind that
> > the ``view``
> > construct would only very seldomly be used if copy-on-demand is
> > the default
> > -- as I said, I've only needed the aliasing behavior once -- no
> > doubt it was
> > really handy then, but the fact that e.g. matlab doesn't have
> > anything along
> > those lines (AFAIK) suggests that many people will never need it.
> >
> You're kidding, right? Particularly after arguing for aliasing
> semantics in the previous paragraph for .flat ;-)

I didn't argue for any semantics of ``.flat`` -- I just pointed out that I
found the division of labour that I (incorrectly) assumed to be the case an
ugly hack (for the reasons outlined above):

``ravel``: always works, but always creates copy
            (which might be undesirable wastage of resources); [this was
            mistaken; the real semantics are: always works, creates view if
            contiguous, copy otherwise]

``.flat``: behavior undefined at compile time, a runtime-check can be used
           to ensure that it can be used as a more efficient alternative to
           ``ravel`` in some cases.

If I now understand the behavior of both ``ravel`` and ``.flat`` correctly
then I can't currently see *any* raison d'être for a ``.flat`` attribute.

If, as I would hope, the behavior of ravel is changed to always create copies
(ideally on-demand), then matters might look different. In that case, it might
be justifiable to have ``.flat`` as a specialized construct analogous to what
I proposed as``.view``, but only if there is some way to make it work (the
same) for both contiguous and non-contiguous arrays. I'm not sure that it
would be needed at all (especially with a lazy ravel).

alex

-- 
Alexander Schmolck     Postgraduate Research Student
                       Department of Computer Science
                       University of Exeter
A.Schmolck at gmx.net     http://www.dcs.ex.ac.uk/people/aschmolc/





More information about the Numpy-discussion mailing list