[Numpy-discussion] Shouldn't all in-place operations simply return self?

Dag Sverre Seljebotn d.s.seljebotn@astro.uio...
Thu Jan 17 12:08:36 CST 2013


On 01/17/2013 05:33 PM, Nathaniel Smith wrote:
> On Thu, Jan 17, 2013 at 2:32 PM, Alan G Isaac <alan.isaac@gmail.com> wrote:
>> Is it really better to have `permute` and `permuted`
>> than to add a keyword?  (Note that these are actually
>> still ambiguous, except by convention.)
>
> The convention in question, though, is that of English grammar. In
> practice everyone who uses numpy is a more-or-less skilled English
> speaker in any case, so re-using the conventions is helpful!
>
> "Shake the martini!" <- an imperative command
>
> This is a complete statement all by itself. You can't say "Hand me the
> shake the martini". In procedural languages like Python, there's a
> strong distinction between statements (whole lines, a = 1), which only
> matter because of their side-effects, and expressions (a + b) which
> have a value and can be embedded into a larger statement or expression
> ((a + b) + c). "Shake the martini" is clearly a statement, not an
> expression, and therefore clearly has a side-effect.
>
> "shaken martini" <- a noun phrase
>
> Grammatically, this is like plain "martini", you can use it anywhere
> you can use a noun. "Hand me the martini", "Hand me the shaken
> martini". In programming terms, it's an expression, not a statement.
> And side-effecting expressions are poor style, because when you read
> procedural code, you know each statement contains at least 1
> side-effect, and it's much easier to figure out what's going on if
> each statement contains *exactly* one side-effect, and it's the
> top-most operation.
>
> This underlying readability guideline is actually baked much more
> deeply into Python than the sort/sorted distinction -- this is why in
> Python, 'a = 1' is *not* an expression, but a statement. C allows you
> to say things like "b = (a = 1)", but in Python you have to say "a =
> 1; b = a".
>
>> Btw, two separate issues seem to be running side by side.
>>
>> i. should in-place operations return their result?
>> ii. how can we signal that an operation is inplace?
>>
>> I expect NumPy to do inplace operations when feasible,
>> so maybe they could take an `out` keyword with a None default.
>> Possibly recognize `out=True` as asking for the original array
>> object to be returned (mutated); `out='copy'` as asking for a copy to
>> be created, operated upon, and returned; and `out=a` to ask
>> for array `a` to be used for the output (without changing
>> the original object, and with a return value of None).
>
> Good point that numpy also has a nice convention with out= arguments
> for ufuncs. I guess that convention is, by default return a new array,
> but also allow one to modify the same (or another!) array in-place, by
> passing out=. So this would suggest that we'd have
>    b = shuffled(a)
>    shuffled(a, out=a)
>    shuffled(a, out=b)
>    shuffle(a) # same as shuffled(a, out=a)
> and if people are bothered by having both 'shuffled' and 'shuffle',
> then we drop 'shuffle'. (And the decision about whether to include the
> imperative form can be made on a case-by-case basis; having both
> shuffled and shuffle seems fine to me, but probably there are other
> cases where this is less clear.)

In addition to the verb tense, I think it's important that mutators are 
methods whereas functions do not mutate their arguments:

lst.sort()
sorted(lst)

So -1 on shuffle(a) and a.shuffled().

Dag Sverre

>
> There is also an argument that if out= is given, then we should always
> return None, in general. I'm having a lot of trouble thinking of any
> situation where it would be acceptable style (or even useful) to write
> something like:
>    c = np.add(a, b, out=a) + 1
> But, 'out=' is very large and visible (which makes the readability
> less terrible than it could be). And np.add always returns the out
> array when working out-of-place (so there's at least a weak
> countervailing convention). So I feel much more strongly that
> shuffle() should return None, than I do that np.add(out=...) should
> return None.
>
> A compromise position would be to make all new functions that take
> out= return None when out= is given, while leaving existing ufuncs and
> such as they are for now.
>
> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



More information about the NumPy-Discussion mailing list