[Numpy-discussion] Ransom Proposals

Tim Hochberg tim.hochberg at cox.net
Fri Mar 24 18:30:02 CST 2006


Now that the fortran keyword is dead, I'll take time off from rejoicing, 
to briefly go over other proposals that were involved in that thread 
lest they get lost in the shuffle.

* reshape -- The origination of that whole, enormous thread was with 
reshape and what its return behaviour should be. Currently it returns a 
view if possible, otherwise it returns a copy. That's evil behaviour. I 
believe that it should either always return a copy or return a view if 
possible and otherwise raise an exception. I think I actually prefer the 
later, although it's probably a more disruptive change, since its more 
flexible when combined with asarray and friends.  One possible 
compromise would be to have numpy.reshape always copy, while 
array.reshape always returns a view.

* ascontiguous -- In private email Chris Barker mentioned that the name 
ascontiguous was confusing, or at least not to his taste and suggested 
"something like" ascontiguous_array. I don't like that one, but it might 
worth considering something that matches asarray and asanyarray. 
ascontigarray looks okay to me, but it's quite possible that I've 
staring at this too long and that's just cryptic.

* ndmin -- I still think ndmin should be spun off to a separate 
function. It's trivial to implement a fuction, call it paddims for lack 
of a better name (asatleastndimensionarray seems kind of long and 
cryptic!). It should have minimal performance since no data gets copied, 
and if it makes a difference I would be willing to implement it in C in 
need be so that that performance impact was minimized.

    def paddims(a, n):
        "return a view of 'a' with at least 'n' dimensions"
        dims = a.shape
        b = a.view()
        b.shape = (1,)*(n - len(dims)) + dims
        return b

* copy -- Yes, I understand that the copy flag is probably not going to 
change for backward compatibility reasons if nothing else, but there was 
one last point I wanted to make about the copy flag. One of the warts 
that results from the copy flag, and I think that this is a common 
problem for functions that take parameters that switch their mode of 
operation, is that some combinations of input become nonsensical. 
Consider these five possibilities:

array(anarray, dtype=matchingtype, order=matchingorder, copy=True) # OK; 
copied
array(anarray, dtype=matchingtype, order=matchingorder, copy=False) # 
OK; not copied
array(anarray, dtype=nonmatchingtype, order=nonmatchingorder, copy=True) 
# OK; copied
array(anarray, dtype=nonmatchingtype, order=nonmatchingorder, 
copy=False) # Ugly; copied
array(nonarray, dtype=whatever, order=whatever, copy=False) # Ugly; copied

[Note that I've folded nonmatchingtype and nonmatchingorder together 
since they have the same effect]

Of these five possibilities, two have results where the arguments and 
the action taken become uncoupled. One way to address this would be to 
change the name of the copy flag to something that matches reality: 
force_copy. However, that seems kind of pointless, since it still 
introduces  as the underlying problem that some of the modes the array 
function can operate in are kind of bogus. Compare this to the case 
where the two primitives are array and asarray:

array(anarray, dtype=matchingtype, order=matchingorder) # copied
array(anarray, dtype=nonmatchingtype, order=nonmatchingorder) # copied

asarray(anarray, dtype=matchingtype, order=matchingorder) # not copied
asarray(anarray, dtype=nonmatchingtype, order=nonmatchingorder) # copied
asarray(nonarray, dtype=whatever, order=whatever, copy=False) # copied

There's still five cases, so the interface hasn't narrowed any[*], but 
all the possible argument combinations make sense (or raise a 
straightforward error). And think how much easier this behaviour is to 
explain!

Anyway that's it for now and hopefully for a while.

Regards,

-tim


[*] In reality it does narrow the interface because we already have 
asarray, but now you really need as array whereas before it was really 
just shorthand for array(copy=False).







More information about the Numpy-discussion mailing list