[Numpy-discussion] Ransom Proposals

Tim Hochberg tim.hochberg at cox.net
Mon Mar 27 12:18:07 CST 2006

Travis Oliphant wrote:

> Tim Hochberg wrote:
>>> The transpose function handles a list and returns an array.  
>> Yes. And this is exactly the same property that I've been ranting 
>> about incessantly for the past several days in the case of reshape. 
>> It seems convenient, but it's evil. It leads to intermittent, hard to 
>> find bugs, since when passed a list it necessarily creates a new 
>> array object, but when passed an array, it returns a view.
> And this is also the area where Tim is going to find people that 
> disagree.  The functional interfaces have always been there so that 
> other objects could be manipulated as arrays.    There is *a lot* of 
> precendent behind the functional interfaces.  It seems like a big 
> change to simply get rid of them .

Yes. I certainly hope to get disagreement. While I'm certain that I'm 
right, this is a big change and I would feel uncomfortable if it were 
adopted without being thoroughly thrashed out. This way, if it does get 
adopted, I'll feel comfortable that it was fully vetted. And if doesn't, 
I get to say I told you so every time it bites someone ;) And who knows 
perhaps you can change my mind.

> Yes, having this ability means that you have to think about it a bit 
> if you are going to use the functional interface and try to do 
> in-place operations.  But, I would argue that this is an "advanced" 
> usage which is implemented to save space and time.

How is this true though? In what way, for instance, is:

    b = asarray(a).reshape(newshape)

slower or less space efficient than todays:

    b = reshape(a)

? If the answer is just the overhead of the call to asarray, I'll 
rewrite it in C for you at which point the overhead will truly be 
negligible. [Note I don't encourage the former style. In general I think 
all objects you want to use as arrays should be converted at the 
function boundaries.]

In fact, I'd argue just the opposite. The function encourage people to 
*not* convert objects to arrays at the boundaries of their functions. As 
a result, in addition to getting subtle, intermitent bugs, they tend to 
end up converting from objects to arrays multiple times inside a given 
function; wasting both space and time.

Let me muddy the waters a bit by saying that the subtle bugs you can get 
from reshape accepting lists pale next to the issue that whether you get 
a view or a copy depends on what the strides of the array are. Who keeps 
track of strides? I'd forgotten about that until I was looking at ravel, 
at which point I recalled the true horror of the function that is 
reshape. For those of you playing along at home:

     >>> a = numpy.arange(9).reshape([3,3])
     >>> b = a[1:,1:]
     >>> flat_a = a.reshape(-1)
     >>> flat_b = b.reshape(-1)
     >>> flat_a[0] = 999
     >>> flat_b[0] = 999
     >>> a
    array([[999,   1,   2],
           [  3,   4,   5],
           [  6,   7,   8]])
     >>> b
    array([[4, 5],
           [7, 8]])


>   I fully support such "advanced" usages, but believe the person using 
> them should know what they are doing.  Understanding why the 
> functional interface may return a copy does not seem to me to be a big 
> deal for somebody thinking about "in-place" operations.

If I truly saw any benefit to these, other than saving the typing of a 
few characters, I'd agree. However, I don't see such benefit. I don't 
see them advanced either as they offer no benefit over the corresponding 
"basic" usages other than marginally less typing. And you pay for that 
as it encourages bad practices and results in subtle bugs.

> I'm -1 right now on removing (or even deprecating) the functional 
> interfaces.

How about just segregating them. You can even call the submodule 
"advanced" if you like, although I'd prefer "danger_danger_warning", so 
perhaps "functions" would be a good compromise. Then I, and other 
like-minded numpy users, can simply ban the use of the "functions" 
submodule from our code and live blissful, efficient, bug free 
existences. Or something like that.

>    I'm +1 on educating users on the advantages of using methods and 
> that the functional interfaces are just basic wrappers around them.

Let me restate that I'm not method-ist (nor a Methodist for that 
matter), I'd be perfectly happy with functions. However, it seems 
impossible for backwards compatibility and cultural reasons to have 
non-broken function semantics. So for that reason I'm in favor of 
deprecating or segregating the borked functions. I'm not actually in 
favor of removing them altogether, as that would unnecessarily 
complicate migration from Numeric, but I am in favor of booting them out 
of the main numpy namespace.



More information about the Numpy-discussion mailing list