[Numpy-discussion] subclassing ndaray

Travis Oliphant oliphant.travis at ieee.org
Mon Feb 27 22:13:03 CST 2006

> Travis Oliphant wrote:
>> Stefan van der Walt wrote:
>>>> The __init__ and __new__ methods are not called because they may 
>>>> have arbitrary signatures.  Instead,  the __array_finalize__ method 
>>>> is always called.  So, you should use that instead of __init__.
>> This is now true in SVN.  Previously, __array_finalize__ was not 
>> called if the "parent" was NULL.  However, now, it is still called 
>> with None as the value of the first argument.
>> Thus __array_finalize__ will be called whenever ndarray.__new__(<some 
>> subclass>,...) is called.
> Why this change in style from the the common Python idom of __new__, 
> __init__, with the same signature to __new__, __array_finalize__ with 
> possibly different signatures?

I don't see it as a change in style but adding a capability to the 
ndarray subclass.   The problem is that arrays can be created in many 
ways (slicing, ufuncs, etc).   Not all of these ways should go through 
the __new__/__init__ -- style creation mechanism.   Try inheriting from 
a float builtin and add attributes.  Then add your float to an instance 
of your new class and see what happens.

You will get a float-type on the output.   This is the essence of Paul's 
insight that sub-classing is rarely useful because you end up having to 
re-define all the operators anyway to return the value that you want.  
He knows whereof he speaks as well, because he wrote MA and UserArray 
and had experience with Python sub-classing. 

I wanted a mechanism to make it easier to sub-class arrays and have the 
operators return your object if possible (including all of it's attributes).


__array_priority__  (a floating point attribute)
__array_finalize__  (a method called on internal construction of the 
array wrapper).

were invented (along with __array_wrap__  which any class can define to 
have their objects survive ufuncs). 

It was easy enough to see where to call __array_finalize__ in the C-code 
if somewhat difficult to explain (and get exception handling to work 
because of my initial over-thinking). 

The signature is just

__array_finalize__(self, parent):

i.e. any return value is ignored (but exceptions are caught).

I've used the feature succesfully on at least 3-subclasses (chararray, 
memmap, and matrix) and so I'm actually pretty happy with it. 

__new__ and __init__ are still relevant for constructing your brand-new 
object.  The __array_finalize__ function is just what the internal 
contructor that acutally allocates memory will always call to let you 
set final attributes *every* time your sub-class gets created.


More information about the Numpy-discussion mailing list