[Numpy-discussion] ndarray __getattr__ to perform __getitem__
Thu Oct 28 16:42:54 CDT 2010
On Thu, Oct 28, 2010 at 16:37, Ian Stokes-Rees
> On 10/28/10 5:29 PM, Robert Kern wrote:
>> On Thu, Oct 28, 2010 at 15:17, Ian Stokes-Rees
>> <firstname.lastname@example.org> wrote:
>>> I have an ndarray with named dimensions. I find myself writing some
>>> fairly laborious code with lots of square brackets and quotes. It seems
>>> like it wouldn't be such a big deal to overload __getattribute__ so
>>> instead of doing:
>>> r = genfromtxt('results.dat',dtype=[('a','int'), ('b', 'f8'),
>>> ('c','int'), ('d', 'a20')])
>>> scatter(r[r['d'] == 'OK']['a'], r[r['d'] == 'OK']['b'])
>>> I could do:
>>> scatter(r[r.d == 'OK'].a, r[r.d == 'OK'].b)
>>> which is really a lot clearer. Is something like this already possible
>> See recarray which uses __getattribute__.
> Thanks -- I'll look into it.
>>> Is there some reason not to map __getattr__ to __getitem__?
>> Using __getattribute__ tends to slow down almost all operations on the
>> array substantially. Perhaps __getattr__ would work better, but all of
>> the methods and attributes would mask the fields. If you can find a
>> better solution that doesn't have such an impact on normal
>> performance, we'd be happy to hear it.
> But wouldn't the performance hit only come when I use it in this way?
> __getattr__ is only called if the named attribute is *not* found (I
> guess it falls off the end of the case statement, or is the result of
> the attribute hash table "miss").
That's why I said that __getattr__ would perhaps work better.
> So the proviso is "this shortcut only works if the field names are
> distinct from any methods or attributes on the ndarray object (or its
"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
-- Umberto Eco
More information about the NumPy-Discussion