[Numpy-discussion] first impressions with numpy

Tim Hochberg tim.hochberg at cox.net
Sun Apr 2 08:11:17 CDT 2006


Colin J. Williams wrote:

> Tim Hochberg wrote:
>
>> Sebastian Haase wrote:
>>
>>> Thanks Tim,
>>> that's OK - I got the idea...
>>> BTW, is there a (policy) reason that you sent the first email just 
>>> to me and not the mailing list !?
>>
>>
>>
>> No. Just clumsy fingers. Probably the same reason the functions got 
>> all garbled!
>>
>>>
>>> I would really be more interested in comments to my first point ;-)
>>> I think it's important that numpy will not be to cryptic and only 
>>> for "hackers", but nice to look at ...  (hope you get what I mean ;-)
>>
>>
>>
>> Well, I think it's probably a good idea and it sounds like Travis 
>> like the idea " for some of the builtin types". I suspect that's code 
>> for "not types for which it doesn't make sense, like recarrays".
>>
> Tim,
>
> Could you elaborate on this please?  Surely, it would be good for all 
> functions and methods to have meaningful parameter lists and good doc 
> strings.

This isn't really about parameter lists and docstrings, it's about 
__str__ and possibly __repr__. The basic issue is that the way dtypes 
are displayed is powerful, but unfriendly. If I create an array of integers:

     >>> a = arange(4)
     >>> print repr(a.dtype), str(a.dtype)
    dtype('<i4') '<i4'

This result is sort of cryptic. It would probably be reasonable to have 
this print

    dtype(int32), int32

instead. This is much less cryptic and dtype(int32) works fine, so it's 
an acceptable substitute for repr.

On the other hand, some things don't map neatly onto the builtin types. 
Data that's not in the native byte order would be one case. For example, 
dtype('>i4') is not the same as dtype(int32) on my machine and should 
probably not be displayed using int32[1]. These cases should be rare in 
practice and it seems fine to fall back to the less friendly but more 
flexible notation.

Recarrays were probably not such a good example. Here is an example from 
a recarray:

    dtype([('x', '<f8'), ('z', '<c16')])

This would work fine if repr were instead:

    dtype([('x', float64), ('z', complex128)])

Anyway, this all seems reasonable to me at first glance. That said, I 
don't plan to work on this, I've got other fish to fry at the moment.

Regards,

-tim

[1] There does seem to be something squirley going on here though: 
dtype('>i4').name is 'int32' which seems wrong.





More information about the Numpy-discussion mailing list