[Numpy-discussion] Datetime branch
Thu Jun 11 14:07:12 CDT 2009
On Jun 11, 2009, at 1:44 PM, Charles R Harris wrote:
> The implementation of PyArray_CanCastSafely illustrates two other
> points that bother me.
> 1) The rules are encoded in the program logic. This makes them
> difficult to find or to see what they are and requires editing the
> code to make changes.
I agree that this is all sub-optimal. I didn't do much to fix what
was there with Numeric except add a semi-orthogonal user-defined
I like the generic function concept that was added to the ufuncs quite
a bit. I'm wondering if most of the functions currently in the *f
member of the data-type structure couldn't be implemented under that
Also, should we attach coercion information to each data-type directly
and an API to extend the coercion information? I agree that the
"implicit" ordering of the data-types for coercion is wonky, but it
allowed the code from Numeric to be used to dispatch in the ufunc
instead of designing a new approach. Do you have other ideas about
how this might work?
> 2) Some of the rules are maintained by the types. That is even more
> obscure and reminiscent of the "friend" functions in c++ that encode
> the same sort of thing when the operators are overloaded. I never
> did like that as a general system ;)
Are you referring to the user-defined data-types? I agree it's
pretty kludgy. Are you envisioning a "global" coercion table? It
seems like this may need to be operation specific and extensible to
allow new data-types to be added fairly easily.
> BTW, what is the metadata that is going to be added to the types?
> What purpose does it serve?
In the date-time case, it holds what frequency the integer in the data-
type represents. There will only be 2 new static data-types.
"Datetime" and "Timedelta" that use 8 bytes each.
What those 8 bytes represent will be determined by the metadata
(years, months, seconds, etc...).
But, generally, it will be an extra dictionary that can store anything
you want (anybody want to define a "float" data-type that uses IBM
format bits?). The ufunc machinery needs to change to handle passing
that information in somehow. The approaches we take to doing that
will also hopefully allow us to define ufuncs for string, unicode, and
void * arrays as well.
More information about the Numpy-discussion