[Numpy-discussion] Datetime branch

Robert Kern robert.kern@gmail....
Thu Jun 11 13:55:38 CDT 2009


On Thu, Jun 11, 2009 at 13:44, Charles R
Harris<charlesr.harris@gmail.com> wrote:
>
>
> On Thu, Jun 11, 2009 at 12:18 PM, Robert Kern <robert.kern@gmail.com> wrote:
>>
>> On Thu, Jun 11, 2009 at 13:06, Charles R
>> Harris<charlesr.harris@gmail.com> wrote:
>> >
>> >
>> > On Thu, Jun 11, 2009 at 11:47 AM, Robert Kern <robert.kern@gmail.com>
>> > wrote:
>> >>
>> >> On Thu, Jun 11, 2009 at 12:39, Charles R
>> >> Harris<charlesr.harris@gmail.com> wrote:
>> >> >
>> >> >
>> >> > On Thu, Jun 11, 2009 at 11:34 AM, Robert Kern <robert.kern@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> On Thu, Jun 11, 2009 at 12:29, Charles R
>> >> >> Harris<charlesr.harris@gmail.com> wrote:
>> >> >> > Oh, and slipping the new types in between 64 bit integers and
>> >> >> > floats
>> >> >> > is
>> >> >> > a
>> >> >> > bit iffy.
>> >> >>
>> >> >> Where, specifically? There are several linear orders of types in
>> >> >> numpy. I tried to be careful to do the right thing in each. The enum
>> >> >> numbers are after NPY_VOID, of course, for compatibility.
>> >> >
>> >> > I noticed. I'm not saying it's wrong, just that a linear order lacks
>> >> > descriptive power and is difficult to maintain. I expect you ran into
>> >> > that
>> >> > problem when trying to make everything work as you wanted.
>> >>
>> >> Yes. Now, which place am I slipping in the new types between 64-bit
>> >> integers and floats?
>> >
>> > In the ufunc generator.
>>
>> This line from generate_umath.py?
>>
>>  all = '?bBhHiIlLqQtTfdgFDGO'
>>
>> > But most of the macros use the type ordering
>>
>> Not quite. They use the order of the loops given to the ufunc. The
>> order of the types in that string I think you are referring doesn't
>> affect much. Basically, just the comparisons where every type has a
>> loop.
>>
>> > and how
>> > do you control the promotion (or lack thereof) of the various types
>> > to/from
>> > the datetime types?
>>
>> PyArray_CanCastSafely() in convert_datatype.c. datetime and timedelta
>> types cannot be auto-casted to or from any datatype. They can be
>> explicitly cast, but ufuncs won't auto-cast them when trying to find
>> the right loop. The datetime types are a bit unique in that they need
>> to exclude certain combinations (e.g. datetime+datetime). Allowing
>> auto-casts prevented me from doing that.
>
> The implementation of  PyArray_CanCastSafely illustrates two other points
> that bother me.
>
> 1) The rules are encoded in the program logic. This makes them difficult to
> find or to see what they are and requires editing the code to make changes.
>
> 2) Some of the rules are maintained by the types. That is even more obscure
> and reminiscent of the "friend" functions in c++ that encode the same sort
> of thing when the operators are overloaded. I never did like that as a
> general system ;)

Yeah, I'm not much a fan of it, either. But it's what I had to work with.

>> In fact, the placement of the datetime typecodes in that string was a
>> leftover from when I was trying to allow auto-casts between integers
>> and datetime types. Now that I disallow them, the ordering can be
>> changed.
>>
>> > There also seems to be some mechanism for raising errors that has been
>> > added, maybe to loops. I'm not clear on that, did you add some such
>> > mechanism?
>>
>> Not really. Object loops already had such a mechanism; I just extended
>> that to do the same thing for the datetime types, too. You will be
>> able to raise a Python exception in the datetime loops. Of course, you
>> pay for that a little because that means that you can't release the
>> GIL. I don't think that will be a substantial problem.
>
> Didn't say it was a problem, just that the issue of raising errors in the
> ufunc loops has come up before and I wondered if you were developing some
> mechanism for that.

We were brainstorming, but there isn't a good way to do it (i.e.
allowing a useful message rather than just an error flag) without
holding on to the GIL or much more extensive modifications to the
machinery.

> BTW, what is the metadata that is going to be added to the types? What
> purpose does it serve?

Storage for the time frequency (days, weeks, months, etc.) per the NEP.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


More information about the Numpy-discussion mailing list