[Numpy-discussion] fixing up datetime
Thu Jun 2 11:22:24 CDT 2011
Charles R Harris wrote:
> Good support for units and delta times is very useful, but
> parsing dates and times and handling timezones, daylight savings, leap
> seconds, business days, etc., is probably best served by addon packages
> specialized to an area of interest. Just my $.02
I agree here -- I think for numpy, what's key is to focus on the kind of
things needed for computational use -- that is the performance critical
I suppose business-day type calculations would be both key and
performance-critical, but that sure seems like the kind of thing that
should go in an add-on package, rather than in numpy.
The stdlib datetime package is a little bit too small for my taste
(couldn't I at least as for a TimeDelta to be expressed in, say,
seconds, without doing any math on my own?), but the idea is good --
create the core types, let add-on packages do the more specialized stuff.
> * The existing datetime-related API is probably not useful, and in fact
> those functions aren't used internally anymore. Is it reasonable to
> remove the functions, or do we just deprecate them?
I say remove, but some polling to see if anyone is using it might be in
> * Leap seconds probably deserve a rigorous treatment, but having an
> internal representation with leap-seconds overcomplicates otherwise very
> simple and fast operations.
could you explain more? I don't get the issues -- leap seconds would com
e in for calculations like: a_given_datetime + a_timedelta, correct?
Given leap years, and all the other ugliness, does leap seconds really
make it worse?
> * Default conversion to string - should it be in UTC or with the local
> timezone baked in?
most date_time handling should be time-zone neutral --i.e. assume
everything is in the same timezone (and daylight savings status). Libs
that assume you want the locale setting do nothing but cause major pain
if you have anything out of the ordinary to do (and sometimes ordinary
If you MUST include time-zone, exlicite is better than implicit -- have
the user specify, or, at the very least make it easy for the user to
override any defaults.
> As UTC it may be confusing because 'now' will print
> as a different time than people would expect.
I think "now" should be expressed (but also stored) in the local time,
unless the user asks for UTC. This is consistent with the std lib
datetime.now(), if nothing else.
> * Should the NaT (not-a-time) value behave like floating-point NaN? i.e.
> NaT == NaT return false, etc. Should operations generating NaT trigger
> an 'invalid' floating point exception in ufuncs?
makes sense to me -- at least many folks are used to NaN symantics.
> And after the removal of datetime from 1.4.1 and now this, I'd be in
> favor of putting a large "experimental" sticker over the whole thing
> until further notice.
> Do we have a good way to do that?
Maybe a "experimental" warning, analogous to the "deprecation" warning.
> Good support for units and delta times is very useful,
> This part works fairly well now, except for some questions like what
> should datetime("2011-01-30", "D") + timedelta(1, "M") produce. Maybe
> "2011-02-28", or "2011-03-02"?
Neither -- "month" should not be a valid unit to express a timedelta in.
Nor should year, or anything else that is not clearly defined (we can
argue about day, which does change a bit as the earth slows down, yes?)
Yes, it's nice to be able to easily have a way of expressing things like
every month, or "a month from now" when you mean a calendar month, but
it's a heck of a can of worms.
We just had a big discussion about this in the netcdf CF metadata
We more or less came to the conclusion (I did, anyway) that there were
two distinct, but related concepts:
1) time as a strict unit of measurement, like length, mass, etc. In that
case, don't use "months" as a unit.
2) Calendars -- these are what months, days of week, etc, etc, etc. are
from, and these get ugly. I also learned that there are even more
calendars than I thought. Beyond the Julian, Gregorian, etc, there are
special ones used for climate modeling and the like, that have nice
properties like all months being 30 days long, etc. Plus, as discussed,
various "business" calendars.
So: I think that the calendar-related functions need fairly self
contained library, with various classes for the various calendars one
might want to use, and a well specified way to define new ones.
Christopher Barker, Ph.D.
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
More information about the NumPy-Discussion