[Numpy-discussion] fixing up datetime
Thu Jun 2 11:57:52 CDT 2011
Mark Wiebe wrote:
> I'm following what I understand the NEP to mean for combining dates and
> deltas of different units. This means for timedeltas, the metadata
> becomes more precise, in particular it becomes the GCD of the input
> metadata, and between timedelta and datetime the datetime always dominates.
Thanks for posting this link -- a few comments on that doc follow.
> Only Years, Months, and Business Days have a nonlinear relationship with
> the other units, so they're the only problem case for this. They can be
> arbitrarily special-cased based on what is decided to make the most sense.
As mentioned on my recent post -- this stuff should be handles by some
sort of "calendar" classes -- there is no one way to do that! So numpy
should provide datetime and timedelta data types that can be used, but a
timedelta should _not_ ever be defined by these weird variable units.
I guess what I'm getting is that:
a_date_time + a_timedelta
is a fundamentally different operation than:
a_date_time + a_calendar_defined_timespan
The former can follow all the usual math properties for addition, but
the later doesn't.
About the NEP:
A representation is also supported such that the stored date-time
integer can encode both the number of a particular unit as well as a
number of sequential events tracked for each unit.
I'm not sure I understand what this really means, but I _think_ I agree
with Pierre that this is unnecessary complication - couldn't it be
handled by multiple arrays, or maybe a structured dtype?
The datetime64 represents an absolute time. Internally it is represented
as the number of time units between the intended time and the epoch
(12:00am on January 1, 1970 --- POSIX time including its lack of leap
The CF netcdf metadata standard provides for times to be specified as
"units since a_date_time". units can be seconds, hours, days, etc (it
does allow months and years, but it shouldn't!). This is nice, flexible
system that makes it easy to capture wildly different scales needed:
from nanoseconds to millennia. Similarly, we might want to consider a
datetime dtype as containing a reference datetime, and a tic unit.
I think the "Time units" section does specify that you can use various
units, but it looks like the NEP sticks with the single POSIX epoch.
I see later in the NEP:
However, after thinking more about this, we found that the combination
of an absolute datetime64 with a relative timedelta64 does offer the
same functionality while removing the need for the additional origin
metadata. This is why we have removed it from this proposal.
hmmm -- I don't think that's the case -- you need the "origin" if you
want to represent something like nanoseconds as a datetime, far away
from the epoch. Sure, you can supply your own by keeping the origin and
a timedelta array separately, by you could do that for all uses, also,
and the point of this is to make working with datetimes easy. If we're
going to allow different units, we might as well have different "origins".
I also don't think that units like "month", "year", "business day"
should be allowed -- it just adds confusion. It's not a killer if they
are defined in the spec:
1 year = 365.25 days (for instance0
1 month = 1year/12
But I think it's better to simply disallow them, and keep that use for
what I'm calling the "Calendar" functions. And "business day" is
particularly ugly, and, I'm sure defined differently in different places.
Christopher Barker, Ph.D.
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
More information about the NumPy-Discussion