[Numpy-discussion] RFC: A (second) proposal for implementing some date/time types in NumPy
Fri Jul 25 15:47:02 CDT 2008
Could you clarify a couple of points ?
If I understand properly, your datetime64 would be time units from the POSIX
epoch (1970/01/01 00:00:00), right ? So
+7d would be 1970/01/08 (7 days after the epoch)
-7W would be 1969/11/13 (7*7 days before the epoch)
With this approach, a series [1,2,3,7] at a resolution 'd' would correspond to
1970/01/01, 1970/01/02, 1970/01/03 and 1970/01/07, right ?
I'm all for that, **AS LONG AS we have a business day resolution** 'b', so
+7b would be 1970/01/09.
I like your idea of a timedelta64 being relative, but in that case, why not
having the same resolutions as datetime64 ?
We can currently perform the following operations in scikits.timeseries
>>>import scikits.timeseries as ts
>>>series = ts.date_array(['1970-01', '1970-02', '1970-09'], freq='M')
DateArray([Jan-1970, Feb-1970, Sep-1970],
DateArray([1970, 1970, 1970],
DateArray([1970, 1970, 1971],
"A-MAR" means that year YY ends on 03/31 and that year (YY+1) starts on 04/01.
I use that a lot in my work, when I need to average daily data by water years
(a water year starts usually on 04/01 and ends the following 03/31).
How would I do that with datetime64 and timedelta64 ?
Apart from that, I'd be of course quite happy to help as much as I can.
On Friday 25 July 2008 07:09:33 Francesc Alted wrote:
> Well, as there were no replies to our second proposal for the date/time
> dtype, I assume that everbody agrees with it ;-) At any rate, we would
> like to proceed with the implementation phase very soon now.
> However, it happens that Enthought is sponsoring this job and they
> clearly stated that the implementation should cover the needs of as
> much users as possible. So, most in particular, we would like that one
> of the most heavier users of date/time objects, i.e. the TimeSeries
> authors, would be comfortable with the new date/time dtypes, and
> specially that they can benefit from them.
> For this goal, we are proposing a decoupling of the date/time use cases
> in two different groups:
> 1. A pure ``datetime`` dtype (absolute or relative) that would be useful
> for timestamping purposes in general (i.e. registering dates without a
> need that they be evenly spaced in time).
> 2. A class based on the ``frequency`` concept that would be useful for
> measurements that are done on a regular basis or in business
> With this, we are preventing the dtype implementation at the core of
> NumPy from being too cluttered with the relatively complex needs of the
> ``frequency`` concept users, factoring it out to a external class
> (``Date`` to follow the TimeSeries naming convention). More
> importantly, this decoupling will also avoid the mix of those two
> concepts that, although they are about time measurements, they have
> quite a different meanings indeed.
> Another important advantage of this distinction is that the ``datetime``
> timestamp requires less meta-information to worry about (basically,
> the 'resolution' property), while a ``frequency`` à la TimeSeries will
> need more additional meta-information, like the 'start' and 'end' of
> periods, as well as a more complex way to code frequencies (there
> exists much more time-periods to be coded, as it can be seen in _).
> This can be utterly important to allow the NumPy data based on the
> ``datetime`` dtype to be quickly saved and retrieved on databases like
> ZODB (object database) or PyTables (HDF5-based database).
> Our ultimate goal is that the ``Date`` and ``DateArray`` classes in the
> TimeSeries would be rewritten in terms of the new date/time dtype so as
> to get advantage of its features but also for getting rid of duplicated
> code. I honestly think that this can be a big advantage for TimeSeries
> indeed (at the cost of taking some time for doing the migration).
> Does that approach make sense for people?
> ..  http://scipy.org/scipy/scikits/wiki/TimeSeries#Frequencies
More information about the Numpy-discussion