[Numpy-discussion] NumPy date/time types and the resolution concept
Francesc Alted
faltet@pytables....
Mon Jul 14 08:07:47 CDT 2008
Hi,
Before giving more thought to the new proposal of the date/time types
for NumPy based in the concept of 'resolution', I'd like to gather more
feedback on your opinions about this.
After pondering about the opinions about the first proposal, the idea we
are incubating is to complement the ``datetime64`` with a 'resolution'
metainfo. The ``datetime64`` will still be based on a int64 type, but
the meaning of the 'ticks' would depend on a 'resolution' property.
This is best seen with an example:
In [21]: numpy.arange(3, dtype=numpy.dtype('datetime64', 'sec'))
Out [21]: [1970-01-01T00:00:00, 1970-01-01T00:00:01,
1970-01-01T00:00:02]
In [22]: numpy.arange(3, dtype=numpy.dtype('datetime64', 'hour'))
Out [22]: [1970-01-01T00, 1970-01-01T01, 1970-01-01T02]
i.e. the 'resolution' gives the actual meaning to the 'int64' counter.
The advantage of this abstraction is that the user can easily choose the
scale of resolution that better fits his need. I'm thinking in
providing the next resolutions:
["femtosec", "picosec", "nanosec", "microsec", "millisec", "sec", "min",
"hour", "month", "year"]
Also, together with the absolute ``datetime64`` one can have a relative
counterpart, say, ``timedelta64`` that also supports the notion
of 'resolution'. Between both one would cover the needs for most uses,
while providing the user with a lot of flexibility, IMO. We very much
prefer this new approach than the stated in our first proposal.
Now, it comes the tricky part: how to integrate the notion
of 'resolution' with the 'dtype' data type factory of NumPy? Well, we
have thought a couple of possibilities.
1) Using the NumPy 'dtype' factory:
nanoabs = numpy.dtype('datetime64', resolution="nanosec")
nanorel = numpy.dtype('timedelta64', resolution="nanosec")
2) Extending the string notation by using the '[]' square brackets:
nanoabs = numpy.dtype('datetime64[nanosec]') # long notation
nanoabs = numpy.dtype('T[nanosec]') # short notation
nanorel = numpy.dtype('timedelta64[nanosec]') # long notation
nanorel = numpy.dtype('t[nanosec]') # short notation
With these building blocks, one may obtain more complex dtype structures
easily.
Now, the question is: would that proposal enter in conflict with the
spirit of the current 'dtype' factory? And another important one,
would that complicate the implementation too much?
If the answer to the both previous questions is 'no', then we will study
this more and provide another proposal based on this. BTW, I suppose
that the best candidate to answer these would be Travis O., but if
anybody feels brave enough ;-) please go ahead and give your advice.
Cheers,
--
Francesc Alted
More information about the Numpy-discussion
mailing list