[Numpy-discussion] NumPy date/time types and the resolution concept

Francesc Alted faltet@pytables....
Mon Jul 14 08:07:47 CDT 2008


Hi,

Before giving more thought to the new proposal of the date/time types 
for NumPy based in the concept of 'resolution', I'd like to gather more 
feedback on your opinions about this.

After pondering about the opinions about the first proposal, the idea we 
are incubating is to complement the ``datetime64`` with a 'resolution' 
metainfo.  The ``datetime64`` will still be based on a int64 type, but 
the meaning of the 'ticks' would depend on a 'resolution' property.  
This is best seen with an example:

In [21]: numpy.arange(3, dtype=numpy.dtype('datetime64', 'sec'))
Out [21]: [1970-01-01T00:00:00, 1970-01-01T00:00:01, 
1970-01-01T00:00:02]

In [22]: numpy.arange(3, dtype=numpy.dtype('datetime64', 'hour'))
Out [22]: [1970-01-01T00, 1970-01-01T01, 1970-01-01T02]

i.e. the 'resolution' gives the actual meaning to the 'int64' counter.

The advantage of this abstraction is that the user can easily choose the 
scale of resolution that better fits his need.  I'm thinking in 
providing the next resolutions:

["femtosec", "picosec", "nanosec", "microsec", "millisec", "sec", "min",
"hour", "month", "year"]

Also, together with the absolute ``datetime64`` one can have a relative 
counterpart, say, ``timedelta64`` that also supports the notion 
of 'resolution'.  Between both one would cover the needs for most uses, 
while providing the user with a lot of flexibility, IMO.  We very much 
prefer this new approach than the stated in our first proposal.

Now, it comes the tricky part: how to integrate the notion 
of 'resolution' with the 'dtype' data type factory of NumPy?  Well, we 
have thought a couple of possibilities.

1) Using the NumPy 'dtype' factory:

nanoabs = numpy.dtype('datetime64', resolution="nanosec")
nanorel = numpy.dtype('timedelta64', resolution="nanosec")

2) Extending the string notation by using the '[]' square brackets:

nanoabs = numpy.dtype('datetime64[nanosec]')  # long notation
nanoabs = numpy.dtype('T[nanosec]')  # short notation
nanorel = numpy.dtype('timedelta64[nanosec]')  # long notation
nanorel = numpy.dtype('t[nanosec]')  # short notation

With these building blocks, one may obtain more complex dtype structures 
easily.

Now, the question is:  would that proposal enter in conflict with the 
spirit of the current 'dtype' factory?  And another important one, 
would that complicate the implementation too much?

If the answer to the both previous questions is 'no', then we will study 
this more and provide another proposal based on this.  BTW, I suppose 
that the best candidate to answer these would be Travis O., but if 
anybody feels brave enough ;-) please go ahead and give your advice.

Cheers,

-- 
Francesc Alted


More information about the Numpy-discussion mailing list