[Numpy-discussion] The date/time dtype and the casting issue
Francesc Alted
faltet@pytables....
Tue Jul 29 08:12:52 CDT 2008
Hi,
During the making of the date/time proposals and the subsequent
discussions in this list, we have changed a couple of times our point
of view about the way how the castings would work between different
date/time types and the different time units (previously called
resolutions). So I'd like to expose this issue in detail here, and
give yet another new proposal about this, so as to gather feedback from
the community before consolidating it in the final date/time proposal.
Casting proposal for date/time types
====================================
The operations among the proposed date/time types can be divided in
three groups:
* Absolute time versus relative time
* Absolute time versus absolute time
* Relative time versus relative time
Now, here are our considerations for each case:
Absolute time versus relative time
----------------------------------
We think that in this case the absolute time should have priority for
determining the time unit of the outcome. That would represent what
the people wants to do most of the times. For example, this would
allow to do:
>>> series = numpy.array(['1970-01-01', '1970-02-01', '1970-09-01'],
dtype='datetime64[D]')
>>> series2 = series + numpy.timedelta(1, 'Y') # Add 2 relative years
>>> series2
array(['1972-01-01', '1972-02-01', '1972-09-01'],
dtype='datetime64[D]') # the 'D'ay time unit has been chosen
Absolute time versus absolute time
----------------------------------
When operating (basically, only the substraction will be allowed) two
absolute times with different unit times, we are proposing that the
outcome would be to raise an exception. This is because the ranges and
timespans of the different time units can be very different, and it is
not clear at all what time unit will be preferred for the user. For
example, this should be allowed:
>>> numpy.ones(3, dtype="T8[Y]") - numpy.zeros(3, dtype="T8[Y]")
array([1, 1, 1], dtype="timedelta64[Y]")
But the next should not:
>>> numpy.ones(3, dtype="T8[Y]") - numpy.zeros(3, dtype="T8[ns]")
raise numpy.IncompatibleUnitError # what unit to choose?
Relative time versus relative time
----------------------------------
This case would be the same than the previous one (absolute vs
absolute). Our proposal is to forbid this operation if the time units
of the operands are different. For example, this should be allowed:
>>> numpy.ones(3, dtype="t8[Y]") + 3*numpy.ones(3, dtype="t8[Y]")
array([4, 4, 4], dtype="timedelta64[Y]")
But the next should not:
>>> numpy.ones(3, dtype="t8[Y]") + numpy.zeros(3, dtype="t8[fs]")
raise numpy.IncompatibleUnitError # what unit to choose?
Introducing a time casting function
-----------------------------------
As forbidding operations among absolute/absolute and relative/relative
types can be unacceptable in many situations, we are proposing an
explicit casting mechanism so that the user can inform about the
desired time unit of the outcome. For this, a new NumPy function,
called, say, ``numpy.change_unit()`` (this name is for the purposes of
the discussion and can be changed) will be provided. The signature for
the function will be:
change_unit(time_object, new_unit, reference)
where 'time_object' is the time object whose unit is to be
changed, 'new_unit' is the desired new time unit, and 'reference' is an
absolute date that will be used to allow the conversion of relative
times in case of using time units with an uncertain number of smaller
time units (relative years or months cannot be expressed in days). For
example, that would allow to do:
>>> numpy.change_unit( numpy.array([1,2], 'T[Y]'), 'T[d]' )
array([365, 731], dtype="datetime64[d]")
or:
>>> ref = numpy.datetime64('1971', 'T[Y]')
>>> numpy.change_unit( numpy.array([1,2], 't[Y]'), 't[d]', ref )
array([366, 365], dtype="timedelta64[d]")
Note: we refused to use the ``.astype()`` method because of the
additional 'time_reference' parameter that will sound strange for other
typical uses of ``.astype()``.
Opinions?
--
Francesc Alted
More information about the Numpy-discussion
mailing list