[Numpy-discussion] RFC: A proposal for implementing some date/time types in NumPy

Christopher Barker Chris.Barker@noaa....
Mon Jul 14 12:46:53 CDT 2008


Matt Knox wrote:
> The DateArray class in the timeseries scikits can do part of what you want.
> Observe...

>>>> a.year
> array([2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008,
>        2008, 2008, 2008, 2008])
>>>> a.hour
> array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,  0,  1])
>>>> a.day
> array([12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 13, 13])

This is great for what I often need: to output data in a format with 
columns of:

year, month, day, hour, min, sec

But I also often need to be able to convert a "TimeDelta" to a 
particular unit, for example (using the lib datetime):

 >>> td = datetime.datetime(2008, 7, 14, 12) - datetime.datetime(2008, 
7, 13, 10)
 >>> td
datetime.timedelta(1, 7200)

so we have a timedelta of one day, 7200 seconds.

I'd like:
 >>> td.as_hours
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
AttributeError: 'datetime.timedelta' object has no attribute 'as_hours'

which doesn't exist in the datetime module, so I do:

 >>> hours = td.days*24 + td.seconds/3600.
 >>> hours
26.0

I find myself writing this code al ot, so I'd love to have it built in.

Which brings up an issue:

The reason it isn't built in is that the philosophy behind the datetime 
module is that it provides the building blocks with which to build more 
feature-full packages. Personally, I really wish it had a bit more built 
in, but what can we do?

As for the numpy datetime types, we need to decide how much to build in. 
I think the kind of functionality described here is pretty basic, and 
should be included, but if we inculde everyone's idea of basic, it could 
get pretty bloated!

> I would encourage you to take a look at the wiki
> (http://scipy.org/scipy/scikits/wiki/TimeSeries) as you may find some surprises
> in there that prove useful.

So maybe we should have very little in the numy datetime type, and have 
scikits.TimeSeries as a more feature full package built on top of it.

> Would many people be interested in seeing this kind 
> of string date parsing integrated in the native NumPy types?

I think that more than one string format is a feature for a meta package.

>  the idea we 
> are incubating is to complement the ``datetime64`` with a 'resolution' 
> metainfo.  The ``datetime64`` will still be based on a int64 type, but 
> the meaning of the 'ticks' would depend on a 'resolution' property.  

I like this! Would there be conversion between different resolutions 
available? I wonder what that syntax for that should be?

>  And 
> definitely, "offset" would be similar to "origin".  So yes, we will try 
> to introduce both concepts.

yup -- origin is critical!

What resolution (and numerical format) do you use to express the origin? 
Even if you data is ini days, you may want to specify the origin with 
more precision, so as not to have confusion about what "0 days" means in 
some higher resolution unit. Also, if you want picosecond resolution, 
then the origin needs to be picosecond resolution as well.

Thanks for working on this -- I'm looking forward to using it!

-Chris




-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@noaa.gov


More information about the Numpy-discussion mailing list