[Numpy-discussion] fixing up datetime

Chris Barker Chris.Barker@noaa....
Thu Jun 2 23:56:39 CDT 2011


On 6/2/11 12:57 PM, Robert Kern wrote:
>>> Anyhow, years and months are simple enough.
>>
>> no, they are not -- they are fundamentally different than hours, days, etc.
>
> That doesn't make them *difficult*.

I won't comment on how difficult it is -- I'm not writing the code. My 
core point is that they are different, and that should be clear in the API.


> It's tricky to convert between
> months and hours, yes, but that's not the only operation that we're
> looking to represent. A *ton* of important time series are in these
> calendrical units (monthly, semi-monthly, quarterly, yearly, etc.). If
> you collect economic data on a monthly basis, you don't need to
> unambiguously convert "January 2011" to a microsecond timestamp. You
> do need to be able to "add 6 months", "add 1 year", and "roll up each
> 3 month period into quarters", etc. Each of those operations *is*
> unambiguous.

I'm not all that sure that there isn't ambiguity in there, actually. But 
anyway, the ambiguity really shows up if people use the SAME dtypes, 
APIs, etc for these types of operations as they do for regular old time 
series.

> The data simply doesn't represent microsecond-sized
> events.

true -- and this brings up another point -- would it be useful to have a 
way to capture the precision? Kind of like significant figures? I 
suppose that is implied by ones choice of units in a datetime:

"weeks since a datetime" certainly implies something different than 
"microseconds since a datetime", but maybe it should be more formal.

This is where the proposed approach is better than the CF standard -- in 
that case, most folks use floating point units, so the concept of 
precision is lost, even if your time units are hours, rather than micro 
seconds.

> The machinery to handle both is basically the same inside their areas
> of applicability;

I'm confused about that -- I expect the machinery to be quite different 
for "Calendar" calculations than for "straight time" calculations. 
months, years, etc are different lengths depending on the when they occur.

Or are you suggesting that if we keep the precision never higher than it 
should be, then it's easy?

i.e. if you are storing your datetimes and time deltas as integer 
months, then you don't need to care that some months are 30 days, and 
some 31 (or 28, or 29).

I'd have to think on that more -- my experience is not with financial 
data -- so that may make sense there, but in the sciences I'm familiar 
with (hydrology, oceanography, meteorology, ...), people are very quick 
to convert months to days or hours, and want to differentiate time 
series, etc, and then this stuff does get messy. And this applies to 
what is similar to economic data; monthly rainfall, monthly means wind 
speed, whatever.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@noaa.gov


More information about the NumPy-Discussion mailing list