[Numpy-discussion] fixing up datetime

Mark Wiebe mwwiebe@gmail....
Thu Jun 2 11:22:32 CDT 2011


On Thu, Jun 2, 2011 at 10:36 AM, Pierre GM <pgmdevlist@gmail.com> wrote:

>
> On Jun 1, 2011, at 11:16 PM, Mark Wiebe wrote:
>
> > On Wed, Jun 1, 2011 at 3:52 PM, Charles R Harris <
> charlesr.harris@gmail.com> wrote:
> >
> > <snip>
> > Just a quick comment, as this really needs more thought, but time is a
> bag of worms.
> >
> > Certainly a bag of worms, I agree.
>
> Oh yes... Keep in mind that the inclusion of time in numpy was tricky at
> the beginning. Travis O. started to work on that at the same time a GSoC
> student Jared and I were supervising a couple of years ago. This student
> tried to implement in numpy some of the ideas Matt Knox and I had put in
> scikits.timeseries. I myself tried to port the new dtype to
> scikits.timeseries, but ran into problems in the C side ( I've never been
> able to completely subclass an array in C...), and then life went in the
> way. Anyhow.
> I would advise you to check the very experimental git version of
> scikits.timeseries on github, it can give you some ideas of what (not) to
> do.


Cool, I'll take a look.


> > This part works fairly well now, except for some questions like what
> should datetime("2011-01-30", "D") + timedelta(1, "M") produce. Maybe
> "2011-02-28", or "2011-03-02"?
>
> Great example. What rules do you have for adding dates and deltas of
> different frequency ? What's the frequency of the result?


I'm following what I understand the NEP to mean for combining dates and
deltas of different units. This means for timedeltas, the metadata becomes
more precise, in particular it becomes the GCD of the input metadata, and
between timedelta and datetime the datetime always dominates.

https://github.com/numpy/numpy/blob/master/doc/neps/datetime-proposal.rst

Only Years, Months, and Business Days have a nonlinear relationship with the
other units, so they're the only problem case for this. They can be
arbitrarily special-cased based on what is decided to make the most sense.

As an aside, I don't actually like the name 'frequency' for the metadata,
'unit' sounds better to me. A frequency to me means a number with a
particular form of unit, such as Hz or (1/s).

>
> > but parsing dates and times
> >
> > I've implemented an almost-ISO 8601 date-time parser.  I had to deviate a
> bit to support years outside the little 10000-year window we use. I think
> supporting more formats could be handled by writing a function which takes
> its date format and outputs ISO 8601 format to feed numpy.
>
> Good to have a nice parser independent of eGenix's, but I don't think it
> should be the first priority.
>

Yeah, I wasn't planning on implementing such a parser at least in the near
term.

Other comments:
> * I've never been really happy with the idea of putting a nb of events in
> the dtype. I think it makes things more complicated than they already are,
> for no big advantages (if you really need to keep track of several events
> falling in the same unit of time, why don't you make several arrays ?). If
> there's a will to change the API, I'd be glad to see this part dropped...
>

The NEP says that events were a requirement of a commercial sponsor, does
anyone else have opinions about the feature?


> * I quite agree with Mark W. It'd be great if there could be some basic
> mechanism to handle holidays in numpy, for example using the metadata. It'd
> be up to the user to decide what a holiday is (or holihour...). Wouldn't it
> be too much overhead for most users, though ? And those holidays would
> affect __add__/__sub__ and ?
>

In the current design, holidays would only be a feature of business days,
nothing else. Sorry, no holihours. ;)

I think the business days functionality might have to deviate from the
general pattern quite a bit to work naturally for what it does. I imagine
people will want to sometimes get a NaT, sometimes fall forward to the next
valid business day, and sometimes fall backwards to the closest prior valid
business day.


> * As Chuck pointed out, users in different fields won't have the same
> expectations. In environmental sciences, there's no need for leap seconds
> nor business days. Astronomers and financial analysts have different
> needs...


Is there anyone watching this thread with specific needs for environmental
sciences, astronomy, or other areas which might have an interest?

-Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20110602/ff90296a/attachment.html 


More information about the NumPy-Discussion mailing list