[Numpy-discussion] fixing up datetime
Thu Jun 2 10:36:20 CDT 2011
On Jun 1, 2011, at 11:16 PM, Mark Wiebe wrote:
> On Wed, Jun 1, 2011 at 3:52 PM, Charles R Harris <email@example.com> wrote:
> Just a quick comment, as this really needs more thought, but time is a bag of worms.
> Certainly a bag of worms, I agree.
Oh yes... Keep in mind that the inclusion of time in numpy was tricky at the beginning. Travis O. started to work on that at the same time a GSoC student Jared and I were supervising a couple of years ago. This student tried to implement in numpy some of the ideas Matt Knox and I had put in scikits.timeseries. I myself tried to port the new dtype to scikits.timeseries, but ran into problems in the C side ( I've never been able to completely subclass an array in C...), and then life went in the way. Anyhow.
I would advise you to check the very experimental git version of scikits.timeseries on github, it can give you some ideas of what (not) to do.
> This part works fairly well now, except for some questions like what should datetime("2011-01-30", "D") + timedelta(1, "M") produce. Maybe "2011-02-28", or "2011-03-02"?
Great example. What rules do you have for adding dates and deltas of different frequency ? What's the frequency of the result?
> but parsing dates and times
> I've implemented an almost-ISO 8601 date-time parser. I had to deviate a bit to support years outside the little 10000-year window we use. I think supporting more formats could be handled by writing a function which takes its date format and outputs ISO 8601 format to feed numpy.
Good to have a nice parser independent of eGenix's, but I don't think it should be the first priority.
* I've never been really happy with the idea of putting a nb of events in the dtype. I think it makes things more complicated than they already are, for no big advantages (if you really need to keep track of several events falling in the same unit of time, why don't you make several arrays ?). If there's a will to change the API, I'd be glad to see this part dropped...
* I quite agree with Mark W. It'd be great if there could be some basic mechanism to handle holidays in numpy, for example using the metadata. It'd be up to the user to decide what a holiday is (or holihour...). Wouldn't it be too much overhead for most users, though ? And those holidays would affect __add__/__sub__ and ?
* As Chuck pointed out, users in different fields won't have the same expectations. In environmental sciences, there's no need for leap seconds nor business days. Astronomers and financial analysts have different needs...
More information about the NumPy-Discussion