[Numpy-discussion] fixing up datetime
Tue Jun 7 18:24:38 CDT 2011
On Tue, Jun 7, 2011 at 3:56 PM, Pierre GM <firstname.lastname@example.org> wrote:
> On Jun 7, 2011, at 5:54 PM, Robert Kern wrote:
> > On Tue, Jun 7, 2011 at 07:34, Dave Hirschfeld <email@example.com>
> >> I'm not convinced about the events concept - it seems to add complexity
> >> for something which could be accomplished better in other ways. A [Y]//4
> >> dtype is better specified as [3M] dtype, a [D]//100 is an [864S]. There
> >> may well be a good reason for it however I can't see the need for it in
> >> own applications.
> > Well, [D/100] doesn't represent [864s]. It represents something that
> > happens 100 times a day, but not necessarily at precise regular
> > intervals. For example, suppose that I am representing payments that
> > happen twice a month, say on the 1st and 15th of every month, or the
> > 5th and 20th. I would use [M/2] to represent that. It's not [2W], and
> > it's not [15D]. It's twice a month.
> I understand that, that was how the concept was developed in the first
> place. I still wonder it's that necessary. I would imagine that a structured
> type could do the same job, without to much hassle (I'm thinking of a
> ('D',100), like you can have a (np.float,100), for example...)
It appears to me that a structured dtype with some further NumPy extensions
could entirely replace the 'events' metadata fairly cleanly. If the ufuncs
are extended to operate on structured arrays, and integers modulo n are
added as a new dtype, a dtype like [('date', 'M8[D]'), ('event', 'i8[mod
100]')] could replace the current 'M8[D]//100'.
> The default conversions may seem to imply that [D/100] is equivalent
> > to [864s], but they are not intended to.
> Well, like Chris suggested (I think), we could prevent type conversion when
> the denominator is not 1...
I'd like to remove the '/100' functionality here, I don't think it gives any
benefit. Saying 'M8[3M]' is better than 'M8[Y/4]', in my opinion, and maybe
adding 'M8[Q]' and removing the number in front of the unit would be good
> > They are just a starting
> > point for one to write one's own, more specific conversions.
> > Similarly, we have default conversions from low frequencies to high
> > frequencies defaulting to representing the higher precision event at
> > the beginning of the low frequency interval. E.g. for days->seconds,
> > we assume that the day is representing the initial second at midnight
> > of that day. We then use offsets to allow the user to add more
> > information to specify it more precisely.
> As Dave H. summarized, we used a basic keyword to do the same thing in
> scikits.timeseries, with the addition of some subfrequencies like A-SEP to
> represent a year starting in September, for example. It works, but it's
> really not elegant a solution.
This kind of thing definitely belongs in a layer above datetime.
> On Jun 7, 2011, at 6:53 PM, Christopher Barker wrote:
> > Pierre GM wrote:
> >> Using the ISO as reference, you have a good definition of months.
> > Yes, but only one. there are others. For instance, the climate modelers
> > like to use a calendar that has 360 days a year: 12 30 day months. That
> > way they get something with the same timescale as months and years, but
> > have nice, linear, easy to use units (differentiable, and all that).
> Ah Chris... That's easy for climate models, but when you have to deal with
> actual station data ;) More seriously, that's the kind of specific case that
> could be achieved by subclassing an array with a standard unit (like, days).
> I do understand your arguments for not recognizing a standard Gregorian
> calendar month as a universal unit. Nevertheless, it can be converted to
> days without ambiguity in the ISO8601 system.
> NumPy-Discussion mailing list
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion