[Numpy-discussion] timezones and datetime64
Wed Apr 3 11:51:38 CDT 2013
Mark Wiebe and I are both still tracking NumPy development and can provide
context and even help when needed. Apologies if we've left a different
impression. We have to be prudent about the time we spend as we have
other projects we are pursuing as well, but we help clients with NumPy
issues all the time and are eager to continue to improve the code base.
It seems to me that the biggest issue is just the automatic conversion that
is occurring on string or date-time input. We should stop using the local
time-zone (explicit is better than implicit strikes again) and not use any
time-zone unless time-zone information is provided in the string. I am
definitely +1 on that.
It may be necessary to carry around another flag in the data-type to
indicate whether or not the date-time is naive (not time-zone aware) or
time-zone aware so that string printing does not print a time-zone if it
didn't have one to begin with as well.
If others agree that this is the best way forward, then Mark or I can
definitely help contribute a patch.
On Wed, Apr 3, 2013 at 9:38 AM, Dave Hirschfeld
> Nathaniel Smith <njs <at> pobox.com> writes:
> > On Wed, Apr 3, 2013 at 2:26 PM, Dave Hirschfeld
> > <dave.hirschfeld <at> gmail.com> wrote:
> > >
> > > This isn't acceptable for my use case (in a multinational company) and
> > > no reasonable way around it other than bypassing the numpy conversion
> > > by setting the dtype to object, manually parsing the strings and
> creating an
> > > array from the list of datetime objects.
> > Wow, that's truly broken. I'm sorry.
> > I'm skeptical that just switching to UTC everywhere is actually the
> > right solution. It smells like one of those solutions that's simple,
> > neat, and wrong. (I don't know anything about calendar-time series
> > handling, so I have no ability to actually judge this stuff, but
> > wouldn't one problem be if you want to know about business days/hours?
> > You lose the original day-of-year once you move everything to UTC.)
> > Maybe datetime dtypes should be parametrized by both granularity and
> > timezone? Or we could just declare that datetime64 is always
> > timezone-naive and adjust the code to match?
> > I'll CC the pandas list in case they have some insight. Unfortunately
> > AFAIK no-one who's regularly working on numpy this point works with
> > datetimes, so we have limited ability to judge solutions... please
> > help!
> > -n
> It think simply setting the timezone to UTC if it's not specified would
> 99% of use cases because IIUC the internal representation is UTC so numpy
> be doing no conversion of the dates that were passed in. It was the
> which was the source of the error in my example.
> The only potential issue with this is that the dates might take along an
> incorrect UTC timezone, making it more difficult to work with naive
> In : d = np.datetime64('2014-01-01 00:00:00', dtype='M8[ns]')
> In : d
> Out: numpy.datetime64('2014-01-01T00:00:00+0000')
> In : str(d)
> Out: '2014-01-01T00:00:00+0000'
> In : pydate(str(d))
> Out: datetime.datetime(2014, 1, 1, 0, 0, tzinfo=tzutc())
> In : pydate(str(d)) == datetime.datetime(2014, 1, 1)
> Traceback (most recent call last):
> File "<ipython-input-46-abfc0fee9b97>", line 1, in <module>
> pydate(str(d)) == datetime.datetime(2014, 1, 1)
> TypeError: can't compare offset-naive and offset-aware datetimes
> In : pydate(str(d)) == datetime.datetime(2014, 1, 1, tzinfo=tzutc())
> Out: True
> In : pydate(str(d)).replace(tzinfo=None) == datetime.datetime(2014, 1,
> Out: True
> In this case it may be best to have numpy not try to set the timezone at
> all if
> none was specified. Given the internal representation is UTC I'm not sure
> is feasible though so defaulting to UTC may be the best solution.
> NumPy-Discussion mailing list
Continuum Analytics, Inc.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion