[SciPy-User] scikits.timeseries question
Mon Nov 30 19:39:39 CST 2009
On Nov 30, 2009, at 8:16 PM, Christopher Barker wrote:
> nope -- not duplicated, but maybe there are missing ones. The point is
> that I have an array of "days since", and I want array of
> timeseries.dates (which is a DateArray, yes?)
Got it. Duplicated and/or missing dates correspond to the same problem: you can't assume that your dates are regularly spaced, so you can't use start_date and length.
>> np.array(...) + sd gives you a ndarray of Date objects (so its dtype
>> is np.object), and you use that as the input of date_array. The
>> frequency should be recognized properly.
> OK -- though it seems I SHOULD be able to go straight to an DateArray,
> and I'm still confused about what this means:
Well, that depends on the type of starting date, actually. If it's a Date, adding a ndarray to it will give you a ndarray of Date objects. If it's a DateArray of length 1, it'll give you a DateArray. (Note to self: we could probably be a bit more consistent on this one...)
>>> In : da = ts.date_array((1,2,3,4), start_date=sd)
>> Check the doc for date_array: the first argument can be
>> * an existing :class:`DateArray` object;
>> * a sequence of :class:`Date` objects with the same frequency;
>> * a sequence of :class:`datetime.datetime` objects;
>> * a sequence of dates in string format;
>> * a sequence of integers corresponding to the representation of
>> :class:`Date` objects.
> That's what I have: a sequence of integers corresponding to the
> representation of the Date objects (doesn't it represent them as "units
> since start date" where units is the "freq" ?
No, not exactly: the representation of a Date objects is relative to an absolute build-in reference (Day #1 being 01/01/01). (Likewise, nump.datetime64 uses the standard 1970/01/01).
We can't have a variable reference as it would be far too messy too quickly. Instead, you have to use the trick start_date + ndarray of integers to get what you want.
> If that's not what if means, then what does it mean?
If you have a 'A' frequency, that'd be a sequence like 2001, 2002, ...
For a 'M' frequency, that'd be 24001 (for 2001/01), 24002 (for 2001/02)...
For a 'D' frequency, that'd be 730486, 730487... for 2001/01/01, 2001/01/02...
In other terms, the nb of units since the absolute reference.
> hmm -- I see this:
> ts_lib.mov_average(data, span, dtype=None)
> Calculates the moving average of a series.
> data : array-like
> Input data, as a sequence or (subclass of) ndarray.
> Masked arrays and TimeSeries objects are also accepted.
> The input array should be 1D or 2D at most.
> If the input array is 2D, the function is applied on each
> I've got a 3-d array -- darn! Maybe I'll poke into it and see if it can
> be generalized.
3D ? What are your actual variables ? Keep in mind that when we talk about dimensions with time series, we zap the time one, so if you have a series of maps, your array is only 2D in our terminology.
If you have a time series of (lat, lon), mov_average will average your lats independently of your lons
More information about the SciPy-User