[Scipy-svn] r2653 - in trunk/Lib/sandbox/timeseries: . examples tests

scipy-svn at scipy.org scipy-svn at scipy.org
Tue Jan 30 15:11:13 CST 2007


Author: pierregm
Date: 2007-01-30 15:11:09 -0600 (Tue, 30 Jan 2007)
New Revision: 2653

Modified:
   trunk/Lib/sandbox/timeseries/examples/example.wiki
   trunk/Lib/sandbox/timeseries/tdates.py
   trunk/Lib/sandbox/timeseries/tests/test_dates.py
   trunk/Lib/sandbox/timeseries/tests/test_multitimeseries.py
   trunk/Lib/sandbox/timeseries/tmulti.py
   trunk/Lib/sandbox/timeseries/tseries.py
Log:
removed outdated comments

Modified: trunk/Lib/sandbox/timeseries/examples/example.wiki
===================================================================
--- trunk/Lib/sandbox/timeseries/examples/example.wiki	2007-01-30 20:52:09 UTC (rev 2652)
+++ trunk/Lib/sandbox/timeseries/examples/example.wiki	2007-01-30 21:11:09 UTC (rev 2653)
@@ -1,287 +1,2 @@
-
-== Requirements ==
-
-In order to use the `TimeSeries` package, you need to have the foloowing packages installed:
-    * `numpy` 
-    * `maskedarray` : the new implementation where masked arrays are subclasses of regular ndarrays. You can find this version in the sandbox on the scipy SVN server
-    * `mx.DateTime` : This external module is needed to process some of the dates in C. At term, we hope to be able to get rid of it. In the meanwhile, sorry for the inconvenience.
-
-
-The `TimeSeries` package is organized as a group of modules. Two modules are especially useful: `tdates` for managing `Date` and `DateArray` objects, and `tseries`, for managing time series. 
-{{{#!python numbers=disable
->>> import numpy as N
->>> import maskedarray as MA
->>> import mx.DateTime
->>> import datetime
->>> import tseries as TS
->>> import tdates as TD
-}}}
-Note that you could also just import the generic `timeseries` package to achieve the same results.
-
-
-== Dates ==
-
-A `Date` object combines some date and/or time related information with a given frequency. You can picture the frequency as the unit into which the date is expressed. For example, the three objects:
-{{{#!python  numbers=disable
->>> D = TD.Date(freq='D', year=2007, month=1, day=1)
->>> M = TD.Date(freq='M', year=2007, month=1, day=1)
->>> Y = TD.Date(freq='A', year=2007, month=1, day=1)
-}}}
-use the same date information, but different frequencies. They correspond to to the day 'Jan. 01, 2007', the month 'Jan. 2007' and the year '2007', respectively. The importance of the frequency will become clearer later on.
-~- A more technical note: `Date` objects are internally stored as integers. The conversion to integers and back is controlled by the frequency. In the example above, the internal representation of the three objects `D`, `M` and `Y` are 732677, 24073 and 2007, respectively. -~
-
-==== Construction of a `Date` object ====
-Several options are available to construct a Date object explicitly. In each case, the `frequency` argument must be given.
-
-    * Give appropriate values to any of the `year`, `month`, `day`, `quarter`, `hours`, `minutes`, `seconds` arguments.
-{{{#!python numbers=disable
->>> TD.Date(freq='Q',year=2004,quarter=3)
-<Q : 2004Q3>
->>> TD.Date(freq='D',year=2001,month=1,day=1)
-<D : 01-Jan-2001>
-}}}
-      
-    * Use the `string` keyword. This method calls the `mx.DateTime.Parser` submodule, more information is available in the documentation of this latter.
-{{{#!python numbers=disable      
->>> TD.Date('D', string='2007-01-01')
-<D : 01-Jan-2007>
-}}}      
-
-    * Use the `mxDate` keyword with an existing `mx.DateTime.DateTime` object, or even a `datetime.datetime` object.
-{{{#!python numbers=disable
->>> TD.Date('D', mxDate=mx.DateTime.now())
->>> TD.Date('D', mxDate=datetime.datetime.now())
-}}}
-
-
-==== Manipulating dates ====
-
-You can add and subtract integers from a `Date` object to get a new `Date` object. The frequency of the new object is the same as the roginal one. For example:
-{{{#!python numbers=disable
->>> mybirthday = D-1
-<D : 31-Dec-2006>
->>> infivemonths = M + 5
-<M : Jun-2007>
-}}}
-
-You can convert a `Date` object from one frequency to another with the `asfreq` method. When converting to a higher frequency (for example, from monthly to daily), the new object will fall on the earliest date possible by default. Thus, if you convert a daily `Date` to a monthly one and back to a daily one, you will lose your day information in the process:
-{{{#!python numbers=disable
->>> mybirthday.asfreq('M')
-<M: Dec-2006>
->>> mybirthday.asfreq('M').asfreq('D')
-<D: 01-Dec-2006>
-}}}
-
-Some other methods worth mentioning are:
-    * `toordinal` : converts an object to the equivalent proleptic gregorian date
-    * `tostring`  : converts an object to the corresponding string.
-
-----
-
-== DateArray objects ==
-
-DateArrays are simply ndarrays of `Date` objects. They accept the same methods as a `Date` object, with the addition of:
-    * `tovalue` : converts the array to an array of integers. Each integer is the internal representation of the corresponding date
-    * `has_missing_dates` : outputs a boolean on whether some dates are missing or not. 
-    * `has_duplicated_dates` : outputs a boolean on whether some dates are duplicated or not.
-
-==== Construction ====
-
-To construct a `DateArray` object, you can call the class directly
-{{{#!python numbers=disable
-DateArray(dates=None, freq='U', copy=False)
-}}}
-where `dates` can be ''(i)'' an existing `DateArray`; ''(ii)'' a sequence of `Date` objects; ''(iii)''' a sequence of objects that `Date` can recognize (such as strings, integers, `mx.DateTime` objects...).
-Alternatively, you can use the `date_array` constructor:
-{{{#!python numbers=disable
-date_array(dlist=None, start_date=None, end_date=None, 
-           include_last=True, length=None,  freq=None)
-}}}
-If `dlist` is None, a new list of dates will be created from `start_date` and `end_date`. You should set `include_last` to True if you want `end_date` to be included. If `end_date` is None, then a series of length `length` will be created.
-
-
-----
-
-== TimeSeries ==
-
-A `TimeSeries` object is the combination of three ndarrays:
-    * `dates`: DateArray object.
-    * `data` : ndarray.
-    * `mask` : Boolean ndarray, indicating missing or invalid data.
-These three arrays can be accessed as attributes of a TimeSeries object. Another very useful attribute is `series`, that gives you the possibility to directly access `data` and `mask` as a masked array.
-
-==== Construction ====
-
-To construct a TimeSeries, you can use the class constructor:
-
-{{{#!python numbers=disable
-TimeSeries(data, dates=None, mask=nomask, 
-           freq=None, observed=None, start_date=None, 
-           dtype=None, copy=False, fill_value=None,
-           keep_mask=True, small_mask=True, hard_mask=False)
-}}}               
-where `data` is a sequence, a ndarray or a MaskedArray. If `dates` is None, a DateArray of the same length as the data is constructed at a `freq` frequency, starting at `start_date`.
-
-Alternatively, you can use the `time_series` function:
-{{{#!python numbers=disable
-time_series(data, dates=None, freq=None, 
-            start_date=None, end_date=None, length=None, include_last=True,
-            mask=nomask, dtype=None, copy=False, fill_value=None,
-            keep_mask=True, small_mask=True, hard_mask=False)    
-}}}
-
-Let us construct a series of 600 random elements, starting 600 business days ago, at  a business daily frequency
-{{{#!python numbers=disable
->>> data = N.random.uniform(-100,100,600)
->>> today = TD.thisday('B')
->>> series = TS.time_series(data, dtype=N.float_, freq='B', observed='SUMMED',
->>>                         start_date=today-600)
-}}}
-We can check that `series.dates` is a `DateArray` object and that `series.series` is a `MaskedArray` object.
-{{{#!python numbers=disable
->>> isinstance(series.dates, TD.DateArray)
-True
->>> isinstance(series.series, MA.MaskedArray)
-True
-}}}
-So, if you are already familiar with `MaskedArray`, using `TimeSeries` should be straightforward. Just keep in mind that another attribute is always present, `dates`.
-
-
-==== Indexing ====
-
-Elements of a TimeSeries can be accessed just like with regular ndarrrays. Thus,
-{{{#!python numbers=disable
->>> series[0]
-}}}
-outputs the first element, while
-{{{#!python numbers=disable
->>> series[-30:]
-}}}
-outputs the last 30 elements.
-
-But you can also use a date:
-{{{#!python numbers=disable
->>> thirtydaysago = today - 30
->>> series[thirtydaysago:]
-}}}
-or even a string...
-{{{#!python numbers=disable
->>> series[thirtydaysago.tostring():]
-}}}
-or a sequence/ndarray of integers... 
-{{{#!python numbers=disable
->>> series[[0,-1]]
-}}}
-~-This latter is quite useful: it gives you the first and last data of your series.-~
-
-In a similar way, setting elements of a TimeSeries works seamlessly.
-Let us set negative values to zero...
-{{{#!python numbers=disable
->>> series[series<0] = 0
-}}}
-... and the values falling on Fridays to 100
-{{{#!python numbers=disable
->>> series[series.day_of_week == 4] = 100
-}}}
-Note that we could also create a temporary array of 'day_of weeks' for the 
-corresponding period, and use it as condition.
-{{{#!python numbers=disable
->>> weekdays = TD.day_of_week(series)
->>> series[weekdays == 4] = 100
-}}}
-You should keep in mind that TimeSeries are basically MaskedArrays. If some data are masked, you will not be able to use a condition as index, you will have to fill the data first.
-
-==== Operations on TimeSeries ====
-
-If you work with only one TimeSeries, you can use regular commands to process the data. For example:
-{{{#!python numbers=disable
->>> series_log = N.log(series)
-}}}
-Note that invalid values (negative, in that case), are automatically masked. Note also that you could use the corresponding function of the `maskedarray` module. This latter approach is actually recommended when you want to use the `reduce` and `accumulate` methods of some ufuncs (such as add or multiply). ~-The reason is that the methods of the numpy.ufuncs do not communicate well with subclasses: as they do not call the `__array_wrap__` method, there is no postprocessing.-~
-
-When working with multiple series, only series of the same frequency, size and starting date can be used in basic operations. The function `align_series` ~-(or its alias `aligned`)-~ forces series to have matching starting and ending dates. By default, the starting date will be set to the smallest starting date of the series, and the ending date to the largest.
-
-Let's construct a list of months, starting on Jan 2005 and ending on Dec 2006, with a gap from Oct 2005 to Dec 2006.
-{{{#!python numbers=disable
->>> mlist_1 = ['2005-%02i' % i for i in range(1,10)]
->>> mlist_1 += ['2006-%02i' % i for i in range(2,13)]
->>> mdata_1 = N.arange(len(mlist_1))
->>> mser_1 = TS.time_series(mdata_1, mlist_1, observed='SUMMED')
-}}}
-Note that the frequency is 'U', for undefined. In fact, you have to specify what kind of data is actually missing by forcing a given frequency.
-{{{#!python numbers=disable
->>> mser = mser_1.asfreq('M')
-}}}
-Let us check whether there are some duplicated dates (no):
-{{{#!python numbers=disable
->>> mser_1.has_duplicated_dates()
-False
-}}}
-...or missing dates (yes):
-{{{#!python numbers=disable
->>> mser_1.has_missing_dates()
-True
-}}}
-
-
-Let us construct a second monthly series, this time without missing dates
-{{{#!python numbers=disable
->>> mlist_2 = ['2004-%02i' % i for i in range(1,13)]
->>> mlist_2 += ['2005-%02i' % i for i in range(1,13)]
->>> mser_2 = TS.time_series(N.arange(len(mlist_2)), mlist_2, observed='SUMMED')
-}}}
-Let's try to add the two series:
-{{{#!python numbers=disable
->>> mser_3 = mser_1 + mser_2
-}}}
-That doesn't work, as the series have different starting dates. We need to align them first.
-{{{#!python numbers=disable
->>> (malg_1,malg_2) = aligned(mser_1, mser_2) 
-}}}
-That still doesnt' work, as `malg_1` has missing dates. Let us fill it, then: data falling on a date that was missing will be masked.
-{{{#!python numbers=disable
->>> mser_1_filled = fill_missing_dates(mser_1)
->>> (malg_1,malg_2) = align_series(mser_1_filled, mser_2) 
-}}}
-Now we can add the two series. Only the data that fall on dates common to the original, non-aligned series will be actually added, the others will be masked. After all, we are adding masked arrays.
-{{{#!python numbers=disable
->>> mser_3 = malg_1 + malg_2
-}}}
-We could have filled the initial series first:
-{{{#!python numbers=disable
->>> mser_3 = filled(malg_1,0) + filled(malg_2,0)
-}}}
-When aligning the series, we could have forced the series to start/end at some given dates:
-{{{#!python numbers=disable
->>> (malg_1,malg_2) = aligned(mser_1_filled, mser2, 
->>>                           start_date='2004-06', end_date='2006-06')
-}}}
-
-
-==== TimeSeries Conversion ====
-
-To convert a TimeSeries to another frequency, use the `convert` method or function. The optional argument `func` must be a function that acts on a 1D masked array and returns a scalar. 
-{{{#!python numbers=disable
->>> mseries = series.convert('M',func=ma.average)
-}}}
-If `func` is not specified, the default value `'auto'` is used instead. In that case,
-the conversion function is estimated from the `observed` attribute of the series.
-For example, if `observed='SUMMED'`, then `func='auto'` is in fact `func=sum`.
-{{{#!python  numbers=disable
->>> mseries_default = series.convert('M')
-}}}
-If `func` is None, the convert method/function returns a 2D array, where each row corresponds to the new frequency, and the columns to the original data. In our example, `convert` will return a 2D array with 23 columns, as there are at most 23 business days per month.
-{{{#!python numbers=disable
->>> mseries_2d = series.convert('M',func=None)
-}}}
-When converting from a lower frequency to a higher frequency, an extra argument `position` is required. The value of the argument is either 'START' or 'END', and determines where the data point will be placed in the period. In the future, interpolation methods will be supported to fill in the resulting masked values.
-
-Let us create a second series, this time with a monthly frequency, starting 110 months ago.
-{{{#!python numbers=disable
->>> data = N.random.uniform(-100,100,100).astype(np.float_)
->>> today = TS.today.asfreq('M') - 110
->>> nseries = TS.TimeSeries(data, freq='m', observed='END',start_date=today)
->>> sixtymonthsago = today-60
->>> nseries[sixtymonthsago:sixtymonthsago+10] = 12
-}}}
-
+The page can be accessed at:
+http://www.scipy.org/TimeSeriesPackage
\ No newline at end of file

Modified: trunk/Lib/sandbox/timeseries/tdates.py
===================================================================
--- trunk/Lib/sandbox/timeseries/tdates.py	2007-01-30 20:52:09 UTC (rev 2652)
+++ trunk/Lib/sandbox/timeseries/tdates.py	2007-01-30 21:11:09 UTC (rev 2653)
@@ -470,7 +470,6 @@
             return Date(freq=tofreq, value=value)
         else:
             return None
-            
 Date.asfreq = asfreq
             
 def isDate(data):
@@ -697,14 +696,9 @@
                 if self.__hasdups is None:
                     self.__hasdups = (steps.min() == 0)
             else:
-#            elif val.size == 1:
                 self.__full = True
                 self.__hasdups = False
                 steps = numeric.array([], dtype=int_)
-#            else:
-#                self.__full = False
-#                self.__hasdups = False
-#                steps = None
             self.__steps = steps
         return self.__steps
     
@@ -855,17 +849,9 @@
     # Case #3: dates as objects .................
     elif dlist.dtype.kind == 'O':
         template = dlist[0]
-#        if dlist.size > 1:
-#            template = dlist[0]
-#        else:
-#            template = dlist.item()
         #...as Date objects
         if isinstance(template, Date):
             dates = numpy.fromiter((d.value for d in dlist), int_)
-#            if dlist.size > 1:
-#                dates = numpy.fromiter((d.value for d in dlist), int_)
-#            else:
-#                dates = [template]
         #...as mx.DateTime objects
         elif hasattr(template,'absdays'):
             # no freq given: try to guess it from absdays
@@ -904,7 +890,6 @@
     # Case #2: we have a starting date ..........
     if start_date is None:
         raise InsufficientDateError
-#    if not isDateType(start_date):
     if not isinstance(start_date, Date):
         raise DateError, "Starting date should be a valid Date instance!"
     # Check if we have an end_date
@@ -914,8 +899,6 @@
     else:
         if not isinstance(end_date, Date):
             raise DateError, "Ending date should be a valid Date instance!"
-#        assert(isDateType(end_date),
-#               "Starting date should be a valid Date instance!")
         length = end_date - start_date
         if include_last:
             length += 1

Modified: trunk/Lib/sandbox/timeseries/tests/test_dates.py
===================================================================
--- trunk/Lib/sandbox/timeseries/tests/test_dates.py	2007-01-30 20:52:09 UTC (rev 2652)
+++ trunk/Lib/sandbox/timeseries/tests/test_dates.py	2007-01-30 21:11:09 UTC (rev 2653)
@@ -26,9 +26,9 @@
 from maskedarray.testutils import assert_equal, assert_array_equal
 
 from timeseries import tdates
-reload(tdates)
+#reload(tdates)
 from timeseries import tcore
-reload(tcore)
+#reload(tcore)
 from timeseries.tdates import date_array_fromlist, Date, DateArray, date_array, mxDFromString
 
 class test_creation(NumpyTestCase):

Modified: trunk/Lib/sandbox/timeseries/tests/test_multitimeseries.py
===================================================================
--- trunk/Lib/sandbox/timeseries/tests/test_multitimeseries.py	2007-01-30 20:52:09 UTC (rev 2652)
+++ trunk/Lib/sandbox/timeseries/tests/test_multitimeseries.py	2007-01-30 21:11:09 UTC (rev 2653)
@@ -26,17 +26,12 @@
 
 from maskedarray.core import getmaskarray, nomask, masked_array
 
-##reload(MA)
-#import maskedarray.mrecords
-##reload(maskedarray.mrecords)
-#from maskedarray.mrecords import mrecarray, fromarrays, fromtextfile, fromrecords
 from timeseries import tmulti
 reload(tmulti)
 from timeseries.tmulti import MultiTimeSeries, TimeSeries,\
     fromarrays, fromtextfile, fromrecords, \
     date_array, time_series
 
-#from timeseries.tseries import time_series, TimeSeries
 
 #..............................................................................
 class test_mrecords(NumpyTestCase):

Modified: trunk/Lib/sandbox/timeseries/tmulti.py
===================================================================
--- trunk/Lib/sandbox/timeseries/tmulti.py	2007-01-30 20:52:09 UTC (rev 2652)
+++ trunk/Lib/sandbox/timeseries/tmulti.py	2007-01-30 21:11:09 UTC (rev 2653)
@@ -2,11 +2,11 @@
 """
 Support for multi-variable time series, through masked recarrays.
 
-:author: Pierre Gerard-Marchant
-:contact: pierregm_at_uga_dot_edu
+:author: Pierre GF Gerard-Marchant & Matt Knox
+:contact: pierregm_at_uga_dot_edu - mattknow_ca_at_hotmail_dot_com
 :version: $Id$
 """
-__author__ = "Pierre GF Gerard-Marchant ($Author$)"
+__author__ = "Pierre GF Gerard-Marchant & Matt Knox ($Author$)"
 __version__ = '1.0'
 __revision__ = "$Revision$"
 __date__     = '$Date$'
@@ -51,7 +51,6 @@
 reserved_fields = MR.reserved_fields + ['_dates']
 
 import warnings
-#                    format='%(name)-15s %(levelname)s %(message)s',)
 
 __all__ = [
 'MultiTimeSeries','fromarrays','fromrecords','fromtextfile',           
@@ -107,11 +106,6 @@
             cls._defaulthardmask = data._series._hardmask | hard_mask
             cls._fill_value = data._series._fill_value
             return data._data.view(cls)
-#        elif isinstance(data, TimeSeries):
-#            cls._defaultfieldmask = data._series._fieldmask
-#            cls._defaulthardmask = data._series._hardmask | hard_mask
-#            cls._fill_value = data._series._fill_value
-            
         # .......................................
         _data = MaskedRecords(data, mask=mask, dtype=dtype, **mroptions)
         if dates is None:
@@ -122,36 +116,30 @@
             newdates = date_array(dlist=dates, freq=freq)
         else:
             newdates = dates
-#            _data = data
-#            if hasattr(data, '_mask') :
-#                mask = mask_or(data._mask, mask)
         cls._defaultdates = newdates    
         cls._defaultobserved = observed  
         cls._defaultfieldmask = _data._fieldmask
-#        assert(_datadatescompat(data,newdates))
         #
         return _data.view(cls)
-    
-        #..................................
-    def __array_wrap__(self, obj, context=None):
-        """Special hook for ufuncs.
-Wraps the numpy array and sets the mask according to context.
-        """
-#        mclass = self.__class__
-        #..........
-        if context is None:
-#            return mclass(obj, mask=self._mask, copy=False)
-            return MaskedArray(obj, mask=self._mask, copy=False,
-                               dtype=obj.dtype,
-                               fill_value=self.fill_value, )
-        #..........
-        (func, args) = context[:2]
- 
-#        return mclass(obj, copy=False, mask=m)
-        return MultiTimeSeries(obj, copy=False, mask=m,)
-#                           dtype=obj.dtype, fill_value=self._fill_value)
-    
-        
+#    
+#        #..................................
+#    def __array_wrap__(self, obj, context=None):
+#        """Special hook for ufuncs.
+#Wraps the numpy array and sets the mask according to context.
+#        """
+##        mclass = self.__class__
+#        #..........
+#        if context is None:
+##            return mclass(obj, mask=self._mask, copy=False)
+#            return MaskedArray(obj, mask=self._mask, copy=False,
+#                               dtype=obj.dtype,
+#                               fill_value=self.fill_value, )
+#        #..........
+#        (func, args) = context[:2]
+# 
+##        return mclass(obj, copy=False, mask=m)
+#        return MultiTimeSeries(obj, copy=False, mask=m,)
+##                           dtype=obj.dtype, fill_value=self._fill_value)        
     def __array_finalize__(self,obj):
         if isinstance(obj, MultiTimeSeries):
             self.__dict__.update(_dates=obj._dates,
@@ -234,13 +222,11 @@
                     for k in _names:
                         m = mask_or(val, base_fmask.__getattr__(k))
                         base_fmask.__setattr__(k, m)
-                else:
-                    return
             else:
                 mval = getmaskarray(val)
                 for k in _names:
                     base_fmask.__setattr__(k, mval)  
-                return
+            return
     #............................................
     def __getitem__(self, indx):
         """Returns all the fields sharing the same fieldname base.

Modified: trunk/Lib/sandbox/timeseries/tseries.py
===================================================================
--- trunk/Lib/sandbox/timeseries/tseries.py	2007-01-30 20:52:09 UTC (rev 2652)
+++ trunk/Lib/sandbox/timeseries/tseries.py	2007-01-30 21:11:09 UTC (rev 2653)
@@ -64,7 +64,6 @@
            ]
 
 #...............................................................................
-#                    format='%(name)-15s %(levelname)s %(message)s',)
 
 ufunc_domain = {}
 ufunc_fills = {}
@@ -1195,7 +1194,6 @@
         inidata = series._series.copy()
     else:
         inidata = series._series
-    
     if nper < 0:
         nper = max(-len(series), nper)
         newdata[-nper:] = inidata[:nper]



More information about the Scipy-svn mailing list