[SciPy-dev] [ANN] alpha release of the timeseries module
Wed Mar 7 12:46:20 CST 2007
As you may or may not be aware, over the past several months we (Matt and Pierre) have been busy developing a timeseries module for scipy. The module has been completely overhauled from its original form when it was added to the scipy sandbox at the end of last year. In particular, masked arrays are now fully supported, and the dependence on the external package mxDateTime has been removed.
The purpose of this package is to manipulate data indexed by dates (and time). The package defines three main classes: Date (as a single element), DateArray (as a ndarray of Date objects), and TimeSeries (as the combination of a DateArray and a masked array). Each of these main classes support a wide variety of frequencies: annual, quarterly, monthly, weekly, daily, business day, hourly, minutely, secondly, and more. Additional frequencies can be added fairly easily.
Note that by frequency, we really mean time units. Time series usually do NOT have to be regularly spaced : the smallest time increment between two consecutive data is just expressed in one particular set of units.
Dates (and hence DateArrays and TimeSeries) can be easily converted to other frequencies (the conversion algorithms implemented in C).
TimeSeries can be indexed (and sliced!) by Date objects/strings, or indexed in the convential numpy way . The TimeSeries class is a subclass of the new MaskedArray class (the version in the scipy sandbox, which is itself a subclass of ndarray). Therefore, missing data are naturally supported, and the series are recognized as ndarrays by asarray/asanyarray.
In addition to these three classes, we developed a series of addons:
* a plotting add-on, that allows TimeSeries objects to be easily plotted using matplotlib with dynamic, intelligent axis labels (a tad slow at the moment, but very pretty...).* A report class for generating TimeSeries reports, to export the results to a spreadsheet program via csv, generate html tables, or inspect data from the console, etc.* The 'interpolate' sub-module, that permits to interpolate masked values in a MaskedArray (and hence, TimeSeries also) * A 'filters' sub-module provides some functions for 'running window' based filtering (this is rather incomplete at this point) * An io.fame sub-module allows reading/writing to/from FAME databases (still highly experimental)
Our future plans include:* Porting the Date class entirely to C. This is partly underway, so you may notice some unusual redundancy between the python and C code with regards to Date handling. * Adding some more frequencies (quarterly with different year ends, groups of months, possibly higher frequencies than secondly, possibly user defined custom frequencies) * Improving the frequency support of the plotting module (it currently supports only Annual, Quarterly, Monthly, Daily, and Business Day. and weekly, vaguely). * Add additional library functions: percent change, ARMA models, moving standard deviation...
If you are interested in the module, we would very much appreciate your feedback. The module is still under-going active development, so API changes are quite possible at this point (even if most of the core is failry stabilized).
Please see the wiki at http://www.scipy.org/TimeSeriesPackage for some documentation. There should be enough there to get you started, although we don't claim it is completely comprehensive at this point.
You can download the maskedarray and timeseries modules from the scipy sandbox in SVN.
Please feel free to contact us on or off the mailing list with any questions/comments/suggestions you have about the module.
- Matt Knox & Pierre Gerard-Marchant
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Scipy-dev