[SciPy-dev] interest in Time series functionality?
mattknox_ca at hotmail.com
Sat Apr 22 10:23:12 CDT 2006
I hope this is the correct mailing list to post this kind of question to... and if not, my apologies.
I work in the quantitative finance division of a financial services company in Canada and the last year or so we have been doing a lot more python based work. Most of the data we work with is time series data (stock price data, etc) and we have traditionally used FAME (a product of sungard) to store and manipulate this data. We have developed a python api on top of the included FAME c api to access FAME data from python, but the problem now is that there aren't any available python libraries (to my knowledge) for manipulating time series data in any way that comes close to the power of FAME.
Our motivation for being able to manipulate this data in Python is primarily for web-based applications. Although being able to bring this data into the python world opens up many other possibilities for us as well (FAME has no matrix capabilities whatsoever, which is pretty sad really given their target market).
We have done some preliminary work in developing a time series class built on top of the numpy array class, and it has gotten to the point where it works reasonably well for what we are using it for, although I'm certain it could be optimized a great deal.
The key features of this module are:
- works with different frequencies of data (currently supports monthly, daily, business days, and secondly frequencies)
- able to index the time series directly by date objects (from a custom date class we have created)
- handle missing values (along the lines of masked arrays)
- global module settings to dictate how certain scenarios are handled
- perform operations on time series that do not necessarily have the same start/end dates (+,-,*,/) (and handle missing values appropriately in the operation according to certain global option settings). This involves an implicit resizing of the arrays.
- perform operations on time series that do not have the same frequency and perform implicit frequency conversions (according to certain global option settings). Again, this involves implicitly resizing the arrays
We have basically attempted to model the time series functionality to be similar to how FAME handles it since that works reasonably well.
I'm wondering if there is any kind of interest in this? Our group consists mostly of financial practitioners and engineers, not really pure software developers, so if somebody is interested in taking this to the next level I would be willing to release the code (both the FAME api, and the time series module) if someone wanted to improve upon this and share their improvements in the future. The code is definitely not a polished product right now, but it is functional.
If you have any thoughts on this (positive or negative) I would love to hear them. Thanks,
- Matt Knox
Express yourself instantly with MSN Messenger! Download today it's FREE!
More information about the Scipy-dev