[SciPy-user] scikits.timeseries

Robert Ferrell ferrell@diablotech....
Thu Nov 27 10:23:15 CST 2008


Timeseries is an awesome package.  Great contribution.  I have 2  
questions about it, though.

1. Is scipy-user the right place for questions?

2. I've noticed that 'business frequency' includes holidays, and that  
can create holes in what are actually complete data sets.  For  
instance, Sep 01, 2008 was a holiday in the US (Labor Day).  However,  
it is included in a DateArray spanning that date.  For instance.

In [640]: ts.date_array(ts.Date('B','2008-08-25'), length=12)
Out[640]:
DateArray([25-Aug-2008, 26-Aug-2008, 27-Aug-2008, 28-Aug-2008, 29- 
Aug-2008,
        01-Sep-2008, 02-Sep-2008, 03-Sep-2008, 04-Sep-2008, 05-Sep-2008,
        08-Sep-2008, 09-Sep-2008],
           freq='B')

This makes stock ticker data look like it's incomplete - no data for  
Sep 01, since the markets were closed.  For instance, if I use  
matplotlib.finance.quotes_historical_yahoo to download Intel data, and  
put that into the date array above, I get the series:

masked_array(data = [22.77 22.95 23.21 23.39 22.67 -- 22.39 21.35  
20.34 20.43 20.79],
       mask = [False False False False False  True False False False  
False False],
       fill_value=1e+20)

That has a hole on Sep 1.  This matters for things like moving average  
calculation.  Sep 1 should be treated like a Saturday or Sunday, but  
instead causes a 5-day mov_average calculation to not compute anything  
from Sep 2 through Sep 7:

timeseries([-- -- -- -- 22.998 -- -- -- -- -- 21.06],
            dates = [25-Aug-2008 ... 08-Sep-2008],
            freq  = B)

My question: What is a good way to handle (get rid of?) the holes in  
the series?

thanks,
-robert


More information about the SciPy-user mailing list