# [SciPy-user] Calculating daily, monthly and seasonal averages of hourly time series data.

Dharhas Pothina Dharhas.Pothina@twdb.state.tx...
Thu Oct 9 12:02:10 CDT 2008

```This sounds great. I'm going to have to see how complicated it is to
so I can try the timeseries scikit out. From what I could tell there are
no repositories or rpms available for matplotlib 0.98 on Fedora 8 or 9.

- dharhas

>>> Lionel Roubeyrie <lroubeyrie@limair.asso.fr> 10/9/2008 9:39 AM >>>

> Ok I think I understood your example below. Can you give me an
example
> of how to deal with missing data?
If you take my last example, you have 3 separated arrays:
mes.dates : the date array
Trying to mask the 10 first values:
##################################

Out[18]: array([ 1.,  1.,  1., ...,  0.,  0.,  0.])

In [20]: mes2
Out[20]:
timeseries([-- -- -- ..., 17.9699245692 66.8968405206 24.7117965045],
dates = [01-jan-2007 00:00 ... 30-déc-2008 23:00],
freq  = H)
##################################
A timeseries can be constructed based on another timeseries, like I do
here with mes2. Note that just the values are masked (missing), not
the
dates because all fields have a value (masked or not).

>  Does this general technique work for data that is on a 15 minute
frequency
Yes, but no :-) The timeseries module doesn't handle directly QH
frequency, but minute frequency (freq='T'). Look at that :
#################################
In [28]: fielddates=ts.date_array(['2007-01-01 00:00', '2007-01-01
00:15', '2007-01-01 00:30', '2007-01-01 00:45'], freq='T')

In [29]: salinity=random(4)*100

In [30]: mes=ts.time_series(data=salinity, dates=fielddates)

In [31]: mes.has_missing_dates()
Out[31]: True
#################################
There's not QH native frequency, then there's some missing dates (you
can also look for duplicated dates, very convenient!). But you can
fill
these missing dates :
###############################
In [36]: mes2=mes.fill_missing_dates()

In [37]: mes2.has_missing_dates()
Out[37]: False

In [38]: mes2
Out[38]:
timeseries([2.33824586442 -- -- -- -- -- -- -- -- -- -- -- -- -- --
36.180901427 -- --
-- -- -- -- -- -- -- -- -- -- -- -- 39.0648471531 -- -- -- -- -- --
--
--
-- -- -- -- -- -- 55.4226606997],
dates = [01-jan-2007 00:00 ... 01-jan-2007 00:45],
freq  = T)
###############################
Or the module can handle directly these missing dates when you convert
the timeseries to a lower frequency:
###############################
In [39]: mes.convert(freq='H', func=mean)
Out[39]:
timeseries([ 33.25166379],
dates = [01-jan-2007 00:00],
freq  = H)
###############################
You can try with func=None, it will just fill the missing dates with
missing values :-p

> or datasets where the frequency is
> variable (ie some months we have 10 readings other months we may
have
> 30?
Like you see, just pass you datas with the corrects dates, and it
rocks,
but don't mix minute frequency with hour frequency!
Here I take 3 daily samples in january, and one in october :
#################################
In [41]: fielddates=ts.date_array(['2007-01-01', '2007-01-02',
'2007-01-03', '2007-10-15'], freq='D')

In [42]: salinity=random(4)*100

In [43]: mes=ts.time_series(data=salinity, dates=fielddates)

In [44]: mes
Out[44]:
timeseries([ 59.63468614  38.60721076  64.52554805  66.17637291],
dates = [01-jan-2007 02-jan-2007 03-jan-2007 15-oct-2007],
freq  = D)

In [45]: mes.convert(freq='M', func=mean)
Out[45]:
timeseries([54.2558149823 -- -- -- -- -- -- -- -- 66.1763729106],
dates = [jan-2007 ... oct-2007],
freq  = M)
###################################
Computing the monthly average goes fine, the module fill the missing

>
> Also how stable is the scikits.timeseries? Is it reasonably usable?
Yes, we use it intensively on large projects and Pierre G.M. has made
a
very good tool.
Cordialy

>
> thanks,
>
> - dharhas
>
> >>> Lionel Roubeyrie <lroubeyrie@limair.asso.fr> 10/9/2008 3:51 AM
>>>
> Hi Dharhas,
> scikits.timeseries is perfect for what you want in a very useable
way
> :
>
> ###############################
> In [29]: import scikits.timeseries as ts
>
> In [30]: sdate=ts.Date('H', '2007-01-01 00:00')
>
> In [31]: fielddates=ts.date_array(start_date=sdate, freq='H',
> length=365*24*2)
>
> In [32]: salinity=random(365*24*2)*100
>
> In [33]: mes=ts.time_series(data=salinity, dates=fielddates)
>
> In [34]: mes
> Out[34]:
> timeseries([ 23.84116045  49.51437251  89.29221711 ...,  37.00510947
> 41.12589836
>   78.65572656],
>            dates = [01-jan-2007 00:00 ... 30-déc-2008 23:00],
>            freq  = H)
>
>
> In [35]: mes_avmonth=mes.convert(freq='M', func=mean)
>
> In [36]: mes_avmonth
> Out[36]:
> timeseries([ 49.29718906  50.64688937  49.88193999  48.97144253
> 49.5788259
>   50.41340038  50.15047009  51.70933261  50.5635153   51.15084406
>   51.15362514  51.51443468  49.17556599  49.26877667  50.21416724
>   49.37037657  51.00724033  49.43337134  49.60398056  50.24470761
>   50.62350109  51.15572702  51.37652011  49.24193747],
>            dates = [jan-2007 ... déc-2008],
>            freq  = M)
>
>
> In [37]: mes_avyear=mes.convert(freq='Y', func=mean)
>
> In [38]: mes_avyear
> Out[38]:
> timeseries([ 50.41903159  50.06468157],
>            dates = [2007 2008],
>            freq  = A-DEC)
>
>
> In [39]: mes_avseason=mes[(mes.month>=5) & (mes.month<=9)].mean()
>
> In [40]: mes_avseason
> Out[40]: 50,33380690600049
> ###############################
>
>
> Le mercredi 08 octobre 2008 à 14:54 -0500, Dharhas Pothina a écrit :
> > Hi,
> >
> > I'm trying to analyze hourly salinity data. I was wondering if
there
> is a simple way of calculating daily, monthly and seasonal averages
of
> hourly time series data.
> >
> > So assuming I have two arrays that contain several years of hourly
> (or every 15min) salinity data: a datetime array called 'fielddates'
& a
> data array called 'salinity'
> >
> > How would I go about getting the various averages. The seasonal
> averages would be say defined as May through September etc.
> >
> > I had a look at scikits.timeseries but it looks like it would
require
> upgrading numpy to install and there isn't enough high level
> documentation on how to use it for me to be confident in picking it
up
> in the time frame I'm looking at. I'm also not completely clear if
it
> can handle stuff that happens on a scale smaller than a day. If
anyone
> can point me to any usage examples for it that would be appreciated.
> >
> > Thanks,
> >
> > - dharhas
> >
> > _______________________________________________
> > SciPy-user mailing list
> > SciPy-user@scipy.org
> > http://projects.scipy.org/mailman/listinfo/scipy-user
> >
--
Lionel Roubeyrie
chargé d'études
LIMAIR - La Surveillance de l'Air en Limousin
http://www.limair.asso.fr

_______________________________________________
SciPy-user mailing list
SciPy-user@scipy.org
http://projects.scipy.org/mailman/listinfo/scipy-user
```