[SciPy-User] Averaging over unevenly-spaced records

Camilo Polymeris cpolymeris@gmail....
Mon Oct 17 16:55:03 CDT 2011


> You should take a look at my project, pandas
> (http://pandas.sourceforge.net/groupby.html). It had a lot richer
> built-in functionality for this kind of stuff than anything else in
> the scientific Python ecosystem.
>
> Assuming your timestamps are unique, you need only do:
>
> def normalize(dt):
>    return dt.replace(microsecond=0)
> data.groupby(normalize).agg({'A' : np.mean, 'B' : np.sum})
>
> and that will give you exactly what you want

Yes, the timestamps are unique. Looks neat.

> here data is a pandas DataFrame object. To get your record array into
> the right format, do:
>
> data = DataFrame.from_records(r, index='datetime')
>
> that will turn the datetimes into the index (row labels) of the DataFrame.
>
> However-- if the datetimes are not unique, all is not lost. Don't set
> the DataFrame index and do instead:
>
> data = DataFrame(r)
> grouper = data['datetime'].map(normalize)
> data.groupby(grouper).agg({'A' : np.mean, 'B' : np.sum})
>
> I think you'll find this a lot more palatable than a DIY approach
> using itertools.
>

Thanks for your suggestions. I think, I'll give the DIY approach a
try, for pedagogical reasons, but may later switch to pandas.

Regards,
Camilo


More information about the SciPy-User mailing list