[SciPy-User] Averaging over unevenly-spaced records

nicky van foreest vanforeest@gmail....
Sun Oct 16 17:18:12 CDT 2011


Hi,

Have you perhaps considered using itertools.groupby? Like this you can
group elements by datetime at second accuracy (use a key function that
strips all subsecond accuracy from your datetime objects). Then just
sum over your colums A and B, and divide for the average by the lenght
of the group.

HTH

Nicky

On 14 October 2011 19:24, Camilo Polymeris <cpolymeris@gmail.com> wrote:
> Hello all,
>
> I am pretty new to numpy (and numerical software packages in general),
> so this may be a basic question, but I would appreciate any help.
>
> Say I have a recarray like the following:
>
>    r = array([
> ...
>       (datetime.datetime(2011, 3, 30, 16, 1, 15, 911000), 1.39, 18),
>       (datetime.datetime(2011, 3, 30, 16, 1, 16, 181000), 1.34, 22),
>       (datetime.datetime(2011, 3, 30, 16, 1, 16, 630000), 1.37, 19),
>       (datetime.datetime(2011, 3, 30, 16, 1, 16, 922000), 1.34, 19),
>       (datetime.datetime(2011, 3, 30, 16, 1, 17, 324000), 1.33, 19),
> ...
>      dtype=[('datetime', '|O8'), ('A', '<f8'), ('B', '<i8')])
>
> I would like to, for every whole second, e.g. datetime(2011, 3, 30,
> 16, 1, 16), get the average of column A and the sum of column B, like
> this:
>
> r1 = array([
> ...
>       [1.35, 60],  # for second datetime(2011, 3, 30, 16, 1, 16)
> ...
>       ])
>
> As you can see, the datetimes are not homogeneously spaced. There can
> be any number of data point in one second (even zero -- then I would
> just keep the last value or 0 or NaN, whichever is easier). I have in
> the order of 10^8 to 10^9 records.
> I think it can be done with reduceat, but I would have to manually
> find the indices, which I don't think is the numpythonicest way to do
> this. Another option is to use griddata to interpolate the values at
> e.g. 1ms, to have evenly data & and then use evenly spaced indices --
> more elegant, but seems inefficient. Any suggestions?
>
> Thanks & best regards,
>
> Camilo
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


More information about the SciPy-User mailing list