[Numpy-discussion] Rebinning numpy array
Robert Kern
robert.kern@gmail....
Sun Nov 13 11:27:18 CST 2011
On Sun, Nov 13, 2011 at 16:04, Johannes Bauer <dfnsonfsduifb@gmx.de> wrote:
> Hi group,
>
> I have a rather simple problem, or so it would seem. However I cannot
> seem to find the right solution. Here's the problem:
>
> A Geiger counter measures counts in distinct time intervals. The time
> intervals are not of constant length. Imaging for example that the
> counter would always create a table entry when the counts reach 10. Then
> we would have the following bins (made-up data for illustration):
>
> Seconds Counts Len CPS
> 0 - 44 10 44 0.23
> 44 - 120 10 76 0.13
> 120 - 140 10 20 0.5
> 140 - 200 10 60 0.16
>
> So we have n bins (in this example 4), but they're not equidistant. I
> want to rebin samples to make them equidistant. For example, I would
> like to rebin into 5 bins of 40 seconds time each. Then the rebinned
> example (I calculate by hand so this might contain errors):
>
> 0-40 9.09
> 40-80 5.65
> 80-120 5.26
> 120-160 13.33
> 160-200 6.66
>
> That means, if a destination bin completely overlaps a source bin, its
> complete value is taken. If it overlaps partially, linear interpolation
> of bin sizes should be used.
What you want to do is set up a linear interpolation based on the
boundaries of the uneven bins.
Seconds Value
0 0
44 10
120 20
140 30
200 40
Then evaluate that linear interpolation on the boundaries of the uniform bins.
[~]
|18> bin_bounds = np.array([0.0, 44.0, 120, 140, 200])
[~]
|19> bin_values = np.array([0.0, 10, 10, 10, 10])
[~]
|20> cum_bin_values = bin_values.cumsum()
[~]
|21> new_bounds = np.array([0.0, 40, 80, 120, 160, 200])
[~]
|22> ecdf = np.interp(new_bounds, bin_bounds, cum_bin_values)
[~]
|23> ecdf
array([ 0. , 9.09090909, 14.73684211, 20. ,
33.33333333, 40. ])
[~]
|24> uniform_histogram = np.diff(ecdf)
[~]
|25> uniform_histogram
array([ 9.09090909, 5.64593301, 5.26315789, 13.33333333, 6.66666667])
This may be what you are doing already. I'm not sure what is in your
getx() and gety() methods. If so, then I think you are on the right
track. If you still have problems, then we might need to see some of
the problematic data and results.
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
-- Umberto Eco
More information about the NumPy-Discussion
mailing list