[Numpy-discussion] Rebinning numpy array

Olivier Delalleau shish@keba...
Sun Nov 13 11:40:17 CST 2011


Just one thing: numpy.interp says it doesn't check that the x coordinates
are increasing, so make sure it's the case.

Assuming this is ok, I could still see how you may get some non-smooth
behavior: this may be because your spike can either be split between two
bins (which "dilutes" it somehow), or be included in a single bin (which
would make it stand out more). And as you increase your bin size, you will
switch between these two situations.

-=- Olivier

2011/11/13 Johannes Bauer <dfnsonfsduifb@gmx.de>

> Hi group,
>
> I have a rather simple problem, or so it would seem. However I cannot
> seem to find the right solution. Here's the problem:
>
> A Geiger counter measures counts in distinct time intervals. The time
> intervals are not of constant length. Imaging for example that the
> counter would always create a table entry when the counts reach 10. Then
> we would have the following bins (made-up data for illustration):
>
> Seconds         Counts  Len     CPS
> 0 - 44          10      44      0.23
> 44 - 120        10      76      0.13
> 120 - 140       10      20      0.5
> 140 - 200       10      60      0.16
>
> So we have n bins (in this example 4), but they're not equidistant. I
> want to rebin samples to make them equidistant. For example, I would
> like to rebin into 5 bins of 40 seconds time each. Then the rebinned
> example (I calculate by hand so this might contain errors):
>
> 0-40            9.09
> 40-80           5.65
> 80-120          5.26
> 120-160         13.33
> 160-200         6.66
>
> That means, if a destination bin completely overlaps a source bin, its
> complete value is taken. If it overlaps partially, linear interpolation
> of bin sizes should be used.
>
> It is very important that the overall count amount stays the same (in
> this case 40, so my numbers seem to be correct, I checked that). In this
> example I increased the bin size, but usually I will want to decrease
> bin size (even dramatically).
>
> Now my pathetic attempts look something like this:
>
> interpolation_points = 4000
> xpts = [ time.mktime(x.timetuple()) for x in self.getx() ]
>
> interpolatedx = numpy.linspace(xpts[0], xpts[-1], interpolation_points)
> interpolatedy = numpy.interp(interpolatedx, xpts, self.gety())
>
> self._xreformatted = [ datetime.datetime.fromtimestamp(x) for x in
> interpolatedx ]
> self._yreformatted = interpolatedy
>
> This works somewhat, however I see artifacts depending on the
> destination sample size: for example when I have a spike in the sample
> input and reduce the number of interpolation points (i.e. increase
> destination bin size) slowly, the spike will get smaller and smaller
> (expected behaviour). After some amount of increasing, the spike however
> will "magically" reappear. I believe this to be an interpolation artifact.
>
> Is there some standard way to get from a non-uniformally distributed bin
> distribution to a unifomally distributed bin distribution of arbitrary
> bin width?
>
> Best regards,
> Joe
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20111113/a78398f9/attachment-0001.html 


More information about the NumPy-Discussion mailing list