[Numpy-discussion] reduce array by computing min/max every n samples
Brad Buran
bburan@cns.nyu....
Mon Jun 21 13:58:37 CDT 2010
Hmmm, if I force the reshaped array to be copied, it speeds up the min/max
and makes the overall operation a bit faster (times are below, generated
using line profiler with kernprof.py). I'd certainly like to get rid of
this copy() operation if possible. Is there any way to avoid it?
Brad
<use fixed width>
Line Hits Time Per hit Line
186 1040 22625448 21755.2 val_pts =
val_pts[offset:].reshape((-1, downsample))
187 1040 2590217880 2490594.1 val_min = val_pts.min(-1)
188 1040 2169869368 2086412.9 val_max = val_pts.max(-1)
Time per hit (for all three lines): 4,598,762
189 1040 2005172872 1928050.8 val_pts =
val_pts[offset:].reshape((-1, downsample)).copy()
190 1040 592062968 569291.3 val_min = val_pts.min(-1)
191 1040 560845192 539274.2 val_max = val_pts.max(-1)
Time per hit (for all three lines): 3,036,616
</use fixed width>
On Sat, Jun 19, 2010 at 4:37 PM, Benjamin Root <ben.root@ou.edu> wrote:
> Brad, I think you are doing it the right way, but I think what is
happening
> is that the reshape() call on the sliced array is forcing a copy to be
made
> first. The fact that the copy has to be made twice just worsens the
issue.
> I would save a copy of the reshape result (it is usually a view of the
> original data, unless a copy is forced), and then perform a min/max call
on
> that with the appropriate axis.
>
> On that note, would it be a bad idea to have a function that returns a
> min/max tuple? Performing two iterations to gather the min and the max
> information versus a single iteration to gather both at the same time
would
> be useful. I should note that there is a numpy.ptp() function that
returns
> the difference between the min and the max, but I don't see anything that
> returns the actual values.
>
> Ben Root
>
> On Thu, Jun 17, 2010 at 4:50 PM, Brad Buran <bburan@cns.nyu.edu> wrote:
>>
>> I have a 1D array with >100k samples that I would like to reduce by
>> computing the min/max of each "chunk" of n samples. Right now, my
>> code is as follows:
>>
>> n = 100
>> offset = array.size % downsample
>> array_min = array[offset:].reshape((-1, n)).min(-1)
>> array_max = array[offset:].reshape((-1, n)).max(-1)
>>
>> However, this appears to be running pretty slowly. The array is data
>> streamed in real-time from external hardware devices and I need to
>> downsample this and compute the min/max for plotting. I'd like to
>> speed this up so that I can plot updates to the data as quickly as new
>> data comes in.
>>
>> Are there recommendations for faster ways to perform the downsampling?
>>
>> Thanks,
>> Brad
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20100621/0d0e0fb7/attachment.html
More information about the NumPy-Discussion
mailing list