[Numpy-discussion] reduce array by computing min/max every n samples

Brad Buran bburan@cns.nyu....
Mon Jun 21 13:58:37 CDT 2010


Hmmm, if I force the reshaped array to be copied, it speeds up the min/max
and makes the overall operation a bit faster (times are below, generated
using line profiler with kernprof.py).  I'd certainly like to get rid of
this copy() operation if possible.  Is there any way to avoid it?

Brad

<use fixed width>
Line     Hits   Time       Per hit       Line
186      1040     22625448  21755.2      val_pts =
val_pts[offset:].reshape((-1, downsample))
187      1040   2590217880 2490594.1     val_min = val_pts.min(-1)
188      1040   2169869368 2086412.9     val_max = val_pts.max(-1)
Time per hit (for all three lines): 4,598,762


189      1040   2005172872 1928050.8     val_pts =
val_pts[offset:].reshape((-1, downsample)).copy()
190      1040    592062968 569291.3      val_min = val_pts.min(-1)
191      1040    560845192 539274.2      val_max = val_pts.max(-1)
Time per hit (for all three lines): 3,036,616
</use fixed width>


On Sat, Jun 19, 2010 at 4:37 PM, Benjamin Root <ben.root@ou.edu> wrote:
> Brad, I think you are doing it the right way, but I think what is
happening
> is that the reshape() call on the sliced array is forcing a copy to be
made
> first.  The fact that the copy has to be made twice just worsens the
issue.
> I would save a copy of the reshape result (it is usually a view of the
> original data, unless a copy is forced), and then perform a min/max call
on
> that with the appropriate axis.
>
> On that note, would it be a bad idea to have a function that returns a
> min/max tuple?  Performing two iterations to gather the min and the max
> information versus a single iteration to gather both at the same time
would
> be useful.  I should note that there is a numpy.ptp() function that
returns
> the difference between the min and the max, but I don't see anything that
> returns the actual values.
>
> Ben Root
>
> On Thu, Jun 17, 2010 at 4:50 PM, Brad Buran <bburan@cns.nyu.edu> wrote:
>>
>> I have a 1D array with >100k samples that I would like to reduce by
>> computing the min/max of each "chunk" of n samples.  Right now, my
>> code is as follows:
>>
>> n = 100
>> offset = array.size % downsample
>> array_min = array[offset:].reshape((-1, n)).min(-1)
>> array_max = array[offset:].reshape((-1, n)).max(-1)
>>
>> However, this appears to be running pretty slowly.  The array is data
>> streamed in real-time from external hardware devices and I need to
>> downsample this and compute the min/max for plotting.  I'd like to
>> speed this up so that I can plot updates to the data as quickly as new
>> data comes in.
>>
>> Are there recommendations for faster ways to perform the downsampling?
>>
>> Thanks,
>> Brad
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20100621/0d0e0fb7/attachment.html 


More information about the NumPy-Discussion mailing list