[Numpy-discussion] avoiding loops when downsampling arrays
eat
e.antero.tammi@gmail....
Mon Feb 6 16:38:34 CST 2012
Hi,
Sorry for my latest post, hands way too quick ;(
On Mon, Feb 6, 2012 at 9:16 PM, Moroney, Catherine M (388D) <
Catherine.M.Moroney@jpl.nasa.gov> wrote:
> Hello,
>
> I have to write a code to downsample an array in a specific way, and I am
> hoping that
> somebody can tell me how to do this without the nested do-loops. Here is
> the problem
> statement: Segment a (MXN) array into 4x4 squares and set a flag if any
> of the pixels
> in that 4x4 square meet a certain condition.
>
> Here is the code that I want to rewrite avoiding loops:
>
> shape_out = (data_in.shape[0]/4, data_in.shape[1]/4)
> found = numpy.zeros(shape_out).astype(numpy.bool)
>
> for i in xrange(0, shape_out[0]):
> for j in xrange(0, shape_out[1]):
>
> excerpt = data_in[i*4:(i+1)*4, j*4:(j+1)*4]
> mask = numpy.where( (excerpt >= t1) & (excerpt <= t2),
> True, False)
> if (numpy.any(mask)):
> found[i,j] = True
>
> Thank you for any hints and education!
>
Following closely with Warrens answer a slight demonstration of code like
this:
import numpy as np
def ds_0(data_in, t1= 1, t2= 4):
shape_out= (data_in.shape[0]/ 4, data_in.shape[1]/ 4)
found= np.zeros(shape_out).astype(np.bool)
for i in xrange(0, shape_out[0]):
for j in xrange(0, shape_out[1]):
excerpt= data_in[i* 4: (i+ 1)* 4, j* 4: (j+ 1)* 4]
mask= np.where((excerpt>= t1)& (excerpt<= t2), True, False)
if (np.any(mask)):
found[i, j]= True
return found
# with stride_tricks you may cook up something like this:
from numpy.lib.stride_tricks import as_strided as ast
def _ss(dt, ds, s):
return {'shape': (ds[0]/ s[0], ds[1]/ s[1])+ s,
'strides': (s[0]* dt[0], s[1]* dt[1])+ dt}
def _view(D, shape= (4, 4)):
return ast(D, **_ss(D.strides, D.shape, shape))
def ds_1(data_in, t1= 1, t2= 4):
excerpt= _view(data_in)
mask= np.where((excerpt>= t1)& (excerpt<= t2), True, False)
return mask.sum(2).sum(2).astype(np.bool)
if __name__ == '__main__':
from numpy.random import randint
r= randint(777, size= (64, 288)); print r
print np.allclose(ds_0(r), ds_1(r))
and when run, it will yield like:
In []: run dsa
[[ 60 470 521 ..., 147 435 295]
[246 127 662 ..., 718 525 256]
[354 384 205 ..., 225 364 239]
...,
[277 428 201 ..., 460 282 433]
[ 27 407 130 ..., 245 346 309]
[649 157 153 ..., 316 613 570]]
True
and compared in performance wise:
In []: %timeit ds_0(r)
10 loops, best of 3: 56.3 ms per loop
In []: %timeit ds_1(r)
100 loops, best of 3: 2.17 ms per loop
My 2 cents,
eat
> Catherine
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20120207/f114a992/attachment.html
More information about the NumPy-Discussion
mailing list