[Numpy-discussion] avoiding loops when downsampling arrays

eat e.antero.tammi@gmail....
Mon Feb 6 16:38:34 CST 2012


Hi,

Sorry for my latest post, hands way too quick ;(

On Mon, Feb 6, 2012 at 9:16 PM, Moroney, Catherine M (388D) <
Catherine.M.Moroney@jpl.nasa.gov> wrote:

> Hello,
>
> I have to write a code to downsample an array in a specific way, and I am
> hoping that
> somebody can tell me how to do this without the nested do-loops.  Here is
> the problem
> statement:  Segment a (MXN) array into 4x4 squares and set a flag if any
> of the pixels
> in that 4x4 square meet a certain condition.
>
> Here is the code that I want to rewrite avoiding loops:
>
> shape_out = (data_in.shape[0]/4, data_in.shape[1]/4)
> found = numpy.zeros(shape_out).astype(numpy.bool)
>
> for i in xrange(0, shape_out[0]):
>        for j in xrange(0, shape_out[1]):
>
>                excerpt = data_in[i*4:(i+1)*4, j*4:(j+1)*4]
>                mask = numpy.where( (excerpt >= t1) & (excerpt <= t2),
> True, False)
>                if (numpy.any(mask)):
>                        found[i,j] = True
>
> Thank you for any hints and education!
>
Following closely with Warrens answer a slight demonstration of code like
this:
import numpy as np

 def ds_0(data_in, t1= 1, t2= 4):

    shape_out= (data_in.shape[0]/ 4, data_in.shape[1]/ 4)

    found= np.zeros(shape_out).astype(np.bool)

    for i in xrange(0, shape_out[0]):

        for j in xrange(0, shape_out[1]):

        excerpt= data_in[i* 4: (i+ 1)* 4, j* 4: (j+ 1)* 4]

        mask= np.where((excerpt>= t1)& (excerpt<= t2), True, False)

        if (np.any(mask)):

        found[i, j]= True

    return found


 # with stride_tricks you may cook up something like this:

from numpy.lib.stride_tricks import as_strided as ast


 def _ss(dt, ds, s):

    return {'shape': (ds[0]/ s[0], ds[1]/ s[1])+ s,

    'strides': (s[0]* dt[0], s[1]* dt[1])+ dt}


 def _view(D, shape= (4, 4)):

    return ast(D, **_ss(D.strides, D.shape, shape))


 def ds_1(data_in, t1= 1, t2= 4):

    excerpt= _view(data_in)

    mask= np.where((excerpt>= t1)& (excerpt<= t2), True, False)

    return mask.sum(2).sum(2).astype(np.bool)


 if __name__ == '__main__':

    from numpy.random import randint

    r= randint(777, size= (64, 288)); print r

    print np.allclose(ds_0(r), ds_1(r))


and when run, it will yield like:

In []: run dsa

[[ 60 470 521 ..., 147 435 295]

 [246 127 662 ..., 718 525 256]

 [354 384 205 ..., 225 364 239]

 ...,

 [277 428 201 ..., 460 282 433]

 [ 27 407 130 ..., 245 346 309]

 [649 157 153 ..., 316 613 570]]

True

and compared in performance wise:
In []: %timeit ds_0(r)
10 loops, best of 3: 56.3 ms per loop

In []: %timeit ds_1(r)
100 loops, best of 3: 2.17 ms per loop


My 2 cents,

eat



> Catherine
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20120207/f114a992/attachment.html 


More information about the NumPy-Discussion mailing list