# [Numpy-discussion] avoiding loops when downsampling arrays

Sturla Molden sturla@molden...
Tue Feb 7 11:10:43 CST 2012

```On 07.02.2012 15:27, eat wrote:

> This is elegant and very fast as well!

Just be aware that it depends on C ordered input. So:

m,n = data.shape
cond = lamda x : (x >= t1) & (x <= t2)
x = cond(np.ascontiguousarray(data)).reshape((m//4, 4, n//4, 4))
found = np.any(np.any(x, axis=1), axis=2)

With Fortran ordered data, we must reshape in the opposite direction:

(m,n) ---> (4, m//4, 4, n//4)

That is:

m,n = data.shape
cond = lamda x : (x >= t1) & (x <= t2)
x = cond(np.asfortranarray(data)).reshape((4, m//4, 4, n//4))
found = np.any(np.any(x, axis=0), axis=1)

On the other hand, using as_strided instead of reshape should work with
any ordering. Think of as_strided as a generalization of reshape.

A problem with both these approaches is that it implicitely loops over
discontiguous memory. The only solution to that is to use C, Fortran or
Cython.

So in Cython:

import numpy as np
cimport numpy as np
cimport cython

cdef inline np.npy_bool cond(np.float64_t x,
np.float64_t t1, np.float64_t t2):
return 1 if (x >= t1 and x <= t2) else 0

@cython.wraparound(False)
@cython.boundscheck(False)
@cython.cdivision(True)
def find(object data, np.float64_t t1, np.float64_t t2):

cdef np.ndarray[np.float64_t, ndim=2, mode='c'] x
cdef np.ndarray[np.npy_bool, ndim=2, mode='c'] found
cdef Py_ssize_t m, n, i, j

x = np.ascontiguousarray(data)
m,n = x.shape[0], x.shape[1]
found = np.zeros((m//4,n//4),dtype=bool)

for i in range(m):
for j in range(n):
found[i//4,j//4] = cond(x[i,j])

return found

It might be that the C compiler has difficulties optimizing the loop if
it think x and found cound be aliased, in which case the next logical
step would be C99 or Fortran...

Sturla

```