[Numpy-discussion] Optimized half-sizing of images?

Robert Bradshaw robertwb@math.washington....
Fri Aug 7 02:34:17 CDT 2009


On Aug 7, 2009, at 12:23 AM, Sebastian Haase wrote:

> On Fri, Aug 7, 2009 at 3:46 AM, Zachary  
> Pincus<zachary.pincus@yale.edu> wrote:
>>> We have a need to to generate half-size version of RGB images as
>>> quickly
>>> as possible.
>>
>> How good do these need to look? You could just throw away every other
>> pixel... image[::2, ::2].
>>
>> Failing that, you could also try using ndimage's convolve routines to
>> run a 2x2 box filter over the image, and then throw away half of the
>> pixels. But this would be slower than optimal, because the kernel
>> would be convolved over every pixel, not just the ones you intend to
>> keep.
>>
>> Really though, I'd just bite the bullet

You say that as if it's painful to do so :)

-------------------------------------
import cython
import numpy as np
cimport numpy as np

@cython.boundscheck(False)
def halfsize_cython(np.ndarray[np.uint8_t, ndim=2, mode="c"] a):
     cdef unsigned int i, j, w, h
     w, h = a.shape[0], a.shape[1]
     cdef np.ndarray[np.uint8_t, ndim=2, mode="c"] a2 = np.ndarray((w/ 
2, h/2), np.uint8)
     for i in range(w/2):
         for j in range(h/2):
             a2[i,j] = (<int>a[2*i,2*j] + a[2*i+1,2*j] + a[2*i,2*j+1]  
+ a[2*i+1,2*j+1])/4
     return a2

def halfsize_slicing(a):
      a2 =  a[0::2, 0::2].astype(np.uint8) / 4
      a2 += a[0::2, 1::2] / 4
      a2 += a[1::2, 0::2] / 4
      a2 += a[1::2, 1::2] / 4
      return a2
-------------------------------------

sage: import numpy; from half_size import *
sage: a = numpy.ndarray((512, 512), numpy.uint8)
sage: timeit("halfsize_cython(a)")
625 loops, best of 3: 604 µs per loop
sage: timeit("halfsize_slicing(a)")
5 loops, best of 3: 2.72 ms per loop


>> and write a C extension (or cython, whatever, an extension to work  
>> for a defined-dimensionality,
>> defined-dtype array is pretty simple), or as suggested before, do it
>> on the GPU. (Though I find that readback from the GPU can be slow
>> enough that C code can beat it in some cases.)
>>
>> Zach
>
> Chris,
> regarding your concerns of doing to fancy interpolation at the cost of
> speed, I would guess the overall bottle neck is rather the memory
> access than the extra CPU cycles needed for interpolation.
> Regarding ndimage.zoom it should be able to "not zoom" the color-axis
> but the others in one call.

I was about to say the same thing, it's probably the memory, not  
cycles, that's hurting you. Of course 512x512 is still small enough  
to fit in L2 of any modern computer.

- Robert


More information about the NumPy-Discussion mailing list