[Numpy-discussion] efficient 3d histogram creation
Chris Colbert
sccolbert@gmail....
Wed May 6 19:34:18 CDT 2009
nice!
This was really my first attempt at doing anything constructive with Cython.
It was actually unbelievably easy to work with. I think i spent less time
working on this, than I did trying to find an optimized solution using pure
numpy and python.
Chris
On Wed, May 6, 2009 at 8:21 PM, <josef.pktd@gmail.com> wrote:
> On Wed, May 6, 2009 at 7:39 PM, Chris Colbert <sccolbert@gmail.com> wrote:
> > i just realized I don't need the line:
> >
> > cdef int z = img.shape(2)
> >
> > it's left over from tinkering. sorry. And i should probably convert the
> out
> > array to type float to handle large data sets.
> >
> > Chris
> >
> > On Wed, May 6, 2009 at 7:30 PM, <josef.pktd@gmail.com> wrote:
> >>
> >> On Wed, May 6, 2009 at 6:06 PM, Chris Colbert <sccolbert@gmail.com>
> wrote:
> >> > I decided to hold myself over until being able to take a hard look at
> >> > the
> >> > numpy histogramdd code:
> >> >
> >> > Here is a quick thing a put together in cython. It's a 40x speedup
> over
> >> > histogramdd on Vista 32 using the minGW32 compiler. For a (480, 630,
> 3)
> >> > array, this executed in 0.005 seconds on my machine.
> >> >
> >> > This only works for arrays with uint8 data types having dimensions (x,
> >> > y, 3)
> >> > (common image format). The return array is a (16, 16, 16) equal width
> >> > bin
> >> > histogram of the input.
> >> >
> >> > If anyone wants the cython C-output, let me know and I will email it
> to
> >> > you.
> >> >
> >> > If there is interest, I will extend this for different size bins and
> >> > aliases
> >> > for different data types.
> >> >
> >> > Chris
> >> >
> >> > import numpy as np
> >> >
> >> > cimport numpy as np
> >> >
> >> > DTYPE = np.uint8
> >> > DTYPE32 = np.int
> >> >
> >> > ctypedef np.uint8_t DTYPE_t
> >> > ctypedef np.int_t DTYPE_t32
> >> >
> >> > def hist3d(np.ndarray[DTYPE_t, ndim=3] img):
> >> > cdef int x = img.shape[0]
> >> > cdef int y = img.shape[1]
> >> > cdef int z = img.shape[2]
> >> > cdef int addx
> >> > cdef int addy
> >> > cdef int addz
> >> > cdef np.ndarray[DTYPE_t32, ndim=3] out = np.zeros([16, 16, 16],
> >> > dtype=DTYPE32)
> >> > cdef int i, j, v0, v1, v2
> >> >
> >> >
> >> > for i in range(x):
> >> > for j in range(y):
> >> > v0 = img[i, j, 0]
> >> > v1 = img[i, j, 1]
> >> > v2 = img[i, j, 2]
> >> > addx = (v0 - (v0 % 16)) / 16
> >> > addy = (v1 - (v1 % 16)) / 16
> >> > addz = (v2 - (v2 % 16)) / 16
> >> > out[addx, addy, addz] += 1
> >> >
> >> > return out
> >> >
> >>
> >> Thanks for the example for using cython. Once I figure out what the
> >> types are, cython will look very convenient for loops, and pyximport
> >> takes care of the compiler.
> >>
> >> Josef
> >>
> >> import pyximport; pyximport.install()
> >> import hist_rgb #name of .pyx files
> >>
> >> import numpy as np
> >> factors = np.random.randint(256,size=(480, 630, 3))
> >> h = hist_rgb.hist3d(factors.astype(np.uint8))
> >> print h[:,:,0]
>
>
> playing some more with cython: here is a baby on the fly code generator
> input type int, output type float64
>
> a dispatch function by type is missing
>
> no segfaults, even though most of the time a call the function with
> the wrong type.
>
> Josef
>
> code = '''
> import numpy as np
> cimport numpy as np
>
> __all__ = ["hist3d"]
> DTYPE = ${imgtype}
> DTYPE32 = ${outtype}
>
> ctypedef ${imgtype}_t DTYPE_t
> ctypedef ${outtype}_t DTYPE_t32
>
> def hist3d(np.ndarray[DTYPE_t, ndim=3] img):
> cdef int x = img.shape[0]
> cdef int y = img.shape[1]
> #cdef int z = img.shape[2]
> cdef int addx
> cdef int addy
> cdef int addz
> cdef np.ndarray[DTYPE_t32, ndim=3] out = np.zeros([16, 16, 16],
> dtype=DTYPE32)
> cdef int i, j, v0, v1, v2
>
>
> for i in range(x):
> for j in range(y):
> v0 = img[i, j, 0]
> v1 = img[i, j, 1]
> v2 = img[i, j, 2]
> addx = (v0 - (v0 % 16)) / 16
> addy = (v1 - (v1 % 16)) / 16
> addz = (v2 - (v2 % 16)) / 16
> out[addx, addy, addz] += 1
>
> return out
> '''
>
> from string import Template
> s = Template(code)
> src = s.substitute({'imgtype': 'np.int', 'outtype': 'np.float64'})
> open('histrgbintfl2.pyx','w').write(src)
>
> import pyximport; pyximport.install()
> import histrgbintfl2
>
> import numpy as np
> factors = np.random.randint(256,size=(480, 630, 3))
> h = histrgbintfl2.hist3d(factors)#.astype(np.uint8))
> print h[:,:,0]
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20090506/07013074/attachment-0001.html
More information about the Numpy-discussion
mailing list