[Numpy-discussion] Help speeding up element-wise operations for video processing

Brendan Simons spam4bsimons@yahoo...
Tue Sep 16 18:48:26 CDT 2008


Why would I need the GPU to do parallel operations?  I thought most  
modern processors have vector units.  I just don't know if there's a  
way to have my code use them.

I tried a quick test with pyopengl (as quick as can be done in that  
crazy api), but I found adding two textures of the same size as ain  
took nearly as long as numpy (maybe twice as fast but I need it to be  
more like 10x faster).  I can send you the code if you like, but it  
may be getting a bit out of scope for this forum.

Thanks,
    Brendan

On 16-Sep-08, at 10:04 AM, David Huard wrote:

> Brendan,
>
> Not sure if I understand correctly what you want, but ...
>
> Numpy vector operations are performed in C, so there will be an  
> iteration over the array elements.
>
> For parallel operations over all pixels, you'd need a package that  
> talks to your GPU, such as pyGPU.
> I've never tried it and if you do, please report your experience,  
> I'd be very interested to hear about it.
>
> HTH,
>
> David
>
>
>
>
> On Tue, Sep 16, 2008 at 4:50 AM, Stéfan van der Walt  
> <stefan@sun.ac.za> wrote:
> Hi Brendan
>
> 2008/9/16 brendan simons <spam4bsimons@yahoo.ca>:
> > #interpolate the green pixels from the bayer filter image ain
> > g = greenMask * ain
> > gi = g[:-2, 1:-1].astype('uint16')
> > gi += g[2:, 1:-1]
> > gi += g[1:-1, :-2]
> > gi += g[1:-1, 2:]
> > gi /= 4
> > gi += g[1:-1, 1:-1]
> > return gi
>
> I may be completely off base here, but you should be able to do this
> *very* quickly using your GPU, or even just using OpenGL.  Otherwise,
> coding it up in ctypes is easy as well (I can send you a code snippet,
> if you need).
>
> > I do something similar for red and blue, then stack the  
> interpolated red,
> > green and blue integers into an array of 24 bit integers and blit  
> to the
> > screen.
> >
> > I was hoping that none of the lines would have to iterate over  
> pixels, and
> > would instead do the adds and multiplies as single operations.  
> Perhaps numpy
> > has to iterate when copying a subset of an array?  Is there a  
> faster array
> > "crop" ?  Any hints as to how I might code this part up using  
> ctypes?
>
> Have you tried formulating this as a convolution, and using
> scipy.signal's 2-d convolve or fftconvolve?
>
> Cheers
> Stéfan
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20080916/83e24d7f/attachment.html 


More information about the Numpy-discussion mailing list