[Numpy-discussion] Help speeding up element-wise operations for video processing

Francesc Alted faltet@pytables....
Wed Sep 17 04:34:02 CDT 2008


A Wednesday 17 September 2008, Brendan Simons escrigué:
[clip]
> I would love a c-types code snippet.  I'm not very handy in c.  Since
> I gather numpy is row-major, I thought I up and down crops very
> quickly by moving the start and end pointers of the array.  For
> cropping left and right, is there a fast c command for "copy while
> skipping every nth hundred bytes"?

There is no such efficient "copy while skipping every nth hundred bytes" 
thing for C or for any other language.  You are facing here a 
fundamental problem in the design of modern processor architectures, 
namely, the (large) memory latency.  That means that when accessing 
memory in discontiguous patterns like the one you indicated, that will 
keep the processor waiting for the data most of the time.  There are 
ways to give hints to compilers in order to perform a better 
pre-fetching of interesting data, but this is a rather complex process, 
and the improvements can be meager in most of cases.

In brief, if you don't have much time to spend of this, my advice is to 
use just regular assignment or memcpy (whatever is more comfortable for 
your situation), because you won't be able to get more performance than 
what these will offer.

However, if you have more time and want to look for ways on how to 
scratch more performance on different memory access patterns, it is 
always a wise thing to have a look at the excellent "What Every 
Programmer Should Know About Memory" report:

htt://people.redhat.com/drepper/cpumemory.pdf

[Incidentally, this is possible one of the best reports available on the 
subject of memory access on nowadays architectures (a critical thing 
for achieving maximum performance), and besides it is available for 
free, so there is no excuse to not have look at it ;-)]

Cheers,

-- 
Francesc Alted


More information about the Numpy-discussion mailing list