[Numpy-discussion] Help speeding up element-wise operations for video processing
Francesc Alted
faltet@pytables....
Wed Sep 17 04:34:02 CDT 2008
A Wednesday 17 September 2008, Brendan Simons escrigué:
[clip]
> I would love a c-types code snippet. I'm not very handy in c. Since
> I gather numpy is row-major, I thought I up and down crops very
> quickly by moving the start and end pointers of the array. For
> cropping left and right, is there a fast c command for "copy while
> skipping every nth hundred bytes"?
There is no such efficient "copy while skipping every nth hundred bytes"
thing for C or for any other language. You are facing here a
fundamental problem in the design of modern processor architectures,
namely, the (large) memory latency. That means that when accessing
memory in discontiguous patterns like the one you indicated, that will
keep the processor waiting for the data most of the time. There are
ways to give hints to compilers in order to perform a better
pre-fetching of interesting data, but this is a rather complex process,
and the improvements can be meager in most of cases.
In brief, if you don't have much time to spend of this, my advice is to
use just regular assignment or memcpy (whatever is more comfortable for
your situation), because you won't be able to get more performance than
what these will offer.
However, if you have more time and want to look for ways on how to
scratch more performance on different memory access patterns, it is
always a wise thing to have a look at the excellent "What Every
Programmer Should Know About Memory" report:
htt://people.redhat.com/drepper/cpumemory.pdf
[Incidentally, this is possible one of the best reports available on the
subject of memory access on nowadays architectures (a critical thing
for achieving maximum performance), and besides it is available for
free, so there is no excuse to not have look at it ;-)]
Cheers,
--
Francesc Alted
More information about the Numpy-discussion
mailing list