[Numpy-discussion] performance comparison of C++ vs Numeric (MA) operations.
Paul F. Dubois
paul at pfdubois.com
Tue Jun 12 20:08:46 CDT 2001
I have a timing benchmark for MA that computes the ratio MA/Numeric for two
1. there is actually no mask
2. there is a mask
For N=50,000 these ratios are usually around 1.3 and 1.8 respectively.
It makes sense in the second case that the number might be around 2 since
you have to pass through the mask data as well, even if it is only bytes.
In short, there is this much overhead to MA. If you got MA/C++ = 1.67 it
would indicate Numpy/C++ comparable. The tests Jim did when he first wrote
it were about 10% worse than C.
Your C++ uses a special value instead of a mask array which may mean that
you traded space for CPU time, and using large arrays like that maybe that
causes some page faults (?) Anyway you're comparing apples and oranges a
Anyway, my point is this is probably an MA issue rather than a Numpy issue.
However, please note that I did not (yet) do any of the normal profiling and
testing that one would do to speed MA up, such as putting key parts in C.
This is just not an issue for me right now.
From: numpy-discussion-admin at lists.sourceforge.net
[mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of Joe
Sent: Tuesday, June 12, 2001 5:20 PM
Subject: [Numpy-discussion] performance comparison of C++ vs Numeric
I was curious about the relative performance of C++ vs Numeric Python,
for operations on arrays of roughly 400,000 array elements. I built a
simple array single precision multiplication function in C++, that
performs an element by element multiply, checking whether each element
is "valid" or "missing data".
Then, for comparision, I wrote a similar multiplication routine, using
the Masked Array (MA) package of Numeric Python.
I compiled Numeric Python (20.1.0b2) with '-O3', by modifying setup.py
to contain lines like
On an 800 Mhz dual processor Dell Linux box, using gcc 2.95.3,
Numeric Python 5.0e6 multiplies/second
Numeric Python -03 6.1e6 multiplies/second
C++ 10.3e6 multiplies/second
C++ -O3 10.3e6 multiplies/second
(I tried using "plain" Numeric arrays, rather than Masked arrays, and it
didn't seem to make much difference.)
Has anyone else benchmarked the relative performance of C/C++ vs Numeric
Does anyone know of other optimizations to Numeric Python that could be
I know a more realistic benchmark would include I/O, which might tend to
reduce the apparent difference in performance.
I've attached the benchmark modules, in case someone would like to
National Center for Atmospheric Research
Internet: vanandel at ucar.edu
More information about the Numpy-discussion