[Numpy-discussion] Matlab/Numeric/numarray benchmarks

Jason Rennie jrennie at csail.mit.edu
Thu Jan 20 07:25:49 CST 2005

I have access to a variety of intel machines running Debian Sarge, and
I'm trying to decide between numarray and Numeric for some experiments
I'm about to run, so I thought I'd try out this benchmark.  I need
fast matrix multipliation and element-wise operations.  Here are the
results I see:

Matlab:		0.0475	1.44	5.78
Numeric:	0.0842	1.19	6.28
numarray:	7.62	9.78	Floating point exception

Matlab:		0.0143  1.00    3.08
Numeric:	0.0653	1.19	6.26
numarray:	3.46	8.30	Floating point exception

Matlab:		0.0102  0.886   2.71
Numeric:	0.0272	10.2	2.46
numarray:	2.23	3.43	Floating point exception

Numarray performance is pitiful.  Numeric ain't bad, except for that
matrixmultiply on the Xeon.  As luck would have it, our
cpu-cycle-servers are all Xeons, and the main big computations I have
to do are matrix multiplies...  Grrr...

All three machines are Debian Sarge with atlas3-sse2 plus all the
python2.3 packages installed.  I had to include /usr/lib/atlas/sse2 in
my LD_LIBRARY_PATH.  Anyone have any clue why the Xeon would balk at
the Numeric matrixmultiply?  Thinking it might be an atlas3-sse2
issue, I tried atlas-sse:

Xeon/atlas3-sse/Numeric:	0.0269	10.2	2.44
Xeon/atlas3-sse/numarray:	2.24	3.41	2.48

Apparently, there's a bug in the sse2 libraries that numarry is
tripping...  Still horrible Numeric/matrixmultiply
performance... Interesting that sse2 doesn't provide a performance
boost over sse.  I tried it on another Xeon machine... same bad
Numeric/matrixmultiply performance.  I tried atlas3-base (386
instructions only):

Xeon/atlas3-base/Numeric:	0.0269	10.2	2.60
Xeon/atlas3-base/numarray:	2.23	3.41	2.54

Sheesh!  No worse than the libraries w/ sse instructions...  But
still, no improvement in the Numeric/matrixmultiply test.  Next,

Xeon/Numeric:			0.0271	3.45	2.72
Xeon/numarray:			2.24	3.42	2.62

Progress!  Though, the Numeric/matrixmultiply is still four times
slower than Matlab...

As far as I can tell, I'm out of (Debian Sarge) libraries to
try... Any ideas as to why the Numeric matrixmultiply would be so slow
on the Xeon?



P.S. I had to move the import statements to the top of the file to get
benchmark.py to work.  As a sanity check, I tried only importing sys,
time, Numeric, and RandomArray, defining test10.  I then called
test10().  Same results as above.

More information about the Numpy-discussion mailing list