[Numpy-discussion] Speeding up Numeric

Rory Yorke ryorke at telkomsa.net
Mon Jan 24 10:54:16 CST 2005

Todd Miller <jmiller at stsci.edu> writes:

> This looked fantastic so I tried it over the weekend.  On Fedora Core 3,
> I couldn't get any information about numarray runtime (in the shared
> libraries),  only Python.  Ditto with Numeric,  although from your post
> you apparently got great results including information on Numeric .so's.
> I'm curious: has anyone else tried this for numarray (or Numeric) on
> Fedora Core 3?  Does anyone have a working profile script?

I think you need to have --separate=lib when invoking opcontrol. (See
later for an example.)

Some comments on oprofile:

- I think the oprofile tools (opcontrol, opreport etc.) are separate
  from the oprofile module, which is part of the kernel. I installed
  oprofile-0.8.1 from source, and it works with my standard Ubuntu
  kernel. It is easy to install it in a non-standard location
  ($HOME/usr on my system).

- I think opstack is part of oprofile 0.8 (or maybe 0.8.1) -- it
  wasn't in the 0.7.1 package available for Ubuntu. Also, to actually
  get callgraphs (from opstack), you need a patched kernel; see here:


- I think you probably *shouldn't* compile with -pg if you use
  oprofile, but you should use -g.

To profile shared libraries, I also tried the following:

- sprof. Some sort of dark art glibc tool. I couldn't get this to work
  with dlopen()'ed libraries (in which class I believe Python C
  extensions fall).

- qprof (http://www.hpl.hp.com/research/linux/qprof/). Almost worked,
  but I couldn't get it to identify symbols in shared libraries. Their
  page has a list of other profilers.

I also tried the Python 2.4 profile module; it does support
C-extension functions as advertised, but it seemed to miss object
instantiation calls (_numarray._numarray's instantiation, in this

Sample oprofile usage on my Ubuntu box:

rory at foo:~/hack/numarray/profile $ cat longadd.py 
import numarray as na
a = na.arange(1000.0)
b = na.arange(1000.0)
for i in xrange(1000000):
    a + b
rory at foo:~/hack/numarray/profile $ sudo modprobe oprofile
rory at foo:~/hack/numarray/profile $ sudo ~/usr/bin/opcontrol --start --separate=lib
Using 2.6+ OProfile kernel interface.
Using log file /var/lib/oprofile/oprofiled.log
Daemon started.
Profiler running.
rory at foo:~/hack/numarray/profile $ sudo ~/usr/bin/opcontrol --reset
Signalling daemon... done
rory at foo:~/hack/numarray/profile $ python2.4 longadd.py 
rory at foo:~/hack/numarray/profile $ sudo ~/usr/bin/opcontrol --shutdown
Stopping profiling.
Killing daemon.
rory at foo:~/hack/numarray/profile $ opreport -t 2 -l $(which python2.4)
CPU: Athlon, speed 1836.45 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask of 0x00 (No unit mask) count 100000
samples  %        image name               symbol name
47122    11.2430  _ufuncFloat64.so         add_ddxd_vvxv
26731     6.3778  python2.4                PyEval_EvalFrame
24122     5.7553  libc-2.3.2.so            memset
21228     5.0648  python2.4                lookdict_string
10583     2.5250  python2.4                PyObject_GenericGetAttr
9131      2.1786  libc-2.3.2.so            mcount
9026      2.1535  python2.4                PyDict_GetItem
8968      2.1397  python2.4                PyType_IsSubtype

(The idea wasn't really to discuss the results, but anyway: The
prominence of memset is a little odd -- are destination arrays zeroed
before being assigned the sum result?)

To get the libc symbols you need a libc with debug symbols -- on
Ubuntu this is the libc-dbg package; I don't know what it'll be on
Fedora or other systems. Set the LD_LIBRARY_PATH variable to force
these debug libraries to be loaded:

  export LD_LIBRARY_PATH=/usr/lib/debug

This is probably not all that useful -- I suppose it might be
interesting if one generates callgraphs. I don't (yet) have a modified
kernel, so I haven't tried this.

Have fun,


More information about the Numpy-discussion mailing list