[Numpy-discussion] FIY: a (new ?) practical profiling tool on linux

David Cournapeau cournape@gmail....
Thu Jan 7 21:12:20 CST 2010


Hi,

I don't know if many people are aware of it, but I have recently
discovered perf, a tool available from the kernel sources. It is
extremely simple to use, and very useful when looking at numpy/scipy
perf issues in compiled code. For example, I can get this kind of
results for looking at the numpy neighborhood iterator performance in
one simple command, without special compilation flags:

 44.69%   python
/home/david/local/stow/scipy.git/lib/python2.6/site-packages/scipy/signal/sigtools.so
                [.] _imp_correlate_nd_double
    39.47%   python
/home/david/local/stow/numpy-1.4.0/lib/python2.6/site-packages/numpy/core/multiarray.so
              [.] get_ptr_constant
     9.98%   python
/home/david/local/stow/numpy-1.4.0/lib/python2.6/site-packages/numpy/core/multiarray.so
              [.] get_ptr_simple
     0.65%   python  /usr/bin/python2.6
                                                    [.]
0x0000000012b8a0
     0.40%   python  /usr/bin/python2.6
                                                    [.]
0x000000000a6662
     0.37%   python  /usr/bin/python2.6
                                                    [.]
0x0000000004c10d
     0.32%   python  /usr/bin/python2.6
                                                    [.]
PyEval_EvalFrameEx
     0.15%   python  [kernel]
                                                    [k] __d_lookup
     0.14%   python  /lib/libc-2.10.1.so
                                                    [.] _int_malloc
     0.12%   python  /usr/bin/python2.6
                                                    [.]
0x0000000004f90e
     0.10%   python  [kernel]
                                                    [k]
__link_path_walk
     0.09%   python  /usr/bin/python2.6
                                                    [.]
PyObject_Malloc
     0.09%   python  /lib/ld-2.10.1.so
                                                    [.] do_lookup_x
     0.09%   python  /lib/libc-2.10.1.so
                                                    [.] __GI_memcpy
     0.08%   python  [kernel]
                                                    [k]
__ticket_spin_lock
     0.07%   python  /usr/bin/python2.6
                                                    [.]
PyParser_AddToken

And even cooler, annotated sources:

------------------------------------------------
 Percent |      Source code & Disassembly of multiarray.so
------------------------------------------------
         :
         :
         :
         :      Disassembly of section .text:
         :
         :      000000000001d8a0 <get_ptr_constant>:
         :          _coordinates[c] = bd;
         :
         :      /* set the dataptr from its current coordinates */
         :      static char*
         :      get_ptr_constant(PyArrayIterObject* _iter, npy_intp
*coordinates)
         :      {
   15.69 :         1d8a0:       48 81 ec 08 01 00 00    sub    $0x108,%rsp
         :          int i;
         :          npy_intp bd, _coordinates[NPY_MAXDIMS];
         :          PyArrayNeighborhoodIterObject *niter =
(PyArrayNeighborhoodIterObject*)_iter;
         :          PyArrayIterObject *p = niter->_internal_iter;
         :
         :          for(i = 0; i < niter->nd; ++i) {
    0.02 :         1d8a7:       48 83 bf 48 0a 00 00    cmpq   $0x0,0xa48(%rdi)
    0.00 :         1d8ae:       00
         :      get_ptr_constant(PyArrayIterObject* _iter, npy_intp
*coordinates)
         :      {
         :          int i;
         :          npy_intp bd, _coordinates[NPY_MAXDIMS];
         :          PyArrayNeighborhoodIterObject *niter =
(PyArrayNeighborhoodIterObject*)_iter;
         :          PyArrayIterObject *p = niter->_internal_iter;
    0.01 :         1d8af:       48 8b 87 50 0b 00 00    mov    0xb50(%rdi),%rax
         :
         :          for(i = 0; i < niter->nd; ++i) {
    7.92 :         1d8b6:       7e 64                   jle    1d91c
<get_ptr_constant+0x7c>
         :              _INF_SET_PTR(i)
    0.01 :         1d8b8:       48 8b 0e                mov    (%rsi),%rcx
    0.00 :         1d8bb:       48 03 48 28             add    0x28(%rax),%rcx
    0.03 :         1d8bf:       48 3b 88 40 07 00 00    cmp    0x740(%rax),%rcx
    7.97 :         1d8c6:       7c 68                   jl     1d930
<get_ptr_constant+0x90>
    0.02 :         1d8c8:       45 31 c9                xor    %r9d,%r9d
    0.00 :         1d8cb:       31 d2                   xor    %edx,%edx
    0.00 :         1d8cd:       48 3b 88 48 07 00 00    cmp    0x748(%rax),%rcx
    7.75 :         1d8d4:       7e 32                   jle    1d908
<get_ptr_constant+0x68>
    0.00 :         1d8d6:       eb 58                   jmp    1d930
<get_ptr_constant+0x90>
    0.00 :         1d8d8:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
    0.00 :         1d8df:       00
    7.68 :         1d8e0:       4c 8d 42 74             lea    0x74(%rdx),%r8
    0.00 :         1d8e4:       48 8b 0c d6             mov
(%rsi,%rdx,8),%rcx
    0.00 :         1d8e8:       48 03 4c d0 28          add
0x28(%rax,%rdx,8),%rcx
    0.00 :         1d8ed:       49 c1 e0 04             shl    $0x4,%r8
    7.89 :         1d8f1:       49 3b 0c 00             cmp    (%r8,%rax,1),%rcx
    0.00 :         1d8f5:       7c 39                   jl     1d930
<get_ptr_constant+0x90>
    0.01 :         1d8f7:       49 89 d0                mov    %rdx,%r8
    0.11 :         1d8fa:       49 c1 e0 04             shl    $0x4,%r8
    7.18 :         1d8fe:       4a 3b 8c 00 48 07 00    cmp
0x748(%rax,%r8,1),%rcx
    0.00 :         1d905:       00
    0.09 :         1d906:       7f 28                   jg     1d930
<get_ptr_constant+0x90>
         :          int i;
         :          npy_intp bd, _coordinates[NPY_MAXDIMS];
         :          PyArrayNeighborhoodIterObject *niter =
(PyArrayNeighborhoodIterObject*)_iter;
         :          PyArrayIterObject *p = niter->_internal_iter;
         :

It works for C and Fortran, BTW,

cheers,

David


More information about the NumPy-Discussion mailing list