[Numpy-discussion] numpy : your experiences?

bernhard.voigt@gmai... bernhard.voigt@gmai...
Wed Nov 21 04:27:35 CST 2007


> a) Can you guys tell me briefly about the kind of problems you are
> tackling with numpy and scipy?

I'm using python with numpy,scipy, pytables and matplotlib for data
analysis in the field of high energy particle physics. Most of the
work is histograming millions of  events, fitting functions to the
distributions or applying cuts to yield optimized signal/background
ratios. I often use the random number and optimization facilities for
these purposes.

Most of my colleagues use ROOT (root.cern.ch) which has also a python
binding, however, I love the simplicity of numpy's ufuncs and indexing
capabilities, which makes the code much denser and readable.

Another part is developing toy simulation and reconstruction
algorithms in python which later will be translated to C++ since the
main software framework in our collaboration is written in C++. This
includes log-likelihood track reconstruction algorithms based on the
arrival times of photons measured with photomultipliers. Simulation
involves particle generators and detector response simulations to test
the reconstruction with known inputs.

> b) Have you ever felt that numpy/scipy was slow and had to switch to
> C/C++/Fortran?

In particular for the simulation yes, depending on the level of detail
of course. But only parts, eg. random number generation for certain
distributions had to be coded in C/C++.

Since the main software for my work is coded in C++,  I often end up
writing wrappers around parts of this software to extract the data I
need for doing the analysis work in python.

> c) Do you use any form of parallel processing? Multicores? SMPs?
> Clusters? If yes how did u utilize them?

We have a cluster at our lab which I use for my computations. This is
not very difficult since the data can be split into several files and
each can be treated in the same way. One just needs to pass a script
over and over again to the cluster, this is done in  a shell script or
with the tools provided by the cluster scheduling system.

Cheers! Bernhard


More information about the Numpy-discussion mailing list