[SciPy-dev] Parallel cKDTree for SciPy
Sat Jan 24 12:45:18 CST 2009
This is a follow up to my post on scipy-user this morning. As shown, using
OpenMP with cKDTree works very well. But it remained that I hand-edited
the output from Cython. Therefore, I rewrote Anne Archibald's cKDTree
sliglity, putting all the OpenMP calls in a separate C file. Now it works
without having to modify the Cython output.
I have also made sure that:
- The GIL is released around the time-consuming loop in cKDTree.query.
- If a single query is made, OpenMP threads is not used in cKDTree.query.
- The number of OpenMP threads can be specified when calling cKDTree.query.
For compilation I used GCC 4.4 with -O3 and -fopenmp, and linked -lgomp
and -lpthread. I have attached the C and Cython code, and a speed test. In
the benchmark, the black line is SciPy's 0.7.0rc2 cKDTree, green line is
mine using one OpenMP thread, and the red is letting OpenMP pick the
number of threads. I have a dual core laptop so OpenMP used two threads.
The code is donated to SciPy if you want it. I think it is substantially
better than the present version in SciPy SVN, as speed is the sole purpose
of using kd-trees; otherwise, we could just as well do O(N**2) brute force
Code and benchmark is available here:
Windows binary compiled for Python 2.5 can be uploaded on request.
More information about the Scipy-dev