[SciPy-User] griddata() performance

Pauli Virtanen pav@iki...
Tue Jan 15 04:43:32 CST 2013


Andreas Hilboll <lists <at> hilboll.de> writes:
> I'm wondering which performance I can expect from griddata. I will need
> to interpolate from 4d unstructured data, with dimensions at least
> 5x17x38x52, and I need to get the 1d-values of ~ 15k points. So far, the
> routine is running for > 30 minutes, and I'm wondering if this is to be
> expected. My machine is a AMD Opteron(tm) Processor 8439 SE 2.8GHz CPU.
> I guess there's no easy way to parallelize this?

The step taking the time is Delaunay triangulation of your data point set.
In 4D, the number of simplices is probably huge, and computing them takes
time. This is done by the Qhull library, and probably nontrivial (or
impossible) to parallelize. The algorithm is probably not far from
state-of-the-art, so I don't think it is possible to speed this up
significantly in 4D.

However, doesn't the fact that that you say "5x17x38x52" mean that
your data has structure that you can exploit? (A grid with non-uniform
spacing in each dimension?) If so, using griddata() is not the best
approach. One answer for rectangular grid are tensor product splines.
Scipy doesn't have implementation of these though (yet), but it's
possible to roll up your own.

Alternatively, if you data is really unstructured something like RBF
or inverse distance weighing may work (more or less badly, depends on
what you want).

-- 
Pauli Virtanen




More information about the SciPy-User mailing list