# [SciPy-User] weighted griddata

Pauli Virtanen pav@iki...
Thu Sep 16 15:27:46 CDT 2010

```Thu, 16 Sep 2010 10:20:14 -0700, Adam Ryan wrote:
[clip]
> I was wondering if there is a way to use scipy.interpolate.griddata with
> weights.

You are probably looking for routines for data smoothing, and not for
interpolation. griddata only computes interpolants, functions f(x) that
go through all the data points

f(x_i) = y_i

If you want some data points to have more weight than others, then
probably you don't want to this condition to hold (otherwise you get
sharp peaks around points with low weights).

Data smoothing is a different problem than interpolation, and the
algorithms in griddata cannot do it, and they are not easily modified to
do it either.

> Specifically, I have a list of lines.  Each line represents the position
> of a wave on a beach at a timestamp, and is comprised of a list of
> mostly connected points.  I can use griddata and all the points of all
> the lines to create an interpolation of the position of the wave vs
> time.  The problem is that the lines are the result of image processing
> and the points vary in confidence level.

So you have data

(x[i], y[i], t[i])

and you'd like to fit a 2-D surface to it? I guess you used griddata to
find the graph of the function y = f(x, t)?

> Currently I'm just tossing out
> points that don't pass muster, but I'd like a more robust solution,
> something like a weighted griddata, or some other method.  Any advice
> would be great.

Since your data is 2-D, you can in principle use the spline routines in
scipy.interpolate for data smoothing. For example,

ip = interpolate.SmoothBivariateSpline(x=x, y=t, z=y,
w=weights)
xi, ti = mgrid[0:1:70j,0:1:80j]
yi = ip.ev(xi.ravel(), ti.ravel()).reshape(xi.shape)

yi = griddata((x, t), y, (xi, ti))

You may need to fiddle with the smoothing parameter `s=...` for
SmoothBivariateSpline.

Ok, a word of warning: personally, I've found that these 2-D spline
routines often produce garbage, especially when `s` is small, and you can
waste a lot of time fiddling with them. These routines come from the
ancient FITPACK library (http://www.netlib.org/dierckx/), and I suppose
it's not fully refined in this respect...

Or, you can maybe fit a 1-D spline (UnivariateSpline) at each `t`
separately, and use the smoothed results in griddata. The 1D spline
routines are robust.

Another option that you can try is to cook up some inverse distance
weighting scheme -- you can use scipy.spatial.cKDTree/KDTree to do the
fast N nearest neighbour lookups. Scipy doesn't have an implementation
for these algorithms at the moment, so you'd have to do it from scratch.

Maybe other people have more ideas?

--
Pauli Virtanen

```