# [SciPy-User] Bivariate Spline Surface Fitting

Thu Nov 26 18:30:27 CST 2009

```Hi Peter,

would that be localization microscopy data by any chance? Which method are you using?

We're using a very similar approach to correct our chromatic shift, I suspect that the problem you're having with the standard spline is that if you have a few points where the shift estimation is way off (when you've got 1000's of points you're almost guaranteed to have a few in the tails of a distribution) and they're pulling the interpolation out of whack in their neighbourhood. To get around this, we've ended up using a smoothing spline rather than a simple bivariate spline (see code below). We also preprocess the data to throw out any shift measurements which are pointing in dramatically different directions to their neighbours.

def genShiftVectorFieldSpline(x,y, sx, sy, err_sx, err_sy):
'''interpolates shift vectors using smoothing splines'''
wonky = findWonkyVectors(x, y, sx, sy, tol=2*err_sx.mean())
good = wonky == 0

print '%d wonky vectors found and discarded' % wonky.sum()

spx = SmoothBivariateSpline(x[good], y[good], sx[good], 1./err_sx[good])
spy = SmoothBivariateSpline(x[good], y[good], sy[good], 1./err_sy[good])

X, Y = np.meshgrid(np.arange(0, 512*70, 100), np.arange(0, 256*70, 100))

dx = spx.ev(X.ravel(),Y.ravel()).reshape(X.shape)
dy = spy.ev(X.ravel(),Y.ravel()).reshape(X.shape)

return (dx.T, dy.T, spx, spy)

I've never found that I need more than a few thousand points to calculate a shift field which will get the error down to the 10nm regime, and the most I've tried fitting is probably ~10-20K points, which would only have taken a couple of minutes. Evaluating the splines is fast though, so you should have no worries evaluating with millions of points.

Cheers,
David

--- On Fri, 27/11/09, Peter Combs <peter.combs@berkeley.edu> wrote:

> From: Peter Combs <peter.combs@berkeley.edu>
> Subject: [SciPy-User] Bivariate Spline Surface Fitting
> To: scipy-user@scipy.org
> Received: Friday, 27 November, 2009, 10:34 AM
> Hi all,
> I have localization data in 2 color channels that should
> agree with each other, but in practice, they don't to the
> level we want. I thought I'd try doing a straight polynomial
> least squares fit, and while that gives better registration
> between the two, I'm still not to the level I want. My next
> thought was a spline fit, so I'm trying to make two
> least-squares bivariate spline fits: one for taking (x,y) to
> x', and one for taking (x,y) to y'.
>
>
> import scipy.interpolate as interp
> ...
> def makeLSQspline(xl, yl, xr, yr):
>    """docstring for makespline"""
>
>    xmin = xr.min()-1
>    xmax = xr.max()+1
>    ymin = yr.min()-1
>    ymax = yr.max()+1
>    n = len(xl)
>
>    print "xrange: ", xmin, xmax, '\t',
> "yrange: ", ymin, ymax
>
>    yknots, xknots = mgrid[ymin:ymax:10j,
> xmin:xmax:10j]   # Makes an 11x11 regular
> grid of knot locations
>
>    xspline = interp.LSQBivariateSpline(xr,
> yr, xl, xknots.flat, yknots.flat)
>    yspline = interp.LSQBivariateSpline(xr,
> yr, yl, xknots.flat, yknots.flat)
>
>    def mapping(xr, yr):
>       xl = xspline.ev(xr, yr)
>       yl = yspline.ev(xr, yr)
>       return xl, yl
>    return mapping
>
>
> I have a "Registration Error" function which calculates a
> mapping for all but the ith point, then plugs that point
> into the mapping and finds the difference between the
> predicted value and the known value. For the 2nd order
> polynomial fit, I get a mean registration error around 7nm,
> but for the spline fitting using the function above, the
> mean error is more like 20,000nm. Which (along with all the
> random junk that gets spit out, such as
> /Library/Frameworks/Python.framework/Versions/5.1.0/lib/python2.5/site-packages/scipy/interpolate/fitpack2.py:498:
> UserWarning:
> Error on entry, no approximation returned. The following
> conditions
> must hold:
> xb<=x[i]<=xe, yb<=y[i]<=ye, w[i]>0,
> i=0..m-1
> If iopt==-1, then
> xb<tx[kx+1]<tx[kx+2]<...<tx[nx-kx-2]<xe
> yb<ty[ky+1]<ty[ky+2]<...<ty[ny-ky-2]<ye
> warnings.warn(message)
>
> about 1 copy of this (or something similar) per call of
> makeLSQSpline:
> tx= 0.00000000000000 0.00000000000000 0.00000000000000
> -264.022095089756 337.266858978529 3468.05790461355
> 6598.84895024857 9729.63999588358 12860.4310415186
> 15991.2220871536 19122.0131327886 22252.8041784237
> 25383.5952240587 28514.3862696937 337.266858978529
> 3468.05790461355 6598.84895024857 9729.63999588358
> 12860.4310415186 15991.2220871536 19122.0131327886
> 22252.8041784237 25383.5952240587 28514.3862696937
> 337.266858978529 3468.05790461355 6598.84895024857
> 9729.63999588358 12860.4310415186 15991.2220871536
> 19122.0131327886 22252.8041784237 25383.5952240587
> 28514.3862696937 337.266858978529 3468.05790461355
> 6598.84895024857 9729.63999588358 12860.4310415186
> 15991.2220871536 19122.0131327886 22252.8041784237
> 25383.5952240587 28514.3862696937 337.266858978529
> 3468.05790461355 6598.84895024857 9729.63999588358
> 12860.4310415186 15991.2220871536 19122.0131327886
> 22252.8041784237 25383.5952240587 28514.3862696937
> 337.266858978529 3468.05790461355 6598.84895024857
> 9729.63999588358
>   12860.4310415186 15991.2220871536 19122.0131327886
> 22252.8041784237 25383.5952240587 28514.3862696937
> 337.266858978529 3468.05790461355 6598.84895024857
> 9729.63999588358 12860.4310415186 15991.2220871536
> 19122.0131327886 22252.8041784237 25383.5952240587
> 28514.3862696937 337.266858978529 3468.05790461355
> 6598.84895024857 9729.63999588358 12860.4310415186
> 15991.2220871536 19122.0131327886 22252.8041784237
> 25383.5952240587 28514.3862696937 337.266858978529
> 3468.05790461355 6598.84895024857 9729.63999588358
> 12860.4310415186 15991.2220871536 19122.0131327886
> 22252.8041784237 25383.5952240587 28514.3862696937
> 337.266858978529 3468.05790461355 6598.84895024857
> 9729.63999588358 12860.4310415186 15991.2220871536
> 19122.0131327886 22252.8041784237 25383.5952240587
> 28514.3862696937 28778.4083647834 0.00000000000000
> 0.00000000000000 0.00000000000000
> tx= 0.00000000000000 0.00000000000000 0.00000000000000
> -264.022095089756 337.266858978529 3468.05790461355
> 6598.84895024857 9729.63999588358 12860.4310415186
> 15991.2220871536 19122.0131327886 22252.8041784237
> 25383.5952240587 28514.3862696937 337.266858978529
> 3468.05790461355 6598.84895024857 9729.63999588358
> 12860.4310415186 15991.2220871536 19122.0131327886
> 22252.8041784237 25383.5952240587 28514.3862696937
> 337.266858978529 3468.05790461355 6598.84895024857
> 9729.63999588358 12860.4310415186 15991.2220871536
> 19122.0131327886 22252.8041784237 25383.5952240587
> 28514.3862696937 337.266858978529 3468.05790461355
> 6598.84895024857 9729.63999588358 12860.4310415186
> 15991.2220871536 19122.0131327886 22252.8041784237
> 25383.5952240587 28514.3862696937 337.266858978529
> 3468.05790461355 6598.84895024857 9729.63999588358
> 12860.4310415186 15991.2220871536 19122.0131327886
> 22252.8041784237 25383.5952240587 28514.3862696937
> 337.266858978529 3468.05790461355 6598.84895024857
> 9729.63999588358
>   12860.4310415186 15991.2220871536 19122.0131327886
> 22252.8041784237 25383.5952240587 28514.3862696937
> 337.266858978529 3468.05790461355 6598.84895024857
> 9729.63999588358 12860.4310415186 15991.2220871536
> 19122.0131327886 22252.8041784237 25383.5952240587
> 28514.3862696937 337.266858978529 3468.05790461355
> 6598.84895024857 9729.63999588358 12860.4310415186
> 15991.2220871536 19122.0131327886 22252.8041784237
> 25383.5952240587 28514.3862696937 337.266858978529
> 3468.05790461355 6598.84895024857 9729.63999588358
> 12860.4310415186 15991.2220871536 19122.0131327886
> 22252.8041784237 25383.5952240587 28514.3862696937
> 337.266858978529 3468.05790461355 6598.84895024857
> 9729.63999588358 12860.4310415186 15991.2220871536
> 19122.0131327886 22252.8041784237 25383.5952240587
> 28514.3862696937 28778.4083647834 0.00000000000000
> 0.00000000000000 0.00000000000000
>
> makes me think something isn't quite right. Any guesses
> what's going on? I have ~3400 data points, roughly evenly
> spread out over a 28,000nm x 23,000nm grid.
>
> On another note, how well is this going to scale up? If I
> end up collecting hundreds of thousands to low-millions of
> points, does spline fitting go as O(n^2), or more like O(n)?
> The error registration function runs as O(n*O(fitting)), and
> takes around 5 seconds now, so O(N) spline fitting is fine,
> about an hour run time total, but O(n^2) is very much not.
> Peter Combs
> peter.combs@berkeley.edu
>
>
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>

```