[SciPy-user] interpolate

Anne Archibald aarchiba@physics.mcgill...
Sat Nov 15 16:22:14 CST 2008


2008/11/14 David Warde-Farley <dwf@cs.toronto.edu>:
>
> On 14-Nov-08, at 6:26 PM, Anne Archibald wrote:
>
>> The knots are specified in a form that allows them all to be treated
>> identically. This sometimes means repeating knots or having zero
>> coefficients.
>>
>> If you have more data points than you want knots, then you are going
>> to be producing a spline which does not pass through all the data. The
>> smoothing splines include an automatic number-of-knots selector, which
>> you may prefer to specifying the number of knots yourself. it chooses
>> (approximately) the minimum number of knots needed to let the curve
>> pass within one sigma of the data points, so by adjusting the
>> smoothing parameter and the weights you can tune the number of knots.
>> Evaluation time is not particularly sensitive to the number of knots
>> (though of course memory usage is).
>
> I see. I'm interested in doing is modeling the variation in the
> curves, presumably via a description of the joint distribution of the
> spline coefficients.  This gets difficult if the number of knots is
> variable, which is why I've gone this route.  It's not important that
> the curves fit the data exactly, but part of the reason for fitting
> splines is to reduce each of many, many curves to a fixed-length
> description. Does this make sense?

This makes sense but may pose some additional difficulties. In
particular, the way the fitpack routines select their knots, even when
the number is specified, is by successive subdivision. So you're going
to get "jumps" in your description where a knot hops from one place to
another as you vary the data you're fitting to.

You might want to avoid the fitpack fitting routines entirely, at
least in the stage where you are varying the curve: fix not just the
number of knots but the knot positions, and vary only the
coefficients. If you correctly identify a basis for the space of
splines on your given set of knots, fitting each curve becomes a
linear least-squares fit, which you can easily do in scipy. The
fitting won't be quite as efficient as fitpack, though if you are
clever you might be able to make sure it's a sparse problem. But this
ought to free you from ugly discontinuities in your parameterization.
You could of course do this with splines implemented from scratch, but
if you can understand the fitpack tck representation well enough, you
should be able to both use fitpack to evaluate your splines
(efficiently, in C code) and use fitpack to come up with an initial
set of knots.

Anne


More information about the SciPy-user mailing list