[SciPy-user] Linear Interpolation Question
Anne Archibald
peridot.faceted@gmail....
Wed Apr 30 02:16:20 CDT 2008
2008/4/28 Andrea Gavana <andrea.gavana@gmail.com>:
> I have 2 matrices coming from 2 different simulations: the first
> column of the matrices is a date (time) at which all the other results
> in the matrix have been reported (simulation step). In these 2
> matrices, very often the simulation steps do not coincide, so I just
> want to interpolate the results in the second matrix using the dates
> in the first matrix. The problem is, I have close to 13,000 columns in
> every matrices, and repeating interp1d all over the columns is quite
> expensive. An example of what I am doing is as follows:
>
> # Loop over all the columns
> for indx in indices:
>
> # Set up a linear interpolation with:
> # x = dates in the second simulation
> # y = single column in the second matrix simulation
> function = interp1d(secondaryMatrixDates,
> secondaryMatrixResults[:, indx], kind='linear')
>
> # Interpolate the second matrix results using the first simulation dates
> interpolationResults = function(mainMatrixDates)
>
> # I need the difference between the first simulation and the second
> newMatrix[:, indx] = mainMatrixResults[:, indx] - interpolationResults
>
> This is somehow a costly step, as it's taking up a lot of CPU
> (increasing at every iteration) and quite a long time (every column
> has about 350 data). Is there anything I can do to speed up this loop?
> Or may someone suggest a better approach?
You have run into an unfortunate limitation of interp1d; it only
handles scalar-valued data. That python loop, through all those
interp1d objects, is pretty wasteful. Since you have only several
hundred values to interpolate to, and thirteen thousand columns, I
would write a vectorized linear interpolation by hand. That is, for
each date in the main matrix, use searchsorted() to find it in the
secondary matrix dates, then do something like
for j in num_dates:
date = main_matrix[j]
i = searchsorted(date,secondary_date) # check the docstring
t = (date-secondary_date[i])/(secondary_date[i+1]-secondary_date[i])
new_matrix[j,:] = t*secondary_matrix[i,:]+(1-t)*secondary_matrix[i+1,:]
Good luck,
Anne
With 13000 columns, the overhead
More information about the SciPy-user
mailing list