[SciPy-User] synchronizing timestamps from different systems; unpaired linear regression
Tue Apr 10 23:16:33 CDT 2012
Thanks very much for the suggestions!
Re a new hardware implementation: I bet this would totally work and
honestly is probably the fastest way to get it working. I think even a
rough system clock would do the trick. The downsides are 1) many data
have already been collected with the old setup; 2) I'm getting
stubbornly interested in this problem for its own sake since it feel
so solvable. So perhaps I'll change the hardware for future data and
keep working on algorithms for the old data. (I'd never heard of
Lamport timestamps. The wikipedia article is really interesting. If I
understand it correctly, it would still require a hardware change
Re Nathaniel's suggestion:
I think this is pretty similar to the algorithm I'm currently using. Pseudocode:
current_guess = estimate_from_correlation(x, y)
for timescale in decreasing_order:
xm, ym = find_matches(
x, y, current_guess, within=timescale)
current_guess = linfit(xm, ym)
The problem is the local minima caused by mismatch errors. If the
clockspeed estimate is off, then late events are incorrectly matched
with a delay of one event. Then the updated guess moves closer to this
incorrect solution. So by killing off the points that disagree, we
reinforce the current orthodoxy!
Actually the truest objective function would be the number of matches
within some specified error.
ERR = .1
def objective(offset, clockspeed):
# adjust parametrization to suit
adj_y_times = y_times * clockspeed + offset
closest_x_times = np.searchsorted(x_midpoints, adj_y_times)
pred_err = abs(adj_y_times - x_midpoints[closest_x_times])
closest_good = closest_x_times[pred_err < ERR]
That function has some ugly non-smoothness due to the
len(unique(...)). Would something like optimize.brent work for this or
am I on the wrong track?
Thanks again all!
On Tue, Apr 10, 2012 at 2:22 PM, Nathaniel Smith <email@example.com> wrote:
> On Tue, Apr 10, 2012 at 10:18 PM, Nathaniel Smith <firstname.lastname@example.org> wrote:
>> return np.sum((y_times - x_times[closest_x_times]) ** 2)
> On further thought, squaring is probably exactly the wrong
> transformation here -- squared error focuses on minimizing the large
> errors, and in this case we know that the large errors are caused by
> events that got dropped on the X side, and that these contain no
> information about the proper (offset, clockspeed).
> np.sqrt(np.abs(...)) would probably do better, or something similar
> that flattens out for larger values. Easy to play around with, though.
> Also on further thought, it might make sense to run that both
> directions, and match x values against y values too.
> -- Nathaniel
> SciPy-User mailing list
More information about the SciPy-User