# [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression

Chris Rodgers xrodgers@gmail....
Tue Apr 10 04:27:44 CDT 2012

```I have what seems like a straightforward problem but it is becoming
more difficult than I thought. I have two different computers
recording timestamps from the same stream of events. I get lists X and
Y from each computer and the question is how to figure out which entry
in X corresponds to which entry in Y.

Complications:
1) There are an unknown number of missing or spurious events in each
list. I do not know which events in X match up to which in Y.
2) The temporal offset between the two lists is unknown, because each
timer begins at a different time.
3) The clocks seem to run at slightly different speeds (~0.3%
difference adds up to about 10 seconds over my 1hr recording time).

I know this problem is solvable because once you find the temporal
offset and clock-speed ratio, the matching timestamps agree to within
10ms. That is, there is a strong linear relationship between some
unknown X->Y mapping.

Basically, the problem is: given list X and list Y, and specifying a
certain minimum R**2 value, what is the largest set of matched points
from X and Y that satisfy this R**2 value? I have tried googling
"unmatched linear regression" but this must not be the right search
term.

One approach that I've tried is to create an analog trace for X and Y
with a Gaussian centered at each timestamp, then finding the lag that
optimizes the cross-correlation between the two. This is good for
finding the temporal offset but can't handle the clock-speed
difference. (Also it takes a really long time because the series are
1hr of data sampled at 10Hz.) Then I can choose the closest matches
between X and Y and fit them with a line, which gives me the
clock-difference parameter. The problem is that there are a ton of
local minima created by how I choose to match up the points in X and
Y, so it gets stuck on the wrong answer.

Any tips?

Thanks!
Chris

PS: my current code and test data is here:
https://github.com/cxrodgers/DiscreteAnalyze

--
Chris Rodgers