# [SciPy-User] synchronizing timestamps from different systems; unpaired linear regression

josef.pktd@gmai... josef.pktd@gmai...
Tue Apr 10 09:13:52 CDT 2012

```On Tue, Apr 10, 2012 at 5:27 AM, Chris Rodgers <xrodgers@gmail.com> wrote:
> I have what seems like a straightforward problem but it is becoming
> more difficult than I thought. I have two different computers
> recording timestamps from the same stream of events. I get lists X and
> Y from each computer and the question is how to figure out which entry
> in X corresponds to which entry in Y.
>
> Complications:
> 1) There are an unknown number of missing or spurious events in each
> list. I do not know which events in X match up to which in Y.
> 2) The temporal offset between the two lists is unknown, because each
> timer begins at a different time.
> 3) The clocks seem to run at slightly different speeds (~0.3%
> difference adds up to about 10 seconds over my 1hr recording time).
>
> I know this problem is solvable because once you find the temporal
> offset and clock-speed ratio, the matching timestamps agree to within
> 10ms. That is, there is a strong linear relationship between some
> unknown X->Y mapping.
>
> Basically, the problem is: given list X and list Y, and specifying a
> certain minimum R**2 value, what is the largest set of matched points
> from X and Y that satisfy this R**2 value? I have tried googling
> "unmatched linear regression" but this must not be the right search
> term.
>
> One approach that I've tried is to create an analog trace for X and Y
> with a Gaussian centered at each timestamp, then finding the lag that
> optimizes the cross-correlation between the two. This is good for
> finding the temporal offset but can't handle the clock-speed
> difference. (Also it takes a really long time because the series are
> 1hr of data sampled at 10Hz.) Then I can choose the closest matches
> between X and Y and fit them with a line, which gives me the
> clock-difference parameter. The problem is that there are a ton of
> local minima created by how I choose to match up the points in X and
> Y, so it gets stuck on the wrong answer.
>
> Any tips?

I'm pretty sure someone has a more experienced answer similar to image
registration.

What I would try to do is do your correlation or regression matching
on two subsamples, for example a segment at the beginning and a
segment at the end, then the different clock speeds will have a small
effect. Then recover the clockspeed difference comparing the match
between the two subsamples/segments.

Josef

>
> Thanks!
> Chris
>
> PS: my current code and test data is here:
> https://github.com/cxrodgers/DiscreteAnalyze
>
> --
> Chris Rodgers