[SciPy-User] synchronizing timestamps from different systems; unpaired linear regression
Thu Apr 12 10:13:44 CDT 2012
On Thu, Apr 12, 2012 at 10:14 AM, Samuel Garcia
> What algo do you use for the online spike detection (sorting) ?
> Do you have a commercial system or a home made system ?
> Did you tried some Metroplis Hasting like method :
> * sample a subset of point
> * estimate offset, clockspeed at each step
> * stamp bad point X that have not a pair Y with that estimated offset
> and clockspeed
> * sample again only poitn (that are not bad stamp)
> * check is loglike is better or not and reject or accept the new state
estimate offset from many samples and clockspeed e.g. from the time
difference of nearby samples.
If you get lot's of estimates that are wrong but many that are right,
then a robust estimator like statsmodels.RLM might be able to estimate
the offset and clockspeed from the individual estimates by down
weighting the outliers. (similar to trimmed means)
maybe iterate with a better initial estimate for the individual subsamples.
(I have no idea what data like this might look like)
> Le 12/04/2012 07:02, Chris Rodgers a écrit :
>> The platform is specialized because the data come from equipment in my
>> lab. One system is a 3KHz real-time operating system detecting voltage
>> pulses on one input. The other is a data-acquistion card sampling at
>> 30KHz and detecting deflections on another input. The temporal
>> uncertainty arises not from the raw sampling rates but from the
>> uncertainties in the pulse detection algorithms.
>>> Yes, that's why I was suggesting doing something similar to your
>>> current algorithm, but different in ways that might avoid this problem
>> I could be wrong but I think your suggestion retains the matching
>> problem implicitly, because it uses the closest value in X to the
>> transformed point from Y.
>> I wrote a new objective function that uses the sum of squared error
>> between every value in X and every transformed point from Y (but
>> thresholds any squared error that is above a certain point). I played
>> around with this function, and the one you suggested, and a couple of
>> others, using scipy.optimize.brute. It was a fun exercise but the
>> energy landscape is pathological. There are local minima for every
>> nearby mismatch error, and the minimum corresponding to the true
>> solution is extremely narrow and surrounded by peaks. The problem is
>> that a good guess for one parameter (dilation say) means that a small
>> error in the other parameter (offset) results in a very bad solution.
>> It still feels like there should be a way to do this from the
>> statistics of the input. I tried some crazier things like taking many
>> subsamples of X and Y, fitting them, and then looking at the
>> distribution of all the discovered fits. But the problem with this is
>> that I'm very unlikely to choose useful subsamples, of course. I think
>> I'll have to use the brute force approach with very fine sampling, or
>> one of the practical hardware solutions suggested previously.
>> On Wed, Apr 11, 2012 at 9:53 AM, Charles R Harris
>> <email@example.com> wrote:
>>> On Wed, Apr 11, 2012 at 10:37 AM, Nathaniel Smith<firstname.lastname@example.org> wrote:
>>>> On Wed, Apr 11, 2012 at 5:16 AM, Chris Rodgers<email@example.com> wrote:
>>>>> Re Nathaniel's suggestion:
>>>>> I think this is pretty similar to the algorithm I'm currently using.
>>>>> current_guess = estimate_from_correlation(x, y)
>>>>> for timescale in decreasing_order:
>>>>> xm, ym = find_matches(
>>>>> x, y, current_guess, within=timescale)
>>>>> current_guess = linfit(xm, ym)
>>>>> The problem is the local minima caused by mismatch errors. If the
>>>>> clockspeed estimate is off, then late events are incorrectly matched
>>>>> with a delay of one event. Then the updated guess moves closer to this
>>>>> incorrect solution. So by killing off the points that disagree, we
>>>>> reinforce the current orthodoxy!
>>>> Yes, that's why I was suggesting doing something similar to your
>>>> current algorithm, but different in ways that might avoid this problem
>>>>> Actually the truest objective function would be the number of matches
>>>>> within some specified error.
>>>>> ERR = .1
>>>>> def objective(offset, clockspeed):
>>>>> # adjust parametrization to suit
>>>>> adj_y_times = y_times * clockspeed + offset
>>>>> closest_x_times = np.searchsorted(x_midpoints, adj_y_times)
>>>>> pred_err = abs(adj_y_times - x_midpoints[closest_x_times])
>>>>> closest_good = closest_x_times[pred_err< ERR]
>>>>> return len(unique(closest_good))
>>>>> That function has some ugly non-smoothness due to the
>>>>> len(unique(...)). Would something like optimize.brent work for this or
>>>>> am I on the wrong track?
>>>> That also has the problem that the objective function is totally flat
>>>> whenever the points are all within ERR, so you are guaranteed to get
>>>> an offset inaccuracy on the same order of magnitude as ERR. It sounds
>>>> like your clocks have a lot of jitter, though, if you can't do better
>>>> than 10 ms agreement, so maybe you can't get more accurate than that
>>> I'd like to know the platform. The default Window's clock ticks at 15 Hz,
>>> SciPy-User mailing list
>> SciPy-User mailing list
> Samuel Garcia
> Lyon Neuroscience
> CNRS - UMR5292 - INSERM U1028 - Universite Claude Bernard LYON 1
> Equipe R et D
> 50, avenue Tony Garnier
> 69366 LYON Cedex 07
> Tél : 04 37 28 74 24
> Fax : 04 37 28 76 01
> SciPy-User mailing list
More information about the SciPy-User