# [SciPy-User] How to do symmetry detection?

josef.pktd@gmai... josef.pktd@gmai...
Wed Jan 20 14:00:46 CST 2010

```On Wed, Jan 20, 2010 at 2:39 PM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
>
>
> On Wed, Jan 20, 2010 at 12:26 PM, iCy-fLaME <icy.flame.gm@gmail.com> wrote:
>>
>> Thanks for the replies!
>>
>> Perhaps I should clarify that the input data can be int or float, and
>> most of them will have a very large DC offset (i.e. sum(data) >> 0),
>> and no, the signal duration can be anything, I can not "guess"
>>
>
> You should remove the offset, it is translation invariant anyway and gives
> no symmetry information.
>
>>
>> The problem with convolution (scipy.signal.convolve) with self is, it
>> will only produce one "valid" point in the middle, because anywhere
>> else there is a mis-match of array shape.
>>
>
> This can be a problem if the symmetry is near an end, but won't matter much
> if the relevant part is short or near the middle. The end effect will be a
> problem no matter what method you use. Think of convolution as a matched
> filter.
>
>>
>> I believe scipy.signal.convolve do not take into account of the number
>> of points being integrated, and in the case of a large DC offset, any
>> matches far from the middle of the data will be drowned by other areas
>> which has more points to integrate over.
>>
>> Self convolution also has a problem of signal features matching
>> itself. Imagine the input of the following:
>>
>> data: ______W____M_____
>> data[::-1]: _____M____W______
>>
>> As you do the convolution, feature W will match itself first, then the
>> W-M pair matching, then the M-M matching. Where a valid algorithm
>> should only produce results for the W-M pair matching.
>>
>
> Well, there is no symmetry in that example. If you don't know if there is
> symmetry then you have to account for that possibility in setting up the
> statistics. I'm thinking Bayesian here.
>
> Chuck

And I think that convolve, especially fftconvolve for longer series
has such a large speed advantage that running your loop to confirm the
results (or several candidates) will still be much faster than the
python loop over the entire array.
Also, if the series is normalized to mean zero than the out-of bounds
effect of the full self convolution will not matter so much.

Josef

>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
>
```