[SciPy-User] FW: curve fitting by a sum of gaussian with scipy

Charles R Harris charlesr.harris@gmail....
Thu Apr 18 10:05:19 CDT 2013

```On Thu, Apr 18, 2013 at 8:59 AM, Charles R Harris <charlesr.harris@gmail.com
> wrote:

>
>
> On Thu, Apr 18, 2013 at 6:24 AM, Stéphanie haaaaaaaa <
> flower_des_iles@hotmail.com> wrote:
>
>> Dear all,
>>
>>
>> I'm doing bioinformatics and we map small RNA on mRNA. We have the
>> mapping coordinate of a protein on each mRNA and we calculate the relative
>> distance between the place where the protein is bound on the mRNA and the
>> site that is bound by a small RNA.
>> I obtain the following dataset :
>>
>>
>> dist    eff-69 3-68 2-67 1-66 1-60 1-59 1-58 1-57 2-56 1-55 1-54 1-52 1-50 2-48 3-47 1-46 3-45 1-43 10   11   22   123   184   185   136   97   78   59   310  113  214  315  216  217  218  219  220  221  322  124  125  126  128  231  138  140  2
>>
>>
>> When i plot the data, i have 3 pics : 1 at around 3/4 another one around
>> 20 and a last one around -50. (see attached file, upper graph)
>>
>> I try cubic spline interpolation, but it does'nt work very well for my
>> data (see attached file 2, red curve).
>> My idea was to do curve fitting with a sum of gaussians. For example in
>> my case, estimate 3 gaussian curve around the peak (at point 5,20 and -50).
>> How can i do so ?
>> I looked at scipy.optimize.curve_fit(), but how can i fit the curve at
>> precise intervalle ? How can i add the curve to have one single curve ?
>>
>>
> That's interesting. On thinking about it, I think if you used the design
> matrix for, say, fitting a uniform spline with fairly closely spaced sample
> points, that it would be pretty singular, which would be a good thing
> because the pseudo inverse would minimize the sum of squares of the
> coefficients, which in turn would knock down the curve where there is no
> data. Mind, I'm just speculating here, haven't tried it. Is the data you
> posted complete?
>

And thinking some more, always a bad sign here, this looks like a
histogram, but you have left out all the distance data points that had zero
matches, I think you need to keep them in.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20130418/4b454c30/attachment.html
```