# [SciPy-user] nonlinear fit with non uniform error?

Trevis Crane t_crane@mrl.uiuc....
Thu Jun 21 08:40:45 CDT 2007

As an aside, will those of you who are *more* in the know on this topic
than the rest of us suggest a good text that has a worthwhile treatment
of this subject (as well as other related data analysis/statistical
issues)?

I'd love to learn more about it, but just jumping on Amazon and picking
a book at almost random seems like a good way to waste a lot of money I
don't have on books that I don't need, so if you have a favorite
reference or text, I'm interested in knowing about it.

thanks,

trevis

-----Original Message-----
From: scipy-user-bounces@scipy.org [mailto:scipy-user-bounces@scipy.org]
On Behalf Of David Huard
Sent: Thursday, June 21, 2007 8:09 AM
To: SciPy Users List
Subject: Re: [SciPy-user] nonlinear fit with non uniform error?

Hi,

What you have is an heteroscedastic normal distribution (varying
variance) describing the residuals.

2007/6/21, Matthieu Brucher <matthieu.brucher@gmail.com>:

1)Does this mean that least squares is NOT ok?

Yes, LS is _NOT_ OK because it assumes that the distribution (with its
parameters) is the same for all errors. I don't remember exactly, but
this may be due to ergodicity

Well, let's put things in perspective. You can still use ordinary
least-squares.  Theoretically, this means you're making the assumption
that the error mean and variance are fixed and constant.   In your case,
this is not true and you can consider the LS solution like an
approximation. What will happen under this approximation is that large
errors on Cy will tend to dominate the residuals, and values in Ay will
probably not be fitted optimally. I advise you try it anyway and
visually check whether you care about that or not.

2)What does "rescaling" mean in this context?

You must change B and C so that :
Ay +/- 5
B'y +/- 5
C'y +/- 5

Or maximize the likelihood of a multivariate normal distribution, whose
covariance matrix describes your assumption about the heteroscedasticity
of the residuals.

\Sigma =
| \sigma_A^2       0                0                 |
|      0             \sigma_B^2     0                 |
|      0                    0              \sigma_C^2 |

Heteroscedastic likelihood = -n/2 \ln(2\pi) - 1/2 \sum \ln(\sigma_i^2)
-1/2 \sum \sigma_i^{-2} (y_{obs} - y_{sim})^2

You might also consider the possibility that your errors are
multiplicative rather than additive. In this case, describing the
residuals by a lognormal distribution could make more sense.

Maximize lognormal likelihood:  L=lognormal(y_sim | ln(y_obs), \sigma)

Cheers,

David

Matthieu

_______________________________________________
SciPy-user mailing list
SciPy-user@scipy.org
http://projects.scipy.org/mailman/listinfo/scipy-user

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/scipy-user/attachments/20070621/13f3d80b/attachment.html