[SciPy-User] Orthogonal distance regression in 3D

Robert Kern robert.kern@gmail....
Wed Mar 7 06:30:48 CST 2012


On Wed, Mar 7, 2012 at 06:57, Владимир <draft2008@bk.ru> wrote:
>
>
>
> 06 марта 2012, 15:15 от Robert Kern <robert.kern@gmail.com>:
>> On Tue, Mar 6, 2012 at 08:22, Владимир <draft2008@bk.ru> wrote:
>> >
>> > 05 марта 2012, 14:59 от Robert Kern <robert.kern@gmail.com>:
>> >> On Mon, Mar 5, 2012 at 10:26, Владимир <draft2008@bk.ru> wrote:
>> >> > 02 марта 2012, 15:49 от Robert Kern <robert.kern@gmail.com>:
>> >> >> On Fri, Mar 2, 2012 at 06:02, Владимир <draft2008@bk.ru> wrote:
>> >> >> > Hello!
>> >> >> > I'm working with orthogonal distance regression (scipy.odr).
>> >> >> > I try to fit the curve to a point cloud (3d), but it doesn work properly, it
>> >> >> > returns wrong results
>> >> >> >
>> >> >> > For example I want to fit the simple curve y = a*x + b*z + c to some point
>> >> >> > cloud (y_data, x_data, z_data)
>> >> >> >
>> >> >> >
>> >> >> >     def func(p, input):
>> >> >> >
>> >> >> >     x,z = input
>> >> >> >
>> >> >> >     x = np.array(x)
>> >> >> >
>> >> >> >     z = np.array(z)
>> >> >> >
>> >> >> >     return (p[0]*x + p[1]*z + p[2])
>> >> >> >
>> >> >> >
>> >> >> >     initialGuess = [1,1,1]
>> >> >> >
>> >> >> >     myModel = Model(func)
>> >> >> >
>> >> >> >     myData = Data([x_data, z_daya], y_data)
>> >> >> >
>> >> >> >     myOdr = ODR(myData, myModel, beta0 = initialGuess)
>> >> >> >
>> >> >> >     myOdr.set_job(fit_type=0)
>> >> >> >
>> >> >> >     out = myOdr.run()
>> >> >> >
>> >> >> >     print out.beta
>> >> >> >
>> >> >> > It works perfectly in 2d dimension (2 axes), but in 3d dimension the results
>> >> >> > are not even close to real, moreover it is very sensitive to initial Guess,
>> >> >> > so it returns different result even if i change InitiaGuess from [1,1,1]
>> >> >> > to [0.99,1,1]
>> >> >> >
>> >> >> > What do I do wrong?
>> >> >>
>> >> >> Can you provide a complete runnable example including some data? Note
>> >> >> that if you do not specify any errors on your data, they are assumed
>> >> >> to correspond to a standard deviation of 1 for all dimensions. If that
>> >> >> is wildly different from the actual variance around the "true"
>> >> >> surface, then it might lead the optimizer astray.
>> >> >>
>> >> >> --
>> >> >> Robert Kern
>> >> >>
>> >> >
>> >> > I wonder why when I change the initial guess the results changes too. As it, the result depends on the initial guess directly. This is wrong.
>> >> >
>> >> > Here is an example (Sorry for the huge array of data, but its important to show what happens on it)
>> >> >
>> >> > import numpy as np
>> >> > from scipy.odr import *
>> >> > from math import *
>> >>
>> >> [snip]
>> >>
>> >> > def funcReturner(p, input):
>> >> >        input = np.array(input)
>> >> >        x = input[0]
>> >> >        z = input[1]
>> >> >        return 10**(p[0]*x + p[1]*z +p[2])
>> >>
>> >> Ah. 10**(p[0]*x+p[1]*z+p[2]) is a *lot* different from the linear
>> >> problem you initially asked about. Setting the uncertainties
>> >> accurately on all axes of your data is essential. Do you really know
>> >> what they are? It's possible that you want to try fitting a plane to
>> >> np.log10(y_data) instead.
>> >>
>> >> > myModel = Model(funcReturner)
>> >> > myData = Data([x_data,z_data], y_data)
>> >> > myOdr = ODR(myData, myModel, beta0=[0.04, -0.02,  1.75])
>> >> > myOdr.set_job(fit_type=0)
>> >> > out = myOdr.run()
>> >> > result = out.beta
>> >> >
>> >> > print "Optimal coefficients: ", result
>> >> >
>> >> > I tryed to specify sx, sy, we, wd, delta, everything: and I get the better results, but they are still not what I need. And they are still depends directly on initial guess as well.
>> >> > If I set initial guess to [1,1,1], it fails to find any close solution and returns totally wrong result with huge Residual Variance like 3.21014784829e+209
>> >>
>> >> For such a nonlinear problem, finding reasonable initial guesses is
>> >> useful. There is also a maximum iteration limit defaulting to a fairly
>> >> low 50. Check out.stopreason to see if it actually converged or just
>> >> ran into the iteration limit. You can keep calling myOdr.restart()
>> >> until it converges. If I start with beta0=[1,1,1], it converges
>> >> somewhere between 300 and 400 iterations.
>> >>
>> >> --
>> >> Robert Kern
>> >>
>> >
>> > Yeah, increasing the number of iterations (maxit parameter) makes the results slightly more accurate, but not better. I mean if I attain that the stop reason is "sum square convergence", results are even worse. But, I tryed to fit converted function, like you recommended - np.log10(y_data). And it gave me the proper results. Why that happens and is it possible to achieve these results without convertion?
>>
>> As I mentioned before, in a nonlinear case, you really need to have
>> good estimates of the uncertainties on each point. Since your Y
>> variable varies over several orders of magnitude, I really doubt that
>> the uncertainties are the same for each point. It's more likely that
>> you want to assign a relative 10% (or whatever) uncertainty to each
>> point rather than the same absolute uncertainty to each. I don't think
>> that you have really measured both 1651.5+-1.0 and 0.05+-1.0, but
>> that's what you are implicitly saying when you don't provide explicit
>> estimates of the uncertainties.
>>
>> One problem that you are going to run into is that least-squares isn't
>> especially appropriate for your model. Your Y output is strictly
>> positive, but it goes very close to 0.0. The error model that
>> least-squares fits is that each measurement follows a Gaussian
>> distribution about the true point, and the Gaussian has infinite
>> support (specifically, it crosses that 0 line, and you know a priori
>> that you will never observe a negative value). For the observations
>> ~1000.0, that doesn't matter much, but it severely distorts the
>> problem at 0.05. Your true error distribution is probably something
>> like log-normal; the errors below the curve are small but the errors
>> above can be large. Transforming strictly-positive data with a
>> logarithm is a standard technique. In a sense, the "log-transformed"
>> model is the "true" model to be using, at least if you want to use
>> least-squares. Looking at the residuals of both original and the
>> log10-transformed problem (try plot(x_data, out.eps, 'k.'),
>> plot(x_data, out.delta[0], 'k.'), etc.), it looks like the
>> log10-transformed data does fit fairly well; the residuals mostly
>> follow a normal distribution of the same size across the dataset.
>> That's good! But it also means that if you transform these residuals
>> back to the original space, they don't follow a normal distribution
>> anymore, and using least-squares to fit the problem isn't appropriate
>> anymore.
>>
>> > I could use converted function further, but the problem is that I have the whole list of different functions to fit. And I'd like to create universal fitter for all of them.
>>
>> Well, you will have to go through those functions (and their implicit
>> error models) and determine if least-squares is truly appropriate for
>> them. Least-squares is not appropriate for all models. However,
>> log-transforming the strictly-positive variables in a model quite
>> frequently is all you need to do to turn a least-squares-inappropriate
>> model into a least-squares-appropriate one. You can write your
>> functions in that log-transformed form and write a little adapter to
>> transform the data (which is given to you in the original form).
>>
>> --
>> Robert Kern
>>
>
> Robert, thank you very much for detailed answer, now I see what is the problem. I don't really have any uncertainties, and I guess it would be hard to compute them from the data. Moreover, this data is just the sample, and I will have a different types of data in real. Transformation actually helps just for the the couple of functions, for instance, 10**(A*x + B*z +C) and C*(A)**X*(B)**Z functions fit just perfectly, but doesn't work for any others (like A*lg(X) + B*Z + C, C/(1 + A*X + B*Z)).

Right. That's why I said that you have to go through all of the
functions to see if it's applicable or not.

> I transform the function like this (conditionally):
> y_data = np.log10(y_data)
> function = np.log10(function)
> , is that correct?

Yes.

> And what do you mean by little adapter to transform the data?

I just meant the "y_data = np.log10(y_data)" part.

> By the way, the problem appears only in 3d mode. When I use the same logarithmic data in 2d mode (no Z axis), it works perfectly for all functions, and no log10 transformation needed (this transformation distort the results, make them worse in that case).

Without knowing how you are getting the data to make that
determination, I don't have much to say about it. The problem of
inaccurate uncertainties will probably get larger as you increase the
dimension of the inputs. Since you don't know them, it's probably not
a good idea to keep trying to do ODR instead of ordinary least
squares. Use myOdr.set_job(fit_type=2) to use OLS instead. You still
have a problem with the unknown uncertainties on the Y output, but
that narrows some of the problems down.

> Do you know any other fitting methods, available in python?

None that free you from this kind of thoughtful analysis. Fitting
functions isn't a black box, I'm afraid. You need to consider your
models and your data before fitting, and you need to look at the
residuals afterwards to check that the results make sense. Then you
improve your model. Least-squares is not suitable for all models. You
may need to roll your own error model and use the generic minimization
routines in scipy.optimize. If you can formulate your models as
generalized linear models (which you can for a couple of them, but not
all), you could look at the functionality for this in the statsmodels
package.

-- 
Robert Kern


More information about the SciPy-User mailing list