Least median of squares for regression in scipy?
"least median of squares" doesn't mean anything to me. But, I know that
minimizing sum of absolute differences will provide a good estimate of the
median and is a good technique for dealing with outliers:
http://en.wikipedia.org/wiki/Least_absolute_deviation
http://en.wikipedia.org/wiki/L1_norm
Note that you'll need an LP solver. Another option is a hybrid between the
squared and absolute value loss functions, such as the one that Peter Huber
devised:
http://en.wikipedia.org/wiki/Huber_Loss_Function
This loss provides the outlier-insensitivity of L1 while being easy to solve
using gradient descent & line search.
> Hi,
> I have to perform a linear regression on noisy data. On the last paper I
> read
> least median of squares was suggested for dealing with outliers. I have
> searched
> the scipy docs but it seems nothing is readily available. Searching the web
> for
> "(python OR scipy OR numpy) least median square" doesn't yield meaningfull
> results. The best I found were fortran and matlab code, which I would need
> to
> wrap (I have zero knowledge about fortran or wrapping it into python,
> except
> that there's a tool called f2py, but I would have to learn that as well) or
> rewrite (I used matlab in the past, so this should be feasible).
> I am asking here in the hope there's something I overlooked before I jump
> into
> one of the (probably more time demanding) possibilities I mentioned above.
> Thanks in advance,
>
> Jorge
