[SciPy-User] Fwd: one-sided gauss fit -- or: how to estimate backgound noise ?
Tue Jul 20 14:14:20 CDT 2010
On Sat, Jul 17, 2010 at 12:07 AM, Zachary Pincus
>> I was thinking about 1D-fitting the histogram. And I need this to be
>> fully automatic so that I can apply that to many subregions of many
>> I have to think about your suggestions for a while.
> Well, if you don't want to estimate the mean/std of the underlying
> data but just work based on the histogram, you could use a nonlinear
> optimizer to fit a mean/std-parameterized gaussian PDF to the 1D
> histogram (with some amount of the tails chopped off). Just make the
> loss function the RMS between the data points (histogram height) and
> the fit curve at the positions of each data point? That would probably
> work too? But I'm not a statistician.
thanks for your reply. The idea is to calculate the mean/std of the
*background noise* of the underlying (2d or 3d) image data based on
the image's 1d image intensity histogram.
Regarding the "tail", the problem is that in general the signal
intensities are not well separated from the background. Thus, the
right half of the background's (Gaussian) noise distribution may
already be significantly miss-shaped - whereas the left side, i.e. all
values below the mean background level, should be nicely Bell-shape
distributed. (This idea comes also from looking at many hundreds of
image histogram [my wx/OpenGL based image viewer always displays the
image intensity histogram right below the image])
>From your answer I just got the idea of changing the error function in
such a way to only sum up the datapoint-model-errors (IOW, build the
RMS) over intensities smaller than the models mean value. So, the
resulting question becomes if the error function has the Gaussian-mean
value of the "current" fitting step available to it !? I can probably
find that out.
> Though, now that I think on it, I seem to recall that the EM algorithm
> was originally deployed to estimate parameters of gaussians with
> censored tails. (I think the problem was: how tall was the average
> Frenchman, given distribution of heights of French soldiers and the
> knowledge that there was a height minimum for army service?) I think
> you just estimate the mean/std from the censored data, then fill in
> the censored tails with samples from the fit distribution, and then re-
> estimate the mean/std from the new data, etc. I forget exactly how one
> does this (does it work on the histogram, or the underlying data,
> e.g.) but that's the general idea.
>> On Fri, Jul 16, 2010 at 1:04 AM, Christoph Deil
>> <Deil.Christoph@googlemail.com> wrote:
>>> Hi Sebastian,
>>> in astronomy a method called kappa-sigma-clipping is sometimes used
>>> to estimate the background level by clipping away most of the signal:
>>> I am not aware of a python implementation, but it's just a few
>>> lines of code.
>>> If you can identify the background level approximately by eye,
>>> e.g. by plotting a histogram of your data, you should be able to
>>> just fit the tail of the Gaussian that only contains background.
>>> Here is my attempt at doing such a fit using
>>> similar to but not exactly what you want:
>>> from scipy.stats import norm, halfnorm, uniform
>>> signal = - uniform.rvs(0, 3, size=10000)
>>> background = norm.rvs(size=10000)
>>> data = hstack((signal, background))
>>> hist(data, bins=30)
>>> selection = data[data>0]
>>> x = linspace(-3, 3, 100)
>>> y = selection.sum() * halfnorm.pdf(x)/3
>>> Good luck!
>>> On Jul 15, 2010, at 10:39 PM, Sebastian Haase wrote:
>>>> In image analysis one is often faced with (often unknown) background
>>>> levels (offset) + (Gaussian) background noise.
>>>> The overall intensity histogram of the image is in fact often
>>>> (Bell shaped), but depending on how many (foreground) objects are
>>>> present the histogram shows a positive tail of some sort.
>>>> So, I just got the idea if there was a function (i.e. mathematical
>>>> algorithm) that would allow to fit only the left half of a Gaussian
>>>> bell curve to data points !?
>>>> This would have to be done in a way that the center, the variance
>>>> sigma) and the peak height are free fitting parameters.
>>>> Any help or ideas are appreciated,
>>>> Sebastian Haase
More information about the SciPy-User