[SciPy-User] technical question: normed exponential fit for data?

Daniel Lepage dplepage@gmail....
Thu Mar 24 12:20:14 CDT 2011


On Thu, Mar 24, 2011 at 11:33 AM, Daniel Mader
<danielstefanmader@googlemail.com> wrote:

> this is not a software question or scipy problem but rather I have no
> clue how to tackle this on a mathematical level.

I'd write this in terms of probabilities, but that could be just
because I tend to write everything that way :-)

Your calibration measurements are samples of a probability
distribution P(count, concentration, temperature), which equals
P(concentration | count, temperature) P(count, temperature). You can
estimate P(count, temperature) from these samples and thus estimate
P(concentration | count, temperature).

Now you do your experiment - with an unknown concentration, you take
some [count, temperature] measurements. Based on the uncertainty of
your sensors, you estimate P(count, temperature) for this new sample.
You want to find P(concentration), so you write the marginalization:

P(concentration) = \int_{temperature, count} P(concentration, count,
temperature) = \int_{temperature, count} P(concentration | count,
temperature) P(count, temperature)

where \int denotes an integral over all possible temperatures and
counts. You have P(concentration | count, temperature) from your
calibration, so you integrate to get P(concentration), and then choose
the concentration that maximizes it (ML estimator) or that minimizes
the expected error from it (Bayes estimator).

If you assume that all measurements come from some deterministic
system corrupted by Gaussian noise and that all concentrations and
temperatures are equally likely, and you choose to use a
maximum-likelihood (ML) estimator, then this takes a very simple
algorithmic form:
1) Fit a surface to your (count, concentration, temperature)
calibration points. If you assume that count is a linear function of
concentration and temperature, this surface will be a plane (very easy
to fit); if instead you expect it to be exponential in temperature and
linear in concentration, you'll be fitting a curved surface.
2) Each new [count, temperature] pair defines a line in this 3D space;
intersect this line with your surface to get the most probable
concentration.

As Robert pointed out, step 1 will be a lot more robust if you have
calibration samples with more than 3 distinct concentrations.

Hope this helps,
Dan Lepage


More information about the SciPy-User mailing list