# [SciPy-User] technical question: normed exponential fit for data?

Thu Mar 24 13:30:27 CDT 2011

Dear Dan,

thank your very much for this approach, it really sounds very reasonable.

However, not being a probability pro, I don't understand the meaning
for some terms:

P(concentration | count, temperature) P(count, temperature) ?

I'd be grateful if you could elaborate a little more, it sounds very promising!

Best regards,
Daniel

2011/3/24 Daniel Lepage <dplepage@gmail.com>:
> On Thu, Mar 24, 2011 at 11:33 AM, Daniel Mader
>
>> this is not a software question or scipy problem but rather I have no
>> clue how to tackle this on a mathematical level.
>
> I'd write this in terms of probabilities, but that could be just
> because I tend to write everything that way :-)
>
> Your calibration measurements are samples of a probability
> distribution P(count, concentration, temperature), which equals
> P(concentration | count, temperature) P(count, temperature). You can
> estimate P(count, temperature) from these samples and thus estimate
> P(concentration | count, temperature).
>
> Now you do your experiment - with an unknown concentration, you take
> some [count, temperature] measurements. Based on the uncertainty of
> your sensors, you estimate P(count, temperature) for this new sample.
> You want to find P(concentration), so you write the marginalization:
>
> P(concentration) = \int_{temperature, count} P(concentration, count,
> temperature) = \int_{temperature, count} P(concentration | count,
> temperature) P(count, temperature)
>
> where \int denotes an integral over all possible temperatures and
> counts. You have P(concentration | count, temperature) from your
> calibration, so you integrate to get P(concentration), and then choose
> the concentration that maximizes it (ML estimator) or that minimizes
> the expected error from it (Bayes estimator).
>
> If you assume that all measurements come from some deterministic
> system corrupted by Gaussian noise and that all concentrations and
> temperatures are equally likely, and you choose to use a
> maximum-likelihood (ML) estimator, then this takes a very simple
> algorithmic form:
> 1) Fit a surface to your (count, concentration, temperature)
> calibration points. If you assume that count is a linear function of
> concentration and temperature, this surface will be a plane (very easy
> to fit); if instead you expect it to be exponential in temperature and
> linear in concentration, you'll be fitting a curved surface.
> 2) Each new [count, temperature] pair defines a line in this 3D space;
> intersect this line with your surface to get the most probable
> concentration.
>
> As Robert pointed out, step 1 will be a lot more robust if you have
> calibration samples with more than 3 distinct concentrations.
>
> Hope this helps,
> Dan Lepage