[Numpy-discussion] Log Arrays

Robert Kern robert.kern@gmail....
Thu May 8 14:18:14 CDT 2008


On Thu, May 8, 2008 at 2:12 PM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
>
> On Thu, May 8, 2008 at 11:46 AM, Anne Archibald <peridot.faceted@gmail.com>
> wrote:
>>
>> 2008/5/8 Charles R Harris <charlesr.harris@gmail.com>:
>> >
>> > On Thu, May 8, 2008 at 10:56 AM, Robert Kern <robert.kern@gmail.com>
>> > wrote:
>> > >
>> > > When you're running an optimizer over a PDF, you will be stuck in the
>> > > region of exp(-1000) for a substantial amount of time before you get
>> > > to the peak. If you don't use the log representation, you will never
>> > > get to the peak because all of the gradient information is lost to
>> > > floating point error. You can consult any book on computational
>> > > statistics for many more examples. This is a long-established best
>> > > practice in statistics.
>> >
>> > But IEEE is already a log representation. You aren't gaining precision,
>> > you
>> > are gaining more bits in the exponent at the expense of fewer bits in
>> > the
>> > mantissa, i.e., less precision. As I say, it may be convenient, but if
>> > cpu
>> > cycles matter, it isn't efficient.
>>
>> Efficiency is not the point here. IEEE floats simply cannot represent
>> the difference between exp(-1000) and exp(-1001). This difference can
>> matter in many contexts.
>>
>> For example, suppose I have observed a source in X-rays and I want to
>> fit a blackbody spectrum. I have, say, hundreds of spectral bins, with
>> a number of photons in each. For any temperature T and amplitude A, I
>> can compute the distribution of photons in each bin (Poisson with mean
>> depending on T and A), and obtain the probability of obtaining the
>> observed number of photons in each bin. Even if a blackbody with
>> temperature T and amplitude A is a perfect fit I should expect this
>> number to be very small, since the chance of obtaining *exactly* this
>> sequence of integers is quite small. But when I start the process I
>> need to guess a T and an A and evauate the probability. If my values
>> are far off, the probability will almost certainly be lower than
>> exp(-1000). But I need to know whether, if I increase T, this
>> probability will increase or decrease, so I cannot afford to treat it
>> as zero, or throw up my hands and say "This is smaller than one over
>> the number of baryons in the universe! My optimization problem doesn't
>> make any sense!".
>
> You want the covariance of the parameters of the fit, which will be much
> more reasonable. And if not, then you have outliers. Mostly, folks are
> looking for the probability of classes of things, in this case the class of
> curves you are fitting, the probability of any give sequence, which
> approaches to zero, is a much less interest. So the errors in the parameters
> of the model matter, the probability of essentially unique sequence much
> less so.

Except that when you are doing practical computations, you will often
be computing likelihoods as part of the larger computation. Yes, the
final probabilities that you interpret won't be exp(-1000) (and if
they are, then 0 is close enough), but the calculations *in between*
do require that exp(-1000) and exp(-1001) be distinguishable from each
other.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco


More information about the Numpy-discussion mailing list