[Numpy-discussion] Log Arrays

David Cournapeau cournape@gmail....
Thu May 8 10:20:20 CDT 2008


On Thu, May 8, 2008 at 10:20 PM, Charles R Harris
<charlesr.harris@gmail.com> wrote:
>
>
> Floating point numbers are essentially logs to base 2, i.e., integer
> exponent and mantissa between 1 and 2. What does using the log buy you?

Precision, of course. I am not sure I understand the notation base =
2, but doing computation in the so called log-domain is a must in many
statistical computations. In particular, in machine learning with
large datasets, it is common to have some points whose pdf is
extremely small, and well below the precision of double. Typically,
internally, the computation of my EM toolbox are done in the log
domain, and use the logsumexp trick to compute likelihood given some
data:

a = np.array([-1000., -1001.])
np.log(np.sum(np.exp(a))) -> -inf
-1000 + np.log(np.sum([1 + np.exp(-1)])) -> correct result

Where you use log(exp(x) + exp(y)) = x + log(1 + exp(y-x)). It is
useful when x and y are in the same range bu far from 0, which happens
a lot practically in many machine learning algorithms (EM, SVM, etc...
everywhere you need to compute likelihood of densities from the
exponential family, which covers most practical cases of parametric
estimation)


More information about the Numpy-discussion mailing list