[Numpy-discussion] Log Arrays

Charles R Harris charlesr.harris@gmail....
Thu May 8 11:04:28 CDT 2008

On Thu, May 8, 2008 at 9:20 AM, David Cournapeau <cournape@gmail.com> wrote:

> On Thu, May 8, 2008 at 10:20 PM, Charles R Harris
> <charlesr.harris@gmail.com> wrote:
> >
> >
> > Floating point numbers are essentially logs to base 2, i.e., integer
> > exponent and mantissa between 1 and 2. What does using the log buy you?
> Precision, of course. I am not sure I understand the notation base =
> 2,

lg(x) = ln(x)/ln(2)

> but doing computation in the so called log-domain is a must in many
> statistical computations. In particular, in machine learning with
> large datasets, it is common to have some points whose pdf is
> extremely small, and well below the precision of double.

<  1e-308 ?

> Typically,
> internally, the computation of my EM toolbox are done in the log
> domain, and use the logsumexp trick to compute likelihood given some
> data:

Yes, logs can be useful there, but I still fail to see any precision
advantage. As I say, to all intents and purposes, IEEE floating point *is* a
logarithm. You will see that if you look at how log is implemented in
hardware. I'm less sure of the C floating point library because it needs to
be portable.

> a = np.array([-1000., -1001.])
> np.log(np.sum(np.exp(a))) -> -inf
> -1000 + np.log(np.sum([1 + np.exp(-1)])) -> correct result

What realistic probability is in the range exp(-1000) ?

> Where you use log(exp(x) + exp(y)) = x + log(1 + exp(y-x)). It is
> useful when x and y are in the same range bu far from 0, which happens
> a lot practically in many machine learning algorithms (EM, SVM, etc...
> everywhere you need to compute likelihood of densities from the
> exponential family, which covers most practical cases of parametric
> estimation)

If you have a hammer... It's portable, but there are wasted cpu cycles in
there. If speed was important, I suspect you could do better writing a low
level function that assumed IEEE doubles and twiddled the bits.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20080508/7b8271a4/attachment.html 

More information about the Numpy-discussion mailing list