[Numpy-discussion] Log Arrays
Charles R Harris
charlesr.harris@gmail....
Thu May 8 11:04:28 CDT 2008
On Thu, May 8, 2008 at 9:20 AM, David Cournapeau <cournape@gmail.com> wrote:
> On Thu, May 8, 2008 at 10:20 PM, Charles R Harris
> <charlesr.harris@gmail.com> wrote:
> >
> >
> > Floating point numbers are essentially logs to base 2, i.e., integer
> > exponent and mantissa between 1 and 2. What does using the log buy you?
>
> Precision, of course. I am not sure I understand the notation base =
> 2,
lg(x) = ln(x)/ln(2)
> but doing computation in the so called log-domain is a must in many
> statistical computations. In particular, in machine learning with
> large datasets, it is common to have some points whose pdf is
> extremely small, and well below the precision of double.
< 1e-308 ?
> Typically,
> internally, the computation of my EM toolbox are done in the log
> domain, and use the logsumexp trick to compute likelihood given some
> data:
>
Yes, logs can be useful there, but I still fail to see any precision
advantage. As I say, to all intents and purposes, IEEE floating point *is* a
logarithm. You will see that if you look at how log is implemented in
hardware. I'm less sure of the C floating point library because it needs to
be portable.
>
> a = np.array([-1000., -1001.])
> np.log(np.sum(np.exp(a))) -> -inf
> -1000 + np.log(np.sum([1 + np.exp(-1)])) -> correct result
>
What realistic probability is in the range exp(-1000) ?
>
> Where you use log(exp(x) + exp(y)) = x + log(1 + exp(y-x)). It is
> useful when x and y are in the same range bu far from 0, which happens
> a lot practically in many machine learning algorithms (EM, SVM, etc...
> everywhere you need to compute likelihood of densities from the
> exponential family, which covers most practical cases of parametric
> estimation)
>
If you have a hammer... It's portable, but there are wasted cpu cycles in
there. If speed was important, I suspect you could do better writing a low
level function that assumed IEEE doubles and twiddled the bits.
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20080508/7b8271a4/attachment.html
More information about the Numpy-discussion
mailing list