[Numpy-discussion] complex division
David M. Cooke
cookedm at physics.mcmaster.ca
Sun Feb 19 16:25:02 CST 2006
On Sun, Feb 19, 2006 at 03:32:34PM -0700, Tim Hochberg wrote:
>
> While rummaging around Python's complexobject.c looking for code to
> steal for complex power, I came across the following comment relating to
> complex division:
>
> /******************************************************************
> This was the original algorithm. It's grossly prone to spurious
> overflow and underflow errors. It also merrily divides by 0 despite
> checking for that(!). The code still serves a doc purpose here, as
> the algorithm following is a simple by-cases transformation of this
> one:
>
> Py_complex r;
> double d = b.real*b.real + b.imag*b.imag;
> if (d == 0.)
> errno = EDOM;
> r.real = (a.real*b.real + a.imag*b.imag)/d;
> r.imag = (a.imag*b.real - a.real*b.imag)/d;
> return r;
> ******************************************************************/
>
> /* This algorithm is better, and is pretty obvious: first
> divide the
> * numerators and denominator by whichever of {b.real, b.imag} has
> * larger magnitude. The earliest reference I found was to CACM
> * Algorithm 116 (Complex Division, Robert L. Smith, Stanford
> * University). As usual, though, we're still ignoring all IEEE
> * endcases.
> */
>
> The algorithm shown, and maligned, in this comment is pretty much
> exactly what is done in numpy at present. The function goes on to use
> the improved algorithm, which I will include at the bottom of the post.
> It seems nearly certain that using this algorithm will result in some
> speed hit, although I'm not certain how much. I will probably try this
> out at some point and see what the speed hit, but in case I drop the
> ball I thought I'd throw this out there as something we should at least
> look at. In most cases, I'll take accuracy over raw speed (within reason).
>
> -tim
The condition for accuracy on this is
|Z - z| < epsilon |z|
where I'm using Z for the computed value of z=a/b, and epsilon is on the
order of machine accuracy.
As pointed out by Steward (ACM TOMS, v. 11, pg 238 (1985)), this doesn't
mean that the real and imaginary components are accurate. The example he
gives is a = 1e70 + 1e-70i and b=1e56+1e-56i, where z=a/b=1e14 + 1e-99i,
which is susceptible to underflow for a machine with 10 decimal digits
and a exponent range of +-99.
Priest (ACM TOMS v30, pg 389 (2004)) gives an alternative, which I won't
show here, b/c it does bit-twiddling with the double representation. But
it does a better job of handling overflow and underflow in intermediate
calculations, is competitive in terms of accuracy, and is faster (at
least on a 750 MHz UltraSPARC-III ;) than the other algorithms except
for the textbook version. One problem is the sample code is for double
precision; for single or longdouble, we'd have to figure out some magic
constants.
Maybe I'll look into it later, but for now Smith's algorithm is better
than the textbook one we were using :-)
--
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca
More information about the Numpy-discussion
mailing list