[SciPy-User] understanding machine precision

Francesc Alted faltet@pytables....
Tue Dec 14 13:11:42 CST 2010


A Tuesday 14 December 2010 19:38:12 josef.pktd@gmail.com escrigué:
> On Tue, Dec 14, 2010 at 1:09 PM, Francesc Alted <faltet@pytables.org> 
wrote:
> > A Tuesday 14 December 2010 18:42:40 josef.pktd@gmail.com escrigué:
> >> I thought that we get deterministic results, with identical
> >> machine precision errors, but I get (with some random a0, b0)
> >> 
> >> >>> for i in range(5):
> >>       x = scipy.linalg.lstsq(a0,b0)[0]
> >>       x2 = scipy.linalg.lstsq(a0,b0)[0]
> >>       print np.max(np.abs(x-x2))
> >> 
> >> 
> >> 9.99200722163e-016
> >> 9.99200722163e-016
> >> 0.0
> >> 0.0
> >> 9.99200722163e-016
> >> 
> >> >>> a0.shape
> >> 
> >> (100, 10)
> >> 
> >> >>> b0.shape
> >> 
> >> (100, 3)
> >> 
> >> Why is the result not always the same? just curious
> > 
> > That's really funny!  Could you please come up with a
> > self-contained example so as to see if others can reproduce that?
> 
> Essentially all I did was
> 
>     a0 = np.random.randn(100,10)
>     b0 = a0.sum(1)[:,None] + np.random.randn(100,3)
> 
> I copied scipy.linalg pinv, pinv2 and lstsq to a local module, and
> the results where not exactly the same as with the ones in scipy.
> So, I did this check for scipy also.
> 
> the attached script produces different results in each run (on
> WindowsXP 32), for example
> 
> lstsq
> 1.55431223448e-015
> 1.55431223448e-015
> 0.0
> 1.55431223448e-015
> 1.55431223448e-015
> 
> pinv
> 5.20417042793e-017
> 5.20417042793e-017
> 0.0
> 0.0
> 0.0
> 
> pinv2
> 0.0
> 0.0
> 5.76795555762e-017
> 5.76795555762e-017
> 4.51028103754e-017

I cannot reproduce that:

lstsq
0.0
0.0
0.0
0.0
0.0

pinv
0.0
0.0
0.0
0.0
0.0

Using numpy without ATLAS.  So yes, as others pointed out it seems to be 
ATLAS the responsible for this.

I think ATLAS always perform some small calculation in order to 
determine the optimal block sizes for its computations.  So, my guess is 
that, depending on the stress of your machine (and maybe on the phase of 
the moon too), these block sizes maybe slightly different, leading to a 
different order in computations and, hence, preventing strict 
reproducibility.

Anyway, I must confess that running these calculations for *every* 
computation strikes me a bit, but provided that the matrices can be 
large, it might have some sense (i.e. the cost for block size guessing 
is negligible in general).

-- 
Francesc Alted


More information about the SciPy-User mailing list