[SciPy-User] understanding machine precision
Francesc Alted
faltet@pytables....
Tue Dec 14 13:11:42 CST 2010
A Tuesday 14 December 2010 19:38:12 josef.pktd@gmail.com escrigué:
> On Tue, Dec 14, 2010 at 1:09 PM, Francesc Alted <faltet@pytables.org>
wrote:
> > A Tuesday 14 December 2010 18:42:40 josef.pktd@gmail.com escrigué:
> >> I thought that we get deterministic results, with identical
> >> machine precision errors, but I get (with some random a0, b0)
> >>
> >> >>> for i in range(5):
> >> x = scipy.linalg.lstsq(a0,b0)[0]
> >> x2 = scipy.linalg.lstsq(a0,b0)[0]
> >> print np.max(np.abs(x-x2))
> >>
> >>
> >> 9.99200722163e-016
> >> 9.99200722163e-016
> >> 0.0
> >> 0.0
> >> 9.99200722163e-016
> >>
> >> >>> a0.shape
> >>
> >> (100, 10)
> >>
> >> >>> b0.shape
> >>
> >> (100, 3)
> >>
> >> Why is the result not always the same? just curious
> >
> > That's really funny! Could you please come up with a
> > self-contained example so as to see if others can reproduce that?
>
> Essentially all I did was
>
> a0 = np.random.randn(100,10)
> b0 = a0.sum(1)[:,None] + np.random.randn(100,3)
>
> I copied scipy.linalg pinv, pinv2 and lstsq to a local module, and
> the results where not exactly the same as with the ones in scipy.
> So, I did this check for scipy also.
>
> the attached script produces different results in each run (on
> WindowsXP 32), for example
>
> lstsq
> 1.55431223448e-015
> 1.55431223448e-015
> 0.0
> 1.55431223448e-015
> 1.55431223448e-015
>
> pinv
> 5.20417042793e-017
> 5.20417042793e-017
> 0.0
> 0.0
> 0.0
>
> pinv2
> 0.0
> 0.0
> 5.76795555762e-017
> 5.76795555762e-017
> 4.51028103754e-017
I cannot reproduce that:
lstsq
0.0
0.0
0.0
0.0
0.0
pinv
0.0
0.0
0.0
0.0
0.0
Using numpy without ATLAS. So yes, as others pointed out it seems to be
ATLAS the responsible for this.
I think ATLAS always perform some small calculation in order to
determine the optimal block sizes for its computations. So, my guess is
that, depending on the stress of your machine (and maybe on the phase of
the moon too), these block sizes maybe slightly different, leading to a
different order in computations and, hence, preventing strict
reproducibility.
Anyway, I must confess that running these calculations for *every*
computation strikes me a bit, but provided that the matrices can be
large, it might have some sense (i.e. the cost for block size guessing
is negligible in general).
--
Francesc Alted
More information about the SciPy-User
mailing list