[SciPy-dev] Linalg2 benchmarks
pearu at scipy.org
pearu at scipy.org
Fri Apr 5 14:17:25 CST 2002
On 5 Apr 2002, Jochen Küpper wrote:
> Well, se above. Going from 300 x 100x100 to 4 x 500x500 scipy takes
> more or the same time, whereas numpy takes less.
Yes, because in the later case most of the time is spent in ATLAS
routines and time spent in interfaces is very small, after all there is
only 4 calls. But in the former case (300 x 100x100), ATLAS routines
finish more quickly and since there are lots of calls (300), then
the time spent in interfaces becomes noticable.
> pearu> 1) for n->oo, where n is the size of the problem, there would be no
> pearu> difference in speeds as the hard computation is done by the same ATLAS
> pearu> routine.
>
> The data does not oppose that.
But you don't believe me ;-)
> pearu> 2) for n fixed but repeating the computation for c times, then for c->oo
> pearu> you would find that scipy is 2-3 times faster that Numeric. This speed up
> pearu> is gained only because of the f2py generated interface between Python and
> pearu> the ATLAS routines that scipy uses.
>
> That is kind of what the data shows, but is only valid for small n
> though, as you said yourself in 1).
So, I should have said here:
scipy interfaces to ATLAS routines are X times faster than the
corresponding interfaces in Numeric.
To find X, I run tests with small input data (taking n=2) so that most of
the time is spent in the interfaces, rather in ATLAS routines.
It turns out that scipy interface is 3-5 times faster than the interface
of Numeric. The test results are included at the end of this message.
For increasing n and c->oo, the difference between scipy and Numeric
becomes smaller because of the reasons explained in 1).
Note also that the n's used in these tests are relatively small. If n is
really large, then I would expect also scipy to perform better than
Numeric because the interfaces in scipy are also optimized to minimize
the memory usage.
> pearu> So, I don't find these testing results strange as you commented.
>
> I think it is not a valid comparison to put scipy against lapack_lite,
> considering how easy it is to get numpy use any LAPACK/BLAS.
> Esp. considering the people on this list.
I agree. However this comparison is still generally useful: it gives a
motivation for people to build numpy with optimized LAPACK/BLAS libraries.
> pearu> More surprising is that sometimes with non-contiguous input
> pearu> data the calculation is actually faster(!) and not slower as I
> pearu> would expect. I have no explanation for this one.
>
> I would assume that gives you a lower bound for the accuracy of these
> benchmarks...
Yes, but curiosly enough with non-contiguous input the calculation is
systematically faster or with the same speed, but rarely slower.
Pearu
---------------------------
Intel Mobile 400Mhz, 160MB RAM, Debian Woody with Linux 2.4.14-6,
gcc version 2.95.4, Python 2.1.2-4, Both SciPy and NumPy use ATLAS-3.3.13.
Finding matrix determinant
==================================
| contiguous | non-contiguous
----------------------------------------------
size | scipy | Numeric | scipy | Numeric
2 | 1.10 | 5.20 | 1.08 | 5.22 (secs for 4000 calls)
20 | 0.91 | 4.28 | 1.12 | 3.52 (secs for 2000 calls)
100 | 1.56 | 3.48 | 1.62 | 4.35 (secs for 300 calls)
500 | 1.67 | 2.31 | 1.73 | 2.58 (secs for 4 calls)
.
Solving system of linear equations
==================================
| contiguous | non-contiguous
----------------------------------------------
size | scipy | Numeric | scipy | Numeric
2 | 1.93 | 6.21 | 1.87 | 6.19 (secs for 4000 calls)
20 | 1.34 | 3.69 | 1.33 | 3.94 (secs for 2000 calls)
100 | 1.92 | 3.32 | 1.94 | 4.26 (secs for 300 calls)
500 | 2.57 | 2.21 | 1.66 | 2.39 (secs for 4 calls)
.
Finding matrix inverse
==================================
| contiguous | non-contiguous
----------------------------------------------
size | scipy | Numeric | scipy | Numeric
2 | 2.02 | 8.77 | 1.96 | 8.89 (secs for 4000 calls)
20 | 1.73 | 5.89 | 1.75 | 6.13 (secs for 2000 calls)
100 | 4.73 | 9.49 | 4.48 | 10.35 (secs for 300 calls)
500 | 4.82 | 7.50 | 4.88 | 7.88 (secs for 4 calls)
.
----------------------------------------------------------------------
Ran 35 tests in 180.893s
More information about the Scipy-dev
mailing list