[SciPy-User] How to efficiently do dot(dot( A.T, diag(d) ), A ) ?

Pauli Virtanen pav@iki...
Tue Sep 11 13:21:10 CDT 2012


11.09.2012 20:28, Hugh Perkins kirjoitti:
[clip]
> It makes me wonder though.  There is an opensource project called
> 'Eigen', for C++.
> It seems to provide good performance for matrix-matrix multiplication,
> comparable to Intel MKL, and significantly better than ublas
> http://eigen.tuxfamily.org/index.php?title=Benchmark  I'm not sure
> what the relationship is between ublas and BLAS?

Eigen doesn't provide a BLAS interface, so it would be quite a lot of
work to use it.

Moreover, it probably derives some of its speed for small matrices from
compile-time specialization, which is not available via a BLAS interface.

However, OpenBLAS/GotoBLAS could be better than ATLAS, it seems to be
also doing well in the benchmarks you linked to:

    https://github.com/xianyi/OpenBLAS

If you are on Linux, you can easily swap the BLAS libraries used, like so:

*** OpenBLAS:

LD_PRELOAD=/usr/lib/openblas-base/libopenblas.so.0 ipython
...
In [11]: %timeit e = np.dot(d, c.T)
100 loops, best of 3: 14.8 ms per loop

*** ATLAS:

LD_PRELOAD=/usr/lib/atlas-base/atlas/libblas.so.3gf ipython
In [12]: %timeit e = np.dot(d, c.T)
10 loops, best of 3: 20.8 ms per loop

*** Reference BLAS:

LD_PRELOAD=/usr/lib/libblas/libblas.so.3gf:/usr/lib/libatlas.so ipython
...
In [11]: %timeit e = np.dot(d, c.T)
10 loops, best of 3: 89.3 ms per loop


Yet another thing to watch out is possible use of multiple processors at
once (although I'm not sure how much that will matter in this particular
case).

-- 
Pauli Virtanen



More information about the SciPy-User mailing list