[SciPy-dev] sparse matrix multiplication - poor performance
Robert Cimrman
cimrman3 at ntc.zcu.cz
Thu Dec 7 06:41:27 CST 2006
Nathan Bell wrote:
> There appears to be something terribly wrong with the current
> implementation of sparse matrix multiplication. Consider the test
> below done in both MATLAB and scipy.
>
> MATLAB is [660.5872 936.8530 702.5506 650.2876] times faster for
> these four problems. Changing N = 100,000 puts the ratio in the
> neighborhood of 4000.
>
> Also, doing "B = A*A.T" takes *hundreds* of times longer than "B =
> A*A" or "B = A.T*A". This is more than the time required for dense
> matrix multiplication (try N=1000).
>
> Can anyone comment on this? Is the SPARSKIT code really that slow?
I use sparse matrices only for solving large linear systems (-> don't
need multiplication), but I would bet that, apart from sparskit speed, a
suboptimal code path is taken.
In any case, A_csr * B_csc (or A_csr * B_csr.T) should be very fast (in
theory), since the sparsity data are in the "right" order.
> Also, I made an improvement [1] to the coo_matrix.tocsr() and
> coo_matrix.tocsr() methods. Is there a reason it hasn't been
> included? I'd commit it myself, but I don't have the privileges.
> Who do I need so speak to about SVN commit access? FWIW I've
> contributed a few bug reports and some additional SWIG wrappers for
> UMFPACK (via Robert Cimrman). The sparse matrix functionality is
> especially important for my work, so I'd like to help on that front.
Your help is very appreciated. While debugging, could you determine
which code paths are really taken in the cases
csr * csc
csr.T * csc
csr * csr,
...
- some speed-up might be gained on python level, imho. If it is not
enough, a sparskit replacement should be considered. Ed Schofield (and
Travis O.) could know more?
> [1] http://projects.scipy.org/scipy/scipy/ticket/326
I have merged your changes, but try to ask the administrator (Jeff
Strunk?) to get SVN write access.
r.
More information about the Scipy-dev
mailing list