[SciPy-user] scipy.sparse: coo_matrix ignores sum_duplicates=False

Nathan Bell wnbell@gmail....
Mon Oct 13 14:30:29 CDT 2008


On Mon, Oct 13, 2008 at 10:52 AM, James Philbin <philbinj@gmail.com> wrote:
>
> Hmm, I see. This is quite subtle (+ suprising) as to all intents and
> purposes the csr_matrix behaves as if the duplicates had been summed
> whether or not sum_duplicates=True or False. The parameter name
> probably needs to be changed and/or something said in the docstring.
> What I was actually looking for was a way for duplicates to be
> ignored, which i've found with dok_matrix.
>

By "ignored" do you mean that you want only the first or last value to be used?

Summing duplicates when converting COO->CSR is fairly common (e.g.
UMFPACK does it) and quite useful if you're assembling FEM matrices.
Furthermore, regarding duplicate entries as parts of a sum is
necessary if one wants to maintain consistency with matrix-vector
multiplication (i.e A*x == A.tocsr() * x).  In theory you could change
this as well, but it would be *very* costly.

FYI, others have expressed an interest more general accumulation methods:
http://thread.gmane.org/gmane.comp.python.scientific.devel/7667

I'll think about how to implement this.  It should be straightfoward
to do in pure numpy, but I'd want it to be fast for the common cases
Viral listed in the message above.

-- 
Nathan Bell wnbell@gmail.com
http://graphics.cs.uiuc.edu/~wnbell/


More information about the SciPy-user mailing list