[SciPy-User] Sparse Matrices, summing columns !=sum

Pauli Virtanen pav@iki...
Fri Nov 16 10:57:23 CST 2012


 <gabe.g <at> me.com> writes:
[clip]
> So-- is this a bug or am I doing something wrong?
> 
> I have a sparse matrix that I arrived at through 
> a complicated bunch of calculations which I cannot 
> reproduce here. I will try to find a simpler example of this.
[clip]
> In [170]: X
> Out[170]: 
> <196980x43 sparse matrix of type '<type 'numpy.uint16'>'
>         with 70875 stored elements in Compressed Sparse Row format>

It's an integer overflow due to using the same integer type
as the accumulator.

Dense Numpy arrays however appear to use the platform default integer
as the accumulator:

>>> np.array([[1,1],[1,1]], dtype=np.uint16).sum(axis=0)
array([2, 2], dtype=uint64)

IIRC, this was changed in Numpy at some point to work like this.

The sparse matrices perhaps also should mirror this behavior,
at least it would avoid overflows in the most common cases.

-- 
Pauli Virtanen



More information about the SciPy-User mailing list