[SciPy-User] overflow in .sum() of dtype bool sparse matrix

Yaroslav Halchenko lists@onerussian....
Tue Oct 30 08:38:30 CDT 2012


I wonder if that is somehow considered a feature and manual casting is
generally advised in such cases:  calling .sum on a bool matrix can easily lead
to overflows causing bogus results (works fine on ndarrays):

% git describe --tags                                                                   
v0.4.3-6232-g43c7982

% PYTHONPATH=$PWD ../demo-scipy-sparse-negativesoverflow.py
summing 128 booleans in <type 'numpy.ndarray'> leads to answer [128]
summing 128 booleans in <class 'scipy.sparse.csc.csc_matrix'> leads to answer [[-128]]

% cat ../demo-scipy-sparse-negativesoverflow.py
#!/usr/bin/python

import numpy as np
import scipy.sparse as sp
test = np.random.rand(128, 1)
test_m= sp.csc_matrix(test)

for t in test, test_m:
    test_bool=t.astype('bool')
    sum = test_bool.sum(axis=0)
    print "summing %d booleans in %s leads to answer %s" \
           % (t.shape[0], test_bool.__class__, sum)



-- 
Yaroslav O. Halchenko
Postdoctoral Fellow,   Department of Psychological and Brain Sciences
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        


More information about the SciPy-User mailing list