[Scipy-tickets] [SciPy] #1889: Discrepancy in non-zeros when constructing sparse matrix

SciPy Trac scipy-tickets@scipy....
Wed Apr 10 14:08:18 CDT 2013

#1889: Discrepancy in non-zeros when constructing sparse matrix
 Reporter:  avneesh  |       Owner:  somebody   
     Type:  defect   |      Status:  new        
 Priority:  normal   |   Milestone:  Unscheduled
Component:  Other    |     Version:  0.9.0      
 Keywords:           |  
 Hello all,

 I notice a somewhat bizarre issue when constructing sparse matrices by
 initializing with 3-tuples (row index, column index, value).

 The following is a slight abstraction to what my exact code is, but it
 shows the behavior:
 kNN = 10
 dataset_size = 1661165
 rowIdx = np.empty((kNN+1)*dataset_size)
 colIdx = np.empty((kNN+1)*dataset_size)
 vals = np.empty((kNN+1)*dataset_size)
 for i, line in enumerate(data):
   #perform certain operations
 print vals.size, colIdx.size, rowIdx.size
 print vals[np.nonzero(vals)].size
 W = sp.csc_matrix((vals, (rowIdx, colIdx)), shape=(dataset_size,
 print W.nnz
 The printed outputs I get are the following:
 18272815 18272815 18272815

 Therefore, as you can see, there is a difference of 18272815-18272465 =
 350 elements that should be non-zero in the resulting sparse matrix, but
 are not.

 I have verified in the rowIdx and colIdx arrays that there are no
 duplicates, i.e., a given (rowIdx, colIdx) pair does not appear twice
 (otherwise two values would map to the same position in the sparse
 matrix).  As per my understanding, I should get 18272815 elements in the
 resulting sparse matrix, but I fall 350 elements short.

 Is this expected behavior? Am I doing something wrong?

 I am running Linux x86-64-bit OpenSuSE 11.4, NumPy version 1.5.1, SciPy
 version 0.9.0, Python 2.7.

Ticket URL: <http://projects.scipy.org/scipy/ticket/1889>
SciPy <http://www.scipy.org>
SciPy is open-source software for mathematics, science, and engineering.

More information about the Scipy-tickets mailing list