[SciPy-User] Normalizing a sparse matrix

Warren Weckesser warren.weckesser@enthought....
Sun Mar 20 05:37:21 CDT 2011


On Sun, Mar 20, 2011 at 2:06 AM, coolhead.pranay@gmail.com <
coolhead.pranay@gmail.com> wrote:

> Hi,
>
> I have a sparse matrix with nearly (300*10000) entries constructed out of
> 14000*14000 matrix...In each iteration after performing some operations on
> the sparse matrix(like multiply and dot) I have to divide each row of the
> corresponding dense matrix with the sum of its elements...
>
> Since sparse matrix format doesn't allow all the required matrix
> operation(divide) I tried to convert it to a dense format and then divide by
> the sum. But this raises MemoryError exception because 14000*14000 matrix
> doesn't fit memory..
>
> Can someone tell me how to normalize a sparse matrix ?
>
>

This will normalize the rows of R, a sparse matrix in CSR format:

-----

# Normalize the rows of R.
row_sums = np.array(R.sum(axis=1))[:,0]
# OR: row_sums = R.dot(np.ones(R.shape[1]))
row_indices, col_indices = R.nonzero()
R.data /= row_sums[row_indices]

-----

The attached code provides an example of that snippet in use.


Warren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20110320/b7de96b3/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sparse_normalize_rows_example.py
Type: application/octet-stream
Size: 710 bytes
Desc: not available
Url : http://mail.scipy.org/pipermail/scipy-user/attachments/20110320/b7de96b3/attachment.obj 


More information about the SciPy-User mailing list