[SciPy-user] Sparse csr_matrix and column sum
Dinesh B Vadhia
dineshbvadhia@hotmail....
Sun Apr 27 21:32:29 CDT 2008
Thanks Nathan.
Sorry, I wasn't being imprecise as A.sum(0) didn't work and still doesn't - I've just tried again. However, A.todense().sum(0) does work - but takes a performance hit. My import statements are:
> import numpy
> import scipy
> from scipy import sparse
Which means that I have to qualify each function/operation with a numpy. or scipy. - is the problem that I haven't qualified the statement:
> colSum = A.sum(0)
correctly?
Anyway, A.todense().sum() works for small sized matrices but unfortunately, because of the large matrices being used the A.todense().sum(0) results in a memory error. For I = 20000 and J = 66000, here is the Traceback:
Traceback (most recent call last):
File "C:\... sparseTest.py", line 42
colSum = A.todense().sum(0) # sum of each column of A
File "C:\Python25\Lib\site-packages\scipy\sparse\base.py", line 416, in todense
return asmatrix(self.toarray())
File "C:\Python25\Lib\site-packages\scipy\sparse\compressed.py", line 627, in toarray
M = zeros(self.shape, dtype=self.dtype)
MemoryError
Is there a way around this?
Cheers
Dinesh
--------------------------------------------------------------------------------
From: Nathan Bell <wnbell <at> gmail.com>
Subject: Re: Sparse csr_matrix and column sum
Newsgroups: gmane.comp.python.scientific.user
Date: 2008-04-28 00:08:23 GMT (1 hour and 38 minutes ago)
On Sun, Apr 27, 2008 at 6:41 PM, Dinesh B Vadhia
<dineshbvadhia <at> hotmail.com> wrote:
>
>
> If A is a sparse csr_matrix and you want to calculate the sum of each column
> then the 'normal' method is:
>
> import numpy
> import scipy
> from scipy import sparse
>
> colSum = scipy.asmatrix(scipy.zeros((1,J), dtype=numpy.float))
> colSum = A.mean(0)
>
> This isn't working. Do we have to do something else (eg. a todense()) for a
> sparse matrix? If so, how?
What do you mean by "isn't working"?
In [1]: from scipy import *
In [2]: from scipy.sparse import *
In [3]: A = csr_matrix(rand(3,3))
In [4]: A.todense()
Out[4]:
matrix([[ 0.95297535, 0.81029421, 0.79146232],
[ 0.88477059, 0.9025494 , 0.80259054],
[ 0.06691343, 0.76691617, 0.68518027]])
In [5]: A.mean(0)
Out[5]: matrix([[ 0.63488646, 0.82658659, 0.75974438]])
In [6]: A.mean(1)
Out[6]:
matrix([[ 0.85157729],
[ 0.86330351],
[ 0.50633662]])
In [7]: A.todense().mean(0)
Out[7]: matrix([[ 0.63488646, 0.82658659, 0.75974438]])
In [8]: A.todense().mean(1)
Out[8]:
matrix([[ 0.85157729],
[ 0.86330351],
[ 0.50633662]])
Dinesh, as a courtesy, would you provide specific details when
reporting your problems with SciPy? I'd rather not have to speculate
on the precise nature of each issue raised.
--
Nathan Bell wnbell <at> gmail.com
http://graphics.cs.uiuc.edu/~wnbell/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/scipy-user/attachments/20080427/e68c41e4/attachment.html
More information about the SciPy-user
mailing list