[SciPy-user] fast max() on sparse matrices

nicky van foreest vanforeest@gmail....
Mon Jan 5 15:43:53 CST 2009


Hi,

A few days ago I encountered just the same problem, and solved by
taking the max of the values(), just as suggested below. However, it
took me some minutes to fiugre this out, and I first, of course, tried
the max() function. Thus, I suggest that the max function will be
added to the sparse class. Is there a reason not to do so?

bye

Nicky

2009/1/5 Peter Skomoroch <peter.skomoroch@gmail.com>:
> I knew I overlooked something simple :)  Thanks Bill
>
>
>>>> import scipy
>>>> from scipy.sparse import csr_matrix, csc_matrix
>>>> A= array([[1,2,3],[1,0,0],[4,5,0]])
>>>> A
> array([[1, 2, 3],
>        [1, 0, 0],
>        [4, 5, 0]])
>>>> B = csr_matrix(A)  # just for this simple example, construct with COO
>>>> for speed
>>>> B
> <3x3 sparse matrix of type '<type 'numpy.int32'>'
>     with 6 stored elements in Compressed Sparse Row format>
>>>> print B
>   (0, 0)    1
>   (0, 1)    2
>   (0, 2)    3
>   (1, 0)    1
>   (2, 0)    4
>   (2, 1)    5
>>>> B.data
> array([1, 2, 3, 1, 4, 5])
>>>> max(B.data)
> 5
>
>
>
> On Sun, Jan 4, 2009 at 9:48 PM, Bill Baxter <wbaxter@gmail.com> wrote:
>>
>> On Mon, Jan 5, 2009 at 11:37 AM, Peter Skomoroch
>> <peter.skomoroch@gmail.com> wrote:
>> > Does anyone have suggestions on a fast max() function for sparse
>> > matrices
>> > (COO, CSC, or CSR format)?
>> >
>> > I was thinking of slicing CSC or CSR matrices, and iterating through the
>> > columns, but I suspect any loop based approach will be slow.
>> >
>> > def sparse_amax(V):
>> >     """Returns the max of a sparse CSR matrix V with shape (m,n)
>> >     m = number of examples (# columns),
>> >     n = dimensionality of examples (# rows) """
>> >     n,m = V.shape
>> >     # if type is CSR, slice by rows
>> >     maxvals = []
>> >     for row in xrange(n):
>> >         #find max of row
>> >         maxvals.append(max(array(V[row,:].todense())[0]))
>> >     Vmax = max(maxvals)
>> >     return Vmax
>>
>> The CSC and CSR formats both internally store a dense array of all the
>> non-zero values.
>> I'm not sure how the Python interface looks like in SciPy's versions,
>> but if there's a way to get at that values array, then you can just do
>> the max of that.  (But don't forget the corner case of an unset
>> implicit zero value being the max).
>>
>> --bb
>> _______________________________________________
>> SciPy-user mailing list
>> SciPy-user@scipy.org
>> http://projects.scipy.org/mailman/listinfo/scipy-user
>
>
>
> --
> Peter N. Skomoroch
> peter.skomoroch@gmail.com
> http://www.datawrangling.com
> http://del.icio.us/pskomoroch
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user@scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
>


More information about the SciPy-user mailing list