[SciPy-user] trouble saving sparse matrix

David Warde-Farley dwf@cs.toronto....
Fri Nov 28 02:13:44 CST 2008


On 26-Nov-08, at 11:51 AM, Robin wrote:

> Hi,
>
> I have a large sparse matrix (about 9GB):
>
> In [18]: a.A
> Out[18]:
> <21699x1048575 sparse matrix of type '<type 'numpy.int8'>'
>        with 1035272192 stored elements in Compressed Sparse Column  
> format>
>
> but I am having trouble saving it.
>
> I am on 64 bit linux.
>
> The problem is whatever I try I get :
> SystemError: Negative size passed to PyString_FromStringAndSize
>
> This happens with cPickle.dump, np.save, sp.io.savemat etc.

How are you using np.save? (just to be sure)

Have you tried saving the individual component vectors? x.data,  
x.indices, x.indptr? I usually use np.save() on each one of these, as  
well as array(x.shape), or equivalently

np.savez('mysparsematrix.npz', data=x.data, indices=x.indices,  
indptr=x.indptr,shape=np.array(x.shape))

is a nice way to save sparse matrices, I've found. Then restoring is  
as easy as

f = np.load('mysparsematrix.npz')
mymat = sp.sparse.csc_matrix((f['data'],f['indices'],f['indptr']),  
shape=f['shape'])

At any rate, calling np.save on them individually might help you  
isolate the problem.

Regards,

David


More information about the SciPy-user mailing list