[SciPy-User] Reading / writing sparse matrices
Lutz Maibaum
lutz.maibaum@gmail....
Thu Nov 11 23:33:16 CST 2010
On Thu, Nov 11, 2010 at 8:58 PM, Matthew Brett <matthew.brett@gmail.com> wrote:
> The problem I can see is that this would be confusing:
>
> a=scipy.sparse.lil_matrix((5,5), dtype=np.uint64)
> a[0,0]=9876543210
> mmwrite(fname, a)
> res = mmread(fname)
> b.data
> array([-2147483648], dtype=int32)
>
> That is, I think the writer shouldn't write something without warning,
> that it will read incorrectly by default. So, how about a
> compromise:
>
> In [7]: mmwrite(fname, a)
> ---------------------------------------------------------------------------
> TypeError Traceback (most recent call last)
> ...
> TypeError: Will not write unsigned integers by default. Please pass
> field="integer" to write unsigned integers
> In [8]: mmwrite(fname, a, field='integer')
> In [9]: res = mmread(fname, dtype=np.uint64)
> In [11]: res.todense()[0,0]
> Out[11]: 9876543210
That's one possibility, but I find it somewhat odd that this would
generate an exception when the matrix is being saved, even though
there is no ambiguity at this stage. It also wouldn't eliminate the
potential for confusion if someone tries to load a matrix that they
didn't save themselves, but got from some other source.
Are there other situations where the automated conversion from mmread
may cause problems? For example, reading a matrix with 64-bit integers
on a system where the default int dtype is only 32 bit?
I think it would be ideal if mmread would generate a warning or throw
an exception of the numerical value of the generated integer does not
coincide with string that has been read from the file. I don't know
if that is feasible. Alternatively, one could store additional
information about the integer data type in the Matrix Market header
section as a comment.
I understand that these solutions would require much more thought.
Your solution would be a nice initial patch.
Thanks,
Lutz
More information about the SciPy-User
mailing list