[SciPy-Dev] 64-bit sparse matrix indices

Nathaniel Smith njs@pobox....
Fri Dec 14 09:54:52 CST 2012


On Fri, Dec 14, 2012 at 9:37 AM, Pauli Virtanen <pav@iki.fi> wrote:
> Hi,
>
> I've been looking a bit at making sparse matrices work with 64-bit
> indices:
>
>     https://github.com/pv/scipy-work/commits/ticket/1307
>
> The motivation is that 32-bit indices on 64-bit machines don't allow
> representing sparse matrices with large nnz.
>
> One option A (currently there) is to allow both int32 and int64 as
> indices, and use the larger one only when required by nnz.
>
> The second option B would be to just use intp for everything.
>
> The problem with A is that I'm far from certain that I found all the
> corner cases yet, and I'm fairly certain there are some undiscovered
> bugs still somewhere. The test suite doesn't yet have the level of
> coverage on this issue I'd be comfortable with.
>
> The problem with B is that on 64-bit systems, it it increases the
> memory needs of sparse matrices by about 50%. However, as a solution
> it's more robust and elegant.

One problem with B is if there is code out there which "knows" that
sparse matrices use 32-bit indices. E.g. I can adapt
scikits.sparse.cholmod to handle 64-bit indices, but it will require
code changes, because you have to use different flags when calling the
underlying routines and so far there was no point in it. It looks like
I was paranoid enough that switching to option B would just require
changing ~4 lines of code, and that if you somehow passed 64-bit
indices to the current version then it will downcast and keep going
(not sure if this is better than crashing or not!). But there may well
be other code out there that passes scipy.sparse matrices to
C/Fortran, and if indices suddenly become 64-bit, then that code may
start simply returning nonsense... I'd be concerned, anyway.

I guess this is a problem with option A as well, but at least existing
code working on matrices that currently work, would keep working. OTOH
option A also means that any future C/Fortran code has to be prepared
to handle both cases. Not really a big deal when working in Cython,
but I hear that some people still use other tools...

Do all the sparse matrix kernels we care about even handle 64-bit
indices? CHOLMOD does, but it takes special setup, and I don't know if
all kernel authors are so careful.

-n


More information about the SciPy-Dev mailing list