[SciPy-Dev] SVDLIBC for sparse SVDs

Jake Vanderplas vanderplas@astro.washington....
Mon Dec 10 12:29:43 CST 2012

Hi folks,
I just came across a sparse svd implementation based on SVDLIBC [1] with 
a nice python wrapper utilizing Scipy's csc_matrix type [2]. Scipy 
currently includes a basic iterative sparse svd implementation based on 
ARPACK (scipy.sparse.linalg.svds), but the implementation is somewhat 
hackish.  The SVDLIBC version uses the same principles as ARPACK -- 
Lanczos factorization -- and from my quick checks, can be faster than 
the ARPACK version in some cases.  All the code, including python 
wrappers, is released under a BSD license, so it would be fairly 
seamless to include in Scipy.

On the plus side, incorporating SVDLIBC would add some well-tested 
sparse functionality and gives users more powerful options.  Where our 
current svds function performs iterations within python, the SVDLIBC 
implementation performs the iterations directly within the C code.  It 
uses the csc_matrix format internally, so no data copying is involved.  
It could fairly easily supplement or replace our current sparse svd.

On the minus side, the functionality does duplicate what we already 
have, and would involve bundling another C package in Scipy.  This might 
cause some linking headaches (what if the user already has a different 
version of SVDLIBC on their system? We experienced this with ARPACK) and 
maintenance overhead (possibility of added compilation issues; the need 
to keep up with updates to SVDLIBC). Furthermore, sparsesvd is a fairly 
light-weight python package, and users needing the functionality could 
easily install it with pip if the need arises.

I could be convinced either way, but I thought I'd ask the list: any 
thoughts on whether this would be worth including in Scipy?

[1] http://tedlab.mit.edu/~dr/SVDLIBC/
[2] http://pypi.python.org/pypi/sparsesvd/

More information about the SciPy-Dev mailing list