[SciPy-dev] Sparse matrix module

Travis Oliphant oliphant at ee.byu.edu
Mon Oct 24 10:03:17 CDT 2005


Ed Schofield wrote:

>Robert Cimrman wrote:
>  
>
>>Ed Schofield wrote:
>>    
>>
>
>  
>
>>>I wrote in the SVN change log:
>>>
>>>This is the beginning of a merge of Roman Geus's PySparse into scipy.
>>>The goals are to make it more 'Pythonic', integrate sparse matrix types
>>>into scipy so they act similarly to dense arrays, and to support more
>>>a few more sparse data types, particularly CSC, in a nice OO hierarchy.
>>>      
>>>
>>Does this basically mean to make Roman's spmatrix extension type also
>>a subclass of the original Travis' spmatrix? Or would you prefer to
>>keep scipy-ized PySparse separately and ultimately (merge &) replace
>>the original?
>>...
>>
>>Here is a little summary of the situation as I see it now:
>>
>>existing module (TravisSparse):
>>- Python spmatrix class + a number of Python subclasses with a
>>possible fortran/C underlying implementation
>>- pros: trivial addition of new methods / attributes, speeding up
>>could be done "later"; handles (or will handle :-) all numeric scipy
>>types (ints, floats, doubles, ...) automagically
>>- cons: speed in some situations?
>>
>>experimental module (RomanSparse) (random remarks, since I know nuts
>>about it):
>>- C extension class
>>- pros: constructor speed? and speed in general?
>>- cons: not so easy to change the low-level implementation, 'cause
>>it's on that level already?; the solvers (umfpack etc.) are
>>hand-wrapped - I would certainly prefer a generated interface (swig?)
>>which would be AFAIK much more flexible.
>>
>>I am personally in favor of the TravisSparse approach as the base with
>>RomanSparse subclass of spmatrix with a key feature "*speed* *speed*
>>*speed*". Also the _solvers_ should be split as much as practical from
>>the sparse matrix _type_. Of course, having some 'recommended format
>>hinting' (e.g. CSR for umfpack), that would tell the user "use this if
>>you want speed and to avoid implicit matrix conversion" would be
>>necessary.
>>    
>>
This is already the case.  Currently the CSC type is used for SuperLU 
decomposition, and any matrix with a matvec method can be used with the 
iterative approaches.

>>    
>>
>Yes, I agree that a Python implementation would be simpler and more
>flexible than one in C.  Perhaps our goal should be to build on the
>existing TravisSparse module but replace the sparsetools Fortran code
>with code from PySparse, which is probably better tested and debugged. 
>  
>
Let's not through the baby out with the bath water :-) I spent a bit of 
time checking that code, and I don't know that PySparse code has the 
same functionality.

If speed is the issue, then we first need to understand what is actually 
"slowing" us down.  We are basing things off a few comments.  

My suspicion is that speed improvements will be seen by using 
linked-lists for sparse matrix construction and perhaps a faster 
back-end for matrix factorization (I'm not sure how umfpack compares to 
SuperLU).  

>I've had a look at how objects with both C and Python member functions
>are possible in the Python standard library.  An example is the random
>module: there's a randommodule.c file that is imported in the Python
>random.py file and used as a base class for Python objects:
>  
>

We could do this, but I'm really not sure of the improvement.   The 
problem is that the different sparse matrices need such different 
structures underneath that there is not much benefit to having the base 
class in C.

However, you could take the typeobjects that PySparse defines and use 
them as mix-in classes, so that the Python csc_matrix class inherited 
from both the spmatrix base-class and the C-based class.

>Why don't we adopt a similar structure?  Then we'd use Travis's spmatrix
>as the base class, and derive csr_matrix from both spmatrix and an
>object we import from C.  
>
I see that you thought of it already.

>Or is it possible for a C module to derive
>directly from a Python class?!
>  
>
This is possible, but not particularly easy.    Let's solidify the 
desired structure and find out where the bottlenecks actually are before 
moving too much to C.

-Travis




More information about the Scipy-dev mailing list