[SciPy-dev] feedback on scipy.sparse
Robert Cimrman
cimrman3@ntc.zcu...
Thu Dec 13 03:21:58 CST 2007
Hi Nathan,
thanks for pushing scipy.sparse forwards!
Nathan Bell wrote:
> ===== Constructors =====
>
> Here are the current constructors for the various sparse classes:
>
> csr_matrix and csc_matrix
> def __init__(self, arg1, dims=None, dtype=None, copy=False):
>
> dok_matrix and lil_matrix
> def __init__(self, A=None, shape=None, dtype=None, copy=False):
>
> coo_matrix
> def __init__(self, arg1, dims=None, dtype=None):
>
> Empty matrices can now be constructed with xxx_matrix( (M,N) ) for
> all formats.
>
> 1) Should we prefer 'dims' over 'shape' or vice versa? IMO 'shape'
> is arguably more natural since all the types have a .shape attribute
Yes, please.
> 2) It would be nice if xxx_matrix( A ) always worked when A is a
> sparse or dense matrix. Does anyone object to this? The
> functionality is already present (though the various .toxxx() methods)
+1.
> 3) When the user defines the dim (or shape) argument but the data
> violates these bounds, what should happen? IMO this merits an
> exception, as opposed to expanding the dimensions to accommodate the
> data.
IMHO scipy.sparse should not assume anything that a user not asked
explicitely -> I am for an exception.
> ===== sparse.py and sparse functions =====
>
> sparse.py currently weighs in at nearly 3000 lines and will continue
> growing. I propose that we move the functions (e.g. spidentity(),
> spdiags(), spkron(), etc. ) to a separate file. Any comments or
> proposals for the name of this file? Would it be prudent to move the
> classes into separate files also?
sputils? Splitting into class files sounds good.
> Also, these functions always return a specific sparse format. For
> example spidentity() always returns a csc_matrix, spkron() always
> returns a coo_matrix, etc. Currently, a user who wanted the identity
> matrix in CSR format would have to do a CSC->CSR conversion on the
> result of spidentity(). This is somewhat wasteful since the
> spidentity() could easily have generated the CSR format instead. It
> would be better to allow the user to specify the desired return type
> in the function call. For example,
> spidentity(n, dtype='d',format='csr')
> instead of
> spidentity(n, dtype='d').tocsr()
> Sometimes a given function has a very natural return type. For
> instance, when we have a dia_matrix() implementation (I'm working on
> one) then spdiags() would naturally use this format. If the user
> specified another type, spdiags( ..., format='csr') then spdiags()
> would, at worst, create the matrix in DIA format first and then
> convert to CSR (with dia_matrix.tocsr() ). I like this approach
> because it allows the implementation to be clever when cleverness is
> possible, but also doesn't place an undue burden on the programmer
> when implementing a new method. Furthermore, it shields the user from
> internal implementation changes that might change the default return
> format.
Good idea!
Concerning the Stefan's idea of static methods for spidentity etc., we
could use only one method for all of them, e.g.
class spmatrix:
def special( name, format = ... ):
if name = 'identity':
return spidentity(n,format=format)
...
to prevent cluttering od the class you mentioned.
best,
r.
More information about the Scipy-dev
mailing list