[SciPy-dev] feedback on scipy.sparse
Thu Dec 13 03:21:58 CST 2007
thanks for pushing scipy.sparse forwards!
Nathan Bell wrote:
> ===== Constructors =====
> Here are the current constructors for the various sparse classes:
> csr_matrix and csc_matrix
> def __init__(self, arg1, dims=None, dtype=None, copy=False):
> dok_matrix and lil_matrix
> def __init__(self, A=None, shape=None, dtype=None, copy=False):
> def __init__(self, arg1, dims=None, dtype=None):
> Empty matrices can now be constructed with xxx_matrix( (M,N) ) for
> all formats.
> 1) Should we prefer 'dims' over 'shape' or vice versa? IMO 'shape'
> is arguably more natural since all the types have a .shape attribute
> 2) It would be nice if xxx_matrix( A ) always worked when A is a
> sparse or dense matrix. Does anyone object to this? The
> functionality is already present (though the various .toxxx() methods)
> 3) When the user defines the dim (or shape) argument but the data
> violates these bounds, what should happen? IMO this merits an
> exception, as opposed to expanding the dimensions to accommodate the
IMHO scipy.sparse should not assume anything that a user not asked
explicitely -> I am for an exception.
> ===== sparse.py and sparse functions =====
> sparse.py currently weighs in at nearly 3000 lines and will continue
> growing. I propose that we move the functions (e.g. spidentity(),
> spdiags(), spkron(), etc. ) to a separate file. Any comments or
> proposals for the name of this file? Would it be prudent to move the
> classes into separate files also?
sputils? Splitting into class files sounds good.
> Also, these functions always return a specific sparse format. For
> example spidentity() always returns a csc_matrix, spkron() always
> returns a coo_matrix, etc. Currently, a user who wanted the identity
> matrix in CSR format would have to do a CSC->CSR conversion on the
> result of spidentity(). This is somewhat wasteful since the
> spidentity() could easily have generated the CSR format instead. It
> would be better to allow the user to specify the desired return type
> in the function call. For example,
> spidentity(n, dtype='d',format='csr')
> instead of
> spidentity(n, dtype='d').tocsr()
> Sometimes a given function has a very natural return type. For
> instance, when we have a dia_matrix() implementation (I'm working on
> one) then spdiags() would naturally use this format. If the user
> specified another type, spdiags( ..., format='csr') then spdiags()
> would, at worst, create the matrix in DIA format first and then
> convert to CSR (with dia_matrix.tocsr() ). I like this approach
> because it allows the implementation to be clever when cleverness is
> possible, but also doesn't place an undue burden on the programmer
> when implementing a new method. Furthermore, it shields the user from
> internal implementation changes that might change the default return
Concerning the Stefan's idea of static methods for spidentity etc., we
could use only one method for all of them, e.g.
def special( name, format = ... ):
if name = 'identity':
to prevent cluttering od the class you mentioned.
More information about the Scipy-dev