# [Numpy-discussion] Proposal for matrix_rank function in numpy

Matthew Brett matthew.brett@gmail....
Tue Dec 15 14:16:25 CST 2009

```Hi,

Is it reasonable to summarize that, to avoid confusion, we keep
'matrix_rank' as the name?

I've edited as Robert suggested, attempting to adopt a more suitable
tone in the docstring...

Thanks a lot,

Matthew

def matrix_rank(M, tol=None):
''' Return rank of matrix using SVD method

Rank of the array is the number of SVD singular values of the
array that are greater than `tol`.

Parameters
----------
M : array-like
array of <=2 dimensions
tol : {None, float}
threshold below which SVD values are considered zero. If `tol`
is None, and `S` is an array with singular values for `M`, and
`eps` is the epsilon value for datatype of `S`, then `tol` set
to ``S.max() * eps``.

Examples
--------
>>> matrix_rank(np.eye(4)) # Full rank matrix
4
>>> I=np.eye(4); I[-1,-1] = 0. # rank deficient matrix
>>> matrix_rank(I)
3
>>> matrix_rank(np.zeros((4,4))) # All zeros - zero rank
0
>>> matrix_rank(np.ones((4,))) # 1 dimension - rank 1 unless all 0
1
>>> matrix_rank(np.zeros((4,)))
0
>>> matrix_rank([1]) # accepts array-like
1

Notes
-----
Golub and van Loan [1]_ define "numerical rank deficiency" as using
tol=eps*S[0] (where S[0] is the maximum singular value and thus the
2-norm of the matrix). This is one definition of rank deficiency,
and the one we use here.  When floating point roundoff is the main
concern, then "numerical rank deficiency" is a reasonable choice. In
some cases you may prefer other definitions. The most useful measure
of the tolerance depends on the operations you intend to use on your
matrix. For example, if your data come from uncertain measurements
with uncertainties greater than floating point epsilon, choosing a
tolerance near that uncertainty may be preferable.  The tolerance
may be absolute if the uncertainties are absolute rather than
relative.

References
----------
.. [1] G. H. Golub and C. F. Van Loan, _Matrix Computations_.
Baltimore: Johns Hopkins University Press, 1996.
'''
M = np.asarray(M)
if M.ndim > 2:
raise TypeError('array should have 2 or fewer dimensions')
if M.ndim < 2:
return int(not np.all(M==0))
S = npl.svd(M, compute_uv=False)
if tol is None:
tol = S.max() * np.finfo(S.dtype).eps
return np.sum(S > tol)
```