[SciPy-User] linear algebra: quadratic forms without linalg.inv

Souheil Inati souheil.inati@nyu....
Mon Nov 2 08:53:25 CST 2009

On Nov 2, 2009, at 4:37 AM, Sturla Molden wrote:

> josef.pktd@gmail.com skrev:
>> Good, I didn't realize this when I worked on the eig and svd  
>> versions of
>> the pca. In a similar way, I was initially puzzled that pinv can be  
>> used
>> on the data matrix or on the covariance matrix (only the latter I  
>> have seen
>> in books).
> I'll try to explain... If you have a matrix C, you can factorize like
> this, with Sigma being a diagonal matrix:
>   C = U * Sigma * V'
>>>> u,s,vt = np.linalg.svd(c)
> If C is square (rank n x n), we now have the inverse
>   C**-1 = V * [S**-1] * U'
>>>> c_inv = np.mat(vt.T) * np.mat(np.eye(4)/s) * np.mat(u.T)
> And here you have the pathology diagnosis:
> A small value of s, will cause a huge value of 1/s. This is
> "ill-conditioning" that e.g. happens with multicolinearity. You get a
> small s, you divide by it, and rounding error skyrockets. We can  
> improve
> the situation by editing the tiny values in Sigma to zero. That just
> changes C by a tiny amount, but might have a dramatic stabilizing  
> effect
> on C**-1. Now you can do your LU and not worry. It might not be clear
> from statistics textbooks why multicolinearity is problem. But using
> SVD, we see both the problem and the solution very clearly: A small
> singular value might not contribute significantly to C, but could or
> severly affect or even dominate in C**-1. We can thus get a biased but
> numerically better approximation to C**-1 by deleting it from the
> equation. So after editing s, we could e.g. do:
>>>> c_fixed = np.mat(u) * np.mat(np.eye(4)*s) * np.mat(vt)
> and continue with LU on c_fixed to get the quadratic form.
> Also beware that you can solve
>   C * x = b
> like this
>   x = (V * [S**-1]) * (U' * b)
> But if we are to reapeat this for several values of b, it would make
> more sence to reconstruct C and go for the LU. Soving with LU also
> involves two matrix multiplications:
>   L * y = b
>   U * x = y
> but the computational demand is reduced by the triangular structure  
> of L
> and U.
> Please don't say you'd rather preprocess data with a PCA. If C was a
> covariance matrix, we just threw out the smallest principal components
> out of the data. Deleting tiny singular values is in fact why PCA  
> helps!
> Also beware that
>   pca = lambda x : np.linalg.svd(x-x.mean(axis=0), full_matrices=0)
> So we can get PCA from SVD without even calculating the covariance.  
> Now
> you have the standard deviations in Sigma, the principal components in
> V, and the factor loadings in U. SVD is how PCA is usually computed.  
> It
> is better than estimating Cov(X), and then apply Jacobi rotations to  
> get
> the eigenvalues and eigenvectors of  Cov(X). One reason is that Cov(X)
> should be estimated using a "two-pass algorithm" to cancel  
> accumulating
> rounding error (Am Stat, 37: p.  242-247). But that equation is not
> shown in most statistics textbooks, so most practitioners tend to not
> know of it .
> We can solve the common least squares problem using an SVD:
>   b = argmin { || X * b - Y  ||  **  2 }
> If we do an SVD of X, we can compute
>   b = sum( ((u[i,:] * Y )/s[i]) * vt[:,i].T )
> Unlike the other methods of fitting least squares, this one cannot  
> fail.
> And you also see clearly what a PCA will do:
>   Skip "(u[i,:] * Y )/s[i]" for too small values of s[i]
> So you can preprocess with PCA anf fit LS in one shot.
> Ridge-regression (Tychonov regularization) is another solution to the
> multicollinearity problem:
>    (A'A + lambda*I)*x = A'b
> But how would you choose the numerically optimal value of lambda? It
> turns out to be a case of SVD as well. Goloub & van Loan has that on
> page 583.
> QR with column pivoting can be seen as a case of SVD. Many use this  
> for
> least-squares, not even knowing it is SVD. So SVD is ubiquitous in  
> data
> modelling, even if you don't know it. :-)
> One more thing: The Cholesky factorization is always stabile, the LU  
> is
> not. But don't be fooled: This only applies to the facotization  
> itself.
> If you have multicolinearity, the problem is there even if you use
> Cholesky. You get the "singular value disease" (astronomic rounding
> error) when you solve the triangular system. A Cholesky can tell you  
> if
> a covariance matrix is singular at your numerical precision. An SVD  
> can
> tell you how close to singularity it is, and how to fix it. SVD  
> comes at
> a cost, which is  slower  computation. But usually it is worth the  
> extra
> investment in CPU cycles.
> Sturla Molden

I agree with Sturla's comment's above 100%.    You should almost  
always use SVD to understand your linear system properties.   For  
least squares fitting QR is the modern, stable algorithm of choise.   
(see for example the matlab \ operator).   It's really a crime that we  
don't teach SVD and QR.

There are two sources of error: 1. noise in the measurement and 2.  
noise in the numerics (rounding, division, etc.).   A properly  
constructed linear system solver will take care of the second type of  
error (rounding, etc.).  If your system is ill-conditioned, then you  
need to control the inversion so that the signal is maintained and the  
noise is not amplified too much.  In the overwhelming majority of  
applications, the SNR isn't better than 1000:1.  If you know your the  
relative size of your noise and signal, then you can control the SNR  
in your parameter estimates by choosing the svd truncation (noise  
amplification factor).

For those of you that want an accessible reference for numerical  
stability in linear algebra, this book is a must read:
Numerical Linear Algebra, Lloyd Trefethen



Souheil Inati, PhD
Research Associate Professor
Center for Neural Science and Department of Psychology
Chief Physicist, NYU Center for Brain Imaging
New York University
4 Washington Place, Room 809
New York, N.Y., 10003-6621
Office: (212) 998-3741
Fax:     (212) 995-4011
Email: souheil.inati@nyu.edu

More information about the SciPy-User mailing list