# [SciPy-user] PCA and Scipy

John Hunter jdhunter at ace.bsd.uchicago.edu
Thu Feb 2 13:32:19 CST 2006

```>>>>> "Andrew" == Andrew Straw <strawman at astraw.com> writes:

Andrew> PCA is easily implemented using singular value
Andrew> decomposition, which is available in numpy. Sorry I don't
Andrew> have any code available at the moment, but hopefully this
Andrew> gives you a start.

from matplotlib.mlab ....

def prepca(P, frac=0):
"""
Compute the principal components of P.  P is a numVars x
numObservations numeric array.  frac is the minimum fraction of
variance that a component must contain to be included

Return value are
Pcomponents : a num components x num observations numeric array
Trans       : the weights matrix, ie, Pcomponents = Trans*P
fracVar     : the fraction of the variance accounted for by each
component returned
"""
U,s,v = svd(P)
varEach = s**2/P.shape[1]
totVar = asum(varEach)
fracVar = divide(varEach,totVar)
ind = int(asum(fracVar>=frac))

# select the components that are greater
Trans = transpose(U[:,:ind])
# The transformed data
Pcomponents = matrixmultiply(Trans,P)
return Pcomponents, Trans, fracVar[:ind]

```