[SciPy-user] PCA and Scipy
John Hunter
jdhunter at ace.bsd.uchicago.edu
Thu Feb 2 13:32:19 CST 2006
>>>>> "Andrew" == Andrew Straw <strawman at astraw.com> writes:
Andrew> PCA is easily implemented using singular value
Andrew> decomposition, which is available in numpy. Sorry I don't
Andrew> have any code available at the moment, but hopefully this
Andrew> gives you a start.
from matplotlib.mlab ....
def prepca(P, frac=0):
"""
Compute the principal components of P. P is a numVars x
numObservations numeric array. frac is the minimum fraction of
variance that a component must contain to be included
Return value are
Pcomponents : a num components x num observations numeric array
Trans : the weights matrix, ie, Pcomponents = Trans*P
fracVar : the fraction of the variance accounted for by each
component returned
"""
U,s,v = svd(P)
varEach = s**2/P.shape[1]
totVar = asum(varEach)
fracVar = divide(varEach,totVar)
ind = int(asum(fracVar>=frac))
# select the components that are greater
Trans = transpose(U[:,:ind])
# The transformed data
Pcomponents = matrixmultiply(Trans,P)
return Pcomponents, Trans, fracVar[:ind]
More information about the SciPy-user
mailing list