[SciPy-user] PCA and Scipy

John Hunter jdhunter at ace.bsd.uchicago.edu
Thu Feb 2 13:32:19 CST 2006

>>>>> "Andrew" == Andrew Straw <strawman at astraw.com> writes:

    Andrew> PCA is easily implemented using singular value
    Andrew> decomposition, which is available in numpy. Sorry I don't
    Andrew> have any code available at the moment, but hopefully this
    Andrew> gives you a start.

from matplotlib.mlab ....

def prepca(P, frac=0):
    Compute the principal components of P.  P is a numVars x
    numObservations numeric array.  frac is the minimum fraction of
    variance that a component must contain to be included

    Return value are
    Pcomponents : a num components x num observations numeric array
    Trans       : the weights matrix, ie, Pcomponents = Trans*P
    fracVar     : the fraction of the variance accounted for by each
                  component returned
    U,s,v = svd(P)
    varEach = s**2/P.shape[1]
    totVar = asum(varEach)
    fracVar = divide(varEach,totVar)
    ind = int(asum(fracVar>=frac))

    # select the components that are greater
    Trans = transpose(U[:,:ind])
    # The transformed data
    Pcomponents = matrixmultiply(Trans,P)
    return Pcomponents, Trans, fracVar[:ind]

More information about the SciPy-user mailing list