# [Numpy-discussion] Efficient orthogonalisation with scipy/numpy

Charles R Harris charlesr.harris@gmail....
Tue Jan 19 17:48:44 CST 2010

```On Tue, Jan 19, 2010 at 2:34 PM, <josef.pktd@gmail.com> wrote:

> On Tue, Jan 19, 2010 at 4:29 PM, Charles R Harris
> <charlesr.harris@gmail.com> wrote:
> >
> >
> > On Tue, Jan 19, 2010 at 1:47 PM, Gael Varoquaux
> > <gael.varoquaux@normalesup.org> wrote:
> >>
> >> On Tue, Jan 19, 2010 at 02:22:30PM -0600, Robert Kern wrote:
> >> > > y = y - np.dot(confounds.T, linalg.lstsq(confounds.T, y)[0])
> >>
> >> > > with np = numpy and linalg = scipy.linalg where scipy calls ATLAS.
> >>
> >> > For clarification, are you trying to find the components of the y
> >> > vectors that are perpendicular to the space spanned by the 10
> >> > orthonormal vectors in confounds?
> >>
> >> Yes. Actually, what I am doing is calculating partial correlation
> between
> >> x and y conditionally to confounds, with the following code:
> >>
> >> def cond_partial_cor(y, x, confounds=[]):
> >>    """ Returns the partial correlation of y and x, conditionning on
> >>        confounds.
> >>    """
> >>    # First orthogonalise y and x relative to confounds
> >>    if len(confounds):
> >>        y = y - np.dot(confounds.T, linalg.lstsq(confounds.T, y)[0])
> >>        x = x - np.dot(confounds.T, linalg.lstsq(confounds.T, x)[0])
> >>    return np.dot(x, y)/sqrt(np.dot(y, y)*np.dot(x, x))
> >>
> >> I am not sure that what I am doing is optimal.
> >>
> >> > > Most of the time is spent in linalg.lstsq. The length of the vectors
> >> > > is
> >> > > 810, and there are about 10 confounds.
> >>
> >> > Exactly what are the shapes? y.shape = (810, N); confounds.shape =
> (810,
> >> > 10)?
> >>
> >> Sorry, I should have been more precise:
> >>
> >> y.shape = (810, )
> >> confounds.shape = (10, 810)
> >>
> >
> > Column stack the bunch so that the last column is y, then do a qr
> > decomposition. The last column of q is the (normalized) orthogonal vector
> > and its amplitude is the last (bottom right) component of r.
>
> do you have to do qr twice, once with x and once with y in the last
> column or can this be combined?
>
> I was trying to do something similar for partial autocorrelation for
> timeseries but didn't manage or try anything better than repeated
> leastsq or a variant.
>
>
Depends on what you want to do. The QR decomposition is essentially
Gram-Schmidt on the columns. So if you just want an orthonormal basis for
the subspace spanned by a bunch of columns, the columns of Q are they. To
get the part of y orthogonal to that subspace you can do y - Q*Q.T*y, which
is probably what you want if the x's are fixed and the y's vary. If there is
just one y, then putting it as the last column lets the QR algorithm do that
last bit of projection.

Note that if you apply the QR algorithm to a Vandermonde matrix with the
columns properly ordered you can get a collection of graded orthogonal
polynomials over a given set of points.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20100119/520bcd83/attachment.html
```