[Numpy-discussion] performance matrix multiplication vs. matlab

Gael Varoquaux gael.varoquaux@normalesup....
Mon Jun 8 08:34:14 CDT 2009


On Mon, Jun 08, 2009 at 06:28:06AM -0700, Keith Goodman wrote:
> On Mon, Jun 8, 2009 at 6:17 AM, Gael Varoquaux
> <gael.varoquaux@normalesup.org> wrote:
> > On Mon, Jun 08, 2009 at 09:02:12AM -0400, josef.pktd@gmail.com wrote:
> >> whats the actual shape of the array/data you run your PCA on.

> > 50 000 dimensions, 820 datapoints.

> Have you tried shuffling each time series, performing PCA, looking at
> the magnitude of the largest eigenvalue, then repeating many times?
> That will give you an idea of how large the noise can be. Then you can
> see how many eigenvectors of the unshuffled data have eigenvalues
> greater than the noise. It would be kind of the empirical approach to
> random matrix theory.

Yes, that's the kind of things that is done in the paper I pointed out
and I use to infer the number of PCAs I retain.

Gaël


More information about the Numpy-discussion mailing list