[Numpy-discussion] Numpy Memory Error with corrcoef
Tue Mar 27 08:30:53 CDT 2012
Le 27 mars 2012 06:04, Nicole Stoffels <firstname.lastname@example.org> a écrit
> Hi Pierre,
> thanks for the fast answer!
> I actually have timeseries of 24 hours for 459375 gridpoints in Europe.
> The timeseries of every grid point is stored in a column. That's why in my
> real program I already transposed the data, so that the correlation is made
> column by column. What I finally need is the correlation of each gridpoint
> with every other gridpoint. I'm afraid that this results in a 459375*459375
> The correlation is actually just an interim result. So I'm currently
> trying to loop over every gridpoint to get single correlations which will
> then be processed further. Is this the right approach?
> for column in range(len(data_records)):
> for columnnumber in range(len(data_records)):
> correlation = corrcoef(data_records[column],
> Best wished,
It may be painfully slow... You should make sure you don't compute twice
each off-diagonal element.
Also, if all your computations can be vectorized, you'll probably get a
significant performance boost by computing your matrix by blocks instead of
element-by-element. Take blocks as big as can fit in memory.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion