[SciPy-User] operations on large arrays

Bruce Southey bsouthey@gmail....
Mon Mar 8 16:29:18 CST 2010


On 03/07/2010 12:05 AM, Vincent Davis wrote:
> I just figured out that I had a few arrays that where taking up a 
> bunch of the memory. That said I still wonder if there is a better way.
>
> 	
> 	
>
> *Vincent Davis
> 720-301-3003 *
> vincent@vincentdavis.net <mailto:vincent@vincentdavis.net>
>
> my blog <http://vincentdavis.net> | LinkedIn 
> <http://www.linkedin.com/in/vincentdavis>
>
>
>
> On Sat, Mar 6, 2010 at 10:22 PM, Vincent Davis 
> <vincent@vincentdavis.net <mailto:vincent@vincentdavis.net>> wrote:
>
>     I have arrays of 8-20 rows and 230,000 column, all the data is float64
>     I what to be able to find the difference in the correlation matrix
>     between arrays
>     let A and B be of size (10, 230000)
>     np.corrcoef(a)-np.corrcoef(b)
>
>     I can't seem to do this with more than 10000 columns at a time
>     because of memory limitations. (about 9GB usable to python)
>     Is there a better way?
>
>     I also have problem finding the column means which is surprising
>     to me, I was not able to get the column means for 10000 columns,
>     but I can computer  the corrcoeff ?
>     np.mean(a, axis=0)
>
>     Do I just need to divide up the job or is there a better approach?
>
>     Thanks
>
>     	
>     	
>
>     *Vincent Davis
>     720-301-3003 *
>     vincent@vincentdavis.net <mailto:vincent@vincentdavis.net>
>
>     my blog <http://vincentdavis.net> | LinkedIn
>     <http://www.linkedin.com/in/vincentdavis>
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>    
Is there a better way to do what?
A problem with 'np.corrcoef(a)-np.corrcoef(b)' is that it is unclear 
what you want as if a and b have more than 1d then you get an array 
back. If the array is near zero then what does that mean? One 
interpretation that perhaps you should be seeing if these are the same 
array. If the array is not zero then what does that mean? Do you need to 
know which parts of a and b lead to different correlations?

You can always do np.corrcoef(a,b) such that the diagonal relating each 
column of a to each column of b is one.

Bruce





-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20100308/ecd375d8/attachment-0001.html 


More information about the SciPy-User mailing list