[Numpy-discussion] MemoryError with dot(A, A.T) where A is 800MB on 32-bit Vista

"V. Armando Solé" sole@esrf...
Wed Jun 9 11:57:38 CDT 2010


greg whittier wrote:
> When I run
>
> import numpy as np
>
> a = np.ones((400, 500000), dtype=np.float32)
> c = np.dot(a, a.T)
>
> produces a "MemoryError" on the 32-bit Enthought Python Distribution
> on 32-bit Vista.  I understand this has to do with the 2GB limit with
> 32-bit python and the fact numpy wants a contiguous chunk of memory
> for an array.  When I look at the memory use in the task manager
> though, it looks like it's trying to allocate enough for two
> 400x500000 arrays.  I guess it's explicitly forming a.T.  Is there a
> way to avoid this?  I tried
>
> c = scipy.lib.blas.fblas.dgemm(1.0, a, a, trans_b=1)
>
> but I get the same result.  It appears to be using a lot of extra
> memory.  Isn't this just a wrapper to the blas library that passes a
> pointer to the memory location of a?  Why does it seem to be
> generating the transpose?  Is there a way to do A*A.T without two
> copies of A?
>   
In such cases I create a matrix of zeros with the final size and I fill 
it with a loop of dot products of smaller chunks of the original a matrix.

The MDP package also does something similar.

Armando






More information about the NumPy-Discussion mailing list