[Numpy-discussion] MemoryError with dot(A, A.T) where A is 800MB on 32-bit Vista
"V. Armando Solé"
sole@esrf...
Wed Jun 9 11:57:38 CDT 2010
greg whittier wrote:
> When I run
>
> import numpy as np
>
> a = np.ones((400, 500000), dtype=np.float32)
> c = np.dot(a, a.T)
>
> produces a "MemoryError" on the 32-bit Enthought Python Distribution
> on 32-bit Vista. I understand this has to do with the 2GB limit with
> 32-bit python and the fact numpy wants a contiguous chunk of memory
> for an array. When I look at the memory use in the task manager
> though, it looks like it's trying to allocate enough for two
> 400x500000 arrays. I guess it's explicitly forming a.T. Is there a
> way to avoid this? I tried
>
> c = scipy.lib.blas.fblas.dgemm(1.0, a, a, trans_b=1)
>
> but I get the same result. It appears to be using a lot of extra
> memory. Isn't this just a wrapper to the blas library that passes a
> pointer to the memory location of a? Why does it seem to be
> generating the transpose? Is there a way to do A*A.T without two
> copies of A?
>
In such cases I create a matrix of zeros with the final size and I fill
it with a loop of dot products of smaller chunks of the original a matrix.
The MDP package also does something similar.
Armando
