[Numpy-discussion] inplace matrix multiplication
Dan Goodman
dg.gmane@thesamovar....
Tue Apr 28 09:56:20 CDT 2009
Can anyone explain the results below? It seems that for small matrices
dot(x,y) is outperforming dgemm(1,x,y,0,y,overwrite_c=1), but for larger
matrices the latter is winning. In principle it seems like I ought to be
able to always do better with inplace rather than making copies?
From looking at the RAM used by python.exe it seems that indeed
dot(x,y) is allocating lots of memory and making copies, and dgemm is
not. My system is a Windows PC with a Pentium M processor 1.86GHz family
6 model 13 (which I think supports SSE2 but not SSE3), Python version
2.5, scipy version 0.7.0.dev5410 and numpy version 1.2.1.
In [71]: x=array(randn(3,3), order='F')
In [73]: n=1000
In [74]: y=array(randn(3,n), order='F')
In [75]: y0=copy(y)
In [76]: %timeit y[:]=y0[:]; dot(x,y)
10000 loops, best of 3: 48.5 ┬Ás per loop
In [77]: %timeit y[:]=y0[:]; dgemm(1,x,y,0,y,overwrite_c=1)
10000 loops, best of 3: 61.6 ┬Ás per loop
In [79]: n=100000
In [80]: y=array(randn(3,n), order='F')
In [81]: y0=copy(y)
In [82]: %timeit y[:]=y0[:]; dot(x,y)
10 loops, best of 3: 22.9 ms per loop
In [83]: %timeit y[:]=y0[:]; dgemm(1,x,y,0,y,overwrite_c=1)
100 loops, best of 3: 8.37 ms per loop
Dan
Frédéric Bastien wrote:
You are right, some here told me about this. I should have posted it here for reference.
> here for reference.
> thanks
>
> Fred
On Fri, Apr 24, 2009 at 1:45 PM, David Warde-Farley wrote:
> <mailto:dwf@cs.toronto.edu>> wrote:
On 9-Jan-09, at 4:31 PM, Robert Kern wrote:
>
> You can't in numpy. With scipy.linalg.fblas.dgemm() and the right
> arguments, you can.
> > arguments, you can.
>
Make sure your output array is Fortran-ordered, however, otherwise copies will be made.
> copies will be made.
>
> David
