[Numpy-discussion] Should dot return 1x1 matrix as scalar?

Paulo J. S. Silva pjssilva at ime.usp.br
Fri Jan 13 11:18:01 CST 2006


> I would be interested to see how a raw-array solution compares.
> There 
> is going to be some overhead of using a subclass, because of the 
> attribute lookups that occur on all array creations and because the 
> subclass is written in Python that could be on the order of 10%.
> 
> Quantifying the subclass slow-down would be useful...  Also, which
> BLAS 
> are you using?



I am using matlab 6.5.0 release 13 in a Athlon tbird 1.2Ghz
Blas with numpy is ATLAS optimized for 3dnow.

I m getting very interesting results. First the time for a QR
factorization of a 1000x1000 random matrix:

Matlab:


>> t = cputime; R = houseqr(A); cputime - t

ans =

   67.1000

Numpy:

In [9]:i = time.clock(); R = houseqr.houseqr(A); time.clock() - i
Out[9]:79.009999999999991

If I code numpy naively making an extra matrix slice this drops to:

In [14]:i = time.clock(); R = houseqr.houseqr(A); time.clock() - i
Out[14]:114.34999999999999

(See the code below).

I have then decided to campare the blas and here things get funny:

Matrix multiplication:

Matlab: 

>> t = cputime; A*A; cputime - t

ans =

    2.0600

Numpy:

In [32]:i = time.clock(); C=A*A; time.clock() - i
Out[32]:1.3600000000000136


It looks like numpy (ATLAS) is much better....

However in the QR code the most important part is a matrix times vector
multiplication and an outer product. If I benchmark this operations I
get:

Matlab:

>> t = cputime; bench(A,b); cputime - t

ans =

    5.7500

Numpy:

In [27]:i = time.clock(); bench(A,b); time.clock() - i
Out[27]:10.610000000000014


Why is numpy so slow??????

Paulo

code

--- houseqr.py --- 

from numpy import *

def norm(x):
    return sqrt(x.T*x).item()

def sinal(x):
    if x < 0.0: return -1.0
    else: return 1.0

def houseqr(A):
    m, n = A.shape
    R = A.copy()

    bigE1 = matrix(zeros((m, 1), float))
    bigE1[0] = 1.0
    for j in range(n):
        x = R[j:,j]
        v = sinal(x[0])*norm(x)*bigE1[:m-j] + x
        v = v / norm(v)
        # Slower version.
        #R[j:,j:] -= 2.0*v*(v.T*R[j:,j:])
        # Faster version: avoids the extra slicing.
        upRight = R[j:,j:]
        upRight -= 2.0*v*(v.T*upRight)

    return R

--- houseqr.m ---

function [A] = houseqr(A)

  [m, n] = size(A);

  bigE1 = zeros(m,1);
  bigE1(1) = 1.0;
  for j = 1:n,
    x = A(j:m,j);
    v = sinal(x(1))*norm(x)*bigE1(1:m-j+1) + x;
    v = v / (norm(v));
    upRight = A(j:m,j:n);
    A(j:m,j:n) = upRight - 2*v*(v'*upRight);
  end

--- bench.py ---

from numpy import *

def bench(A, b):
    for i in range(100):
        c = b.T*A
        d = b*c
    
--- bench.m ---

function bench(A, b)
for i=1:100,
    c = b'*A;
    d = b*c;
end










More information about the Numpy-discussion mailing list