[Numpy-discussion] speed of array vs matrix

Keith Goodman kwgoodman@gmail....
Mon Oct 25 09:04:48 CDT 2010


On Mon, Oct 25, 2010 at 6:48 AM, Citi, Luca <lciti@essex.ac.uk> wrote:
> Hello,
> I have noticed a significant speed difference between the array and the matrix implementation of the dot product, especially for not-so-big matrices.
> For example:
>
> In [1]: import numpy as np
> In [2]: b = np.random.rand(104,1)
> In [3]: bm = np.mat(b)
> In [4]: a = np.random.rand(8, 104)
> In [5]: am = np.mat(a)
> In [6]: %timeit np.dot(a, b)
> 1000000 loops, best of 3: 1.74 us per loop
> In [7]: %timeit am * bm
> 100000 loops, best of 3: 6.38 us per loop
>
> The results for two different PCs (PC1 with windows/EPD6.2-2 and PC2 with ubuntu/numpy-1.3.0) and two different sizes are below:
>
>           array matrix
>
> 8x104 * 104x1
> PC1    1.74us   6.38us
> PC2    1.23us   5.85us
>
> 8x10 * 10x5
> PC1    2.38us   7.55us
> PC2    1.56us   6.01us
>
> For bigger matrices the timings seem to asymptotically approach.
>
> Is it something worth trying to fix or should I just accept this as a fact and, when working with small matrices, stick to array?

I think the fixed overhead comes from the subclassing of arrays. The
subclassing is done in Python and if an operation creates a matrix
then __array_finalize__ is called. All that adds up to overhead.

http://github.com/numpy/numpy/blob/master/numpy/matrixlib/defmatrix.py

I wrote a mean-variance optimizer with matrices. Switching to arrays
gave me a big speed up.


More information about the NumPy-Discussion mailing list