[Numpy-discussion] Slicing slower than matrix multiplication?
Sat Dec 12 09:17:05 CST 2009
On Sat, Dec 12, 2009 at 5:59 AM, Jasper van de Gronde
> Francesc Alted wrote:
>> Yeah, I think taking slices here is taking quite a lot of time:
>> In : timeit E + Xi2[P/2,:]
>> 100000 loops, best of 3: 3.95 µs per loop
>> In : timeit E + Xi2[P/2]
>> 100000 loops, best of 3: 2.17 µs per loop
>> don't know why the additional ',:' in the slice is taking so much time, but my
>> guess is that passing & analyzing the second argument (slice(None,None,None))
>> could be the responsible for the slowdown (but that is taking too much time).
>> Mmh, perhaps it would be worth to study this more carefully so that an
>> optimization could be done in NumPy.
> This is indeed interesting! And very nice that this actually works the
> way you'd expect it to. I guess I've just worked too long with Matlab :)
>>> I think the lesson mostly should be that with so little data,
>>> benchmarking becomes a very difficult art.
>> Well, I think it is not difficult, it is just that you are perhaps
>> benchmarking Python/NumPy machinery instead ;-) I'm curious whether Matlab
>> can do slicing much more faster than NumPy. Jasper?
> I had a look, these are the timings for Python for 60x20:
> Dot product: 0.051165 (5.116467e-06 per iter)
> Add a row: 0.092849 (9.284860e-06 per iter)
> Add a column: 0.082523 (8.252348e-06 per iter)
> For Matlab 60x20:
> Dot product: 0.029927 (2.992664e-006 per iter)
> Add a row: 0.019664 (1.966444e-006 per iter)
> Add a column: 0.008384 (8.384376e-007 per iter)
> For Python 600x200:
> Dot product: 1.917235 (1.917235e-04 per iter)
> Add a row: 0.113243 (1.132425e-05 per iter)
> Add a column: 0.162740 (1.627397e-05 per iter)
> For Matlab 600x200:
> Dot product: 1.282778 (1.282778e-004 per iter)
> Add a row: 0.107252 (1.072525e-005 per iter)
> Add a column: 0.021325 (2.132527e-006 per iter)
> If I fit a line through these two data points (60 and 600 rows), I get
> the following equations:
> Python, AR: 3.8e-5 * n + 0.091
> Matlab, AC: 2.4e-5 * n + 0.0069
> This would suggest that Matlab performs the vector addition about 1.6
> times faster and has a 13 times smaller constant cost!
> As for the questions about what I'm trying to compute, these tests are
> minimized as much as possible to show the bottleneck I encountered, they
> are part of a larger loop where it does make sense. In essence I'm
> iteratively adjusting w and E has to keep up (because that's what is
> used to determine the next change). Instead of recomputing E all the
> time based on E = Xi*w a little linear algebra shows that the vector
> addition is sufficient.
> NumPy-Discussion mailing list
I think the difference between Matlab and numpy have been well
discussed elsewhere (http://www.scipy.org/NumPy_for_Matlab_Users)
especially the C/Fortran order difference.
Clearly you are not using the same level of optimized libraries
because of the differences shown. Unfortunately your code does not
distinguish between the dot products and the slicing so slower dot
product rules your times. Really you need to just compare your slicing
alone without any dot product or without the inplace addition.
Really I would suggest asking the list for the real problem because it
is often amazing what solutions have been given.
More information about the NumPy-Discussion