[Numpy-discussion] speeding up operations on small vectors
Bruce Southey
bsouthey@gmail....
Tue Oct 11 12:29:41 CDT 2011
On 10/11/2011 12:06 PM, Skipper Seabold wrote:
> On Tue, Oct 11, 2011 at 12:41 PM, Christoph Groth<cwg@falma.de> wrote:
>> Skipper Seabold<jsseabold@gmail.com> writes:
>>
>>> So it's the dot function being called repeatedly on smallish arrays
>>> that's the bottleneck? I've run into this as well. See this thread
>>> [1].
>>> (...)
>> Thanks for the links. "tokyo" is interesting, though I fear the
>> intermediate matrix size regime where it really makes a difference will
>> be rather small. My concern is in really tiny vectors, where it's not
>> even worth to call BLAS.
>>
> IIUC, it's not so much the BLAS that's helpful but avoiding the
> overhead in calling numpy.dot from cython.
>
>>> I'd be very interested to hear if you achieve a great speed-up with
>>> cython+tokyo.
>> I try to solve this problem in some way or other. I'll post here if I
>> end up with something interesting.
> Please do.
>
> Skipper
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
In the example, M is an identity 2 by 2 array. This creates a lot of
overhead in creating arrays from a tuple followed by two dot operations.
But the tuple code is not exactly equivalent because M is 'expanded'
into a single dimension to avoid some of the unnecessary
multiplications. Thus the tuple code is already a different algorithm
than the numpy code so the comparison is not really correct.
All that is needed here for looping over scalar values of x, y and
radius is to evaluate (x*x + y*y) < radius**2
That could probably be done with array multiplication and broadcasting.
Bruce
More information about the NumPy-Discussion
mailing list