[Numpy-discussion] mkl usage
Fri Feb 24 08:35:23 CST 2012
Am 24.2.2012 um 13:54 schrieb Neal Becker:
> Francesc Alted wrote:
>> On Feb 23, 2012, at 2:19 PM, Neal Becker wrote:
>>> Pauli Virtanen wrote:
>>>> 23.02.2012 20:44, Francesc Alted kirjoitti:
>>>>> On Feb 23, 2012, at 1:33 PM, Neal Becker wrote:
>>>>>> Is mkl only used for linear algebra? Will it speed up e.g., elementwise
>>>>>> transendental functions?
>>>>> Yes, MKL comes with VML that has this type of optimizations:
>>>> And also no, in the sense that Numpy and Scipy don't use VML.
>>> My question is:
>>> "Should I purchase MKL?"
>>> To what extent will it speed up my existing python code, without my having to
>>> exert (much) effort?
>>> So that would be numpy/scipy.
>> Pauli already answered you. If you are restricted to use numpy/scipy and your
>> aim is to accelerate the evaluation of transcendental functions, then there is
>> no point in purchasing MKL. If you can open your spectrum and use numexpr,
>> then I think you should ponder about it.
>> -- Francesc Alted
> Thanks. One more thing, on theano I'm guessing MKL is required to be installed
> onto each host that would use it at runtime? So I'd need a licensed copy for
> each host that will execute the code? I'm guessing that because theano needs to
> compile code at runtime.
No, this is not a must. Theano uses the MKL (BLAS) only for linear algebra (dot products). For linear algebra, Theano can also use numpy instead of directly linking to the MKL. So if you have numpy with an optimized BLAS library (MKL) available, you can indirectly use MKL without having it installed on each computer you use Theano.
If you want to use fast transcendental functions, you should go for numexpr. Christoph Gohlke provides Windows binaries linked against MKL (for numpy and numexpr), so you don't need to purchase MKL to check how much performance increase you can gain.
Long ago I wrote a small package that injects the transcendental functions of MKL/VML into numpy, so you could get better performance without making any changes to your code. But my experience is that you gain only little because typically the performance is limited by memory bandwidth. Perhaps now on multi-core hosts you gain more due since the VML library uses multi-threading.
More information about the NumPy-Discussion