[Numpy-discussion] Speed performance on array constant set

Travis Oliphant oliphant.travis at ieee.org
Fri Jan 20 08:42:01 CST 2006


Mark Heslep wrote:

> Travis Oliphant wrote:
>
>> This is actually a bit surprising that opencv can create and fill so 
>> quickly.  Perhaps they are using  optimized SSE functions for the 
>> Intel platform, or something?
>> -Travis
>>
> Ah, sorry, Im an unintentional fraud.  Yes I have Intel's optimization 
> library IPP turned on and had forgotten about it.  So one more time:
>
> With IPP on as before.  UseOptimized = # of Cv functions available w/  
> IPP
>
>> python -m timeit -s "import opencv.cv as cv; print 
>> cv.cvUseOptimized(1); im =cv.cvCreateImage(cv.cvSize(1000,1000), 8, 
>> 1)" "cv.cvSet( im, cv.cvRealScalar( 7 ) )"
>> 305
>> 305
>> 305
>> 305
>> 305
>> 100 loops, best of 3: 2.24 msec per loop
>
>
> And without:
>
>> python -m timeit -s "import opencv.cv as cv; print 
>> cv.cvUseOptimized(0); im =cv.cvCreateImage(cv.cvSize(1000,1000), 8, 
>> 1)" "cv.cvSet( im, cv.cvRealScalar( 7 ) )"
>> 0
>> 0
>> 0
>> 0
>> 0
>> 100 loops, best of 3: 6.94 msec per loop
>

>
> So IPP gives me 3X, which leads me to ask about plans for IPP / SSE 
> for NumPy, no offense intended to non Intel users.  I believe I recall 
> some post that auto code generation in NumArray was the road block?


There was some talk of using liboil for this (similar to what _dotblas 
does).   There could definitely be some gains.  I don't see any road 
block other than time and effort....

With my own tests of a ctypes-wrapped function that just mallocs memory 
and fills it, I put numpy at about 3x slower than that function.  

The scalar fill function of numpy could definitely be made faster.    
Right now, it's still pretty generic.

-Travis







More information about the Numpy-discussion mailing list