[SciPy-dev] Sorting speed

Travis Oliphant oliphant.travis at ieee.org
Fri Dec 30 16:35:44 CST 2005


Alan G Isaac wrote:

>On Fri, 30 Dec 2005, Francesc Altet apparently wrote: 
>  
>
>>numarray is roughly 5x faster than scipy_core 
>>    
>>
>
>On my computer, it seems almost all that difference is due 
>to the time difference in initializing the arrays.  (I.e.,
>much of it disappears if I put the array creation in the
>setup statement.)
>
>  
>
Hmm...  This is strange to me.   Look at my times for array creation...


 >>> t4 = timeit.Timer('res = na.array(None,shape=10000)','import 
numarray as na')
 >>> t5 = timeit.Timer('res = sc.empty(shape=10000)','import scipy as sc')

 >>> t5.repeat(5,10000)
[0.1860051155090332, 0.13622808456420898, 0.12397193908691406, 
0.13260793685913086, 0.13042497634887695]

 >>> t4.repeat(5,10000)
[1.9458281993865967, 1.9462640285491943, 1.8813719749450684, 
1.8477518558502197, 1.8829448223114014]


Which seems to indicate that scipy is about 10x faster at array 
creation.   

But, in this instance the sorting is way more time consuming than array 
creation and so the improved numarray sort-times are holding sway.   On 
my system for pure sorting (with new SVN which does in-place sorting --- 
but still the old way using generic sort algorithm with compare function 
pointers) I get that numarray (with it's data-type specific sorting) is 
about 3x faster in this case:


 >>> t1 = timeit.Timer('a.sort()', 'import numarray as na; a = 
na.array(None,shape=10000)')
 >>> t2 = timeit.Timer('a.sort()', 'import scipy as sc; a = 
sc.empty(shape=10000)')
 >>> t1.repeat(3,100)
[0.26560115814208984, 0.25392889976501465, 0.25759387016296387]
 >>> t2.repeat(3,100)
[0.66517782211303711, 0.6185309886932373, 0.65760207176208496]

It could be that the 5x numbers were due to the copying of information 
over to a new array.  But still, I think 3x leaves room for improvement 
which no doubt specific (rather than generic) sorting algorithms would 
provide.

-Travis







More information about the Scipy-dev mailing list