[Numpy-discussion] when and where to use numpy arrays vs nested lists

Mark P. Miller mpmusu@cc.usu....
Thu Mar 1 10:41:05 CST 2007


Interesting...

I also tried the following and got similar results (using a 1,000 x 
1,000 arrays).  The time required to initialize the nested list array 
was much higher (but nonetheless small in the context of the overall 
time that my programs will run).  But array element access is always 
faster when using the nested list approach.

Any other thoughts?  (more code below)

array1 = NP.zeros((10000,10000), int)
array2 = []
for aa in xrange(10000):
     array2.append([])
     for bb in xrange(10000):
	array2[aa].append([])
	array2[aa][bb] = 0

		
 >>> t=timeit.Timer("random1()", "from __main__ import random1")
 >>> t.timeit(10000)
0.51156278242069675
 >>> t.timeit(10000)
0.11799264990395386
 >>> t.timeit(10000)
0.11274142383990693
 >>> t.timeit(10000)
0.11590411630504605
 >>> t=timeit.Timer("random2()", "from __main__ import random2")
 >>> t.timeit(10000)
0.24217882440370886
 >>> t.timeit(10000)
0.077239146316060214
 >>> t.timeit(10000)
0.07531906988197079
 >>> t.timeit(10000)
0.075705711200498627




-Mark

Perry Greenfield wrote:
> On Mar 1, 2007, at 11:03 AM, Mark P. Miller wrote:
> 
>> I've been using Numpy arrays for some work recently.  Just for fun, I
>> compared some "representative" code using Numpy arrays and an object
>> comprised of nested lists to represent my arrays.  To my surprise, the
>> array of nested lists outperformed Numpy in this particular  
>> application
>> (in my actual code by 10%, results below are more dramatic).
>>
>> Can anyone shed some insight here?  The functions that I use in  
>> reality
>> are much more complicated than those listed below, but they are
>> nonetheless representative of the type of thing that I'm doing.
> 
> I'm guessing it has to do with the number of elements you are using.  
> You have to understand that there is a fair amount of overhead in  
> setting up an array operation (e.g., for ufuncs). Typically (I'm  
> going from old results here so I may be off significantly) it isn't  
> until you have arrays of around 1000 elements that as much times is  
> spent doing the operation as it is in setting up for it. So for very  
> large arrays (> 10,000 elements) the overhead is insignificant. For  
> small arrays (e.g., 50 elements, it's all overhead. In that size  
> range, lists will usually be much faster.
> 
> Perry Greenfield
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion



More information about the Numpy-discussion mailing list