[Numpy-discussion] Timing array construction

Christopher Barker Chris.Barker@noaa....
Thu Apr 30 14:16:05 CDT 2009


Mark Janikas wrote:
> I have a lot of array constructions in my code that use
> NUM.array([list of values])... I am going to replace it with the
> empty allocation and insertion.

It may not be worth it, depending on where list_of_values comes from/is. 
A rule of thumb may be: it's going to be slow going from a numpy array 
to a regular old python list or tuple, back to a numpy array. If your 
data is a python list already, than np.array(list) is a fine choice.


>> def useAsArray(xCoords, yCoords):
>>
>>     return NUM.asarray(zip(xCoords, yCoords))

Here are some of the issues with this one:

zip unpacks two generic python sequences and then put the items into 
tuple, then puts them in a list. Essentially this:

new_list = []
for i in range(len(xCoords)):
     new_list.append((xCoords[i], yCoords[i]))


In each iteration of that loop, it's indexing into the numpy arrays, 
making a python object out of them, putting them into a tuple, and 
appending that tuple to the list, which may have to re-allocate memory a 
few times.

Then the np.array() call loops through that list, unpacks each tuple, 
examines the python object, decides what it is, and turn it into a raw 
c-type to put into the array.

whereas:

def useEmpty(xCoords, yCoords):
      out = np.empty((len(xCoords), 2), dtype=xCoords.dtype)
      out[:,0] = xCoords
      out[:,1] = yCoords
      return out

allocates an array the right size.
directly copies the data from xCoords and yCoords to it.

that's it.

You can see why it's so much faster!

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@noaa.gov


More information about the Numpy-discussion mailing list