[Numpy-discussion] Timing array construction

Christopher Barker Chris.Barker@noaa....
Thu Apr 30 14:16:05 CDT 2009

Mark Janikas wrote:
> I have a lot of array constructions in my code that use
> NUM.array([list of values])... I am going to replace it with the
> empty allocation and insertion.

It may not be worth it, depending on where list_of_values comes from/is. 
A rule of thumb may be: it's going to be slow going from a numpy array 
to a regular old python list or tuple, back to a numpy array. If your 
data is a python list already, than np.array(list) is a fine choice.

>> def useAsArray(xCoords, yCoords):
>>     return NUM.asarray(zip(xCoords, yCoords))

Here are some of the issues with this one:

zip unpacks two generic python sequences and then put the items into 
tuple, then puts them in a list. Essentially this:

new_list = []
for i in range(len(xCoords)):
     new_list.append((xCoords[i], yCoords[i]))

In each iteration of that loop, it's indexing into the numpy arrays, 
making a python object out of them, putting them into a tuple, and 
appending that tuple to the list, which may have to re-allocate memory a 
few times.

Then the np.array() call loops through that list, unpacks each tuple, 
examines the python object, decides what it is, and turn it into a raw 
c-type to put into the array.


def useEmpty(xCoords, yCoords):
      out = np.empty((len(xCoords), 2), dtype=xCoords.dtype)
      out[:,0] = xCoords
      out[:,1] = yCoords
      return out

allocates an array the right size.
directly copies the data from xCoords and yCoords to it.

that's it.

You can see why it's so much faster!


Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception


More information about the Numpy-discussion mailing list