[Numpy-discussion] fastest way to make two vectors into anarray

Chris Barker Chris.Barker at noaa.gov
Fri Jan 31 12:44:02 CST 2003


John Hunter wrote:
>     John> I have two equal length 1D arrays of 256-4096 complex or
>     John> floating point numbers which I need to put into a
>     John> shape=(len(x),2) array.

> I tested all the suggested methods and the transpose with [x] and [y]
> was the clear winner, with an 8 fold speed up over my original code.
> The concatenate method was between 2-3 times faster.

I was a little surprised by this, as I figured that the transpose method
made an extra copy of the data (array() makes one copy, transpose()
another. So I looked at the source for concatenate:

def concatenate(a, axis=0):
    """concatenate(a, axis=0) joins the tuple of sequences in a into a
single
    NumPy array.
    """
    if axis == 0:
        return multiarray.concatenate(a)
    else:
        new_list = []
        for m in a:
            new_list.append(swapaxes(m, axis, 0))
    return swapaxes(multiarray.concatenate(new_list), axis, 0)

So, if you are concantenating along anything other than the zero-th
axis, you end up doing something similar to the transpose method. Seeign
this, I trioed something else:

def test_concat2(x,y):
    x.shape = (1,-1)
    y.shape = (1,-1)
    X = transpose( concatenate( (x, y) ) )
    x.shape = (-1,)
    y.shape = (-1,)

This then uses the native concatenate, but requires an extra copy in teh
transpose.

Here's a somewhat cleaner version, though you get more copies:

def test_concat3(x,y):
    "Thanks to Chris Barker and Bryan Cole"
    X = transpose( concatenate( ( reshape(x,(1,-1)), reshape(y,(1,-1)) )
) )

Here are the test results:

testing on vectors of length:  4096

test_concat 0.286280035973
test_transpose 0.100033998489
test_naive 0.805399060249
test_concat3 0.109319090843
test_concat2 0.136469960213

All the transpose methods are essentially a tie. Would it be that hard
for concatenate to do it's thing for any axis in C? It does seem like
this is a fairly basic operation, and shouldn't require more than one
copy.

By the way, I realised that the transpose method had an extra call.
transpose() can take an approprriate python sequence, so this works just
fine:

def test_transpose2(x,y):
    X = transpose([x]+[y])

However, it doesn't really save you the copy, as I'm retty sure
transpose makes a copy internally anyway. Test results:
testing on vectors of length:  4096

test_transpose 0.104995965958
test_transpose2 0.103582024574

I think the winner is:

X = transpose([x]+[y])


well, I learned a little bit more about Numeric today.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                    		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov




More information about the Numpy-discussion mailing list