[Numpy-discussion] numpy.concatenate slower than slice copying

Zbyszek Szmek zbyszek@in.waw...
Tue Aug 17 08:03:10 CDT 2010


Hi,
this is a problem which came up when trying to replace a hand-written
array concatenation with a call to numpy.vstack:
for some array sizes, 

   numpy.vstack(data)

runs > 20% longer than a loop like

   alldata = numpy.empty((tlen, dim))
   for x in data:
        step = x.shape[0]
        alldata[pos:pos+step] = x
        pos += step

(example script attached)

$ python del_cum3.py numpy 10000 10000 1 10
problem size: (10000x10000) x 1 = 10^8
0.816s <------------------------------- numpy.concatentate of 10 arrays 10000x10000

$ python del_cum3.py concat 10000 10000 1 10
problem size: (10000x10000) x 1 = 10^8
0.642s <------------------------------- slice manipulation giving the same result

When the array size is reduced to 100x100 or so, the computation time goes to 0,
so it seems that the dtype and dimension checking is negligible.
Does numpy.concatenate do some extra work?

Thanks for any pointers,
Zbyszek

PS. Architecture is amd64.
    python2.6, numpy 1.3.0
    or
    python3.1, numpy 2.0.0.dev / trunk@8510
    give the same result.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: del_cum3.py
Type: text/x-python
Size: 884 bytes
Desc: not available
Url : http://mail.scipy.org/pipermail/numpy-discussion/attachments/20100817/e2b26043/attachment.py 


More information about the NumPy-Discussion mailing list