[Numpy-discussion] fromstring, tostring slow?

Charles R Harris charlesr.harris@gmail....
Tue Feb 13 14:44:28 CST 2007


On 2/13/07, Mark Janikas <mjanikas@esri.com> wrote:
>
> Good call Stefan,
>
> I decoupled the timing from the application (duh!) and got better results:
>
> from numpy import *
> import numpy.random as RAND
> import time as TIME
>
> x = RAND.random(1000)
> xl = x.tolist()
>
> t1 = TIME.clock()
> xStringOut = [ str(i) for i in xl ]
> xStringOut = " ".join(xStringOut)
> f = file('blah.dat','w'); f.write(xStringOut)
> t2 = TIME.clock()
> total = t2 - t1
> t1 = TIME.clock()
> f = file('blah.bwt','wb')
> xBinaryOut = x.tostring()
> f.write(xBinaryOut)
> t2 = TIME.clock()
> total1 = t2 - t1
>
> >>> total
> 0.00661
> >>> total1
> 0.00229
>
> Printing x directly to a string took REALLY long: f.write(str(x)) = 0.0258
>
> The problem therefore, must be in the way I am appending values to the
> empty arrays.  I am currently using the append method:
>
> myArray = append(myArray, newValue)
>
> Or would it be faster to concat or use a list append then convert?


I am going to guess that a list would be faster for appending. Concat and, I
suspect, append make new arrays for each use, rather like string
concatenation in Python. A list, on the other hand, is no doubt optimized
for adding new values. Another option might be using PyTables with
extensible arrays. In any case, a bit of timing should show the way if the
performance is that crucial to your application.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20070213/4ecd32cc/attachment.html 


More information about the Numpy-discussion mailing list