[Numpy-discussion] fromstring, tostring slow?
Mark Janikas
mjanikas@esri....
Tue Feb 13 14:36:01 CST 2007
Good call Stefan,
I decoupled the timing from the application (duh!) and got better results:
from numpy import *
import numpy.random as RAND
import time as TIME
x = RAND.random(1000)
xl = x.tolist()
t1 = TIME.clock()
xStringOut = [ str(i) for i in xl ]
xStringOut = " ".join(xStringOut)
f = file('blah.dat','w'); f.write(xStringOut)
t2 = TIME.clock()
total = t2 - t1
t1 = TIME.clock()
f = file('blah.bwt','wb')
xBinaryOut = x.tostring()
f.write(xBinaryOut)
t2 = TIME.clock()
total1 = t2 - t1
>>> total
0.00661
>>> total1
0.00229
Printing x directly to a string took REALLY long: f.write(str(x)) = 0.0258
The problem therefore, must be in the way I am appending values to the empty arrays. I am currently using the append method:
myArray = append(myArray, newValue)
Or would it be faster to concat or use a list append then convert?
But to be more sure, Ill have to profile it. It seems a bit odd in that there are far less loops and conversions in my current implementation for the binary, yet it is still running slower.
-----Original Message-----
From: numpy-discussion-bounces@scipy.org [mailto:numpy-discussion-bounces@scipy.org] On Behalf Of Stefan van der Walt
Sent: Tuesday, February 13, 2007 12:03 PM
To: numpy-discussion@scipy.org
Subject: Re: [Numpy-discussion] fromstring, tostring slow?
On Tue, Feb 13, 2007 at 11:42:35AM -0800, Mark Janikas wrote:
> I am finding that directly packing numpy arrays into binary using the tostring
> and fromstring methods do not provide a speed improvement over writing the same
> arrays to ascii files. Obviously, the size of the resulting files is far
> smaller, but I was hoping to get an improvement in the speed of writing. I got
> that speed improvement using the struct module directly, or by using generic
> python arrays. Let me further describe my methodological issue as it may
> directly relate to any solution you might have.
Hi Mark
Can you post a benchmark code snippet to demonstrate your results?
Here, using 1.0.2.dev3545, I see:
In [26]: x = N.random.random(100)
In [27]: timeit f = file('/tmp/blah.dat','w'); f.write(str(x))
100 loops, best of 3: 1.77 ms per loop
In [28]: timeit f = file('/tmp/blah','w'); x.tofile(f)
10000 loops, best of 3: 100 µs per loop
(I see the same results for heterogeneous arrays)
Cheers
Stéfan
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion
More information about the Numpy-discussion
mailing list