[Numpy-discussion] Question about improving genfromtxt errors
Mon Sep 28 11:51:39 CDT 2009
On Mon, Sep 28, 2009 at 12:41 PM, Christopher Barker
> Skipper Seabold wrote:
>> FWIW, I have a script that creates and savez arrays from several text
>> files in total about 1.5 GB of text.
>> without the incrementing in genfromtxt
>> Run time: 122.043943 seconds
>> with the incrementing in genfromtxt
>> Run time: 131.698873 seconds
>> If we just want to always keep track of things, I would be willing to
>> take a poorly measured 8 % slowdown,
> I also think 8% is worth it, but I'm still surprised it's that much.
> What addition code is inside the inner loop? (or , I guess, the each
> line loop...)
This was probably due to the way that I timed it, honestly. I only
did it once. The only differences I made for that part were in the
first post of the thread. Two incremented scalars for line numbers
and column numbers and a try/except block.
I'm really not against a debug mode if someone wants to do it, and
it's deemed necessary. If it could be made to log all of the errors
that would be extremely helpful. I still need to post some of my use
cases though. Anything to help make data cleaning less of a chore...
More information about the NumPy-Discussion