[Numpy-discussion] Question about improving genfromtxt errors

Skipper Seabold jsseabold@gmail....
Mon Sep 28 11:51:39 CDT 2009


On Mon, Sep 28, 2009 at 12:41 PM, Christopher Barker
<Chris.Barker@noaa.gov> wrote:
> Skipper Seabold wrote:
>> FWIW, I have a script that creates and savez arrays from several text
>> files in total about 1.5 GB of text.
>>
>> without the incrementing in genfromtxt
>>
>> Run time: 122.043943 seconds
>>
>> with the incrementing in genfromtxt
>>
>> Run time: 131.698873 seconds
>>
>> If we just want to always keep track of things, I would be willing to
>> take a poorly measured 8 % slowdown,
>
> I also think 8% is worth it, but I'm still surprised it's that much.
> What addition code is inside the inner loop? (or , I guess, the each
> line loop...)
>
> -Chris
>

This was probably due to the way that I timed it, honestly.  I only
did it once.  The only differences I made for that part were in the
first post of the thread.  Two incremented scalars for line numbers
and column numbers and a try/except block.

I'm really not against a debug mode if someone wants to do it, and
it's deemed necessary.  If it could be made to log all of the errors
that would be extremely helpful.  I still need to post some of my use
cases though.  Anything to help make data cleaning less of a chore...

Skipper


More information about the NumPy-Discussion mailing list