[Numpy-discussion] genloadtxt: second serving
Thu Dec 4 06:22:33 CST 2008
Pierre GM wrote:
> Here's the second round of genloadtxt. That's a tad cleaner version than
> the previous one, where I tried to take into account the different
> comments and suggestions that were posted. So, tabs should be supported
> and explicit whitespaces are not collapsed.
> FYI, in the __main__ section, you'll find 2 hotshot tests and a timeit
> comparison: same input, no missing data, one with genloadtxt, one with
> np.loadtxt and a last one with matplotlib.mlab.csv2rec.
> As you'll see, genloadtxt is roughly twice slower than np.loadtxt, but
> twice faster than csv2rec. One of the explanation for the slowness is
> indeed the use of classes for splitting lines and converting values.
> Instead of a basic function, we use the __call__ method of the class,
> which itself calls another function depending on the attribute values.
> I'd like to reduce this overhead, any suggestion is more than welcome,
> as usual.
> Anyhow: as we do need speed, I suggest we put genloadtxt somewhere in
> numpy.ma, with an alias recfromcsv for John, using his defaults. Unless
> somebody comes with a brilliant optimization.
Will loadtxt in that case remain as is? Or will the _faulttolerantconv
class be used?
More information about the Numpy-discussion