All,
Here's the second round of genloadtxt. That's a tad cleaner version
than the previous one, where I tried to take into account the
different comments and suggestions that were posted. So, tabs should
be supported and explicit whitespaces are not collapsed.
FYI, in the __main__ section, you'll find 2 hotshot tests and a timeit
comparison: same input, no missing data, one with genloadtxt, one with
np.loadtxt and a last one with matplotlib.mlab.csv2rec.
As you'll see, genloadtxt is roughly twice slower than np.loadtxt, but
twice faster than csv2rec. One of the explanation for the slowness is
indeed the use of classes for splitting lines and converting values.
Instead of a basic function, we use the __call__ method of the class,
which itself calls another function depending on the attribute values.
I'd like to reduce this overhead, any suggestion is more than welcome,
as usual.
Anyhow: as we do need speed, I suggest we put genloadtxt somewhere in
numpy.ma, with an alias recfromcsv for John, using his defaults.
Unless somebody comes with a brilliant optimization.
Let me know how it goes,
Cheers,
P.
