[Numpy-discussion] More loadtxt() changes

Ryan May rmay31@gmail....
Tue Nov 25 14:13:56 CST 2008

> On Nov 25, 2008, at 2:37 PM, Ryan May wrote:
>> What about doing the parsing and type inference in a loop and holding
>> onto the already split lines?  Then loop through the lines with the
>> converters that were finally chosen?  In addition to making my usecase
>> work, this has the benefit of not doing the I/O twice.
> You mean, filling a list and relooping on it if we need to ? Sounds  
> like a plan, but doesn't it create some extra temporaries we may not  
> want ?

It shouldn't create any *extra* temporaries since we already make a list 
of lists before creating the final array.  It just introduces an extra 
looping step. (I'd reuse the existing list of lists).

> Depends on how we do it. We could have a  modified np.loadtxt that  
> takes some of the ideas of the file I send you (the StringConverter,  
> for example), then I could have a numpy.ma.io that would take care of  
> the missing data. And something in scikits.timeseries for the dates...
> The new np.loadtxt could use the default of the initial one, or we  
> could create yet another function (np.loadfromtxt) that would match  
> what I was suggesting, and np.loadtxt would be a special stripped  
> downcase with dtype=float by default.
> thoughts?

My personal opinion is that if it doesn't make loadtxt too unwieldly, to 
just add a few of the options to loadtxt() itself.  I'm working on 
tweaking loadtxt() to add the auto dtype and the names, relying heavily 
on your StringConverter class (nice code btw.).  If my understanding of 
StringConverter is correct, tweaking the new loadtxt for ma or 
timeseries would only require passing in modified versions of 

I'll post that when I'm done and we can see if it looks like too much 
functionality stapled together or not.


Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma

More information about the Numpy-discussion mailing list