[Numpy-discussion] genfromtxt - the return

Christopher Barker Chris.Barker@noaa....
Wed Oct 7 14:14:58 CDT 2009

Pierre GM wrote:
> On Oct 6, 2009, at 10:08 PM, Bruce Southey wrote:
>> option to merge delimiters - actually in SAS it is default

Wow! that sure strikes me as a bad choice.

> Ahah! I get it. Well, I remember that we discussed something like that a  
> few months ago when I started working on np.genfromtxt, and the  
> default of *not* merging whitespaces was requested. I gonna check  
> whether we can't put this option somewhere now...

I'd think you might want to have two options: either "whitespace" which 
would be any type or amount of whitespace, or a specific delimeter: say 
"\t" or " " or "  " (two spaces), etc. In that case, it would mean "one 
and only one of these".

Of course, this would fail in Bruce's example:

 >>>> A B C D
 >>>> 1 2 3 4
 >>>> 1     4 5

as there is a space for the delimeter, and one for the data! This looks 
like fixed-format to me. if it were single-space delimited, it would 
look more like:

when the delimiter is whitespace.
1 2 3 4 5
1   4 5

which is the same as:

A, B, C, D, E
1, 2, 3, 4, 5
1,  ,  , 4, 5

If something like SAS actually does merge decimeters, which I interpret 
to mean that if there are a few empty fields and you call for 
tab-delimited , you only get one tab, then information as simply been 
lost -- there is no way to recover it!


Christopher Barker, Ph.D.

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception


More information about the NumPy-Discussion mailing list