[Numpy-discussion] `missing` argument in genfromtxt only a string?

Bruce Southey bsouthey@gmail....
Tue Sep 15 08:43:16 CDT 2009


On 09/14/2009 09:31 PM, Skipper Seabold wrote:
> On Mon, Sep 14, 2009 at 9:59 PM, Pierre GM<pgmdevlist@gmail.com>  wrote:
>    
[snip]
>> OK, I see the problem...
>> When no dtype is defined, we try to guess what a converter should
>> return by testing its inputs. At first we check whether the input is a
>> boolean, then whether it's an integer, then a float, and so on. When
>> you define explicitly a converter, there's no need for all those
>> checks, so we lock the converter to a particular state, which sets the
>> conversion function and the value to return in case of missing.
>> Except that I messed it up and it fails in that case (the conversion
>> function is set properly, bu the dtype of the output is still
>> undefined). That's a bug, I'll try to fix that once I've tamed my snow
>> kitten.
>>      
> No worries.  I really like genfromtxt (having recently gotten pretty
> familiar with it) and would like to help out with extending it towards
> these kind of cases if there's an interest and this is feasible.
>
> I tried another workaround for the dates with my converters defined as conv
>
> conv.update({date : lambda s : datetime(*map(int,
> s.strip().split('/')[-1:]+s.strip().split('/')[:2]))})
>
> Where `date` is the column that contains a date.  The problem was that
> my dates are "mm/dd/yyyy" and datetime needs "yyyy,mm,dd," it worked
> for a test case if my dates were "dd/mm/yyyy" and I just use reversed,
> but gave an error about not finding the day in the third position,
> though that lambda function worked for a test case outside of
> genfromtxt.
>
>    
>> Meanwhile, you can use tsfromtxt (in scikits.timeseries),
>>      
In SAS there are multiple ways to define formats especially dates:
http://support.sas.com/onlinedoc/913/getDoc/en/lrcon.hlp/a002200738.htm

It would be nice to accept the common variants (USA vs English dates) as 
well as two digit vs 4 digit year codes.



>> or even
>> simpler, define a dtype for the output (you know that your first
>> column is a str, your second an object, and the others ints or floats...
>>
>>      
How do you specify different dtypes in genfromtxt?
I could not see the information in the docstring and the dtype argument 
does not appear to allow multiple dtypes.

Bruce



More information about the NumPy-Discussion mailing list