[IPython-User] CSV import with malformed/blank data suppressed?

Johnny yggdrasil@gmx.co...
Tue Aug 30 14:57:32 CDT 2011


I have been fiddling with the csv-reader to get csv data imported into a
numpy array but am struggling to find a good method for it. The datafile
contains one header row with descriptors followed by
csv-data that should be imported as floats, but some data is missing or
malformed. I would like to import this regardless and replace any
malformed values by NaN. I.e.:

I initiate a csv-reader object by:

,----
| reader = csv.reader(open("myfile.csv", "rb")
`----

I get rid of the header row by doing:

,----
| descriptors = reader.next()
`----

Then I want to read in the rest as floats; as the data is read in as
strings I try to convert it as:

,----
| data = float_(numpy.array(list(reader)))
`----

Unfortunately, some csv-data is blank as below (or otherwise malformed)

,----
| 123,7.3,,,12,18
`----

This means I get an error and the data is not imported. Ideally, I would
like to get at most a warning and all nonnumeric (non-float) replaced by
nan. How does one manage to import csv-data like this efficiently?

Thanks for any advice, and if the question is misplaced, please point me
to the appropriate forum.

Cheers,
-- 
Johnny


More information about the IPython-User mailing list