[NumPy-Tickets] [NumPy] #1071: loadtxt fails if the last column contains empty value

NumPy Trac numpy-tickets@scipy....
Thu Mar 31 12:10:05 CDT 2011


#1071: loadtxt fails if the last column contains empty value
---------------------------------+------------------------------------------
 Reporter:  Electrion            |       Owner:  somebody    
     Type:  defect               |      Status:  needs_review
 Priority:  normal               |   Milestone:  1.6.0       
Component:  numpy.lib            |     Version:  devel       
 Keywords:  loadtxt ascii strip  |  
---------------------------------+------------------------------------------

Comment(by bsouthey):

 Replying to [comment:4 derek]:
 > I am afraid I don't understand how this pertains to the ticket.
 I want to keep a distinction between loadtxt and genfromtxt. Specifically
 genfromtxt was added in 2009 especially to address many of the limitations
 of loadtxt (added in 2007). So we need to avoid adding new
 features/abilities to loadtxt that duplicate features already present in
 genfromtxt.

 The ticket says 'empty value' which means 'missing data'. My response is
 that loadtxt must fail in such cases because the loadtxt docstring clearly
 states that in 3 different, yet critical places, that missing values are
 not handled:
 Each row in the text file must have the same number of values.
 genfromtxt : Load data with missing values handled as specified.
 This function aims to be a fast reader for simply formatted files.  The
 genfromtxt function provides more sophisticated handling of, e.g., lines
 with missing values.

 Really the ticket and your last example arise from the ''misused'' of
 loadtxt features (i.e. converters) to handle missing data. But since
 loadtxt is not meant to handle missing values, the ticket and your example
 is invalid. If missing values were allowed then sure there is a problem.

 The attached patch would still be incorrect because it should not include
 the space in the strip. However I am not sure if similar cases will be
 uncovered (like more than one delimiter in certain lines) or that really
 the split_line function should be changed so it first splits valid lines
 on the delimiter.

-- 
Ticket URL: <http://projects.scipy.org/numpy/ticket/1071#comment:6>
NumPy <http://projects.scipy.org/numpy>
My example project


More information about the NumPy-Tickets mailing list