[Numpy-discussion] Question about improving genfromtxt errors
Fri Sep 25 14:34:45 CDT 2009
On 09/25/2009 01:25 PM, Skipper Seabold wrote:
> On Fri, Sep 25, 2009 at 2:17 PM, Pierre GM<firstname.lastname@example.org> wrote:
>> Sorry all, I haven't been as respondent as I wished lately...
>> * About the patch: I don't like the idea of adding yet some other
>> tests in the main loop. I was more into letting things like they are,
>> but calling some error function if some 'setting an array element with
>> a sequence' exception is raised. This function would take 'rows' as an
>> input and would check the length of each row. That way, we don't slow
>> things down when everything works, but just add some delay when they
>> don't. I'll try to come up w/ something soon (in the next couple of
I tend to agree but I think that the actual array() function give a more
meaningful error about mismatched data such as indicating the row. I
think that it would be too late to go back to the data and try to figure
out why the exception occurred. If you wait until array() is called
then you have not used at least two opportunities to check the whole
data.. The data is parsed at least twice, the first is the
itertools.chain loop and the second is the subsequent enumeration over
rows - lines 981 and 1006 of the unpatched io.py).
Really it is a question of how useful the messages are and if (or when)
genfromtxt should stop on an error. For a huge data set I can see that
stopping on an error is useful because it avoids parsing all the data.
But listing all the errors is also useful especially when you can fix
all the errors at once.
>> * About the converter error: there's indeed a bug in
>> StringConverter.upgrade, I need to write some unittests to make sure I
>> get it covered. If you could get me some sample code, that'd be great.
> Hmm, I'm not sure that the error I'm seeing is the same as the bug we
> had previously discussed. In this case, the converters are wrong and
> I need to know about it. I will try to post an example of the two
> times I've seen this error raised when I get a minute.
Samples of using it would be great.
More information about the NumPy-Discussion