[Numpy-discussion] major bug in fromstring, ascii mode
Eric Firing
efiring@hawaii....
Sun Jan 27 13:40:18 CST 2008
Pauli Virtanen wrote:
> su, 2008-01-27 kello 01:16 -0700, Charles R Harris kirjoitti:
>>
>> On Jan 26, 2008 11:30 PM, Eric Firing <efiring@hawaii.edu> wrote:
>> In the course of trying to parse ascii times, I ran into a
>> puzzling bug.
>> Sometimes it works as expected:
>>
>> In [31]:npy.fromstring('23:19:01', dtype=int, sep=':')
>> Out[31]:array([23, 19, 1])
>>
>> But sometimes it doesn't:
>>
>> In [32]:npy.fromstring('23:09:01', dtype=int, sep=':')
>> Out[32]:array([23, 0])
>>
>> In [33]:npy.__version__
>> Out[33]:'1.0.5.dev4742'
>>
>> Works here.
>
> I think it's that some numbers work, and some don't. Consider:
>
>>>> npy.fromstring('23:06:01', dtype=int, sep=':')
> array([23, 6, 1])
>>>> npy.fromstring('23:07:01', dtype=int, sep=':')
> array([23, 7, 1])
>>>> npy.fromstring('23:08:01', dtype=int, sep=':')
> array([23, 0])
>>>> npy.fromstring('23:09:01', dtype=int, sep=':')
> array([23, 0])
>
> and
>
>>>> npy.fromstring('23:010:01', dtype=int, sep=':')
> array([23, 8, 1])
>>>> npy.fromstring('23:011:01', dtype=int, sep=':')
> array([23, 9, 1])
>
> and
>
>>>> npy.fromstring('23:0xff:01', dtype=int, sep=':')
> array([ 23, 255, 1])
>
> Smells like some scanf function is interpreting numbers beginning with
> zero as octal, and recognizing also hexadecimals.
That is it exactly. The code in core/src/arraytypes.inc.src is using
scanf, and scanf tries hard to recognize integers specified in different
ways. So, what caught me is a feature, not a bug, and I should have
recognized it as such right away. The bug was in my expectations, not
in the code.
>
> This is a bit surprising, and whether this is the desired behavior is
> questionable.
>
From a user's standpoint it would be nice to be able to have numbers
with leading zeros interpreted as base 10 instead of octal, since this
turns up any time one converts date and time-of-day strings, and can
occur in many other contexts also. (Outside of computer science octal is
rare, as far as I know.) It looks like supporting this would require
quite a bit of change in the code, however. I suspect it would have to
go in as a kwarg that would be propagated through several layers of C
function calls. Otherwise, if octal conversion support were simply
dropped, I suspect someone else's code would break, and equally
reasonable expectations would be violated.
Eric
More information about the Numpy-discussion
mailing list