[NumPy-Tickets] [NumPy] #2210: Mixing regular IO with numpy.fromfile confuses file offset
NumPy Trac
numpy-tickets@scipy....
Tue Sep 4 15:17:14 CDT 2012
#2210: Mixing regular IO with numpy.fromfile confuses file offset
------------------------------------+---------------------------------------
Reporter: allen@… | Owner: somebody
Type: defect | Status: new
Priority: normal | Milestone: Unscheduled
Component: numpy.core | Version: 1.6.2
Keywords: |
------------------------------------+---------------------------------------
I have a binary file which is written by a C program. It is essentially a
bunch of integer and single precision floating point values written one
after another. I'm trying to read the file partly with the python
{{{read()}}} function (usually followed by {{{struct.unpack()}}}) and
partly with {{{numpy.fromfile()}}}. Generally, I'm extracting the scalars
with {{{read()/unpack() }}} and the arrays with {{{fromfile()}}}. I've
discovered that {{{fromfile()}}} can become confused if the file itself is
larger than a particular size. On my Red Hat Enterprise Linux 6.3 and
Ubuntu 12.04 64-bit systems, this size is 4096 bytes. It appears to work
OK on windows xp regardless of the file size.
I attached a simple program which writes a simple binary file and then
reads it back. It should produce the output:
{{{
offset0: 125
[-70. -65. -60. -55. -50. -45. -40. -35. -30. -25. -20. -15. -10. -5.
0.
5. 10. 15. 20. 25. 30. 35. 40. 45. 50. 55. 60. 65. 70.]
offset1: 241
[-80. -75. -70. -65. -60. -55. -50. -45. -40. -35. -30. -25. -20. -15.
-10.
-5. 0. 5. 10. 15. 20. 25. 30. 35. 40. 45. 50. 55. 60.]
offset2: 357
[-90. -85. -80. -75. -70. -65. -60. -55. -50. -45. -40. -35. -30. -25.
-20.
-15. -10. -5. 0. 5. 10. 15. 20. 25. 30. 35. 40. 45. 50.]
offset3: 473
}}}
On linux I get:
{{{
offset0: 125
[-70. -65. -60. -55. -50. -45. -40. -35. -30. -25. -20. -15. -10. -5.
0.
5. 10. 15. 20. 25. 30. 35. 40. 45. 50. 55. 60. 65. 70.]
offset1: 242
[ 1.78734834e-38 1.78698961e-38 1.78663088e-38 1.78627215e-38
1.78562643e-38 1.78490896e-38 1.78419150e-38 1.78347403e-38
1.78275657e-38 1.78203910e-38 1.78103465e-38 1.77959972e-38
1.77816479e-38 1.77644288e-38 1.77357302e-38 1.76898124e-38
0.00000000e+00 5.93486894e-39 5.98078669e-39 6.00948528e-39
6.02670444e-39 6.04105373e-39 6.05540303e-39 6.06544754e-39
6.07262218e-39 6.07979683e-39 6.08697148e-39 6.09414613e-39
6.10132078e-39]
offset2: 358
[ 1.78806581e-38 1.78770708e-38 1.78734834e-38 1.78698961e-38
1.78663088e-38 1.78627215e-38 1.78562643e-38 1.78490896e-38
1.78419150e-38 1.78347403e-38 1.78275657e-38 1.78203910e-38
1.78103465e-38 1.77959972e-38 1.77816479e-38 1.77644288e-38
1.77357302e-38 1.76898124e-38 0.00000000e+00 5.93486894e-39
5.98078669e-39 6.00948528e-39 6.02670444e-39 6.04105373e-39
6.05540303e-39 6.06544754e-39 6.07262218e-39 6.07979683e-39
6.08697148e-39]
offset3: 474
}}}
You see that the first array is read OK, but the file {{{offset1}}}
following the {{{fromfile()}}} call is incorrect. It should be 241, but is
242 instead.
I glanced at the C code which implements {{{fromfile()}}} but I didn't see
anything obviously incorrect. Except, it does make a copy of the
underlying file handle to do the {{{fromfile()}}}. I wondered if this was
exposing a bug in GLIBC or the python file handling layer.
Thanks,
Allen
--
Ticket URL: <http://projects.scipy.org/numpy/ticket/2210>
NumPy <http://projects.scipy.org/numpy>
My example project
More information about the NumPy-Tickets
mailing list