[NumPy-Tickets] [NumPy] #2210: Mixing regular IO with numpy.fromfile confuses file offset

NumPy Trac numpy-tickets@scipy....
Tue Sep 4 15:17:14 CDT 2012


#2210: Mixing regular IO with numpy.fromfile confuses file offset
------------------------------------+---------------------------------------
 Reporter:  allen@…                 |       Owner:  somebody   
     Type:  defect                  |      Status:  new        
 Priority:  normal                  |   Milestone:  Unscheduled
Component:  numpy.core              |     Version:  1.6.2      
 Keywords:                          |  
------------------------------------+---------------------------------------
 I have a binary file which is written by a C program. It is essentially a
 bunch of integer and single precision floating point values written one
 after another. I'm trying to read the file partly with the python
 {{{read()}}} function (usually followed by {{{struct.unpack()}}}) and
 partly with {{{numpy.fromfile()}}}. Generally, I'm extracting the scalars
 with {{{read()/unpack() }}} and the arrays with {{{fromfile()}}}. I've
 discovered that {{{fromfile()}}} can become confused if the file itself is
 larger than a particular size. On my Red Hat Enterprise Linux 6.3 and
 Ubuntu 12.04 64-bit systems, this size is 4096 bytes. It appears to work
 OK on windows xp regardless of the file size.

 I attached a simple program which writes a simple binary file and then
 reads it back. It should produce the output:
 {{{
 offset0: 125
 [-70. -65. -60. -55. -50. -45. -40. -35. -30. -25. -20. -15. -10.  -5.
 0.
    5.  10.  15.  20.  25.  30.  35.  40.  45.  50.  55.  60.  65.  70.]
 offset1: 241
 [-80. -75. -70. -65. -60. -55. -50. -45. -40. -35. -30. -25. -20. -15.
 -10.
   -5.   0.   5.  10.  15.  20.  25.  30.  35.  40.  45.  50.  55.  60.]
 offset2: 357
 [-90. -85. -80. -75. -70. -65. -60. -55. -50. -45. -40. -35. -30. -25.
 -20.
  -15. -10.  -5.   0.   5.  10.  15.  20.  25.  30.  35.  40.  45.  50.]
 offset3: 473
 }}}
 On linux I get:
 {{{
 offset0: 125
 [-70. -65. -60. -55. -50. -45. -40. -35. -30. -25. -20. -15. -10.  -5.
 0.
    5.  10.  15.  20.  25.  30.  35.  40.  45.  50.  55.  60.  65.  70.]
 offset1: 242
 [  1.78734834e-38   1.78698961e-38   1.78663088e-38   1.78627215e-38
    1.78562643e-38   1.78490896e-38   1.78419150e-38   1.78347403e-38
    1.78275657e-38   1.78203910e-38   1.78103465e-38   1.77959972e-38
    1.77816479e-38   1.77644288e-38   1.77357302e-38   1.76898124e-38
    0.00000000e+00   5.93486894e-39   5.98078669e-39   6.00948528e-39
    6.02670444e-39   6.04105373e-39   6.05540303e-39   6.06544754e-39
    6.07262218e-39   6.07979683e-39   6.08697148e-39   6.09414613e-39
    6.10132078e-39]
 offset2: 358
 [  1.78806581e-38   1.78770708e-38   1.78734834e-38   1.78698961e-38
    1.78663088e-38   1.78627215e-38   1.78562643e-38   1.78490896e-38
    1.78419150e-38   1.78347403e-38   1.78275657e-38   1.78203910e-38
    1.78103465e-38   1.77959972e-38   1.77816479e-38   1.77644288e-38
    1.77357302e-38   1.76898124e-38   0.00000000e+00   5.93486894e-39
    5.98078669e-39   6.00948528e-39   6.02670444e-39   6.04105373e-39
    6.05540303e-39   6.06544754e-39   6.07262218e-39   6.07979683e-39
    6.08697148e-39]
 offset3: 474
 }}}
 You see that the first array is read OK, but the file {{{offset1}}}
 following the {{{fromfile()}}} call is incorrect. It should be 241, but is
 242 instead.

 I glanced at the C code which implements {{{fromfile()}}} but I didn't see
 anything obviously incorrect. Except, it does make a copy of the
 underlying file handle to do the {{{fromfile()}}}. I wondered if this was
 exposing a bug in GLIBC or the python file handling layer.

 Thanks,
 Allen

-- 
Ticket URL: <http://projects.scipy.org/numpy/ticket/2210>
NumPy <http://projects.scipy.org/numpy>
My example project


More information about the NumPy-Tickets mailing list