[SciPy-User] Python seek argument type

Nathaniel Smith njs@pobox....
Wed Apr 14 12:54:33 CDT 2010


On Wed, Apr 14, 2010 at 8:53 AM, Tom Kuiper <kuiper@jpl.nasa.gov> wrote:
> The argument in question, new_pos, has a value of 2,252,639,972, which
> is slightly too large to be a Python signed int:
> math.log(2252639972,2) = 31.068969607366373
> and, of course, a Python int is a C long on a 32-bit machine, which is
> the type specified for the first argument of the underlying C fseek command.
>
> Now a 2.1 GB file is not large by modern standards.  I imagine someone
> must have come up with a way to position past the 2 billionth byte in a
> file.  Does anyone know what it is?

The whole traditional POSIX filesystem API has a 64-bit version -- on
Linux at least, you use syscalls that take "off_t" instead of "int".
For fseek the corresponding syscall is fseeko. And then you compile
with -D_FILE_OFFSET_BITS=64.

Python really should be taking care of all that mess for you, though.
Since Python runs on both systems with and without support for large
files, you do have to make sure that it's using the right code. Is
your Python built by you, or does it come from some distributor? If
it's built by you, you might want to double-check how you compiled
it...? Or check if any bugs were fixed in this area in a later Python
point release? Since this problem doesn't have anything to do with
scipy, you might find more knowledgeable help on python-list.

I can confirm that with Ubuntu's Python 2.5.4, fd.seek(2252639972, 0)
works fine.

-- Nathaniel


More information about the SciPy-User mailing list