[Numpy-discussion] numpy.load and gzip file handles

Matthew Miller mattdm@mattdm....
Sun Feb 1 20:58:18 CST 2009


Hi everyone.

I'd like to log the state of my program as it progresses. Using the
numpy.save / numpy.load functions on the same filehandle repeatedly works
very well for this -- but ends up making a file which very quickly grows to
gigabytes. The data compresses well, though, so I thought I'd use Python's
built-in gzip module underneath. This works great for saving -- but when it
comes time to play back, there's an issue:

  >>> import numpy
  >>> import gzip
  >>> f=open("test.gz")
  >>> g=gzip.GzipFile(None,"rb",9,f)
  >>> g
  <gzip open file 'test.gz', mode 'r' at 0xbaad50 0xc0ab90>
  >>> numpy.load(g)
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/usr/lib64/python2.5/site-packages/numpy/lib/io.py", line 133, in load
      fid.seek(-N,1) # back-up
  TypeError: seek() takes exactly 2 arguments (3 given)

Turns out you can't rewind gzip file handles in Python. Oops. The offending
code is that which distinguishes between npy and npz files. Could there
maybe be something added to just trust me that it's an npy?

Or better yet, is there something I'm doing wrong / overlooking?

Thanks!



-- 
Matthew Miller           mattdm@mattdm.org          <http://mattdm.org/>


More information about the Numpy-discussion mailing list