[Numpy-discussion] reading gzip compressed files using numpy.fromfile

Peter Schmidtke pschmidtke@mmb.pcb.ub...
Wed Oct 28 14:31:43 CDT 2009

Dear Numpy Mailing List Readers,

I have a quite simple problem, for what I did not find a solution for now. 
I have a gzipped file lying around that has some numbers stored in it and I
want to read them into a numpy array as fast as possible but only a bunch
of data at a time. 
So I would like to use numpys fromfile funtion. 

For now I have somehow the following code :

        f=gzip.open( "myfile.gz", "r" )

So I would read 400 entries from the file, keep it open, process my data,
come back and read the next 400 entries. If I do this, numpy is complaining
that the file handle f is not a normal file handle :
OError: first argument must be an open file

but in fact it is a zlib file handle. But gzip gives access to the normal
filehandle through f.fileobj.

So I tried  xyz=npy.fromfile(f.fileobj,dtype="float32",count=400)

But there I get just meaningless values (not the actual data) and when I
specify the sep=" " argument for npy.fromfile I get just .1 and nothing

Can you tell me why and how to fix this problem? I know that I could read
everything to memory, but these files are rather big, so I simply have to
avoid this.

Thanks in advance.


Peter Schmidtke

PhD Student at the Molecular Modeling and Bioinformatics Group
Dep. Physical Chemistry
Faculty of Pharmacy
University of Barcelona

More information about the NumPy-Discussion mailing list