[Numpy-discussion] numpy.savez does /not/ compress!?

Hans Meine meine@informatik.uni-hamburg...
Tue Jun 8 02:48:53 CDT 2010


Hi,

I just wondered why numpy.load("foo.npz") was so much faster than loading 
(gzip-compressed) hdf5 file contents, and found that numpy.savez did not 
compress my files at all.  So there is currently no point in using numpy.savez 
instead of numpy.save when you're not using the multiple-arrays-per-file 
feature.  (To the contrary, it even complicates loading and you need to choose 
and remember a name for the archive member.)

But is that intended?  The numpy.savez docstring says "Save several arrays 
into a single, *compressed* file in ``.npz`` format." (emphasis mine), so I 
guess this might be a bug, or at least a missing feature.  In fact, the 
implementation simply uses the zipfile.ZipFile class, without specifying the 
'compression' argument to the constructor.  

From http://docs.python.org/library/zipfile.html :
> `compression` is the ZIP compression method to use when writing the archive,
> and should be ZIP_STORED or ZIP_DEFLATED; unrecognized values will cause
> RuntimeError to be raised. If ZIP_DEFLATED is specified but the zlib module
> is not available, RuntimeError is also raised. The default is ZIP_STORED.

Greetings,
  Hans


More information about the NumPy-Discussion mailing list