[SciPy-user] read/write compressed files

Antonino Ingargiola tritemio@gmail....
Sat Jun 23 04:02:59 CDT 2007


2007/6/21, Francesc Altet <faltet@carabos.com>:


> Ok, that's fine. In any case, I'm interested in knowing the reasons on
> why you are using bzip2 instead zlib.  Have you detected some data
> pattern where you get significantly more compression than by using zlib
> for example?.
> I'm asking this because, in my experience with numerical data, I was
> unable to detect important compression level differences between bzip2
> and zlib. See:
> http://www.pytables.org/docs/manual/ch05.html#compressionIssues
> for some experiments in that regard.
> I'd appreciate any input on this subject (bzip2 vs zlib).

Probably not very meaningful, but with ascii data (float as ascii)
bzip2 seems to have a certain degree of advantages (both in speed and
compress ratio):

  $ du -h lena.txt
  3,1M    lena.txt

  $ time gzip  -9 lena.txt

  real    0m4.937s        <=
  user    0m4.758s
  sys     0m0.018s

  $ du -h lena.txt.gz
  316K    lena.txt.gz

  $ time gunzip lena.txt.gz

  real    0m0.092s
  user    0m0.038s
  sys     0m0.020s

  $ time bzip2 lena.txt

  real    0m2.524s        <=
  user    0m2.396s
  sys     0m0.027s

  $ du -h lena.txt.bz2
  188K    lena.txt.bz2

  $ time bunzip2 lena.txt.bz2

  real    0m0.868s
  user    0m0.775s
  sys     0m0.040s

Even if it's usually a bad idea to put numerical data in ascii format,
sometimes may be handy.


    ~ Antonio

More information about the SciPy-user mailing list