[SciPy-user] read/write compressed files
Sat Jun 23 04:35:06 CDT 2007
I remember even for my binary data that bzip was about 20% better, but
significantly slower. Best would be of course to have both (and more)
compressors and chose which suits the case best. But in real world
probabely zlib is a more general choice, if only one compressor is intended.
PS. Yes, it's a very bad idea to keep real numbers as ascii.
Antonino Ingargiola wrote:
> 2007/6/21, Francesc Altet <firstname.lastname@example.org>:
>> Ok, that's fine. In any case, I'm interested in knowing the reasons on
>> why you are using bzip2 instead zlib. Have you detected some data
>> pattern where you get significantly more compression than by using zlib
>> for example?.
>> I'm asking this because, in my experience with numerical data, I was
>> unable to detect important compression level differences between bzip2
>> and zlib. See:
>> for some experiments in that regard.
>> I'd appreciate any input on this subject (bzip2 vs zlib).
> Probably not very meaningful, but with ascii data (float as ascii)
> bzip2 seems to have a certain degree of advantages (both in speed and
> compress ratio):
> $ du -h lena.txt
> 3,1M lena.txt
> $ time gzip -9 lena.txt
> real 0m4.937s <=
> user 0m4.758s
> sys 0m0.018s
> $ du -h lena.txt.gz
> 316K lena.txt.gz
> $ time gunzip lena.txt.gz
> real 0m0.092s
> user 0m0.038s
> sys 0m0.020s
> $ time bzip2 lena.txt
> real 0m2.524s <=
> user 0m2.396s
> sys 0m0.027s
> $ du -h lena.txt.bz2
> 188K lena.txt.bz2
> $ time bunzip2 lena.txt.bz2
> real 0m0.868s
> user 0m0.775s
> sys 0m0.040s
> Even if it's usually a bad idea to put numerical data in ascii format,
> sometimes may be handy.
> ~ Antonio
> SciPy-user mailing list
Dominik Szczerba, Ph.D.
Computer Vision Lab CH-8092 Zurich
More information about the SciPy-user