[SciPy-dev] huge speed regression in loadmat from 0.6.0 to 0.7.0
Nathaniel Smith
njs@pobox....
Wed Feb 11 04:45:14 CST 2009
The improvements to loadmat in 0.7.0 are wonderful! Thanks for the
work on that. But I've run into one snag... on a matlab file I care
about:
$ easy_install scipy==0.6.0
$ time python -c 'import scipy.io; scipy.io.loadmat("test.mat")'
real 0m4.172s
user 0m2.908s
sys 0m1.056s
$ easy_install scipy==0.7.0
$ time python -c 'import scipy.io; scipy.io.loadmat("test.mat")'
real 3m10.556s
user 1m14.713s
sys 1m55.731s
So it became ~50 times slower, and quite unusable.
All that time seems to be disappearing into GzipInputStream.__fill,
and in particular, line_profiler says:
95 8509 270975 31.8 0.1 data =
self.fileobj.read(n_to_fetch)
96 8509 37703 4.4 0.0
self._bytes_read += len(data)
97 8509 27164 3.2 0.0 if data:
98 8509 190425980 22379.4 99.6
self.data += self._unzipper.decompress(data)
I'm thinking this is one of those times where the quadratic-time
overhead to string append is worth avoiding...
-- Nathaniel
More information about the Scipy-dev
mailing list