[SciPy-dev] huge speed regression in loadmat from 0.6.0 to 0.7.0

Nathaniel Smith njs@pobox....
Wed Feb 11 04:45:14 CST 2009


The improvements to loadmat in 0.7.0 are wonderful! Thanks for the
work on that. But I've run into one snag... on a matlab file I care
about:

$ easy_install scipy==0.6.0
$ time python -c 'import scipy.io; scipy.io.loadmat("test.mat")'
real    0m4.172s
user    0m2.908s
sys     0m1.056s

$ easy_install scipy==0.7.0
$ time python -c 'import scipy.io; scipy.io.loadmat("test.mat")'
real    3m10.556s
user    1m14.713s
sys     1m55.731s

So it became ~50 times slower, and quite unusable.

All that time seems to be disappearing into GzipInputStream.__fill,
and in particular, line_profiler says:


    95      8509       270975     31.8      0.1              data =
self.fileobj.read(n_to_fetch)
    96      8509        37703      4.4      0.0
self._bytes_read += len(data)
    97      8509        27164      3.2      0.0              if data:
    98      8509    190425980  22379.4     99.6
self.data += self._unzipper.decompress(data)

I'm thinking this is one of those times where the quadratic-time
overhead to string append is worth avoiding...

-- Nathaniel


More information about the Scipy-dev mailing list