[SciPy-dev] huge speed regression in loadmat from 0.6.0 to 0.7.0
Ryan May
rmay31@gmail....
Wed Feb 11 10:53:46 CST 2009
On Wed, Feb 11, 2009 at 4:45 AM, Nathaniel Smith <njs@pobox.com> wrote:
> The improvements to loadmat in 0.7.0 are wonderful! Thanks for the
> work on that. But I've run into one snag... on a matlab file I care
> about:
>
> $ easy_install scipy==0.6.0
> $ time python -c 'import scipy.io; scipy.io.loadmat("test.mat")'
> real 0m4.172s
> user 0m2.908s
> sys 0m1.056s
>
> $ easy_install scipy==0.7.0
> $ time python -c 'import scipy.io; scipy.io.loadmat("test.mat")'
> real 3m10.556s
> user 1m14.713s
> sys 1m55.731s
>
> So it became ~50 times slower, and quite unusable.
>
> All that time seems to be disappearing into GzipInputStream.__fill,
> and in particular, line_profiler says:
>
>
> 95 8509 270975 31.8 0.1 data =
> self.fileobj.read(n_to_fetch)
> 96 8509 37703 4.4 0.0
> self._bytes_read += len(data)
> 97 8509 27164 3.2 0.0 if data:
> 98 8509 190425980 22379.4 99.6
> self.data += self._unzipper.decompress(data)
>
> I'm thinking this is one of those times where the quadratic-time
> overhead to string append is worth avoiding...
Well, here's a patch against gzipstreams.py that changes to add the chunks
to a list and only add to the string at the very end. See if it helps your
case. If not, is there somewhere you can put the datafile so that we can
test with it?
Ryan
--
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma
Sent from: Norman Oklahoma United States.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/scipy-dev/attachments/20090211/6b602fa2/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gzipstreams_speedup.diff
Type: application/octet-stream
Size: 1289 bytes
Desc: not available
Url : http://projects.scipy.org/pipermail/scipy-dev/attachments/20090211/6b602fa2/attachment-0001.obj
More information about the Scipy-dev
mailing list