[Numpy-discussion] "import numpy" is slow
Andrew Dalke
dalke@dalkescientific....
Wed Jul 30 15:12:19 CDT 2008
On Jul 4, 2008, at 2:22 PM, Andrew Dalke wrote:
> [josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke% time python -c
> 'pass'
> 0.015u 0.042s 0:00.06 83.3% 0+0k 0+0io 0pf+0w
> [josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke% time python -c
> 'import numpy'
> 0.084u 0.231s 0:00.33 93.9% 0+0k 0+8io 0pf+0w
> [josiah:numpy/build/lib.macosx-10.3-fat-2.5] dalke%
> For one of my clients I wrote a tool to analyze import times. I
> don't have it, but here's something similar I just now whipped up:
Based on those results I've been digging into the code trying to
figure out why numpy imports so many files, and at the same time I've
been trying to guess at the use case Robert Kern regards as typical
when he wrote:
Your use case isn't so typical and so suffers on the import
time end of the balance
and trying to figure out what code would break if those modules
weren't all eagerly imported and were instead written as most other
Python modules are written.
I have two thoughts for why mega-importing might be useful:
- interactive users get to do tab complete and see everything
(eg, "import numpy" means "numpy.fft.ifft" works, without
having to do "import numpy.fft" manually)
- class inspectors don't need to to directory checks to find
possible modules
(This is a stretch, since every general purpose inspector I
know of
has to know how to frob the directories to find directories.)
Are these the reasons numpy imports everything or are there other
reasons?
The first guess comes from the comment in numpy/__init__.py
"The following sub-packages must be explicitly imported:"
meaning, I take it, that the other modules (core, lib, random,
linalg, fft, testing)
do not need to be explicitly imported.
Is the numpy recommendation that people should do:
import numpy
numpy.fft.ifft(data)
? If so, the documentation should be updated to say that "random",
"ma", "ctypeslib" and several other libraries are included in that
list. Why is the last so important that it should be in the top-
level namespace?
In my opinion, this assistance is counter to standard practice in
effectively every other Python package. I don't see the benefit.
You may ask if there are possible improvements. There's no obvious
place taking up a bunch of time but there are plenty of small places
which add up.
For examples:
1) I wondered why 'cPickle' needed to be imported. One of the places
it's used is numpy.lib.format which is only imported by
numpy.lib.io. It's easy to defer the 'import format' to be inside
the functions which need it. Note that io.py already defers the
import of zipfile, so function-local imports are not inappropriate.
'io' imports 'tempfile', needing 0.016 seconds. This can be a
deferred cost only incurred by those who use io.savez, which already
has some function-local imports. The reason for the high import
costs? Here's what tempfile itself imports.
tempfile: 0.016 (io)
errno: 0.000 (tempfile)
random: 0.010 (tempfile)
binascii: 0.003 (random)
_random: 0.003 (random)
fcntl: 0.003 (tempfile)
thread: 0.000 (tempfile)
(This is read as 'tempfile' is imported by 'io' and takes 0.016
seconds total, including all children, and the directly imported
children of 'tempfile' are 'errno', 'random', 'fcntl' and 'thread'.
'random' imports 'binascii' and '_random'.)
BTW, the load and save commands in io do an incorrect check.
if isinstance(file, type("")):
fid = _file(file,"rb")
else:
fid = file
Filenames can be unicode strings. This test should either be
isinstance(file, basestring)
or
not hasatttr(file, 'read')
2) What's the point of "add_newdocs"? According to the top of the
module
# This is only meant to add docs to objects defined in C-
extension modules.
# The purpose is to allow easier editing of the docstrings without
# requiring a re-compile.
which implies this aids development, but not deployment. The import
takes a miniscule 0.006 seconds of the 0.225 ("import lib" and its
subimports takes 0.141 seconds) but seems to add no direct end-user
benefit. Shouldn't this documentation be pushed into the C code at
least for each release?
3) I see that numpy/core/numerictypes.py imports 'string', which
takes 0.008 seconds. I wondered why. It's part of "english_lower",
"english_upper", and "english_capitalize", which are functions
defined in that module. The implementation can't be improved, and
using string.translate is the right approach.
However,
3a) the two functions have no leading underscore and have
docstrings to imply that this is part of the public API (although
they are not included in __all__). Are they meant for general use?
Note that english_capitalize is over-engineered for the use-case in
that file. There are no empty type names, so the test "if s" is
never false.
3b) there are only 33 types in that module so a hand-written
lookup table mapping the name to the appropriate name/alias would
work. Yes, it makes adding new types less than completely auomatic,
but that's done rarely.
Getting rid of these functions, and thus getting rid of the import
speeds numpy startup time by 3.5%.
4) numpy.testing takes 0.041 seconds to import. The text I quoted
above says that it's a numpy requirement that 'testing' always be
imported, even though I'm hard pressed to figure out why that's
important. Assuming it is important, 0.020 seconds is spent
importing 'difflib'
difflib: 0.020 (utils)
heapq: 0.016 (difflib)
itertools: 0.003 (heapq)
operator: 0.003 (heapq)
bisect: 0.005 (heapq)
_bisect: 0.003 (bisect)
_heapq: 0.003 (heapq)
which is only used in numpy.testing.utils:assert_string . That can
be deferred.
Similarly,
numpytest: 0.012 (numpy.testing)
glob: 0.005 (numpytest)
fnmatch: 0.002 (glob)
shlex: 0.006 (numpytest)
collections: 0.003 (shlex)
numpy.testing.utils: 0.000 (numpytest)
but notice that 'glob' while imported is never used in 'numpytest',
and that 'shlex' can easily be a deferred import. This saves (for
the common case) 0.01 seconds.
5) There some additional savings in _datasource
_datasource: 0.016 (io)
shutil: 0.003 (_datasource)
stat: 0.000 (shutil)
urlparse: 0.003 (_datasource)
bz2: 0.003 (_datasource)
gzip: 0.006 (_datasource)
zlib: 0.003 (gzip)
This module provides the "Datasource" class, which is accessed
through "numpy.lib.io.Datasource".
Deferring the 'bz2' and 'gzip' imports until needed saves 0.01
seconds. This will require some modification to the code more than
shifting the import statement.
These together add up to about 0.08 seconds, which is about 30% of
the 'import numpy' cost.
I could probably get another 0.05 seconds if I dug around more, but I
can't without knowing what use case numpy is trying to achieve. Why
are all those ancillary modules (testing, ctypeslib) eagerly loaded
when there seems no need for that feature?
Andrew
dalke@dalkescientific.com
More information about the Numpy-discussion
mailing list